Newsletter - GoCast Israel

If you have ever stumbled across those online videos of Freddie Mercury singing what sounds like an a cappella rendition of “Another One Bites the Dust” or a version of Alanis Morissette’s “You Oughta Know” featuring only Flea’s distinctive slapped bass, then you’re already familiar with the concept of music source separation.

Simply put, music source separation is the use of technology to break a song into its constituent contributions, such as the vocals, bass, and drums. It is easy to achieve if you own the original multitrack studio recordings: You just adjust the mix to isolate a single track. But if you’re starting with a regular CD or MP3 audio file (where all the instruments and vocals have been mixed into a single stereo recording), even the most sophisticated software programs would struggle to precisely pluck out a single part. That is, until now.

Recommended Reading
Powered by AI: Automatic alt text to help the blind ‘see’ Facebook
Building AI with a helpful eye for fashion
Food for thought: AI researchers develop new way to ‘reverse engineer’ recipes from photos
Facebook’s AI researchers have developed a system that can do just that – with an uncanny level of accuracy. A system created by Alexandre Defossez, a research scientist in Facebook AI’s Paris lab, uses AI to analyze a tune and then quickly split it into the various component tracks. Defossez’s system, called Demucs, works by detecting complex patterns in sound waves, building a high-level understanding of which waveform patterns belong to each instrument or voice, and then neatly separating them. It’s only a research project for now, but Defossez hopes it will have real-world benefits. Defossez says technology like Demucs won’t just help musicians learn a tricky guitar riff or drum fill; it could also one day make it easier for AI assistants to hear voice commands in a noisy room, and enhance technology such as hearing aids and noise-canceling headphones. Defossez recently published a research paper explaining his work, and he also released the code so that other AI researchers can experiment with and build on Demucs.

Defossez, who is part of a Ph.D. program run by Facebook AI and France’s National Institute for Research in Computer Science and Automation (INRIA), says his goal is to make AI systems adept at recognizing the components of an audio source, just as they can now accurately separate out the different objects in a single photograph. “We haven’t reached the same level with audio,” he says.

Font Resize

Contrast

Accessibility by WAH