In music, artificial intelligence, and more specifically machine learning (ML), silently emerges as the black box behind nearly all of our interactions with music online. In fact, most of us have unknowingly been using AI and ML-based technology for years. Music listening platforms like YouTube, Spotify, Apple Music and Pandora use artificial intelligence to refine our experience on their respective services, like recommending the perfect track to play next, removing dead air and adjusting the volume in real time.
Machine learning is a subfield of AI that teaches machines how to understand. It looks for patterns in ‘training data’ and then uses those patterns to build models based on that data. Deep learning is a class of machine learning that, once built, can keep improving without human interference by collecting more data from how it is used tweaking its output.
Spotify’s Discover Weekly is the most obvious example of this. Apple’s voice assistant Siri is different; your synthesized voice will be learned from real-world recordings, and your voice will be better recognized over time.
But it’s not just recommendations that ML can attempt to master. Generating music when composer David Cope trained a computer on the Bach catalog to overcome writer’s block. More recently, artists like Actress and Holly Herndon have trained models and created records on their own music, voice, themes, and style to create a virtual collaborator modeled after their own likeness.
Music creation software and plugins have also started to embrace ML, with iZotope’s Neutron heralding a new era in the way we produce music in the studio. More recently, Splice, Loopmaster, and more are using ML to recommend new samples to enhance your track, allowing you to search your libraries of millions of sounds based on more abstract attributes like harmonic profile and pitch.
DJing hasn’t stopped being touched by ML either. Virtual DJ and Algoriddim’s DJ software have introduced real-time stem separation supported by the AI software’s AutoMix capabilities. Like Spotify, Splice and Loopmasters, advanced recommendations and searches are becoming more common in all DJ shows, while Pioneer DJ’s rekordbox recently introduced an AI-assisted vocal detector to avoid the dreaded vocal crashes.
The impact of this technology on the way we make, perform and listen to music will be far-reaching and at times dramatic. Unprecedented new legal and ethical questions will arise and deepfakes will change our perception of the real. Musical instruments, DAWs and tools are completely rethought and redesigned. Basic mixing and production skills are automated, and virtual DJs master mixing and track selection. and collaboration or identity theft will be possible with any artist, dead or alive.
The music manufacturing industry is to face a storm of Deepfakes. If we can’t distinguish between a real Beyoncé voice and a fake one, the can with the worms is really open.
Holly Herndon has invested heavily in AI and Machine Learning for years, releasing albums and projects using the latest technology. For his 2019 album “PROTO” he developed a language model called SPAWN. In July of this year he started Holly+, a project that allows anyone to upload and process audio and recreate it in their speech interpretation of the Holly+ algorithm, all based on hours of machine learning.
How this horizon will develop is still uncertain. This could go in a number of different ways, it could be a total nightmare scenario where people were working with their vocal likeness in a way that they’re not really comfortable with and they could try to control it and be really DRM-heavy to become Digital Management on it. Herndon’s view is not to try to limit the technology, but to exploit its creative potential.
Mat Dryhurst, musician, researcher and lecturer at New York University’s Clive Davis Institute for Recorded Music and a long-time employee in Herndon. There will be people who do that shit just to get attention, which is annoying. It’s so much better to jump forward and act responsibly, but at the same time celebrate the great thing, he voices his opinion.
Regardless of whether the artist in question recognizes the new technology or not, moral, ethical and legal questions remain unanswered.
Professor Joe Bennett, forensic musicologist at Berklee College of Music, told Billboard in early 2021. The sound of the voice itself is not copyrighted. Objects in copyright law: musical works and sound carriers. The musical work relates to the song: notes, chords and lyrics and audio protection can only be applied to a specific track. This means that deepfake audio is a gray area as the voice is not considered part of the composition [by law], leaving tone by hiring an image to her own backup singer, the inevitable legal battle against deepfakes is growing.
While an identical model trained with a singer is more obvious, the water becomes muddy when the data used to train the model is less clear.
It is described as ‘black box’ and it’s really hard to say why an algorithm produces what it does, said Cherie Hu, an award-winning music technology trends leader, journalist and researcher and author of the excellent Water and Music and Community newsletter. It’s a common problem with machine learning, and music has a very specific copyright and royalty implications, he further added.
If you wanted to create a model based on tunes by Bicep, Herbie Hancock, and Slipknot, who, if any, would be compensated for using these artists’ intellectual property? While sampling can be forgotten, it is at least possible to identify certain aspects of a sound. There is no start or end point for modeling, no clear references, and no way of knowing what music was used in training. Even if your intentions are to compensate for this, it is impossible to quantify which chord progression or melody is attributed to which artist.
There’s a real concern: if the rightsholders and the artists themselves don’t speak a little fluently about it, we might run into a problem, continues Dryhurst. Google created one of the most powerful companies in the world by scraping the internet without anyone’s permission. Google is going to index everything and then sell services, in addition to being able to navigate through that information. In this new machine learning paradigm, there is the same opportunity to create the largest models. You can think of a new service or DAW that ties in with one of those mega-models where you can easily find who you want your song to play for.
Lately, the discussion about fair compensation for musicians and songwriters has accelerated. This summer the UK Government’s Department of Culture, Media, Media and Sport released a report stating that the music streaming model needed a “total reset”. As a new creative opportunity, AI also offers the opportunity to improve artist compensation across the board.
Where sampling has failed so many times, could modeling be an opportunity to rewrite the compensation rulebook? An innovative solution is Herndon’s Holly + concept and property. The model is managed by an Autonomous Decentralized Organization (DAO) and manages what was created through approved work with Holly +, is then returned to DAO to fund additional tools in the future.
Of course, AI is not just about creating models that enable identity theft, and the Beyoncé button won’t appear tomorrow, but as Dryhurst and Herndon allude, helping clarify potential problems now could be in the future impede. UK Performance Rights Organization agrees.
As important as it is to address these intellectual property issues, it’s also important to celebrate the creative possibilities these models offer. The ability to work with others using just your voice, or with their permission behaving like others is pretty new and cool, added Dryhurst.
A person who plays the trombone makes very specific musical decisions because of the strange shape of the trombone, and you would make a very different decision if you played a violin. So open these bodies of physical resonance to everyone else. In 2021, you can also see the OpenAI Jukebox project’s attempts to model certain artists, including Frank Sinatra, Katy Perry, and Elvis Presley. It’s not perfect, but it’s one Proof of concept, it’s remarkable.