Scraping the Web Now, Asking for Permission Later

Apple

Federico Viticci, writing at MacStories about Apple’s details on their AI model being trained on web content:

As a creator and website owner, I guess that these things will never sit right with me. Why should we accept that certain data sets require a licensing fee but anything that is found “on the open web” can be mindlessly scraped, parsed, and regurgitated by an AI? Web publishers (and especially indie web publishers these days, who cannot afford lawsuits or hiring law firms to strike expensive deals) deserve better.

I agree wholeheartedly. I felt similarly when I looked at the data that trained Google’s AI. I see Chorus and our forum very clearly in their training data. We didn’t agree to that. Our community never agreed to that. Google played a massive role in devaluing small and medium sized websites (and the online ad business) and we’re certainly not going to be the ones getting any publishing deals. None of it sits well with me.

Google Unveils Music AI Sandbox

Google

Ty Pendlebury, writing for CNET:

Google has unveiled a new music-making tool it calls Music AI Sandbox, which enables loops to be created via AI prompts, as part of its I/O 2024 conference.

The tool, shown briefly today in a Google I/O video, appears to accept text input and provides short audio clips or “stems” based on the prompt, complete with a waveform representation of the invented sounds.

Spotify Recommending A.I. Generated Music

Spotify has been recommending “A.I. generated music” to some users:

My favorite example of this is AI music spreading across on Spotify right now. A user on X this week spotted an Artist page called Obscurest Vinyl that was promoted by Spotify’s Discovery Weekly.

The story behind the page is interesting. Obscurest Vinyl started as a Facebook page that would photoshop fake album covers for classic records that didn’t exist. The page recently shifted into posting AI songs to go with the fake album covers. As one commenter noted, you can tell the songs are AI because most of them feature bass and drum parts that don’t repeat in any discernible pattern. The account also regularly fights with users on Instagram who gripe about it using AI. 

Look, I think songs titled things like, “I Glued My Balls To My Butthole Again” are, honestly, pretty funny, AI or not. But they’re being uploaded to Apple Music and Spotify, which is where the snake starts to eat its own tail. Popular AI music generators like Suno clearly have datasets that include at least some copyrighted material (likely a lot). Which means, in this instance, Spotify is promoting and monetizing an account using an AI likely trained on the music that’s been uploaded to their platform that they don’t actually pay enough to support the creation of. And this is happening across every corner of the web right now.