Sophia Ciocca, writing for Hacker Noon:
The exact mechanisms behind NLP are beyond the scope of this article, but here’s what happens on a very high level: Spotify crawls the web constantly looking for blog posts and other written texts about music, and figures out what people are saying about specific artists and songs — what adjectives and language is frequently used about those songs, and which other artists and songs are also discussed alongside them.
While I don’t know the specifics of how Spotify chooses to then process their scraped data, I can give you an understanding of how the Echo Nest used to work with them. They would bucket them up into what they call “cultural vectors” or “top terms.” Each artist and song had thousands of daily-changing top terms. Each term had a weight associated, which reveals how important the description is (roughly, the probability that someone will describe music as that term.)