The machine learning algorithm I chose to use for my project is logistic regression. Essentially, the logistic function works as such:
1 / (1 + e^-value)
The curve compares two variables and graphs a curve from the scale of 0 to 1. Due to the simple nature of this algorithm, making predictions with it is as easy as plugging numbers into a calculator.
So, how does this relate to my project?:
The main features I am looking at are:
Artist Familiarity vs Song Hotness: essentially, is a song by a popular artist, say Ed Sheeran, going to be more popular than a song with every other feature (tempo, tune, etc) that corresponds to a hit song? Basically, how important is Artist Familiarity. As you would expect, my model proved that artist familiarity (a score from 0 to 1) definitely has something to do with whether a song is popular or not.
The X-axis depicts artist familiarity provided by Spotify’s open source API, while the Y axis has song hotness based on the hotness index through the Million Songs Dataset. (https://labrosa.ee.columbia.edu/millionsong/pages/getting-dataset)
In the upcoming weeks, I will be analyzing many more features and further train this model. I’m also loaded with some college visits, so I’ll keep you updated on my progress next week!
Till Next Time,