Popular Music Dataset

Spotify provides a public API with detailed information on millions of songs, including both musical characteristics—such as tempo, energy, danceability, and mood—and metadata like artist, genre, release year, and popularity. This makes it a valuable resource for exploring patterns in music across different styles and time periods.

In this class, we use a simplified subset of Spotify data that includes the top 1,000 most popular songs from each of 114 different genres. This curated sample allows students to explore how musical features vary across genres and gain experience working with real-world music data.

 

 

 

Measurements Within Data
Feature Description
track_id The Spotify ID for the track
artists The artists’ names who performed the track. If there is more than one artist, they are separated by a ;
album_name The album name in which the track appears
track_name Name of the track
popularity The popularity of a track is a value between 0 and 100, with 100 being the most popular. It’s based on the total number of plays and recency. Duplicate versions of a track are rated independently.
duration_ms The track length in milliseconds
explicit Whether or not the track has explicit lyrics (true = yes; false = no or unknown)
danceability How suitable a track is for dancing, based on tempo, rhythm stability, beat strength, and overall regularity. 0.0 = least danceable, 1.0 = most danceable
energy A measure from 0.0 to 1.0 representing intensity and activity. Energetic tracks feel fast, loud, and noisy
key The key the track is in (0 = C, 1 = C♯/D♭, …, 11 = B). If no key is detected, the value is -1
loudness The overall loudness of a track in decibels (dB)
mode Modality of the track: major (1) or minor (0)
speechiness Presence of spoken words in a track. Values near 1.0 represent speech-like tracks, while values below 0.33 likely represent music
acousticness A confidence measure from 0.0 to 1.0 of whether the track is acoustic
instrumentalness Predicts whether a track contains no vocals. The closer to 1.0, the more likely the track is instrumental
liveness Detects the presence of an audience. Values above 0.8 strongly suggest the track was recorded live
valence A measure from 0.0 to 1.0 describing the musical positiveness of a track. High = happy/cheerful; Low = sad/angry
tempo The estimated tempo of a track in beats per minute (BPM)
time_signature The estimated number of beats per bar. Ranges from 3 to 7, representing time signatures like 3/4 to 7/4
track_genre The genre in which the track belongs

 

 

 

https://www.kaggle.com/datasets/maharshipandya/-spotify-tracks-dataset