The natural language of playlists

This page contains the data and examples described in the paper cited below. The playlist bigram collection is derived from the Art of the Mix playlist database, available here. The data provided here is the subset of the Art of the Mix collection which coincides with the Million Song Dataset.

If you have any questions or comments, feel free to email the author at brian.mcfee@nyu.edu

Note

This data is provided only to facilitate comparison to prior work. For new research, the AotM-2011 dataset is recommended instead.

Bigram data

aotm-bigrams.csv.gz
66250 song bigrams (869KB) [md5]

Each line in aotm-bigrams.csv consists of an ordered pair current song,next song, identified by Echo Nest track ID. For example:

TRFHYOH128F9358CEF,TRWXAOC128F42553AE
These track identifiers can be resolved to song titles by the Echo Nest developer API, or using the meta-data information provided with the Million Song Dataset.

References

If you use this data, please cite the following paper:
2011
bib |pdf
The natural language of playlists
12th International Society for Music Information Retrieval conference (ISMIR).

Playlist examples