This page contains the data and examples described in the paper cited below. The playlist collection is derived from Art of the Mix playlist database.

AOTM-2011 spans the period from 1998-01-22 to 2011-06-17. In all, it contains 101,343 unique playlists, each of which has had its songs matched to the Million Song Dataset (MSD). Approximately 98,000 songs were matched into MSD; please see the publication below for details.

If you have any questions or comments, feel free to email the author at

Playlist data

101,343 playlists (50M) [md5]

We provide the data in JSON format. The following python code can be used to import the data.

import cjson
import gzip

with'aotm2011_playlists.json.gz', 'r') as file_desc:
    playlists = cjson.decode(

The variable playlists will then contain a list of playlists, complete with song identifiers, categorical annotation, and various other metadata. For example:

P[0] = {'category': 'Mixed Genre',
        'filtered_lists': [['SOFDPDC12A58A7D198'],
                           ['SOPIEQP12A8C13F268', 'SOKMCJK12A6D4F6105'],
        'mix_id': 89567,
        'playlist': [[['peter murphy', "marlene dietrich's favourite poem"], None],
                     [['the walker brothers', "the sun ain't gonna shine anymore"],
                     [['marc almond', 'jacky'], None],
                     [['tindersticks', 'dying slowly'], None],
                     [['tori amos', 'me and a gun'], 'SOPIEQP12A8C13F268'],
                     [['suzanne vega', 'luka'], 'SOKMCJK12A6D4F6105'],
                     [['madonna', 'spanish eyes'], None],
                     [['the angels of light', 'praise your name'], None],
                     [['eurythmics', 'sex crime'], None],
                     [['tom waits', 'drunk on the moon'], None],
                     [['kate bush', 'wuthering heights'], 'SOGTGJR12A6310E08D'],
                     [['david bowie', "new york's in love"], 'SOLTBYJ12A6310F2BB'],
                     [['echo & the bunnymen', 'crocodiles'], 'SOBOXXN12A6D4FA1A2'],
                     [['peter murphy', "i'll fall with your knife"], 'SOUQUFO12B0B80778E']],
        'timestamp': '2005-03-27T10:53:00',
        'user': {'member_since': '2004-03-21T00:00:00',
                 'mixes_posted': '23',
                 'name': 'pulmotor'}}

Going through the fields of P[0], we have the following fields.

A string describing the category of the playlist. There are about 40 unique categories.
An array of contiguous segments of the playlists, where each song in the segment could be matched to MSD. This field is provided for convenience, and could be reconstructed from the playlist field described below.
A unique numeric identifier for the playlist.
An array containing the original playlist data. Eahc element of this array is an array of length 2. The first element contains the artist name and song title, and the second contains the MSD song identifier, or None if the song could not be matched to MSD.
The time and date when the user uploaded the playlist.
A dictionary containing information about the playlist's author, such as the user name, the date of joining the site, and number of playlists posted by that user (at time of upload).


If you use this data, please cite the following paper:
bib |pdf
Hypergraph models of playlist dialects
13th International Society for Music Information Retrieval conference (ISMIR).