Brian McFee :: eHarmony matchings

eHarmony matchings

This data set was provided by eHarmony, Inc., and consists of two subsets corresponding to two equal-length segments of time. The data is stored in CSV files, organized as follows.

EH-*-data.csv.gz: Each row describes an individual. The first column is an identification number for that individual, and subsequent columns are feature values.
EH-*-labels.csv.gz: Each row describes a pairwise interaction. The first column indicates whether the interaction is positive (1) or negative (0). The second and third columns contain identification numbers for the corresponding individuals.

Note

To protect the privacy of users, all features have been obfuscated and normalized. I cannot provide names for the features.

Download

Slice 1: data (9.7 MB, md5), labels (2.2 MB, md5)
Slice 2: data (7.4 MB, md5), labels (1.8 MB, md5)

References

If you use this data, please cite the following paper:

2010

.bib .pdf

Metric learning to rank

McFee, B., & Lanckriet, G. R. G.

Twenty-seventh International Conference on Machine Learning (ICML).

Source code

The source code for MLR is now hosted on GitHub.