This data set was provided by eHarmony, Inc. The data consists of pairs of individuals, which either matched (positive example) or did not (negative example). The data is partitioned into two subsets corresponding to two equal-length segments of time.
The data is stored in CSV files, organized as follows.
- EH-*-data.csv.gz
- Each row describes an individual. The first column is an
identification number for that individual, and all subsequent
columns contain the (numeric) feature values.
- EH-*-labels.csv.gz
- Each row describes a pairwise interaction. The first column
indicates whether the interaction is positive (1) or negative (0).
The second and third columns contain identification numbers for
the corresponding individuals.
Please refer to the paper below for more details about this data set.