The challenge consists of three tracks: one track is common for both datasets and a separate for each dataset. Participating teams are welcome to take part in any track. The number of tracks a team is participating in is not limited.

Both datasets have been anonymized to protect the users of each service. Participants are expected to use the provided anonymized datasets. The use of external information sources, like IMDB, Wikipedia, or NetFlix, is not allowed.

The challenge tracks will be:

Weekly recommendation

This track focuses on the temporal dimension of context. The participants are asked to recommend movies for three different weeks. For this track, any of the two datasets can be used. Note that the algorithms used for each dataset may differ.

Participants are expected to generate recommendations for:

  • Calendar week 52 2009, i.e. Christmas week
  • Calendar week 9 2010, i.e. the week leading up to the Oscars

Due to the effect of Christmas week and Oscars week, the recommended items for each week may differ. The selected test users for each week have been stripped of all ratings done after the specified dates. Most of the stripped ratings for the specified dates can be found in the test set bundled with the dataset.

Each evaluation should address the following metrics: MAP, P@5, P@10, and AUC. The performance of the teams will be published on the leaderboard after the submission deadline. An additional evaluation dataset is withheld by the organizers and will be released for the final evaluation to identify the winners of the track. The best performing team will be announced at the RecSys dinner.

The five best performing teams will be asked to recommend a set of movies (upcoming cinema releases and recently released DVDs) to a group of real users at the first week of September 2010. These users will do a live evaluation of the recommendations, i.e., they will be given cinema tickets and DVDs and will be asked to rate the recommended movies.

Moviepilot Track

This track will focus on the dataset from Moviepilot, which contains features such as the movie mood (in a hierarchical representation), movie location, and the intended audience.

Participants of this track are asked to recommend a list of movies for a selection of users based on a given mood. The list of users recommendations should be generated for is included in the dataset, the mood is represented by id 16 in the dataset, the files this is found in are

  • moviepilot_keywords_emotion.txt – field: emotion_keywords_id
  • moviepilot_taggings_emotion.txt – field: id

The users selected for testing have been stripped of a portion of their ratings for the requested movie types. Most of the stripped ratings for the specified mood can be found in the test set bundled with the dataset. The testset contains ratings tagged with other ids than 16 as well, these have been included here to mitigate overfitting. In order to use these for your own evaluation, look up the emotion ids given to those movies in the moviepilot_taggings_emotion.txt-file. The evaluation submitted in the paper should only consider ratings on movies tagged with mood id 16.

Each evaluation should address the following metrics: MAP, P@5, P@10, and AUC. The performance of the teams will be published on the leaderboard after the submission deadline. An additional evaluation dataset is withheld by the organizers and will be released for the final evaluation to identify the winners of the track.

Filmtipset Track

This track focuses on the social connections between users in the Filmtipset dataset. Many Filmtipset users take part in the social network of the service. Users can befriend each other asymmetrically, similarly to the follower/following relation in Twitter. Furthermore, the dataset contains additional features, such as movie comments, comments on actors/directors/writers, movie reviews, review ratings, age and location of certain users, lists of movies, and links between similar movies.

Participants are asked to recommend a set of movies for the selected users. The list of users, for whom the recommendations should be generated, is included in the test set. The recommendation should be based on the social network, i.e., a user should be interest to see the movies, which were seen recently by the user’s friends (where the definition of ‘friend’ will be decided by the participants).

The users selected for testing have been stripped of a portion of their ratings. Most of the stripped ratings for the selected users can be found in the test set bundled with the dataset.

Each evaluation should address the following metrics: MAP, P@5, P@10, and AUC. The performance of the teams will be published on the leaderboard after the submission deadline. An additional evaluation dataset is withheld by the organizers and will be released for the final evaluation to identify the winners of the track.