In the era of information overload, pushing to users information matching their preference has been a valuable application and a major challenge in the field of data science. In this project, we proposed to explore the movielens dataset, a dataset consisting of users’ rating and tagging towards movies, and implement a demo movie recommendation system.
A report summarizing exploratory analysis results, recommendation algorithms used and predicted user-movie rating
A shiny app or dashboard giving movie recommendation
Find popular movies and high-rating movies
Investigate the relationship among ratings and various factors, including movie genre, release year, rating time
Evaluate the similarity among users and among movies
Test the average ratings of the users, to investigate whether the rating criteria are different.(ANOVA)
Analyze the changes in the ratings of the same movie every year, or the changes in the ratings of each category of movies,and try to get a rule through the analysis
Construct model to predict the rating of a specific user towards a specific movie. Preference on specific genre
- Find the average rating after every year a movie published. And find the trend of that.
Boxplot regarding: ratings and years; ratings and genres;
Clustering map of users and movies
Heatmaps denoting user / movie similarity
maybe a linear regression graph to show a specific user’s rating on genre
Convert collected-in-seconds timestamp into years
Split and group movie by different genre
Select proper loss function and decompose the user-movie rating matrix through gradient descent
Scrape movie basic information (length, director, actor, date…) from IMDb for calculating movie-movie similarity
Calculate movie-movie similarity based on movie description and user-generated tags
Build models to calculate user-similarity
Date | Task | Due |
---|---|---|
Nov 4 | Brainstorm | NA |
Nov 9 | Finish draft proposal | NA |
Nov 13 | Submit formal proposal | Nov 13 by 1:00 pm |
Nov 15 | Assign tasks | NA |
Nov 16 - 19 | Project review meeting | Nov 16-19 |
Nov 30 | Finish coding part | NA |
Dec 7 | Construct website | NA |
Dec 11 | Report & Webpage and screencast & Peer asessment | Dec 11 by 4:00 pm |
Dec 14 | Presentation practice | NA |
Dec 16 | In class discussion | Dec 16 |