Homework 5
Data sets
- OMDB Movies | Local
- You can download a version of the Adventureworks Cycles dataset directly from this Github Repo
Problems
- Import the data CSV as dataframe (See above for link to dataset)
- Print first 5 rows
- Print out the num rows and cols in the dataset
- Print out column names
- Print out the column data types
- How many unique genres are available in the dataset?
- How many movies are available per genre?
- What are the top 5 R-rated movies? (hint: Boolean filters needed! Then sorting!)
- What is the average Rotten Tomatoes score for all available films?
- Same question as above, but for the top 5 films
- What is the Five Number Summary like for top rated films as per IMDB?
- Find the ratio between Rotten Tomato rating vs IMDB rating for all films. Update the dataframe to include a
Ratings Ratio
column (inplace). - Find the top 3 ratings ratio movies (rated higher on IMBD compared to Rotten Tomatoes)
- Find the top 3 ratings ratio movies (rated higher on IMBD compared to Rotten Tomatoes)
How to Submit
Please zip up the files and DM your IA and instructor.