Homework 5

Data sets

  • OMDB Movies | Local
    • You can download a version of the Adventureworks Cycles dataset directly from this Github Repo

Problems

  1. Import the data CSV as dataframe (See above for link to dataset)
  2. Print first 5 rows
  3. Print out the num rows and cols in the dataset
  4. Print out column names
  5. Print out the column data types
  6. How many unique genres are available in the dataset?
  7. How many movies are available per genre?
  8. What are the top 5 R-rated movies? (hint: Boolean filters needed! Then sorting!)
  9. What is the average Rotten Tomatoes score for all available films?
  10. Same question as above, but for the top 5 films
  11. What is the Five Number Summary like for top rated films as per IMDB?
  12. Find the ratio between Rotten Tomato rating vs IMDB rating for all films. Update the dataframe to include a Ratings Ratio column (inplace).
  13. Find the top 3 ratings ratio movies (rated higher on IMBD compared to Rotten Tomatoes)
  14. Find the top 3 ratings ratio movies (rated higher on IMBD compared to Rotten Tomatoes)

How to Submit

Please zip up the files and DM your IA and instructor.