Recitation Assignment

Tasks

Jupyter

Create a Jupyter notebook:

  • Provide cells which show your data cleaning, analysis and pre-processing.
  • Provide an implementation of the pseudo code for the daily algorithm(s).

Kaggle

Run the daily algorithm(s) using scikit-learn and submit the results to Kaggle.

Timing

Run all implementations using the timeit cell magic. The daily algorithms should be re-run if the data changes, for an apples-to-apples comparison.

Conclusions and Recommendations

  • Record the results of timeit for both implementations in a cell. Were there any surprises? Why or why not?
  • Record your Kaggle placement on the leaderboard in a cell. Were you surprised by the results? Why or why not?
  • Record the results of algorithms rerun on the same data.

Submitting Your Work

  • Report your answers in a Jupyter notebook with either print statements or markdown.
  • For your implementations, include a docstring describing what that function does. Note: You are not writing tests for any functions.
  • If there is any change in the data (including segmentation), provide the results
  • Create cells that contain your conlusions and recommendations. You may choose to create charts, graphs or other visualization of changes.