Lecture 9 - Exploratory Analysis "Cold Open"



The data:

The data came from here: http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml.

It was preprocessed by a friend using this notebook. I think he told me he pulled out a subset of columns and subsampled the rows, but I don't know any more than that.


Fun question I came up with when posting this notebook:

Could we deduce the formula for calculating the fare from the data? In other words, can we predict the fare column from the elapsed time and distance columns?