Lecture 15

Get Yer Data - Discussion

Correlation (does not imply causation)

Announcements:

Goals:

Data Ethics - Get Yer Data

Poll time!

  1. On a scale from 1 (completely unsurprised) to 5 (shocked), how surprised were you by the data?
  1. On a scale from 1 (unperturbed) to 5 (fully spooked), how creeped out were you by the amount or types of data?
  1. On a scale from 1 (completely bogus) to 5 (perfectly accurate), how accurate did you find the data to be?

Discussion in groups of three:

  1. Discuss any particularly notable findings that pertain to the questions you were asked to write about:

    1. How did your data compare to what you expected? Was there anything surprising, or creepy, or just plain strange? Describe the types of data that you see.
    2. How comprehensive was your download? Are you able to determine whether the company gave you everything they had, or were they more selective?
    3. What kinds of data science questions could someone answer about you based solely on this data? What kinds of data science questions could someone with access to millions of records like yours answer?
    4. Are you comfortable with the extent and/or accuracy of data collected? Does the company have controls for opting out of collection of the sorts of data you’d rather they not have? If not - or if the company suddenly decided tomorrow to remove those controls - what should our society do about this?
  2. Were your reactions similar or different? Was this due to your attitudes, or due to differences in the data?

Decide on one most interesting thing (from any of the above discussion) to share with the class. The person who woke up latest this morning will be your group's spokesperson.

Discussion as a class:

  1. Is there a problem here?
  2. If so, how should society solve it?