This assignment is inspired by a pair of articles (Facebook
article / archival, Google
article / archival) in the New York
Times in Spring 2018. One of their tech columnists downloaded all of the
data that those two companies had about him. For this assignment, your
job is to download some of your own data!
Instructions:
- Take a moment and think. What sorts of data do you imagine these
companies have collected about you?
- Either use the Facebook tool
or the Google tool
to request your data. (You only need to do one of them.) If you can
think of another internet service that allows you to download your data
in bulk, feel free to go to something else. Watch out—this might take a
while.
- Poke around in your data a bit. What sort of information has the
company collected about you? Do you think it’s more or less accurate?
(If you went with the Google download, many of the files might be in
JSON format. Firefox can format JSON for you if you load the file in the
browser, and the Sublime and Atom text editors can be configured to
display JSON. [sublime. atom]. There are nice
web viewers too, but maybe you shouldn’t upload your personal data to
them!)
- Write an approximately 1-page (single-spaced) reflection about your
findings. Use the questions below to guide your reflections, but don’t
feel the need to answer all of them. The 1 page guideline is very loose,
so please direct your efforts towards conveying substance rather than
filling a page: I’d much rather read 2 good paragraphs than try to fish
the substance out of 4 fluffy ones.
- We will have an in-class discussion on this topic (tentatively) on
Monday, October 18th. Please come to class prepared to discuss your
findings and thoughts with your classmates.
Questions:
Note: Nothing in the following questions obligates you to
share personal information about yourself. You are under no pressure to
reveal private information, and in fact it’s much better if you
don’t!
- How did your data compare to what you expected? Was there anything
surprising, or creepy, or just plain strange? Describe the types of data
that you see.
- How comprehensive was your download? Are you able to determine
whether the company gave you everything they had, or were they more
selective?
- What kinds of data science questions could someone answer about you
based solely on this data? What kinds of data science questions could
someone with access to millions of records like yours answer?
- Are you comfortable with the extent and/or accuracy of data
collected? Does the company have controls for opting out of collection
of the sorts of data you’d rather they not have? If not - or if the
company suddenly decided tomorrow to remove those controls - what should
our society do about this?
Submission
Submit your reflections on Canvas in a PDF file called
ethics1.pdf
.
Rubric
This assignment is out of 10 points, and will be graded relatively
coarsely on effort, thoughtfulness, and clarity. If you got your data,
spent some time looking at it, and produced thoughtful and
clearly-written answers to some of the above questions, you will receive
full credit.