DATA 311 - Data Ethics Assignment 3: Scraping, Crawling, and AI

Scott Wehrwein

Fall 2025

For this assignment, we’ll read some sources related to various data collection practices.

Readings

You will read one of the following articles:

  1. AI crawlers and fetchers are blowing up websites, with Meta and OpenAI the worst offenders

  2. Protect your website from excessive AI crawler traffic with modern defensive strategies

  3. Bartz vs. Anthropic: Order on Fair Use (first 9 pages - Introduction and Statement sections, you can stop at the Analysis section unless you want to geek out over the legalities)

  4. Perplexity AI accused of scraping content against websites’ will with unlisted IP ranges

The article assignments are listed below based on the first few characters of your WWU username:

Username prefix Article Username prefix Article
cor 3 lop 3
dap 4 man 4
deg 2 meu 3
dic 2 mir 3
dic 2 mon 4
fra 4 pat 1
guz 1 sch 2
her 3 smi 3
jac 1 tam 2
kan 1 tri 4
lee 2 viv 1
leo 4 wilc 2
lev 1 will 4
lis 1

Pre-Discussion Prompt

Read your assigned article from the list above, then compose and submit to Canvas a short (1/2 to 1 page) reflection addressing the following questions:

  1. In a few sentences, summarize the article you read.
  2. What are the key points you would explain to someone who hadn’t read the article?
  3. Describe one thing you learned that you found particularly interesting, and explain why.

In class, you will join forces with your classmates who read the same article as you did to present a cohesive summary of your article and its implications to the whole class; this will be followed by a whole-class discussion of the broader issues around crawling, scraping, copyright, and the ethics of data collection.

Rubric

This assignment is out of 10 points, and as usual it will be graded on effort, thoughtfulness, and clarity. If you produce thoughtful and clearly-written responses, you will receive full credit.