L17 - HTML and Web Scraping

Announcements:

Goals:

HTML - HyperText Markup Language

This is the language that web page content is written in. Some basic facts about HTML:

Basic Elements

Instead of reading the following list, let's look at the source for a couple of webpages and see what we find:

Some common HTML elements to know about:

Note: If Jupyter sees HTML amongst your Markdown, it will render it like HTML - that's why I'm able to show you both the code and how it renders in the examples below.

Web Scraping

So you want some data, but you can only find it buried in some webpage.

Packages you'll need to pip install for this all to work (and for Lab 5):

Game plan:

Things to demo:

Okay, let's do something with this!

Live coding problem: collect the Names and Office Numbers of all WWU CS faculty from the department directory.