Fall 2025
This pre-lab and lab serve to give hands-on experience with an important Python library for numerical computation: NumPy. NumPy (“numb-pie”) is a powerful, popular library for numerical computation in Python. Its implementation utilizes highly optimized C and Fortran code, resulting in significant speedups, and supports a very large number of numerical operations.
You must complete the pre-lab individual, but complete the lab in pairs. Your TA will help facilitating pairing. If an odd number of students are in attendance at lab, your TA will arrange one group of three. The lab must be done synchronously and collaboratively with your partner, with no free-loaders and no divide-and-conquer; every answer should reflect the understanding and work of both members. If your group does not finish during lab time, please arrange to meet as a pair outside of class time to finish up any work.
The Pre-Lab is due by the start of your lab period - see Canvas for your deadline. Submit a PDF to the “Lab 1 Pre-Lab” assignment on Canvas with your answers.
The Lab is due by 10pm on the Thursday of the week after the lab was assigned. Be sure that you have run all the cells in your notebook, download the notebook in .ipynb format, and submit it to the Lab 1 assignment on Canvas.
The lab involves completing a Jupyter notebook. As shown in class, Jupyter notebooks are a file format that allows you interleave text and code, in an environment that allows you to interactively run (and re-run) code blocks. While there are other ways to work with Jupyter notebooks, the recommended approach for this class is to use the department’s JupyterHub service, which is hosted on the HPC cluster. Jupyter notebooks represent a code format that has advantages and drawbacks over traditional source files; for data science, Jupyter notebooks suit our purposes for exploratory, explanatory, and illustrative computing. Just keep in mind that notebooks are not a great format for writing “software” in the sense of larger systems with reusable components.
If you choose to use Google Colab or any other alternative notebook hosting service, you must disable any built-in generative AI features. For Google Colab, this can be done via: Tools, Settings, AI Assistance, then uncheck “Show AI-powered inline completions” and “Consented to use generative AI features,” if checked. Then check “Hide generative AI features.” Your TA can help you if you have trouble with this.
Please carefully read “NumPy Illustrated: The Visual Guide to NumPy” (this link bypasses the paywall).
In a text editor of your choice that is capable of exporting to pdf, answer these questions:
What is a key difference between a NumPy array and a Python list? Why might you choose to use a NumPy array over a Python list for certain data science tasks?
What is the command to generate a random NumPy array of 15 integers in the range (0, 10]?
What is the command to generate a random NumPy matrix of integers of shape (2, 6) in the range (2, 7)?
What is a view? How is it different from a copy?
What is the difference between the *
,
@
, and **
operators in NumPy matrix
arithmetic?
Please read NumPy: the Absolute Basics for Beginners and refer to the numpy documentation as needed for the following.
In the same pre-lab document as above, answer these additional questions:
Let a be the following NumPy array:
[0, 1, 8, 27, 64, 125, 216, 343, 512, 729]
. What does
a[2]
return? What about a[:4]
? Note: the first
element of an array/vector is always at index 0 when coding in
Python.
What is the .shape
array attribute? What about
.size
?
What are three different commands you can use to create a basic array?
What does numpy.mean()
do? Briefly describe the
parameters and what the function returns.
What is the difference between .flatten()
and
.ravel()
?
For Part 3 of the lab, you will need more than the default (1 CPU, 2 GB RAM) resources for your Jupyter server. For this lab, 2 CPUs and 10GB of RAM should be sufficient.
With your lab partner, download lab1.ipynb, upload it to JupyterHub, and complete the lab following the instructions in the notebook. Any work not completed during lab time must be completed outside of lab hours, and should only be done with both partners present.