{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "e2362e9d-042e-4e26-82ad-517722f21d55",
   "metadata": {},
   "source": [
    "# DATA 311 Lecture 2 - `numpy` demo\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "20d0f1e8-e323-436e-87a4-b4fad4a03c78",
   "metadata": {
    "id": "4tRd3rnn5YHf"
   },
   "source": [
    "\n",
    "\n",
    "## Markdown Demo (this is a level 2 heading)\n",
    "\n",
    "### Lists\n",
    "* bulleted lists\n",
    "* another item\n",
    "\n",
    "1. numbered lists\n",
    "2. another item\n",
    "\n",
    "### Text formatting\n",
    "**bold**, *italics*, `monospace`\n",
    "\n",
    "A code block:\n",
    "```python\n",
    "a = 4\n",
    "b = 7\n",
    "```\n",
    "### Links and images:\n",
    "\n",
    "[link text](https://facultyweb.cs.wwu.edu/~wehrwes/courses/data311_26s/)\n",
    "\n",
    "![alt text](https://facultyweb.cs.wwu.edu/~wehrwes/courses/data311_25f/lab1/diagonal_example.png)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "478eb7e7-029f-41c4-9349-89c3d7e4a727",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import random\n",
    "import matplotlib.pyplot as plt\n",
    "import imageio.v3 as imageio"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e77cab94-293a-4a71-b3aa-5d7d5471f7fb",
   "metadata": {},
   "source": [
    "### Creating Arrays\n",
    "* `array`, `zeros`, `ones`, `*_like`\n",
    "  * `dtype` argument"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "9327ac12-6e94-47ea-a48f-01cadc3a12b8",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# create a python list with 0..9\n",
    "a = list(range(10))\n",
    "a\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "0a95dc4d-6489-4e52-89fc-ee8d79e82a68",
   "metadata": {},
   "outputs": [],
   "source": [
    "# create a numpy array with the list's contents\n",
    "a = np.array(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "20d562b2-0563-4ae2-8778-8d8f1d2c5a0b",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "dtype('int64')"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# show the array's data type\n",
    "a.dtype"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "2709d96d-d7d1-4ea6-80e9-c45528f30afa",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "np.int64(1)"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# get the element at index 1\n",
    "a[1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "b5ac609c-b331-4984-96bb-8ef813f1ab9d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(10,)"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# get the shape of the array\n",
    "a.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e116f13f-de96-4ba0-9183-f4c73f1e9c32",
   "metadata": {},
   "source": [
    "### Basic list-like slicing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "04e7936a-9655-411b-8ee0-c64fcfe79262",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([3, 4, 5])"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# slice with beginning and end\n",
    "a[3:6]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "b84e5597-b0a4-4802-ba0b-a90d67a93d52",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0, 1, 2, 3])"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# slice with implicit start (0)\n",
    "a[:4]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "b57d654c-4458-40ac-bdec-e169f3a10599",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([5, 6, 7, 8, 9])"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# slice with implicit end (len)\n",
    "a[5:]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "a5050fcd-ddef-4b7c-a010-169b3c77c959",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# slice with implicit start and end\n",
    "a[:]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "bd646389-5025-46c5-b5f2-2e60682eed95",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1, 3, 5])"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# slice with a step size\n",
    "a[1:7:2]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2326dcaf-8551-4557-8433-ce6aa3aceab7",
   "metadata": {},
   "source": [
    "### Elementwise Operations\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "be7ae120-a010-41e8-a564-f304fd002e2f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "f6fec9ad-44cb-4812-8799-b6ed556ec2e7",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 4,  5,  6,  7,  8,  9, 10, 11, 12, 13])"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# array + scalar\n",
    "a + 4"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "4f5cb864-b68a-43de-8e3b-d93b93ac479a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# array + array\n",
    "a + a"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "d6d52581-cc25-415a-b66f-f84a8dc6c7b7",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# scalar * array\n",
    "2 * a"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "fb52730e-9f0c-48e4-a48f-39aa35b8b0ba",
   "metadata": {},
   "outputs": [
    {
     "ename": "ValueError",
     "evalue": "operands could not be broadcast together with shapes (10,) (4,) ",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mValueError\u001b[39m                                Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[16]\u001b[39m\u001b[32m, line 2\u001b[39m\n\u001b[32m      1\u001b[39m \u001b[38;5;66;03m# array + array dimension matching\u001b[39;00m\n\u001b[32m----> \u001b[39m\u001b[32m2\u001b[39m a + a[:\u001b[32m4\u001b[39m]\n",
      "\u001b[31mValueError\u001b[39m: operands could not be broadcast together with shapes (10,) (4,) "
     ]
    }
   ],
   "source": [
    "# array + array dimension matching\n",
    "a + a[:4]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "66a9ec63-8bd2-4a6b-ae01-ca0182c1ff82",
   "metadata": {},
   "source": [
    "### Exercise 1: Speed Check\n",
    "\n",
    "**In pairs**: I've claimed `numpy` is faster than native Python. Let's find out how much faster. \n",
    "\n",
    "1. In the cell below, create a Python list (not a numpy array!) of 10,000 random floating-point numbers between 0.0 and 1.0. Useful tools: `import random`, `random.random()`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "b236aa94-ef90-4764-9e2e-901e3043d8db",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0.8437826572010122,\n",
       " 0.6696759345208393,\n",
       " 0.14343354017923993,\n",
       " 0.8904407099190859,\n",
       " 0.2182667489356157]"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import random\n",
    "\n",
    "b = []\n",
    "for i in range(1_000_000):\n",
    "    b.append(random.random())\n",
    "b[:5]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "472e5cdf-0425-4dfa-9268-7a12ebf50e16",
   "metadata": {},
   "source": [
    "2. In the next cell, create a new Python list containing the same numbers as the original, but with 0.5 subtracted from each. Don't modify the original list. I've added the ipython magic command `%%time` to the top of the cell to measure and report the time it takes to execute that cell."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "5b5a1d84-821c-48db-8b1c-0c8ad5a73ff4",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 83.7 ms, sys: 16.1 ms, total: 99.8 ms\n",
      "Wall time: 99.4 ms\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "\n",
    "c = []\n",
    "for v in b:\n",
    "    c.append(v - 0.5) \n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ebd4d06a-5b28-4de7-8c99-9910a5e1b148",
   "metadata": {},
   "source": [
    "3. In the next cell, create a numpy array `np_nums` containing the same numbers as your original (0.0 to 0.1) list.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "9dd7bc63-1d0f-4ae0-9f13-e4ad6e862afc",
   "metadata": {},
   "outputs": [],
   "source": [
    "np_nums = np.array(b)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ce97493a-60a5-4087-9ce8-86fb3334e3e4",
   "metadata": {},
   "source": [
    "4. In the cell below, create a new numpy array `np_result` by subtracting 0.5 from `np_nums` (i.e., using elementwise operations). Time this cell's execution. How much faster is the numpy version than the native python version?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "cd0b7e95-fb75-48de-b9c9-6c374007b117",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 1.84 ms, sys: 2.96 ms, total: 4.8 ms\n",
      "Wall time: 4.14 ms\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "np_result = np_nums - 0.5\n",
    "# your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6d49c281-2964-42b0-b870-d18f962ee67e",
   "metadata": {},
   "source": [
    "### Multidimensional Arrays\n",
    "* 2D arrays, slicing across dimensions\n",
    "* elementwise operations\n",
    "  * comparisons / boolean dtype, masking\n",
    "* visualizing as an image"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "06d0c62a-99f4-4d2e-bdf2-a12b3c4bcd36",
   "metadata": {},
   "source": [
    "More ways of making arrays:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "b7829fb1-2ebc-4f34-9b98-6ef3aee9241c",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1, 2, 3])"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# create an array from [1, 2, 3]\n",
    "np.array([1, 2, 3])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "629cf52c-1ce9-4758-b1dd-6cc547faded0",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0., 0., 0., 0., 0., 0.])"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# create an array of 6 zeros\n",
    "np.zeros((6,))\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "64ee80f6-b629-4910-b9f3-4de43a7c8724",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0, 0, 0, 0, 0, 0])"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# create an array of 6 ones with 64-bit integer datatype\n",
    "np.zeros((6,), dtype=np.int64)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "id": "8098011e-8960-45ab-987d-387ea8471099",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0., 0., 0.],\n",
       "       [0., 0., 0.]])"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# create a 2-by-3 array of zeros\n",
    "np.zeros((2, 3))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1e64d8ce-ff59-40f8-aebc-d1fe55613bcd",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "b99c4784-9470-4687-9055-026f9038b0c7",
   "metadata": {},
   "source": [
    "### Reshaping\n",
    "* more than 2 dimensions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "id": "e7078b4f-777d-40ae-b80b-8d0d8a136918",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0, 1, 2],\n",
       "       [3, 4, 5]])"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# set b to an array of 0..5, reshaped to 2-by-3\n",
    "b = np.array(range(6)).reshape((2, 3))\n",
    "b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "id": "b1dab6f1-7ebb-4087-8737-9be85a7b9277",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "np.int64(5)"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# demo indexing into b\n",
    "b[1,2]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "32a9e33e-3b95-41b1-a98e-e482047bb088",
   "metadata": {},
   "source": [
    "### Aggregation / Projection\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "id": "550de3a9-103e-4e01-a9c6-71ede2ff1f26",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "np.int64(15)"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# find the sum of all elements in b\n",
    "b.sum()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "id": "77946e7d-a010-41a9-a5f9-981c282efed4",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "np.int64(0)"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# find the minimum value in b\n",
    "b.min()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "id": "2e3bdd9e-6496-48e4-87dc-f08f0c2a1745",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0, 1, 2],\n",
       "       [3, 4, 5]])"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# display b, just for reference\n",
    "b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "0bf92ee6-63f0-4e12-831e-e880da468d36",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([3, 5, 7])"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# sum the elements of b along axis 0 (the row dimension)\n",
    "b.sum(axis=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "id": "4ebc1ad7-998b-48d1-b285-7ebeb9de032e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 3, 12])"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# sum the elements of b along axis 1 (the column dimension)\n",
    "b.sum(axis=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f4f4d18c-146b-4cd8-be3a-828639c81977",
   "metadata": {},
   "source": [
    "### Exercise 2 - Broadcasting\n",
    "\n",
    "**In pairs**: We've seen that, to perform elementwise operations, the dimensions of the arrays must match. There's one convenient exception to this. Let's see it in action below:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "id": "593ae1a4-2b7d-4f5d-b6c2-8965d8330ab3",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(2, 3)"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b = np.array(range(6)).reshape((2, 3))\n",
    "b.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "id": "48b40f87-c4e6-4ced-9b07-2237e576e475",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(2,)"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "c = np.array([2, 4])\n",
    "c.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "id": "f7fa7e24-4232-4d1f-9d5b-40e3f7a163a8",
   "metadata": {},
   "outputs": [
    {
     "ename": "ValueError",
     "evalue": "operands could not be broadcast together with shapes (2,3) (2,) ",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mValueError\u001b[39m                                Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[34]\u001b[39m\u001b[32m, line 2\u001b[39m\n\u001b[32m      1\u001b[39m \u001b[38;5;66;03m# dimension mismatch\u001b[39;00m\n\u001b[32m----> \u001b[39m\u001b[32m2\u001b[39m b * c\n",
      "\u001b[31mValueError\u001b[39m: operands could not be broadcast together with shapes (2,3) (2,) "
     ]
    }
   ],
   "source": [
    "# dimension mismatch\n",
    "b * c"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "id": "bd5f466f-2c27-4693-9d7c-a8a973cd6234",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(2, 1)"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# reshape c to be 2D\n",
    "c_2x1 = c.reshape((2, 1))\n",
    "c_2x1.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "id": "e1b166ff-d01d-40cd-89e0-8aee8992953d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0, 1, 2],\n",
       "       [3, 4, 5]])"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "id": "f938a358-fb29-491c-b956-8f93171d86d6",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[2],\n",
       "       [4]])"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "c_2x1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "id": "6171b36d-f9aa-4bc3-a0c5-8f69b918873a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0,  2,  4],\n",
       "       [12, 16, 20]])"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# example of broadcasting:\n",
    "b * c_2x1"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c70aab89-024d-4abd-baf6-795e43b14713",
   "metadata": {},
   "source": [
    "1. This is called **broadcasting**. Explain what happened here."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1d7d726d-65ef-4ed8-b391-647fc3e94b27",
   "metadata": {},
   "source": [
    "The values in `c_2x1` got repeated across the column dimension to match the column dimension of b."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "282e8a06-4b9d-4c19-aa39-3f9e687d3c31",
   "metadata": {},
   "source": [
    "Now, run the following cells to see another example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "id": "c1647f22-5f79-439f-9fc8-aca6ef54b3be",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(1, 3)"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "d = np.array([1, 0, 1]).reshape(1, 3)\n",
    "d.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "id": "453a956c-ca42-4ed9-af2a-5a708b1d8963",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0, 1, 2],\n",
       "       [3, 4, 5]])"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "id": "3e4a0d0a-2341-4c3d-a836-cbaa02b43f93",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1, 0, 1]])"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "d"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "id": "e43a3fce-f968-4e06-8ac6-acc16ef19b6f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0, 0, 2],\n",
       "       [3, 0, 5]])"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b * d"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "02ac0f5e-ae25-4e23-91db-456a1f3b6277",
   "metadata": {},
   "source": [
    "Now, explain the general rule for:\n",
    "\n",
    "2. What kind of dimension mismatches are allowed?"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "efb05e2c-b9a0-409c-ba8f-4a563938391c",
   "metadata": {},
   "source": [
    "Dimensions must match exactly, unless one array has a 1 in a given dimension."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "87b61a43-8b95-4666-be4c-cc08fe054f59",
   "metadata": {},
   "source": [
    "\n",
    "3. How do elementwise operations behave when such a mismatch is present?\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e96e90cf-2b43-4eb7-842f-6ecc838068ad",
   "metadata": {},
   "source": [
    "The elements will be repeated across the singleton dimension."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "390d84d5-f605-444b-9775-a81d5fb893de",
   "metadata": {},
   "source": [
    "## Numpy, Continued\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4e100472-fb81-49d6-84aa-07abf1aadec2",
   "metadata": {},
   "source": [
    "### Fancy indexing\n",
    "* Integer indexing: `a[ list or ndarray of integer indices ]`\n",
    "* Boolean indexing: `a[ list or ndarray of booleans ]` where the list/ndarray's shape matches a's\n",
    "\n",
    "\n",
    "See <https://numpy.org/doc/stable/user/basics.indexing.html> for much more."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "43dc1963-2476-4777-891f-d454f608b996",
   "metadata": {},
   "source": [
    "#### Integer indexing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e4fd4822-e274-42e5-a8ab-4b143f83b1a7",
   "metadata": {},
   "outputs": [],
   "source": [
    "a = np.array(range(10, 20))\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4aea9397-4e35-4e5c-a64d-f0253b6339a1",
   "metadata": {},
   "source": [
    "Indexing with a list or array of integers pulls out only the elements at those indices:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "da64fe99-4432-4a59-ab15-eb6286ea6c13",
   "metadata": {},
   "outputs": [],
   "source": [
    "# get the first, third, and fifth elements:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e16212b1-9d14-4b45-93de-1aa048bb2171",
   "metadata": {},
   "outputs": [],
   "source": [
    "# get the fourth, second, and second elements (!):\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "39a4a73c-c7f5-4e0f-8ae5-1e473e6beaf4",
   "metadata": {},
   "source": [
    "#### Boolean Indexing\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "faf2fba9-b205-46a7-b150-449eed3016d9",
   "metadata": {},
   "outputs": [],
   "source": [
    "b = np.ones((2, 2))\n",
    "b[0,0] = 2\n",
    "b[1,1] = 0"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f8fd9d3c-13c9-4700-b842-0d54ff406b9d",
   "metadata": {},
   "source": [
    "**Quick quiz: what does b look like now?**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "354df772-c6e9-4dea-916d-1a39f919f183",
   "metadata": {},
   "outputs": [],
   "source": [
    "b"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9fafec39-69ea-4a80-8928-f8ccd6a16e98",
   "metadata": {},
   "source": [
    "Make a \"mask\" of booleans that's the same shape as `b`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "44ad98a0-75e0-4ff0-b092-fa23eb6cdc6b",
   "metadata": {},
   "outputs": [],
   "source": [
    "mask = np.array([\n",
    "    [True, False],\n",
    "    [False, True]\n",
    "])\n",
    "mask"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7e93f608-d307-4c67-b9ba-ce2d2cd3dbbe",
   "metadata": {},
   "outputs": [],
   "source": [
    "# index b with the boolean mask:\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "387408d1-0ccf-4298-bb48-1ed23f24bdcf",
   "metadata": {},
   "source": [
    "A common pattern - comparison operators to generate a mask:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c53f5345-acbe-4fc9-987f-bb4c44436389",
   "metadata": {},
   "outputs": [],
   "source": [
    "# get an array of only the elements of b that are greater than zero:\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "43acafdd-a851-440e-9d83-869525b66c67",
   "metadata": {},
   "source": [
    "#### Tips for multidimensional arrays\n",
    "\n",
    "* I never display anything that's more than 2D.\n",
    "* I never try to visualize anything that's more than 3D."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d1bfce3c-7b1a-4db4-a872-5460342eb9bf",
   "metadata": {},
   "outputs": [],
   "source": [
    "c = np.array(range(24)).reshape(2, 4, 3)\n",
    "c"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d363f7db-1af7-4b2d-8b3a-2c482140d23d",
   "metadata": {},
   "outputs": [],
   "source": [
    "# take one 2D slice\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e14d5322-bb1f-4350-ac34-f9bb1dc2b45b",
   "metadata": {},
   "outputs": [],
   "source": [
    "# take another 2D slice along a different axis\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c913c407-2ebb-464b-883a-7459316fe19a",
   "metadata": {},
   "source": [
    "### Exercise 3 Play with my cat\n",
    "\n",
    "**In pairs**: In this exercise, we'll manipulate an image as a 2D array.\n",
    "\n",
    "We'll start by loading a picture of my cat, Beans:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a94298d7-179a-4c1e-9ad9-3cfb2c4f09ac",
   "metadata": {},
   "outputs": [],
   "source": [
    "beans = imageio.imread(\"/cluster/academic/DATA311/202620/beans_gray.jpeg\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5f49a796-3ecc-4bbf-988b-ef52330654f8",
   "metadata": {},
   "source": [
    "We'll use `plt.imshow` to visualize the image:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5d461096-969c-4217-840e-3136b6273900",
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.imshow(beans, cmap='gray')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9fb413b6-328d-4aca-a270-4ccfe34c6273",
   "metadata": {},
   "source": [
    "1. What is the dtype of the resulting array? What are the minimum and maximum values?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8988c9c4-3163-4169-ad47-ee39c025b31d",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "6be43a32-a3d1-40cc-82d5-8039978901a3",
   "metadata": {},
   "source": [
    "2. Display a binary image showing which pixels are greater than half the maximum pixel intensity (127)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8e4a13fe-0b4c-4cef-bbb5-4c17ffa8c865",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "a967c097-eab3-4abe-ab05-53e71197845d",
   "metadata": {},
   "source": [
    "3. What is the average value of pixels that have intensity value above 127?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4d801220-fe14-4737-97a5-35de27ea0bb3",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "df00432a-1d3e-49da-9d22-1e7eeb50ead8",
   "metadata": {},
   "source": [
    "4. Which column of the image has the highest average pixel value?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8e90318c-b38c-4b57-b4ba-afeee6bc36fb",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
