{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "gwDkBYVEqm5N"
   },
   "source": [
    "# DATA 311 - Lecture 4: Probability and Statistics; Summary Statistics"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "YTajC-alqoaU"
   },
   "source": [
    "## Announcements\n",
    "\n",
    "* Lecture 3 notebook has the pandas tour cells filled in for you\n",
    "* Lab 2 Pre-Lab is posted on the course webpage. Fun with Pandas!\n",
    "* Reminder - Ethics 1 due Wednesday\n",
    "* How's lab 1 going?\n",
    "  * Numpy slicing magic (plus some) examples:\n",
    "    * <https://github.com/harskish/tlgan?tab=readme-ov-file#disentangling-random-and-cyclic-effects-in-time-lapse-sequences>\n",
    "    * <https://vision.huji.ac.il/videowarping/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "-qhrUj8UOzvv"
   },
   "source": [
    "## Goals\n",
    "\n",
    "* Develop intuition for the purpose of, and distinction between, *probability* and *statistics* (Skiena 2.1.1)\n",
    "* Know the terminology and properties of basic probability (Skiena 2.1):\n",
    "  * Experiment; Outcome; Sample Space; Event; Probability; Random Variable; Expected Value\n",
    "* Know how to compute and interpret basic summary statistics (Skiena 2.2):\n",
    "  * Centrality measures: Arithmetic mean, Geometric Mean, Median, (Mode)\n",
    "  * Variability measures: Standard Deviation, Variance\n",
    "\n",
    "* Know how to compute summary statistics in pandas."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "89776e28"
   },
   "source": [
    "## Probability and Statistics\n",
    "\n",
    "What are these words? What is the purpose of each? What's the difference between the two?\n",
    "\n",
    "Probability is a set of tools for describing a **model** of how the world (or some process in the world) behaves.\n",
    "\n",
    "Statistics gives us a set of tools for *estimating* such a model, or for *verifying* or *evaluating* a hypothesized model, given observations of how the world behave**d**."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "c6uGPSQ9QdTX"
   },
   "source": [
    "### Probability - Basics\n",
    "* Probability is actually hard to define! But easy to have intuition about it.\n",
    "* Also straightforward to write down its properties (i.e., how it *behaves*)\n",
    "\n",
    "First, need some terminology.\n",
    "* An **experiment** is a process that results in one of a set of possible outcomes.\n",
    "* The **sample space** ($S$) of the experiment is the set of all possible outcomes.\n",
    "* An event ($E$) is a subset of the outcomes.\n",
    "* The **probability** of an outcome $s$ is written $P(s)$ and has these properties:\n",
    "  * $P(s)$ is between 0 and 1: $0 \\le P(s) \\le 1$.\n",
    "  * The sum of probabilities of all outcomes is exactly 1: $$\\sum_{s \\in S} P(s) = 1$$\n",
    "  \n",
    "* A **random variable** $(V)$ is a function that maps an outcome to a number.\n",
    "* The **expected value** $E(V)$ of a random variable $V$ is the sum of the probability of each outcome times the random variable's value at that outcome: $$E(V) = \\sum_{s \\in S} P(s) \\cdot V(s)$$\n",
    "\n",
    "If we run an **experiment** where we toss a fair coin, the **sample space** contains the outcomes ${H, T}$ representing heads and tails. The coin is fair, so the **probability** of each outcome is 0.5, which satisfies both of the properties above.\n",
    "\n",
    "Suppose you made a deal with a friend to toss a coin, and if it comes up heads, your friend gives you a dollar. If it comes up tails, no money changes hands. The **random variable** $V$ that's relevant to your wallet is $V(H) = 1, V(T) = 0$. The **expected value** of this random variable is $V(H) * P(H) + V(T) * P(T) = 0.5$, which you can think of as the amount of money you would expect to earn per flip, on average, if you repeated this experiment many, many times."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "5TWlulGDQ6VV"
   },
   "source": [
    "**Exercise:** describe the rolling of a six-sided die using the same terminology as above. For a random variable, use the number on the die itself; find the expected value."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "7D2Zx2pgREJZ"
   },
   "source": [
    "**Exercise for later**: Do the same as above for a roll of *two* six-sided dice, and calculated the expected value of the random variable that is the *sum* of the numbers that the two dice land on."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3.4299999999999997"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "0.6 * 4 + 0.3 * 3 + 0.05 * 2 + 0.03 * 1 + 0.02 * 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "021977af"
   },
   "source": [
    "#### Probability Distributions\n",
    "\n",
    "The expected value is one important property of a random variable, but if we want the whole story, we need to look at its **probability density function** (PDF): a graph with random variable's values on the $x$ axis and the probability of the random variable taking on that value on the $y$ axis.\n",
    "\n",
    "Here's the PDF of the random variable described above:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 297
    },
    "executionInfo": {
     "elapsed": 579,
     "status": "ok",
     "timestamp": 1673461655902,
     "user": {
      "displayName": "Scott Wehrwein",
      "userId": "11327482518794216604"
     },
     "user_tz": 480
    },
    "id": "59df670b",
    "outputId": "fb6be4c1-ac69-4e4e-a5d8-b3f0d0d8e1d1"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Text(0, 0.5, 'P(s)')"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAjcAAAGwCAYAAABVdURTAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjgsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvwVt1zgAAAAlwSFlzAAAPYQAAD2EBqD+naQAAHBVJREFUeJzt3XtsVvX9wPFPKZYSJ2xaQFFuc4pIFZFOKI79orgacDqmm0wn6IQp8zZGWCIyRdkfuE0R3QTsvDA2xcbrP7Jod3HWYZbJMJvTLc6ANAhD0AFeRgf0l3MSGkoBUds+5cvrlZzQc55znue0yRPe+Z5bUWNjY2MAACSiU6F3AACgNYkbACAp4gYASIq4AQCSIm4AgKSIGwAgKeIGAEhK5zjI7NixI95888047LDDoqioqNC7AwDsh+y2fFu2bInevXtHp077Hps56OImC5s+ffoUejcAgI+hvr4+jjnmmH2uc9DFTTZis/OP061bt0LvDgCwHzZv3pwPTuz8f3xfDrq42XkoKgsbcQMAB5b9OaXECcUAQFLEDQCQFHEDACRF3AAASRE3AEBSxA0AkBRxAwAkRdwAAEkRNwBAUsQNAJAUcQMAJKXgcTN//vwYMGBAlJaWxrBhw6Kurm6v6z777LP5MyV2n/7xj3+06z4DAB1XQeOmpqYmpk6dGjNnzowVK1bEqFGjYsyYMbF69ep9bvfPf/4z1q5d2zQdd9xx7bbPAEDHVtC4mTt3bkyaNCkmT54cgwYNinnz5uWPM1+wYME+t+vZs2cceeSRTVNxcXG77TMA0LEVLG4aGhpi+fLlUVVV1Wx5Nr9s2bJ9bjt06NA46qijYvTo0fH73/9+n+tu3bo1Nm/e3GwCANLVuVAfvGHDhti+fXv06tWr2fJsft26dXvcJgua6urq/NycLFp++ctf5oGTnYvzxS9+cY/bzJkzJ2655ZZoL/2vf6rdPgsONKtuPSdS4HsOHfu7XrC42Sk7IXhXjY2NLZbtNHDgwHzaqbKyMurr6+O2227ba9zMmDEjpk2b1jSfjdxkh74AgDQV7LBUWVlZfq7M7qM069evbzGasy8jRoyI1157ba+vd+nSJbp169ZsAgDSVbC4KSkpyQ8v1dbWNluezY8cOXK/3ye7yio7XAUAUPDDUtnhogkTJkRFRUV+iCk7nya7DHzKlClNh5TWrFkTixcvzuezq6n69+8fgwcPzk9I/tWvfhWPPfZYPgEAFDxuxo8fHxs3bozZs2fn96spLy+PpUuXRr9+/fLXs2W73vMmC5rp06fnwdO1a9c8cp566qkYO3ZsAX8LAKAjKWrMzuA9iGQnFHfv3j02bdrUJuffuIoCOu4VFK3F9xza/7v+Uf7/LvjjFwAAWpO4AQCSIm4AgKSIGwAgKeIGAEiKuAEAkiJuAICkiBsAICniBgBIirgBAJIibgCApIgbACAp4gYASIq4AQCSIm4AgKSIGwAgKeIGAEiKuAEAkiJuAICkiBsAICniBgBIirgBAJIibgCApIgbACAp4gYASIq4AQCSIm4AgKSIGwAgKeIGAEiKuAEAkiJuAICkiBsAICniBgBIirgBAJIibgCApIgbACAp4gYASIq4AQCSIm4AgKSIGwAgKeIGAEiKuAEAkiJuAICkiBsAICniBgBIirgBAJIibgCApIgbACAp4gYASIq4AQCSIm4AgKSIGwAgKeIGAEiKuAEAkiJuAICkiBsAICniBgBIirgBAJIibgCApIgbACApBY+b+fPnx4ABA6K0tDSGDRsWdXV1+7XdH//4x+jcuXOccsopbb6PAMCBo6BxU1NTE1OnTo2ZM2fGihUrYtSoUTFmzJhYvXr1PrfbtGlTTJw4MUaPHt1u+woAHBgKGjdz586NSZMmxeTJk2PQoEExb9686NOnTyxYsGCf21155ZVx8cUXR2Vl5Yd+xtatW2Pz5s3NJgAgXQWLm4aGhli+fHlUVVU1W57NL1u2bK/bPfDAA/H666/HrFmz9utz5syZE927d2+asngCANJVsLjZsGFDbN++PXr16tVseTa/bt26PW7z2muvxfXXXx8PPvhgfr7N/pgxY0Z+GGvnVF9f3yr7DwB0TPtXCG2oqKio2XxjY2OLZZkshLJDUbfcckscf/zx+/3+Xbp0yScA4OBQsLgpKyuL4uLiFqM069evbzGak9myZUu8+OKL+YnH11xzTb5sx44deQxlozjPPPNMnHnmme22/wBAx1Sww1IlJSX5pd+1tbXNlmfzI0eObLF+t27d4m9/+1u89NJLTdOUKVNi4MCB+c/Dhw9vx70HADqqgh6WmjZtWkyYMCEqKiryK5+qq6vzy8CzaNl5vsyaNWti8eLF0alTpygvL2+2fc+ePfP74+y+HAA4eBU0bsaPHx8bN26M2bNnx9q1a/NIWbp0afTr1y9/PVv2Yfe8AQDYVVFjdtLKQSS7z012SXh25VR2qKu19b/+qVZ/T0jFqlvPiRT4nkP7f9c/yv/fBX/8AgBAaxI3AEBSxA0AkBRxAwAkRdwAAEkRNwBAUsQNAJAUcQMAJEXcAABJETcAQFLEDQCQFHEDACRF3AAASRE3AEBSxA0AkBRxAwAkRdwAAEkRNwBAUsQNAJAUcQMAJEXcAABJETcAQFLEDQCQFHEDACRF3AAASRE3AEBSxA0AkBRxAwAkRdwAAEkRNwBAUsQNAJAUcQMAJEXcAABJETcAQFLEDQCQFHEDACRF3AAASRE3AEBSxA0AkBRxAwAkRdwAAEkRNwBAUsQNAJAUcQMAJEXcAABJETcAQFLEDQCQFHEDACRF3AAASRE3AEBSxA0AkBRxAwAkRdwAAEkRNwBAUsQNAJAUcQMAJEXcAABJETcAQFLEDQCQlILHzfz582PAgAFRWloaw4YNi7q6ur2u+/zzz8fpp58eRxxxRHTt2jVOOOGEuOOOO9p1fwGAjq1zIT+8pqYmpk6dmgdOFi333HNPjBkzJl555ZXo27dvi/UPPfTQuOaaa+Lkk0/Of85i58orr8x/vuKKKwryOwAAHUtBR27mzp0bkyZNismTJ8egQYNi3rx50adPn1iwYMEe1x86dGhcdNFFMXjw4Ojfv39ccsklcfbZZ+9ztAcAOLgULG4aGhpi+fLlUVVV1Wx5Nr9s2bL9eo8VK1bk6/7f//3fXtfZunVrbN68udkEAKSrYHGzYcOG2L59e/Tq1avZ8mx+3bp1+9z2mGOOiS5dukRFRUVcffXV+cjP3syZMye6d+/eNGUjQwBAugp+QnFRUVGz+cbGxhbLdpcdhnrxxRdj4cKF+aGsJUuW7HXdGTNmxKZNm5qm+vr6Vtt3AKDjKdgJxWVlZVFcXNxilGb9+vUtRnN2l11dlTnppJPi3//+d9x88835uTh7ko3wZBMAcHAo2MhNSUlJful3bW1ts+XZ/MiRI/f7fbKRnuy8GgCAgl8KPm3atJgwYUJ+7kxlZWVUV1fH6tWrY8qUKU2HlNasWROLFy/O5+++++78EvHs/jaZ7FLw2267La699tpC/hoAQAdS0LgZP358bNy4MWbPnh1r166N8vLyWLp0afTr1y9/PVuWxc5OO3bsyINn5cqV0blz5zj22GPj1ltvze91AwCQKWrMjuscRLJLwbOrprKTi7t169bq79//+qda/T0hFatuPSdS4HsO7f9d/yj/fxf8aikAgNYkbgCApIgbACAp4gYASIq4AQCSIm4AgKSIGwAgKeIGAEiKuAEAkiJuAICkfKJnS9XX18eqVavi/fffjx49esTgwYOjS5curbd3AABtHTdvvPFGLFy4MJYsWZLHza6PpiopKYlRo0bFFVdcERdccEF06mRgCABoXx+pPr773e/GSSedFK+99lr+JO+///3v+QOsGhoaYt26dfkTvb/whS/EjTfeGCeffHL8+c9/brs9BwD4pCM32cjM66+/nh+C2l3Pnj3jzDPPzKdZs2bloZON8nz+85//KB8BANB+cfOTn/xkv9cdO3bsx9kfAIBP5GOfFPPBBx/kJxLvlI3SzJs3L55++ulPtkcAAIWIm6985SuxePHi/Of//Oc/MXz48Lj99ttj3LhxsWDBgk+yTwAA7R83f/nLX/IrozKPPvpo9OrVKx+9yYLnrrvu+vh7BABQiLjJDkkddthh+c/PPPNMnH/++fml3yNGjMgjBwDggIqbz33uc/Hkk0/m97rJzrOpqqrKl69fvz66devWmvsIAND2cXPTTTfF9OnTo3///vn5NpWVlU2jOEOHDv24bwsAUJjHL3zta1/Lb9i3du3aGDJkSNPy0aNHx1e/+tVPtlcAAIV4ttSRRx6ZT7s67bTTPslbAgC032GpKVOm5OfY7I+ampp48MEHP+5+AQC0/chN9tiF8vLyGDlyZJx33nlRUVERvXv3jtLS0njnnXfilVdeieeffz4efvjhOProo6O6uvrj7RUAQHvEzQ9/+MO49tpr4957782fDP7yyy83ez27NPyss87KX9959RQAQIc+5yZ7QOYNN9yQT9mdibN72mSPYigrK4tjjz02ioqK2mZPAQDa4lLw7OZ9V199dX7Y6fjjj48f/ehH+T1vsknYAAAHXNzMmjUrFi1aFOecc0584xvfiNra2vjOd77TNnsHANDWh6Uef/zxuO+++/KwyVxyySVx+umnx/bt26O4uPijvh0AQGFHbrJLwXc+MHPnfW06d+4cb775ZuvuGQBAe8RNNkJTUlLSbFkWN9u2bfs4nw8AUNjDUo2NjXHZZZdFly5dmpb997//zW/wd+ihhzY7fAUA0OHj5tJLL22xLDvvBgDggIybBx54oG32BACgEOfcAAB0ZOIGAEiKuAEAkiJuAICkiBsAICniBgBIirgBAJIibgCApIgbACAp4gYASIq4AQCSIm4AgKSIGwAgKeIGAEiKuAEAkiJuAICkiBsAICniBgBIirgBAJIibgCApIgbACAp4gYASIq4AQCSIm4AgKQUPG7mz58fAwYMiNLS0hg2bFjU1dXtdd3HH388vvSlL0WPHj2iW7duUVlZGU8//XS77i8A0LEVNG5qampi6tSpMXPmzFixYkWMGjUqxowZE6tXr97j+s8991weN0uXLo3ly5fHGWecEeeee26+LQBApqixsbGxUH+K4cOHx6mnnhoLFixoWjZo0KAYN25czJkzZ7/eY/DgwTF+/Pi46aab9mv9zZs3R/fu3WPTpk356E9r63/9U63+npCKVbeeEynwPYf2/65/lP+/CzZy09DQkI++VFVVNVuezS9btmy/3mPHjh2xZcuWOPzww/e6ztatW/M/yK4TAJCugsXNhg0bYvv27dGrV69my7P5devW7dd73H777fHee+/FhRdeuNd1shGgrPR2Tn369PnE+w4AdFwFP6G4qKio2Xx2lGz3ZXuyZMmSuPnmm/Pzdnr27LnX9WbMmJEPYe2c6uvrW2W/AYCOqXOhPrisrCyKi4tbjNKsX7++xWjO7rKgmTRpUjzyyCNx1lln7XPdLl265BMAcHAo2MhNSUlJful3bW1ts+XZ/MiRI/c5YnPZZZfFQw89FOeck8bJiQBAAiM3mWnTpsWECROioqIiv2dNdXV1fhn4lClTmg4prVmzJhYvXtwUNhMnTow777wzRowY0TTq07Vr1/x8GgCAgsZNdgn3xo0bY/bs2bF27dooLy/P72HTr1+//PVs2a73vLnnnnti27ZtcfXVV+fTTpdeemksWrSoIL8DANCxFDRuMldddVU+7cnuwfLss8+2014BAAeqgl8tBQDQmsQNAJAUcQMAJEXcAABJETcAQFLEDQCQFHEDACRF3AAASRE3AEBSxA0AkBRxAwAkRdwAAEkRNwBAUsQNAJAUcQMAJEXcAABJETcAQFLEDQCQFHEDACRF3AAASRE3AEBSxA0AkBRxAwAkRdwAAEkRNwBAUsQNAJAUcQMAJEXcAABJETcAQFLEDQCQFHEDACRF3AAASRE3AEBSxA0AkBRxAwAkRdwAAEkRNwBAUsQNAJAUcQMAJEXcAABJETcAQFLEDQCQFHEDACRF3AAASRE3AEBSxA0AkBRxAwAkRdwAAEkRNwBAUsQNAJAUcQMAJEXcAABJETcAQFLEDQCQFHEDACRF3AAASRE3AEBSxA0AkBRxAwAkRdwAAEkpeNzMnz8/BgwYEKWlpTFs2LCoq6vb67pr166Niy++OAYOHBidOnWKqVOntuu+AgAdX0HjpqamJg+UmTNnxooVK2LUqFExZsyYWL169R7X37p1a/To0SNff8iQIe2+vwBAx1fQuJk7d25MmjQpJk+eHIMGDYp58+ZFnz59YsGCBXtcv3///nHnnXfGxIkTo3v37vv1GVkQbd68udkEAKSrYHHT0NAQy5cvj6qqqmbLs/lly5a12ufMmTMnD6GdUxZPAEC6ChY3GzZsiO3bt0evXr2aLc/m161b12qfM2PGjNi0aVPTVF9f32rvDQB0PJ0LvQNFRUXN5hsbG1ss+yS6dOmSTwDAwaFgIzdlZWVRXFzcYpRm/fr1LUZzAAA6fNyUlJTkl37X1tY2W57Njxw5slC7BQAc4Ap6WGratGkxYcKEqKioiMrKyqiurs4vA58yZUrT+TJr1qyJxYsXN23z0ksv5f++++678dZbb+XzWSideOKJBfs9AICOo6BxM378+Ni4cWPMnj07v0FfeXl5LF26NPr165e/ni3b/Z43Q4cObfo5u9rqoYceytdftWpVu+8/ANDxFPyE4quuuiqf9mTRokUtlmUnHAMAdNjHLwAAtCZxAwAkRdwAAEkRNwBAUsQNAJAUcQMAJEXcAABJETcAQFLEDQCQFHEDACRF3AAASRE3AEBSxA0AkBRxAwAkRdwAAEkRNwBAUsQNAJAUcQMAJEXcAABJETcAQFLEDQCQFHEDACRF3AAASRE3AEBSxA0AkBRxAwAkRdwAAEkRNwBAUsQNAJAUcQMAJEXcAABJETcAQFLEDQCQFHEDACRF3AAASRE3AEBSxA0AkBRxAwAkRdwAAEkRNwBAUsQNAJAUcQMAJEXcAABJETcAQFLEDQCQFHEDACRF3AAASRE3AEBSxA0AkBRxAwAkRdwAAEkRNwBAUsQNAJAUcQMAJEXcAABJETcAQFLEDQCQFHEDACRF3AAASSl43MyfPz8GDBgQpaWlMWzYsKirq9vn+n/4wx/y9bL1P/vZz8bChQvbbV8BgI6voHFTU1MTU6dOjZkzZ8aKFSti1KhRMWbMmFi9evUe11+5cmWMHTs2Xy9b/4YbbojrrrsuHnvssXbfdwCgYypo3MydOzcmTZoUkydPjkGDBsW8efOiT58+sWDBgj2un43S9O3bN18vWz/b7vLLL4/bbrut3fcdAOiYOhfqgxsaGmL58uVx/fXXN1teVVUVy5Yt2+M2L7zwQv76rs4+++y477774n//+18ccsghLbbZunVrPu20adOm/N/NmzdHW9ix9f02eV9IQVt979qb7zm0/3d953s2NjZ23LjZsGFDbN++PXr16tVseTa/bt26PW6TLd/T+tu2bcvf76ijjmqxzZw5c+KWW25psTwbIQLaV/d5/uJwMOjeht/1LVu2RPfu3Ttm3OxUVFTUbD4rst2Xfdj6e1q+04wZM2LatGlN8zt27Ii33347jjjiiH1+Dge+rPKziK2vr49u3boVeneANuK7fnBobGzMw6Z3794fum7B4qasrCyKi4tbjNKsX7++xejMTkceeeQe1+/cuXMeK3vSpUuXfNrVpz/96U+8/xw4srARN5A+3/X0df+QEZuCn1BcUlKSX9JdW1vbbHk2P3LkyD1uU1lZ2WL9Z555JioqKvZ4vg0AcPAp6NVS2eGie++9N+6///549dVX43vf+15+GfiUKVOaDilNnDixaf1s+RtvvJFvl62fbZedTDx9+vQC/hYAQEdS0HNuxo8fHxs3bozZs2fH2rVro7y8PJYuXRr9+vXLX8+W7XrPm+xmf9nrWQTdfffd+XG3u+66Ky644IIC/hZ0VNnhyFmzZrU4LAmkxXed3RU17s81VQAAB4iCP34BAKA1iRsAICniBgBIirgBAJIibkjW/Pnz8yvsSktL83sq1dXVFXqXgFb03HPPxbnnnptfOZvdcf7JJ5/09yUnbkhSTU1NTJ06NWbOnBkrVqyIUaNGxZgxY5rdWgA4sL333nsxZMiQ+NnPflboXaGDcSk4SRo+fHiceuqpsWDBgqZlgwYNinHjxuUPUwXSko3cPPHEE/l3HIzckJyGhoZYvnx5VFVVNVuezS9btqxg+wVA+xA3JGfDhg2xffv2Fg9gzeZ3f/AqAOkRNyQ9TL2r7Gbcuy8DID3ihuSUlZVFcXFxi1Ga9evXtxjNASA94obklJSU5Jd+19bWNluezY8cObJg+wXAQfBUcGgr06ZNiwkTJkRFRUVUVlZGdXV1fhn4lClT/NEhEe+++27861//appfuXJlvPTSS3H44YdH3759C7pvFJZLwUn6Jn4//vGPY+3atVFeXh533HFHfPGLXyz0bgGt5Nlnn40zzjijxfJLL700Fi1a5O98EBM3AEBSnHMDACRF3AAASRE3AEBSxA0AkBRxAwAkRdwAAEkRNwBAUsQNAJAUcQMk4cYbb4wrrrhiv9adPn16XHfddW2+T0BhuEMx0GGde+658cEHH8RvfvObFq+98MIL+YNQly9fHkcffXQcd9xx8de//jX69+//oe+bPSH+2GOPzdcfMGBAG+09UChGboAOa9KkSfG73/0u3njjjRav3X///XHKKafEqaeeGvfdd1/+gNT9CZtMz549o6qqKhYuXNgGew0UmrgBOqwvf/nLeYjs/hDE999/P2pqavL4yTz88MNx3nnnNVvn0UcfjZNOOim6du0aRxxxRJx11lnx3nvvNb2erb9kyZJ2+k2A9iRugA6rc+fOMXHixDxuGhsbm5Y/8sgj0dDQEN/85jfjnXfeiZdffjkqKiqaXs+eBH/RRRfF5ZdfHq+++mr+9Ojzzz+/2XucdtppUV9fv8dRIeDAJm6ADi0LlFWrVuWBsushqSxWPvOZz+RxkkVL7969m8XNtm3b8nWyQ1XZCM5VV10Vn/rUp5rWyc7TyWTvDaRF3AAd2gknnJCfOJwFTeb111+Purq6PHoy2QnHmdLS0qZthgwZEqNHj86j5utf/3r8/Oc/z0d4dpUdrtp5iAtIi7gBOrzs3JrHHnssNm/eHA888ED069cvj5dMWVlZ/u+u8VJcXBy1tbXx61//Ok488cT46U9/GgMHDoyVK1c2rfP222/n//bo0aPdfx+gbYkboMO78MIL82B56KGH4he/+EV861vfiqKiovy17JLubt26xSuvvNJsm+z1008/PW655ZZYsWJFlJSUxBNPPNH0enaeziGHHBKDBw9u998HaFud2/j9AT6x7FyZ8ePHxw033BCbNm2Kyy67rOm1Tp065VdCPf/88zFu3Lh82Z/+9Kf47W9/m1/unV1tlc2/9dZbMWjQoKbtskNbo0aNajo8BaTDyA1wwByayg49ZSHTt2/fZq9ldybOLgffsWNHPp+N5Dz33HMxduzYOP744+MHP/hB3H777TFmzJimbbLLwL/97W+3++8BtD13KAYOeNnVUiNGjIipU6fml4B/mKeeeiq+//3v53cozi43B9Ji5AY44GXn11RXV+eXf++P7GZ+2YnJwgbSZOQGAEiKkRsAICniBgBIirgBAJIibgCApIgbACAp4gYASIq4AQCSIm4AgKSIGwAgUvL/4b5it8dx5xQAAAAASUVORK5CYII=",
      "text/plain": [
       "<Figure size 640x480 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "plt.bar([\"0\", \"1\"], [0.5, 0.5])\n",
    "plt.xlabel(\"V(s)\")\n",
    "plt.ylabel(\"P(s)\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "2cafcded"
   },
   "source": [
    "**Exercise:** Draw the PDF for a loaded **five**-sided die that comes up 1 with probability 0.6 and has an equal chance of each the remaining four faces."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "59d1dd17"
   },
   "source": [
    "### Statistics\n",
    "\n",
    "We can think of a set of data as the outcome of one or more experiments; statistics give us tools to describe the properties of the data and, eventually, estimate how the underlying experiment behaves. For now, we'll talk about **summary statistics**, which provide aggregate descriptions of a data set. For now, let's assume we have a single numerical column of a table - say, the Height column of a dataset of measurements of people.\n",
    "\n",
    "#### Histograms\n",
    "**Histograms** are the statistical equivalent of probability density functions. They show the observed frequency of a certain outcome. For example, here's the histogram describing the result of flipping a coin ten times."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 283
    },
    "executionInfo": {
     "elapsed": 327,
     "status": "ok",
     "timestamp": 1673461780118,
     "user": {
      "displayName": "Scott Wehrwein",
      "userId": "11327482518794216604"
     },
     "user_tz": 480
    },
    "id": "d2cc3081",
    "outputId": "f3ff3668-06ef-4a5b-ab06-bf63df5c16ca",
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<BarContainer object of 2 artists>"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAjEAAAGdCAYAAADjWSL8AAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjgsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvwVt1zgAAAAlwSFlzAAAPYQAAD2EBqD+naQAAG+pJREFUeJzt3X2slnX9wPHPiSeBgHgQiIlGizEMdAWNh6VgPE8kZxsWjdEiwDCIAUOJP8LWQGkBNRZDc2EI4T9hLY3AlSjjUZIlhCwXFkyeNB6NAeH57Xtt9z3OgR96ED18z3m9tmvnvq/7e+5zHdwt732v63tRUVlZWRkAAJn5RG0fAADA1RAxAECWRAwAkCURAwBkScQAAFkSMQBAlkQMAJAlEQMAZKlh1FHvvfdevPXWW9GiRYuoqKio7cMBAD6AdA/eU6dORadOneITn/hE/YyYFDCdO3eu7cMAAK7C/v3746abbqqfEZNmYEp/CC1btqztwwEAPoCTJ08WkxClv8frZcSUTiGlgBExAJCXD3IpiAt7AYAsiRgAIEsiBgDIkogBALIkYgCALIkYACBLIgYAyJKIAQCyJGIAgCyJGAAgSyIGAMiSiAEAsiRiAIAsiRgAIEsNa/sAcvWZh5+r7UOA69abj95d24cA1AM1momZO3duVFRUVNk6duxYfr2ysrIY06lTp2jatGkMHDgwdu/eXeU9zp49G1OmTIl27dpF8+bNY9SoUXHgwIEqY44dOxZjx46NVq1aFVt6fPz48Q/7uwIA9fl00uc///k4ePBgeXvttdfKry1YsCAWLlwYS5Ysie3btxeBM2TIkDh16lR5zLRp02LNmjWxevXq2LhxY5w+fTpGjhwZFy5cKI8ZM2ZM7Ny5M9auXVts6XEKGQCAqz6d1LBhwyqzLxfPwixevDjmzJkT9913X7Hvqaeeig4dOsSqVati0qRJceLEiXjyySdjxYoVMXjw4GLM008/HZ07d44XXnghhg0bFnv27CnCZcuWLdGnT59izBNPPBH9+vWLvXv3Rrdu3Wp6yABAHVTjmZh//OMfxemiLl26xNe//vX45z//Wezft29fHDp0KIYOHVoe26RJkxgwYEBs2rSpeL5jx444f/58lTHpvXr06FEes3nz5uIUUilgkr59+xb7SmMuJ52mOnnyZJUNAKi7ahQxKSx+/etfx5/+9KdidiRFS//+/eOdd94pHidp5uVi6XnptfS1cePG0bp16yuOad++/SU/O+0rjbmc+fPnl6+hSVua3QEA6q4aRcyIESPia1/7WvTs2bM4HfTcc8+VTxuVpIt9q59mqr6vuupjLjf+/d5n9uzZxemq0rZ///6a/GoAQH26T0xaXZSCJp1iKl0nU3225MiRI+XZmTTm3LlzxeqjK405fPjwJT/r6NGjl8zyXCydumrZsmWVDQCouz5UxKTrUNKFuJ/+9KeLa2RSgKxfv778egqWDRs2FKeckl69ekWjRo2qjEkrnHbt2lUeky7gTTMp27ZtK4/ZunVrsa80BgCgRquTZs6cGffcc0/cfPPNxezJj3/84+IC2nHjxhWnetLy6Xnz5kXXrl2LLT1u1qxZsWQ6SdeqjB8/PmbMmBFt27aNNm3aFO9ZOj2VdO/ePYYPHx4TJkyIZcuWFfsmTpxYLMO2MgkAuKqISTel+8Y3vhFvv/123HjjjcWqobQU+pZbbilenzVrVpw5cyYmT55cnDJKFwKvW7cuWrRoUX6PRYsWFcu0R48eXYwdNGhQLF++PBo0aFAes3Llypg6dWp5FVO6IV669wzAx8mdueH6vjt3RWW6YrYOSjNEaeYnnYb6KK6P8T83uH7/x3at+JzDx/9Zr8nf3/4BSAAgSyIGAMiSiAEAsiRiAIAsiRgAIEsiBgDIkogBALIkYgCALIkYACBLIgYAyJKIAQCyJGIAgCyJGAAgSyIGAMiSiAEAsiRiAIAsiRgAIEsiBgDIkogBALIkYgCALIkYACBLIgYAyJKIAQCyJGIAgCyJGAAgSyIGAMiSiAEAsiRiAIAsiRgAIEsiBgDIkogBALIkYgCALIkYACBLIgYAyJKIAQCyJGIAgCyJGAAgSyIGAMiSiAEAsiRiAIAsiRgAIEsiBgDIkogBALIkYgCALIkYACBLIgYAyJKIAQCyJGIAgCyJGAAgSyIGAMiSiAEAsiRiAIAsiRgAIEsiBgDIkogBALIkYgCALIkYACBLIgYAqH8RM3/+/KioqIhp06aV91VWVsbcuXOjU6dO0bRp0xg4cGDs3r27yvedPXs2pkyZEu3atYvmzZvHqFGj4sCBA1XGHDt2LMaOHRutWrUqtvT4+PHjH+ZwAYA65KojZvv27fH444/HbbfdVmX/ggULYuHChbFkyZJiTMeOHWPIkCFx6tSp8pgUPWvWrInVq1fHxo0b4/Tp0zFy5Mi4cOFCecyYMWNi586dsXbt2mJLj1PIAABcdcSk6PjmN78ZTzzxRLRu3brKLMzixYtjzpw5cd9990WPHj3iqaeeiv/+97+xatWqYsyJEyfiySefjJ/+9KcxePDg+MIXvhBPP/10vPbaa/HCCy8UY/bs2VOEyy9/+cvo169fsaWf9Yc//CH27t3rvxwAcHUR8+CDD8bdd99dRMjF9u3bF4cOHYqhQ4eW9zVp0iQGDBgQmzZtKp7v2LEjzp8/X2VMOvWUgqc0ZvPmzcUppD59+pTH9O3bt9hXGlNdOkV18uTJKhsAUHc1rOk3pFNAf/3rX4tTRdWlgEk6dOhQZX96/q9//as8pnHjxlVmcEpjSt+fvrZv3/6S90/7SmMud33OI488UtNfBwCoDzMx+/fvj+9///vF6Z8bbrjh/x2XLva9WDrNVH1fddXHXG78ld5n9uzZxamq0paOFQCou2oUMelU0JEjR6JXr17RsGHDYtuwYUP8/Oc/Lx6XZmCqz5ak7ym9li70PXfuXLH66EpjDh8+fMnPP3r06CWzPBeftmrZsmWVDQCou2oUMYMGDSouwE0rhUpb7969i4t80+PPfvazRYCsX7++/D0pWFLo9O/fv3ieAqhRo0ZVxhw8eDB27dpVHpMu5E2zKdu2bSuP2bp1a7GvNAYAqN9qdE1MixYtigtwL5bu89K2bdvy/rR8et68edG1a9diS4+bNWtWLJlO0sW548ePjxkzZhTf16ZNm5g5c2b07NmzfKFw9+7dY/jw4TFhwoRYtmxZsW/ixInFMuxu3bpdq98dAKhPF/a+n1mzZsWZM2di8uTJxSmjtMJo3bp1RQCVLFq0qDj9NHr06GJsmuFZvnx5NGjQoDxm5cqVMXXq1PIqpnRDvHTvGQCApKIyXS1bB6Ul1mnWJ52C+iiuj/nMw89d8/eEuuLNR++OusDnHD7+z3pN/v72bycBAFkSMQBAlkQMAJAlEQMAZEnEAABZEjEAQJZEDACQJREDAGRJxAAAWRIxAECWRAwAkCURAwBkScQAAFkSMQBAlkQMAJAlEQMAZEnEAABZEjEAQJZEDACQJREDAGRJxAAAWRIxAECWRAwAkCURAwBkScQAAFkSMQBAlkQMAJAlEQMAZEnEAABZEjEAQJZEDACQJREDAGRJxAAAWRIxAECWRAwAkCURAwBkScQAAFkSMQBAlkQMAJAlEQMAZEnEAABZEjEAQJZEDACQJREDAGRJxAAAWRIxAECWRAwAkCURAwBkScQAAFkSMQBAlkQMAJAlEQMAZEnEAABZEjEAQJZEDACQJREDAGRJxAAAWRIxAECWRAwAUPcjZunSpXHbbbdFy5Yti61fv37xxz/+sfx6ZWVlzJ07Nzp16hRNmzaNgQMHxu7du6u8x9mzZ2PKlCnRrl27aN68eYwaNSoOHDhQZcyxY8di7Nix0apVq2JLj48fP/5hf1cAoL5GzE033RSPPvpovPLKK8X2la98Jb761a+WQ2XBggWxcOHCWLJkSWzfvj06duwYQ4YMiVOnTpXfY9q0abFmzZpYvXp1bNy4MU6fPh0jR46MCxculMeMGTMmdu7cGWvXri229DiFDABASUVlmj75ENq0aRM/+clP4tvf/nYxA5Mi5aGHHirPunTo0CEee+yxmDRpUpw4cSJuvPHGWLFiRdx///3FmLfeeis6d+4czz//fAwbNiz27NkTt956a2zZsiX69OlTjEmP06zP66+/Ht26dftAx3Xy5MliFif9zDRrdK195uHnrvl7Ql3x5qN3R13gcw4f/2e9Jn9/X/U1MWnmJM2mvPvuu0Vg7Nu3Lw4dOhRDhw4tj2nSpEkMGDAgNm3aVDzfsWNHnD9/vsqYFD49evQoj9m8eXNx8KWASfr27VvsK425nBRM6Re/eAMA6q4aR8xrr70Wn/zkJ4tAeeCBB4pTQ2nmJAVMkmZeLpael15LXxs3bhytW7e+4pj27dtf8nPTvtKYy5k/f375Gpq0pdkdAKDuqnHEpNM56RqVdIrnu9/9bowbNy7+/ve/l1+vqKioMj6draq+r7rqYy43/v3eZ/bs2cXUU2nbv39/DX8zAKBOR0yaSfnc5z4XvXv3LmY/br/99vjZz35WXMSbVJ8tOXLkSHl2Jo05d+5csfroSmMOHz58yc89evToJbM8F0szQ6VVU6UNAKi7PvR9YtIMSboepUuXLkWArF+/vvxaCpYNGzZE//79i+e9evWKRo0aVRlz8ODB2LVrV3lMur4mzaRs27atPGbr1q3FvtIYAICGNfkj+MEPfhAjRoworjdJy6bThb0vvvhisQw6nepJK5PmzZsXXbt2Lbb0uFmzZsWS6SRdqzJ+/PiYMWNGtG3btljZNHPmzOjZs2cMHjy4GNO9e/cYPnx4TJgwIZYtW1bsmzhxYrEM+4OuTAIA6r4aRUw6zZPu15JmT1KQpBvfpYBJ94JJZs2aFWfOnInJkycXp4zSCqN169ZFixYtyu+xaNGiaNiwYYwePboYO2jQoFi+fHk0aNCgPGblypUxderU8iqmdEO8dO8ZAIBrdp+Y65X7xEDtcZ8YqB/ezPU+MQAAtUnEAABZEjEAQJZEDACQJREDAGRJxAAAWRIxAECWRAwAkCURAwBkScQAAFkSMQBAlkQMAJAlEQMAZEnEAABZEjEAQJZEDACQJREDAGRJxAAAWRIxAECWRAwAkCURAwBkScQAAFkSMQBAlkQMAJAlEQMAZEnEAABZEjEAQJZEDACQJREDAGRJxAAAWRIxAECWRAwAkCURAwBkScQAAFkSMQBAlkQMAJAlEQMAZEnEAABZEjEAQJZEDACQJREDAGRJxAAAWRIxAECWRAwAkCURAwBkScQAAFkSMQBAlkQMAJAlEQMAZEnEAABZEjEAQJZEDACQJREDAGRJxAAAWRIxAECWRAwAkCURAwBkScQAAHU/YubPnx9f+tKXokWLFtG+ffu49957Y+/evVXGVFZWxty5c6NTp07RtGnTGDhwYOzevbvKmLNnz8aUKVOiXbt20bx58xg1alQcOHCgyphjx47F2LFjo1WrVsWWHh8/fvzD/K4AQH2NmA0bNsSDDz4YW7ZsifXr18f//ve/GDp0aLz77rvlMQsWLIiFCxfGkiVLYvv27dGxY8cYMmRInDp1qjxm2rRpsWbNmli9enVs3LgxTp8+HSNHjowLFy6Ux4wZMyZ27twZa9euLbb0OIUMAEBSUZmmTq7S0aNHixmZFDd33nlnMQuTZmBSpDz00EPlWZcOHTrEY489FpMmTYoTJ07EjTfeGCtWrIj777+/GPPWW29F586d4/nnn49hw4bFnj174tZbby1iqU+fPsWY9Lhfv37x+uuvR7du3d732E6ePFnM4KSf17Jly2v+X/szDz93zd8T6oo3H7076gKfc/j4P+s1+fv7Q10Tk35A0qZNm+Lrvn374tChQ8XsTEmTJk1iwIABsWnTpuL5jh074vz581XGpPDp0aNHeczmzZuLX6AUMEnfvn2LfaUx1aVYSr/4xRsAUHdddcSkWZfp06fHl7/85SJAkhQwSZp5uVh6XnotfW3cuHG0bt36imPSDE91aV9pzOWu1yldP5O2NLMDANRdVx0x3/ve9+Jvf/tb/OY3v7nktYqKikuCp/q+6qqPudz4K73P7Nmzi5mh0rZ///4a/DYAQL2ImLSy6Pe//3385S9/iZtuuqm8P13Em1SfLTly5Eh5diaNOXfuXLH66EpjDh8+fNlrcKrP8lx82iqdO7t4AwDqrhpFTJoJSTMwv/3tb+PPf/5zdOnSpcrr6XkKkLRyqSQFS7rwt3///sXzXr16RaNGjaqMOXjwYOzatas8Jl3Am2ZTtm3bVh6zdevWYl9pDABQvzWsyeC0vHrVqlXxu9/9rrhXTGnGJV2Dku4Jk071pJVJ8+bNi65duxZbetysWbNiyXRp7Pjx42PGjBnRtm3b4qLgmTNnRs+ePWPw4MHFmO7du8fw4cNjwoQJsWzZsmLfxIkTi2XYH2RlEgBQ99UoYpYuXVp8TTewu9ivfvWr+Na3vlU8njVrVpw5cyYmT55cnDJKK4zWrVtXRE/JokWLomHDhjF69Ohi7KBBg2L58uXRoEGD8piVK1fG1KlTy6uY0g3x0r1nAAA+9H1irmfuEwO1x31ioH54M+f7xAAA1BYRAwBkScQAAFkSMQBAlkQMAJAlEQMAZEnEAABZEjEAQJZEDACQJREDAGRJxAAAWRIxAECWRAwAkCURAwBkScQAAFkSMQBAlkQMAJAlEQMAZEnEAABZEjEAQJZEDACQJREDAGRJxAAAWRIxAECWRAwAkCURAwBkScQAAFkSMQBAlkQMAJAlEQMAZEnEAABZEjEAQJZEDACQJREDAGRJxAAAWRIxAECWRAwAkCURAwBkScQAAFkSMQBAlkQMAJAlEQMAZEnEAABZEjEAQJZEDACQJREDAGRJxAAAWRIxAECWRAwAkCURAwBkScQAAFkSMQBAlkQMAJAlEQMAZEnEAABZEjEAQJZEDACQJREDAGRJxAAA9SNiXnrppbjnnnuiU6dOUVFREc8++2yV1ysrK2Pu3LnF602bNo2BAwfG7t27q4w5e/ZsTJkyJdq1axfNmzePUaNGxYEDB6qMOXbsWIwdOzZatWpVbOnx8ePHr/b3BADqe8S8++67cfvtt8eSJUsu+/qCBQti4cKFxevbt2+Pjh07xpAhQ+LUqVPlMdOmTYs1a9bE6tWrY+PGjXH69OkYOXJkXLhwoTxmzJgxsXPnzli7dm2xpccpZAAAkoY1/WMYMWJEsV1OmoVZvHhxzJkzJ+67775i31NPPRUdOnSIVatWxaRJk+LEiRPx5JNPxooVK2Lw4MHFmKeffjo6d+4cL7zwQgwbNiz27NlThMuWLVuiT58+xZgnnngi+vXrF3v37o1u3br5rwcA9dw1vSZm3759cejQoRg6dGh5X5MmTWLAgAGxadOm4vmOHTvi/PnzVcakU089evQoj9m8eXNxCqkUMEnfvn2LfaUx1aVTVCdPnqyyAQB11zWNmBQwSZp5uVh6XnotfW3cuHG0bt36imPat29/yfunfaUx1c2fP798/Uza0swOAFB3fSSrk9IFv9VPM1XfV131MZcbf6X3mT17dnGqqrTt37//qo8fAKhnEZMu4k2qz5YcOXKkPDuTxpw7d65YfXSlMYcPH77k/Y8ePXrJLM/Fp61atmxZZQMA6q5rGjFdunQpAmT9+vXlfSlYNmzYEP379y+e9+rVKxo1alRlzMGDB2PXrl3lMekC3jSbsm3btvKYrVu3FvtKYwCA+q3Gq5PScug33nijysW8aflzmzZt4uabby6WT8+bNy+6du1abOlxs2bNiiXTSbpeZfz48TFjxoxo27Zt8X0zZ86Mnj17llcrde/ePYYPHx4TJkyIZcuWFfsmTpxYLMO2MgkAuKqIeeWVV+Kuu+4qP58+fXrxddy4cbF8+fKYNWtWnDlzJiZPnlycMkorjNatWxctWrQof8+iRYuiYcOGMXr06GLsoEGDiu9t0KBBeczKlStj6tSp5VVM6YZ4/9+9aQCA+qeiMl0tWwelJdZp1iedgvooro/5zMPPXfP3hLrizUfvjrrA5xw+/s96Tf7+9m8nAQBZEjEAQJZEDACQJREDAGRJxAAAWRIxAECWRAwAkCURAwBkScQAAFkSMQBAlkQMAJAlEQMAZEnEAABZEjEAQJZEDACQJREDAGRJxAAAWRIxAECWRAwAkCURAwBkScQAAFkSMQBAlkQMAJAlEQMAZEnEAABZEjEAQJZEDACQJREDAGRJxAAAWRIxAECWRAwAkCURAwBkScQAAFkSMQBAlkQMAJAlEQMAZEnEAABZEjEAQJZEDACQJREDAGRJxAAAWRIxAECWRAwAkCURAwBkScQAAFkSMQBAlkQMAJAlEQMAZEnEAABZEjEAQJZEDACQJREDAGRJxAAAWRIxAECWRAwAkCURAwBkScQAAFkSMQBAlq77iPnFL34RXbp0iRtuuCF69eoVL7/8cm0fEgBwHbiuI+aZZ56JadOmxZw5c+LVV1+NO+64I0aMGBH//ve/a/vQAIBadl1HzMKFC2P8+PHxne98J7p37x6LFy+Ozp07x9KlS2v70ACAWtYwrlPnzp2LHTt2xMMPP1xl/9ChQ2PTpk2XjD979myxlZw4caL4evLkyY/k+N47+9+P5H2hLvioPncfN59z+Pg/66X3rKyszDdi3n777bhw4UJ06NChyv70/NChQ5eMnz9/fjzyyCOX7E8zN8DHq9Vif+JQH7T6CD/rp06dilatWuUZMSUVFRVVnqcyq74vmT17dkyfPr38/L333ov//Oc/0bZt28uOp+5I1Z5idf/+/dGyZcvaPhzgI+BzXn9UVlYWAdOpU6f3HXvdRky7du2iQYMGl8y6HDly5JLZmaRJkybFdrFPfepTH/lxcv1IASNioG7zOa8fWr3PDMx1f2Fv48aNiyXV69evr7I/Pe/fv3+tHRcAcH24bmdiknR6aOzYsdG7d+/o169fPP7448Xy6gceeKC2Dw0AqGXXdcTcf//98c4778SPfvSjOHjwYPTo0SOef/75uOWWW2r70LiOpNOIP/zhDy85nQjUHT7nXE5F5QdZwwQAcJ25bq+JAQC4EhEDAGRJxAAAWRIxAECWRAxZ+8UvfhFdunSJG264obiv0Msvv1zbhwRcYy+99FLcc889xR1c0x3Yn332WX/GFEQM2XrmmWdi2rRpMWfOnHj11VfjjjvuiBEjRhT3EgLqjnfffTduv/32WLJkSW0fCtcZS6zJVp8+feKLX/xiLF26tLyve/fuce+99xb/IChQ96SZmDVr1hSfczATQ5bOnTsXO3bsiKFDh1bZn55v2rSp1o4LgI+PiCFLb7/9dly4cOGSfww0Pa/+j4YCUDeJGLKfWr5YugF19X0A1E0ihiy1a9cuGjRocMmsy5EjRy6ZnQGgbhIxZKlx48bFkur169dX2Z+e9+/fv9aOC4CPz3X9r1jDlUyfPj3Gjh0bvXv3jn79+sXjjz9eLK9+4IEH/MFBHXL69Ol44403ys/37dsXO3fujDZt2sTNN99cq8dG7bLEmuxvdrdgwYI4ePBg9OjRIxYtWhR33nlnbR8WcA29+OKLcdddd12yf9y4cbF8+XJ/1vWYiAEAsuSaGAAgSyIGAMiSiAEAsiRiAIAsiRgAIEsiBgDIkogBALIkYgCALIkYACBLIgYAyJKIAQCyJGIAgMjR/wEnFWKpJBNPMgAAAABJRU5ErkJggg==",
      "text/plain": [
       "<Figure size 640x480 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import random\n",
    "N = 10000\n",
    "outcomes = []\n",
    "for i in range(N):\n",
    "    outcomes.append(random.choice((\"H\", \"T\")))\n",
    "#print(outcomes)\n",
    "\n",
    "n_heads = 0\n",
    "n_tails = 0\n",
    "for out in outcomes:\n",
    "    if out == \"H\":\n",
    "        n_heads += 1\n",
    "    if out == \"T\":\n",
    "        n_tails += 1\n",
    "\n",
    "plt.bar([\"0\", \"1\"], [n_tails, n_heads])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "b8f45dc6"
   },
   "source": [
    "The histogram is a direct analogue to the probability distribution function. In fact, we can convert it to an **empirical** PDF by dividing by the number of trials:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 283
    },
    "executionInfo": {
     "elapsed": 348,
     "status": "ok",
     "timestamp": 1673461881702,
     "user": {
      "displayName": "Scott Wehrwein",
      "userId": "11327482518794216604"
     },
     "user_tz": 480
    },
    "id": "fb8d6419",
    "outputId": "0936c7b8-0a83-4f22-a7da-194c3b11df15"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<BarContainer object of 2 artists>"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAiMAAAGdCAYAAADAAnMpAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjgsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvwVt1zgAAAAlwSFlzAAAPYQAAD2EBqD+naQAAFZdJREFUeJzt3W1oXvX9+PFPmi5NGWuGxlbF2HZjc5nZXG2gJqMO5xaJIgiDlclax1pmUDey4IN2hal9Uhmu1jETLbuRsilh6PbEwsyDObNlTwwRBrvBDV1Dly5LB031B8lM8+cc/g2mSbumjX7aK68XHNrzzTlXTgIXefM9N1fV9PT0dAAAJFmW9Y0BAApiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABItTwuASdPnox//vOf8aEPfSiqqqqyDwcAOAfFc1VPnDgRV199dSxbtuzSjpEiRBoaGrIPAwA4D8PDw3HNNddc2jFSzIic+mFWrVqVfTgAwDkYHx8vJxNO/R2/pGPk1KmZIkTECABcWv7XJRYuYAUAUokRACCVGAEAUokRACCVGAEAUokRACCVGAEAUokRACCVGAEAUokRACCVGAEAUokRACCVGAEAUokRACDV8txvD/D+WLfzRb9qOIM3H70jLrmZke7u7li/fn3U1tbGxo0bo7+//4zbvvzyy1FVVTVn+ctf/nIhxw0AVIgFx0hvb290dnbG7t27Y2hoKDZv3hzt7e1x+PDhs+7317/+NUZGRmaWj33sYxdy3ADAUo2Rffv2xfbt22PHjh3R2NgY+/fvj4aGhujp6TnrfqtXr44rr7xyZqmurr6Q4wYAlmKMTE5OxuDgYLS1tc0aL9YHBgbOuu+GDRviqquuiltvvTV+85vfnHXbiYmJGB8fn7UAAJVpQTEyNjYWU1NTsWbNmlnjxfrRo0fn3acIkAMHDsTzzz8fL7zwQlx33XVlkLzyyitn/D579+6Nurq6maWYeQEAKtN53U1TXID6btPT03PGTinio1hOaWlpieHh4Xjsscfi5ptvnnefXbt2RVdX18x6MTMiSACgMi1oZqS+vr681uP0WZDR0dE5syVnc9NNN8Xrr79+xq+vWLEiVq1aNWsBACrTgmKkpqamvJW3r69v1nix3traes6vU9yFU5y+AQBY8Gma4vTJ1q1bo7m5uTzlUlwPUtzW29HRMXOK5ciRI3Hw4MFyvbjbZt26dXH99deXF8D+7Gc/K68fKRYAgAXHyJYtW+LYsWOxZ8+e8nkhTU1NcejQoVi7dm359WLs3c8cKQLkwQcfLANl5cqVZZS8+OKLcfvtt18Uv31PZYSL+8mMQOWrmi6uPr3IFRewFnfVHD9+fNGvHxEjsDRixHsd3v/3+bn+/fZBeQBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAFx6MdLd3R3r16+P2tra2LhxY/T395/Tfr///e9j+fLl8ZnPfOZ8vi0AUIEWHCO9vb3R2dkZu3fvjqGhodi8eXO0t7fH4cOHz7rf8ePHY9u2bXHrrbdeyPECAEs9Rvbt2xfbt2+PHTt2RGNjY+zfvz8aGhqip6fnrPvde++9cffdd0dLS8uFHC8AsJRjZHJyMgYHB6OtrW3WeLE+MDBwxv1++tOfxt///vd46KGHzun7TExMxPj4+KwFAKhMC4qRsbGxmJqaijVr1swaL9aPHj067z6vv/567Ny5M37+85+X14uci71790ZdXd3MUsy8AACV6bwuYK2qqpq1Pj09PWesUIRLcWrmkUceiY9//OPn/Pq7du0qrzE5tQwPD5/PYQIAl4Bzm6r4/+rr66O6unrOLMjo6Oic2ZLCiRMn4tVXXy0vdH3ggQfKsZMnT5bxUsySvPTSS/H5z39+zn4rVqwoFwCg8i1oZqSmpqa8lbevr2/WeLHe2to6Z/tVq1bFH//4x3jttddmlo6OjrjuuuvK/2/atOnCfwIAYOnMjBS6urpi69at0dzcXN4Zc+DAgfK23iIyTp1iOXLkSBw8eDCWLVsWTU1Ns/ZfvXp1+XyS08cBgKVpwTGyZcuWOHbsWOzZsydGRkbKqDh06FCsXbu2/Hox9r+eOQIAcErVdHEBx0WuuLW3uKumuJi1OPWzmNbtfHFRXw8qzZuP3hGVwHsd3v/3+bn+/fbZNABAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwBAKjECAKQSIwDApRcj3d3dsX79+qitrY2NGzdGf3//Gbf93e9+F5/97Gfj8ssvj5UrV8YnPvGJePzxxy/kmAGACrJ8oTv09vZGZ2dnGSRFZDz99NPR3t4ef/rTn+Laa6+ds/0HP/jBeOCBB+LTn/50+f8iTu69997y/9/4xjcW6+cAAJbKzMi+ffti+/btsWPHjmhsbIz9+/dHQ0ND9PT0zLv9hg0b4itf+Upcf/31sW7duvjqV78at91221lnUwCApWNBMTI5ORmDg4PR1tY2a7xYHxgYOKfXGBoaKrf93Oc+d8ZtJiYmYnx8fNYCAFSmBcXI2NhYTE1NxZo1a2aNF+tHjx49677XXHNNrFixIpqbm+P+++8vZ1bOZO/evVFXVzezFDMvAEBlOq8LWKuqqmatT09Pzxk7XXFa5tVXX42nnnqqPLXz3HPPnXHbXbt2xfHjx2eW4eHh8zlMAKDSLmCtr6+P6urqObMgo6Ojc2ZLTlfcfVP41Kc+Ff/617/i4YcfLq8lmU8xg1IsAEDlW9DMSE1NTXkrb19f36zxYr21tfWcX6eYSSmuCwEAWPCtvV1dXbF169by2o+WlpY4cOBAHD58ODo6OmZOsRw5ciQOHjxYrj/55JPlLb/F80UKxa29jz32WHzzm9/02wcAFh4jW7ZsiWPHjsWePXtiZGQkmpqa4tChQ7F27dry68VYESennDx5sgyUN954I5YvXx4f/ehH49FHHy2fNQIAUDVdnDO5yBW39hZ31RQXs65atWpRX3vdzhcX9fWg0rz56B1RCbzX4f1/n5/r32+fTQMApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAMClFyPd3d2xfv36qK2tjY0bN0Z/f/8Zt33hhRfii1/8YlxxxRWxatWqaGlpiV//+tcXcswAwFKOkd7e3ujs7Izdu3fH0NBQbN68Odrb2+Pw4cPzbv/KK6+UMXLo0KEYHByMW265Je68885yXwCAqunp6emF/Bo2bdoUN954Y/T09MyMNTY2xl133RV79+49p9e4/vrrY8uWLfHd7373nLYfHx+Purq6OH78eDm7spjW7XxxUV8PKs2bj94RlcB7Hd7/9/m5/v1e0MzI5ORkObvR1tY2a7xYHxgYOKfXOHnyZJw4cSIuu+yyM24zMTFR/gDvXgCAyrSgGBkbG4upqalYs2bNrPFi/ejRo+f0Gt///vfj7bffji9/+ctn3KaYYSlK6tTS0NCwkMMEACr9AtaqqqpZ68WZntPH5vPcc8/Fww8/XF53snr16jNut2vXrnJK59QyPDx8PocJAFwCli9k4/r6+qiurp4zCzI6OjpntuR0RYBs3749fvGLX8QXvvCFs267YsWKcgEAKt+CZkZqamrKW3n7+vpmjRfrra2tZ50R+drXvhbPPvts3HFHZVwMBwAkzIwUurq6YuvWrdHc3Fw+M+TAgQPlbb0dHR0zp1iOHDkSBw8enAmRbdu2xRNPPBE33XTTzKzKypUry+tBAIClbcExUtySe+zYsdizZ0+MjIxEU1NT+QyRtWvXll8vxt79zJGnn3463nnnnbj//vvL5ZR77rknnnnmmcX6OQCApRIjhfvuu69c5nN6YLz88svnd2QAwJLgs2kAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgFRiBABIJUYAgEsvRrq7u2P9+vVRW1sbGzdujP7+/jNuOzIyEnfffXdcd911sWzZsujs7LyQ4wUAlnqM9Pb2lkGxe/fuGBoais2bN0d7e3scPnx43u0nJibiiiuuKLe/4YYbFuOYAYClHCP79u2L7du3x44dO6KxsTH2798fDQ0N0dPTM+/269atiyeeeCK2bdsWdXV1i3HMAMBSjZHJyckYHByMtra2WePF+sDAwKIdVDGbMj4+PmsBACrTgmJkbGwspqamYs2aNbPGi/WjR48u2kHt3bu3nEU5tRQzLwBAZTqvC1irqqpmrU9PT88ZuxC7du2K48ePzyzDw8OL9toAwMVl+UI2rq+vj+rq6jmzIKOjo3NmSy7EihUrygUAqHwLmhmpqakpb+Xt6+ubNV6st7a2LvaxAQBLwIJmRgpdXV2xdevWaG5ujpaWljhw4EB5W29HR8fMKZYjR47EwYMHZ/Z57bXXyn/feuut+Pe//12uF2HzyU9+cjF/FgBgKcTIli1b4tixY7Fnz57ygWZNTU1x6NChWLt2bfn1Yuz0Z45s2LBh5v/F3TjPPvtsuf2bb765GD8DALCUYqRw3333lct8nnnmmTljxQWuAADz8dk0AEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIApBIjAEAqMQIAXHox0t3dHevXr4/a2trYuHFj9Pf3n3X73/72t+V2xfYf+chH4qmnnjrf4wUAlnqM9Pb2RmdnZ+zevTuGhoZi8+bN0d7eHocPH553+zfeeCNuv/32crti++985zvxrW99K55//vnFOH4AYKnFyL59+2L79u2xY8eOaGxsjP3790dDQ0P09PTMu30xC3LttdeW2xXbF/t9/etfj8cee2wxjh8AuMQtX8jGk5OTMTg4GDt37pw13tbWFgMDA/Pu84c//KH8+rvddttt8eMf/zj++9//xgc+8IE5+0xMTJTLKcePHy//HR8fj8V2cuL/Fv01oZK8F++7DN7r8P6/z0+97vT09OLFyNjYWExNTcWaNWtmjRfrR48enXefYny+7d95553y9a666qo5++zduzceeeSROePFDAzw/qrb7zcOla7uPX6fnzhxIurq6hYnRk6pqqqatV4Uz+lj/2v7+cZP2bVrV3R1dc2snzx5Mv7zn//E5Zdfftbvw6WvqOgiOoeHh2PVqlXZhwO8B7zPl47p6ekyRK6++uqzbregGKmvr4/q6uo5syCjo6NzZj9OufLKK+fdfvny5WVczGfFihXl8m4f/vCHF3KoXOKKEBEjUNm8z5eGurPMiJzXBaw1NTXlLbp9fX2zxov11tbWefdpaWmZs/1LL70Uzc3N814vAgAsLQu+m6Y4ffKjH/0ofvKTn8Sf//zn+Pa3v13e1tvR0TFzimXbtm0z2xfj//jHP8r9iu2L/YqLVx988MHF/UkAgEvSgq8Z2bJlSxw7diz27NkTIyMj0dTUFIcOHYq1a9eWXy/G3v3MkeLhaMXXi2h58skny/NGP/jBD+JLX/rS4v4kVITi9NxDDz005zQdUDm8zzld1fT/ut8GAOA95LNpAIBUYgQASCVGAIBUYgQASCVGuGh0d3eXd1/V1taWz7Pp7+/PPiRgEb3yyitx5513lndVFk/T/tWvfuX3S0mMcFHo7e2Nzs7O2L17dwwNDcXmzZujvb191m3iwKXt7bffjhtuuCF++MMfZh8KFxm39nJR2LRpU9x4443R09MzM9bY2Bh33XVX+cGJQGUpZkZ++ctflu9xMDNCusnJyRgcHIy2trZZ48X6wMBA2nEB8P4QI6QbGxuLqampOR+2WKyf/iGLAFQeMcJFNW37bsXDgU8fA6DyiBHS1dfXR3V19ZxZkNHR0TmzJQBUHjFCupqamvJW3r6+vlnjxXpra2vacQFwkX5qL7wXurq6YuvWrdHc3BwtLS1x4MCB8rbejo4Ov3CoEG+99Vb87W9/m1l/44034rXXXovLLrssrr322tRjI5dbe7moHnr2ve99L0ZGRqKpqSkef/zxuPnmm7MPC1gkL7/8ctxyyy1zxu+555545pln/J6XMDECAKRyzQgAkEqMAACpxAgAkEqMAACpxAgAkEqMAACpxAgAkEqMAACpxAgAkEqMAACpxAgAkEqMAACR6f8BVx1lvUqsMO0AAAAASUVORK5CYII=",
      "text/plain": [
       "<Figure size 640x480 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.bar([\"0\", \"1\"], [n_heads / N, n_tails / N])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ef643c1b"
   },
   "source": [
    "Notice that only the $y$ axis scale changed. This is an estimate of the PDF based on the data we observed."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "a21ec033"
   },
   "source": [
    "### Summary Statistics\n",
    "\n",
    "Real-world experiments of interest are more complicated - more complicated processes, more complicated sets of outcomes, etc. It's often useful to summarize the salient properties of an observed distribution. For this, we use **summary statistics**.\n",
    "\n",
    "#### Central Tendency Measures\n",
    "These tell you something about where the data is \"centered\".\n",
    "\n",
    "**(Arithmetic) Mean**, aka \"average\": The sum of the values divided by the number of values: $$\\mu_X = \\frac{1}{n} \\sum_{i=1}^n x_i$$.\n",
    "\n",
    "This works well for data sets where there aren't many outliers; for example: the average height of a female American is 5 feet 4 inches.\n",
    "\n",
    "**Geometric Mean**: The $n$th root of the product of $n$ values: $$\\left(\\prod_{i=1}^n a_i\\right)^\\frac{1}{n}$$\n",
    "\n",
    "This is a weird one, and not as often applicable. If you have a single zero, the geometric mean is zero. But it's useful for measuring the central tendency of a collection of **ratios**.\n",
    "\n",
    "**Median**: The middle value - the element appearing exaclty in the middle if the data were sorted. This is useful in the presence of **outliers** or more generally when the distribution is weirdly-shaped.\n",
    "\n",
    "**\\*-iles**\n",
    "\n",
    "These generalize the median to fractions other than one half. For example, the five quartiles of a dataset are the minimum, the value that is larger than one quarter of the data, the median, the value that is larger than three quarters of the data, and the maximum.\n",
    "\n",
    "Common examples aside from quartiles include percentiles (divide the data into 100ths), deciles (10ths), and quintiles (fifths).\n",
    "\n",
    "#### Variability Measures\n",
    "These tell you something about the *spread* of the data, i.e., how far measurements tend to be from the center.\n",
    "\n",
    "**Standard Deviation** ($\\sigma$): The square root of the sum of squared differences between the elements and the mean: $$\\sqrt{\\frac{\\sum_{i=1}^n (a_i - \\bar{a})^2}{n-1}}$$\n",
    "\n",
    "**Variance**: the square of the Standard Deviation (i.e., same thing without the square root).\n",
    "\n",
    "Variance is easier to intuit: it's the average sqaured distance from the mean, with a small caveat that it's divided by n-1 rather than by n.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "58BsHyFIW1hV"
   },
   "source": [
    "## Summary Statistics in Pandas\n",
    "\n",
    "There are built-in functions that do all of the above for us. To demo, we'll use a dataset of body measurements from a sample of humans."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "executionInfo": {
     "elapsed": 372,
     "status": "ok",
     "timestamp": 1673462763291,
     "user": {
      "displayName": "Scott Wehrwein",
      "userId": "11327482518794216604"
     },
     "user_tz": 480
    },
    "id": "yrLC-Fy9XIUM"
   },
   "outputs": [],
   "source": [
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "zbAL8dxhXKnd"
   },
   "source": [
    "Load the data and do a little tidying:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "url = '/cluster/academic/DATA311/202620/NHANES/NHANES.csv'\n",
    "df = pd.read_csv(url).rename(\n",
    "   columns={\"SEQN\": \"SEQN\",\n",
    "            \"RIAGENDR\": \"Gender\", # 1 = M, 2 = F\n",
    "            \"RIDAGEYR\": \"Age\", # years\n",
    "            \"BMXWT\": \"Weight\", # kg\n",
    "            \"BMXHT\": \"Height\", # cm\n",
    "            \"BMXLEG\": \"Leg\", # cm\n",
    "            \"BMXARML\": \"Arm\", # cm\n",
    "            \"BMXARMC\": \"Arm Cir\", # cm\n",
    "            \"BMXWAIST\": \"Waist Cir\"} # cm\n",
    ")\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df[\"Height\"].plot.hist()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can see the names and datatypes of all the columns with `info`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To compute a useful collection of summary statistics on each column, we can use `describe`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's poke around! Some ideas:\n",
    "* `mean`, `median`, `min`, `max` per column\n",
    "* `.plot.hist` per column\n",
    "* Perform some unit conversions\n",
    "* Extract a subset of rows meeting a criterion"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "colab": {
   "authorship_tag": "ABX9TyOVugDhab4OOBNTMJmjz+ao",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
