DATA 311 - Lecture 20: Baselines Worksheet

Names:

Baselines

For each of the following prediction scenarios, come up with the strongest baseline you can think of that does not require any machine learning.

  1. Predict whether an email message is spam. The training data contains equal numbers of spam (positive) and non-spam (negative) examples.

  2. Your task is to predict whether an MRI scan shows a tumor or not. The training data contains 90% non-tumor images (negative examples) and 10% tumor images (positive examples).

  3. Given all weather measurements from today and prior, predict whether it will rain tomorrow.

  4. For the NHANES body measurement dataset, predict a person’s leg length given their height.

Regression Metrics

Suppose you are evaluating regression results on a validation set. Your model produces predictions \(y_i^\mathrm{pred}\) for each datapoint \(i\), while the corresponding ground truth labels are \(y_i^\mathrm{true}\)

  1. Computing average error over a whole validation set would look like \(\sum_i \left(y_i^\mathrm{true} - y_i^\mathrm{pred}\right)\). Why wouldn’t this be a good idea? and how would you fix it?

  2. What is the tradeoff in choosing MSE vs MAE to measure regression performance on a dataset?

  3. The coefficient of determination is defined as \(1 - \frac{SS_\mathrm{res}}{SS_\mathrm{tot}}\), where:

    • the numerator is the “sum of squared residuals”: \(SS_\mathrm{res} = \sum_i \left(y_i^\mathrm{true} - y_i^\mathrm{pred}\right)\).
    • the denominator is the “total sum of squares”: \(SS_\mathrm{tot} = \sum_i \left(y_i^\mathrm{true} - \bar{y}\right)\).
    1. What is the coefficient of determination if the predictions are perfect?

    2. What is the coefficient of determination if you use a regressor that predicts the mean label?

    3. What happens to the coefficient of determination if your predictions are worse than the mean?

Classification Metrics

As a reminder, we can classify binary classification predictions into four categories:

  1. Let TP be the number of true positives, and so on for the other three. Define accuracy in terms of these quantities.

For each of the following questions, your task is to game the metric; describe either a classification task, or a classification strategy, where the given metric would not be a good measure of the model’s true performance. For the sake of example, imagine the classification task is a test that predicts cancer.

  1. Game it: when is accuracy not a good measure?

  2. Precision is how often you’re right when you say it’s positive: \(\frac{TP}{(TP+FP)}\). Game it.

  3. Recall is how many of the positive examples you are right about: \(\frac{TP}{(TP + FN)}\). Game it.

  4. The precision for class \(c\) is \(\frac{\textrm{\# correctly labeled } c}{\textrm{\# labeled class } c}\), while recall for class \(c\) is: \(\frac{\textrm{\# correctly labeled } c}{\textrm{\# with true label } c}\). Given a confusion matrix, how would you calculate:

    1. The precision for a certain class?

    2. The recall for a certain class?