Announcements¶
- New seats, new friends, as of Monday:
In [7]:
import random
random.seed(518)
datafolk = "Alli Keira Malik Erika Narina Sebastian Josh Dylan Haden Zach Maven Marcus Finnley".split()
random.shuffle(datafolk)
print(datafolk[:4])
print(datafolk[4:9])
print(datafolk[9:])
['Malik', 'Josh', 'Alli', 'Erika'] ['Marcus', 'Keira', 'Maven', 'Dylan', 'Zach'] ['Narina', 'Finnley', 'Haden', 'Sebastian']
Goals¶
- Know why and how to subdivide datasets into training, validation, and test sets
- Understand what hyperparameters are and how to tune them using a validation set
- Know how cross-validation works and why you might want to use it.
Notes¶
Lecture notes for today's content:
- See the notebook from last lecture (L18)
- See the whiteboard notes from L18 and today (L19)
Hyperparameter Tuning and Regularization: Activity¶
0. Imports and Data Splits¶
In [8]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_openml
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
data = fetch_openml('vehicle', version=1, as_frame=False, parser='auto')
X, y = data.data, data.target
# This converts features to z-scores:
X = StandardScaler().fit_transform(X)
train_frac = 0.6
X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=1-train_frac, random_state=311)
X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=311)
print(f"Train: {len(X_train)}\n Val: {len(X_val)}\n Test: {len(X_test)}")
Train: 507 Val: 169 Test: 170
1. Train a family of models with different values of $C$¶
In [9]:
C_values = [0.001, 0.01, 0.1, 1, 10, 100, 1000]
train_accs, val_accs = [], []
for C in C_values:
svm = SVC(kernel='linear', C=C) # make the classifier
svm.fit(X_train, y_train) # train the classifier
train_accs.append(svm.score(X_train, y_train)) # evaluate on the training set
val_accs.append( svm.score(X_val, y_val)) # evaluate on the validation set
best_C = C_values[np.argmax(val_accs)]
2. Plot the training and validation accuracy for each model¶
In [10]:
fig, ax = plt.subplots(figsize=(9, 5))
ax.plot(C_values, train_accs)
ax.plot(C_values, val_accs)
ax.set_xscale('log')
ax.set_xlabel('C', fontsize=12)
ax.set_ylabel('Accuracy', fontsize=12)
ax.set_title('SVM Hyperparameter Tuning', fontsize=13)
plt.tight_layout()
plt.show()
3. Evaluate the best model on the held-out test set¶
Use the best C chosen from the validation curve and compute the performance on the held-out test set.
Important: we should run this only when we are finished tweaking our modeling assumptions and hyperparameters. After we evaluate on the test set, it's no longer unseen!
In [11]:
final_model = SVC(kernel='linear', C=best_C)
final_model.fit(X_train, y_train)
test_acc = final_model.score(X_test, y_test)
val_acc = max(val_accs)
print(f"Validation accuracy (C={best_C}): {val_acc:.1%}")
print(f"Test accuracy (C={best_C}): {test_acc:.1%}")
Validation accuracy (C=10): 78.7% Test accuracy (C=10): 83.5%
In [ ]: