06_Overfitting and Regularization
Carpe Tu Black Whistle

Overfitting

Model can fit the training dataset perfectly, but can’t do genelize well to the test example.

Underfitting

or high bias, is when the form of our hypothesis function h maps poorly to the trend of the data.

It’s usually caused by few features.

overfitting

or high variance, is caused by a hypothesis function that fits the available data but doesn’t generalize well to predict new data.

Two main options to address the issue of overfitting

Reduce the no. of features

  • Manually select which features to keep.
  • Use a model selection algorithm

Regularization

  • Keep all the features, but reduce the magnitude of parameters.
  • Regularization works well when we have a lot of slightly useful features.

Regularization

Regularization can “shrink“ some of the theta in the hypothesis function.

image

The, or lambda, is the regularization parameter.
It determines how much the costs of our theta parameters are inflated.

Note that using the above cost function with the extra summation, we can smooth the output of our hypothesis function to reduce overfitting. If lambda is chosen to be too large, it may smooth out the function too much and cause underfitting.

Regularized Linear Regression

Note: [8:43 - It is said that X is non-invertible if mn. The correct statement should be that X is non-invertible if m < n, and may be non-invertible if m = n.

Gradient Descent

Repeat{


}

The termperforms regularization.With some manipulation update rule can also be represented as:

will always be less than 1.
Intuitively you can see it as reducing the value ofby some amount on every update. Notice that the second term is now exactly the same as it was before.

Normal Equation

To add in regularization, the equation is the same as our original, except that we add another term inside the parentheses:

image

Remark

Recall that if m < n, thenis non-invertible. However, when we add the term λ⋅L, thenbecomes invertible.

as long as the parameteris greater than 0.

Regularized Logistic Regression

image

Cost Function

cost function for logistic regression:

cost function for logistic regression with regulazation:

The second sum,means to explicitly exclude the bias term,.I.e. thevector is indexed from 0 to n (holding n+1 values,through), and this sum explicitly skips, by running from 1 to n, skipping 0. Thus, when computing the equation, we should continuously update the two following equations:

image