Overfitting
Model can fit the training dataset perfectly, but can’t do genelize well to the test example.
Underfitting
or high bias, is when the form of our hypothesis function h maps poorly to the trend of the data.
It’s usually caused by few features.
overfitting
or high variance, is caused by a hypothesis function that fits the available data but doesn’t generalize well to predict new data.
Two main options to address the issue of overfitting
Reduce the no. of features
- Manually select which features to keep.
- Use a model selection algorithm
Regularization
- Keep all the features, but reduce the magnitude of parameters
. - Regularization works well when we have a lot of slightly useful features.
Regularization
Regularization can “shrink“ some of the theta in the hypothesis function.
The
It determines how much the costs of our theta parameters are inflated.
Note that using the above cost function with the extra summation, we can smooth the output of our hypothesis function to reduce overfitting. If lambda is chosen to be too large, it may smooth out the function too much and cause underfitting.
Regularized Linear Regression
Note: [8:43 - It is said that X is non-invertible if m
n. The correct statement should be that X is non-invertible if m < n, and may be non-invertible if m = n.
Gradient Descent
Repeat{
}
The term
performs regularization.With some manipulation update rule can also be represented as:
will always be less than 1.
Intuitively you can see it as reducing the value ofby some amount on every update. Notice that the second term is now exactly the same as it was before.
Normal Equation
To add in regularization, the equation is the same as our original, except that we add another term inside the parentheses:
Remark
Recall that if m < n, then
as long as the parameter
Regularized Logistic Regression
Cost Function
cost function for logistic regression:
cost function for logistic regression with regulazation:
The second sum,
means to explicitly exclude the bias term, .I.e. the vector is indexed from 0 to n (holding n+1 values, through ), and this sum explicitly skips , by running from 1 to n, skipping 0. Thus, when computing the equation, we should continuously update the two following equations:
- Post title: 06_Overfitting and Regularization
- Create time: 2022-01-05 16:03:29
- Post link: Machine-Learning/06-overfitting-and-regularization/
- Copyright notice: All articles in this blog are licensed under BY-NC-SA unless stating additionally.