Stochastic Gradient Descent
we have a very large training set, gradient descent becomes a computationally very expensive procedure.
Algorithm
- Randomly shuffle dataset.
Repeat {{ }
}
Tips
SGD Algorithm is much more faster than Batch Gradient Descent when we have large scale dataset.
Learning rate
Mini-Batch Gradient Descent
- Batch gradient descent: Use all m examples in each iteration
- Stochastic gradient descent: Use 1 example in each iteration
- Mini-Batch gradient descent: Use b examples in each iteration
Algorithm
Say
Repeat{
}
Tips
小批量梯度下降,是介于 批梯度下降 和 随机梯度下降之间的方法。
Checking for convergence
Plot
Advanced Topics
Online Learning
Repeat forever {
}
- get an example
- learn the example
- discard the example
Product search(learning to search)
User searches for "Android phone 1080p camera"
Have 100 phones in stores. Will return 10 results.
x = features of phone, how many words in user query match name of phone, how many words in query match description of phone, etc.
Learn
Use to show user the 10 phones they’re most likely to click on.
Other examples: Choosing special offers to show user; customized selection of news articles product recommendation;
Map Reduce
Many learning algorithms can be epressed as computing sums of functions over the training set.
E.g. for advanced optimization, with logistic regression, need:
- Post title: 16_Learning with Large Scale Dataset
- Create time: 2022-02-28 09:11:29
- Post link: Machine-Learning/16-learning-with-large-scale-dataset/
- Copyright notice: All articles in this blog are licensed under BY-NC-SA unless stating additionally.