这边一章的内容,靠图展示比较强
推荐CSDN链接: https://blog.csdn.net/CodingRae/article/details/103985629
Photo OCR
Photo OCR means Photo Optical Character Recognition.
Photo OCR pipline
- Text detction, go through the image and find the regions where there’s text and image.
- Character segmentation, given the rectangle around that text region
- Character classification, having segmented out into individual characters
In many complex machine learning systems, these sorts of pipelines are common, where u can have multiple modules. each of which may be machine learning component, or sometimes it may not be a machine learning component but to have a set of modules that act one after another on some piece of data in order to produce the output you want.
Design the pipeline: give a problem, break the problem down into a sequence of different modules.
Sliding windows
Pedestrian Detection
- train a classifer to recognize the pedestrians in the image.
- Slide different size of windows to find region of pedestrians in the image.
Text Detection
train a character classifer to recognize the characters.
After getting the region of single characters, using an expansion to get the region of strings.
Summary
Getting lots of Data and Artificial Databases
there is a fascinating idea called artificial data synthesis.
Artificial data synthesis for photo OCR
Take characters from diffrent fonts
paste these characters against different random backgrounds.
Synthesizing data by introducing distortions
- overlayed picture with the grid lines just for the purpose of illustration.
- take the image and introduce artificial warpings (artificial distortions)
Distortion introduced should be representation of the type of noise/distortions in the test set.
Usually does not help to add purely random/meaningless noise to your data.
Dicussion on getting more data
- Make sure you have a low bias classifier before expending the effort.(Plot learning curves). E.g. keep increasing the number of features/number of hidden units in neural network until you have a low bias classifier.
- “How much work would it be to get 10x as much data as we currently have?”
- Artificial data synthesis
- Collect/label it yourself
- “Crowd source”(E.g. Amazon Mechanical Turk)
Ceiling Analysis
What Part of the Pipline to Work on Next
Ceiling Analysis can sometimes give you a very strong signal, a very strong guidance on what parts of the pipeline might be the best use of your time to work on.
Estimating the errors due to each component(ceiling analysis)
Though Ceiling Analysis we can realize what’s the most promising components.
The table above is the centrel of method。
Coursera Summary
Main topics
U ARE an Expect
Thank u Andrew Ng
- Post title: 17_Problem Description and Pipeline
- Create time: 2022-02-28 19:48:49
- Post link: Machine-Learning/17-problem-description-and-pipeline/
- Copyright notice: All articles in this blog are licensed under BY-NC-SA unless stating additionally.