这边一章的内容，靠图展示比较强
推荐CSDN链接： https://blog.csdn.net/CodingRae/article/details/103985629

Photo OCR

Photo OCR means Photo Optical Character Recognition.

Photo OCR pipline

Text detction, go through the image and find the regions where there’s text and image.
Character segmentation, given the rectangle around that text region
Character classification, having segmented out into individual characters

In many complex machine learning systems, these sorts of pipelines are common, where u can have multiple modules. each of which may be machine learning component, or sometimes it may not be a machine learning component but to have a set of modules that act one after another on some piece of data in order to produce the output you want.

Design the pipeline: give a problem, break the problem down into a sequence of different modules.

Sliding windows

Pedestrian Detection

train a classifer to recognize the pedestrians in the image.
Slide different size of windows to find region of pedestrians in the image.

Text Detection

train a character classifer to recognize the characters.

After getting the region of single characters, using an expansion to get the region of strings.

Summary

Getting lots of Data and Artificial Databases

there is a fascinating idea called artificial data synthesis.

Artificial data synthesis for photo OCR

Take characters from diffrent fonts
paste these characters against different random backgrounds.

Synthesizing data by introducing distortions

overlayed picture with the grid lines just for the purpose of illustration.
take the image and introduce artificial warpings (artificial distortions)

Distortion introduced should be representation of the type of noise/distortions in the test set.

Usually does not help to add purely random/meaningless noise to your data.

Dicussion on getting more data

Make sure you have a low bias classifier before expending the effort.(Plot learning curves). E.g. keep increasing the number of features/number of hidden units in neural network until you have a low bias classifier.
“How much work would it be to get 10x as much data as we currently have?”

Artificial data synthesis
Collect/label it yourself
“Crowd source”(E.g. Amazon Mechanical Turk)

Ceiling Analysis

What Part of the Pipline to Work on Next

Ceiling Analysis can sometimes give you a very strong signal, a very strong guidance on what parts of the pipeline might be the best use of your time to work on.

Estimating the errors due to each component(ceiling analysis)

Though Ceiling Analysis we can realize what’s the most promising components.