In this article, you will learn about Optical Character Recognition(OCR).
We have several data in the form of different file types. Some of the data were stored in different document formats like PDF, JPEG, PNG, etc. To extract the text data from these documents using OCR.
What is Optical Character Recognition?
Optical Character Recognition is a widespread technology to recognize text inside images, such as scanned documents and photos. OCR technology is used to convert virtually any kind of image containing written text (typed, handwritten, or printed) into machine-readable text data.
There more other OCR tools are available. For higher accuracy and time-efficient processing is better to purchase SDK.
Using Keras-OCR in Python
For install Keras-OCR in python.
pip install keras-ocr
The below example shows how to use the pre-trained models.
#Importing the library
import matplotlib.pyplot as plt
import keras_ocr
# keras-ocr will automatically download pretrained
# weights for the detector and recognizer.
pipeline = keras_ocr.pipeline.Pipeline()
# Get a set of three example images
images = [
keras_ocr.tools.read(url) for url in [
'https://upload.wikimedia.org/wikipedia/commons/b/bd/Army_Reserves_Recruitment_Banner_MOD_45156284.jpg',
'https://upload.wikimedia.org/wikipedia/commons/e/e8/FseeG2QeLXo.jpg',
'https://upload.wikimedia.org/wikipedia/commons/b/b4/EUBanana-500x112.jpg'
]
]
# Each list of predictions in prediction_groups is a list of
# (word, box) tuples.
prediction_groups = pipeline.recognize(images)
# Plot the predictions
fig, axs = plt.subplots(nrows=len(images), figsize=(20, 20))
for ax, image, predictions in zip(axs, images, prediction_groups):
keras_ocr.tools.drawAnnotations(image=image, predictions=predictions, ax=ax)
Best Practices for Using OCR
- 300- 600 DPI at a minimum works great.
- Applying different pre-processing techniques like binarizing, de-noising the image, rotating the image to deskew it, increase the sharpness of the image, etc.
- Grayscale images produce better output.
- Crop image document borders to improve accuracy.
Conclusion:
OCR results depend on the input data quality. A clean segmentation of the text and no noise in the background gives better results. In the real world, this is not always possible, so we need to apply multiple pre-processing techniques for OCR to give better results.