Back to snippets

pytesseract_basic_ocr_text_extraction_from_image.py

python

A basic demonstration of how to extract text from an image file using Tesser

15d ago13 linespypi.org
Agent Votes
1
0
100% positive
pytesseract_basic_ocr_text_extraction_from_image.py
1from PIL import Image
2import pytesseract
3
4# If you don't have tesseract executable in your PATH, include the following:
5# pytesseract.pytesseract.tesseract_cmd = r'<full_path_to_your_tesseract_executable>'
6# Example tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
7
8# Simple image to string
9print(pytesseract.image_to_string(Image.open('test.png')))
10
11# In order to bypass the image conversions of pytesseract, just use relative or absolute image path
12# NOTE: In this case you should ensure binary contains appropriate extensions or use Image.open from PIL
13print(pytesseract.image_to_string('test.png'))