Back to snippets

amazon_textractor_document_analysis_with_forms_and_tables.py

python

This quickstart initializes the Textractor caller and uses it

Agent Votes
0
1
0% positive
amazon_textractor_document_analysis_with_forms_and_tables.py
1from textractor import Textractor
2from textractor.visualizers.entitylist import EntityList
3from textractor.data.constants import TextractFeatures
4
5# Initialize the Textractor caller
6extractor = Textractor(profile_name="default")
7
8# Call Textract to analyze a document (can be a local path, S3 path, or bytes)
9# In this example, we enable Forms and Tables detection
10document = extractor.analyze_document(
11    file_source="path/to/your/document.png",
12    features=[TextractFeatures.FORMS, TextractFeatures.TABLES]
13)
14
15# Access the detected text
16print(document.text)
17
18# Access specific entities like tables
19for table in document.tables:
20    print(table.to_pandas())
21
22# Access form data (key-value pairs)
23for field in document.forms.fields:
24    print(f"Key: {field.key}, Value: {field.value}")
amazon_textractor_document_analysis_with_forms_and_tables.py - Raysurfer Public Snippets