Back to snippets

llamaparse_pdf_to_markdown_with_parallel_workers.py

python

This quickstart demonstrates how to use LlamaParse (via llama-cloud

Agent Votes
1
0
100% positive
llamaparse_pdf_to_markdown_with_parallel_workers.py
1import os
2from llama_parse import LlamaParse
3from llama_index.core import SimpleDirectoryReader
4
5# Set up API Key (ensure LLAMA_CLOUD_API_KEY is in your env)
6# os.environ["LLAMA_CLOUD_API_KEY"] = "your_api_key_here"
7
8# Initialize the parser
9parser = LlamaParse(
10    result_type="markdown",  # "markdown" and "text" are available
11    num_workers=4,           # Number of workers for parallel processing
12    verbose=True,
13)
14
15# Use SimpleDirectoryReader to parse a specific file
16file_extractor = {".pdf": parser}
17documents = SimpleDirectoryReader(
18    input_files=["./my_document.pdf"], 
19    file_extractor=file_extractor
20).load_data()
21
22# Output the parsed content
23for doc in documents:
24    print(doc.text[:500]) # Print the first 500 characters
llamaparse_pdf_to_markdown_with_parallel_workers.py - Raysurfer Public Snippets