Skip to main content

Documentation Index

Fetch the complete documentation index at: https://spacesail.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

The PDF Reader with asynchronous processing allows you to handle PDF files efficiently and integrate them with knowledge bases.

Code

examples/concepts/knowledge/readers/pdf_reader_async.py
import asyncio

from agno.agent import Agent
from agno.knowledge.knowledge import Knowledge
from agno.vectordb.pgvector import PgVector

db_url = "postgresql+psycopg://ai:ai@localhost:5532/ai"

# Create a knowledge base with the PDFs from the data/pdfs directory
knowledge = Knowledge(
    vector_db=PgVector(
        table_name="pdf_documents",
        db_url=db_url,
    )
)

# Create an agent with the knowledge base
agent = Agent(
    knowledge=knowledge,
    search_knowledge=True,
)

if __name__ == "__main__":
    asyncio.run(
        knowledge.add_content_async(
            path="cookbook/knowledge/testing_resources/cv_1.pdf",
        )
    )
    # Create and use the agent
    asyncio.run(
        agent.aprint_response(
            "What skills does an applicant require to apply for the Software Engineer position?",
            markdown=True,
        )
    )

Usage

1

Create a virtual environment

Open the Terminal and create a python virtual environment.
python3 -m venv .venv
source .venv/bin/activate
2

Install libraries

pip install -U pypdf sqlalchemy psycopg pgvector agno openai  
3

Set environment variables

export OPENAI_API_KEY=xxx
4

Run PgVector

docker run -d \
  -e POSTGRES_DB=ai \
  -e POSTGRES_USER=ai \
  -e POSTGRES_PASSWORD=ai \
  -e PGDATA=/var/lib/postgresql/data/pgdata \
  -v pgvolume:/var/lib/postgresql/data \
  -p 5532:5432 \
  --name pgvector \
  agno/pgvector:16
5

Run Agent

python examples/concepts/knowledge/readers/pdf_reader_async.py

Params

ParameterTypeDefaultDescription
pathPathRequiredPath to PDF file or URL
split_on_pagesboolTrueSplit the PDF into pages
page_start_numbering_formatOptional[str]NoneFormat for page numbering
page_end_numbering_formatOptional[str]NoneFormat for page numbering
passwordOptional[str]NonePassword to unlock the PDF