Skip to main content

Documentation Index

Fetch the complete documentation index at: https://spacesail.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

The Website Reader with asynchronous processing crawls and processes entire websites efficiently, following links to create comprehensive knowledge bases from web content.

Code

examples/concepts/knowledge/readers/website_reader_async.py
import asyncio

from agno.agent import Agent
from agno.knowledge.knowledge import Knowledge
from agno.knowledge.reader.website_reader import WebsiteReader
from agno.vectordb.pgvector import PgVector

db_url = "postgresql+psycopg://ai:ai@localhost:5532/ai"

knowledge = Knowledge(
    vector_db=PgVector(
        table_name="website_documents",
        db_url=db_url,
    ),
)

agent = Agent(
    knowledge=knowledge,
    search_knowledge=True,
)

async def main():
    # Crawl and add website content to knowledge base
    await knowledge.add_content_async(
        url="https://docs.agno.com/introduction",
        reader=WebsiteReader(max_depth=2, max_links=20),
    )

    # Query the knowledge base
    await agent.aprint_response(
        "What are the main features of Agno?",
        markdown=True,
    )

if __name__ == "__main__":
    asyncio.run(main())

Usage

1

Create a virtual environment

Open the Terminal and create a python virtual environment.
python3 -m venv .venv
source .venv/bin/activate
2

Install libraries

pip install -U requests beautifulsoup4 sqlalchemy psycopg pgvector agno openai    
3

Set environment variables

export OPENAI_API_KEY=xxx
4

Run PgVector

docker run -d \
  -e POSTGRES_DB=ai \
  -e POSTGRES_USER=ai \
  -e POSTGRES_PASSWORD=ai \
  -e PGDATA=/var/lib/postgresql/data/pgdata \
  -v pgvolume:/var/lib/postgresql/data \
  -p 5532:5432 \
  --name pgvector \
  agno/pgvector:16
5

Run Agent

python examples/concepts/knowledge/readers/website_reader_async.py

Params

ParameterTypeDefaultDescription
urlstrRequiredURL of the website to crawl and read
max_depthint3Maximum depth level for crawling links
max_linksint10Maximum number of links to crawl