Skip to main content
Agno agents support text, image, audio and video inputs and can generate text, image, audio and video outputs. For a complete overview, please check out the compatibility matrix.
To get started, take a look at the multimodal examples.

Image

Image As Input

Learn how to use image as input with Agno agents.

Image As Output

Learn how to use image as output with Agno agents.

Image Generation

Learn how to use image generation with Agno agents.

Audio

Audio As Input

Learn how to use audio as input with Agno agents.

Audio As Output

Learn how to use audio as output with Agno agents.

Speech-to-Text

Learn how to use speech-to-text with Agno agents.

Audio Generation

Learn how to use audio generation with Agno agents.

Video

Video As Input

Learn how to use video as input with Agno agents.

Video Generation

Learn how to use video generation with Agno agents.

Files

Files As Input

Learn how to use files as input with Agno agents.

Files Generation

Learn how to use files generation with Agno agents.