LLM Backends

audia requires an LLM for the curation step. Two providers are supported.

OpenAI

AUDIA_LLM_PROVIDER=openai
AUDIA_OPENAI_API_KEY=sk-...
AUDIA_LLM_MODEL=gpt-4o-mini       # default

Recommended models:

Model	Notes
`gpt-4o-mini`	Default — fast, cheap, good quality
`gpt-4o`	Higher quality, higher cost

Custom endpoint

For Azure OpenAI, corporate proxies, or any OpenAI-compatible API:

AUDIA_OPENAI_API_BASE=https://your-org.openai.azure.com/

When set, all OpenAI calls (LLM curation and OpenAI TTS) are routed through this URL.

Anthropic

AUDIA_LLM_PROVIDER=anthropic
AUDIA_ANTHROPIC_API_KEY=sk-ant-...
AUDIA_LLM_MODEL=claude-3-5-haiku-20241022

Recommended models:

Model	Notes
`claude-3-5-haiku-20241022`	Fast and cost-effective
`claude-3-5-sonnet-20241022`	Higher quality

Custom endpoint

AUDIA_ANTHROPIC_API_BASE=https://your-proxy.example.com/

Shared settings

Variable	Default	Description
`AUDIA_LLM_MAX_CHUNK_CHARS`	`8000`	Characters per curation chunk
`AUDIA_LLM_TEMPERATURE`	`0.1`	0 = deterministic, 1 = creative

Temperature recommendation

Keep AUDIA_LLM_TEMPERATURE low (0.0 – 0.2) for curation tasks. Higher values can introduce hallucinations into the audio script.

How the LLM is used

The LLM receives each chunk with a system prompt instructing it to:

Rewrite all mathematical expressions in plain English
Replace tables with concise narrative sentences
Condense acknowledgements to a single sentence
Remove citation markers and reference lists entirely
Maintain a natural, engaging tone suitable for listening

For multi-chunk documents, each chunk after the first also receives the tail of the previous curated output as context, ensuring smooth spoken transitions across chunk boundaries.