Web UI
The audia web UI is a FastAPI backend serving a React + Tailwind SPA.
Starting the server
audia serve
# → http://127.0.0.1:8000
Or with custom options:
audia serve --port 8080 --no-browser
Tabs
Convert
Upload a PDF (drag-and-drop or file browser). The PDF opens immediately in the preview panel. Click Convert to start the pipeline; progress streams in real time with per-stage and per-chunk updates. Cancel any running job with the Cancel button.
Research
Enter an ArXiv query (text or voice). Use Normalize query to let the LLM distil your raw query into concise search terms. Select papers from the results table, then click Convert selected. Each paper runs as an independent background job.
Configuration
Set the LLM provider / model, TTS backend / voice. Settings are saved to the database and persist across server restarts. The animated pipeline diagram illustrates where each component fits in the flow.
Library (Database)
Browse all tables (papers, audio_files, research_sessions, user_settings).
All displayed fields are inline-editable — click a cell to edit, Enter to commit,
Escape to cancel. Hide/show columns with the eye icon. Clicking a paper ID opens its PDF
in the preview panel.
API endpoints (summary)
Method |
Path |
Description |
|---|---|---|
|
|
Enqueue a PDF conversion job |
|
|
Poll job status and log stream |
|
|
Cancel a running job |
|
|
Enqueue ArXiv research jobs |
|
|
Transcribe uploaded audio |
|
|
LLM-distil a search query |
|
|
List all papers |
|
|
Update paper fields |
|
|
List all audio files |
|
|
Update audio file fields |
|
|
List research sessions |
|
|
List user settings |
|
|
Load persisted configuration |
|
|
Save configuration |
Full interactive API docs are available at http://127.0.0.1:8000/docs (Swagger UI) while
the server is running.