Document Reading
muxd can extract text from a wide range of document formats and feed the content directly to the agent. No plugins or external tools are required — everything is handled in pure Go and works on any platform.
Supported Formats
| Format | Extension(s) | Notes |
|---|---|---|
.pdf | Text extracted page by page | |
| Word | .docx | Paragraphs and tables extracted |
| Excel | .xlsx | Each sheet extracted with its header row |
| PowerPoint | .pptx | Slide text and speaker notes |
| HTML | .html, .htm | Tags stripped, text content only |
| CSV | .csv | Passed through as-is |
| JSON | .json | Passed through as-is |
| XML | .xml | Passed through as-is |
Entry Points
file_read Tool
The file_read tool auto-detects the file format by extension and applies the appropriate extractor automatically. You can ask the agent to read any supported file the same way you would a plain text file:
"Read the requirements doc at docs/spec.pdf and summarise the key constraints."
The agent calls file_read with the path; muxd extracts the text and returns it as the tool result.
Chat Attachments
You can attach a document directly in the TUI using the attach shortcut. The file is extracted at attachment time and its text is included in the message you send. This works for all supported formats.
Limits
| Limit | Value |
|---|---|
| Maximum file size | 10 MB |
| Maximum extracted text | 100 000 characters |
If the extracted text exceeds 100 000 characters, it is truncated and a notice is appended to the result. If the file exceeds 10 MB it is rejected before extraction begins.
Output Format by Type
PDF — pages are separated by a form-feed marker:
Page 1 content here...
---
Page 2 content here...
Excel — each sheet is introduced by a header line, followed by rows as tab-separated values:
Sheet: Q1 Revenue
Product Jan Feb Mar
Widget A 1200 1350 1100
Widget B 800 920 870
PowerPoint — each slide is numbered and speaker notes are appended below the slide text:
Slide 1: Introduction
Our roadmap for 2026.
Notes: Mention the acquisition timeline here.
HTML — all tags are stripped; only the visible text content is returned.
CSV / JSON / XML — returned verbatim, as plain text.
Platform Notes
Because the extractors are written in pure Go, they work identically on Linux, macOS, and Windows. There is no dependency on LibreOffice, Ghostscript, or any system-level tool.