HTML to Markdown API
Convert any URL into clean, LLM-ready Markdown — perfect for RAG pipelines, AI agents, and documentation migrations.
The Agenty Markdown API converts any URL into clean, LLM-ready Markdown. It strips navigation, ads, and scripts, then preserves headings, lists, tables, code blocks (with language hints), and image alt text. The result is compact, token-efficient Markdown that you can drop straight into a vector store, RAG pipeline, or AI agent context window.
Use it to feed live web content into LLMs without writing custom scrapers, to migrate legacy HTML docs into static site generators, or to archive articles as portable Markdown files.
Features
- LLM-ready outputCompact, token-efficient Markdown for RAG and agents.
- GitHub-flavored MarkdownTables, task lists, strikethrough, fenced code.
- Code language hintsDetected language is added to fenced code blocks.
- Images with alt textImage URLs preserved with alt text for accessibility.
- Heading hierarchyH1–H6 structure preserved for chunking.
- Frontmatter supportOptional YAML frontmatter with title, author, date.
- JS renderingRenders SPAs before conversion.
- Bulk conversionPass an array of URLs in one request.
Use cases
- Feed live web content into LLMs for RAG and AI agents
- Build token-efficient context for ChatGPT, Claude, and Gemini
- Migrate legacy HTML documentation to static site generators
- Archive articles as portable Markdown for offline reading
- Import web content into headless CMS platforms that use Markdown
API examples
curl -X POST https://api.agenty.ai/v1/markdown \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com/docs"
}'curl -X POST https://api.agenty.ai/v1/markdown \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com/blog/post",
"frontmatter": ["title", "author", "date"]
}'// 1. Convert the page to Markdown
const res = await fetch('https://api.agenty.ai/v1/markdown', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json',
},
body: JSON.stringify({ url: 'https://example.com/docs' }),
});
const { markdown } = await res.json();
// 2. Feed it straight into an LLM as context
const answer = await openai.chat.completions.create({
model: 'gpt-4o-mini',
messages: [
{ role: 'system', content: 'Answer questions using the provided docs.' },
{ role: 'user', content: `Docs:\n\n${markdown}\n\nQuestion: What does this page describe?` },
],
});import requests
res = requests.post(
"https://api.agenty.ai/v1/markdown",
headers={"Authorization": "Bearer YOUR_API_KEY"},
json={"url": "https://example.com/docs"},
)
markdown = res.json()["markdown"]
# Chunk and embed for your vector store
chunks = markdown.split("\n## ")
print(f"{len(chunks)} chunks ready for embedding")How Agenty compares
| Feature | Agenty | MarkDrop | Html2Markdown | Turndown |
|---|---|---|---|---|
| URL to Markdown (hosted API) | Yes | No | Library only | Library only |
| LLM-ready output | Yes | Partial | No | No |
| Table support | Yes (GFM) | Yes | Limited | Plugin |
| Code language detection | Yes | No | No | No |
| Frontmatter support | Yes | No | No | No |
| Free tier | Yes | Yes | Open source | Open source |
Frequently asked questions
What is the HTML to Markdown API?
The Agenty Markdown API converts any web page into clean, LLM-ready Markdown. It preserves structure (headings, lists, tables), keeps code blocks with language hints, and retains image references — so the output works in both RAG pipelines and static site generators.
Why is this better than raw HTML for LLMs?
Markdown is far more token-efficient than raw HTML — typically 3–5x smaller for the same article — and strips noise like <script>, ads, and inline styles. That means your LLM context window is spent on real content, not markup, which improves both cost and answer quality in RAG and agent workflows.
Does it support tables?
Yes. HTML tables are converted to GitHub-flavored Markdown tables. Complex tables with colspan or rowspan are flattened into simple aligned tables.
Can I include YAML frontmatter?
Yes. Pass a frontmatter array with field names like "title", "author", "date", and the API prepends YAML frontmatter to the Markdown output — ready for Hugo, Astro, Next.js, or Jekyll.
Is there a free tier?
Yes. All new accounts include a free tier. See our pricing page for limits.