HTML to Markdown API

Features

LLM-ready output
Compact, token-efficient Markdown for RAG and agents.
GitHub-flavored Markdown
Tables, task lists, strikethrough, fenced code.
Code language hints
Detected language is added to fenced code blocks.
Images with alt text
Image URLs preserved with alt text for accessibility.
Heading hierarchy
H1–H6 structure preserved for chunking.
Frontmatter support
Optional YAML frontmatter with title, author, date.
JS rendering
Renders SPAs before conversion.
Bulk conversion
Pass an array of URLs in one request.

Use cases

Feed live web content into LLMs for RAG and AI agents
Build token-efficient context for ChatGPT, Claude, and Gemini
Migrate legacy HTML documentation to static site generators
Archive articles as portable Markdown for offline reading
Import web content into headless CMS platforms that use Markdown

API examples

Convert a web page to LLM-ready Markdown with cURLbash

curl -X POST https://api.agenty.ai/v1/markdown \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/docs"
  }'

Convert to Markdown with YAML frontmatter for static sitesbash

curl -X POST https://api.agenty.ai/v1/markdown \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/blog/post",
    "frontmatter": ["title", "author", "date"]
  }'

Fetch LLM-ready Markdown in Node.js and send it to an LLMjavascript

// 1. Convert the page to Markdown
const res = await fetch('https://api.agenty.ai/v1/markdown', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({ url: 'https://example.com/docs' }),
});
const { markdown } = await res.json();

// 2. Feed it straight into an LLM as context
const answer = await openai.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [
    { role: 'system', content: 'Answer questions using the provided docs.' },
    { role: 'user', content: `Docs:\n\n${markdown}\n\nQuestion: What does this page describe?` },
  ],
});

Convert a URL to Markdown in Python for a RAG pipelinepython

import requests

res = requests.post(
    "https://api.agenty.ai/v1/markdown",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json={"url": "https://example.com/docs"},
)
markdown = res.json()["markdown"]

# Chunk and embed for your vector store
chunks = markdown.split("\n## ")
print(f"{len(chunks)} chunks ready for embedding")

How Agenty compares

Feature	Agenty	MarkDrop	Html2Markdown	Turndown
URL to Markdown (hosted API)	Yes	No	Library only	Library only
LLM-ready output	Yes	Partial	No	No
Table support	Yes (GFM)	Yes	Limited	Plugin
Code language detection	Yes	No	No	No
Frontmatter support	Yes	No	No	No
Free tier	Yes	Yes	Open source	Open source

Frequently asked questions

What is the HTML to Markdown API?

The Agenty Markdown API converts any web page into clean, LLM-ready Markdown. It preserves structure (headings, lists, tables), keeps code blocks with language hints, and retains image references — so the output works in both RAG pipelines and static site generators.

Why is this better than raw HTML for LLMs?

Markdown is far more token-efficient than raw HTML — typically 3–5x smaller for the same article — and strips noise like <script>, ads, and inline styles. That means your LLM context window is spent on real content, not markup, which improves both cost and answer quality in RAG and agent workflows.

Does it support tables?

Yes. HTML tables are converted to GitHub-flavored Markdown tables. Complex tables with colspan or rowspan are flattened into simple aligned tables.

Can I include YAML frontmatter?

Yes. Pass a frontmatter array with field names like "title", "author", "date", and the API prepends YAML frontmatter to the Markdown output — ready for Hugo, Astro, Next.js, or Jekyll.

Is there a free tier?

Yes. All new accounts include a free tier. See our pricing page for limits.