API Documentation

One endpoint, two free tiers. Everything you need to convert PDFs to markdown.

Quickstart

Convert a PDF in one command. No signup.

$ curl -X POST https://pdftomarkdown.dev/v1/convert \
  -H "Authorization: Bearer demo_public_key" \
  -H "Content-Type: application/json" \
  -d '{"input":{"pdf_url":"https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf"}}'

The demo key demo_public_key works instantly. Multi-page PDFs are accepted, but the Hacker tier only processes page 1 and is limited to 1 req/min.

Tiers

Both are free. Pick the one that fits.

Hacker Developer
Auth Public key GitHub login
Pages Page 1 only 100/month
Rate limit 1/min per IP None
Watermark Yes No

Tier 1 — Hacker

Public demo key, rate-limited to 1 req/min per IP. If you send a multi-page PDF, only page 1 is processed.

curl

$ curl -X POST https://pdftomarkdown.dev/v1/convert \
  -H "Authorization: Bearer demo_public_key" \
  -H "Content-Type: application/json" \
  -d '{"input":{"pdf_url":"https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf"}}'
{
  "markdown": "# Invoice\n\nDate: 2024-01-15\n...",
  "pages": 1,
  "request_id": "req_abc123"
}

Tier 1 responses append a watermark: > Processed by pdfToMarkdown.dev

Successful Hacker-tier responses also include X-PdfToMarkdown-Page-Cap: 1 so you can detect the enforced page cap.

Python

from pdftomarkdown import convert

# No API key needed for Tier 1
result = convert("document.pdf")
print(result.markdown)

Without an API key, the SDK uses the public demo key automatically.

Tier 2 — Developer

Sign in with GitHub for a personal key. 100 pages/month, no watermark, multi-page PDFs.

curl

$ curl -X POST https://pdftomarkdown.dev/v1/convert \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"input":{"pdf_url":"https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf"}}'
{
  "markdown": "# Invoice\n\nDate: 2024-01-15\nInvoice #: INV-2024-0042\n\n| Item | Qty | Price |\n|---|---|---|\n| API Pro Plan | 1 | $49.00 |\n\n**Total: $49.00**",
  "pages": 3,
  "request_id": "req_def456"
}

Replace YOUR_API_KEY with the key from GitHub login.

Python

from pdftomarkdown import convert

# Uses PDFTOMARKDOWN_API_KEY env var, or pass directly:
result = convert("document.pdf", api_key="YOUR_API_KEY")
print(result.markdown)
print(f"Processed {result.pages} pages")

Python SDK

Official Python client.

Install

$ pip install pdftomarkdown

Set your API key

$ export PDFTOMARKDOWN_API_KEY="YOUR_API_KEY"

Convert a local file

from pdftomarkdown import convert

# Uses PDFTOMARKDOWN_API_KEY env var, or pass directly:
result = convert("document.pdf", api_key="YOUR_API_KEY")
print(result.markdown)
print(f"Processed {result.pages} pages")

Convert from URL

from pdftomarkdown import convert

# Convert a PDF from a URL
result = convert(url="https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf")
print(result.markdown)

API Reference

POST /v1/convert

Request body · application/json

Field Type Required Description
input.pdf_url string Yes* Public URL of a PDF to fetch and convert
input.pdf_base64 string Yes* Base64-encoded PDF bytes
input.include_raw boolean No Add one extra raw field for debugging without changing the standard success fields
input.max_pages integer No Optional page cap. Hacker tier always overrides this to 1, so only page 1 is processed.

* Provide exactly one of input.pdf_url or input.pdf_base64.

Headers

Authorization Bearer <api_key>
Content-Type application/json

Response · application/json

Field Type Description
markdown string Converted markdown text
pages integer Pages processed
request_id string Unique ID for debugging

Successful responses always include markdown, pages, and request_id. Set input.include_raw to true only if you also want a fourth raw field for debugging.

Hacker-tier successes also return X-PdfToMarkdown-Page-Cap: 1; if the source PDF has multiple pages, only page 1 is converted and pages returns 1.

Errors

Error responses always include error, message, and request_id. Any 429 also includes retry_after_seconds and the same value in the Retry-After header.

422 Bad input

{
  "error": "unprocessable_document",
  "message": "The PDF could not be opened or parsed.",
  "request_id": "req_pdf321"
}

401 Unauthorized

{
  "error": "missing_api_key",
  "message": "Send Authorization: Bearer <api_key>.",
  "request_id": "req_auth789"
}

429 Rate limited

{
  "error": "rate_limited",
  "message": "The public Hacker tier allows 1 request per IP every 60 seconds.",
  "request_id": "req_rate123",
  "retry_after_seconds": 60
}

429 Quota exceeded

{
  "error": "quota_exceeded",
  "message": "Your monthly page quota is exhausted. Retry after the monthly reset.",
  "request_id": "req_quota456",
  "retry_after_seconds": 2678400,
  "reset_at": "2026-04-01T00:00:00.000Z"
}

Quota exhaustion also returns reset_at so you can surface the exact monthly reset time.

Code Meaning When
200 Success PDF converted
400 Bad Request Request payload rejected upstream
401 Unauthorized Missing or invalid API key
422 Unprocessable Entity Invalid source URL, TLS failure, unreachable source PDF, or unreadable PDF
429 Too Many Requests rate_limited or quota_exceeded; both include Retry-After
502 Bad Gateway Upstream worker unreachable, invalid, or failed while processing
504 Gateway Timeout Upstream worker timed out
500 Server Error Internal — include request_id when reporting

Ready to start?

Try the demo key or sign in with GitHub for 100 pages/month.