PDF to JSON API
Convert business PDFs into structured JSON. Send a PDF to the API and receive extracted fields ready for dashboards, databases, automations, and internal tools. Built for receipts, invoices, and business documents.
Start with 20 documents/month and full API access. No credit card required.
PDF in, JSON out
One API call replaces an entire document processing pipeline. Send your PDF file. Receive structured JSON with the fields your application needs — merchants, totals, dates, currencies, IDs, and line items.
Flow
Supported document data
Fields extracted from receipts, invoices, and business PDFs
Request
curl -X POST https://docuparseapi.com/api/v1/extract \ -H "Authorization: Bearer YOUR_API_KEY" \ -F "file=@document.pdf"
Response
{
"success": true,
"document_type": "receipt",
"merchant": "Example Supplies Co.",
"date": "2026-05-10",
"total": "198.40",
"subtotal": "180.00",
"tax": "18.40",
"currency": "USD",
"receipt_id": "REC-88821",
"line_items": [
{
"description": "Office supplies bundle",
"quantity": 2,
"amount": "90.00"
}
],
"extraction_source": "rule"
}Common PDF extraction use cases
Built for real workflows, not demo apps. Use extracted JSON inside your product, dashboard, automation, or backend system.
Expense tracking
Parse receipt PDFs into expense entries without manual data entry.
Invoice automation
Extract invoice data from PDF files to drive approval and payment workflows.
Internal admin dashboards
Process business document uploads and display structured records.
SaaS document ingestion
Accept PDF uploads from your users and store extracted JSON in your database.
Bookkeeping workflows
Map extracted document fields to accounting software entries automatically.
Receipt and payment records
Archive payment records as structured data indexed by merchant, date, and amount.
Why structured extraction beats raw text extraction
Raw OCR gives you text. DocuParse API returns useful fields your application can actually use — without writing a parser.
| Feature | Raw OCR | Manual entry | DocuParse API |
|---|---|---|---|
| Returns structured JSON | |||
| Extracts totals and dates | |||
| Extracts merchant / vendor | |||
| Supports API workflows | |||
| Useful for automation | |||
| Reduces manual review |
Frequently asked questions
What is a PDF to JSON API?
A PDF to JSON API accepts PDF files and returns structured JSON with named fields extracted from the document content — totals, dates, merchants, IDs, currencies, and line items — rather than raw text.
Can it parse receipts and invoices?
Yes. DocuParse API is built for receipts, invoices, and business documents. It extracts named fields specific to these document types.
Does it return raw text or structured fields?
Structured fields. You receive a JSON object with named properties like merchant, total, date, currency, and line_items — not a wall of raw text to parse yourself.
Can I use it in Node.js or Python?
Yes. The API is a standard REST endpoint. Any HTTP client works. The docs include copy-paste examples for cURL, Node.js, and Python.
Can it process scanned PDFs?
Yes. DocuParse API handles scanned PDFs. The extraction runs automatically regardless of scan quality and returns the same named fields.
How do I authenticate requests?
Use an API key in the Authorization header as a Bearer token. Generate your API key from the dashboard after signing up.
Related pages
Receipt OCR API
Purpose-built receipt parsing with structured JSON output.
Learn moreInvoice Parser API
Extract invoice IDs, due dates, vendors, and line items.
Learn moreReceipt to JSON
Convert receipt files directly to structured JSON.
Learn moreInvoice to JSON
Convert invoice PDFs to structured JSON data.
Learn moreSecurity & Data Handling
How DocuParse API handles your uploaded PDFs and documents.
Learn moreDeveloper Resources
Quickstart, code examples, and API reference.
Learn moreStart extracting from PDFs today
Start with 20 documents/month and full API access. No credit card required.