PDF Extraction & Parsing

PDF to JSON API

Convert business PDFs into structured JSON. Send a PDF to the API and receive extracted fields ready for dashboards, databases, automations, and internal tools. Built for receipts, invoices, and business documents.

Start with 20 documents/month and full API access. No credit card required.

Simple by design

PDF in, JSON out

One API call replaces an entire document processing pipeline. Send your PDF file. Receive structured JSON with the fields your application needs — merchants, totals, dates, currencies, IDs, and line items.

Standard REST API — works with any HTTP client
Structured JSON — ready for your database
Handles difficult or scanned documents automatically

Flow

1
Upload PDF file
POST /api/v1/extract with your file
2
Extraction runs
Fields detected and normalized
3
Receive JSON
Structured response with named fields

Supported document data

Fields extracted from receipts, invoices, and business PDFs

Totals and subtotals
Final and pre-tax amounts
Tax amount
Tax charged on the document
Dates
Transaction and due dates
Merchant / vendor
Company or store name
Invoice ID
Invoice reference number
Receipt ID
Receipt identifier
Currency
ISO 3-letter currency code
Line items
Individual line entries
Field coverage
Per-field coverage

Request

curl -X POST https://docuparseapi.com/api/v1/extract \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@document.pdf"

Response

{
  "success": true,
  "document_type": "receipt",
  "merchant": "Example Supplies Co.",
  "date": "2026-05-10",
  "total": "198.40",
  "subtotal": "180.00",
  "tax": "18.40",
  "currency": "USD",
  "receipt_id": "REC-88821",
  "line_items": [
    {
      "description": "Office supplies bundle",
      "quantity": 2,
      "amount": "90.00"
    }
  ],
  "extraction_source": "rule"
}
Common use cases

Common PDF extraction use cases

Built for real workflows, not demo apps. Use extracted JSON inside your product, dashboard, automation, or backend system.

E

Expense tracking

Parse receipt PDFs into expense entries without manual data entry.

I

Invoice automation

Extract invoice data from PDF files to drive approval and payment workflows.

I

Internal admin dashboards

Process business document uploads and display structured records.

S

SaaS document ingestion

Accept PDF uploads from your users and store extracted JSON in your database.

B

Bookkeeping workflows

Map extracted document fields to accounting software entries automatically.

R

Receipt and payment records

Archive payment records as structured data indexed by merchant, date, and amount.

Why structured extraction beats raw text extraction

Raw OCR gives you text. DocuParse API returns useful fields your application can actually use — without writing a parser.

FeatureRaw OCRManual entryDocuParse API
Returns structured JSON
Extracts totals and dates
Extracts merchant / vendor
Supports API workflows
Useful for automation
Reduces manual review

Frequently asked questions

What is a PDF to JSON API?

A PDF to JSON API accepts PDF files and returns structured JSON with named fields extracted from the document content — totals, dates, merchants, IDs, currencies, and line items — rather than raw text.

Can it parse receipts and invoices?

Yes. DocuParse API is built for receipts, invoices, and business documents. It extracts named fields specific to these document types.

Does it return raw text or structured fields?

Structured fields. You receive a JSON object with named properties like merchant, total, date, currency, and line_items — not a wall of raw text to parse yourself.

Can I use it in Node.js or Python?

Yes. The API is a standard REST endpoint. Any HTTP client works. The docs include copy-paste examples for cURL, Node.js, and Python.

Can it process scanned PDFs?

Yes. DocuParse API handles scanned PDFs. The extraction runs automatically regardless of scan quality and returns the same named fields.

How do I authenticate requests?

Use an API key in the Authorization header as a Bearer token. Generate your API key from the dashboard after signing up.

Start extracting from PDFs today

Start with 20 documents/month and full API access. No credit card required.