PDF Extraction & Parsing

PDF to JSON API

Convert business PDFs into structured JSON. Send a PDF to the API and receive extracted fields ready for dashboards, databases, automations, and internal tools. Built for receipts, invoices, and business documents.

Try PDF Parsing View Docs

Start with 20 documents/month and full API access. No credit card required.

Simple by design

PDF in, JSON out

One API call replaces an entire document processing pipeline. Send your PDF file. Receive structured JSON with the fields your application needs — merchants, totals, dates, currencies, IDs, and line items.

Standard REST API — works with any HTTP client

Structured JSON — ready for your database

Handles difficult or scanned documents automatically

Flow

Upload PDF file

POST /api/v1/extract with your file

Extraction runs

Fields detected and normalized

Receive JSON

Structured response with named fields

Supported document data

Fields extracted from receipts, invoices, and business PDFs

Totals and subtotals

Final and pre-tax amounts

Tax amount

Tax charged on the document

Dates

Transaction and due dates

Merchant / vendor

Company or store name

Invoice ID

Invoice reference number

Receipt ID

Receipt identifier

Currency

ISO 3-letter currency code

Line items

Individual line entries

Field coverage

Per-field coverage

Request

curl -X POST https://docuparseapi.com/api/v1/extract \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@document.pdf"

Response

{
  "success": true,
  "document_type": "receipt",
  "merchant": "Example Supplies Co.",
  "date": "2026-05-10",
  "total": "198.40",
  "subtotal": "180.00",
  "tax": "18.40",
  "currency": "USD",
  "receipt_id": "REC-88821",
  "line_items": [
    {
      "description": "Office supplies bundle",
      "quantity": 2,
      "amount": "90.00"
    }
  ],
  "extraction_source": "rule"
}

Common use cases

Common PDF extraction use cases

Built for real workflows, not demo apps. Use extracted JSON inside your product, dashboard, automation, or backend system.

Expense tracking

Parse receipt PDFs into expense entries without manual data entry.

Invoice automation

Extract invoice data from PDF files to drive approval and payment workflows.

Internal admin dashboards

Process business document uploads and display structured records.

SaaS document ingestion

Accept PDF uploads from your users and store extracted JSON in your database.

Bookkeeping workflows

Map extracted document fields to accounting software entries automatically.

Receipt and payment records

Archive payment records as structured data indexed by merchant, date, and amount.

Why structured extraction beats raw text extraction

Raw OCR gives you text. DocuParse API returns useful fields your application can actually use — without writing a parser.

Feature	Raw OCR	Manual entry	DocuParse API
Returns structured JSON
Extracts totals and dates
Extracts merchant / vendor
Supports API workflows
Useful for automation
Reduces manual review

Frequently asked questions

What is a PDF to JSON API?

A PDF to JSON API accepts PDF files and returns structured JSON with named fields extracted from the document content — totals, dates, merchants, IDs, currencies, and line items — rather than raw text.

Can it parse receipts and invoices?

Yes. DocuParse API is built for receipts, invoices, and business documents. It extracts named fields specific to these document types.

Does it return raw text or structured fields?

Structured fields. You receive a JSON object with named properties like merchant, total, date, currency, and line_items — not a wall of raw text to parse yourself.

Can I use it in Node.js or Python?

Yes. The API is a standard REST endpoint. Any HTTP client works. The docs include copy-paste examples for cURL, Node.js, and Python.

Can it process scanned PDFs?

Yes. DocuParse API handles scanned PDFs. The extraction runs automatically regardless of scan quality and returns the same named fields.

How do I authenticate requests?

Use an API key in the Authorization header as a Bearer token. Generate your API key from the dashboard after signing up.

Start extracting from PDFs today