Python Integration

DocuParse API Python Integration

Use DocuParse API in Python to extract structured JSON from receipts, invoices, and PDFs. The API is a standard REST endpoint — no SDK required. Use the requests library to upload files and receive named fields.

Prerequisites

Python 3.8 or later
A DocuParse API key (sign up for a free account to generate one)
The requests library

Install requests:

pip install requests

API key security: Store your API key in an environment variable (e.g. DOCUPARSE_API_KEY), not in source code or version control.

Basic usage

Upload a receipt, invoice, or PDF file using a multipart POST request and receive structured JSON with named extraction fields.

import requests
import os

API_KEY = os.environ.get("DOCUPARSE_API_KEY")  # store in env, not source
API_URL = "https://docuparseapi.com/api/v1/extract"

def extract_document(file_path: str) -> dict:
    with open(file_path, "rb") as f:
        response = requests.post(
            API_URL,
            headers={"Authorization": f"Bearer {API_KEY}"},
            files={"file": f},
        )
    response.raise_for_status()
    return response.json()

# Usage
result = extract_document("receipt.pdf")
print(result["merchant"])   # "Office Depot"
print(result["total"])      # "45.50"
print(result["date"])       # "2026-04-26"

Response

The API returns a JSON object with named fields. All fields are present in the response; fields not found in the document return null.

{
  "success": true,
  "document_type": "receipt",
  "merchant": "Office Depot",
  "date": "2026-04-26",
  "total": "45.50",
  "subtotal": "42.00",
  "tax": "3.50",
  "currency": "USD",
  "receipt_id": "R-10492",
  "payment_method": "Card",
  "line_items": [
    { "description": "Notebook", "quantity": 2, "amount": "20.00" }
  ]
}

Error handling

The API uses standard HTTP status codes. Handle common errors in your Python code:

import requests
from requests.exceptions import HTTPError

def extract_document(file_path: str) -> dict:
    try:
        with open(file_path, "rb") as f:
            response = requests.post(
                "https://docuparseapi.com/api/v1/extract",
                headers={"Authorization": f"Bearer {API_KEY}"},
                files={"file": f},
                timeout=30,
            )
        response.raise_for_status()
        return response.json()

    except HTTPError as e:
        status = e.response.status_code
        if status == 401:
            raise ValueError("Invalid API key") from e
        elif status == 422:
            raise ValueError("Unsupported file type or empty file") from e
        elif status == 429:
            raise RuntimeError("Rate limit exceeded") from e
        else:
            raise RuntimeError(f"API error {status}") from e

    except requests.Timeout:
        raise RuntimeError("Request timed out")
401Invalid or missing API key
422Unsupported file type or empty file
429Rate limit exceeded
500Server-side processing error

Parsing the response

Map the extracted JSON fields to your application's data model:

def process_receipt(file_path: str) -> dict:
    data = extract_document(file_path)

    if not data.get("success"):
        raise ValueError("Extraction was not successful")

    # Access structured fields directly
    return {
        "merchant": data.get("merchant"),
        "total": float(data.get("total", 0)),
        "tax": float(data.get("tax", 0)),
        "date": data.get("date"),
        "currency": data.get("currency"),
        "line_items": data.get("line_items", []),
    }

Start with Python today

Start with 20 documents/month and full API access. No credit card required.