BlogEvent-Driven Invoice Processing with Webhooks

Event-Driven Invoice Processing with Webhooks

2026-06-02 · 7 min read

Push
no polling needed
~3s
until webhook fires
HMAC
signature verified
Free
to start
Event-driven flow — no polling loop required
📤
Your app
POST file to API
⚙️
DocuParseAPI
Extracts data (~3s)
🔔
Webhook fires
POST to your endpoint
Your server
Receives JSON payload
vs polling: your server only wakes up when results are ready — no wasted requests

There are two ways to get extraction results from a document processing API. The first is polling: you submit a document, receive an ID, then periodically call a status endpoint until the result is ready. The second is webhooks: you submit a document, and the API calls your server when the result is ready — you don't ask, you get told.

For anything beyond simple synchronous requests, webhooks are the better architecture. Here's how to use them.

Why Webhooks

Polling vs Webhooks

Polling:

text
Your server → POST /extract          → receives document_id
Your server → GET /documents/{id}    → status: "processing"
Your server → GET /documents/{id}    → status: "processing"
Your server → GET /documents/{id}    → status: "completed" + result

Problems: unnecessary requests, added latency, wasted compute, polling interval creates a trade-off between responsiveness and efficiency.

Webhooks:

text
Your server  → POST /extract                        → receives document_id
[processing happens]
DocuParseAPI → POST your-server.com/webhooks/docuparse → result delivered

Your server does nothing until the result arrives. No polling loop. No wasted requests.

Webhooks are particularly valuable for:

  • Processing high volumes of documents in background queues
  • Mobile or web applications where the user doesn't stay on the page
  • Event-driven pipelines: extract → validate → write to ERP
  • Batch operations where many documents are submitted at once
Payload · Node.js · Python

The Webhook Payload

webhook payload — POST to your server⚡ ~3s after upload
event"extraction.completed"Event type
document_id"doc_abc123"Unique ID
status"success"Result status
merchant"Acme Corp"Extracted field
total"1320.00"Extracted field
timestamp"2026-06-02T10:23:41Z"ISO 8601
signature"sha256=abc..."HMAC-SHA256
✓ Verify the signature header before processing. Reject anything that doesn't match.

document.completed

Fired when a document finishes processing successfully:

json
{
  "event": "document.completed",
  "document_id": "doc_clx7abc123",
  "timestamp": "2026-05-19T14:32:00Z",
  "data": {
    "document_type": "invoice",
    "merchant": "Riverside Consulting LLC",
    "invoice_id": "INV-2026-0091",
    "date": "2026-05-01",
    "due_date": "2026-05-31",
    "currency": "USD",
    "subtotal": "3500.00",
    "tax": "350.00",
    "total": "3850.00",
    "payment_method": null,
    "line_items": [
      {
        "description": "Strategy Consulting — May",
        "quantity": 35,
        "unit_price": "100.00",
        "total": "3500.00"
      }
    ]
  }
}

document.failed

Fired when extraction was attempted but could not produce a result:

json
{
  "event": "document.failed",
  "document_id": "doc_clx7abc456",
  "timestamp": "2026-05-19T14:32:05Z",
  "error": {
    "code": "EXTRACTION_FAILED",
    "message": "Document extraction failed for this file."
  }
}

batch.completed

Fired after all documents in a batch upload have finished processing:

json
{
  "event": "batch.completed",
  "batch_id": "batch_abc789",
  "timestamp": "2026-05-19T14:33:00Z",
  "summary": {
    "total": 12,
    "succeeded": 11,
    "failed": 1
  }
}
Node.js · Python
import os
import hmac
import hashlib
import json
from fastapi import FastAPI, Request, HTTPException, BackgroundTasks
from fastapi.responses import JSONResponse

app = FastAPI()


def verify_webhook_signature(payload: bytes, signature_header: str, secret: str) -> bool:
    """Verify that the webhook came from DocuParseAPI."""
    expected = "sha256=" + hmac.new(
        secret.encode("utf-8"),
        payload,
        hashlib.sha256,
    ).hexdigest()
    # Use constant-time comparison to prevent timing attacks
    return hmac.compare_digest(expected, signature_header)


@app.post("/webhooks/docuparse")
async def docuparse_webhook(request: Request, background_tasks: BackgroundTasks):
    body = await request.body()
    signature = request.headers.get("x-docuparse-signature", "")
    secret = os.environ.get("DOCUPARSE_WEBHOOK_SECRET", "")

    # Verify signature if secret is configured
    if secret and not verify_webhook_signature(body, signature, secret):
        raise HTTPException(status_code=401, detail="Invalid webhook signature")

    try:
        event = json.loads(body)
    except json.JSONDecodeError:
        raise HTTPException(status_code=400, detail="Invalid JSON payload")

    # Respond immediately — hand off processing to background task
    background_tasks.add_task(handle_event, event)
    return JSONResponse({"received": True})


async def handle_event(event: dict):
    event_type = event.get("event")

    if event_type == "document.completed":
        await handle_document_completed(
            event["document_id"],
            event["data"]
        )
    elif event_type == "document.failed":
        await handle_document_failed(
            event["document_id"],
            event.get("error", {})
        )
    elif event_type == "batch.completed":
        await handle_batch_completed(
            event["batch_id"],
            event.get("summary", {})
        )


async def handle_document_completed(document_id: str, data: dict):
    merchant = data.get("merchant", "Unknown")
    total = data.get("total", "0")
    currency = data.get("currency", "USD")
    
    print(f"✓ {merchant}: {currency} {total}")
    
    # Write to your database, call QuickBooks, send Slack notification, etc.
    await save_to_database(document_id, data)


async def handle_document_failed(document_id: str, error: dict):
    code = error.get("code", "UNKNOWN")
    print(f"✗ Document {document_id} failed: {code}")
    
    # Mark for manual review, notify user, log the failure
    await mark_for_review(document_id, error_code=code)


async def handle_batch_completed(batch_id: str, summary: dict):
    succeeded = summary.get("succeeded", 0)
    total = summary.get("total", 0)
    failed = summary.get("failed", 0)
    
    print(f"Batch {batch_id}: {succeeded}/{total} processed")
    
    if failed > 0:
        await send_alert(f"Batch {batch_id} had {failed} failed extractions")
Security

Signature Verification

DocuParseAPI signs each webhook delivery with an HMAC-SHA256 signature. The signature is in the X-DocuParse-Signature header as sha256=<hex_digest>.

Why signature verification matters: Without it, anyone who knows your webhook URL can send fake events to your server. A malicious request could trigger your invoice processing pipeline with fabricated data.

How to get your webhook secret:

  1. Go to Dashboard → Settings → Webhooks
  2. Add your endpoint URL
  3. Copy the generated secret — store it as DOCUPARSE_WEBHOOK_SECRET in your environment variables

The verification logic:

python
import hmac, hashlib

def is_valid_signature(payload_bytes: bytes, header: str, secret: str) -> bool:
    expected = "sha256=" + hmac.new(
        secret.encode(), payload_bytes, hashlib.sha256
    ).hexdigest()
    return hmac.compare_digest(expected, header)

Use hmac.compare_digest (Python) or crypto.timingSafeEqual (Node.js) — not == — to prevent timing attacks.

Your server endpoint is ready. Your API key isn't.
Get your key, set a webhook URL in the dashboard, and events start flowing.
See the webhook payload your server will receive
Upload a document. We'll show you the exact JSON your webhook endpoint would get.
Open Live Demo →
// Your Express endpoint
app.post("/webhook", (req, res) => {
const { merchant, total } = req.body;
// Process extracted data ← already structured
})
Free tier · 20 documents/month — free forever · No credit card · No account needed for the demo
Setup

Testing Without a Live Server

Option 1 — webhook.site (fastest):

  1. Go to webhook.site
  2. Copy your unique URL
  3. Set it as your webhook endpoint in the DocuParseAPI dashboard
  4. Upload a test document
  5. Watch the payload arrive in real time in your browser

Option 2 — ngrok (for local development):

bash
# Start your local server on port 3000
node server.js

# In another terminal, expose it publicly
ngrok http 3000
# Gives you: https://abc123.ngrok.io
# Use https://abc123.ngrok.io/webhooks/docuparse as your endpoint

Option 3 — simulate locally:

bash
# Simulate a webhook delivery to your local server
curl -X POST http://localhost:3000/webhooks/docuparse \
  -H "Content-Type: application/json" \
  -H "X-DocuParse-Signature: sha256=YOUR_COMPUTED_SIGNATURE" \
  -d '{
    "event": "document.completed",
    "document_id": "doc_test123",
    "timestamp": "2026-05-19T14:00:00Z",
    "data": {
      "merchant": "Test Vendor",
      "total": "100.00",
      "currency": "USD",
      "date": "2026-05-19"
    }
  }'
Errors

Common Mistakes

Doing slow work before responding: Database writes, external API calls, and email sends should all happen after you've sent the 200 response. If your handler takes 10 seconds before responding, the delivery times out and gets retried — you may process the event twice.

Not handling retries: If your server returns a non-200 status or times out, DocuParseAPI will retry the delivery. Your handlers should be idempotent — processing the same event twice should have the same result as processing it once. The document_id field is your deduplication key.

Ignoring document.failed events: Every submission that fails extraction fires a document.failed event. If you only handle document.completed, failed documents disappear silently. Always handle both.


Next Steps

Next Steps

POST /webhook → structured JSON → your server

No more polling loops. Just events.

Set up webhooks in 5 minutes. 20 documents/month — free forever, no credit card.

More from the blog