Webhook Infrastructure for SaaS Apps 2026
TL;DR
Inngest for most SaaS builders — serverless-friendly, great DX, handles fan-out and retries without Redis. Trigger.dev for long-running jobs that need durable execution (AI pipelines, data processing, multi-step workflows). BullMQ for high-throughput queues where you already have Redis and need battle-tested reliability. Build your own only if your use case is simple and you can't justify another dependency. The main thing boilerplates miss: they handle webhooks in the route handler itself — that's the wrong pattern. Webhook processing always belongs in a background queue.
Key Takeaways
- Inngest: serverless-native, no Redis needed, TypeScript-first, free tier generous (50K steps/month)
- Trigger.dev: durable execution (survives deploys), ideal for multi-hour AI jobs, self-hostable
- BullMQ: Redis-backed, battle-tested, 1M+ jobs/day capable, requires Redis infrastructure
- Custom (Postgres queue): zero dependencies, works at modest scale (<10K jobs/day)
- Never process webhooks synchronously — always enqueue and return 200 immediately
- Boilerplate gap: ShipFast/T3/Supastarter ship no background job infrastructure
Why Webhooks Need Background Processing
The wrong pattern that most boilerplates ship:
// ❌ WRONG: Processing Stripe webhook synchronously
export async function POST(request: Request) {
const event = stripe.webhooks.constructEvent(body, sig, secret);
// This all runs in the HTTP handler — if it fails or times out,
// Stripe will retry the webhook, causing duplicate processing:
await db.user.update({ where: { stripeCustomerId: event.data.object.customer } });
await sendWelcomeEmail(user.email); // Slow! Email API call in request handler
await updateMetrics(user); // Multiple DB writes
await notifySlack(user); // External API call
return new Response('OK');
}
The right pattern:
// ✅ RIGHT: Enqueue immediately, process in background
export async function POST(request: Request) {
const event = stripe.webhooks.constructEvent(body, sig, secret);
// Just enqueue — return 200 in under 100ms:
await inngest.send({ name: 'stripe/webhook', data: { event } });
return new Response('OK'); // Stripe considers this successful
}
// Background function handles the actual work:
inngest.createFunction(
{ id: 'stripe-webhook-handler' },
{ event: 'stripe/webhook' },
async ({ event, step }) => {
// Retried automatically on failure, never blocks the HTTP response
await step.run('update-user', async () => {
await db.user.update({ ... });
});
await step.run('send-email', async () => {
await resend.emails.send({ ... });
});
}
);
Inngest: Serverless-Native Background Jobs
Best for: Next.js apps on Vercel/serverless, teams that don't want to manage Redis, fan-out patterns
npm install inngest
// lib/inngest.ts
import { Inngest } from 'inngest';
export const inngest = new Inngest({ id: 'my-saas' });
// app/api/inngest/route.ts — single endpoint for all functions
import { serve } from 'inngest/next';
import { inngest } from '@/lib/inngest';
import { handleStripeWebhook } from '@/inngest/stripe';
import { sendWelcomeEmail } from '@/inngest/emails';
import { processAiJob } from '@/inngest/ai';
export const { GET, POST, PUT } = serve({
client: inngest,
functions: [handleStripeWebhook, sendWelcomeEmail, processAiJob],
});
Multi-Step Functions
Inngest's killer feature: steps run durably, retrying independently on failure:
// inngest/stripe.ts
import { inngest } from '@/lib/inngest';
export const handleStripeWebhook = inngest.createFunction(
{
id: 'stripe-webhook',
retries: 5,
throttle: { limit: 100, period: '1m' },
},
{ event: 'stripe/webhook' },
async ({ event, step }) => {
const stripeEvent = event.data.event;
if (stripeEvent.type !== 'checkout.session.completed') return;
// Each step retries independently — if step 2 fails, step 1 doesn't re-run:
const user = await step.run('update-subscription', async () => {
const session = stripeEvent.data.object;
return db.user.update({
where: { id: session.metadata.userId },
data: {
plan: 'pro',
stripeCustomerId: session.customer,
subscriptionStatus: 'active',
},
});
});
await step.run('send-welcome-email', async () => {
await resend.emails.send({
to: user.email,
subject: 'Welcome to Pro!',
html: render(ProWelcomeEmail({ name: user.name })),
});
});
await step.run('notify-slack', async () => {
await fetch(process.env.SLACK_WEBHOOK_URL!, {
method: 'POST',
body: JSON.stringify({ text: `New Pro subscription: ${user.email}` }),
});
});
}
);
Fan-Out Pattern (Parallel Steps)
export const weeklyReportJob = inngest.createFunction(
{ id: 'weekly-reports', concurrency: { limit: 10 } },
{ cron: '0 9 * * MON' }, // Every Monday 9am UTC
async ({ step }) => {
// Fetch all users who need reports:
const users = await step.run('get-users', async () => {
return db.user.findMany({
where: { plan: 'pro', weeklyReports: true },
});
});
// Fan out — send one event per user, processed in parallel:
await step.sendEvent('send-reports', users.map((user) => ({
name: 'report/weekly',
data: { userId: user.id },
})));
}
);
export const sendWeeklyReport = inngest.createFunction(
{ id: 'send-weekly-report', concurrency: { limit: 20 } },
{ event: 'report/weekly' },
async ({ event, step }) => {
const userId = event.data.userId;
// Process each user's report independently:
const stats = await step.run('get-stats', async () => {
return getWeeklyStats(userId);
});
await step.run('send-email', async () => {
await sendReportEmail(userId, stats);
});
}
);
Inngest Pricing
Free: 50,000 steps/month
Starter ($25/month): 500,000 steps/month
Growth ($100/month): 5M steps/month
A "step" = one step.run() call or one event send.
A typical Stripe webhook with 3 steps = 3 steps consumed.
Trigger.dev: Durable Long-Running Jobs
Best for: AI pipelines, multi-minute/hour jobs, jobs that must survive deploys
npm install @trigger.dev/sdk@v3
// trigger/ai-pipeline.ts
import { task, wait } from '@trigger.dev/sdk/v3';
export const processDocumentTask = task({
id: 'process-document',
maxDuration: 300, // 5 minutes max
retry: {
maxAttempts: 3,
minTimeoutInMs: 1000,
maxTimeoutInMs: 10000,
factor: 2,
},
run: async (payload: { documentId: string; userId: string }) => {
const { documentId, userId } = payload;
// Step 1: Download and extract text
const document = await fetchDocument(documentId);
const text = await extractText(document);
// Step 2: Generate embeddings (slow API call)
const embeddings = await generateEmbeddings(text);
// Step 3: Store in vector DB
await storeEmbeddings(documentId, embeddings);
// Step 4: Notify user
await notifyUser(userId, 'Document processed successfully');
return { documentId, chunkCount: embeddings.length };
},
});
// Trigger from your API route:
import { tasks } from '@trigger.dev/sdk/v3';
import { processDocumentTask } from '@/trigger/ai-pipeline';
export async function POST(req: Request) {
const { documentId } = await req.json();
const session = await auth();
// Enqueue the job — returns immediately with a handle:
const handle = await tasks.trigger(processDocumentTask, {
documentId,
userId: session!.user.id,
});
return Response.json({ runId: handle.id });
}
// Poll job status from client:
import { runs } from '@trigger.dev/sdk/v3';
export async function GET(req: Request) {
const { searchParams } = new URL(req.url);
const runId = searchParams.get('runId')!;
const run = await runs.retrieve(runId);
return Response.json({
status: run.status, // 'QUEUED' | 'EXECUTING' | 'COMPLETED' | 'FAILED'
output: run.output,
});
}
When Trigger.dev Wins Over Inngest
Use Trigger.dev when:
→ Jobs take minutes or hours (AI processing, batch imports)
→ You need to cancel running jobs mid-execution
→ Jobs must survive deploys (CI/CD shouldn't kill in-flight work)
→ You want full self-hosting control
→ Real-time job monitoring is required (streaming logs)
BullMQ: Redis-Backed High-Throughput Queues
Best for: high-volume jobs (1M+/day), teams already using Redis, existing Node.js workers
npm install bullmq ioredis
// lib/queue.ts
import { Queue, Worker, QueueEvents } from 'bullmq';
import Redis from 'ioredis';
const connection = new Redis(process.env.REDIS_URL!, {
maxRetriesPerRequest: null,
});
// Define queues:
export const emailQueue = new Queue('email', { connection });
export const webhookQueue = new Queue('webhooks', { connection });
// Worker processes jobs:
const emailWorker = new Worker(
'email',
async (job) => {
switch (job.name) {
case 'welcome':
await sendWelcomeEmail(job.data.userId);
break;
case 'weekly-report':
await sendWeeklyReport(job.data.userId);
break;
default:
throw new Error(`Unknown job type: ${job.name}`);
}
},
{
connection,
concurrency: 5,
}
);
// Error handling:
emailWorker.on('failed', (job, error) => {
console.error(`Job ${job?.id} failed:`, error);
// Alert Sentry, update DB status, etc.
});
// Enqueue from API route:
export async function POST(req: Request) {
const { userId } = await req.json();
await emailQueue.add('welcome', { userId }, {
attempts: 3,
backoff: { type: 'exponential', delay: 5000 },
removeOnComplete: { count: 1000 },
removeOnFail: { count: 5000 },
});
return new Response('OK');
}
// Scheduled jobs with BullMQ:
import { Queue } from 'bullmq';
await emailQueue.add(
'weekly-report',
{ batchId: Date.now() },
{
repeat: {
pattern: '0 9 * * MON', // Every Monday 9am
},
}
);
BullMQ requires running a worker process — this is the biggest operational difference from Inngest/Trigger.dev. You need a separate Node.js process (or container) running your workers. On Vercel, you'd need a separate worker service.
Custom: Postgres-Backed Job Queue
Best for: simple use cases, no Redis, modest volume (<10K jobs/day)
// lib/job-queue.ts — minimal Postgres queue
import { db } from './db';
export type JobStatus = 'pending' | 'processing' | 'completed' | 'failed';
export async function enqueueJob(
type: string,
data: Record<string, unknown>,
options?: { scheduledFor?: Date; maxAttempts?: number }
) {
return db.backgroundJob.create({
data: {
type,
data,
status: 'pending',
scheduledFor: options?.scheduledFor ?? new Date(),
maxAttempts: options?.maxAttempts ?? 3,
},
});
}
// Polling worker (run via cron or long-running process):
export async function processNextJob() {
// Atomic claim using DB transaction:
const job = await db.$transaction(async (tx) => {
const job = await tx.backgroundJob.findFirst({
where: {
status: 'pending',
scheduledFor: { lte: new Date() },
attempts: { lt: tx.backgroundJob.fields.maxAttempts },
},
orderBy: { createdAt: 'asc' },
});
if (!job) return null;
return tx.backgroundJob.update({
where: { id: job.id },
data: { status: 'processing', startedAt: new Date() },
});
});
if (!job) return false;
try {
await executeJob(job);
await db.backgroundJob.update({
where: { id: job.id },
data: { status: 'completed', completedAt: new Date() },
});
} catch (error) {
const nextAttempt = job.attempts + 1;
const isExhausted = nextAttempt >= job.maxAttempts;
await db.backgroundJob.update({
where: { id: job.id },
data: {
status: isExhausted ? 'failed' : 'pending',
attempts: nextAttempt,
lastError: (error as Error).message,
// Exponential backoff:
scheduledFor: isExhausted
? undefined
: new Date(Date.now() + Math.pow(2, nextAttempt) * 5000),
},
});
}
return true;
}
Which to Choose
| Inngest | Trigger.dev | BullMQ | Custom | |
|---|---|---|---|---|
| Setup complexity | Low | Medium | High | Low |
| Redis required | ❌ | ❌ | ✅ | ❌ |
| Serverless compatible | ✅ | ✅ | ❌ | ✅ |
| Long-running jobs | Limited | ✅ (hours) | ✅ | Limited |
| Self-hostable | Partial | ✅ | ✅ | ✅ |
| Volume ceiling | 5M steps/mo | Custom | Unlimited | ~10K/day |
| Monitoring dashboard | ✅ | ✅ | Limited | ❌ |
| Free tier | 50K steps | 25K runs | - | - |
Choose Inngest if:
→ Vercel/serverless deployment
→ Stripe webhooks, email sending, fan-out patterns
→ Want job monitoring without running infrastructure
Choose Trigger.dev if:
→ AI pipelines or multi-minute jobs
→ Need to cancel in-flight jobs
→ Want full self-hosting control
→ Durable execution across deploys required
Choose BullMQ if:
→ Already have Redis
→ Need 1M+ jobs/day
→ Building on non-serverless infra (Railway, Fly.io, AWS)
Choose Custom (Postgres) if:
→ Simple needs: <10K jobs/day, basic retries
→ Can't add another dependency
→ Already have Postgres (which you do if using Prisma)
Adding Background Jobs to Any Boilerplate
Most SaaS boilerplates (T3 Stack, ShipFast, Open SaaS) ship without background job infrastructure. Adding Inngest takes about 2 hours:
npm install inngest— install the package- Create
lib/inngest.tswith the Inngest client - Create
app/api/inngest/route.tsto serve functions - Move your first webhook handler from the HTTP route to an Inngest function
- Test with Inngest Dev Server locally (automatically installed with the package)
The Inngest Dev Server runs at localhost:8288 and shows every function invocation, step execution, and retry. It's one of the better developer experience tools in the ecosystem.
# Start Inngest Dev Server alongside your Next.js dev server
npx inngest-cli@latest dev -u http://localhost:3000/api/inngest
Your Stripe webhook handler, email sending logic, and any operation that shouldn't block an HTTP response are all candidates for immediate migration to Inngest.
Idempotency: The Critical Requirement
Every background job system must handle the same job being triggered more than once. Stripe's documentation explicitly states that webhooks may be delivered more than once. Inngest, Trigger.dev, and BullMQ retry failed steps — but if a step succeeds and then the overall job fails, that step will run again on retry.
The solution: idempotency keys and check-before-write patterns:
// ✅ Idempotent subscription handler
export const handleSubscriptionCreated = inngest.createFunction(
{ id: 'subscription-created', idempotency: 'event.data.event.id' },
// ^^ Inngest deduplicates based on Stripe event ID
{ event: 'stripe/webhook' },
async ({ event, step }) => {
const stripeEvent = event.data.event;
if (stripeEvent.type !== 'customer.subscription.created') return;
const subscription = stripeEvent.data.object;
await step.run('update-user-plan', async () => {
// Use upsert to be safe against double-execution
await db.user.update({
where: { stripeCustomerId: subscription.customer as string },
data: {
plan: 'pro',
subscriptionId: subscription.id,
subscriptionStatus: subscription.status,
},
});
});
await step.run('send-welcome-email', async () => {
const user = await db.user.findUnique({
where: { stripeCustomerId: subscription.customer as string },
});
// Check if email was already sent before sending
if (!user?.welcomeEmailSent) {
await resend.emails.send({ to: user!.email, subject: 'Welcome to Pro!' });
await db.user.update({
where: { id: user!.id },
data: { welcomeEmailSent: true },
});
}
});
}
);
The idempotency field in Inngest causes the platform to skip duplicate events with the same key. The welcomeEmailSent flag in the database catches cases where the email step ran but the flag update failed.
Monitoring and Observability for Background Jobs
Background jobs fail silently if you don't instrument them. Inngest and Trigger.dev provide dashboards for job visibility. For BullMQ, Bull Board is the standard open-source monitoring UI:
// Add Bull Board to your Express/Next.js API
import { createBullBoard } from '@bull-board/api';
import { BullMQAdapter } from '@bull-board/api/bullMQAdapter';
import { ExpressAdapter } from '@bull-board/express';
const serverAdapter = new ExpressAdapter();
createBullBoard({
queues: [new BullMQAdapter(emailQueue), new BullMQAdapter(webhookQueue)],
serverAdapter,
});
serverAdapter.setBasePath('/admin/queues');
app.use('/admin/queues', serverAdapter.getRouter());
// Protect with auth middleware before exposing this
Key metrics to monitor for any job queue:
- Failed job rate: > 1% warrants investigation
- Queue depth: growing queue depth indicates workers can't keep up
- P95 job completion time: latency creep indicates performance issues
- Retry rate: high retry rates mean unreliable dependencies
Set up alerting when failed job rate exceeds your threshold. A silent failure in email sending or subscription activation directly affects user experience and revenue.
Local Development with the Inngest Dev Server
One of Inngest's advantages over webhook-based approaches is the local development experience. The Inngest CLI runs a local dev server that captures function invocations, lets you replay events, and shows step-by-step execution traces without needing a public URL or ngrok tunnel.
npx inngest-cli@latest dev
# Dev server starts at http://localhost:8288
# Functions auto-discovered from your app
Point your Next.js app at the local dev server by setting INNGEST_DEV=1 in your .env.local. When you trigger events from your app code, they appear in the dev server UI immediately. You can replay any event, inspect the step-by-step execution, and see exactly what each step returned.
For Stripe webhook testing locally, the Stripe CLI handles forwarding: stripe listen --forward-to localhost:3000/api/webhooks/stripe. This sends test events to your local webhook handler, which enqueues them to the Inngest dev server. The full flow is testable without deploying anything.
BullMQ local development requires Redis. The standard path is Docker: docker run -p 6379:6379 redis:alpine. Bull Board gives you a visual queue inspector at localhost:3000/admin/queues. The DX is workable but requires more setup than Inngest's zero-infrastructure local server.
Retry Strategies and Exponential Backoff
All production job queues need retry logic. External dependencies fail: email providers return 429s, database connections time out, third-party APIs go down. Without retries, these transient failures become permanent job failures that require manual intervention.
The right retry strategy depends on the job type. Email sending can retry aggressively — 10 retries over 24 hours is reasonable, since the main failure mode is temporary rate limiting or provider downtime. Stripe webhook processing should retry fewer times because Stripe already retries on its end; your processor retrying independently can cause double-processing if idempotency isn't implemented correctly.
// Inngest retry configuration — exponential backoff
export const sendWelcomeEmail = inngest.createFunction(
{
id: 'send-welcome-email',
retries: 5, // Default is 3 — 5 retries = ~24 hours total
},
{ event: 'user/signed-up' },
async ({ event, step }) => {
// Inngest automatically retries on thrown errors
// with exponential backoff: 1m, 5m, 30m, 2h, 8h
await step.run('send-email', async () => {
await resend.emails.send({ to: event.data.email, subject: 'Welcome!' });
});
}
);
BullMQ gives granular control over backoff:
const emailQueue = new Queue('emails', { connection: redis });
// Add job with custom retry configuration
await emailQueue.add('send-welcome', { userId, email }, {
attempts: 5,
backoff: { type: 'exponential', delay: 60_000 }, // 1m, 2m, 4m, 8m, 16m
removeOnComplete: 100, // Keep last 100 completed jobs for debugging
removeOnFail: 500, // Keep last 500 failed jobs for investigation
});
The removeOnComplete and removeOnFail settings matter for Redis memory usage in production. Without them, completed and failed jobs accumulate indefinitely.
Choosing the Right Tool for Your Use Case
The decision between Inngest, Trigger.dev, and BullMQ is primarily driven by infrastructure constraints and job characteristics, not developer preference.
Inngest is the correct choice for serverless-deployed Next.js apps. It requires no Redis, no additional infrastructure, and integrates with Vercel's serverless functions without the connection limits that come from long-lived Redis connections in serverless contexts. The free tier (50K steps/month) covers most indie SaaS indefinitely. Choose Inngest when you're on Vercel, value DX, and your jobs complete within minutes.
Trigger.dev is the correct choice when jobs run for minutes to hours. AI pipeline jobs that call multiple LLMs, video processing, bulk data exports, nightly report generation — these require durable execution that survives function timeouts and deploys. Trigger.dev's runs persist across restarts. If a deploy happens mid-job, the job continues from where it left off. Choose Trigger.dev when job duration exceeds 10 minutes or reliability is critical for revenue-affecting workflows.
BullMQ is the correct choice for high-throughput processing. If you're running 100K+ jobs per day, need to process items in parallel across multiple workers, or need fine-grained queue prioritization, BullMQ's Redis-backed architecture handles this more cost-effectively than managed services. Choose BullMQ when you already have Redis infrastructure and throughput or cost are primary concerns.
For a new SaaS in 2026 deploying to Vercel with under 50K monthly events: Inngest. The DX advantage and infrastructure savings justify the choice over the alternatives until you hit a concrete limitation.
The switching cost between these tools is lower than it appears. All three use the same underlying concept: enqueue events, process asynchronously, retry on failure. The business logic inside each job function is portable. If you start with Inngest and later need BullMQ's throughput, migrating the job definitions takes hours rather than days. Choose based on today's requirements, not hypothetical future scale.
The one area where switching cost is real: job monitoring and observability tooling. Teams build operational habits around their job dashboards — knowing where to look when a job fails, how to replay stuck jobs, where to see the execution history. Migrating tool means retraining those habits. This is a soft cost worth factoring in when evaluating BullMQ's Bull Board versus Inngest's or Trigger.dev's managed dashboards. Both Inngest and Trigger.dev shipped improved dashboards and failure replay tooling in 2025, narrowing the observability gap that previously favored self-hosted BullMQ setups with custom monitoring.
The background job tool you choose becomes part of your production operations — monitoring, alerting, and debugging workflows all depend on what observability the tool provides. Choose based on your deployment architecture first, and your observability requirements second.
Find SaaS boilerplates with pre-built background job infrastructure in our best open-source SaaS boilerplates guide.
See our guide to Redis caching strategies for BullMQ's Redis infrastructure requirements.
See how background jobs connect to webhook processing in our production deployment guide.