RedisBullMQNestJSBackend

Redis + BullMQ: Fault-Tolerant Job Queues in NestJS

A deep-dive into designing background job systems that handle payment webhooks, wallet updates, and email notifications at scale without data loss.

R
Roni Sarkar
Apr 20266 min read

Background job processing is the backbone of any serious backend. If you're using NestJS, BullMQ (with Redis as the broker) is the best option available. Here's a practical guide to setting it up properly — not just the happy path, but the failure cases too.

Why BullMQ over Other Solutions?

  • Built on Redis — persistent, fast, battle-tested
  • First-class TypeScript support
  • Delayed jobs, priority queues, rate limiting out of the box
  • Job events and progress tracking
  • Works seamlessly with NestJS DI system

Setup with @nestjs/bullmq

typescript
// app.module.ts
import { BullModule } from '@nestjs/bullmq';

@Module({
  imports: [
    BullModule.forRoot({
      connection: {
        host: process.env.REDIS_HOST,
        port: parseInt(process.env.REDIS_PORT),
        password: process.env.REDIS_PASSWORD,
      },
    }),
    BullModule.registerQueue({
      name: 'payment-events',
      defaultJobOptions: {
        attempts: 5,
        backoff: { type: 'exponential', delay: 2000 },
        removeOnComplete: 100,
        removeOnFail: 500,
      },
    }),
  ],
})
export class AppModule {}

Structuring Your Queue Service

Separate your producer (the service that enqueues jobs) from your consumer (the worker that processes them). This keeps your main request cycle clean.

typescript
// payment-queue.service.ts  — Producer
@Injectable()
export class PaymentQueueService {
  constructor(
    @InjectQueue('payment-events')
    private readonly paymentQueue: Queue,
  ) {}

  async enqueueWebhookEvent(event: PaymentWebhookEvent): Promise<Job> {
    return this.paymentQueue.add('webhook', event, {
      jobId: event.id, // Idempotency key — prevents duplicate processing
      priority: event.type === 'payment_intent.succeeded' ? 1 : 10,
    });
  }

  async scheduleReconciliation(userId: string): Promise<Job> {
    return this.paymentQueue.add('reconcile', { userId }, {
      delay: 5 * 60 * 1000, // 5 minutes
    });
  }
}

The Worker: Handling Failures Gracefully

The worker is where most bugs hide. Always handle partial failures — if step 3 of 5 fails, the job should retry from the beginning safely. That means idempotent operations.

typescript
@Processor('payment-events')
export class PaymentEventProcessor extends WorkerHost {
  private readonly logger = new Logger(PaymentEventProcessor.name);

  async process(job: Job): Promise<void> {
    this.logger.log(`Processing job ${job.id} (attempt ${job.attemptsMade + 1})`);

    try {
      if (job.name === 'webhook') {
        await this.processWebhook(job.data);
      }
    } catch (error) {
      this.logger.error(`Job ${job.id} failed: ${error.message}`);
      throw error; // BullMQ handles retry based on attempts config
    }
  }

  @OnWorkerEvent('failed')
  onFailed(job: Job, error: Error) {
    if (job.attemptsMade >= job.opts.attempts) {
      // Send to dead letter queue or alert on-call
      this.alertService.critical(`Job ${job.id} exhausted retries`, error);
    }
  }
}

Monitoring with Bull Board

Install @bull-board/nestjs for a visual dashboard to monitor queues, retry failed jobs, and inspect job data. Essential for production debugging.

💡 Production tip: Set removeOnComplete: 100 and removeOnFail: 500 to prevent Redis memory bloat. Keep enough failed jobs for debugging, but not forever.

Patterns for Common Use Cases

  • Webhook processing: Use job ID as the webhook event ID for idempotency
  • Email sending: Low priority queue, high retry count, exponential backoff
  • Report generation: Use job progress updates to show status to the client
  • Scheduled reconciliation: Delayed jobs + cron jobs combined
  • Rate limiting: BullMQ's built-in rate limiter for third-party API calls

Enjoyed this post?

Let's connect and talk engineering.

Get in touch