SYSTEM_BLOG
TIME: 00:00:00
STATUS: ONLINE
~/blog/drupal-scale-cron-queue-workers
$ cat drupal-scale-cron-queue-workers.md _
| 2025-11-20 | 14 min

Scaling Drupal: Cron Jobs, Queue Workers, and Flat Tables

# Scaling Drupal: Cron Jobs, Queue Workers, and Flat Tables

When Drupal needs to handle 50,000+ records efficiently, standard patterns break down. This guide covers the architectural decisions that enable enterprise-scale data processing.

## The Flat Table Philosophy

Drupal's Entity API is powerful but slow at scale. The solution: flat tables that mirror entity data in query-optimized structures.

PHP
/
  Flat table schema for order processing.
 /
function mymoduleschema() {
  $schema['orderflat'] = [
    'fields' => [
      'orderid' => ['type' => 'int', 'unsigned' => TRUE],
      'customerid' => ['type' => 'int', 'unsigned' => TRUE],
      'status' => ['type' => 'varchar', 'length' => 32],
      'total' => ['type' => 'numeric', 'precision' => 10, 'scale' => 2],
      'created' => ['type' => 'int'],
      'processed' => ['type' => 'int', 'default' => 0],
    ],
    'primary key' => ['orderid'],
    'indexes' => [
      'statuscreated' => ['status', 'created'],
      'customerstatus' => ['customerid', 'status'],
    ],
  ];
  return $schema;
}

### Why Flat Tables Win

ApproachQuery Time (50K records)Memory Usage
Entity API45+ seconds512MB+
Views12-15 seconds256MB
Flat Table0.3 seconds8MB

"Every JOIN you eliminate is a gift to your database."

## Queue Worker Architecture

Drupal's Queue API processes items asynchronously. Here's an optimized worker:

PHP
/
  Processes orders in batches.
 
  @QueueWorker(
    id = "orderprocessor",
    title = @Translation("Order Processor"),
    cron = {"time" = 60}
  )
 /
class OrderProcessor extends QueueWorkerBase {

public function processItem($data) {
$order
id = $data['orderid'];

// Use direct database queries for speed
$connection = \Drupal::database();

$connection->update('orderflat')
->fields(['status' => 'processing', 'processed' => time()])
->condition('orderid', $orderid)
->execute();

// Perform actual processing
$this->executeOrderLogic($orderid);

$connection->update('orderflat')
->fields(['status' => 'completed'])
->condition('orderid', $orderid)
->execute();
}
}

## The Object Catalog Pattern

When entities reference complex objects, the Object Catalog provides efficient lookups:

PHP
class ObjectCatalog {

protected $cache = [];

public function get($type, $id) {
$key = "{$type}:{$id}";

if (!isset($this->cache[$key])) {
$this->cache[$key] = $this->load($type, $id);
}

return $this->cache[$key];
}

public function preload($type, array $ids) {
// Batch load to minimize queries
$missing = arraydiff($ids, arraykeys($this->cache));

if (!empty($missing)) {
$items = $this->loadMultiple($type, $missing);
foreach ($items as $id => $item) {
$this->cache["{$type}:{$id}"] = $item;
}
}
}
}

## Bypass Flags for Bulk Operations

Hooks and event subscribers add overhead. For bulk operations, bypass them:

PHP
function processbulkorders(array $orderids) {
  // Disable entity hooks temporarily
  $state = \Drupal::state();
  $state->set('mymodule.bulkprocessing', TRUE);

try {
foreach (arraychunk($orderids, 100) as $batch) {
processorderbatch($batch);
}
} finally {
$state->set('mymodule.bulkprocessing', FALSE);
}
}

// In hook implementations:
function mymodule
entityupdate($entity) {
if (\Drupal::state()->get('mymodule.bulk
processing')) {
return; // Skip during bulk operations
}
// Normal processing
}

## Cron Job Optimization

Default Drupal cron runs everything sequentially. Split heavy tasks:

PHP
/
  Implements hookcron().
 /
function mymodulecron() {
  // Light tasks only - heavy processing goes to queues
  $queue = \Drupal::queue('orderprocessor');

// Find unprocessed orders
$order
ids = \Drupal::database()
->select('orderflat', 'o')
->fields('o', ['order
id'])
->condition('status', 'pending')
->range(0, 1000) // Limit per cron run
->execute()
->fetchCol();

foreach ($orderids as $orderid) {
$queue->createItem(['orderid' => $orderid]);
}
}

## Performance Monitoring

Track these metrics in production:

PHP
function logqueuemetrics() {
  $queues = ['orderprocessor', 'syncworker', 'emailsender'];

foreach ($queues as $queuename) {
$queue = \Drupal::queue($queuename);
\Drupal::logger('metrics')->info('Queue @name: @count items', [
'@name' => $queue
name,
'@count' => $queue->numberOfItems(),
]);
}
}

## Conclusion

Scaling Drupal requires breaking free from the Entity API's convenience for raw performance. Flat tables, optimized queues, and smart caching transform a content management system into an enterprise data processor.

The patterns here handle 50,000+ records without breaking a sweat. Your mileage will vary based on hardware, but the architecture remains the same.