Background Processing: Use Einhorn to Spawn and Manage Worker Processes

5 min read

Most modern web applications have a background processing subsystem. In other words, whenever something does not need to be done in the web request (i.e. sending a welcome email when a new user signs up), it can be queued up and processed later outside of the request-response cycle. This allows the web requests to be fast and responsive.

Anatomy of a Background Processing System

A background processing system usually consists of the following:

Queue is where the "jobs" are stored till they're processed. A job contains the data necessary for the task at hand (i.e. we need the user_id to look up the user's email in our database before sending a welcome email), as well as some meta data (i.e. the time the job entered into the queue).

Producers put jobs into the queue. For example, the request that handles user sign-ups can produce a new job for each successful sign-up so a welcome email will be sent to the user who has just signed up.

Consumers know how to process one or more kinds of jobs. For example, the consumer responsible for sending welcome emails would know how to look up the user, build the welcome message, and deliver it to the user's email address. In other words, the consumers implement your business logic.

Workers poll the queue for new jobs. When a job becomes available, they pop it off the queue, and hand it off to the consumer that knows how to process that job.

Putting the Pieces Together

In most languages, you can find a library that addresses all of the pieces above, and presents you with a complete solution. In Ruby, you can find several libraries such as resque, backburner, and delayed_job that do this.

Suppose you want to put together a custom background processing system. The most important pieces you need to worry about are the queue and workers. What you choose as a queue or how you implement your workers will determine the performance and scalability of the system. How you should choose a queue is outside the scope of this blog post so let's move on to talking about the workers.

The Workers

As described earlier, workers sit in a loop, and wait for new jobs. When a job becomes available, they pop it off the queue and process it by delegating to the appropriate consumer. Here is a sample worker in Ruby that uses beanstalk as the queue:

#!/usr/bin/env ruby

class Worker
  def self.run(consumers)
    # REF https://github.com/beanstalkd/beaneater
    queue = Beaneater::Pool.new(['localhost:11300'])

    # Assume keys map to beanstalk tubes
    queue.watch!(*consumers.keys)

    loop do
      # Wait for 10 seconds to get a job
      job = queue.tubes.reserve(10)

      # Assume that we originally enqueued the job with a JSON body
      job_data = JSON.parse(job.body)

      consumers.fetch(job.tube).perform(job_data)
    end
  end
end

# Assume that we have AlertDetector and DeviceRegistrar defined
# in our app as below:
#
#   class AlertDetector
#     def self.perform(job_data)
#       # Detect an alert (i.e. new goal is scored)
#     end
#   end
#
#   class DeviceRegistrar
#     def self.perform(job_data)
#       # Register the device for push notifications
#     end
#   end
Worker.run(
  'alert_detector' => AlertDetector,
  'device_registrar' => DeviceRegistrar
)

Given that you put the code above in bin/work inside your application, you can run it as ./bin/work from the root of the application.

The worker above can process only one job at a time. This won't scale as the number of jobs increase. You need to be able to process many jobs in parallel. You can use threads to do this, but threads are difficult to use, and even more difficult to debug. In addition, if your worker is written in Ruby or Python, you don't have true threads anyway.

The alternative is to run many copies of the worker process, which is a reliable and popular way to achieve concurrency in unix. Now, you need to worry about how you spin up and manage multiple worker processes. This is where most people reinvent the the wheel, and write custom code. Don't reinvent the wheel.

Enter Einhorn

Einhorn bills itself as the language-independent shared socket manager. But, shared socket management is just one of its many features:

  • Einhorn lets you spin up any number of worker processes (the number can be adjusted on the fly)

  • Einhorn can spawn a new pool of workers and gracefully kill off the old ones, allowing seamless upgrades to new versions of your code.

  • Einhorn gets out of your application's way

This means we can take the worker we've written earlier, and run 20 copies of it without modifying the original code in any way: einhorn -n 20 ./bin/work. That's it. Einhorn does indeed get out of your application's way.

If you use Ruby, Einhorn can also preload your application, which means that it loads everything prior to forking so that your code is only stored in memory once.

Conclusion

At theScore, we've started using Einhorn to spin up and manage worker processes in our background processing system that is responsible for generating and delivering push alerts. We process millions of messages through that system every day. Einhorn turned out to be simpler and much more resilient than the custom code we used to have.