There is a much better way, and it involves
Hudson Jenkins. I introduced “Hudson for cron” as a sidebar at the Drupal Scalability and Performance Workshop a few weeks ago. To my surprise, several of the attendees remarked on their feedback questionnaires that it was one of the most valuable things they picked up that day. So, I’ve decided to write this up for everyone.
Why not cron?
First, I have the burden of explaining why you should drop most use of the tried-and-true cron. To be honest, I don’t think cron is even a “good enough” solution for most of today’s systems:
- You either get email from every run’s output, a dumb log to disk, or no reporting at all. When you do log to disk, you have to worry about segmenting and rotating logs.
- Jobs have to be manually staggered to avoid massive slowdowns every hour (or midnight or ten minutes or other interval)
- Even if the previous job hasn’t finished, cron happily starts up a new one on top of it
- It doesn’t integrate with any other job kickoff or monitoring system
- There’s no built-in ability to run remote jobs, let alone move a remote job from one machine to another
- Most of the web-based tools for configuring cron aren’t very nice
- There’s no built-in logging of job execution time, even though a cron job taking excessive time is one of the most common failure cases.
Why Hudson is better
Here’s how using Hudson with periodic “builds” beats cron:
- Among the myriad ways Hudson can measure success of a “build,” it can verify a zero return status from each “execute shell” build step. If a job simply returns anything but zero, Hudson considers the build a failure and can notify you however you like. It can email you (on first failure only or every time), you can subscribe to build feeds via RSS, or you can simply use the Hudson interface as a dashboard that shows failures in a convenient, summarized way.
- Hudson logs the output of “execute shell” build steps. Success or failure, Hudson archives the build output without filling your inbox or local disk. If the console output isn’t enough, Hudson can archive per-run “build artifacts,” which are files on disk matching a defined pattern. There’s also no-hassle “log rotation” by specifying a cap on the number of builds or a set number of days to keep results; this is configurable per-job. If a particular run had output (say, for troubleshooting) you want to keep around, you can tell Hudson to “keep this build” indefinitely.
- Hudson runs each build on “build executors,” which are effectively process slots. Any system can have any number, but it puts a cap on how much Hudson tries to do, systemwide. This mean 50 jobs can get scheduled to run every hour with four “build executors,” and Hudson will queue them all every hour and run four at once until they’ve all finished.
- If a job is still running when the “periodic build” time comes around, Hudson can either run the job immediately (like cron) or queue the job to run when the one in progress finishes.
- Hudson isn’t limited to time-based scheduling. Sometimes, it’s useful to take a job that used to run periodically (say, a database refresh) and make it only available for manual kickoff. Of course, as a CI tool, Hudson can kick off jobs based on polling a version-control system.
- For remote jobs, Hudson can sign onto systems with SSH, copy over its own runtime, and run whatever you’d like on the remote system. This means that, no matter how many servers in a cluster need scheduled jobs, Hudson can schedule, run, and log them from one server. Hudson can distribute the jobs dynamically based on which machines are already busy, or it can bind jobs to specific boxes.
- Hudson has a solid web interface that can integrate with your Unix shadow file, LDAP, or other authentication methods. For people who prefer operating from the command line, Hudson has a CLI.
- Every job’s running time is logged. Hudson even provides estimates for how long it will take the system to get to any particular job when there’s a queue.
Hudson isn’t perfect
To be fair, there are still a couple reasons continue using cron:
- As a Java-based web application, Hudson is heavyweight. A low-memory or embedded system is better-off with cron. Even Hudson’s remote job invocation installs and starts a Java-based runtime. You can, however, use the SSH plugin if even one box can run the main Hudson instance.
- Cron’s scheduling is more precise if things have to happen exactly at certain time intervals. Hudson’s assumption is that your periodic builds aren’t dependent on when they start within a minute or two.
Adding a cron-style job to Hudson
Moving jobs from cron to Hudson is easy:
- Install Hudson. From the front page of Hudson’s site, there are repositories for Red Hat Enterprise Linux, CentOS, Ubuntu, Debian, and a few others.
- Open Hudson in a browser (on port 8080 by default).
- Add a new “New Job” of type “Build a free-style software project.”
- Check “Build periodically” and put in a cron-like schedule.
- Click “Add build step” and “Execute shell.” The Hudson wiki has a page explaining this.
- Configure access control from within Hudson.
Drupal and Pressflow best practices
Why Hudson with Drush?
- You can configure PHP’s CLI mode to be liberal in error reporting, giving you far more data on failure than a WSOD from wget or cURL. Hudson will also fail the “build” if PHP runs into a fatal error.
- You can block access to cron.php entirely. (This advantage isn’t unique to Hudson integration.)
Because Drush requires local shell execution, there’s a bit more overhead to having one Hudson box run Drupal’s cron on remote servers in a cluster. It’s not that hard, though. Just configure a Hudson “slave” on each box that needs to run Drush and configure each job to run on the “build executor” that hosts the site. If using a Hudson slave is overkill, use the SSH plugin.
There are even better reasons to use Drush with Hudson for things like database schema updates, but that’s outside the scope of this blog entry.
In the wild
Four Kitchens is widely using Hudson for cron automation on client sites. We’ve also deployed Hudson to Drupal.org infrastructure for multiple non-testing purposes, including deploying updates to Drupal.org (to be discussed in a future blog post).