My program has a data import process, which will need to get refreshed from time to time. I started out focusing on the code. Tests ran things for a long time. I prepared fairly high level operations, and then wrapped Rakefile around them. There was still a lot of setup code, so I extracted an even higher level set of operations from the rake, so they could be just as easily driven by code. Now I just need some way to trigger those operations.
The application is hosted on Heroku at present. Heroku provides a scheduler add-on that goes up to a day (though a week would probably be sufficient for my purposes). However, Heroku wants large jobs done by worker processes, and the policy is to cut off others that run too long.
This naturally leads to the use of some sort of communication system to instruct the background workers do something. Commonly, some sort of queue is used, and Resque in particular has been getting some buzz lately. There is a free Redis add-on at Heroku, which is limited by size – not a concern for a communication structure.
Resque queues a class name and a set of parameters. Class names are handy because they are global constants, which can be looked up at the other end – if you want to get really technical, it doesn’t even have to be the same class. Resque calls a
perform method on the class, passing the parameters.
This sounds pretty handy, at first blush. Just define a perform method (and which queue to put it into) and pretty much anything can become a background job, without having to inherit from some special class or anything of the sort. The problem of course, is that you can put it everywhere.
This became especially evident when I added heroku worker autoscaling to avoid having an expensive process running all day to support what will often be a few seconds of work each day. (IronWorker looks interesting, but can’t access Heroku’s shared databases yet.) The autoscaling module has be mixed into to each class which is acting as a worker. As I started out, this involved a fair bit of repetition. Now, there are of course methods to handle the repetition, but between that and sprinkling
queue around I was getting queue all over the application.
So, time for a new gem. The background gem has a module to wrap up the autoscaling and other common configuration. It also becomes the only piece of the application which has a direct dependency on the underlying queuing system (should Reque prove inappropriate at a later time) The background module depends on any other gems that needs in order for the background jobs to get their work done.