Skip to Main Content

Scheduling operations in Python

I was recently helping out a friend build a small project that would run on regular intervals to fetch data from an API that was provided by an IoT device. In the Linux world, cron is a popular job scheduler but its syntax isn't the most user-friendly and I've always had a bit of problems with it.

In this project, I wanted to look into other options, preferably something from the Python ecosystem. Over 10 years ago, Adam Wiggins wrote an article Rethinking Cron which is a really nice article and highlights many of the issues I've also had with cron:

Cron problems are difficult to debug. The arcane syntax of crontab is terse to the point of near inscrutability, making it easy to accidentally schedule jobs at the wrong time. And the subtle differences between a cronjob’s shell environment and your command prompt’s shell environment can be maddening. Lack of feedback makes these or any other problem with your cronjobs difficult to diagnose.

After doing some research, I ran into schedule module which works on Python 3.6 onwards and also references Wiggins' article as inspiration so I decided to test it out.

How to schedule with schedule

After installing the package with pip install schedule and importing it with import schedule, it provides an easy to read and understand API:

import schedule

def job():
  print("I'm working...")
  
schedule.every().hour.do(job)

while True:
  schedule.run_pending()
  time.sleep(1)
(example simplified from documentation)

I would argue that anyone reading that has a pretty good idea what it's gonna do even if you have never seen the schedule package. Especially if we compare that to the equivalent cronjob:

0 * * * * python job.py

If you're dealing with cronjobs, I highly recommend crontab.guru to help debug and build new schedules.

I publish my blog every Wednesday morning (I have a manual process so I do it when I wake up). If I had my blog publish process as a Python function, I could do

schedule.every().wednesday.at("08:00").do(publish_blog_post)

The examples page in the documentation gives a good overview of the API it exposes. While it reads well, one thing I don't particularly like about it is the fact that every().hour and every(5).hours use different attribute (singular vs plural) instead of hour always being in one form.

@property
def hour(self):
  if self.interval != 1:
    raise IntervalError("Use hours instead of hour")
  return self.hours
Implementation of hour in schedule

You even get an error if you try to use every(5).hour instead of every(5).hours. To my personal taste, it's giving a bit too much emphasis on the natural language over technical implementation. It's still a nice library that does its job.

For a library like this, that you don't use and develop with every day, I think readability is a key asset and with that, schedule shines.