Actions

Difference between revisions of "Developer Area/Cron API"

From Mahara Wiki

< Developer Area
Line 90: Line 90:
 
         // etc...
 
         // etc...
 
     );
 
     );
 
+
 
     // ... and more stuff after
 
     // ... and more stuff after
 
  }
 
  }

Revision as of 16:59, 21 January 2014

Mahara provides a Cron API to allow for scheduled tasks. It uses an internal table of schedules to determine how frequently the tasks run, and it uses a lock to prevent two instances of the same task from running at the same time.

How it works

Here's the underlying architecture of the Mahara cron job.

The One and Only cron script, htdocs/lib/cron.php

All of Mahara's cron tasks are handled by one script, htdocs/lib/cron.php. This script is meant to be [System_Administrator%27s_Guide/Cron_Job|scheduled by the System Administrator] to be executed once per minute, either via the command-line or through an HTTP request.

It then checks a series of internal tables in Mahara to see which internal cron tasks need to be executed, and carries them out.

Note that the cron script sets the [Developer_Area/Pagetop_Constants|pagetop constant]] 'CRON', which allows for scripts to detect that they're being executed by the cron and to behave accordingly. For instance, some checks for a current logged-in user are ignored when CRON is defined.

The cron tables

The individual cron tasks that Mahara should execute, are stored in a series of tables in Mahara's database.

Cron tasks pertaining to Mahara core (as opposed to a plugin) are stored in the aptly named cron table. Its most important field is cron.callfunction, which holds the name of a PHP function that will be called to carry out that task. The other fields store scheduling information about how often it should run. The legal values for the scheduling section are essentially the same as those for the standard Unix crontab.

Each Mahara plugin type has its own separate cron table: blocktype_cron, artefact_cron, etc. These tables are much the same as the core cron table, except they additionally indicate the name of the plugin the task belongs to. The callfunction in these tables should be a static method of the plugin's main class.

Cron locks

In order to avoid concurrency problems, Mahara uses a system of cron locks to prevent multiple copies of the same cron task from running at the same time.

The system is quite simple. Before executing a task, Mahara looks for a lock record for that cron, in the database. Specifically, it checks for a config record called "_cron_lock_core_{$callfunction}" (for a core cron task) or "_cron_lock_{$plugintype}{$pluginname}_{$callfunction}" (for a plugin cron task). If it finds this, it knows that another copy of the cron task already claimed the lock and is executing, and so it skips that task and doesn't execute it. On the other hand, if it doesn't find a lock present, it sets the lock itself and begins executing the task. When it has finished the task, it deletes the lock record.

But what happens if the cron task crashes before it can delete the lock record? Well, the lock record is a config record with a particular name, but every config also has a value. In this case, the cron script sets the value to be the time the lock was claimed. Using that, we can tell how long a particular task has been running. Each time the cron job finds a lock already present, it checks the timestamp stored in its value, and if it's more than 24 hours old it assumes the lock belonged to a task that crashed, so it clears it and begins executing the task again.

How to set up new cron tasks

Plugin cron tasks

It's quite easy to create cron tasks for a plugin. In the plugin's lib.php file, you simply add a public static function get_cron() to the plugin's "Plugin" subclass. For instance, for the cron task that updates the RSS feeds in external feed blocks, we added a get_cron() method to the class PluginBlocktypeExternalFeed, in the file htdocs/blocktype/externalfeed/lib.php.

The get_cron() method should take no arguments, and should return an array that defines a callfunction, as well as any scheduling fields needed (minute, hour, day, month, dayofweek). Any scheduling fields left out will default to '*'. The "callfunction" should be the name of a public static method of the plugin's main class, which can be executed with no required parameters.

Example:

// in htdocs/blocktype/externalfeed/lib.php
class PluginBlocktypeExternalfeed {

   // ... lots of other methods also in this class

   public static function get_cron() {
       $refresh = new StdClass;
       $refresh->callfunction = 'refresh_feeds';
       $refresh->hour = '*';
       $refresh->minute = '0';

       $cleanup = new StdClass;
       $cleanup->callfunction = 'cleanup_feeds';
       $cleanup->hour = '3';
       $cleanup->minute = '30';

       return array($refresh, $cleanup);
   }

   public static refresh_feeds() {
       // do stuff
   }

   public static cleanup_feeds() {
       // do stuff
   }
}

And lastly, in order to make sure that existing installations of the plugin get upgraded to include the cron task, increment the plugin's version number in its version.php file.

Core cron tasks

The system for adding core cron tasks is not as graceful.

To cover new installations, you add a record to the $cronjobs array in the method core_install_firstcoredata_defaults() in htdocs/lib/upgrade.php. A few cron tasks schedule their times using rand(). These are cron tasks that "phone home" back to mahara.org to check for updates, etc. They're scheduled randomly in order to distribute the load on our servers.

Example:

// in lib/upgrade.php
function core_install_firstcoredata_defaults() {
    // ...lots of other code too

    // install the cronjobs...
    $cronjobs = array(
        'rebuild_artefact_parent_cache_dirty'       => array('*', '*', '*', '*', '*'),
        'rebuild_artefact_parent_cache_complete'    => array('0', '4', '*', '*', '*'),
        'activity_process_queue'                    => array('*/5', '*', '*', '*', '*'),
        'cron_send_registration_data'               => array(rand(0, 59), rand(0, 23), '*', '*', rand(0, 6)),
        'export_cleanup_old_exports'                => array('0', '3,15', '*', '*', '*'),
        // etc...
   );

   // ... and more stuff after
}

And then, to cover upgrades, you add an insert_records() call in the main htdocs/lib/db/upgrade.php file, to insert the proper record directly into the "cron" table.

// in lib/db/upgrade.php:
if ($oldversion < 2012062902) {

    // Insert cron job to save institution data
    $cron = new stdClass();
    $cron->callfunction = 'cron_institution_data_weekly';
    $cron->minute       = 55;
    $cron->hour         = 23;
    $cron->day          = '*';
    $cron->month        = '*';
    $cron->dayofweek    = 6;
    insert_record('cron', $cron);
}


And of course, you should increment the version number in htdocs/lib/version.php.