Actions

Developer Area/File uploads API

From Mahara Wiki

< Developer Area
Revision as of 15:00, 5 May 2016 by Aaronw (talk | contribs)

This page discusses how to use Mahara's APIs for uploading files, storing them on the server, and retrieving them for later use.

Basic principles

Dataroot, not webroot

The most basic thing to understand about Mahara is that you should never store uploaded files in the Mahara code directory itself. This is for two main reasons. First, the uploaded files may get clobbered during a later upgrade. But more importantly, it's insecure because it is prone to creating a remote code execution vulnerability. For instance, if you stored file uploads into htdocs/artefact/file/uploads, then an attacker might upload a malicious PHP file, calculate that it will be stored to htdocs/artefact/file/uploads/myscript.php, and access it in their browser by a URL such as http://www.example.com/mahara/artefact/file/uploads/myscript.php.

To avoid this whole category of vulnerabilities, Mahara instead stores files under a separate dataroot directory, which should not be directly accessible by URL. The path of the dataroot directory is specified in the config.php file, as $cfg->dataroot.

To avoid interfering with other parts of Mahara that are storing files in dataroot, you should either store your files through an existing Mahara file storage API (which will determine the storage location for you), or create your own new directory under dataroot just for whatever new class of files you're creating. If you're only uploading files temporarily, though (such as a CSV file) you probably don't need to store anything in dataroot, and can just directly process the PHP upload temp file.

Validation

Uploaded files must be validated in a few different ways:

  1. Clamav virus scanning (if enabled)
  2. Upload limits on filesize (which can be specified in Mahara, or via php.ini)
  3. File storage quota for the user, group, or institution that owns the file.
  4. For files that will be served directly via the web server (such as images, video, audio, PDFs, fonts, etc) it's also important for security reasons to validate that the file actually contains the type of content it claims to.

The easiest way to handle this validation properly, is to use one of Mahara's existing file management API's.

File upload APIs

Pieforms

There are three Pieforms elements for handling file uploads.

Pieform "filebrowser" element

The best option, where applicable, is to use the Pieform "filebrowser" element. This allows the user to upload a file, or to select a file they've already uploaded into their Content -> Files area. On the downside, it's a rather complex, so its use can be tricky, and it uses a lot of Javascript which can sometimes cause subtle bugs if you are pulling it up dynamically (say in a modal window). But there are plenty of examples of its usage in the Mahara codebase to choose from. See htdocs/artefact/blog/post.php for one of the simpler examples, where it's used for journal entry attachments.

In general usage, what you do is add a "filebrowser" element to your pieform. Then, in your pieform submit method, you simply check for the value of the filebrowser element, and it will give you the ID of the selected (or uploaded) file artefact. You don't need to write any code to handle saving the file; this is all handled automagically by the file browser when the form is submitted.

It's recommended to use this element whenever you want users to upload a file that will belong to that specific user, and be accessible to them as an artefact. (Or a group file or institution file artefact.) This element is not usable by logged out users (who don't have their own artefacts) and it's generally not applicable for file uploads that will only be processed temporarily, like Leap2a imports or CSV files uploaded by admins. It also can't be used to process files that have been uploaded outside of the web browser, such as files coming from web services or Curl requests.

Pieform "files" element

A simpler option than the filebrowser, the "files" Pieform element creates a dynamically-expandable list of file upload buttons. It is gradually being replaced with the filebrowser in most places, but is still in use in some areas of core, such as comment attachments and resume attachments. For an example of its usage look at the comment artefact's form validation and submission methods add_feedback_form_validate and add_feedback_form_submit in htdocs/artefact/comment/lib.php.

This element will return an array, where each element is a key to an element in the PHP $_FILES superglobal, representing one of the uploaded files. It does none of the necessary Mahara validation or file storage, so you'll need to use one of the API's below for that.

Pieform "file" element

There is also a very simple Pieform element called "file", which only creates an <input type="file"> and does no file handling at all. See htdocs/admin/groups/uploadcsv.php for an example of its use.

This element simply returns the value of the entry in the PHP superglobal $_FILES representing the uploaded file. It does no Mahara validation or file storage, so if you use this you'll need to manually handle file validation and processing, using one of the API's below.

ArtefactTypeFile::save_uploaded_file

Whenever a user uploads a file that goes into their "Content -> Files" storage area, that's a file artefact. If you use the filebrowser Pieform element, a File artefact will have been created for you and you won't have to worry about this. But if you're using a different file upload method, you can use this static method to save the uploaded file into a File artefact.

Unfortunately, this method relies on a poorly-documented $data attribute. The $data attribute fills in many fields of metadata for the file artefact (it's ultimately passed to the constructor for the artefact's object). The best example of its usage is probably in the Pieform filebrowser element, in the pieform_element_filebrowser_upload function in htdocs/lib/form/elements/filebrowser.php.

Example code:

$data = stdClass();

// The folder to put the file in. This refers to a "folder" artefact
// in the "Content->Files" area, not to a physical directory on the
// disk. This value should either be the ID of the folder artefact,
// or NULL to store the file in the home directory of the user/group
$data->parent = $parentfolderid;

// Who owns the folder. The ID of a user OR a group, or the name
// of an institution. (Or 'mahara' for site files.)
$data->owner = $ownerid;
$data->group = $groupid;
$data->institution = $institutionname;

// The title for the new file artefact. To avoid name conflicts, it's
// best to use ArtefactTypeFileBase::get_new_file_title to make sure
// you've got a unique name.
$originalname = $_FILES[$inputname]['name'];
$originalname = $originalname ? basename($originalname) : get_string('file', 'artefact.file');
$data->title = ArtefactTypeFileBase::get_new_file_title(
    $originalname,
    $parentfolderid,
    $data->owner,
    $data->group,
    $data->institution
);

try {
    $artefactid = ArtefactTypeFile::save_uploaded_file(
        $inputname, // The name of the <input type="file"> element the file came from
        $data,
        $inputindex, // (optional) If you're using an array of file elements, the index of the file to process
        $resized // (optional) If you've processed the file on the server-side, set this to TRUE to tell Mahara to re-check the file's size
    );
    
    // $artefactid has the ID of the new artefact
}
catch (QuotaExceededException $e) {
    // The file was too big for the user/group/institution's file storage quota.
}
catch (UploadException $e) {
    // There was some other problem uploading the file.
}

upload_manager

If you need to save a file that is not a file artefact, the upload_manager class is probably the best option. This is the class that the ArtefactTypeFile uses to move the physical files around, and it's the class used for most of the admin pages that need to store non-artefact files (such as skin fonts).

The preprocess_file method of this class will run the file through ClamAV and validate it for upload limits. (It does not check any user/group/institution file storage quotas.)

This class assumes that you are processing files that have been uploaded by the user and are present in the PHP $_FILES superglobal, and it's built to fit into the lifecycle of Pieforms validation and submission (typically along with a Pieform "file" or "files" element. If you're processing files that have come from somewhere else, like webservices or a zip archive, you'll need to go even more basic.

For a simple example of it in action, look at the code for uploading skin fonts in htdocs/admin/site/font/add.php

Example code:

function myform_validate(Pieform $form, $values) {
    $um = new upload_manager(
        $inputname, // Name of the <input type="file"> form element
        $handlecollisions, // (Optional, default false) Rename upload instead of replacing existing file with same name
                           // Use with care, because the upload_manager won't tell you if changed the upload's name!
        $inputindex, // (Optional) If you've used an array of file inputs, which index to handle
        $optional // (Optional) Set to "true" if the file upload is a required form field
    );

    $error = $um->preprocess_file();
    if ($error) {
        $form->set_error($inputname, $error);
    }
}

function myform_submit(Pieform $form, $values) {
    $um = new upload_manager(
        $inputname,
        $handlecollisions,
        $inputindex,
        $optional
    );

    $error = $um->save_file(
        $directory, // The directory under $cfg->dataroot to store the file in. Will be created if necessary.
        $filename // The name to save the file as (if you use $handlecollisions this may be changed without notice!)
    );

    if ($error) {
       // Handle the error during upload.
    }    
}