Developer Area/Specifications in Development/GDPR compliance
The GDPR (General Data Protection Regulation) is going into effect in Europe on 25 May 2018. It's a regulation and thus organizations need to adhere to it.
There are some changes necessary in Mahara while others need to be dealt with by institutions themselves, e.g. via the Terms and Conditions they set out on Mahara.
For definitions of terms, see Article 4.
GDPR is applicable to all content where it potentially contains data identifying a specific person, including group content (as vague a definition as this is). An institution is responsible for the final decision of what this entails.
GDPR also affects instances that are not hosted in the EU but are used by EU citizens.
Disclaimer The information provided here does not constitute legal advice but is provided for informational purposes and discussion only. If you need legal advice, please consult your own lawyer.
Non-technical considerations for the T&C template
Here are a few items that will need to be reviewed and potentially some sample content written for an updated Terms and Conditions template.
- Institutions will need to have institution specific wording which covers how they see/implement data protection compliance; not every institution has the same requirements.
- Time limits for keeping data: It should be stated in the T&C for how long backups are kept and assessment data if portfolios are submitted for evaluation purposes and an institution needs to keep that data as well as log files that are used for reporting purposes.
- Data being processed for analytics purposes and internal reporting: the student should be made aware that their data will be included in analytics, whether it's a separate consent depends mostly on what exactly will be done (kept with the controller / given to third parties, and whether data is anonymised on some level or not).
- Keeping logs of user activity for reporting purposes (currently, we can't exclude individuals from logging and an institution could argue that students shouldn't be able to opt out as the institution can't perform its work properly in supporting students).
- Integration points so users know from where other data of theirs is coming from or going to. The simplest cases and most frequent cases here are SSO and LTI.
- Purpose(s) of the data processing and who's doing it (see Par. 39 and others).
- Where is data hosted.
- Recital 48 - this is something that would be sorted out between the institution (as controller) and the provider (as processor) and should be declared as part of the consent declaration. In theory a UK university using Catalyst EU to manage hosting on AWS should declare all three. Whether this will happen in practice is not clear, but it's not really Mahara's problem provided the administrator can write an appropriate declaration/T&Cs.
- Contact person in the EU.
- Plain language in the T&C rather than legalise.
Must do technical changes
- Need to re-think the possibility for institutions to have their own T&C and privacy statements since they are still part of the wider site, site admins can also run reports, and institutions can't decide which reports to make available or not. Rather than allowing institutions to fully overwrite the T&C and privacy statement, allow them to add additional information to an existing statement. That way they can add moe institution-specific information without removing site information.
- Explicit consent checkbox / Yes/No switch.
- It needs to be possible for anybody to delete their account, no matter whether registration is allowed on the site or not.
- Par. 32:
- Re-order where T&C are stated during registration. Currently, it's below the “Register” button, which encourages people not to read them. We should have registration details, T&C text, then the agreement radio button and only then the “Register” button.
- We also need a separate consent button besides the T&C agreement checkbox / Yes-No switch.
- People coming via external authentication or having accounts created will also need to be able to accept T&C.
- It should be possible to re-trigger the T&C acceptance, e.g. when they changed. This could be done via a DB command in the beginning, but more comfortable would be an admin setting. If institutions can have their own T&C, they would need to be able to re-trigger acceptance just for their institution.
- Keep previous versions of the consented T&C and make these available to both user and admin.
- Nice to have: Inform people a week or more in advance of the upcoming changes. This can be done via announcement on homepage and dashboard. -> No change needed at this stage.
- Theoretically, it should be possible for a user to consent to different sorts of data that are kept and opt out of others, .e.g. accept that data is displayed in Mahara but not consent to the keeping of log files for reporting purposes. However, that is rather complicated and also does not conform with institution requirements to support the student in their learning and thus gather data. For the time being, users wouldn't be able to opt out of certain data types. However, our technical implementation should allow for that, i.e. create the consent section in such a way that we could add more fields later on if needed. We might already need that anyway for multi-tenancy:
- Agree to the site T&C.
- Agree to additional institution T&C (where applicable).
- Consent to the data processing on the site.
- Consent to an institution utilizing the data and having it processed.
- Par. 32: Have an admin report (column in the user overview report) that shows when a user last accepted the T&C (applicable to their institution) and link to the relevant T&C that they agreed to. Initially, it could be sufficient to require everyone (new and existing users) to accept the T&C or they'll not be allowed into their account.
- Par. 59: Need to check what data we keep when a user account is deleted. We know that the username and email address get a “deleted' and a hash appended, but it's still possible to deduce the previous information. This needs to be fully pseudonymised, i.e. a complete hash rather than keeping anything of the previous username and email address. We can't completely delete the record as it is still needed for group content (which still needs decisions).
Nice-to-have technical changes
GDPR leaves it open how information is shared with a user. That does not have to be an automated technical process but can involve a more manual way. It is important though that the information is provided in a timely fashion.
- Keep a log of which admin (site and institution) made what changes in the site and institution settings.
- If the deletion of forum posts and other group artefacts created by a user should be allowed (institution setting), we'd need a placeholder like we have right now that says “Deleted by the owner” or if an account is entirely deleted “Account was deleted” (when an artefact / portfolio is missing). YouTube does that for example when media was deleted. It's not removing it quietly but showing you that it was deleted. Same should be done with files and journal entries that appear in pages so it's clear that there used to be content at some point.
- Before group content is deleted though, it might be good to have it in a pending state so that the group admin can review the request to prevent students deleting content that they shouldn't because other group members still need it or where the content has proper license that allows re-use for example. Comment: The definition of 'personal data to be deleted' is quite vague because exactly what 'personal data' is, is defined by a data controller in practice. There are also arguments supporting preservation of data that is group content on the basis of not infringing on the rights of others, but collaborative works are not really covered in the GDPR on any level. Giving an administrator the choice to delete or not delete would provide suitable functionality from Mahara's perspective.
- Page where user can see info about all of their group activities, i.e. forum posts, file uploads, journal entries, group page / collection, journal entries, comments, annotation feedback with direct link to these. Info should include:
- Artefact type
- Creation date
- Maybe even possibility for bulk delete
- Par. 39 / 63: Display for how long log events are kept when advanced reporting is used rather than relying on site admin to keep that info up to date in the T&C manually.
- Par. 61:
- When user logs in for the first time, they should probably get a pop-up telling them that their profile page is visible by either only their institution or the entire site and what that means. That would be more explicit than having that in the T&C. Though initially, we could probably put that explicitely into the T&C that they accept.
- We have the possibility to retain the access to a copy of a portfolio when that option is ticked in the sharing options. Right now we don't inform the user who copies the portfolio about that. We should put a notice on the page when the copying is done to let the user know of this fact so they are not surprised when they go to the sharing screen and see it. Something like: ** ** Your portfolio has been shared automatically with “...” as per the copying options. You can change the access permissions. [link “change the access permissions” to the sharing screen specifically for this portfolio rather than the “Shared by me” overview screen]
- Remove gender, marital status and visa as resume fields as they are not needed. Rather have the possibility to set up custom resume / profile fields.
- Make it possible for fields to be hidden entirely in the profile rather than just greying them out when they are locked.
- Big change as connection to other provider (needs more analysis into feasibility and ease of use or if simple Leap2A would be sufficient): Transfer a portfolio from one Mahara site to another directly (par. 68).
Look into and make decision for going forward
- Q: Do we keep the IP address only in the access and error logs or also in the database? A: Only in the access / error logs unless someone reveals their own IP address in an artefact or forum post.
- Recording of user agents (UA) could be an issue: User-agent is a complicated one, because it looks like it isn't personal data, but it's possible that it could become so. The definition is essentially, 'can you identify a person from this data' - and while it's far less common than it used to be, browsers do sometimes insert extensions' names into the user agent. Combinations of these could potentially be unique to an individual's account on a computer. This is firmly in theoretical territory though because nothing that could be found suggests UA is considered personal anywhere, but a lot of the discussions about it qualify that with 'yet'.
- Under age people registering: Initially, this can be handled by an institution using an external auth method or manually creating accounts rather than allowing self-registration. Since a Mahara instance can be set up for one institution only, only they could gain access. We don't have any legal ways to verify the age of a person performing self-registration and would need to believe the data they enter.