Actions

Proposals/Management of third party libraries

From Mahara Wiki

< Proposals
Revision as of 13:30, 25 November 2021 by Gold (talk | contribs) (New third-party library proposal)

tl;dr;

  • Remove third-party libraries from the Mahara repo.
  • Use Composer/Composer patches to manage PHP
  • Use npm/Patch Package to manage JS

The issue

Currently Mahara is maintaining third party external libraries in its own repo. Many of these are patched with Mahara specific customisations. This results in Mahara maintaining a lot more code than it needs to.

A resolution

Modern practices for library management would have these being pulled in from official sources and Mahara would either extend these libraries to override methods that need to have specific Mahara functionality or we would maintain a small number of patch files that would be applied by the management system when the library is pulled down.

e.g. There are currently 2 patched files and one method overridden and a whole lot of files removed for ADOdb. We would maintain 1 patch file and a single class file that extends ADOdb. This would remove 776K of code from our project and massively reduce the time required to upgrade this library. The other advantage would be that the patch file could also be removed. Instead an issue created for each edit we are making on the source project, the patch uploaded to it and that online patch is referenced. Then, all we maintain is a reference to the patches in the management system config file.

Dwoo is 972K, chrome-php is 12M, etc. All these are in our repo.

Options

PHP (composer)

For PHP the current management system that is dominating this field is Composer. It has been in development since 2013. Others like Maven and Pyrus seem to have suffered from bit-rot over the years. There is also PEAR, but the dependency management on that has never been that good.

The Composer Patches package allows for patching of third party code on checkout. It works with local patch files and those hosted on URLs. If we need to patch files for Mahara specific things we can maintain those patch files ourselves. If we push fixes upstream we can include references to the URI for the patch until it is accepted.

Javascript (npm)

In the Javascript space we have npm, yarn, bower, webpack, and others. The 3 listed are those I'm familiar with. As a user.

As I understand it npm fills the same package manager space as Composer. Yarn is npm plus project management. They share the same package repository.

The Patch Package looks to be the Composer Patches of the npm world. It also facilitates the generation of the patch files and, if the patch is to be pushed upstream, it can create issues on github hosted libraries as well.

Despite still being in use, Bower has ceased development and is only being maintained. They recommend other things on their homepage as suitable alternatives.

Webpack looks to fill the same space as Yarn without a dedicated package manager behind it.

Reliance on third party library management systems

npm has been in the news lately regarding security problems.

npm is open source and consists of 2 major components. The CLI tool and the registry. If the extra security is needed we could stand up our own registry and only include the packages/dependencies we need.

Note: if we're going that far though, we should be doing the same for Composer and APT(OS package management) as well.

It is worth noting that every package manager system can be at risk like this. More often than not it is actually the packages rather than the package manager that are the risk though.

It is also worth noting that we already use npm.

What if Composer / npm or any other package manager we would go for folds and isn't available any more?

If this happens most of the Internet has collapsed and there are more urgent issues in the world. This would be on the scale of the things like APT/YUM going away.

To answer the question though, every installation archive we have will have an historical record of the state of the libraries at the time they were built. These could easily be added to the repo should they be needed and the libraries maintained manually as we currently do. The only change would be where they are located.

Effort

Switching to the new system

  • Each library would be removed from the repo. (quick)
  • Each library would be added to the appropriate manifest file: composer.json or package.json. (very quick)
  • Current patches/customisations would be extracted as a .patch file and added to the repo, the manifest, and tested. (moderate work)
  • In some cases how we implement the library may change. e.g. I would be suggesting we create a MaharaADODB that extends ADODB. With the current upstream patches and this approach to override the one function we are currently customising there will be no need at all to patch ADODB. In the case of ADOBD - (moderate work)
  • npm packages, being javascript, would be a path update I believe.
  • Libraries added via Composer would be included via the use of vendor/autoload.php. This would be added to init.php making it available everywhere.

Maintaining the new system

  • Effort is reduced.
  • When a package has an update, security or otherwise, we would use the package manager to pull down the new version. Any patches we maintain or track would be applied as part of that step.
    • If that succeeds we're done and the composer.json/composer.lock/package.json are committed to the repo.
    • If not, we check the patch, and if still needed we rework it just as we would currently. Then composer.json/composer.lock/package.json and the updated patch file are committed to the repo.

Can we still track these in the 'Components library'?

I believe so. Composer has a vendor directory. npm has the node_modules directory. These can be traversed to find the current installed version of all libraries in use. Through the APIs available we should also be able to fetch the current latest version as well.

Common installation scenarios

At the moment there are 2 dominant installation scenarios. Both come with upgrading issues but neither of these would, in my opinion, be a serious concern worth addressing here. They are just the natural consequences of maintaining software and are often easily mitigated by custom scripts the site admin would take care of as they would be specific to their site.

Grab the archive

A lot of people in the community install Mahara via the provided zip packages rather than from Git and compiling code (or even the theme) themselves. From the point of view of these people the process would not change at all. They still grab a fully self contained archive.

Currently the archive is built through an automated process. This is already using npm to pull down the SASS packages to compile the CSS. The addition of a call to pull down PHP libraries with composer would not be a major impact to the process.

Checkout from git

As with the archive approach, those using git will already be pulling down the npm packages to compile their SASS into CSS. The addition of composer will not be a huge imposition.


A new scenario for future consideration

Once we're comfortable with composer we will be able to open up a new installation process.

$ composer require mahara/mahara