Actions

User

Difference between revisions of "Gold/Elasticsearch"

From Mahara Wiki

< User:Gold
(Initial run at notes on Elasticsearch)
 
(→‎The plan: Elasticsearch-PHP is looking good and we should keep it.)
Line 32: Line 32:
 
** Ensure results take into account if the user has access to see the them
 
** Ensure results take into account if the user has access to see the them
 
* Add reporting support
 
* Add reporting support
 +
 +
=== The Elasticsearch-PHP library ===
 +
 +
Just reading the [https://www.elastic.co/guide/en/elasticsearch/client/php-api/current/overview.html Overview] has already made it clear that we should stick with it. Key points are that it is a low level client and adds "cluster state sniffing, round-robin requests, and so on".  This last would have been a thing we would need to do if we didn't use the class.  With it being a low level client my concern that somethings may be abstracted away have been alleviated.

Revision as of 12:14, 14 May 2021

tl;dr;

Upgrading has turned out to be... involved. We're creating a new ElasticSearch7 search plugin.

Where are we?

The state of play

Upgrading the existing Elasticsearch search plugin has turned out to be way more involved that previously anticipated. The differences between how ES6 and ES7 work has meant that trying to massage the old code to work with how ES now expects data to be formatted is causing issues to cascade throughout the system revealing more places that need to be touched. This leaves me with the feeling that we are likely to miss things which leaves open the potential that the work may appear shoddy.

SotA

Elasticsearch have been moving towards a more and more simplified structure for ingesting and managing data. Things are, currently, quite 'flat' when it comes to the data being stored. Despite the data structure changing from version to version it has been trending towards a less and less complicated system. Due to this trend it is still desirable to stick with Elasticsearch.

Where to from here?

The current plan is to leave the existing Elasticsearch search plugin in place and create a new Elasticsearch 7 plugin.

This has multiple advantages;

  • Sites that are unable/unwilling to move from their existing ES server can continue with that.
  • We don't need work with existing code to try and bend it into shape for the new way things are done.
  • We can take a "clean canvas" approach and not be hobbled by previous decisions.
  • I get to build a plugin from scratch. << I am quite pleased that this is a thing :)

The plan

  • Investigate what Elasticsearch-PHP gives us. ES7 is fairly straightforward to use. Do we still even need the library?
  • Start a clean plugin
  • Add config form support
  • Get data into ES7
  • Get data out of ES7
    • Get the top level search to return results
    • Ensure results take into account if the user has access to see the them
  • Add reporting support

The Elasticsearch-PHP library

Just reading the Overview has already made it clear that we should stick with it. Key points are that it is a low level client and adds "cluster state sniffing, round-robin requests, and so on". This last would have been a thing we would need to do if we didn't use the class. With it being a low level client my concern that somethings may be abstracted away have been alleviated.