What if you want to add features to Datashare backend?
Unlike plugins that are providing a way to modify the Datashare frontend, extensions have been created to extend the backend functionalities. There are two extension points that have been defined :
NLP pipelines : you can add a new java NLP pipeline to Datashare
HTTP API : you can add HTTP endpoints to Datashare and call the Java API you need in those endpoints
Since version 7.5.0, instead of modifying Datashare directly, you can now isolate your code with a specific set of features and then configure Datashare to use it. Each Datashare user could pick the extensions they need or want, and have a fully customized installation of our search platform.
When starting, Datashare can receive an extensionsDir
option, pointing to your extensions' directory. In this example, let's call it /home/user/extensions
:
You can list official Datashare extensions like this :
You can add a regular expression to --extensionList
. You can filter the extension list if you know what you are looking for.
You can install an extension with its id and providing where the Datashare extensions are stored:
Then if you launch Datashare with the same extension location, the extension will be loaded.
When you want to stop using an extension, you can either remove by hand the jar inside the extensions folder or remove it with datashare --extensionDelete
:
You can create a "simple" java project like https://github.com/ICIJ/datashare-extension-nlp-mitie (as simple as a java project can be right), with you preferred build tool.
You will have to add a dependency to the last version of datashare-api.jar to be able to implement your NLP pipeline.
With the datashare API dependency you can then create a class implementing Pipeline or extending AbstractPipeline. When Datashare will load the jar, it will look for a Pipeline
interface.
Unfortunately, you'll have also to make a pull request to datashare-api to add a new type of pipeline. We will remove this step in the future.
Build the jar with its dependencies, and install it in the /home/user/extensions
then start datashare with the extensionsDir
set to /home/user/extensions
. Your plugin will be loaded by datashare.
Finally, your pipeline will be listed in the available pipelines in the UI, when doing NER.
For making a HTTP extension it will be the same as NLP, you'll have to make a java project that will build a jar. The only dependency that you will need is fluent-http because datashare will look for fluent http annotations @Get, @Post, @Put...
For example, we can create a small class like :
Build the jar, copy it to the /home/user/extensions
then start datashare:
et voilà 🔮 ! You can query your new endpoint. Easy, right?
You can also install and remove extensions with the Datashare CLI.
Then you can install it with:
And remove it: