CLI with Tarentula
Datashare Tarentula is a powerful command-line toolbelt designed to streamline bulk operations against any Datashare instance.
Whether you need to count indexed files, download large datasets, batch-tag records, or run complex Elasticsearch aggregations, Tarentula provides a consistent, scriptable interface with flexible query support, and Docker compatibility.
It also exposes a Python API for embedding automated workflows directly into your data pipelines.
With commands like count
, download
, aggregate
, and tagging-by-query
, you can handle millions of records in a single invocation, or integrate Tarentula into CI/CD pipelines for reproducible data tasks.
You can install Tarentula with your favorite package manager:
pip3 install --user tarentula
Or alternatively with Docker:
docker run icij/datashare-tarentula
For the complete list of commands, options, and example, read the documentation or Github:
Last updated