arrow-left

All pages
gitbookPowered by GitBook
1 of 4

Loading...

Loading...

Loading...

Loading...

Install on Linux

These pages will help you set up and install Datashare on your computer.

Install Datashare

Currently, only a .deb package for Debian/Ubuntu is provided.

If you want to run it with another Linux distribution, you can download the latest version of the Datashare jar here: https://github.com/ICIJ/datashare/releases/latestarrow-up-right

And adapt the following launch script to your environment: https://github.com/ICIJ/datashare/blob/master/datashare-dist/src/main/deb/bin/datasharearrow-up-right.

1

hashtag
Download Datashare

Go to and click 'Download for Linux':

Save the Debian package as a file:

2

hashtag
Install the package

3

hashtag
Run Datashare

You can now .

Start Datashare

Find the application on your computer and run it locally on your browser.

Start Datashare by launching it from the command-line:

datashare

Datashare should now automatically open in your default internet browser. If it doesn’t, type 'localhost:8080arrow-up-right' in your browser.

Datashare must be accessed from your internet browser (Firefox, Chome, etc), even though it works offline without Internet connection (see: Can I use Datashare with no internet connection?).

Datashare's homepage

It's now time to add documents to Datashare.

Screenshot of the homepage of Datashare, the projects' page with one project called 'Default'
datashare.icij.orgarrow-up-right
start Datashare
Save as file
$ sudo apt install /dir/to/debian/package/datashare-dist_7.2.0_all.deb
$ datashare
Screenshot of the homepage of datashare.icij.org highlighting the 'Download for Linux' button
Screenshot of a Linux' window saying 'What should Firefox do with this file?' with 2 radiobuttons: 'Open with Archive Manager' and "Save File' (selected) with 2 buttons: 'Cancel' and 'OK'

Add documents to Datashare

Datashare provides a folder to collect documents on your computer to index in Datashare.

1

hashtag
Add documents to your 'Datashare' folder

You can find a folder called 'Datashare' in your home directory.

Move the documents you want to add to Datashare into this folder.

2

hashtag
Launch Datashare

Launch Datashare and see the interface opening in your default browser.

3

hashtag
In the menu, in 'Tasks', open 'Documents'

Expand the menu on the left:

In 'Tasks', open 'Documents':

4

hashtag
Choose your options

  • Select the project in Datashare where you want to add your documents. The Default project, which is automatically created, is selected by default.

5

hashtag
Watch the progress of your document addition

Two extraction tasks are now running:

  • The first is the

You can now .

On the top right, click the 'Plus' button:

Select the folder or sub-folder on your computer in your 'Datashare' directory containing the documents you want to add. The entire 'Datashare' directory will be added by default.

  • Choose the language of your documents if you don't want Datashare to guess it automatically. Note: If you choose to also extract text from images (at the next option), you might need to install the appropriate language package on your system. Datashare will tell you if the language package is missing. Refer to the documentation to know .

  • Extract text from images/PDFs with Optical Character Recognition (OCR). Be aware the indexing can take up to 10 times longer.

  • Skip already indexed documents if you'd like.

  • Click 'Add'

  • scanning
    of your Datashare folder - it sees if there are documents to analyze. It is called 'ScanTask'.
  • The second is the indexing of these files. It is called 'IndexTask'.

  • Note: It is not possible to '' while these two tasks are still running. You won't have the entities (names of people, organizations, locations and e-mail addresses) yet. To get these, once your document addition is finished, please follow the steps to '.

    But you can start searching in your documents without having to wait for all tasks to be done.

    search documents in Datashare
    Expand the menu
    how to install language packages
    Find entities
    Find entities'
    Open the "Documents" page
    Click the "Plus" button
    Form for adding documents
    Screenshot of Datashare's homepage highlighting the top icon in the left menu top to expand it
    Screenshot of Datashare's homepage with the left menu open highlighting the 'Documents' entry in the 'Tasks' category
    Screenshot of Datashare's Documents page highlighting the 'Plus' button at the top right corner
    Screenshot of Datashare's 'Add Documents' page with the form showing 5 options, a 'Reset' and an 'Add' buttons
    Screenshot of Datashare's Documents page highlighting two lines in a table, one for 'Scan folders' and another one for 'Index documents'