Add documents to Datashare
Datashare provides a folder to collect documents on your computer to index in Datashare.
Last updated
Datashare provides a folder to collect documents on your computer to index in Datashare.
Last updated
Select the project in Datashare where you want to add your documents. The Default project, which is automatically created, is selected by default.
Select the folder or sub-folder on your computer in your 'Datashare' directory containing the documents you want to add. The entire 'Datashare' directory will be added by default.
Choose the language of your documents if you don't want Datashare to guess it automatically. Note: If you choose to also extract text from images (at the next option), you might need to install the appropriate language package on your system. Datashare will tell you if the language package is missing. Refer to the documentation to know how to install language packages.
Extract text from images/PDFs with Optical Character Recognition (OCR). Be aware the indexing can take up to 10 times longer.
Skip already indexed documents if you'd like.
Click 'Add'
Two extraction tasks are now running:
The first is the scanning of your Datashare folder - it sees if there are documents to analyze. It is called 'ScanTask'.
The second is the indexing of these files. It is called 'IndexTask'.
Note: It is not possible to 'Find entities' while these two tasks are still running. You won't have the entities (names of people, organizations, locations and e-mail addresses) yet. To get these, once your document addition is finished, please follow the steps to 'Find entities'.
But you can start searching in your documents without having to wait for all tasks to be done.
You can now search documents in Datashare.