Datashare
DownloadAbout ICIJGitHub
  • About Datashare
  • Ask for help
  • Concepts
    • Running modes
    • CLI stages
  • About ICIJ
  • Github
  • 💻On your computer
    • About the local mode
    • Install on Mac
      • Start Datashare
      • Add documents to Datashare
    • Install on Windows
      • Start Datashare
      • Add documents to Datashare
    • Install on Linux
      • Start Datashare
      • Add documents to Datashare
    • Install with Docker
    • Add documents
    • Add more languages
    • Install plugins and extensions
    • Neo4j
      • Install Neo4j plugin
      • Create and update Neo4j graph
  • 🌐On your server
    • About the server mode
    • Install with Docker
    • Add documents from the CLI
    • Add entities from the CLI
    • Authentication providers
      • OAuth2
      • Basic with a database
      • Basic with Redis
      • Dummy
    • Neo4j
      • Install Neo4j plugin
      • Create and update Neo4j graph
    • Performance considerations
  • ⚡Usage
    • Search documents
    • Search documents in batch
    • Search with operators / Regex
    • Filter documents
    • Sort documents
    • Explore a document
    • Star documents
    • Tag documents
    • Recommend documents
    • Keyboard shortcuts
    • Create a Neo4j graph and explore it
    • FAQ
      • General
        • Can I use Datashare with no internet connection?
        • Can I download a document from Datashare?
        • Can I remove document(s) from Datashare?
        • Do you recommend OS or machines for large corpuses?
        • Can I use an external drive as data source?
        • How can we use Datashare on a collaborative mode on a server?
        • How can I contact ICIJ for help, bug reporting or suggestions?
        • Why results from a simple search and a batch search can be slightly different?
        • How can I uninstall Datashare?
        • Advanced: how can I do bulk actions with Tarentula?
        • What should I do if I get more than 10,000 results?
        • How to run Neo4j?
      • Definitions
        • What is a named entity?
        • What are NLP pipelines?
        • What is fuzziness?
        • What are proximity searches?
      • Common errors
        • 'Your search query is wrong.' What should I do?
        • Searching with double quotes doesn't work
        • List of common errors leading to "failure" in Batch Searches
        • What if Datashare says 'No documents found'?
        • Nothing works, everything crashes. What can I do?
        • What if tasks are 'running' but not completing?
        • 'You are not allowed to use Docker, you must be in the "docker-users" group'. What should I do?
        • What if a 'Preview' of my documents is 'not available'?
        • What do I do if Datashare opens a blank screen in my browser?
        • I see people, organizations and locations in the filters but not in the documents
        • What does 'Windows named pipe error' mean?
        • Datashare doesn't open. What should I do?
        • I upgraded to version 9 of Datashare and it fails.
  • 🤓Developers
    • How to contribute
    • Backend
      • API
      • API (deprecated)
      • Database
    • Frontend
      • JSDoc
      • Plugin hooks
      • Insight widgets
      • Vue app
        • Components
          • Api
          • AppliedSearchFilters
          • AppliedSearchFiltersItem
          • AppNav
          • AppSidebar
          • BatchDownloadActions
          • BatchSearchActions
          • BatchSearchClearFilters
          • BatchSearchCopyForm
          • BatchSearchFilterDate
          • BatchSearchFilterQuery
          • BatchSearchForm
          • BatchSearchResultsDetails
          • BatchSearchResultsFilters
          • BatchSearchResultsTable
          • BatchSearchStatus
          • BatchSearchTable
          • ColumnChartPicker
          • ColumnFilter
          • ColumnFilterBadge
          • ColumnFilterDropdown
          • ContentTypeBadge
          • Document
            • DocumentNavbar
            • DocumentNotes
            • DocumentTabDetails
            • DocumentTabExtractedText
            • DocumentTabNamedEntities
            • DocumentTabPreview
            • Viewers
              • AudioViewer
              • ImageViewer
              • JsonViewer
              • LegacySpreadsheetViewer
              • PaginatedViewer
              • SpreadsheetViewer
              • TiffViewer
              • VideoViewer
          • DocumentActions
          • DocumentAttachments
          • DocumentContent
          • DocumentContentSlice
          • DocumentContentSlicePlaceholder
          • DocumentContentSlices
          • DocumentGlobalSearchTermsTags
          • DocumentInModal
          • DocumentLocalSearchInput
          • DocumentSlicedName
          • DocumentTagsForm
          • DocumentThread
          • DocumentThumbnail
          • DocumentTranslatedContent
          • DocumentTypeCard
          • EllipseStatus
          • EmailString
          • Extensions
          • ExtractingForm
          • ExtractingFormOcrControl
          • ExtractingLanguageFormControl
          • Filter
            • FilterBoilerplate
            • FilterFooter
            • FilterSearch
            • FilterSortByDropdown
            • Types
              • FilterAbstract
              • FilterDate
              • FilterDateRange
              • FilterNamedEntity
              • FilterPath
              • FilterProject
              • FilterRecommendedBy
              • FilterStarred
              • FilterText
          • FiltersPanel
          • FindNamedEntitiesForm
          • Hook
          • InlineDirectoryPicker
          • JsonFormatter
          • LocalesMenu
          • MountedDataLocation
          • NamedEntityInContext
          • PageHeader
          • PageIcon
          • Pagination
          • Plugins
          • ProjectCards
          • ProjectForm
          • ProjectLink
          • ProjectSelector
          • ProjectThumbnail
          • QuickItemNav
          • ResetFiltersButton
          • RouterLinkPopup
          • ScrollTracker
          • SearchBar
          • SearchBarInput
          • SearchBarInputDropdown
          • SearchBarInputDropdownForField
          • SearchBarInputDropdownForProjects
          • SearchDocumentNavbar
          • SearchFormControl
          • SearchLayoutSelector
          • SearchResults
          • SearchResultsGrid
          • SearchResultsHeader
          • SearchResultsList
          • SearchResultsListLink
          • SearchResultsTable
          • ServerSettings
          • ShortkeysModal
          • TaskItemStatus
          • TasksList
          • TreeBreadcrumb
          • TreeView
          • UserDisplay
          • UserHistorySaveSearchForm
          • VersionNumber
          • Widget
            • WidgetDiskUsage
            • WidgetDocumentsByCreationDate
            • WidgetDocumentsByCreationDateByPath
            • WidgetDuplicates
            • WidgetEmpty
            • WidgetEntities
            • WidgetFieldFacets
            • WidgetFileBarometer
            • WidgetListGroup
            • WidgetNames
            • WidgetNested
            • WidgetProject
            • WidgetRecommendedBy
            • WidgetSearchBar
            • WidgetText
            • WidgetTreeMap
        • Pages
          • App
          • DocumentModal
          • DocumentStandalone
          • DocumentView
          • Error
          • Landing
          • Login
          • Project
          • ProjectList
          • ProjectNew
          • ProjectView
          • ProjectViewAddDocuments
          • ProjectViewEdit
          • ProjectViewFindNamedEntities
          • ProjectViewInsights
          • Search
          • Settings
          • TaskAnalysis
          • TaskAnalysisList
          • TaskBatchDownload
          • TaskBatchDownloadList
          • TaskBatchSearch
          • TaskBatchSearchList
          • TaskBatchSearchNew
          • TaskBatchSearchView
          • TaskBatchSearchViewResults
          • Tasks
          • UserHistory
          • UserHistoryDocumentList
          • UserHistorySavedSearchList
    • Introduction to Tarentula
    • Index operations with Playground
    • Write extensions
    • Write plugins
Powered by GitBook

Datashare is an open source project by the International Consortium of Investigative Journalists

On this page
  • Extracted text
  • Search for attachments
  • Search for terms within this document
  • Tags & Details
  • Named Entities
  • Preview
Export as PDF
  1. Usage

Explore a document

Once you opened a document, you can explore the document's data through different tabs.

Last updated 1 year ago

Extracted text

In 'Extracted Text', you can read the text of a document as extracted by Datashare:

Please beware that Datashare show named entities by default. This can overwrite some original text with wrong named entities. It is thus important to always verify original text by deactivating named entity overwriting. To do so, please:

  • Turn off the toggle button ‘Show named entities’ and read the extracted text

  • Check the ‘Preview’ of original document if available

  • Check the original document at its original location or by clicking the pink button ‘Download’

****

Search for attachments

If the documents has attachments (technically called 'children documents'), find them at the end of the document. Click their pink button to open them:

To open all the attachments in Datashare, click 'See children documents' in Tags and Details:

****

Search for terms within this document

  • Press Command(⌘) + F (on Mac) or Control + F (on Windows and Linux) or click on the search bar above your Extracted Text

  • Type what you search for

  • Press ENTER to go from one occurrence to the next one

  • Presse SHIFT + ENTER to go from one occurrence to the previous one

This also counts the number of occurrences of your searched terms in this document:

Tags & Details

In 'Tags & Details', you can read the document's details. It's all the metadata as they appear in the original file. Please click 'Show more details' to get all metadata:

You can also read the tags you previously wrote for this document, like 'test1', 'test2' and 'test3' in the example below:

You can then search for the documents you tagged:

  • Type the tag(s) in the main search bar

  • Click 'All fields' and select 'Tags'

  • Click 'Search' or press 'Enter'

Named Entities

In 'Named Entities', you can read the name of people, organizations and locations as well as the number of their occurrences in the document:

Please beware that there can still be some errors due to the technology of Named Entity Extraction (NER) on which Datashare relies.

If you run email extraction, you will see a list of the extracted emails:

Preview

In 'Preview', you can read the original document.

'Preview' is available for some formats only.

(To know all the shortcuts in Datashare, please read ''.)

If you run and searched for one or several email addresses, if the email adresses are in the email's metadata (recipient, sender or other field), there will be a 'in metadata' label attached to the email addresses:

To learn more about tags, please read ''.

⚡
Use keyboard shortcuts
email extraction
Tag a document