Uploading documents
Learn about how to add files to Aleph, and the ways that Aleph can help to extract meaningful information from documents.
The most common type of data to import into an investigation in Aleph is files, such as PDFs, E-Mail archives, or Word documents. Uploading these documents to Aleph allows you to keep track of evidence gathered over the course of an investigation and to share it as needed with your colleagues.
Most importantly, uploading files to Aleph makes it easier to search their contents, even if text is hidden in images or other unstructured formats.
Why upload documents to Aleph?
Collaboration
Uploading documents to Aleph provides an easy way to share access to large sets of documents with your colleagues and collaborators.
Text Extraction
Aleph extracts text from images and PDF files, allowing you to search for key terms of interest in files that would otherwise prove difficult to search.
Names, phone numbers, addresses…
Aleph is trained to recognize and extract mentions of people, companies, phone numbers, addresses, and IBAN numbers contained in documents you upload, allowing you to easily find other documents containing matching mentions.
Email archives
When uploading a set of emails, Aleph automatically extracts structured information like the sender, receiver, cc’ed entities, and attachments, making it easy to filter the uploaded data for messages sent from or to a specific person.
Organized File Structure
Aleph preserves the folder structure and hierarchy of uploaded documents, and allows you to create additional folders once files are uploaded into the system. This allows you to keep your files organized as the size of an investigation grows.
Getting started
To upload a set of documents, you must first have an investigation into which to import them. Once you have an investigation ready to go, then proceed with the following:
-
From the homepage of your investigation, click the Documents button in the sidebar.
Advanced notes on uploads
- To upload a large trove of documents (such as a leak), use the command-line based alephclient tool to import documents in an automated fashion.
- If you plan to import data on a recurring basis from a public source, such as a government web site, you may want to create a web crawler that automatically executes, collects data and submits them to Aleph.