Skip to content

Habitus Application Overview

Keith Alcock edited this page Feb 4, 2025 · 5 revisions

Introduction

In addition to the software contained in this repo, there is another repo, heuristics, containing a lot more. It is intended to implement a product based on the ideas that are explored in this habitus repo. Because the heuristics repo is presently private, some documentation lives here where it can reach a wider audience. Habitus software primarily addresses the acquisition of documents and their preparation for further investigation. It includes code to aid in the process of acquisition and preparation, but does not include "finished" programs or anything approaching a product. Heuristics software continues the development of document preparation past a program level to a product level and additionally implements a tool for investigation, a GUI frontend, that is similarly advanced. This overview describes the document preparation (middleend) and investigation (frontend) components.

Components

The software for heuristics can be organized into four different deployable artifacts. Three are included in the repo and a fourth, Elasticsearch, is provided by a third party. Here are some of the details:

Name Interface Local Packaging Remote packaging Database Role Process Role
Elasticsearch REST Docker image jar editor & viewer preparer & investigator
Backend-NLP REST Docker image jar viewer investigator
Middleend CLI Docker image jar editor preparer
Frontend GUI jar/zip file - viewer investigator

Roles

From the organization of the roles above, one can see that tasks can be divided between a preparer of data who is able to edit (write to) the database and an investigator of data who can only read from it. These may be separate people or separate groups of people with different skills and possibly equipment. The preparer will be running software with a CLI (command line interface) and should be familiar with that. The software also needs more memory and prefers a more powerful processors than software for the frontend. The investigator works with the frontend GUI (graphical user interface) and some fairly lightweight backend components that don't demand as powerful a computer.

Deployment

There are two basic deployment scenarios, local and remote, and the standard packaging differs between the two. People can run the software on a local computer. This might be for a solo researcher or someone preparing data locally to be later investigated by others after the data has been transferred to an Elasticsearch instance set up for sharing. The software can also be set up to run from a server with as little as possible local overhead. This works well if there are multiple investigators. These ideas lead to this table where ➖ means installed, but remotely, probably by someone else.

Role Location Elasticsearch Backend-NLP Middleend Frontend
preparer local
investigator local
investigator remote