MOBDA 2.0

This repo contains a prototype for MOBDA - Materialised Ontology-Based Data Access developed by the CEIR - Center for Enterprise Information Research.

The prototyp builds on the technology of (materialised) ontology-based data access((M)OBDA) (Cysneiros, 2016). OBDA uses a ontology as a semantic layer to harmonise data from different source systems to make them in one data base with the ontologies language accessable. Thereby, the user needs no knowledge of the source systems datastructure and only needs the language of the ontology. For this MOBDA datastore the Collaborative Actions on Documents Ontology short ColActDOnt is used.

MOBDA datastore builds on several components:

Neo4j as a target database, running on a linux server
A Dremio instance used as a conferated database with data from a HCL Connections instance as a source system
MOBDA Application importing the data into the target database which is documented in this repository The application uses:
- pandas dataframe for data preprocessing
- pyneoinstance for the data import into neo4j

This figure gives a a overview of the architcture: MOBDA datastore is dependend on all three components.

The repository has five folder:

DremioViews contains all dremio views and their SQL scripts
Index contains a file to create Indexes for the Neo4j database
Import contains all Import Scripts
- SoloScripts contains import scripts for each element stand alone executable
  - ImportNodes20 contains scripts for all data nodes
  - OntologyLayer20 contains scripts for the ontology layer
  - Relation20 contains all scripts for relations
- Execution contains all scripts and allows the execution of all data at once -'ExecutionImport.py' is a scripts executing all other scripts for the import
  - The folder contains all subscripts although they are not executable on their own
Cleanup contains a file 'Delete' that deletes all data from the database
PowerBIScripts contains a blueprint for importing a queried dataset from Neo4j to PowerBI
Additionally a yaml file ('exampleYaml.yaml') is used for credentials to Neo4j

Major changes from version 1.0 to 2.0

Use of Dremio as data source for the import
Implementing the majority of data logic in Dremio views
Use of a new import library pyneoinstance which can execute cypher scripts with data from dataframes
Relations are created from dataframes and cypher queries and no longer using mapping ids
Mulitprocessing due to pyneoistance
Single import script to exectue all scripts
Use of indexes to speed import and querying up

More information can be found in:

Working Paper (German): [Schlömer, L., Just, M., & Schubert, P. (2024). Integration eines ontologiebasierten Datastores für Enterprise Collaboration Systems. In CEIR Report (Issue 01/2024, p. 27.)]
Paper: [Schlömer, L., Just, M., & Schubert, P. (2024). Using Materialised Ontology-Based Data Access (MOBDA) for the Harmonisation of Trace Data from Enterprise Collaboration Systems. 1–16.]
Just, M., & Schubert, P. (2023). Collaborative Actions on Documents Ontology (ColActDOnt). Procedia Computer Science, 219, 294–302. https://doi.org/10.1016/j.procs.2023.01.293
Importing Neo4j Graph Data with Power BI

Shield:

This work is licensed under a Creative Commons Attribution 4.0 International License.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Cleanup		Cleanup
DremioViews		DremioViews
Import		Import
Index		Index
PowerBIExport		PowerBIExport
Architecture MOBDAv2.png		Architecture MOBDAv2.png
LICENSE		LICENSE
README.md		README.md
example.yaml		example.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MOBDA 2.0

About

Releases

Packages

Languages

License

ceir-koblenz/MOBDA

Folders and files

Latest commit

History

Repository files navigation

MOBDA 2.0

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages