DatAasee (0.2)
Repository: github.com/ulbmuenster/dataasee
Maintainer: Christian Himpe
Licenses: MIT (additionally: CC-BY for openapi.yaml)
Function: Metadata-Lake, Metadata Catalog, Metadata Aggregator
Audience: University Libraries, Research Libraries, Academic Libraries, Scientific Libraries
Tech Stack Outline
- Setting: Many distributed data and metadata sources
- Goals:
- Centralize metadata
- Interlinked metadata catalog
- Super-index for bibliographic and research data
- Features:
- Interact through HTTP-API (JSON)
- Search by filter or full-text
- Custom query via:
SQL
,Gremlin
,Cypher
,MQL
,GraphQL
- Frontend: Lowdefy
- Backend: Connect (Benthos)
- Data Storage: ArcadeDB
- Infrastructure: Compose (via Docker or Podman)
- Deployment: via Harbor (at Uni Münster)
- Monitoring: Prometheus
- Integrations:
- Protocols:
OAI-PMH
(HTTP),S3
(HTTP) - Encodings:
DataCite
(XML),DC
(XML),MARC
(XML),MODS
(XML)
- Protocols:
- Security: Priviledged endpoints (CQRS)
- Testing: check-jsonschema
- Development: Github
Getting Started (Deployment)
- Depends on
docker-compose
or alternativelypodman-compose
. - To deploy, no need to clone just use the
compose.yaml
file. - See the Deploy Documentation for details.
Quick Start:
$ mkdir -p backup
$ wget https://raw.githubusercontent.com/ulbmuenster/dataasee/0.2/compose.yaml
$ echo -n 'password1' > dl_pass && echo -n 'password2' > db_pass && docker compose up -d; rm -f dl_pass db_pass; history -d $(history 1)
Documentation
- Dependencies Overview
- Software Documentation
- Architecture Documentation
- Database Schema
- OpenAPI Schema
DatAasee
: A Metadata-Lake as Metadata Catalog for a Virtual Data-Lake (Companion Paper, Open Access)
Default Ports
8343
DatAasee API9999
Database JMX (Development Only)8000
Web Frontend (Development Only)80
Web Frontend (Deployment Only)
Repository Contents
api/
- API definition and message schemasassets/
- Project images and stylesbackend/
- Processor pipeline and component definitionscontainer/
- Dockerfilesdatabase/
- Database initialization, schemas and enumerated datadocs/
- Software, data and architecture documentationfrontend/
- Prototype frontend definitionstests/
- Test definitions and data
Getting Started (Development)
make
List available targets:make setup
Build server imagesmake start
Start serversmake stop
Stop servers (after attempting a backup)make logs
Show logsmake state
Report container status (requiresbash
)make watch
Monitor container status (requiresbash
)make peak
Report peak database memory usage (requirespgrep
)make sbom
Create per container SBOMs (requiressyft
)make test
Run HTTP API tests (requirescheck-jsonschema
)make tidy
List violations of StrictYAML (requiresyamllint
)make todo
List inline TODOs in repo