DatAasee (0.3)
Repository: github.com/ulbmuenster/dataasee (nb sources backup)
Maintainer: Christian Himpe (at University and State Library of Münster)
Licenses: MIT (add. CC-BY for openapi.yaml)
Function: Metadata-Lake, Metadata Catalog, Metadata Aggregator, Union Catalog
Audience: University Libraries, Research Libraries, Academic Libraries, Scientific Libraries
Tech Stack Canvas
- Setting: Many distributed data and metadata sources
- Goals:
- Centralize metadata
- Interlinked metadata catalog
- Super-index for bibliographic and research data
- Features:
- Interact through HTTP-API (JSON)
- Search by filter or full-text
- Custom query via:
SQL
,Gremlin
,Cypher
,MQL
,GraphQL
- Frontend: Lowdefy
- Backend: Connect (Benthos)
- Data Storage: ArcadeDB
- Infrastructure: Compose (via Docker or Podman)
- Deployment: via Harbor (at Uni Münster)
- Monitoring: Prometheus
- Integrations:
- Protocols:
OAI-PMH
(HTTP),S3
(HTTP),GET
(HTTP),DatAasee
(HTTP) - Encodings:
XML
(Plain-Text) - Formats:
DataCite
(XML),DC
(XML),LIDO
(XML),MARC
(XML),MODS
(XML)
- Protocols:
- Security: Priviledged endpoints (CQRS)
- Testing: check-jsonschema
- Development: Github
Documentation
- Dependencies Overview
- Software Documentation
- Architecture Documentation
- Database Schema
- OpenAPI Schema
DatAasee
: A Metadata-Lake as Metadata Catalog for a Virtual Data-Lake (Companion Paper, Open Access)
Getting Started (Deployment)
- Depends on
docker-compose
(and compatible todocker
andpodman
) - To deploy, no need to clone, just use the
compose.yaml
file. - See the Deploy Documentation for details.
Quick Start:
$ wget https://raw.githubusercontent.com/ulbmuenster/dataasee/0.3/compose.yaml
$ mkdir -p backup
$ DB_PASS=password1 DL_PASS=password2 docker compose up -d
Default Ports
8343
DatAasee API2480
Database API (Development Only)9999
Database JMX (Development Only)8000
Web Frontend (Development Only)80
Web Frontend (Deployment Only)
Repository Contents
api/
- API definition and message schemasassets/
- Logos and style definitionbackend/
- Processor pipeline and component definitionscontainer/
- Dockerfilesdatabase/
- Database initialization, schemas and enumerated datadocs/
- Documentation of software, data and architecturefrontend/
- Prototype frontend definitiontests/
- Test definitions and data
Getting Started (Development)
- Available
make
targets:make setup
Build server imagesmake start
Start serversmake stop
Stop serversmake reset
Stop and start serversmake empty
Delete database backups (requires priviledges)make logs
Show logs (requiresgrep
)make peak
Report peak database memory usage (requiresgrep
)make test
Run tests (requirescheck-jsonschema
,busybox
,wget
)make tidy
List violations of StrictYAML (requiresyamllint
)make todo
List inline TODOs in repo (requiresgrep
)
- Custom
make
variable:COMPOSE