Introduction
What is CorrLang ?
TL;DR A domain specific language (DSL) and tool for managing semantic interoperability through the declaration of relationships between concepts.
History
The idea for CorrLang came into place some time during the late stages of my PhD (Stünkel 2022). At that time, I was mostly working with quite theoretical formal mathematical stuf: category theory and algebraic graph transformation. I wanted to supply this rather abstract work with something more tangible, which I could show to my more applied software peers1.
The Beginning: graphqlintegrator
The starting point for CorrLang was a “side project” called graphqlintegrator, developed by the very first master student, which I was supervising: Ole von Bargen. This tool was also presented in paper at ECMFA (Stünkel et al. 2020), and could be considered the technical and conceptual ancestor of CorrLang: It facilitated a federation of multiple GraphQL endpoints via query rewriting. A federation is a conceptual system made up of multiple physical systems appearing as one from the outside. A common use case for a federation is in databases. The rewriting algorithm used in the tools was theoretically inspired by the colimit concept (i.e., first creating a common “global” schema by collecting all schema elements into one and then identifying elements that are considered to be “the same”). The tool also featured a DSL, which was used to specify what GraphQL schema element shall be identified and when. Hence, this DSL was a first draft of the semantic interoperability language that CorrLang is today. Graphqlintegrator was useful for sketching out first ideas and playing around with the query rewriting concept. However, it was more of a protype rather than something you would run in production. Moreover, it was “locked-in” on the colimit-concept and therefore not perfectly suitable to express all kinds of semantic interoperability problems (as I have been discovering throughout my PhD on a more theoretical level).
Consolidation: CorrLang v0.9
Therefore, in 2021, I started a re-implementation of the concepts graphqlintegrator. This time from a more foundational perspective, incorporating our idea of comprehensive systems (Stünkel et al. 2021). The latter is an abstract framework, heavily built on top of concepts from category theory, which might make reading the paper quite cumbersome for those who are not already familiar with this abstract branch of mathematics. The general idea, however, can be intuitively sketched out more easily:
- software models (this can be schemas, data models, interface description langauages but also instances a.k.a. data sets) can abstractly be described as graphs, i.e. classes/concepts/objects are considered as nodes while references/associations/links are considered as edges.
- relationships (e.g. typing/instanceOf) between such models (think graphs) can be expressed by something called graph (homo-)morphisms, i.e. a mapphing that respects the edge-node-incidence.
- graph morphisms are generally directed and binary. Therefore, they are not directly adequate for expressing semantic interoperability relationships among concepts from separate models because such relationships (e.g. “same-ness”) are generally undirected and multi-ary.
- semantic interoperability relationships can therefore be expressed via spans (formally: a star-shaped structure of multiple partial morphisms), which can be thought of as a generalized relations: They can be drawn on a whiteboard as lines (or “tentacles”) with many ends that put nodes or edges from multiple graphs in relationship with each other.
- What we have show in our paper was that this span-structure actually can be “internalized” (flattened) such that the resulting structure is “basically” a graph again. Hence, we only need to work with graphs again, with the small caveat that we have to put some special attention on some edges.
CorrLang was meant as a showcase to put these ideas into action. With the delivery of my thesis sometime in September 2021, version 0.9 of CorrLang was ready. Until recently, this version was the “official” showcase version that was linked on the website for a long time. It is also the version that described in chapter 6 of my thesis. This version re-implemented the functionality of graphqlintegrator in the more general framework of comprehensive systems and came with a more “polished” DSL. The main concepts of the DSL remain valid also in the newest version and are depicted in Figure 1.
The basic building blocks are endpoints. These can be servers, databases or simple files (treated as “black boxes”). Each endpoint must have a schema which describes it by listing the “entities”, “operations”, “data types” etc. In CorrLang, these are called elements and as we have just learned, they are abstractly considered to be nodes and edges in a graph. Among a set of endpoints (at least two), one may define a correspondence, which means that these endpoints share some semantic commonalities. The latter reify the correspondence via concrete relationships among the schema elements. Commonalities can be of different types, one being identity (i.e. two or more elements in disparate schemas representing the same concept), which is a very strong form of a commonality, another being some sort of abstract relation.
The road to v1.0
After defending my PhD in February 2022, the usual thing happened: I got a new position somewhere else and thus development halted. At this point, CorrLang would have gone the same road as all academic software prototypes, which are developed as part of master or PhD theses: They disappear. Thankfully, I got the opportunity to continue in academic. Even tough me having having a lot less time than before, I still have the possibility to continue working on CorrLang in between and with the prospect of having development support through student projects and the prospect of increasingly powerful coding LLM-assistants, I am motivated to turn CorrLang into a somewhat more stable product. Since 2025, I have been working on a major revision of the codebase and finally approaching something, I would like to call v1.0.
The major changes compared to the previous version are as follows: - The concept of goals was dropped and replaced with the more general concept of a view - The DSL was revised once more to account for the above change and to improve editing ergonomics - The internal software architecture was completely reworked to make CorrLang less dependent on a single programming language. To this end, the central codebase with the custom comprehensive system logic was encapsulated in something called the core-service (written in Java). The services offers a public API through gRPC.
References
Footnotes
Also it was a nice “side hustle” while wading through the tough phase of my Ph.D. thesis writeup.↩︎