Comprehensive Systems: A formal foundation for Multi-Model Consistency Management

Model management is a central activity in Software Engineering. The most challenging aspect of model management is to keep inter-related models consistent with each other while they evolve. As a consequence, there is a lot of scientific activity in this area, which has produced an extensive body of knowledge, methods, results and tools. The majority of these approaches, however, are limited to binary inter-model relations; i.e. the synchronisation of exactly two models. Yet, not every multi-ary relation can be factored into a family of binary relations. In this paper, we propose and investigate a novel comprehensive system construction, which is able to represent multi-ary relations among multiple models in an integrated manner and thus serves as a formal foundation for artefacts used in consistency management activities involving multiple models. The construction is based on the definition of partial commonalities among a set of models using the same language, which is used to denote the (local) models. The main theoretical results of this paper are proofs of the facts that comprehensive systems are an admissible environment for (i) applying formal means of consistency verification (diagrammatic predicate framework), (ii) performing algebraic graph transformation (weak adhesive HLR category), and (iii) that they generalise the underlying setting of graph diagrams and triple graph grammars.


Introduction
Conceptual models, i.e. abstract specifications of the system under development, are recognised to be of major importance in software engineering [WHR14]. Representing the whole system in a single (global) model is generally unfeasible [CCP19], hence, different teams design and maintain several models (views) which focus on different aspects of the system. This collection of inter-related models is often referred to as a multimodel [BKMW09,SKLR18,DKL19]. A major issue of multi-models is comprehensive consistency management [Ste20, FKWVH19, CCP19, SZ01], i.e. keeping the collection of models consistent w.r.t. each other under the ongoing development process to avoid conflicting interpretations of what is being developed. more expressive alternative to the (colimit-based) model merging approach and they are able to serve as the formal underpinning for model weaving. Furthermore, we will prove that it is theoretically possible (i) to apply existing means of consistency verification (diagrammatic predicates) on comprehensive systems, (ii) to apply the algebraic graph transformation (GT) framework [EEPT06] on comprehensive systems, and (iii) that comprehensive systems generalise the underlying categories of triple graphs and graph diagrams [TA15]-a multi-ary generalisation of the former.
Considering Multi-Model Consistency Management as a three step process comprising the activities alignment, verification and restoration, comprehensive systems are located in the alignment phase. Verification and restoration are enabled by showing that comprehensive systems admit all formal properties that are required to apply mature model management frameworks for verification and restoration that already exist for local models. The results in this paper are foremost of formal analytical nature. A practical evaluation is left for future work.
Changes compared to the conference version This article is an extended version of the paper Towards Multiple Model Synchronization with Comprehensive Systems [SKLR20] published in the proceedings of the 2020 edition of the Fundamental Aspects of Software Engineering conference. A major change to the conference version is a completely rewritten "state of the art" (Section 3), which provides a more detailed overview of Multi-Model Consistency Management and contemporary tool support. This allows to set the contribution of comprehensive systems in a bigger context and to motivate their use case. Furthermore, the theory part is extended substantially: We proved the fact that comprehensive systems are organized into a category that has the weak adhesive HLR property w.r.t. a suitable class of reflective monomorphisms M (Corollary 1 in Section 4.4). This opens the door for the application of the well-established GT-framework and represents a substantial extension compared to the conference version.
Outline Section 2 introduces a Multi-Model Consistency Management scenario, which will be used as a running example throughout the paper. Section 3 gives an overview of the state of the art of Multi-Model Consistency Management. Section 4 introduces comprehensive systems and their formal properties. Finally, Section 5 concludes the paper with references to related work and future work plans. Moreover, to make this paper selfcontained, there is an Appendix, which is divided into two parts. Appendix A contains background on category theory that is required for the proofs in Section 4. Appendix B contains the detailed proofs of the theorems in Section 4.

Use case
Our running example stems from the healthcare domain and models a patient referral process. A referral is "the act of sending a patient to another physician for ongoing management of a specific problem with the expectation that the patient will continue seeing the original physician for co-ordination of total care" [Seg92]. It is an important and recurring process in the healthcare domain. Hence, ICT-support is desirable [WK19]. Furthermore, its design is far from trivial as it involves multiple actors (software vendors, government officials, hospitals and physicians) and aspects (data structures, behavior, interfaces, policies, etc.). A small excerpt of models involved in this design is shown in Fig. 2 (ignore the dashed lines for the moment).
There is a process model A 1 denoted in Business Process Model and Notation (BPMN) [Obj14], a data model A 2 denoted as a Unified Modelling Language (UML) class diagram [Obj15], and a decision model A 3 denoted in Decision Model and Notation (DMN) [Obj19]. The model in A 1 represents a simplified version of the one in [WK19] and specifies the behavioural aspect from the viewpoint of the referring physician: The process is triggered by a patient's appeal beginning with an introductory consultation. Afterwards, information about the patient and its medical history is extracted while in parallel a consultant is selected via a business rule. The patient information is then sent to the consultant. The consultant can either approve the referral or reject it. In the latter case, another consultant has to be found. If a consultant accepts the referral, the process is finished.
This process model is related to the other models: The domain-specific behaviour of the "Select Consultant'' activity in A 1 is specified in the decision table model A 3 , which for a given combination of values in input side columns, assigns combinations of values in output side columns. The data objects (represented by file symbols) in A 1 are implemented by respective classes or attributes in A 2 . There are many more examples of such relations in practice [FKWVH19,TvdBS20]: identity, usage, dependency, refinement, and so on. Hence, there is a plethora of names for this concept. For example, traces [DKPF09], corrs [Sch94], morphisms [Ber03], mappings, cross-reference links [dLGKH18].

State of the art
The problem of inconsistency among inter-related software models has been a major concern of the software engineering community since the late eighties [SZ01]. A prominent study from these early times is the ViewPoints framework [FKN + 92, FGH + 93]: A complex system is described by a set of loosely coupled viewpoints, where each viewpoint may use its own notation. Viewpoints pioneered the usage of logic to define consistency. Each viewpoint has an internal consistency specification and the framework can check consistency both internally to a viewpoint and externally between multiple viewpoints. When inconsistencies are discovered, theframework automatically tries to resolve them using so called meta-level axioms stated in temporal logic, which specify how inconsistencies shall be addressed. A successor in this line of conception is Xlinkit [NEFE03,NEF03], a tool for consistency management of XML documents. Consistency rules are defined by a combination of First-Order-Logic (FOL) and XML Path expressions. When the tool discovers inconsistencies, it generates repair actions based on the structure of formulas and the XML document.
With the advent of MDE, the issue of consistency among multiple models has become even more significant [CCP19,Ste20]. This issue is featured in the following contemporary research domains, which mark the related research areas of this work.
• Multi-View Modeling (MVM) [BBCW19,CCP19] can be seen as the continuation of the viewpoints idea within MDE. The specification of a complex system requires a multitude of views, i.e. (partial) specifications focusing on a certain aspect of the system (data types, behaviour, components). A prominent example of this principle is UML: It comprises 14 different diagram types for modelling structural and behavioural aspects of a system. The need for view-based specification has also been identified for domain specific modelling languages [GBB12]. A major issue are overlaps between views, i.e. when they refer to the "same" concepts. When a view is changed, it must ensure that all occurrences of overlaps are changed accordingly in other views to not violate global consistency.
• The latter, known as the view-update problem, embodies the origin of the cross-disciplinary research area BX [CFH + 09, ASCG + 18, ABW + 19]. This area comprises researchers from databases, pure mathematics, functional programming, graph transformation and (model-driven) software engineering. The solutions produced in BX are called synchronisers, i.e. propagation functions that translate updates from one data source to another and vice versa. BX represents consistency as a correspondence relation [Ste08] between synchronised data sources. The update propagation is considered correct when the propagation functions always return a result that satisfies the correspondence relation.
• Megamodelling stands for a fundamental idea in MDE, where every artefact in the software development process is a model [BJV04,Bé05,FN05]. Models are transformed (refined, translated, migrated) to eventually yield a running system via model execution or code generation. The definition of a model transformation can be seen as a model itself and its execution produces a trace-model. Thus, model transformations can again be transformed by higher-order transformations. The fact that these artefacts depend on other models rises the question of megamodel consistency [Ste20].
We group the research areas mentioned above under the term Multi-Model Consistency Management. To avoid confusion by different terminology, we clarify the concepts of Multi-Model Consistency Management here. Fig. 3 gives an overview of both the artefacts and activities in Multi-Model Consistency Management.
Multi-models are built from (local) models (abstract representations of certain parts of a system). Models contain elements and are denoted in a graphical or textual modelling language. A collection of models denoted in the same modelling language is called a model space and is defined by a metamodel. A metamodel comprises a definition of the language's abstract concepts (compare the terms denoted in teletype font in Section 2) together with their relationships and structural integrity rules. A multi-model (global) is a reification of a correspondence relation among several models, called components of the multi-model. The definition of a correspondence relation is based on consistency rules (e.g. CR1-CR5), which can be evaluated on a multi-model resulting in either true or false. Validity of consistency rules is witnessed by commonalities, which establish structural relationships between elements from disparate models (the dashed elements in Fig. 2). Thus, a multi-model is given by a collection of models and commonalities among their elements. We can distinguish between consistency rules that only refer to elements within the same model and those involving multiple models and commonalities. Following the terminology from [UNKC08] the former are called intra-model consistency rules, also known as constraints. The latter are called inter-model consistency rules.
The early literature [SZ01,FST96] identified the following list of activities of the Multi-Model Consistency Management process: Detection of overlaps, Detection of inconsistencies, Diagnosis of inconsistencies, Handling of inconsistencies, Tracking of inconsistencies, Specification and application of an inconsistency management policy. This list still applies today but to simplify presentation, we will merge them into a three-stage process comprising (I) Alignment, (II) Verification, and (III) Reconciliation. Providing an extensive overview of the state of the art for each of these stages would go beyond the scope of this paper. Thus, we only give a brief description of each stage and provide references to existing surveys for further details.

Alignment
Alignment involves the preparatory actions of detecting and representing commonalities. In addition, consistency rules and policies [FKN + 92] are defined. Policies are meta-rules, that specify how inconsistencies shall be addressed. Spanoudakis and Zisman [SZ01] distinguish between preventive (not allowing certain actions in the first place), remedial (immediate reaction on inconsistencies) or tolerating (doing nothing) policies.
This paper is about a novel formalism for representing multi-models to support consistency management. Thus commonalities play a prominent role, which is why we provide a detailed treatment of their detection and representation.

Commonality detection
The ISO 42010 standard [ISO11] considers the architecture description of a system to comprise multiple views (i.e. models in our terminology). It further distinguishes between projective and synthetic approaches. In the former case, every model is merely a projection of an underlying all-encompassing system model. The most popular representative of this approach is UML [Obj15] itself: Every diagram displays a certain part of one comprehensive underlying UML model. For example, the same method-instance may appear in a class diagram and a sequence diagram. Thus, commonalities are already implicitly known and do not need to be discovered any more. There can still be conflicts between views, e.g. when a method name is changed in one view and not the other. The synthetic approach considers all models to be independent entities. Eventually, they have to be composed [BCE + 06] to yield the resulting system. There are also proposals for combining both projective and synthetic approaches. Orthogonal software modelling [ASB10] is such a representative where one first has to create a single underlying model (SUM) (synthetic) based on existing independent local models. The SUM is then used to derive (projective) views from it, i.e. the composition of the synthetic approach is antedated. The construction of a SUM [MWK + 20] may be difficult and therefore some researchers proposed to only construct it virtually [KKL + 21].
Both synthetic and hybrid approaches require discovery of commonalities, which is also known as model matching [KDRPP09]. Spanoudakis and Zisman [SZ01] identify four primary approaches for model matching. The simplest (and arguably most naive) approach is to establish a commonality when there are two or more elements in disparate modelssharing the same name. Another variant is to rely on a shared ontology, which requires that all model elements have to be annotated with a term from this ontology. In many cases, model matching via human inspection is required, i.e. users have to manually define commonalities. Commonalities come in all different kinds and identifying them is far from obvious. In [BEEH + 19], the authors describe how a collaborative decision process can be used for this. Finally, automated similarity analysis can be an option. However, being a special case of the weighted bipartite graph matching problem, it is an NP-complete problem and may therefore run into complexity issues [ Model weaving is a different approach, which was originally introduced to trace the execution of model transformations [BBDF + 06]. It is closely related to model traceability [ARNRSG06]. Commonalities are stored in a separate trace model, i.e. a collection of cross-reference-links. The trace model can be queried and modified independently of the local models.
The Heterogeneous transformations approach does not represent commonalities explicitly. Instead they are implicitly encoded in the definition of transformation functions, which are established between every pair of models. This approach is common in BX, where these functions are called Put and Get [FGM + 07]. Also, the Queries Views Transformation (QVT) [Obj16a,Ste08] standard specifies commonalities in this way.
Dynamic extension was pioneered in [EHHS00] and is nowadays often implemented with more lightweight dynamic modelling techniques such as facets [dLGKH18]. The idea is related to aspect-oriented programming. Local models are enhanced with commonality meta-data when needed. For instance, we may add a boolean flag to every business rule activity in Fig. 2 to check whether the activity has an associated decision table, compare CR1.
These four approaches can be classified along a two-dimensional grid, shown in Tab. 1 with the two dimensions global/local and intra-model/inter-model. Both merging and weaving store commonality information globally (as a merged model or a set of all cross-reference links). Heterogeneous transformations and dynamic extension do not need to consider all models at once since commonality information is stored locally (usually pairwise). Merging and dynamic extension represent the commonality information within models while weaving and heterogeneous transformations represent it outside of the models.

Verification
Verification involves the activities centred around finding and tracking (storing and reporting) inconsistencies. The four approaches for finding inconsistencies, according to [SZ01], are logic-based (using a resolution procedure), model-checking (enumerating all possible instances), specialized automated analysis or human-centred exploration ( inconsistencies are reported manually).
The first two approaches are generic: When models and consistency rules have an encoding as predicates and formulas in a logic, one can use a resolution or model checking procedure that exist for the respective logic to verify consistency. The limitation of both approaches is complexity (state explosion, non terminating resolution). Thus, several means for specialized automated analysis have been developed. An example in the MDE domain is given by the Object Constraint Language (OCL) [WK99], which can be used to define and verify consistency rules defined on UML models. The work by Egyed and his collaborators [Egy07,RE12] comprises powerful and fieldtested tools that implement consistency verification and restoration in the context of UML/OCL. Human-centred exploration requires the most effort, however, it is the only way to discover inconsistencies for informally given models and consistency rules. A comprehensive survey on consistency verification in the context of UML is given in [KM18]. A more recent survey is [TvdBS20], which also includes other domains than software engineering, e.g. electrical and mechanical engineering.

Reconciliation
When inconsistencies arise, they have to be analysed and addressed according to the predefined policies. In general, consistency violations trigger a semi-automatic consistency restoration procedure. The latter is also known as update propagation, synchronisation or model repair. It is a vast research field and we can only sketch the primary approaches and concepts here. Surveys on the field are given in [MJC17,ABW + 19,SKRL21].
Approaches have been classified into constraint-, search-and propagation-based [OPN20, FKM + 20, WAF + 19]. However, we want to use the classification from [SKRL21] and propose search-based and rule-based as the two top level classifiers: Constraint-based can be seen as a special case of search-based model repair and the term rule-based is used to include hand-crafted imperative approaches into the picture.
Search-based approaches are declarative. The repair problem is conceived as a search problem, where the state space is given by the model space, state transitions are given by possible model modifications, and goal states are those models that satisfy all user-defined consistency rules. A naive atomic search implementation treats models as black boxes. Due to the sheer size of the state space, atomic search has to combined with additional techniques such as heuristics [SMBB10] or machine learning [BMdlC + 20] to cope with complexity. Thus, a prevalent implementation strategy is to translate the problem into a logical representation such that efficient off-the-shelf solvers can be used to perform the search. Examples are given by Echo [MC16] or JTL [EMM + 12]. This type of search-based repair is well-aligned with verification using resolution or model-checking as it requires a logical representation of models and consistency rules as well.
Rule-based approaches require more concrete user guidance on how to react to inconsistencies. We distinguish further between imperative or grammar-based approaches. In the former case, the developer has to write a procedure, which will be executed in the event of a consistency violation [SDZKR18]. Imperative approaches give no further guarantee about correctness. Grammar-based approaches represent a more declarative a approach to rule-based repair. The grammar defines repair rules on a higher level of abstraction, which are then operationalised to produce concrete repairs. An example is the Model/Analyzer tool by Egyed et. al. [RE12], which derives possible repairs from the structural rules defined by the UML metamodel. Another prominent formal representative is given by the (algebraic) graph transformations framework [EEPT06]. The latter represents rules by means of graph-homomorphisms and rule application is defined via a Double Pushout (DPO) construction [EPS73]. Consistency rules are defined by means of a graph grammar [Roz97]. This framework offers means for the analysis of internal properties (concurrency, confluence, termination) [EEPT06] and correctness properties (compliance between rules and static conditions) [HP09]. Model repair approaches, which utilize this framework, exist both for local models [KR17, OPKK18, SLO19] and multiple models [HEO + 11, WAF + 19, FKM + 20]. The latter is represented under the umbrella of TGGs.
Orthogonally, model repair approaches must take into account cross-cutting concerns such as user interaction, incrementality, concurrency and optimality. The fact that repair results are not always unique necessitates user interaction. For example, by letting the user choose a preferred solution among several possible solutions. The Model/Analyzer tool [RE12] relies heavily on user interaction. Incrementality was highlighted by Giese [GW09]: Complexity of model repairs should not depend on the size of the models but on the size of the modification, i.e. avoiding to re-compute the whole correspondence relation from scratch. Support for concurrent model synchronisation (repairing inconsistencies in the aftermath of parallel and independent model modifications) has been rather limited until lately [OBE + 13]. But recently there has been some interesting new results in this direction related to rule-based approaches [OPN20, FKM + 20, WFA20]. Finally, if there a multiple correct solutions for a repair, there arises the questions of what should be considered the "best" solution. Both quantitative (i.e. metrics) [MC16] and qualitative [CGMS15] measures have been proposed. However, it may be noted that some changes can only be ameliorated instead of rectified completely [SZ01]. In fact, toleration of inconsistencies may be an adequate reaction [NER01] as well.

Existing tools
In accordance with the phases previously described, there is a large number of existing tools. Hence, it is not possible to cover a broad selection here. We pick a small selection of contemporary tools to illustrate existing issues in Multi-Model Consistency Management that we want to address by comprehensive systems.

Epsilon (matching, merging, verification)
Model-driven design and development is supported by model management tools providing facilities for common tasks such as querying and modifying a model's contents, verifying the consistency of a model, merging two models or translating a model into a different representation. Epsilon 1 [PKR + 09] is a well-established representative of such a model management tool. It is organized as a set of Domain Specific Languages (DSLs), one for each model management task, and, among others, DSLs for matching (ECL), merging (EML) and verifying models (EVL) [KPP06]. Listing 1 shows the Epsilon code needed to implement consistency verification for CR1 and CR2. In the first step, elements from disparate models have to be matched (lines 2-8). Epsilon performs automatic pairwise model matching, which is controlled by user-defined rules. A rule defines the model element types that should be matched with each other (keywords match and with), when a commonality should be established (compare), and optionally a filter-criterion (guard). The matching engine compares all pairs of model elements with the respective types and creates commonalities when guard and match criterion are fulfilled. In Epsilon vernacular, commonalities are called match traces. They are processed further to create a merged model A + , compare Fig. 1d. This step is controlled via respective merge-rules (lines 10-16) and copy-rules (lines 17-21). Merge rules are invoked for all match traces with the respective types and produce an element in the merged model. Afterwards copy rules are invoked, which copy unmatched elements into the merged model. It is important to note that our example in Section 2 is heterogeneous, i.e. the models are denoted in different modelling languages. In order to create a merge in this case, we have to create a single underlying metamodel (SUMM) beforehand, which encompasses concepts from BPMN, DMN and UML (i.e. another instance of model matching and merging on the metamodel level). For more details about this issue, we refer to [DXC11]. When the merged model is created, we can check whether it fulfils the global consistency rules (lines 23-28), which are formulated in an OCL-like language called Epsilon Verification Language (EVL). These rules can be augmented with a fix statement (lines 25-26). The latter defines an imperative program to restore consistency, e.g. creating the missing decision table.
Concerning the Epsilon solution in relation to our presentation of Multi-Model Consistency Management: Matching is performed via automatic model comparison, which is controlled by user-defined rules, commonalities are reified in a merged model, consistency verification is implemented via specialized verification means (EVL) and repair is performed in an imperative rule-based manner.
In Section 1, we mentioned that merging is a forgetful operation: The origin of elements and the information whether an element was merged with another is lost after the merge. Thus, in order to verify CR1, we had to augment the merge-rule (line 13) and the copy-rule (line 21) with this meta-information to be available in the resulting merged model, where consistency verification is performed.
While information loss can be overcome in the aforementioned way, Epsilon suffers from a major limitation: In its present version it only supports pairwise matching. Therefore, while CR1-CR4 are implementable, CR5 cannot be realised with this tool since it requires a ternary relation. It is not enough to only look at the pairs (A 1 , A 2 ), (A 2 , A 3 ) and (A 1 , A 3 ). Consider the situation in Fig. 4a: Each model pairing is apparently consistent since there are binary one-to-one correspondences but taken altogether the ternary "to-one" correspondence is violated. We conclude with two requirements for a formal multi-modelling foundation.
Requirement 1 Comprehensive Systems must not forget the origin of elements from the original models.
Requirement 2 Comprehensive Systems must be able to handle arbitrary n-ary correspondences.

Generic and domain-specific trace models
Instead of turning the match traces into a merged model, we can turn them into a trace model. As pointed out in [DKPF09], commonalities represent an entity of its own right. Augmenting a set of models with a trace-model is known as model weaving [BBDF + 06]. In its most generic form, a trace model is a (hyper-) graph. Elements are either trace links (edges) or trace link ends (nodes). The latter serve as proxies [GHJV95] of elements in another model. Working with such generic trace models is cumbersome and the definition of consistency rules over such trace models becomes rather involved. In practice, one distinguishes between different types of commonalities that are established only among elements having a specific type. For example, columns in A 3 can only be related to data objects in A 1 and attributes in A 2 .
The default approach to capture this notion is the definition of a domain-specific trace metamodel [FKWVH19,SDZKR18], which contains domain-specific refinements of trace links and trace link ends. An overview of generic and domain specific trace-models, their metamodels and instantiation relationships is shown in Fig. 4b. Metamodels are depicted as packages and models as files at the bottom. Notice the double nature of the generic trace-metamodel: It can either be instantiated directly by a generic trace-model (1) or serve as the metamodel for a domain-specific trace-metamodel, which is instantiated by a domain-specific trace model (2). The domain specific trace metamodel is a suitable carrier for the definition of consistency rules. One may use Epsilon or other model management tools to implement verification and repair [SDZKR18].
Note, that the existence of a separate trace model induces a new challenge: Because trace link ends are proxies of elements in other models, we have created a situation that requires n-binary synchronisations, i.e. when an element in a local model changes also its proxy in the trace model must change. Compared to the solution in 3.4.1, the only major difference is the way of representing commonalities. The solution of using domain-specific trace models is very common in practice, see [FKWVH19,SDZKR18]. However, Drivalos et.al. [DKPF09] reported that there is missing interoperability between solutions using trace-models due to the fact that many implementations are created in an ad-hoc manner.

Requirement 3
Comprehensive Systems shall provide a formal foundation for domain-specific trace-models in model weaving.

Triple graph grammars
Finally, we investigate a formal approach that is based on an auxiliary commonality structure. TGGs [Sch94] are means for defining consistency rules between two structures (e.g. models) represented as graphs in a declarative manner. A TGG is graph grammar [Roz97], which trades "ordinary" directed graphs for triple graphs. The latter is formally given by a pair of graphs 2 (S , T ) connected by a "correspondence graph" C that relates S and T via graph homomorphisms, resulting in a span S ← C → T . A single TGG (tg 0 , R) comprises a start triple graph tg 0 , e.g. one that is empty in all components, and a set of production rules R. In the default case, production rules are monotonic production rules, which are formally given by inclusion-morphisms r : L → R (L and R being triple graphs), that specify how the two structures evolve simultaneously. Intuitively, monotonic rules add elements to an existing context. In Fig. 5, we depict three exemplary production rules in an integrated presentation: Elements in R \ L are added to an existing context. These are highlighted in Fig. 5 by a shaded background and a ++-annotation. The remaining elements are members of L. The TGG induces a language: The set of all triple graphs producible by applying a sequence of production rules on the start triple graph. This language is a subset of all possible triple graphs and hence defines a consistency relation on the collection of all triple graphs or equivalently a relation between S and T if the middle part C is ignored. The language generated by the three rules in Fig. 5 (with an empty start triple graph) models the semantics of CR1 and CR2.
TGGs are a declarative approach. The grammar rules are used to automatically derive programs for (incremental) model transformation [EEE + 07, GW09], model matching [EEH08], consistency verification [LAS17] and update synchronisation [HEO + 11, HEEO12]. This is done via a so-called "operationalisation" of the grammar rules [ [GHL10]. Finally, practitioners have used TGGs as a graphical language for the definition of consistency rules, which were used to generate an implementation in the Epsilon framework [FKWVH19,GdLKP10].
Classifying TGGs within Multi-Model Consistency Management, they come with automatic model matching, specialized verification and a rule-based repair mechanism, which are based on a declarative grammar specification. The commonality representation lies somewhere between model weaving and heterogeneous transformations: There is no explicit trace-model but commonalities appear implicitly as correspondence graphs in rule definitions.
As a conclusion, TGGs combine an intuitive visual language with a powerful theoretical framework and tool support. However, triple graphs are by definition limited to binary situations and thus fail to capture the semantics of CR5. A generalisation of triple graphs for multi-ary situations has been introduced in the form of graph diagrams [TA15,TA16]. Graph diagrams allow the definition of multi-ary relations but require that the arity of all relations is known beforehand because their underlying schema is fixed. We doubt that the set of all necessary relations in a concrete use case can be known beforehand. Thus, we want to to develop a formalism that can deal with these inter-relations more flexibly. In particular, we want to support the introduction of new relations at runtime.

Requirement 4
Comprehensive Systems must support the flexible expression of multi-ary correspondence relations and also support relations that may change their arity over time.

Comprehensive systems
In this section, we define comprehensive systems. We begin with reviewing a formalisation of (local) models and constraints imposed on them in Section 4.1. Afterwards, we develop the idea behind comprehensive systems intuitively along our running example in Section 4.2 before providing the formal definition of comprehensive systems using algebra and category theory in Section 4.3. In Section 4.4, we explore the theoretical properties of comprehensive systems. Moreover, we show that they generalise graph diagrams and triple graphs in Section 4.5. We conclude with a short discussion about the application of comprehensive systems in practice (Section 4.6), their current limitations and how they satisfy the requirements from Section 3.4.

Software model formalisation
In our example use case (Section 2), we employ three modelling languages: BPMN, UML and DMN. Each language is defined by a metamodel (syntactical representation of a class of models). These metamodels are themselves defined using the meta-metamodelling language Meta Object Facility (MOF) [Obj16b]. MOF is essentially a subset of the UML class diagram language comprising classes, attributes and references. A simplified version of the BPMN metamodel, termed M 1 , is depicted in Fig. 6a (clouds allude to concrete syntax). Metamodels M 2 and M 3 for UML class diagram and DMN decision tables can be defined accordingly (excerpts of them are shown in Fig. 8). A metamodel defines the concepts of the language together with structural relations between concepts (signature) and structural integrity rules (formulas) over them. Structural integrity rules, e.g. multiplicities (1..0, 1..1), are used to enforce common domain-specific requirements. Often, the builtin mechanisms are not enough to encode all domain-specific requirements. Therefore, constraint languages such as EVL [KPP08] or OCL [WK99] exist, which allow the designer to attach arbitrary user-defined constraints onto metamodels. Fig. 6a features an attached constraint φ : control flow, which is expressed as an OCL invariant defined in Listing 2.
Listing 2: Constraint φ:=control flow formulated in OCL context Event inv control_flow : ( self . type = EventType :: START implies self . incoming -> count () = 0) and ( self . type = EventType :: END implies self . outgoing -> count () = 0) MOF and its derivations, such as Ecore [SBMP08], are widespread. However, we do not endorse one particular modelling language and instead seek for a technology independent (= mathematical) formulation of models, metamodels and constraints. E-graphs [EEPT06] (see Fig. 6b) are one suitable formal interpretation of the MOF/class-diagram syntax and thus an appropriate base modelling language B (linguistic (meta-) metamodel in [Kü06]) to encode the abstract syntax graph defined by a metamodel. The E-graph language comprises the concepts Graph Nodes GN (complex types), Data Nodes DN (primitive types), as well as Graph Edges GE (associations) and Node Attribute Edges NAE (attributes) together with appropriate owner and target functions. For the sake of simplicity we omitted edge attribute edges, which are usually included in E-graphs.
It must be mentioned, that our formalism is not "tied" to E-graphs: The formal definitions in Section 4.3 are based on arbitrary graph-like structures (see Definition 1 in Section 4.3), where E-graphs are one concrete example. Hence in the following, we use the term graph to refer to any kind of graph-like structures. To require that the content of a (meta-) model must have a graph-like structure is not a major limitation since the majority of graphical and textual modelling languages admit such a representation.
Metamodels are instantiated by models, which are object graphs typed over the abstract syntax graph defined by the metamodel. Thus, ignoring their concrete syntax, the three models A 1 ,A 2 and A 3 in Fig. 2 each form an object graph w.r.t. M 1 , M 2 , and M 3 . The instantiation relationship between a model A (object graph) and its metamodel (class diagram) is formally represented by a graph homomorphism t : A → M . Let for example a be the element named "Diagnosis" in A 1 then t(a) DataObject ∈ M 1 . Given a fixed metamodel M , we call the collection of all typing morphism Mod(M ) : Graphs and graph homomorphisms alone are not enough since metamodels also comprise built-in (e.g. 1..1, 0..1) and attached (e.g. control flow) constraints. Generalised sketches [DW07,Dis97] and the Diagram Predicate Framework [RRLW12,RRLW09] (an MDE oriented adaptation of the former) provide an elegant, yet flexible formalism to express different kinds of constraints in a uniform way and generalise the four approaches presented in Section 3.2. The idea is to express constraints as diagrams (in the category theoretical sense) that are bound to graph elements. Constraint semantics are kept abstract, i.e. delegated to an arbitrary predicate library. This allows designers to implement constraint semantics using the formalism or tool of their choice: EVL/OCL [KPP08,WK99], (nested) graph conditions [HP09], First-Order Logic (FOL) [NEFE03], or arbitrary programming languages.

Fig. 7. Diagrammatic Constraints in a Nutshell
The idea behind diagrammatic constraints is sketched in Fig. 7. A constraint φ (similar to a formula in FOL) is formally given by a pair (p, b), which consists of a binding morphism 5 b and a predicate p.
Predicates p are organised in an abstract but fixed predicate signature . It can be thought of as a library, which may contain the UML/MOF-constraints [RRLW09], functions on some base data types as in OCL [WK99], and logical connectives [Wol21]. Each predicate has a fixed arity graph ar (p) and a semantic interpretation p ⊆ Mod(ar (p)), i.e. a chosen subset of graphs typed over the arity graph closed under renaming. Semantics are commonly defined given a boolean function check p : Mod(ar (p)) → {true, false} which is equivalent to a subset: Given a model i ∈ Mod(ar (p)) the check function applied on i returns true if and only if i belongs to the semantics (check p (i ) true ⇔ i ∈ p ). We then say that i satisfies p, written i | p. As an a example, consider the predicate target[1..1], which represents the UML multiplicity exactly one at the target side of an edge. The arity of this predicate is a graph containing a single edge: ar (target[1. .1]'s semantics may be given as follows: ∧ ∀n ∈ GN I , e, e ∈ GE I : owner I (e) n owner I (e ) ⇒ e e ) The metamodel in Fig. 6a comprises several unnamed constraints, e.g. the edges src and trg both have the predicate target[1..1] attached to them. The named constraint φ : control flow exemplifies the binding of a more complicated predicate. Its semantics are defined by the OCL-code in Listing 2 and the arity (scope) is the subgraph of M 1 highlighted in Fig. 6a.
To check whether an M -model satisfies a constraint φ (p, b), one first has to "pull" the respective typing homomorphism t : A → M "back" along the binding homomorphism b (i.e. querying the scope) and then verify membership of the result w.r.t the semantics of p (i.e. invoking the check-function). When the binding homomorphism is injective, this "pulling-back" or querying operation simply means forgetting all parts of A, which t maps to an element outside of the scope of φ. The result of querying a typed graph t : If a set of diagrammatic constraints (a set of formulas given in a specific logic) is imposed on M , then the space is reduced to the subset Mod(M ) ⊆ Mod(M ) of all consistent models typed over M subject to (those satisfying all constraints). The fact that instances are interpreted as typed graphs induces a default multiplicity for edges: If there is no additional multiplicity bound on the respective edge, this edge implicitly has the 0..* at both ends. This actually differs from the default multiplicity in UML [Obj15], which is 1..1 at the target side (0..1 at the source). Thus, for every metamodel in this paper it holds that, if there is no explicitly specified multiplicity, there is an implicit 0..* multiplicity at both ends.
To summarise the essence of generalised sketches/diagrammatic predicates: Software models and metamodels have a graph-like structure, models are typed graph-like structures, and the definition and verification of constraints requires the existence of chosen subsets and a "pulling-back" (query) operation.

Intuition behind comprehensive systems
To align multiple models with each other in a multi-model, one needs a language to express commonalities between these models. As discussed in Section 3, the majority of contemporary mapping languages are bound to binary situations. A notably exception, which allows multi-ary correspondences is the commonalities language by Klare and Gleitze [KG19]. has input referencing DataObjectCorrespondence { 10 = BPMN : Activity : consumes = DMN : Table : inputSideColumns } 11 has output referencing DataObjectCorrespondence { 12 = BPMN : Activity : produces = DMN : Table : outputSideColumns } 13 } Listing 3 demonstrates how the commonalities from our running example in Fig. 2 are expressed in this language. The keyword commonality initiates the definition of commonalities between instances of respective elements, which stem from disparate metamodels (referenced via the with keyword). Additionally, commonalities may be linked with each other (keywords has and referencing). In [KG19], commonalities are used to define expressions on them, which encode consistency rules. These expressions are translated into a so-called Reactions language, which provides event-based model modification facilities to perform consistency restoration. Their approach is part of the Vitruvius framework [KKL + 21].
We do not want to go further into concrete details of this practical approach, but instead analyse the formal semantics of the code in Listing 3. Commonalities together with their attributes and references, again, form a graph. Consequently, it is reasonable to use the same graph-like language B for it. In such a way, the content of Listing 3 induces an E-graph, shown in Fig. 8. The elements of this graph are depicted using dashed lines and we call them commonality witnesses. Commonality witnesses reify a "tupling" of terms from disparate (meta-) models. They are defined via the with keyword in Listing 3, visualised as dashed arrows (p 1 , p 2 , p 3 ) in Fig. 8. These arrows are called projections and represent the fundamental innovation compared to the situation of only local models from Section 4.1. For example, lines 3-5 specify a commonality of the triple DataObject (M 1 ), Attribute (M 2 ), and Column (M 3 ) reified under the name DataObjectCorrespondence in M 0 . However, not only the nodes (of the graphs) are related: In Listing 3, we see how the keyword has defines how inherent features, i.e. edges, are related as well, e.g. line 4 specifies a commonality between the type attributes in M 2 and in M 3 . The same goes for the consumes/inputSideColumns, produces/outputSideColumns and name features of Activity (M 1 ) and DecisionTable (M 3 ). Common edges require that their respective source and target nodes are also related, e.g. the type-commonality depends on a commonality between Attribute and Column, which is already given by the surrounding commonality-statement, as well as commonality between Type and DataType (see line 2). Hence, commonality specifications must preserve edge-node-incidences. Since the commonality tuples can be of arbitrary arity, these mappings may be partial (highlighted by denoting them as arrows headed with a half-tick: ): p 1 (BaseType) ⊥, p 2 (BaseType) DataType, p 3 (BaseType) Type The above required edge-node-incidence means that defined-ness of p j (e) entails defined-ness of p j (v ), where v is the owner of e in M j , and p j (v ) owner M j (p j (e)) (4) for all edges e in M 0 (and likewise for targets).
Hence, Listing 3 defines a comprehensive metamodel M in which commonalities are accurately specified with the help of (a graph of) commonality representatives. Formally, we obtain a new graph M 0 and partial projections it becomes apparent that we can establish a corresponding construction relating elements in the domains of these typing homomorphisms. They may be defined manually as in Listing 3 or using (semi-)automatic matching procedures, see Section 3.4.1, based on keys, metrics or ontological equivalence. Independently of how they are established, their formal representation is again a graph of commonality representatives A 0 and partial projections p A j : A 0 A j for all j ∈ {1, . . . , 3}. This alignment of models is implicitly shown in Fig. 2. Each dashed circle (1a,1b,1c,1d,2,3) represents a commonality representative and each line ends at the value under the respective projection. Some of the lines are binary, while others are ternary. The complete content of Fig. 2 is called a comprehensive system where the dashed part represents the commonalities and the models A 1 , . . . , A 3 are the components.
Models A i are typed over their metamodels, i.e. there are typing morphisms t i : A i → M i which can be combined to one comprehensive typing of all components. This typing extends to A 0 as well because elements a j and a k (j k ) of model components A j and A k are relatable only if their types t j (a j ) and t k (a k ) are related via a representative w ∈ M 0 . Thus, the specification in Listing 3 defines the possible types of commonalities. This is the formal equivalent to domain-specific trace models, compare Section 3.4.2.
which shows that the typing extension t 0 integrates smoothly (respecting commonalities) into a typing of all parts of the comprehensive model, such that we end up with a single typed comprehensive system: t : A → M . Conditions (4) (compatibility of projections with owner/target) and (5) (compatibility of typing and projections) are visualized in Fig. 9, which shows an excerpt of the complete typed comprehensive system.

Formal definition of Comprehensive Systems
In this section, we want to develop a precise formal definition of the structures described so far. For this, we resort to the mathematical language of category theory. This has several reasons. First of all, category theory allows for very concise definitions due to its abstract nature. Secondly, category theory offers a built-in mechanism called functor, which allows to compare two seemingly different formal structures. Finally, triple graphs and graph diagrams, which represent the most directly related formal approach, are formulated in terms of category theory as well. Thus, we can refer to them more easily using the same "language". This and the following Section 4.4 rely on the categorical concepts Category, Functor, Natural Transformation, Universal Construction (Pushouts and Pullbacks), and Partial Arrow Classifier. To make this paper self-contained, Appendix A contains a short overview over each of them. For a more detailed presentation, we refer to the introductory textbooks [BW90,Pie91,Wal92].
Intuitively speaking, a category (Definition 8; Appendix A) can be seen as a generalised pre-order or alternatively as a directed graph equipped with a (path) monoid. A category C comprises objects and morphisms a.k.a. arrows. We write |C| to denote the class of objects in C and Arr C for the class of all morphisms in C. If the class of objects is a set, the respective category C is called small. By convention, objects are denoted in capital letters (A, B , . . .) and morphisms in small letters (f , g, . . .). A morphism f is an abstract means to compare two objects A, B ∈ |C|, which are called domain (dom(f ) A) and codomain (codom(f ) B ) of f . One may think of it as an edge where domain and codomain represent source and target. Hence we will often denote them in an integrated arrow-notation f : A → B . A hom-set C(A, B ) is a subclass of Arr C and contains all morphisms that have A as domain and B as codomain. In addition to that, there is a unique identity morphism id A : A → A for each object A and morphisms f : A → B , g : B → C with incident domain/codomain (B ) can be composed to yield a morphism g • f : A → C (spoken "g after f "). Composition is associative and neutral w.r.t. identities.
The most important example of a category is the category of sets and mappings SET. In this category, objects are given by sets and morphisms are given by mappings between sets, i.e. total functions. Functors (Definition 10; Appendix A.1) are means to compare different categories. They comprise two mappings: One for objects and one for morphisms. Also they must assure that identities and composition are preserved. A functor G : B → SET from a small category B into the category of sets and mappings SET is called a presheaf. Furthermore, there is a functor category SET B (Fact 2; Appendix A.1), which has such functors as objects and morphisms are given by natural transformations (Definition 11; Appendix A.1) between them (think homomorphisms). Presheaves have some interesting properties: From a theoretical point of view they behave similar to objects in SET [Gol06]. From a more practical point of view they are sufficiently concrete such that one can talk about elements: Saying x ∈ G means that there is some object s ∈ |B| such that x ∈ G(s). They have been called graph structures in [Lö93] and are closely related to algebras. A small category B can be interpreted as a signature with unary operation symbols only. A presheaf G "interprets" every (sort) object s ∈ |B| as a set G(s) and every (unary operation) morphism 6 op : s → s ∈ Arr B as a mapping G(op) : G(s) → G(s ). This is also called functorial or indexed semantics and SET B corresponds to the class of algebras for a signature with unary operations only B (think instance worlds of a metamodel). This also allows to consider substructures F ⊆ G, given by sort-wise subset relations. Categorically, this is represented by an inclusion morphism F → G, which is a special monomorphism (Definition 16; Appendix A.2.2).
Finally, interpreting the diagram in Fig. 6b as the category B and setting G : M 1 (from Fig. 6a), G has the following components: . .}, together with the respective owner and target mappings.
Definition 1 (Base Language B and graph-like structures G). Let B be a small category called base (modelling) language. The base language gives rise to a category of graph-like structures G : SET B (presheaves).
We will now introduce two formal definitions to express our linguistic extension. The first definition is closer to practical implementations, while the second is closer to existing categorical frameworks. Both, formulations will turn out to be equivalent.

Set-based definition
Let us fix a sufficiently large natural number n, the degree of the multi-model, and considering a synchronisation scenario with model spaces (Mod(M j )) j ∈{1,...,n} , e.g. BPMN, UML, DMN and so on. As a consequence, we will be regularly working with indices. By convention we will use i and j as index variables, where i runs between 0 ≤ i ≤ n and j runs between 1 ≤ j ≤ n, if not specified otherwise.
The build-up of a comprehensive system is similar to a graph-like structure (Definition 1) and encompasses local models (components) together with their commonalities (witnesses + projections): Definition 2 (Comprehensive Systems, Components, Commonalities). A comprehensive system C consists of 1. For every s ∈ |B| and 0 ≤ i ≤ n, there is a set C i (s) 2. For every op : s → s ∈ Arr B and 0 ≤ i ≤ n, there is a total function C i (op) : 3. For every s ∈ |B| and 1 ≤ j ≤ n, there is a partial function p C j ,s : such that for all op : s → s ∈ B and 1 ≤ j ≤ n the following statement holds: and p C j ,s (C 0 (op)(x )) C j (op)(p C j ,s (x )).
The sets C j (s) together with the total maps C j (op) constitute the components, the sets C 0 (s) and total maps C 0 (op) constitute the commonality witnesses, and the partial functions p C j ,s represent the projections. Note that (6) and (7) generalise the edge-node-incidences, mentioned in Section 4.2, compare(4).

Definition 3 (Homomorphisms between Comprehensive Systems).
Let C , D be comprehensive systems as defined in Definition 2. A homomorphism between comprehensive systems is a family of mappings compatible with (operation) arrows, i.e. ∀ i ∈ {0, . . . , n}, ∀ op : s → s ∈ Arr B : and compatible with partial (projection) mappings: For all j ∈ {1, . . . , n}, s ∈ |B| and x ∈ C 0 (s): where we write f instead of f j ,s , if the indexing becomes clear from the context.
Alternatively, we can visualize Definition 3 by a family of commutative cubes in SET, shown in (11) and indexed by all op : s → s ∈ B and 1 ≤ j ≤ n. Commutativity of the top and bottom faces encode that the projections in the comprehensive systems C and D fulfil (6)+(7), while left and right faces encode compatibility of f with operation arrows (8), and back and front faces encode compatibility of f with projections (9)+(10). Compare also this formal cube with the example in Fig. 9.
Definition 3 provides the material for formalising multi-models. A multi-model is a morphism t : A → M between two comprehensive systems A and M , where M is the correspondence definition, see Fig. 3. In our example M is the alignment of metamodels M 1 , M 2 , M 3 augmented with type commonalities defined in Listing 3 and partly visualized in Fig. 8. The comprehensive system A typed over M is shown in Fig. 2. Members of A 0 are all dashed circles and p A j ,s assigns to each circle a line end in model A j , where s is the respective element type (node or edge). The mapping definition of the typing homomorphism t is implicitly given by the concrete syntax and the legend in Fig. 2. See also Fig. 9.
Equations (9) and (10) (f substituted by t) reflect the demanded property (5), i.e. compatibility of commonalities and typing. This can be seen in Fig. 2: the commonality 2 must connect a class with a data object for instance.
Proposition 1 Comprehensive Systems together with their homomorphisms constitute a category CS.
Proof. An identity is a family of identities, composition is composition of mappings f j ,s . This yields neutrality and associativity. Moreover, composed homomorphisms are still compatible with the inner structure (op,p i,s ). Whereas this follows in the usual way for op : s → s , transitivity of the defined-ness implication in (9) also yields compatibility with partial functions.

Span-based definition
An alternative approach for encoding commonality relations in a multi-model is to use spans. This approach was used by the present authors in previous works [KD17,SKLR18]. Its formulation avoids SET-based concepts and is based on the categorical concept of a diagram. Recall thatthe semantic interpretation of Listing 3 is a family of n partial G-morphisms (m j : M 0 M j ) 1≤j ≤n . The latter can formally be expressed by a special diagram functor M : I → G, where the schema category I has the star-shape defined in (12) (identity arrows of I are omitted). Additionally, these diagram functors are subject to the condition that M maps the inner edges (10, . . . , n0) to monomorphisms. This condition is due to a well-known categorical construction [RR88], which expresses partial morphisms as a classes of binary spans (Definition 21; Appendix A.3).
We call these functors multi-span relations because spans are the categorical counterpart of relations.

Definition 4 (Multi-Span Relation).
A functor M : I → G where the image of M (j 0) for all 1 ≤ j ≤ n is a monomorphism is called a multi-span relation.
Multi-Span Relations are functors, hence we can relate them by natural transformations (families of Gmorphisms). The latter are called multi-span relation morphisms.

Definition 5 (Multi-Span Relation Morphism).
Let M and N be two multi-span relations. A multi-span relation morphism f : M → N is a family of G-morphisms, depicted by the j -indexed family of diagrams (1 ≤ j ≤ n) in (13) with the condition that squares (i ) and (ii ) commute.
Proposition 2 Commonality spans together with their morphisms establish a category M.
Proof. Follows immediately from the fact that M ⊆ G I is a full subcategory of the functor category G I .

Equivalence of definitions
The following theorem shows the useful fact that the set-based definition in Section 4.3.1 of comprehensive systems and the span-based definition Section 4.3.2 of multi-span relations are equivalent. The span-based definition depicts commonalities externally while comprehensive systems internalise them. Thus, we may use M as a dropin-replacement for CS and vice versa. The external notion M turns out to be more easy to handle in the theoretical considerations in Section 4.4 while the internal notion CS is more closely aligned with the definition of local models (functors into SET) and therefore easier to implement in concrete tools.

Theorem 1 (Equivalence of Categories). CS ∼ M.
Proof. See Appendix B.1. A part of the proof relies on the fact that (small) categories are cartesian closed, i.e. there is an equivalence between functor categories SET B×I ∼ (SET B ) I . In the following we are only speaking of comprehensive systems, bearing the above equivalence in mind.

Formal properties
In the following, we investigate the formal properties of comprehensive systems, which demonstrates their theoretical utility as a foundation for multi-modelling. They fulfil all formal requirements for applying existing frameworks for model verification and model transformation.

Consistency verification
Arguably the most important feature in Multi-Model Consistency Management is a means for consistency verification. The diagrammatic constraint framework [RRLW12, RRLW09, DW07, Dis97] demonstrated in Section 4.1 generalises many established verification tools and approaches. To be applicable on a certain class of formal structures, the latter must form a category, which possesses all pullbacks (Definition 15; Appendix A.2.2).
Proof. See Appendix B.2. The proof is carried out component-wise and involves some diagram chasing using the universal property of pullbacks.
Theorem 2 guarantees that we (theoretically) can apply mature consistency verification methods. We will now demonstrate how to use multiplicities and OCL invariants for implementing CR1-CR5 from Section 2. Here, we will also utilize Theorem 1. The latter allows to "internalize" projections and commonalities, i.e. "flattening" the linguistic extension by interpreting projections and commonalities as edges and nodes. Thus, they can equally be carriers for diagrammatic constraints. Reconsider Fig. 8, this time paying special attention to multiplicities and OCL constraints on the dashed part: The elements of M 0 become regular nodes with edges and attributes. The projections p j become edges that come with an implicit 0..1 multiplicity at the target side (= partial function). To navigate these elements, the comprehensive systems framework may enhance the OCL library with some helper methods, shown in Listing 4, which allow to navigate projections in a forward (projection) and backward (commonalities) 7 direction. The following list explains the consistency rule implementations (CRIs) of the rules from Section 2. CRI1 Is implemented with an OCL-invariant attached to Activity, which requires existence of the respective commonality if the activity is a BUSINESS RULE: context Activity inv : self . type = ActivityType :: BUSINESS_RULE implies self -> commonalities () -> count () = 1 CRI2 Is implemented via an 1..1-multiplicity at the end of projection p 1 on output and input together with an 1..1-multiplicity at the source of p 3 on the same elements. The implicit source-edge-incidence guarantees that owner/target relationships are also respected. Furthermore, it is important to note that multiplicities on projections of edges are conditional because they depend on other commonalities, i.e. they are only enforced if the respective owner-commonality exists. CRI3 Is implemented by an OCL-invariant that checks existence of exactly one type of commonality exists:  This list exemplifies that already multiplicities are "enough" to model many common consistency rules by intelligently imposing them on projections. To make this mechanism more "user-friendly" one may think of a catalogue of frequent commonality constraints. One example is the ForAll [DKPF09] constraint: "For every element of type X there exists a related element of type Y ." This translates into an 1..1-multiplicity at the source of the projection going into X and an 1..1-multiplicity at the target of the projection going into Y . Another common case is the PropertyConsistency constraint: "For two R-related elements x and y the values of the properties x .p and y.q must be equal". The comprehensive system representation of this constraint will encompass a commonality R having a property, which projects to p and q. Then, the implicit node-edge-incidence performs the necessary check. An empirical investigation of such common constraints is an interesting future research direction.
However, not all consistency rules can be implemented this way, as seen above. In these cases, one can resort to the expressive power of a constraint language such as OCL to define arbitrary user-defined constraints. Given that one can resort to arbitrary OCL-invariants makes this framework very expressive [MC99], but it lacks a reasoning system. The latter is useful for automatic analysis of inconsistencies [SLO18] and/or automatic consistency restoration [SLO19], which is another interesting direction for future investigations.

Advantages over model merge
Alternatively, we could have tried to formulate CR1-CR5 utilizing model merge [SNL + 07]. The latter is often considered to be the standard approach for verifying consistency of multiple related models [KM18,KMCD19]. Formally, model merging can be defined by calculating a colimit object [DXC11, KD17, Gog73]: Every object in M represents a diagram in G and the colimit object of this diagram is the merged model, a graph A + ∈ G. Intuitively, this result can be described as the union of all components wherein elements related by commonalities are identified. For example, in the merge of models A 1 , A 2 , A 3 in Fig. 2 the data object "Diagnosis" ∈ A 1 , the attribute "shortDesc" ∈ A 2 and the column "Diagnosis" ∈ A 3 will be merged into the same element, say Diag/descr of type DataObjectCorrespondence.
There are, however, global consistency rules that cannot be realised as a constraint on a merged model. This holds especially for rules, which depend on the knowledge of the membership in local models, because the latter information is lost in the merge.
This can be demonstrated with consistency rule CR1, which relies on the containment of elements (in this case containment in A 1 and A 3 ). After merging A 1 with A 3 there is only a single node representing "Select Consultant" and there is no way of telling if this node had a representation in A 1 and A 3 . We only know that it was present somewhere. In contrast, we do not loose this differentiation in comprehensive systems and can successfully check the validity of CR1.
Simultaneously, comprehensive systems can express everything that is expressible with constraints on a merged model, by including respective computations in the verification procedure. Let us reconsider the example from Fig. 1. In the introduction, we mentioned that the trace model (Fig. 1e) is able to uncover the inconsistency just as the merge model (Fig. 1d) does. An OCL-implementation is shown in Listing 5. The central ingredient part is the definition of the derived property globalSuper, which aggregates the super-class information for every class over all models: A 1 , A 2 , A 3 . This is done by iterating over all commonalities. This principle of aggregating a property over all related elements can be applied universally. A generic algorithm is described in [KD17,SKLR18]. Finally, the absence of cycles is checked in the invariant noCycles, which is based on this derived property.

Transformations
"Model transformations are the heart and soul of MDE" [SK03]. A mature, widespread and declarative (rulebased) approach to model transformations is given by the graph transformation framework, see Section 3.3. The framework is heavily based on the categorical universal construction of a pushout (Definition 17; Appendix A.2.3). To apply graph transformation to a certain class of structures of interest, one first has to show that they form a socalled weak adhesive HLR category [EP06] w.r.t. M, where M is a special sub-class of admissible monomorphisms in the respective category.
Corollary 1 CS is a weak adhesive HLR category w.r.t. M.
Proving this Corollary requires to verify the existence of pushouts (where some morphisms of the pushout diagram belong to the special class M) in our category CS (or M equivalently) and to check whether pushouts have the so-called (weak) van Kampen property (Definition 18; Appendix A.2.3) [LS04,EP06]. The latter enforces a well-behaved interplay between pushouts and pullbacks. Yet, Tobias Heindel, in his PhD thesis [Hei10a], showed that it equivalently suffices to show the existence of (i) pushouts along M-morphisms, (ii) M-partialarrow classifiers (Definition 24; Appendix A.3), and that (iii) pushouts are preserved by pullbacks. This is the strategy we are going to use to prove Corollary 1. First, we have to define the class of admissible monos for our category of comprehensive systems. It turns out that we cannot choose all monomorphisms: For example, let (m : A → B , f : A → C ) be a span of CS-morphisms. If there is an incomplete commonality specification in A containing a commonality representative which relates not as many elements as its images in B and C , the pushout construction may produce a commonality specification D, in which the projection is no longer well-defined. This effect has been studied in [KFST19, Ex.6.] as well. Thus, we cannot expect the existence of pushouts in general.
However, we claim that for M being the class of reflective monomorphisms, CS becomes a respective weakly adhesive HLR category, in particular pushouts along M-morphisms exist. as defined in Definition 3 where every m s,i is injective and, additionally, the implication in (9) is turned into an equivalence. Thus, "defined-ness" of a projection is not only preserved but also reflected.
Since CS ∼ M, there is an equivalent formulation of this condition in M. An M-monomorphism where additionally the squares (i ) in (13), are pullbacks, is called a reflective monomorphism.
Think of monomorphisms m as models of insertion: When elements that are not in the image of m are thought of as being added by D to the existing context C , then reflective morphisms are not allowed to "make" projections for witnesses that already exist C "defined" in the target D.
Example 1 (Non-reflective CS-Monomorphisms). Let G : SET, and n 2. Further let L and R be two comprehensive systems with L 1 R 1 {A}, L 2 R 2 {B }, and L 0 R 0 {C }. The projections are defined as follows: , and p L 1 (C ) is undefined. Now let m : A → B be a comprehensive system morphism that is component-wise the identity. This morphism is monic but not reflective since defined-ness of p R 1 is not reflected.
Example 1 illustrates a non-reflective monomorphism. From a practical point of view, this property prevents the dynamic changes of the arity of a commonality. Next, we have to show that M is admissible [RR88].

Proposition 3
The class of all reflective monomorphisms is an admissible class M of monos, i.e.
• it contains all isomorphisms, • it is closed under composition, • it is stable under pullback.
Proof. See Appendix B.3. The proof is carried out by diagram chasing and using the universal property of pullbacks. Now, one can show the existence of pushouts along M-morphisms, i.e. for spans where one of the legs is a reflective CS-monomorphism.

Theorem 3 CS has pushouts along M morphisms.
Proof. See Appendix B.4. The proof is largely carried out by component-wise considerations. The last part however, requires a set-wise consideration to assure that projections are well-defined.
The next part of the proof, following Heindel's approach, concerns partial arrow-classifiers (Definition 24; Appendix A.3). Intuitively, a partial-arrow classifier adds a substructure to a given object that represents "error" (failed computations or unmappable elements). It is similar to the java.util.Optional data type in Java or the Maybe-monad in Haskell. In SET, the partial arrow-classifier adds a ⊥-element to a given set. In the context of van Kampen squares, this construction becomes relevant because it turns out to represent a right-adjoint  (f , m ).
Proof. It is straightforward to prove this property from the fact that pushouts in G are stable under pullbacks [LS04]. Thus, we can apply the fact that pushouts and pullbacks are constructed component-wise, compare proofs of Theorem 2 and Theorem 3, and that stability of pushouts under pullbacks holds for each component in G.

Comparison with triple graphs and graph diagrams
In Section 3.4.3, we briefly introduced triple graphs, which are similar to comprehensive systems; both of them are based on graph-like structures and their formulation is given in categorical terms. The original formulation by Schürr [Sch94] was based on directed multi-graphs. It was later reformulated by Ehrig et. al.
[EEE + 07] in terms of a functor category G X and abstracted into the framework of weak adhesive HLR categories [EP06], i.e. G being an arbitrary weak adhesive HLR category. The schema category X TGG has the shape of a span, depicted in (15): Thus, the solution space is limited to binary scenarios. Trollmann and Albayrak [TA15,TA16] generalised the TGG framework to cope with multiple models within a graph diagram (GD) framework. The idea is to allow for different types of schema categories X, which must satisfy the condition that the set of objects can be divided into two disjoint sets of models N and relations R, i.e. |X| R N . All non-identity morphisms are required to have a domain in R (relations) and codomain in N (models). Further, there is at most one arrow in Arr X (r , m) for fixed r ∈ R and m ∈ N . In such a way, graph diagrams, i.e. functors D : X → G, can specify relations of different arities. Graph diagrams (GD) subsume TGGs, with R {0} and N {1, 2}.
They are, however, static: If r ∈ R has k outgoing morphisms with targets m 1 , ..., m k ∈ N , D(r ) is a k -ary correspondence relation with representatives which relate to exactly one element in each of the k models D(m j ). Consequently, the schema category has to change each time a new relation is added!
In the remainder of this section, we show that our framework is more general than graph diagrams G X for the case that G is a presheaf (G SET B ) in that there is an embedding functor T : G X → CS. The latter further preserves pushouts, which model derivations in Graph Diagram Grammars (GDG). Hence we are able to replay all TGG/GDG-computations in our framework, yet being able to cope with new relations without changing the schema category, compare Requirement 4 in Section 3.4.3.
In the following, we write i∈I D i to denote the coproduct (Definition 3; Appendix A.2.1) of a collection (D i ) i∈I of G-objects. Note that a collection (f i : D i → D) i∈I of morphisms yields the morphism i∈I f i : i∈I D i → D by the universal property of coproducts, i.e. the morphism, which acts as f i on each D i . Further, we introduce a shorthand notation: By Theorem 1, it suffices to define a functor from G X to M. The composition of this functor with the equivalence yields the desired result. This functor will also be called T. Functor T). Let a schema category X for graph diagrams be given with |X| R N and let n be the cardinality of N . Without loss of generality, we assume N {1, . . . , n}. Let D be a graph diagram, then we define a multi-span relation M : T(D) intuitively as follows (recall the schema in (12)

Definition 7 ( Translation
Morphisms M (jj ) are the unions of the domains of those morphisms that have target D(j ) and inclusions arise from the fact that coproducts in the above definition of M (−j ) (taken over some relations) are always subgraphs of the complete coproduct M (0) (which is taken over all relations).
The definition of T on arrows is straightforward and we give it only informally: If n : D ⇒ D is an arrow (natural transformation) between graph diagrams, then (1) T (n) i is a morphism which acts in the same way as n i on D(i ), if i > 0, (2) it amalgamates the actions of n on relations, if i 0, which (3) naturally restricts to the respective actions, if i < 0. It is then easy to see, that T (n) is a natural transformation.
We illustrate the construction in Definition 7 at the example of a graph diagram production rule depicted in the left side half of Fig. 10. The figure shows a production rule r : B → A in an integrated way, where B (before) and A (after) are graph diagrams (A, B ∈ G X ). A contains all elements shown in Fig. 10 and B contains only those elements, which are not shaded and missing the ++-annotation, compare Fig. 5 in Section 3.4.3. The set of models in X is a three element set: N {1, 2, 3} representing the three model spaces for BPMN, UML and DMN. The relation set in X contains four elements: R  {(1, 2), (2, 3), (1, 3), (1, 2, 3), representing all binary relations between the three model spaces and the ternary relation between all of them. Elements of R are tuples and morphisms in X are projections π R N : R → N . This schema is visualised by compartments in Fig. 10, where each compartment depicts a graph (object in G), i.e. the image of A(x ) (B (x )) for an x ∈ |X|. We introduce the notation: G x : G(x ) and if x is a tuple we may omit parentheses. The application of the translation functor T on A will produce a comprehensive system M with degree n 3, which is depicted in the right side half of Fig. 10. The graphs M (j ) are identical with A j (1 ≤ j ≤ 3). The commonalities graph M (0) is the coproduct (disjoint union) of A 1,2 , A 2,3 , A 2,3 and A 1,2,3 , i.e. the nodes {bt, dca, dc}. The domain of definition M (−1) for the projection on component 1 is the coproduct of A 1,3 , A 1,2 and A 1,2, Proof. See Appendix 6.
We obtain as a consequence: Corollary 2 Every sequence of rule applications in G X has a unique representation of corresponding rule applications in CS and hence can be replayed in the general framework of comprehensive systems.

Comprehensive systems for consistency management
Finally, we discuss the role of comprehensive systems in the conceptual Multi-Model Consistency Management process introduced in Section 3 and visualised in Fig. 3. Note that the artefacts Correspondence Definition and Multi-Model as well as the activities Metamodel Alignment and Model Alignment have a shaded background to highlight the activities that concerned with creation of comprehensive system. A correspondence definition is built from given metamodels and consistency rules and is formally represented by a comprehensive system M , defined using a suitable DSL such as the one in Listing 3. A multi-model is built from local models and commonalities and is formally represented as a comprehensive system morphism t : A → M , see Section 4.2. The added value of using these artefacts instead of simply working with a collection of models and commonalities (trace model), is that they provide a global view (like model merging), where one can reuse existing means for verification and repair.
Comprehensive systems have a structure similar to those of local models and theoretically they allow to apply existing methods for consistency verification, see Section 4.4.1. In particular, we can use established technologies, such as MOF-based modelling languages to encode comprehensive systems and OCL/EVL to encode consistency rules. A prototype implementation 8 based on EMF has been started.
Moreover, comprehensive systems can be used in different approaches for model repair. On the one hand, using the translation pioneered by Courcelle [Cou97], every graph, typed graph, E-graph and thus also comprehensive system can be translated into first order logic (monadic second order logic). Let C be a comprehensive system and recall Definitions 2 and 3: Every membership x ∈ C i (s) becomes a unary predicate inC i S(x ); operation mappings C i (op)(x ) y become binary predicates op C i(x , y). The same principle applies for projections p C j ,s and homomorphism components f i,s . Additionally, we have to add axioms that force C i (op) and f i,s to be total functions (left total and right unique), p C j ,s to be partial functions (right unique), as well as the conditions in (6)+(7), (10), and (9)+(8). When consistency rules are encode-able in FOL, we can utilise optimized offthe-shelf SAT/SMT-solvers, e.g. the popular model finder Alloy [Jac16], or resolution procedures, such as e.g. Prolog [CR96] to perform consistency verification and search-based model repair (finding a model satisfying the formulas). However, it must be noted that this naive translation most likely will run into complexity issues.
On the other hand, with Corollary 1 we have opened the door for graph transformation, i.e. rule-based repair. Thus, we can built on existing results w.r.t. verification [HP09] and repair [KR17,SLO19,OPKK18]. Consistency rules may be encoded as a set of consistency-preserving grammar rules. Upon the detection of a consistency violating model modification, the detected edit rule application may be "completed" to an application of a consistency-preserving rule [KKT13,TOLR17,OPKK18] based on the idea of match-consistent splitting [EEE + 07]. Furthermore, these rules can be analysed w.r.t. nested graph conditions supported by specialized reasoning tools [LO14,SLO18], which have been shown to outperform generic solutions using off-the-shelf solvers [Pen08]. Reasoning facilities enable various possibilities for investigating model repair in our formal framework, see also [SLO19,HS18], and will therefore play an important role for future work.
As a conclusion, comprehensive systems are not "opinionated" in terms of what means for consistency verification and model repair should be applied on them and we can re-use existing tools and methods.

Summary and limitations
Comprehensive Systems can be summarized by the slogan "from many models to one model": The issue of dealing with multiple models is addressed by a construction that yields a single artefact, on which existing means for consistency verification and model repair can be reused, see Section 4.4. This includes technologies such as MOF/EMF (model representation) and OCL/EVL (model verification) as well as theory and methods such algebraic graph transformation. In the past, the construction of global artefacts was often equated with model merging [SNL + 07, BCE + 06, RC13,DXC11]. Merging, however, poses some difficulties, especially if the verification of a global constraint depends on knowledge about membership of model elements. In terms of the four requirements stated in Section 3.4, comprehensive systems represent an alternative approach to merging providing a formal construction for expressing multi-models and consistency rules on them, which does not forget the original membership of model elements (Requirement 1), see Section 4.4.2. Comprehensive Systems support general multi-ary (n ≥ 2) scenarios by definition (Requirement 2) and they formally capture the practical workflow concerning trace models (Requirement 3). The workflow of constructing a domain-specific trace model is mapped to the well-known (meta-) model-instance-pattern, see Section 4.2. Finally, comprehensive systems generalise graph diagrams and triple graphs and allow a flexible introduction and removal of correspondences of different arities (Req. 4), see Section 4.5. Thus, comprehensive systems represent a formal foundation for Multi-Model Consistency Management that combines the practicality of a single artefact from model merging with the flexibility and expressiveness from model weaving, see Section 3.1.2. The construction stresses the utility of partial mappings in commonality specifications, which have been promoted in [SKLR18] and were also picked up in [KFST19].
Regarding current limitations of our approach, we first have to state the conceptual restriction that we require the existence of a graph-like universal meta-language. From our experience, this is often the case. However, it might hamper applicability and may require to implement necessary translators or adapters to integrate heterogeneous modelling tools. But, the fact that MOF and EMF/Ecore [SBMP08] are widespread graph-like languages allows diverse applications. The main limitation of our approach is its current lack of practical evidence. A prototype implementation has been started but empirical data w.r.t performance and scalability is missing. Furthermore, comprehensive systems do not provide their own model repair concept and rely on existing solutions. We want to address these challenges in the future.

Related and future work
Comprehensive Systems are located in the field of Multi-Model Consistency Management, which was briefly overviewed in Section 3. We highlight the most tightly related studies here: Triple graphs [Sch94] and its multi-ary variant, graph diagrams [TA15,TA16], are a mature formal framework for multi-model consistency management comprising industry-proven methods for consistency verification and restoration [HEO + 11, WFA20, FKM + 20]. In Section 4.5, we showed that comprehensive systems are a strict generalisation of triple graphs and graph diagrams.
Model weaving, i.e. using trace models (= commonalities), is often applied in practice. Samimi-Dehkordi et.al. [SDZKR18] use trace models in their implementation of a model synchronisation framework based on Epsilon. Their approach does not encompass a formalisation and does not provide any guarantee for the correctness of the model repair. Feldmann et.al. [FKWVH19] use a similar approach in a multi-domain scenario. They use TGGs as a specification formalism to generate Epsilon code. Thus, the respective consistency rules can only express binary consistency rules. Vitruvius [KKL + 21] is a framework for view-based modelling based on a virtual SUM and allows view synchronisation via user-defined expressions. The virtual SUM is created by defining mappings (= type commonalities) between different metamodels. While their expression language is analysed from a theoretical point of view, their proposal of a multi-ary mapping language [KG19], which was also featured in Section 4.2, is missing such a formalisation.
Multi-ary delta lenses (MX-lens) [DKL19] are a formal framework for describing multi-ary model synchronisation. It is defined on a more abstract categorical level than comprehensive systems and comprehensive systems can be considered as a more concrete instantiation of the former. However, MX-lens also comprise propagationbased means for model repair as a built-in feature. Comprehensive System are not directly tied to a specific model repair approach. It is left open whether model-repair should be implemented using a rule-based or a search-based approach.
Stevens [Ste20] proposes another approach to Multi-Model Consistency Management. Her approach takes a workflow-oriented point of view and considers a network of correspondence relations, which are implemented by abstract builders that implement consistency verification and restoration. In [Ste20], the correct and optimal scheduling of these builders is analysed. This approach can be considered as a meta-approach that may be combined with other approaches, including comprehensive systems.
The biggest limitation of our approach is practical evidence, which is currently lacking. Therefore, our immediate next goal is to provide this missing evidence. We will also have to address the challenge of model repair. Here, the goal must be to re-use as much of existing approaches as possible. With the validity of Corollary 1, we are able to use the algebraic graph transformation framework [EEPT06] and related approaches [HP09, SLO19,KR17]. We aim to built our repair approach on existing rule-based frameworks [KKT13, TOLR17, OPKK18], where consistency is inductively defined via consistency-preserving rules and repair is performed by completing applications of arbitrary edit-rules to consistency preserving rules. For this we have to further investigate the comprehensive system equivalent of match-consistent rule splitting [EEE + 07] in triple graph grammars. Theoretical research on admissibility of other graph transformation approaches has already begun: In [KS20], we studied the possibility of using a subclass of comprehensive systems for single pushout rewriting.
Synchronising multiple behavioural models with comprehensive systems is another open issue. In this paper we focused mainly on (more or less static) structural models. Including behavioural semantics into the picture requires to investigate commonalities between the dynamics of behavioural models as well.
Finally, analysing the nature of the most common multi-ary consistency rules poses as an interesting research direction, see Section 4.4.1. An example of such a consistency rule is given by the ForAll-constraint [DKPF09], which requires the simultaneous existence of a tuple of elements in disparate models. An empirical investigation resulting in a catalogue of such rules is another possible future direction.

A. Categorical background
A structural overview over the contents and the dependencies between the individual sections of the Appendix is given in Figure 11. This first appendix, section A briefly summarizes the categorical background that is required for this paper. A more in-depth introduction can be found in textbooks such as [BW90,Pie91,Wal92]. The second appendix section B contains detailed proofs of the Theorems and Propositions in this paper.
A category is a collection of similarly-structured mathematical objects equipped with means to "compare" these objects:

Definition 8 ( Category).
A category C consists of the following:  id • Composition • is associative, i.e. for all f ∈ C(A, B ), g ∈ C(B , C ), and h ∈ C(C , D): Due to the abstract nature of categories, it is often not possible to check if two objects represent the same thing because we cannot look into the internal structure objects. However, we can compare them via morphisms and if two objects are related by invertible morphisms, they are called isomorphic, i.e. identical modulo internal renaming.
Definition 9 (Isomorphism). Let C be a category and A, B ∈ |C| two objects in this category. A and B are isomorphic, written A ≈ B , if there exist two morphisms i : A → B ∈ Arr C and i −1 : Further, i and i −1 are then called isomorphisms.
Thus, in category theory many construction are only unique "up to isomorphism". Arguably, the most important category is the category of sets and mappings.

Fact 1 (Category SET).
There is a category SET, whose class of objects is the class of all sets. The class of morphisms is given by the class of all total mappings between sets. Identities are given by identical mappings and composition is given by function composition.

A.1. Functors, natural transformations, adjunctions
A functor represents a means to "compare" two categories.
Definition 10 (Functor). Let C and D be two categories. A functor F : C → D comprises, • an object mapping, i.e. for every object A ∈ |C| in the source category, F assigns an object F (A) ∈ D in the target category, • and a morphism mapping, i.e. for every morphism f : A → B , F assigns a morphism F (f ) : F (A) → F (B ) in the target category, such that • identities are mapped to identities, i.e. for all A ∈ |C|: F (id A ) id F (A) .
• and composition is preserved, i.e. for all f ∈ C(A, B ) and g ∈ C(B , C ): F is called an embedding, if it is injective on objects of C and injective on C (A, B ) for all A, B ∈ |C|.
A natural transformation is a means to "compare" functors.
Definition 11 (Natural Transformation). Let F : C → D and G : C → D be two functors between the same categories. A natural transformation α : F ⇒ G is given by a |C|-indexed family of D-morphisms ((α A : F (A) → G(A) A∈|C| , such that for every f ∈ C(A, B ) the following diagram commutes: The diagrams are also known as naturality squares.

Functors and natural transformations organise themselves into a category:
Fact 2 (Functor Category.) For every pair of categories C and D, There exists a functor category D C , whose objects are the functors between C and D morphisms are given by the natural transformations between these functors.
Functors and natural transformations allow us to check whether two classes of mathematical structures are essentially "the same": Definition 12 (Equivalence of Categories). Let C and D be two categories. They are said to be equivalent, written C ∼ D, if there exists a pair of functors R : C → D and L : D → C together with two natural transformations ≈ C : L • R ⇒ 1 C and ≈ D : R • L ⇒ 1 D , where 1 C and 1 D denote identity functors (= identity in all components) and all members of ≈ C (≈ D ) are isomorphisms in C (D).
If these families of isomorphisms are actually identities, then C and D are said to be isomorphic.
Moreover, in category theory there is a weaker notion than equivalence, called adjunction 10 . Intuitively speaking it means that two classes of structures are equivalent modulo some free construction that can be universally applied. An example for such a construction is the free monoid A * (Kleene star) over a set A.

Definition 13 (Adjunctions, (Co)-Free constructions).
Let C and D be two categories and R : C → D, L : D → C be two functors between them. L and R are said to be adjoint, written L R if there exists two natural transformations η : 1 D ⇒ R • L (called unit) and ε : L • R ⇒ 1 C (called co-unit).
Equivalently, an adjunction can be defined as co-free construction w.r.t to a functor L : D → C. A co-free constructions assigns to every C-object B a D-object R(B ) and C-morphism ε B : L(R(B )) → B such that for every D-object A and C-morphism f : L(A) → B there exists a unique morphism f : . This is summarized in the diagram in (19).

A.2. Universal constructions
Universal constructions have proven to be important for many software theoretical methods. Intuitively universal constructions can be described as a generalisation of meets and joins in a pre-order. Some well known examples for universal constructions in SET are cartesian products or disjoint unions (coproduct). It is important to note that SET possesses all these universal constructions and thus every category SET B does as well [Gol06]. The construction of universal constructions in those categories is carried out "pointwise". We say that a universal object is constructed "pointwise" in SET B , if it is constructed separately for each B-object, e.g. in the case of E-Graphs separately for the set of graph nodes, the set of attribute nodes, the set of graph edges, and the set of node attribute edges. The universal properties of the universal constructions then guarantee, that the resulting object is a well-defined object in SET B . Examples are given in the proofs of Lemma 1, Lemma 2 and Lemma 3. For more details on this idea, we refer to [Gol06].

A.2.1. Coproducts
Coproducts a.k.a. sums provide means to collect a set of objects and work with them uniformly, similar to type abstraction in programming.  (20):

Definition 14 (Binary Coproduct). Let C be a category and
The mediating morphism [f ; g] acts like f and g via case distinction. If a category C has coproducts of arbitrary arity then there is a special nullary coproduct, the initial object 0, that has unique morphisms 0 A : 0 → A into every object A ∈ |C| and it is neutral w.r.t. binary coproducts, i.e. A + 0 ∼ A. A multi-ary coproduct is then given by multiple applications of the binary coproduct operator, because the latter are associative ((A 1 + A 2 ) + A 3 ∼ A 1 + (A 2 + A 3 )) and commutative (A 1 + A 2 ∼ A 2 + A 1 ) up to isomorphism. The (multi-ary) coproduct over an I -indexed family of C-objects (A i ) i∈I is denoted ( i∈I A i , (ι i : A i → i∈I A i ) i∈I ) and the mediating morphism for a family of morphisms (f i :

Fact 3 (Coproducts in SET.) SET has all coproducts. A binary coproduct in SET is given by disjoint union
} for A and B being sets. The initial object 0 in SET is the empty set ∅.
Lemma 1 (Coproducts in SET B .) Every functor category SET B has coproducts due to the fact that SET has all coproducts and we can construct them pointwise.
Proof. Let F and G be two functor objects in SET B and consider the family of diagrams in (21), whichindexed x x q q q q q q q q q q x x q q q q q q q q q q F (B ) + G(B )

A.2.2. Pullbacks
A pullback can be seen as the categorical version of an inner join: two structures A and B are combined where they coincide on a common structure C .
Fact 4 (Pullbacks in SET.) SET has all pullbacks: Given two mappings f : A → C and g : B → C with same codomain the pullback A× (f ,g) B is given by the fibred product A× (f ,g) Lemma 2 (Pullbacks in SET B .) Every functor category SET B has pullbacks due to the fact that SET has all pullbacks and we can construct them pointwise.
Proof. Let F ,G and H be objects in SET B and ν : F ⇒ H and μ : G ⇒ H morphisms in SET B . Consider the following cube for some f : A → B ∈ Arr B : The pullback of μ and ν for objects A and B is given by constructing the respective pullbacks In this case m has the left cancellation property, which is a consequence of the pullback property in (23): We sometimes highlight the special property of m by denoting it with a special arrow m : A B .

Fact 5 (Monomorphism in SET).
In SET the class of monomorphisms is exactly the class of injective mappings.
Fact 6 (Pullbacks preserve Monos). If a is a monomorphism in the diagram of Def. 15, then π B is a monomorphism, as well.

A.2.3. Pushouts
A pushout can intuitively be described as gluing of two structures at a defined interface.

Definition 17 (Pushout). Let C be a category and a span of C-morphisms
→ B be given. The pushout of a and b is given by the co-span A Fact 7 (Pushouts in SET). SET has all pushouts: Given two mappings f : C → A and g : C → B with same domain, consider a relation ∼ on A B , defined as follows (ι A and ι B are the embeddings into the disjoint union and ≡ the least equivalence relation containing ∼, then the pushout of f and g is given by Lemma 3 (Pushouts in SET B .) Every functor category SET B has pushouts due to the fact that SET has all pushouts and we can construct them pointwise.
Proof. Dual to the proof of Lemma 2.
Pushouts play an integral role in the algebraic graph transformation framework [EEPT06], i.e. rule-based rewriting is represented by pushout-diagrams in a suitable category. These categories are referred to as adhesive categories and their definition is based on the so-called Van-Kampen property [LS04]: Definition 18 (Van Kampen square). A pushout square (f , m, n, g) as shown in the bottom of (25) is called a Van Kampen square iff back faces are pullbacks ⇒ (front faces are pullbacks ⇔ top face is pushout) (26)

Definition 19 (Adhesive Category). A category C is called adhesive iff
• C has all pullbacks, • C has pushouts along monomorphisms (i.e. for spans where at least one leg is a monomorphism) which also have the Van Kampen property (Definition 18).
A more general and more widespread notion than adhesive categories is given by so-called weak adhesive HLR categories, which also include practically relevant structures such as attributed graphs. This definition weakens both the notion of Van Kampen squares and requires the existence of weak Van Kampen pushouts only along for an admissible sub class of all monomorphism M. The latter is explicated further in Section A.3.

A.2.4. Universal constructions and adjunctions
Fact 8 If L R are two adjoint functors, then L preserves coproducts and pushouts and R preserves pullbacks.

A.3. Partial morphisms and partial arrow classifiers
The category SET denoted the category of sets and total functions. There is also a strict super-category SET ⊂ PSET of sets and partial functions (every total function is a special partial function). Here, we present a a generic and category-independent approach to construct a category Par (C) of partial morphisms over a given category C, whose morphisms are called total. This well-known approach only requires C to have all pullbacks and was introduced in [RR88]. The category Par (C) is a subcategory of the span category Span(C) over C, where the inner legs of these spans are required to be monomorphisms: i.e. the composition [n, g • [m, f is given by [m • n , g • f and it can be shown that the choice of representative for this span is unique up to isomorphism. Neutrality w.r.t. identities and associativity of composition results from the fact that pullbacks preserve isomorphisms and the universal property of pullbacks.
The construction of partial map categories can further be restricted by replacing the class of all monomorphisms with an admissible subclass M of all monomorphisms, called dominion in [RR88]. To be considered admissible, this class M must allow the constructions shown in (27) and (28), i.e. it is closed under isomorphisms, its is closed under composition, and stable under pullback, see Proposition 3. In this case, we call the respective category Par M (C) an M-partial map category.
The category C is embedded into Par (C) 11 by the so called graphing functor, which is the identity on objects and maps every morphism f : A → B to the span [id A , f , where the identity id A on A is trivially a monomorphism.

Definition 22 (Graphing Functor M ).
M : Pushouts in C that remain pushouts in Par M (C) after embedding them via M are called hereditary [Ken91]. They are closely related to Van Kampen Squares [Hei10b].
Hereditaryness is immediately given, when M has a right adjoint and therefore preserves colimits. The foundation for this right adjoint are (M)-partial arrow classifiers, which "totalize" partial morphisms: Fact 9 SET and SET C have partial arrow classifiers. In SET, L adds a new ⊥-element to every set and turns a partial function into a total function by mapping all non-mapped elements to ⊥. In SET B , the construction becomes more involved, see [Gol06,.
Fact 10 In a category with M-partial arrow classifiers, L extends to a functor that is right adjoint to M and defined as follows: L : where the morphism-mapping [η A • m, f is explained in further detail by the diagram in (33): The underlying co-free construction of the adjunction M L is shown in (34).
Par M (C) Finally, there is an important lesser known fact about partial arrow classifiers: for each j ∈ {1, . . . , n} and for each s ∈ |B|.
As an example, we have drawn the category B × I for a star-shape I with degree n 2 and B : E s − → V t ← − E (the signature for directed multi-graphs, where identities are omitted) in (36).
r r r r r r r r r r r r r r (s,11) r r r r r r r r r r r r r r r r r r r r r r r r r r r r (s,20) r r r r r r r r r r r r r r Hence, objects in SET B×I are functors N : B × I → SET, which simultaneously act as 2n + 1 functors from B to SET, augmented with spans of total functions for each s ∈ |B| and all j ∈ {1, . . . , n}.
We define N to be the subcategory of SET B×I , which maps all N ((id s , j 0)) to to monomorphisms (injective functions) in SET for all s ∈ |B|.

B.1.3. Equivalence of CS and N
Let N ∈ |N| and C ∈ |CS|. We will show that every comprehensive system C has an equivalent representation as an N-object. First, we define a one-to-one correspondence within C 's components. We identify • N ((s, i )) and C i (s) for all s ∈ |B|, 0 ≤ i ≤ n (1. in Definition 2). • and N ((op, id i )) : N ((s, i )) → N ((s , i )) and C i (op) : C i (s) → C i (s ) for all (2. in Definition 2). Furthermore, in Appendix A.3 it was demonstrated, how a partial morphism in some category can be expressed by an equivalence class of spans with the inner leg being a monomorphism. Therefore, • N ((s, k )) for all s ∈ |B|, −n ≤ k < 0 (the apex of the span), N (s, 0) for all s ∈ |B|, 0 < j ≤ n (the domain embedding), • and N ((id s , jj )) : N (s, −j ) → N (s, j ) for all s ∈ |B|, 0 < j ≤ n (the concrete assignment) form a concrete representative of the projection p C j ,s : C 0 (s) C j (s) in C . The remaining constituents of N : • N ((op, j 0)) for all 0 < j ≤ n and non-identity morphisms op : s → s ∈ Arr B , • N ((op, jj )) for all 0 < j ≤ n and non-identity morphisms op : s → s ∈ Arr B are subject to the following equations, which are a consequence of the definition of composition in the product category B × I (compare with the diagonals in (36)) and of N being a functor (which must preserve these compositions): These conditions correspond to the generalised edge-node incidence (6)+(7). Thus, N ((op, j 0)) and N ((op, jj )) can be seen as reifications (witnesses) of this condition.
Hence CS ∼ M, as claimed.

B.2. Proof of Theorem 2
Let (  y y y y r r r r r r r r r r r C j f j y y r r r r r r r r r r r The spans (g 0 , n 0 ), (dg j , dn j ), and (g j , n j ) are constructed component-wise as pullbacks in M. The universal pullback property of the top face w.r.t to the horizontal inner face in the middle provides the morphism A(j 0) that makes the upper rear faces commute. And the universal pullback property of the bottom face w.r.t to the horizontal inner face in the middle provides the morphism a j that makes the lower rear faces commute. It remains to show that A ∈ |M|. For this we have to show that A(j 0) is a monomorphism Assume two morphisms x : X → dom(a j ) and y : X → dom(a j ) such that A(j 0) • x A(j 0) • y. Postcomposing this arrow simultaneosly with n 0 and n 0 yields n 0 • A(j 0) • x n 0 • A(j 0) • y g 0 • A(j 0) • x g 0 • A(j 0) • y using the commutativity of the left and back face, followed by monomorphism property of B (j 0) and C (j 0) we get Recall that the horizontal inner face is a pullback, i.e. dn j and dg j are jointly monic and therefore x y as required.

B.3. Proof of Proposition 3
Isomorphisms trivially yield naturality squares that are pullbacks. The composition of two reflective monomorphisms is a reflective monomorphism as well because pullbacks compose. To see that reflective monomorphisms are stable under pullback, consider again our pullback construction in M from the proof of Theorem 2, depicted in (37). This time, the upper front face is a pullback and all components of m are monic, i.e. m is a reflective mono. We have to show that all components of n are monic and that the upper back face is a pullback, i.e. n is a reflective mono. The former, however, is easy, since pullbacks in M preserve monomorphisms (recall that A was constructed via taking pullbacks component-wise in Theorem 2). For the pullback property consider the diagram in (38) dom(a j ) (a) We use the fact that the upper front-face in (37) is a pullback because m is reflective monomorphism by assumption. We compose it with the horizontal inner face in (37), which is a pullback by construction resulting in a pullback that forms the outer rectangle in (38). The right square (b) in (38) is the top face in (37) and therefore also a pullback by construction. Now we know that the upper and lower triangles in (38) commute because they represent the upper left and upper right faces in (37). Therefore, we can use the pullback-decomposition lemma to conclude that (a) is a pullback, as desired. Since G is adhesive [LS04], pushouts preserve monomorphisms such that n 0 and dn j are monomorphisms. Next, we show that the front face is a pullback, for this consider the diagram in (40).

B.4. Proof of Theorem 3
The square (a) is a pushout by construction (bottom face in 39) and the outer square is a pullback composed out of the back (pullback due the reflective property of m) and top face (pushouts along monomorphisms in adhesive categories are pullbacks as well) in (39). Using the special pullback-pushout property [LS06, Lemma 6] the square (b) becomes a pullback.
It remains to show that D(j 0) is a monomorphism. For this consider the following SET-theoretic argument, which is stable under sort-wise construction (lifts to SET B ): Assume D(j 0) is not monic, then there are two elements x , y ∈ dom(d j ) that D(j 0) maps to the same element z ∈ D 0 . Now, we know that dom(d j ) is the apex of a pushout, therefore dn j and dg j are jointly surjective and thus x , y must have pre-images x , y under (dn j , dg j ) in dom(c j ) or dom(b j ).
Note that the cases x , y ∈ dom(c j ) or x ∈ dom(c j ) ∧ y ∈ dom(b j ) disqualify immediately since (a) is a pullback. Therefore consider the case x , y ∈ dom(b j ) further. Then x , y must have distinct images under inclusion B (j 0) in B 0 that must be mapped to z via g 0 . Now D 0 is also constructed as the apex of a pushout and for x , y ∈ B 0 being mapped to the same element in D 0 , there must be pre-images of x , y in A 0 that are mapped to the same element in z ∈ C 0 . But dom(c j ) is the pullback object of n 0 and D(j 0) and therefore z ∈ C 0 must have two pre-images ins dom(c j ) which violates the monomorphism property of C (j 0) .
Hence, we must conclude that D(j 0) is a monomorphism.

B.5. Proof of Theorem 4
Let B be a comprehensive system. For the existence of M-partial arrow classifiers, we have to show the existence of an M-morphism η B : B LB such that for every span (f : X → B , m : X A) there exists a unique morphism [m, f : A → LB such that the resulting square is a pullback. Again we perform the construction in M and focus on the image of j 0's. The case for jj 's works analogously.
X 0 f 0 y y r r r r r r r r r r r An immediate consequence of the definitions of M in terms of coproducts is that T is injective on objects and on morphism sets, hence an embedding, such that it remains to show preservation of pushouts. As seen in Theorem 3, pointwise pushout construction of a span in M may fail to belong to M. This obstacle can be overcome because we use coproducts in the construction of T. are pullbacks, such that it suffices to show that two pullback squares in G always add up to a pullback square of their coproducts, see (44).
This can be demonstrated as follows: G is known to be extensive, i.e. the functor + : G ↓ B 1 × G ↓ B 2 → G ↓ (B 1 + B 2 ) between comma categories is an equivalence of categories, its inverse is taking pullbacks along coproduct injections [CLW93]. This adds pullbacks adjacent on the right of the two left pullbacks in (44) and, by pullback composition [BW90], we obtain two pullbacks with the arrow k 1 + k 2 as right vertical arrow. Since G is a topos [Gol06], it can be shown that these two then add to the right pullback in (44), see $ 5.3. in [Gol06]. Now, consider the cube from (39) in the proof of theorem 3. This time left and back faces are pullbacks. Using the fact that pushouts in G are mono-hereditary, cf. definition 23 in Appendix A.3, we conclude that front and right faces are pullbacks and that D(j 0) is a monomorphism, i.e. the result is actually a comprehensive system. Hence, we have to show that all components are pushouts, i.e. the right squares in (45) are pushouts in G for all i ∈ Arr I .
This is, however, clear from the definition of T for i > 0 (because models are untouched and the left square is a pushout by assumption). For i ≤ 0, all four objects in the right square are coproducts over a certain indexing set I (I Arr X for i 0 and I Arr X ( , j ) for i −j < 0), where the coproduct amalgamates relation graphs of the graph diagrams (index r ∈ R).
Finally, since is a functor from G I to G, which is left-adjoint to the diagonal functor I (cf. [BW90, Ex.13.2.4]), it preserves colimits, hence all squares are pushouts, because in the left square there are pointwise pushouts separately for each relation index r ∈ R.
Funding Open access funding provided by Western Norway University Of Applied Sciences.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.