The data compatibility blog,
Join the discussion

Direct Dataset Interoperability Science

0 March 12, 2025

Many data administration methods are developed without any scientific basis and, therefore, often fail to deliver. In contrast, Direct Dataset Interoperability Science was developed based on scientific investigations and proof of functionality. We define Direct Dataset Interoperability(DDI) as multiple compatible datasets collectively functioning as a single consistent dataset.

We initially developed Direct Dataset Interoperability by experimenting with data models and their instantiated datasets. Our Data Integration Experiment to improve the efficiency of data integration, eventually led to the development of DDI. We conducted our experiments on three levels: the data modeling level, the physical dataset structural level, and the physical data content level. Anyone who typically works with data models and datasets should be able to repeat these experiments to validate the results for themselves.

With the Data Integration Experiment, we prove that three factors contribute to DDI: the Data Context, the data context structural commonality, and the data content commonality of the data context. DDI results when we control all three factors for each dataset.

Factor 1: The Data Context

As with all good dataset designs, we used existing data modeling methods for our experiments. The breakthrough for DDI came about by splitting an existing data model into two data models since, at the time, we did not know how to integrate multiple data models.

In step 1 of our experiment, splitting a data model into two data models, we needed to duplicate several data entities. We refer to these duplicated data entities as the data context entities because these data entities always encapsulate the resulting data models. The data context data entities always represent the master data domains defined for the data models. We believe the data context of a data model has not been described previously. A data model’s data context entities are the anchor points on which other data entities of the data model depend.

We proved that we can consolidate any group of data models, where each model has the same data context entities, into a single data model. This process is the first known example of data model integration.

Factor 2: Data Context Structural Commonality

Step 2 of the experiment involves the database instantiating each data structure from the split data models. When instantiated, the duplicated data context entities in each data model form a data context structure common to each database. It becomes evident that the two instantiated data structures can be combined to create a single dataset structure matching the original. However, we could not integrate the datasets if we altered any data context dataset tables. Therefore, we now understand that the structural commonality of the data context is essential to integrate datasets or to create DDI datasets.

Factor 3: Data Context Data Content Commonality

Step 3 of the experiment involves populating the two instantiated data structures with data values. It quickly became apparent that if we wished to integrate the datasets into a single dataset, the data values had to be an exact duplication in each copy of the data context tables.

Siloed Physical Data Models

Figure 1: Two Data Context Tables highlighting both Structural Commonlity and Data Content Commonality

Figure 1 shows an example of two data context database tables for the address master data domain. These data context database tables are compatible because they share both structural and data content commonality. Since these two context database tables are compatible, we can either consolidate them into a single database table or merely relate them for DDI. Any disparity in the structural metadata or the data content causes the 2 tables to be incompatible.

Compatible Data Modeling arose as a result of the Data Integration Experiment. We concluded that DDI was a better solution than data integration. Data integration creates a bottleneck when we consolidate multiple datasets into one dataset. With DDI, we enrich each dataset to be compatible without impacting other datasets. Datasets can be distributed but related when the data context tables are compatible. To relate compatible datasets, referential data integrity must be enforced between the datasets.

Direct Dataset Interoperability in Summary

The data context shell of a dataset provides the gateway to the data of the dataset. By governance of the data context’s structural metadata and the data content, we can make datasets compatible and directly interoperable.

In this investigation of Direct Dataset Interoperability Science, we were limited by the number of data models available. In our next blog, we expand our investigation beyond the data model to develop Universal Dataset Interoperability Science.

Contact us

Blog Contact