Skip to content

Designing Models

Creating "good" data model designs is an art rather than a science. Within the context of the IVOA reusing existing recommended models is a prerequisite for a model be considered good.

The initial test as to whether a model is good is to run validation which will point out areas of the design that need to be carefully considered as well as outright errors.

Purpose of the Model

The main intended purpose of the model will affect the overall design

  • Data Discovery
  • Data Labelling
  • Data Serialization (e.g. for transport or storage)

Historically most of the models in the IVOA have been designed for data labelling, and consequently typically have a complex design with many levels of inheritance and a large number of classes - an example of such a data model is Coords. However, if the main purpose of the model is data discovery then a much simpler design with fewer classes and less inheritance might be more appropriate to reduce the number of joins. An example of such a data model is ObsCore, which has only 12 classes and no inheritance. If the main purpose is data serialization then the design might be somewhere between the two in complexity of the inheritance hierarchy and number of classes. Clearly, in this sort of design might be appropriate for all three use cases, as it is a compromise between the two extremes.

Data models that are designed for data labelling and can might be used most often in combination with MIVOT to create a "semantic layer" that can be used to label data in a way that is more meaningful to users than the raw data labels, but that is still machine readable.

Data models that are designed for data discovery might be used most often in combination with TAP to create a "data discovery layer" that can be quite complex. The vo-dml tools can be used to generate a TAP schema from the model that can be used to create a TAP service that allows users to query for data that matches the model.

Data models that are designed for data serialization might be used in combination with the various serializations that the vo-dml tools can generate to create a "data serialization layer" that can be used to transport data between different systems and services. The vo-dml tools can be used to generate OpenAPI schema from the model that can be used to create a REST API that allows users to query for data that matches the model, as well as JSON and XML schema that can be used to serialize data according to the model.

Testing serialization.

VO-DML was designed as a modelling language to promote interoperability of textual representations of instances of models. Therefore when designing a model it is important to make sure that the serialization "looks good" i.e. is reasonably "human understandable" as well as the primary aim of being machine readable, which should be guaranteed when using schema generated by the tools.

The java runtime has some functionality for roundtrip testing the various serializations which can be a good first level test as to whether your model is a "good design".