SysMLv2 API vs. version-controlled textual notation
In the last months I got the following question pretty often:
„Why shall I use the SysMLv2 API & Services, when I now have the textual notation, which I can simply version-control?“
I have put some thoughts on it, and maybe this is helpful for some people. I’m also grateful for a discussion about it – although many points are heavily implementation dependent and we – at least I – haven’t seen a commercial implementation of the API & Services yet.
When I thought about it, I came across the point, that there are use cases, where „I do“ something with the model and those where „I want“ or „I need“ to achieve something with the model.
For version-controlled textual notation, I have some git-based tools in mind. Sure, there a several options. So let’s go into it:
Work alone:
If you work alone on your SysMLv2 model, it makes nearly no difference, how you store and version control your model. I see the version-controlled textual notation – dear lord what a term, let´s call it VCTN from now on – a little bit ahead, because it´s easy to achieve at barely no cost.
Work with other SysMLv2 experts:
Assuming you’re working with other SysMLv2 experts, it is most likely, that all of you have tool, and the tool is capable to work with the API and Services. There are some more reasons about changes in the model (see later), which makes me think, that the API is a little bit ahead of VCTN, although the advantages of VCTN stay valid.
Work with other engineers:
If you’re working as well with other SysMLv2 experts but also with other engineers, VCTN might be in favor. Why? I doubt that each and every engineering tool in the world will support SysMLv2 in near future. In these cases, the (non-systems) engineer must get the information manually from the repository. VCTN has the advantage that the thing that gets stored, is already a model notation. Therefore, the user can find this notation in the versioning tool which is the same as input and output of the repository. For API & Services, the single model elements get wrapped into a huge JSON format, which brings us to next point.
Branching / merging:
Branching and merging is possible in both variants. Maybe I see a little advantage in the VCTN because merge conflicts can be resolved directly in the version-control tools.
Human readability:
If you want native human readability for the stored data, VCTN is your friend. But the transport format and the way data is accessed in API & Services is not a bug, it’s a feature. The API & Services were never designed to be human accessible or readable – it is a machine-to-machine interface! It is also thinkable, that commercial implementations of the API offer endpoints to get textual notation out of it.
Downstream tool chain, using single values as in or output:
If your development process is already model-based in a way, that your tools are capable of using SysMLv2 elements as input or output, the API has huge advantages. Imagine a simulation tool, which uses a float/real value as an input. With the API you can address exactly this one element. In the VCTN case, you would have to retrieve the whole model, parse it, and hopefully your attribute wasn’t renamed or moved. (also see stable IDs)
Recommended by LinkedIn
Release processes:
Release processes in the sense of creating a tag could be achieved with both solutions. Maybe the VCTN tools offer more than the API standard, in the sense of reviewing tasks before a tag is being created. Nonetheless, it is still not that level, what ALM/PLM solutions can offer, why I decided to go only for “+”.
Access control:
Access control can also be achieved with both solutions. I gave the API and services a “++” because – at least in theory – access control is possible on element level not only on “file-level”, like in VCTN.
The API & Services specification, does not standardize an authentication mechanism, but I can't imagine any commercial implementation without it.
Stable IDs:
Maybe the most important difference, in which the API is far ahead of the VCTN. In the meta-model each element is supposed to get a UUID, and not only one, but two UUIDs, one for the identity (across all versions of the element) and one for the specific version. Why? See the next examples.
Full traceability between commits:
Imagine a model version which is loaded into the modeler. Now the user moves a part into another package, renames the part and saves (commits) the model. In the VCTN case it is impossible neither for a machine nor for a human, to identify this change. It rather looks like one part was deleted from the model and another one was added.
With the API the identity UUID would stay, so that a machine or a human, could track this change. Even though a modeling tool uses UUIDs when loading a textual notation model, these UUIDs are randomly generated on load and not transported to the VCTN repository, so that they are different the next time, you load the model.
(interdisciplinary) change processes:
The point above leads us directly to another important aspect, if you’re creating your SysML v2 model is not only for documentation purposes: In most cases you want to use the information generated in the SysML v2 model downstream in your development processes, e.g. in discipline-specific engineering processes. Therefore, you want automated triggers, which are running when specific things have changed in your model.
Yes, it is possible with both, but on which effort with the VCTN? But the stand-alone API is only an enabler. The real value comes with integrations to ALM or PLM systems.
CI/CD like staging after committing:
A rather new thought is, to handle model commits or tags in a CI/CD pipeline, to perform different actions after committing a new model version. Think e.g. of model transformations, code generation, simulation tasks, etc.
In this case API and Services specify nothing alike. (*) Nonetheless the Services could be extended by the implementation or later versions of the standard to achieve similar behavior.
For the VCTN case, this is often standard offering of the tools (e.g. giblab, github, etc.). The big disadvantage is, that for software-defined behavior, I have to parse or even to instantiate the textual notation, to do something with it. Especially for those cases, where full traceability is needed, the missing stable IDs are harmful for the process.
In conclusion I can say: There is no golden way – it depends, and it will heavily depend on the tool implementations, we will see in future. What I can say is, that the more you want to use your model data in downstream processes, the more you shall think about using the API, rather than VCTN. Of course mixed scenarios depending on the maturity level of your product / mission are thinkable.
If you want to discuss this further for your case and company, I’m offering consultancy in this area. Write me a message on LinkedIN or contact me under christian@mbse.consulting
One topic that doesn't seem to be explored her but I have grown to appreciate from V1 is the ability to make a change once and see that change propagate throughout my model. This won't happen in textual editors without augmentation by the tool. Maybe this has been solved by SW IDEs already and I am not aware. This is especially true as we reference names across projects/files.
Wonderful article, appreciate the analysis. Have you considered dependency and package management, in terms of API & Services vs VCTN? It is still not-very defined in either approach, but I could imaging leveraging existing infrastructure (e.g. a v2 specific POM, custom repo type for nexus, etc.) for VCTN quite effectively.
Nice article. Working with just the textual notation has its limits especially if you want to be able to capture the metadata related to changes made while creating and editing it. This doesn't happen when using the Jupyter Lab pilot implementation. I wonder if the Eclipse pilot implementation will handle this. I do know that working as an individual, or with a small team, the Jupyter Lap pilot implementation works well if the system model isn't too large. This approach breaks down with multiple modelers and multiple models. At this point, model version control needs to be built into the modeling environment to help streamline the process and reduce errors due to manual techniques.
Hmmm graphs are hard to be merged in there textual representation even harder when they are split out in multiple files.... CI/CD style triggers are key to support modern collaboration scenarios.
I'd like to add one more use case to the list: Versioning together with other artifacts (everything-as-code) As I understood your comparison, the drawback of the textual notation is the missing unique ID, which persists refactoring. Having an _id attribute, which is created initially by your editor and used by the parser over generating its own IDs could solve this. For connecting with other tools, spin up a API server, which serves the text-based data and you're fine. The prototype implementation might be a good starting base for such an approach.