Model-Centric Engineering and The Rise of the Language Workbenches
In March 2015, I have written an article for the Xootic magazine (the magazine of the alumni association for the Software Technology PDEng program). Unfortunately, production of the magazine was halted and this particular edition never saw the light of day.
Since I see the things I wrote (though somewhat simplified, but in essence not plain wrong) becoming more and more reality, I decided to post the original text of the article. So here goes:
The Rise of the Language Workbenches
More and more people are nowadays not talking about coding anymore, but about modeling. People may write models on higher levels of abstraction instead of or in addition to computer programs. The so-called Domain Specific Languages (DSLs) are making a foray into mainstream software- and systems engineering. And while we almost never stop anymore to wonder about what we mean when we say "model", or what the difference between programming and modeling is, theoretical foundations for an entire new paradigm called Language Oriented Programming (and with them, a new class of development tools, called Language Workbenches) are crystalizing in the field of Model-Centric Engineering. Before we describe some basic terminology around Model-Centric Engineering and Language Workbenches, let's look at a utopia-like situation in engineering that may shed some light on why you may want to care.
Imagine that all the documents we edit and review are completely coherent, consistent and coupled with each other. The documents are so structured that their contents (text, diagrams, graphs, tables, etc.) which represent requirements, design or detailed design can be easily cross-checked for consistency and machine-processed for various purposes, such as:
- analysis of the contents for total system aspects or quality attributes, e.g. productivity, costs, or security
- aggregate information to configure or construct (part of) the system (we are building/maintaining)
- export relevant details for auditing, e.g. by the FDA or the FCC
There is no clear boundary between code and specification document, because every new piece of a system design document is accompanied with a way to transform it into configuration or construction of part of the system itself, as well as a way to validate it with the actually built system. Architects and designers/developers can query the rich interrelation between elements in all system design documents to extract relevant information for decisions (such as an overview of the current system architecture) and don't need to write documents and then hope that the realization will be conforming to their design. Instead, they can focus on how to deal with changes in the system/company environment and requirements to the system. This is not limited to the software discipline, but extends to all disciplines that have to work together to build the system (physicists, mecha(tro)nics engineers. chemists, psychologists, software engineers, etc.).
Now substitute the word model for the words document and code. That is what Model-Centric Engineering should be about in my perspective.
Coming back to terminology: there are two ways in which the term modeling can be understood [1]: descriptive and prescriptive. A descriptive model represents an existing system, abstracting away some aspects and emphasizing others. Typically, such models are used for discussion, communication, and analysis. A prescriptive model is one that can be used to (automatically) construct an envisoned system. It must be much more rigorous (sometimes even formal), complete, and consistent. A development approach centering around descriptive models is something I would call Model Based (Software or Systems) Engineering, while working with prescriptive models is in my opinion the essence of Model Driven (Software or Systems) Engineering. In practice, a combination of both approaches is used.
To be able to create models for a certain domain (be it a branch of data science, physics, insurance, printing, microscopy, etc.), we need a language (sometimes also called a metamodel) in which we can describe these models. We know this from software development where we have many General Purpose Programming Languages (GPLs), which are all Turing complete, but are optimized for the tasks in a respective domain of programming. Thus, in C one can directly influence memory layout (for communication with low-level memory-mapped devices), and use pointers (for creating efficient data structures). On the other hand, in Ruby, closures can be used to implement postponed behavior (useful for, e.g. asynchronous web applications), and powerful string manipulation features exist (to handle input received from a website). The more specific a programming task (or modeling task, since the term prescriptive model and program is pretty much interchangeable) becomes, the more reason there is for a specialized language for it. An example of such specialization is querying relational databases, where tables, rows, columns, and joins are the core abstractions. SQL is a specialized domain-specific language (DSL) that accomodates these features.
So a DSL is simply a language that is optimized for a given class of problems, called a domain. It is based on abstractions that are closely aligned with the domain for which the language is built. Specialized languages also come with a syntax suitable for expressing these abstractions concisely. Such a syntax can be textual, tabular, symbolic (mathematical), or graphical. The concept of a DSL is not new at all; actually one of the earlier DSLs (1977) is awk. However, the huge decrease in effort for implementing and effectively integrating DSLs that interact with each other is new. In the past, most DSLs that could be created within a feasible budget were implemented as so-called internal DSLs inside a host language (e.g. using preprocessor macros in C, templates in C++, or meta programming techniques in dynamic languages, like Ruby). However, internal DSLs have a very limited notation (limited to a subset of the host language) and effectively no IDE support (so no error reporting on the abstraction level of the DSL, no editor, and usually no type-safety). Building an external DSL (which can exist outside a host language) required parsers for textual languages and graphical editors for graphical DSLs, which is just too much development effort in most specialized contexts.
In 2005, Martin Fowler coined the term Language Workbench[2], which is an application in which one can define languages (just like classes are first-class citizens in object-orientation, languages are first-class citizens in language-orientation) and express models in these languages. Nowadays, such tools are not just prototypes, but actually exist, with varying reach in features (textual-only, graphical, integration between DSLs, analysis of DSLs and transformation of models to models and models to source code) [3]. While tools continue to provide more features for DSL realization, books like Markus Völter's DSL Engineering [1] and Martin Fowler's DSLs [4] describe the concepts behind language-orientation.
And while we already know that software is eating the world[5], we will see that the generic "computer programmer" developer will gradually disappear and make place for the more specialized "domain user" or "domain expert" developers. Some of these will be software engineers, while others will be biologists, physicists, insurance experts, etc. And, of course, also language developers.
References:
[1] DSL Book by Markus Völter - http://voelter.de/dslbook/markusvoelter-dslengineering-1.0.pdf
[2] http://martinfowler.com/articles/languageWorkbench.html
[3] The State of the Art in Language Workbenches: Conclusions from the Language Workbench Challenge - http://citeseerx.ist.psu.edu/viewdoc/citations;jsessionid=07D20A3994EA996DD08C5A92D586663A?doi=10.1.1.725.1088
[4] http://martinfowler.com/books/dsl.html
[5] http://www.forbes.com/sites/stevedenning/2014/04/11/why-software-is-eating-the-world/