DSL development: 7 recommendations for Domain Specific Language design based on Domain-Driven DesignThe term Domain-Specific Language (DSL) is heard a lot nowadays. A DSL is a language developed to address the need of a given domain. This domain can be a problem domain (e.g. insurance, healthcare, transportation) or a system aspect (e.g. data, presentation, business logic, workflow). The idea is to have a language with limited concepts which are all focused on a specific domain. This leads to higher level languages improving developer productivity and communication with domain experts. In a lot of cases it is even possible to let domain experts use the DSL and develop applications.The question for this article is: how to develop a Domain-Specific Language? I'll first explain the DSL lifecycle, consisting of the phases: decision, analysis, design, implementation, deployment, and maintenance. Afterwards I'll give 7 recommendations for DSL development based on my experiences with developing non-trivial DSLs. The DSL lifecycle![]() Let's look at each phase in more detail. 1. DecisionThe development of a DSL starts with the decision to develop a DSL, to reuse an existing one, or to use a GPL. If a domain is very fresh and little knowledge is available, it doesn't make sense to start developing a DSL. In order to determine the basic concepts of the field, first the regular software engineering process should be applied and a code base supported with libraries should be developed [2]. In other words: if you never have developed an application for a certain domain by hand and you have no existing code base, it isn't smart to start implementing a DSL and its associated code generators or execution engine. The situation differs of course for non-executable DSLs. However, as you need experience with existing code for executable DSL, along the same lines you'll need a deep understanding of the domain you are modeling for non-executable DSLs. 2. AnalysisIn the analysis phase the problem domain is identified and domain knowledge is gathered. The output of formal domain analysis is a domain model consisting of [1]:
The information gathered in this phase can be used to develop the actual DSL. Variabilities indicate what elements should be specified in the DSL, while commonalities are used to define the execution engine or domain framework. If you, for example, analyze a couple of existing code bases in a certain domain, you can split the elements of this code in two parts: the parts that differ and the parts that are the same for each code base. The static parts (the commonalities) can, depending on your implementation approach, be part of the execution engine interpreting the DSL or can be put in a domain framework which is used by the generated code. The parts that differ (the variabilities) should be specified in the DSL, these are the parts which a user of the DSL needs to ‘configure'. Eelco Visser [2] recommends an inductive approach which, in opposite to designing the complete DSL before implementation, incrementally introduces abstractions that allow to capture a set of common programming patterns in software development for a particular domain. He also states that developing the DSL in iterations can mitigate the risk of failure. Instead of a big project that produces a functional DSL in the end, an iterative process produces useful DSLs for sub-domains early on. In the second part of this article I will give 7 additional recommendations for the analysis and design phase of a DLS, based on my own experiences. 3. DesignApproaches to DSL design can be characterized along two orthogonal dimensions: the relationship between the DSL and existing languages, and the formal nature of the design description [1]. A DSL can be designed from scratch or it can be easier to base it on an existing language. Mernik et al. [1] identify three different patterns of design based on existing languages:
Besides the relation with existing languages the formal nature can range between:
It is important to decide what approach to take, however, it is maybe even more important to keep this lesson in mind [3]: Lesson T2: You are almost never designing a programming language.
4. ImplementationFor executable DSLs the most suitable implementation approach should be chosen. Mernik et al. [1] identify seven different implementation patterns, all with different characteristics:
While the different approaches can make a big difference in the total effort to be invested in DSL development, the choice for a particular approach is very important. 5. DeploymentIn the deployment phase the DSLs and the applications constructed with them are used. Developers and/or domain experts use the DSLs to specify models. These models are implemented with one of the implementation patterns presented in the previous section (e.g. the models are interpreted by an engine). Such an implementation results in working software which is used by end-users. 6. MaintenanceWhile domain experts themselves can understand, validate, and modify the software by adapting the models expressed in DSLs, modifications are easier to make and their impact is easier to understand. However, more substantial changes in the software may involve altering the DSL implementation. So, like any other element of software a DSL will evolve over time. Therefore having a DSL migration strategy is very important. Besides migration strategies, I have two recommendations which alleviate the maintenance risks of DSLs:
Seven Domain-Driven Design based recommendations for DSL Development
Before going into the details of DSL design let's try to understand the context of these experiences. First of all, they are focused on creating multiple connected DSLs, i.e. you can create models expressed with different DSLs referring to each other. For example, in a Form model you can refer to elements from your Data model. More specifically, we are talking about a set of DSLs covering all system aspects of a Service-Oriented Business Application. Another important point in the DSLs we're talking about is that they are all aimed at non-programmer domain experts. For most cases this means domain experts can create models expressed in these DSLs, in a few cases this means they can at least read them. This of course always leads to finding a balance between flexibility and complexity. Based on my experiences, influenced by the concepts of Domain-Driven Design [4], I have the following 7 recommendations for DSL development: 1. Capture domain knowledge in a metamodelIf you talk about models for DSLs you will stumble upon the term metamodel. For a lot of people this sounds scary enough to stop reading. However, it's just a model of the abstract structure of the language. In other words: a metamodel models the concepts of a language and their relationships. Just as you model the concepts ‘Order', ‘Product', and ‘Customer' if you are building software like an order entry portal. A metamodel is essential for constructing a DSL. It captures the knowledge of the domain the DSL is aimed at. The model reflects how the team developing the DSL structures the domain knowledge and what they see as the most important elements. The binding of model and implementation ensures that the experiences with earlier versions of the DSL can be used as feedback in the modeling process. 2. Communicate using an ubiquitous languageThe metamodel is also important for communication purposes. When designing a DSL a lot of communication is needed between the users of the language (domain experts) and the developers. The metamodel is the backbone of a language used by all team members. Because the model is bound to the implementation, developers can talk about the DSL in this language. They can communicate with domain experts without translation. You should play with the model when talking about the DSL. If you can't talk in terms of the model about a scenario, the model should be adapted until you can. If the domain experts don't understand the model, there is something wrong with the model. Domain experts should object to terms or structures that are awkward or inadequate to convey domain understanding. Developers should watch for ambiguity or inconsistency that will trip up design. 3. Let the metamodel drive the implementationDon't forget that a language definition is more than just a metamodel (abstract syntax). A language definition also contains a concrete syntax and semantics. When designing and implementing a DSL the concrete syntax is captured in the solution workbench, i.e. an environment in which you can specify models using the DSL with either a textual or a graphical concrete syntax. The semantics of the language are captured in the transformation rules or model interpreter (based on the used implementation pattern, see above). It is important that the metamodel drives the implementation of the solution workbench and interpreter, i.e. the metamodel should driven the implementation of the DSL. If the implementation doesn't map to the metamodel, the metamodel is of little value. At the same time, complex mappings between metamodel and implementation are difficult to understand and in practice difficult to maintain as the design changes. A deadly gap between metamodel and implementation opens, so that insight gained in each of those activities does not feed into the other. Therefore, design the metamodel in such a way that it reflects the
implementation in a very literal way. However, demand at the same time
that a single metamodel also serves the purpose of supporting the
ubiquitous language. The implementation must become an expression of
the metamodel, so a change to the code may be a change tot the
metamodel and the other way around. To tie the DSL implementation and
metamodel in such a way, usually requires DSL tools
that let you generate big parts of the DSL implementation from the
metamodel. Figure 1 exhibits such a scenario in which part of the
solution workbench and interpreter are generated from the metamodel.
Figure 1 - Metamodel-driven DSL implementation
4. Isolate the domainAs said before, DSLs will evolve over time. In the previous recommendation we've seen that it is important to tie model and implementation, the model should drive the implementation. However, to do so you need to isolate the domain. If the domain code, representing the metamodel is diffused through the code it is very difficult to make changes to it. Changes in the GUI of your modeling environment or the infrastructure of your interpreter can actually change your domain code. In principle the recommendations for ‘normal' software also hold for DSL implementations. Divide your code into layers and concentrate all the code related to the domain model in one layer which is isolated from GUI and infrastructure code. The domain objects should be free of the responsibility of displaying themselves, storing themselves, managing application tasks, etc. They should focus on expressing the domain model. In the previous recommendation I stated that the model should drive the implementation, and I meant to do that as literally as possible. This is possible if you isolate the domain! Using the Generation Gap Pattern you can generate all the domain code while isolating it from your other code. So, isolate the domain to ensure that the model can evolve to be rich enough to express the domain and to keep track of the changes in that domain. 5. Refactor continuouslyAlong the same lines you should refactor all the time. You should refactor while you're knowledge crunching. You should refactor while you're communicating using the metamodel. You should refactor while you're busy with implementing the DSL. You should refactor while you're generating code from you metamodel. To say it with Eric Evans [4], you should especially refactor if:
I think it doesn't need any explanation that such an approach needs close involvement of all team members including the domain experts. 6. Maintain metamodel integrityTo effectively abstract a complex domain with domain-specific models, you need more than one DSL. In complex projects multiple DSLs are usually necessary in order to cope with different concerns. In other words: multiple domain-specific models (DSMs), specified in different DSLs are needed to accurately abstract complex systems. Total unification of the metamodel (remember: the metamodel describes the concepts of the DSL we are designing) for a large domain will not be feasible or cost-effective. The most important reason for this is that attempting to satisfy everyone with a single metamodel (and thus a single language) will lead to complex options that make the language difficult to use. This is the reason we are designing a DSL at all! Different domain experts will have a need for their own domain specific language to define their aspect of the system. So, we need multiple domain specific languages, hence we also need multiple metamodels. However, the boundaries and relationships between different metamodels need to be marked consciously. Some recommendations on multi-DSL development:
7. Use a people-oriented approachExecuting a DSL implementation process, especially in a way as recommended in the previous points, is not easy. It requires an effective team of developers and domain experts. My last, and most important, recommendation is to use a people-first approach in DSL development. DSL development is highly creative an professional work. Developers need to make the technical decisions, they are the best people to decide how to conduct their technical work. Domain experts live the domain, hence they are best suited to decide on the applicability of the concepts of the language. Although I strongly recommend the way of working reflected in the previous six points, the team has to decide on the process. Accepting a process requires commitment, and as such needs the active involvement of all the team. Key take aways for DSL development
---------------------- [1] Marjan Mernik, Jan Heering, and Anthony M. Sloane. When and how to develop domain-specific languages. ACM Comput. Surv., 37(4):316-344, 2005. [2] Eelco Visser. WebDSL: A case study in domain-specic language engineering. In R. Lammel, J. Saraiva, and J. Visser, editors, Generative and Transformational Techniques in Software Engineering (GTTSE 2007), Lecture Notes in Computer Science. Springer, 2008. [3] Wile, D. S. 2004. Lessons learned from real DSL experiments. Sci. Comput. Program. 51, 265-290. [4] Eric Evans, Domain Driven Design: Tackling Complexity in the Heart of Software. Addison-Wesley, 2004. Photos by Gail S and Hélio Costa |
|