On the Quality Benefits of Formal Domain Specific Languages

This entry is part of the series:
Software Quality driven by Formal DSLs

One of the assets of mgm is dedicated quality for software, including especially portal technology for applications with high-safety and reliance demands. In the first blog within this series, “Using Domain Specific Languages to Implement Interactive Frontends“, we described an approach using a specification language (DSL) family on customer level to specify valid inputs and frontend compu­tations for forms-based interactive or batch systems. Let us continue and focus on the quality benefits of this approach.

The presented specification language family supports modeling of frontends, such as form-based user interfaces, or field-related batch programs that e.g. process XML input.

In a nutshell, the approach is characterized by formal specifications for well-defined and valid field inputs, computation of values for calculated fields and of relationships between these fields. For an example see the summary at the end of part “Using Domain Specific Languages to Implement Interactive Frontends“.

The specification means are as follows:

  • All fields are typed as numbers, currencies, percentage types, plain text, or enumerations, etc. For some of these types, additional constraints can be expressed, such as minimum and maximum values, field lengths or regular expression restrictions.
  • Calculated fields have functional dependencies describing value dependencies on values of other fields by means of functional rules.
  • Constraint rules describe cross-field relationships between fields. Error messages describing the non-compliance of inputs to these rules are part of the specification.

Both, constraint rules and functional rules, aside from referring to values for fields, allow the expression of different behavior dependent on existent or missing values for input fields.

Such specifications automatically lead to control for the specified system:

  • Field value are computed according to functional rules.
  • Constraints on and between fields lead to automatic validation:
  • Type errors are flagged with corresponding messages.
  • For constraint rules, the specified error messages are issued, if constraints do not hold.

The specification languages are tailored to the application domain (DSL). Therefore specifications can be (and indeed are) written by customers rather than by programmers.

Practical Quality Benefits for Specifications

Let us discuss how the formal language approach supports development, quality and test.

Avoiding implementation errors by code generation

Similarly to programming languages, specification languages substantially simplify the software development process. Code generated from specifications eliminates a great deal of complexity and leads to less error-prone systems. Here’s how code can be generated:

  • Functional rules are simply compiled to fully operational code.
  • Field type definitions and cross-field constraints are translated to validation code performing checks and delivering adequate error messages.

At mgm tp, code generators have been implemented for a variety of languages including Java, C++, Javascript and different runtime environments. The generator approach yields a high-quality gain. Once the code generators are well tested, there is not much need to test validation software for each case again. The generated code conforms to the specification.

Generating Test Data from Specifications

Formal specifications substantially facilitate and improve the quality-assurance process, especially testing. Since constraints and dependencies define the set of correct inputs, they are an ideal prerequisite for the generation of test suites. This includes both, consistent (valid inputs) and inconsistent data (deliberately invalid inputs). Generation of test data is discussed in more detail later in this blog series.

Finding Specification Errors

Translating specifications to code and especially to test data improves quality of the specifications, because it shows flaws allowing for an early feedback to the writers of specifications. Inconsistent and sometimes even incomplete specifications can easily be discovered:

  • Inconsistent specifications lead to contradictions and thus inhibit valid test data generation. This can be detected already at test data generation time.
  • Incomplete specifications give too much freedom to test data generation. Since “too much freedom” cannot be decided upon statically and automatically, this will show up only later in the process. In most cases such specification errors are reported in the testing phase.

Improving the Processes

Additionally to the above technical benefits, this formalized approach improves the requirement, development, and quality processes as follows:

  • Formalization of requirements in the user’s domain enforces the analysis of well-known and less well-known informal requirements. In other words, the likelihood that the system does what the customer explicitly or implicitly expected increases.
  • The differentiation between input validity and business logic modularizes the requirement process and thus substantially improves the latter.
  • The early availability of validity checks allows early feedback to specification writers.
  • Early availability of code and test data generation improves testing for front- and backends.

Quality Assurance Methods for Formal Specifications

The aspects mentioned above already substantially improve software quality without any dedicated quality measure. Still, quality assurance, especially testing is needed to deliver software with the behavior the customer expects. Fortunately, even testing becomes simpler, more reliable, adaptable, and measure­able, due to the usage of formal specifications. How this is being accomplished is described in the rest of this blog.

Formal specifications facilitate well-defined structuring of tests and a focus on more interesting testing aspects. To explain these aspects we refer to typical system aspects (for both, single and multi-tier architectures) in the following table:

Aspect How is it done in our framework?
1.1 Validation of inputs,
Valid cases
By generated software that has to be called by the application. After the check input data is known to be consistent.
1.2 Validation of inputs,
Invalid cases
By generated software that has to be called by the application. The input data has to be flagged by the system with differentiated error messages.
2 Frontend computations By generated software that has to be called by the application.
3 Generation of artifacts for backend processing By generated software that has to be called by a web application. This is not further discussed this blog.
4 Backend behavior Implemented by other means. Subject to extensive testing triggered by the frontend using valid input data controlled by 1) and 2).

The aspects 1.1, 1.2, 2 and 3 can be tested using automatically generated data: valid (and deliberately invalid) test data for functional tests can be generated using methods described below.

Extensive testing is needed for back end behavior (see 4). This needs to be done for valid data only. And the backend data needed can be automatically generated from the formal frontend specifications.

Automated testing

Automated functional testing for both, newly developed software and for regression purposes is a must for the kind of systems considered here. This not only reduces the amount of manual testing for each delivery, but also is more reliable regarding testing errors and permits to increase the test coverage.

It is beyond the scope of this blog to describe the overall test process. We merely mention two aspects for automated testing: test data, and test paths – the sequence of input processing. These aspects are dealt with separately.

  • Test data: Test data generation allows to pre-compute test cases by defining various values for input fields. Formal specifications allow the definition of test coverage measures for test cases related to these field values.
  • Test paths: The same test data sets are being fed into interactive system in different order including multiple inputs, i.e. editing of fields. The term used here is path. A path describes the order of processing, including multiple visits of forms and editing of fields. Within a path, pre-computed test data (as described above) can be used to as input to the system and thus test its behavior.

Test paths and test data are orthogonal aspects which complement each other as described below.

Field value based test data generation

Is there a way to generate all valid test cases for field values? For formal systems described above, the number of input fields, their types, and constraints are well known. Moreover, the number of values per field is finite and so is the number of fields. Thus, at least in principle, the set of all valid test cases is enumerable by enumerating field values and combinations of all values for all fields. Invalid field value combinations – according to constraints – can be automatically excluded from this enumeration.

In theory, this means that all possible input-driven test cases can be generated for the purpose of testing. In practice, however, this finite number is far too large: Without constraints between fields, the overall number of test cases is really huge: it is equal to a product of n factors where n is the number of fields. For each field the factor is defined by the number of valid field values.

Constraints reduce the number of legal inputs but the data variety is still too high. Thus, we must substantially reduce the set of test cases while trying to keep “good” test coverage.

There are two independent and compatible ingredients to reduce the number of test cases:

  1. We reduce the number of different values considered for fields, and focus on fewer but preferably more ”interesting” values.
  2. We reduce the number of combinations of different values for different fields: Rather than considering all combinations, the set of all values for all fields is considered. The number of test cases is reduced to the number of different values for the field with highest number of dif­ferent values. Sometimes, constraints lead to a reduction or growth of this number.

Both techniques, considered together, substantially reduce the number of test cases.

Let us now analyze what “interesting” means. Certainly the criterion is the likeliness to find errors and to prove correctness using the resulting test data. We describe that in two steps.

Special and interesting values for types

The first step is to define special (“interesting”) values for specific field types. Test cases using these shall primarily be considered:

  • Special values such as min or max for numbers, currencies and several related types.
  • Special values such as zero, non-zero, hazardous characters, empty or very large strings, etc.
  • For enumeration fields, potentially all values are important, since enumeration fields often control business logic ramifications (e.g. in interactive systems, enumerations are often represented by drop down boxes for selecting important choices).

Obviously, this method has limitations. No knowledge about rules is used here. Hence incomplete business logic is applied. Moreover, some types — even enumerations — are sometimes too “large”, i.e. they have many values, and not all of them might really be “interesting” in the application domain. Nevertheless, considering special values depending on field types is a good basis for the techniques described below.

Special and interesting values for specific fields

Considering individual fields rather than types, interesting values can be selected more specifically. The most important information stems from rules (both functional and constraints):

  • Each comparison with respect to equality of a field value with a specific constant generates an interesting value equal to the constant. These values come on top of the interesting values de­fined by the field types. The fact that they have impact on business logic is reflected here.
  • Each comparison with respect to equality of a field with another field implies that the interesting value set of these fields are shared.

Field value coverage of test data generation

The main idea with respect to test coverage for test data generation algorithms is this:

  • For each interesting value of each field generate at least one occurrence within the test data.
  • Extend data sets applying all constraints to obtain fields with mutually consistent values.
  • Extend data sets by generating random or interesting values for unconstrained fields.

Note that in most cases the constraints do not allow the full variety of 1), since constraint rules usually will exclude some values for specific fields. As an example, consider the following constraint rule, which is taken from the prevous blog article “Using Domain Specific Languages to Implement Interactive Frontends“:

AlternativeVat == 0
or AlternativeVat == NormalVat
or AlternativeVat == NormalVat/2
=> failed: "VAT can only be normal, half normal or zero"

The interesting values for AlternativeVat are NormalVat, zero and NormalVat/2, and “undefined” (the latter stands for: no value has been specified). These interesting values are propagated to all dependent fields, thus producing more interesting values for test data. This can be demonstrated for the following specification snippet.

AllVat = If FieldValueSpecified(AlternativeVat)
then  AlternativeVat/100*NetAmount
else  NormalVat/100*NetAmount

For the field AllVat the interesting values are (see if clause) NormalVat*100*NetAmount, zero, NormalVat/2*100*NetAmount and (see else clause) NormalVat/100*NetAmount.

This description of test data generation is by no means complete; here we merely intended to show the relation­ship to interesting values. For more information, especially on algorithms (in an earlier language setting), see the blog articles “Producing High-Quality Test Data” and “Form Validation with Rule Bases“.

Multiplicity of Fields

A specific testing aim refers to multiplicity, i.e. to multiple occurrences of fields in forms (such as per­sonal data for several people). In the first blog this concept has been explained and exemplified. The following snippet shows rules for multiplicity fields PosFullPrice, UnitPrice, and Quantity.

NetAmount        = Sum(PosFullPrice.all)
PosFullPrice.each = UnitPrice.each * Quantity.each

The number of instances of fields with multiplicity can be defined in advance in order to cover imple­mentation errors occurring in this context. In our experience in most cases generating test data for a multiplicity of 3 suffices to find hidden bugs. In the above example above, three instances of PosFullPrice, UnitPrice, and Quantity are generated. They are fed into the algorithm for enumeration of test data. If a specific multiplicity index is referred to in con­straint rules, this multiplicity automatically delivers interesting values (similarly to fields with no multiplicity as shown above). For scaling tests and performance tests, specific fields are selectively be set to a very high multiplicity.

Path Aspects: Considering Order of Field Editing in Tests

At mgm tp, the second automation aspect — the order of editing fields — is pursued with two approaches. Both are independent of test data. Data is simply provided from pre-computed test data sets as described in the previous chapters.

Controlled random testing

One method covering path dependencies consists in using randomly generated contexts. More explicitly controlled random testing (see, e.g. the presentation “Testing for the Unexpected“) increases the likelihood that “interesting” paths/value combinations are found by chance, since both, values and paths, are addressed in parallel. Once inte­resting contexts are found, by test failure, one can focus further testing around the contexts found.

Explicit testing paths

In contrast to controlled random testing, explicit testing paths specify evaluation orders (interactive forms visits and field assignments). Several strategies such as multiple visits with value reassignments or with cancelations can be configured. The visiting stra­tegy is specified in advance, depending on quality aims of the customers. Random testing is just a special case which can be specified in our test driver infrastructure.

Semantic Test Coverage using adaptable Testing Aims

From the testing perspective, the greatest advantage of using formal specifications is the fact that the set of valid fields and field combinations is very well defined, and it is defined at the customer level. This enables the techniques discussed above. Furthermore, one can profit from the fact that custo­mers possess knowledge about important and less important domain related aspects.

This permits a deviation from the pre-computed aims in terms related to the domain rather than to technology It bridges from the customer domain to technology (semantics beats automatics).

  1. Sometimes interesting values, such as enumerations, have no impact on program ramifications, and in addition have many values which have similar semantics, with the exception of some specific ones – which can be derived from the domain knowledge.
  2. Important cases may neither be modeled by enumerations nor explicitly be visible in constraint rules. In these cases non-formalized domain knowledge might be fed into the set of interesting values for specific fields.
  3. By analogy, the need for multiplicity tests for specific fields might better be decided upon by persons with domain knowledge rather than taking defaults or algorithmic artifacts.
  4. In some cases invalid data are of interest in order to test reactions on invalid inputs. If this requested, it can be achieved simply by generating test data based on rule sets which contain deliberate negations of specific constraints.
  5. Explicit testing paths are set to useful defaults, sometime even including controlled random testing.

The overall method starts from formal specifications and combines domain expert knowledge with automated testing (see the illustration for an overview):

  1. Automatically derive testing aims from specifications and store them in a default configuration.
  2. Let these aims be analyzed by experts with domain knowledge, and adapt the testing aim con­fi­guration. This leads to an increase of a reduction of the size of the data set as described in a) to e)
  3. Generate test data starting from an adapted test aim configuration for field values. Within this step specification flaws (e.g. contradicting rules) are being reported.
  4. Perform tests with the generated test data while considering path configurations proven useful in former tests. Within this step any insufficient test coverage (in most cases due to incomplete specifications) is being reported.
Illustration of the method for adaptation of testing aims.

In conclusion, the adaptation of testing aims is a tool-supported manual process which fits well to test data generation and automated testing.


In this blog series we have shown that, in several respects, formal specification languages can be used to deliver high quality software systems. Since specifications are being provided at the comprehension level of the cu­sto­mer, we obtain a high likelihood that the system does, what the customer intended. In particular, we gain high quality due to…

  • automatic code generation,
  • well defined automatic test data generation,
  • and measurable test coverage adaptable testing aims,
  • fast turnarounds in the presence of requirement changes,
  • backends guarded by benign frontends, and
  • backends with well-defined test coverage based on frontend testing aims.

In addition to these implementation and quality assurance benefits, the interaction with the customer is improved as well. Early feedback with respect to errors, inconstancies, and incompleteness in specifi­cations are possible, thus improving the requirement analysis process as well.

Series Navigation<< Using Domain Specific Languages to Implement Interactive Frontends