Managing Process Model Complexity via Concrete Syntax Modifications

Marcello La Rosa, Arthur H.M. ter Hofstede, Petia Wohed, Hajo A. Reijers, Jan Mendling and Wil M.P. van der Aalst

Abstract

While Business Process Management (BPM) is an established discipline, the increased adoption of BPM technology in recent years has introduced new challenges. One challenge concerns dealing with the ever-growing complexity of business process models. Mechanisms for dealing with this complexity can be classified into two categories: i) those that are solely concerned with the visual representation of the model, and ii) those that change its inner structure. While significant attention is paid to the latter category in the BPM literature, this paper focuses on the former category. It presents a collection of patterns that generalize and conceptualize various existing mechanisms to change the visual representation of a process model. Next, it provides a detailed analysis of the degree of support for these patterns in a number of state-of-the-art languages and tools. The paper concludes with the results of a usability evaluation of the patterns conducted with BPM practitioners.

Index Terms—Process model, pattern, complexity, presentation, secondary notation.

I. INTRODUCTION

Business Process Management (BPM) deals with the life-cycle of business process models which includes their design, execution and analysis [60]. Through the application of BPM technology, businesses may realize cost reductions, time savings, and an increased agility to deal with change. Many organizations have been investing in this technology and the interest in BPM surged in recent years. Despite advancements in the field of BPM — both in academia and in industry — important challenges still remain. These need to be dealt with in order to fully realize the potential of BPM technology.

One of these challenges concerns the management of complex process models. Business process models may contain many elements which may have numerous and intricate dependencies among them. The more complex a business process model is, the harder it is to determine if it properly captures the right business practices, to use it to communicate with stakeholders, and to evolve it over time (e.g. due to unforeseen circumstances or changing business requirements).

There is a substantial body of literature on process model complexity and understandability on the one hand (e.g. ^[12], [32], [37], [40], ^[4], [53], [49]) as well as on proposed mechanisms to deal with managing this complexity on the other hand (e.g. [56], [59], [21]). However, what is lacking is a language-independent overview of, and a motivation for, the various features that exist to managing complexity in process models. Such an overview could ultimately pave the way for more comprehensive support for complexity management in process modeling languages, standards and tools. A variety of process model stakeholders may benefit from this, e.g., those designing and standardizing process modeling languages such as BPMN, those developing modeling tools to support such languages, and those currently using a specific language/tool in order to evaluate its strengths and weaknesses. These observations triggered the development of a language-independent overview of the various features that exist to reduce the complexity of a process model.

In this paper we follow the established approach of capturing a comprehensive range of desired capabilities through a collection of patterns (e.g. the workflow patterns [61] provide a language-independent description of expressiveness requirements in process modeling languages). The patterns capture features to manage process model complexity as they are found in the literature, in process modeling languages or in their tool implementations.

In line with the field of programming languages [41], we distinguish between concrete syntax and abstract syntax of a model. The concrete syntax of a process modeling language deals only with representational aspects such as symbols, colors and position, of the various types of nodes in a process model (e.g. tasks, events, gateways, roles). In cognitive sciences [46], this corresponds to the cognitive dimension of secondary notation. The abstract syntax of a process modeling language is not concerned with representational aspects but captures the various types of process elements and the structural relationships between them. Hence, changing the graphical appearance of a process model (e.g. by rearranging nodes or modifying the symbol of a certain node type) should not have any consequences for its abstract syntax representation. Modularizing a process model in a hierarchy of nested sub-processes can be used to simplify a model without changing its behavior. However, modularization affects the abstract syntax. Abstracting from certain elements of the model (e.g. in order to create a process view for a specific class of users) also affects the abstract syntax of a process model—in fact certain model elements are replaced or removed altogether. Accordingly, we classify features to reduce the complexity of a process model into two categories: those that affect the concrete syntax of a model only, and those that primarily affect its abstract syntax, and, as a consequence, may also impact on its concrete syntax. In this paper, we focus on complexity reduction features that only affect the concrete syntax of a process model.

The patterns description is language-independent, but typically the realization of these patterns in one or more existing approaches is discussed to reinforce understanding and to demonstrate relevance. This description is complemented by an overview of the degree of patterns support in a number of well-known process modeling languages and language implementations, which provides insights into comparative strengths and weaknesses. Moreover, we report on the results of a usability evaluation of the patterns that we conducted with BPM practitioners.

The remainder of this paper is structured as follows. Section II describes and justifies the approach used. Section III presents the various patterns while Section IV evaluates the support offered by mainstream BPM languages and tools for the identified patterns. Section V presents findings from the usability evaluation. Next, Section VI discusses related work and Section VII concludes the paper.

II. METHODOLOGY

In this paper we identify patterns to reduce the perceived model complexity without changing the abstract syntax, i.e., the goal is to simplify the representation of the process model.

The most well-known patterns collection in the IT domain is the set of design patterns documented by Gamma, Helm, Johnson, and Vlissides [20]. This collection describes a set of problems and solutions frequently encountered in object- oriented software design. The success of the patterns described in [20] triggered many patterns initiatives in the IT field, including the Workflow Patterns Initiative [61]. The idea to use a patterns-based approach originates from the work of the architect Christopher Alexander. In ^[5], he provided rules and diagrams describing methods for constructing buildings. The goal of the patterns documented by Alexander is to provide generic solutions for recurrent problems in architectural design. The idea to use patterns for design problems in the IT domain is appealing as is reflected by the many patterns collections that emerged in the last decade.

The patterns described in this paper have been collected by extensively analyzing existing BPM literature. In addition, we analyzed existing or proposed standards governed by standardization bodies such as OASIS, OMG, W3C, and WfMC. We also analyzed the features of existing tools (business process modeling tools, WFM systems, and BPM systems). Finally, we asked BPM experts and practitioners to comment on the patterns we identified. This resulted in the eight patterns presented in this paper. All operate on the concrete syntax of a process model and aim to reduce the perceived complexity of the model. We could have documented more patterns. However, an important requirement is that a pattern should recur frequently. For each pattern in this paper, we found more than five languages, research approaches or tools that use it.

As it is usual in this domain, we used a fixed pattern-format to document each pattern and used this to evaluate existing languages and tools in a systematic manner. The format lists five elements for each pattern: (a) description, (b) purpose, (c) rationale, (d) realization, and (e) example.

III. PATTERNS FOR CONCRETE SYNTAX MODIFICATION

As a result of our study, we identified eight patterns operating exclusively on the concrete syntax of a process model and classified them according to the hierarchy shown in Figure 1. The first pattern, namely Layout Guidance, describes features to modify the process model layout. Four patterns outline visual mechanisms to emphasize certain aspects or parts of a process model (node Highlight in Figure 1). These are Enclosure Highlight, Graphical Highlight, and two annotation patterns: Pictorial Annotation and Textual Annotation. Two representation patterns, Explicit Representation and Alternative Representation, refer to the availability of explicit and alternative visual representations for process modeling constructs. The last pattern, Naming Guidance, refers to naming conventions or advice to be used in a process model.

In the following we provide a detailed description of each pattern. For illustration purposes, we use the BPMN (Business Process Modeling Notation) standard [44]. An overview of the graphical representation of the main concepts of this notation can be found in Figure 2. Detailed knowledge of this standard is not required to understand the various examples in this paper.

Pattern 1 (Layout Guidance)

Pattern 2 (Enclosure Highlight)

Pattern 3 (Graphical Highlight)

Pattern 4 (Pictorial Annotation)

Pattern 5 (Textual Annotation)

Pattern 6 (Explicit Representation)

Pattern 7 (Alternative Representation)

Pattern 8 (Naming Guidance)

IV. BENCHMARKING

In this section, we report the results of evaluating a number of languages and tools against their support for the iden- tified patterns. The languages selected for this evaluation are mainstream process modeling languages deriving from standardization efforts, large-scale adoptions or established research initiatives. We selected four languages for conceptual process modeling (UML ADs 2.1.1, eEPCs, BPMN 1.2 and BPMN 2.0) and four languages for executable process modeling (BPMN 2.0, BPEL 1.2/2.0, YAWL 2.0 and Protos 8.0.21). For each language, we also evaluated at least one supporting modeling editor. For UML ADs 2.1.1 we evaluated Sparx Enterprise Architect 7.1; for eEPCs and BPMN 1.2 we evaluated ARIS 7.1 from Software AG and Oryx 2.0 beta; for BPMN 2.0 we evaluated Oryx 2.0 beta; for BPEL 1.2 Oracle JDeveloper 11.1.1.1.0; for YAWL 2.0 the YAWL Editor 2.0 and for Protos the Protos Editor 8.0.2 from Pallas Athena. Table I shows the results of this analysis, where tool evaluations are shown next to the evaluations of their languages, except for Protos, where the language cannot be separated from its implementation because it is vendor specific (although based on Petri nets). In particular, for a tool the rationale was to measure the extent by which it facilitates the support for a pattern, as it is offered by a language. Accordingly, for Layout Guidance, we decided to rate as ‘- ’ all tools providing limited layout support (e.g. alignment only). For Graphical Highlight, we rated JDeveloper ‘+/-’ as the appearance of model elements cannot be customized. For Pictorial Annotation, JDeveloper and Protos received a ‘+/-’ as default icons or images are automatically assigned to model elements and cannot be customized. For Alternative Representation, we rated tools with a ‘+’ only if they provide a means to replace standard symbols with custom-made ones. For Naming Guidance, ARIS received a ‘+/-’ as it provides general semantic guidelines for eEPC labels but does not offer renaming capabilities to enforce these guidelines. Regarding languages evaluation, we rated UML ADs 2.1, BPMN 1.2/2.0, YAWL 2.0 and Protos 8.0.2 ‘+/-’ for Explicit Representation, as they do not have a graphical representation for some concepts.

From the table we can make the following observations. First, as expected, the selected tools generally offer wider pattern support than the respective languages. Consider, for example, the differences between UML ADs and Enterprise Architect, between eEPCs and ARIS for eEPCs, and between YAWL and the YAWL Editor. A possible reason is that languages typically focus on defining the process syntax and semantics, but not on visualization features that are convenient in a modeling environment. When implementing support for these languages in a modeling editor, visualization features become a major concern. Clearly, language support being equal, the more sophisticated visualization features an editor can offer, the more competitive it is on the market. Second, the tools that are primarily developed for conceptual process modeling provide better patterns support than those developed for executable process modeling. For example, ARIS fully supports six patterns, while JDeveloper offers full support for two patterns only, and partial support for other two patterns. This can be explained by the fact that the visualization features in the second class of tools are not the main focus, as opposed to other features such as data specification, role allocation and application integration. Third, we observe an increase in patterns support from UML ADs to eEPCs, BPMN 1.2 and finally to BPMN 2.0. This clearly reflects the evolution of process modeling languages. On the other hand, BPEL is the only language that does not support any pattern. This is justified by the fact that BPEL does not define an official graphical notation. Nonetheless, we included BPEL in our analysis for completeness. Fourth, the limited support for Pictorial Annotation can be explained by the fact that, until recently, such things could not be supported easily. Recent advances in computer graphics make it possible to use pictorial annotations. Moreover, there is a growing need for decorating process models with attributes familiar to business users (e.g., icons). Finally, the even less support for Naming Guidance derives from the fact that traditionally the development of modeling languages has not been concerned with the use of linguistic support such as ontologies. However, we can observe a growing academic interest in this pattern, as evidenced by recent publications on the topic.

V. USABILITY EVALUATION

In order to evaluate the patterns from a usability perspective, we turned to the technology acceptance model ^[14] and its adaptation to conceptual modeling [35]. In essence, this theory postulates that actual usage of an information technology artifact—patterns in the case of this paper—is mainly influenced by the perceptions of potential users regarding usefulness and ease of use. Accordingly, a potential user who perceives a pattern to be useful and easy to use would be likely to actually adopt it.

We conducted a series of focus group sessions with profes- sionals to discuss the patterns. Altogether, 15 process modeling experts participated in these sessions, which took place in Eindhoven and in Berlin. While the five persons in Eindhoven had a consulting background, the ten persons in Berlin worked for vendors of business process modeling tools. On average, the participants had seven years of experience with process modeling. They estimated that they had studied about 300 models in the last 12 months, each having on average 20 activities. Due to this extensive exposure to process modeling, they can be considered to be experts.

We used a questionnaire with seven-point scale items adopted from [35] to measure the participants’ perceptions on usefulness and ease of use for each of the patterns.Cronbach’s α was used as an ex post reliability check. The values determined were 0.92 for usefulness and 0.84 for ease of use, which both point at a high internal consistency of the used question items. The boxplots in figures 9 and 10 display the outcomes of the data analysis for the patterns’ usefulness and ease of use respectively. In a boxplot, the median is shown as a horizontal line in a box representing the interval between lower and upper quartile.

In Figure 9, it can be seen that all patterns are perceived to be useful (median equals 4 or greater). The patterns that receive the highest scores in this respect are patterns 1, 2, and 8. Figure 10 shows that ease of use is overall considered even more positively with median values of 5 or more for all but patterns 6 and 7. We also recorded statements about the various patterns in a group discussion, which complemented the quantitative assessment. Highly useful patterns were emphasized in this discussion, e.g. a participant from Eindhoven commented on Pattern 2 that it is “very often used”. In addition, there were interesting comments revealing a trade-off that seems to exist with respect to individual pattern usage. For instance, an Eindhoven participant remarked with respect to Pattern 5: “I am very much in favor of keeping all models clean. Adding text makes the view very complex and reduces transparency.” In a similar vein, a participant from Berlin stated that using this pattern “may lead to semantical problems, since it is easier to create a note than to model correctly” hinting at likely misuse of this pattern. Also, Pattern 3 appears to bear the risk of being used too excessively, since “overstressing [this pattern] leads potentially to unreadable models.” While the benefits of Pattern 7 were noted, a participant from Berlin stated that it is “very difficult to explain to beginners. A language should only provide one way to model a scenario.”

Altogether, the focus group sessions reveal that industry experts found the patterns to be useful and in general easy to use. Many of them, however, have to be applied in a parsimonious way to live up to their full potential. The experts stressed that novice modelers need additional training to utilize the patterns in an efficient and effective way, and that there is a risk of misuse. Finally, the assessments on ease of use might also reflect the experts’ perception of lacking tool support for applying the patterns at this stage. In this regard, there seems to be a need for innovations; in particular for the support of patterns 6 and 7. Indeed, these areas are not yet heavily investigated and require future research.

VI. RELATED WORK

Three main benchmarking frameworks have been used to evaluate process modeling languages: the workflow patterns framework [61], Bunge, Wand and Weber’s (BWW) framework [58], and the Semiotic Quality Framework (SEQUAL) [30]. The workflow patterns provide a language-independent description of control-flow, resource, data and exception handling aspects in workflow languages. Their development started as a bottom-up, comparative analysis of process modeling languages and tools, with the purpose to evaluate their suitability and determine similarities and differences between them. To date, the workflow patterns have been used to examine the capabilities of numerous process modeling/workflow languages, standards and tools [61]. Our work is complementary to this framework since it describes recurring features (patterns) to reduce the complexity of pro- cess models, and can be used for the evaluation of process modeling languages and tools.

The BWW framework refers to Wand and Weber’s tailoring and application of Bunge’s ontology ^[11] to information systems. It was initially used for analysis and comparison of conceptual modeling languages, and later also used for the analysis of process modeling languages [52], [48]. However, the BWW framework lacks conceptual structures central to process modeling such as various types of splits and joins, iteration and cancelation constructs, and different forms of con- currency restrictions. Thus, despite its utilization in practice, its suitability for evaluating process modeling languages can be questioned. Also, its application as a theoretical foundation for conceptual modeling has been criticized [62].

pragmatic quality, which deals with the understanding of a model by its audience. SEQUAL has been used for the evaluation of process modeling languages [57], and has later been extended to deal specifically with quality of process models [31]. Nonetheless, the authors themselves acknowledge SEQUAL’s “disability (sic) to facilitate precise, quantitative evaluations of models” [31, p. 101]. In contrast, our patterns collection provides a concrete means to evaluate the pragmatic and empirical quality of process modeling languages, and can also be applied to evaluate supporting tools.

Our work can be related to [42], where a theory of general principles for designing cognitive-effective visual notations is proposed. Specifically, our patterns can be seen as an implementation of the Complexity Management principle, which recommends that a visual notation should include explicit mechanisms to simplify a model’s appearance. Moreover, the Textual Annotation pattern can be seen as an implementation of the Dual Coding principle, which prescribes the use of text to complement graphics, while the Graphical Highlight pattern can be seen as an implementation of the Semantic Transparency principle, which prescribes the use of visual representations whose appearance suggests their meaning.

Our work has also commonalities with cartography. The first geographical maps date back to the 7th Millennium BC. Since then cartographers have improved their skills and techniques to create maps thereby addressing problems such as clearly representing desired traits, eliminating irrelevant details, reducing complexity, and improving understandability. Cartographers use colors to highlight important cities and roads. The thickness of a line representing a road reflects its importance. This corresponds to Pattern 3. Also, annotations are used to point out important things, e.g., a road that is closed during night or a tunnel where users need to pay toll (cf. Patterns 4 and 5).

VII. CONCLUSION

The main contribution of this paper is a systematic analysis of concrete syntax modifications for reducing process model complexity, as they occur in the literature, in process modeling languages and tool implementations. The result of this analysis took the form of a collection of patterns, complemented by an evaluation of state-of-the-art languages and language implementations in terms of these patterns, and an usability test with practitioners. The results of the usability test demonstrate that all identified patterns are indeed perceived as useful and easy of use, while the evaluation of languages and tools shows that there is generally good support for the majority of these patterns. This collection can be useful for different process model stakeholders, including those designing and standardizing process modeling languages (such as BPMN and UML ADs), those developing modeling tools to support such languages, those currently using a specific language/tool in order to evaluate its strengths and weaknesses, and those who plan to do so.

While one cannot prove that the patterns collection is complete (as there is no reference framework that could be used for this purpose), confidence about the comprehensiveness of this patterns collection is derived from a careful survey of the relevant literature, standards and tools for process modeling. Although some of these patterns may be applied to other types of models, e.g. data models, our focus was solely on process models. This pattern-based analysis of the state-of-the-art in process modeling, identified relative strengths and weaknesses among the languages and tools considered. This analysis may provide a basis for further language and tool development. For example, contemporary tools could support naming conventions or guidelines, from both a syntactic and a semantic perspective, or more, allow modelers to easily switch between shorthand notations and their full expansions depending on user preferences.

The evaluation reported in this paper shows whether or not a given pattern is supported by the various tools. However, one may be interested in a more accurate evaluation of the degree of support for each pattern, especially for tool selection purposes. For example, the majority of tools offer layouting algorithms. Still, some of these algorithms (e.g. the one in ARIS) provide better results in terms of elements alignment, distribution and spacing than others. In order to determine finer grained pattern support, we plan to conduct experiments with end users to compare the perceived model understandability after using different tool features (e.g. different layout algorithms). Another aspect worth investigating through experimentation is which subsets of patterns can be combined to increase process model understanding. In fact, there might be cases in which the application of particular combinations of patterns may actually decrease the understanding of a process model, e.g. using pictorial annotation with graphical highlight.

Our patterns refer to features that affect the concrete syntax of a process model only, i.e. its visual appearance. The automatic generation of models using process mining techniques ^[2] provides new insights on the importance of a model’s concrete syntax. In fact, process models discovered from event logs typically tend to be overly complex and thus difficult to read [22]. In these cases, it is of utmost importance to simplify the visual representation of such models for the end user. On the other hand, information extracted from event logs can be used to enrich the visual appearance of existing process models (e.g., by highlighting frequent activities and paths). This is yet another application of the patterns identified in this paper.

There are however other equally important complexity reduction features, e.g. structuring a model, modularizing it into sub-processes or abstracting from certain modeling elements, which we left out. Although these features also affect the visual appearance of a process model, they primarily operate on its abstract syntax and as a result, may lead to changes in its appearance. A complementing patterns collection that systematically describes such features is under development.