International Journal of Swarm Intelligence and Evolutionary Computation

The paper provides a summarizing review of our research on integrating qualitative methodologies in agent-based social simulation. This holds for both the development of behavioural rules for software agents as well as for interpreting simulation results. Specifically we rely on Grounded Theory, a well-established methodology in qualitative social research. Development of agent rules relies on open coding in a Grounded Theory approach. Interpreting simulation results in a coherent story line relies on theoretical coding in a Grounded Theory approach. This is demonstrated at two examples: the first example shows how qualitative textual data is transformed in a conceptual model for agent-based simulation. The second example shows how numerical simulation results reveal a story line of a case


Introduction
In recent years the need to integrate qualitative data in agent based simulation models is increasingly recognized. In particular a demand is articulated to include qualitative data in ICT based policy modelling. The complexity of policy issues typically involves fuzzy information which cannot be reduced to numerical values but nonetheless cannot be neglected [1][2][3][4][5][6]. Objective of this research note is demonstrating how we handled the problem of qualitative data and numerical simulation in recent research. The research was partly undertaken in course of the project GLODERS on the global dynamics of extortion racket systems (www.gloders.eu). The project provided a set of computational tools for analysis and simulation to study a specific aspect of organized crime, namely extortion racket systems (ERS). As research on organized crime groups is faced with the problem that quantitative data is rarely available, a considerable part of the development of simulation models had to rely on a data basis of qualitative evidence such as police interrogations, court files etc. We summarize findings of our methodological account based on disperse papers which focus on various selected aspects of the overall research process [7][8][9][10]. Here we attempt to draw the attention of practitioners in modelling to the potential of utilizing methodological approaches in qualitative social research. This holds for both the development of a model as well as for interpreting simulation results. Integrating qualitative evidence in behavioural rules of agents in agent-based models can be informed by cross-fertilization of research methodologies at hand in the tradition of qualitative social research. First, it will be shown how methods from qualitative social research can be utilized for the development of behavioral rules of software agents, i.e., how it can inform software engineering. Second, we focus on interpretation of simulation results in later research stages of analyzing simulation models. As a "good analyst weave data into narrative without doing injustice to either" [2: 3, retrieved from www.leastsquares.com/ papers/cork2005] we sketch how numerical simulation results reveals qualitative evidence that provides a narrative of the case. For this purpose we rely on the concept of stylized facts. In sum, we sketch a process of utilizing interpretative research in agent-based modeling.

Methodology of Qualitative Research: The Example of Grounded Theory
While numerous methodological approaches and advices exist for qualitative research exist [11] here we highlight in particular the methodology of Grounded Theory. Grounded Theory is a wellestablished methodology of qualitative research going back already to the 1960s [12]. The term Grounded Theory denotes a structured approach for the generation of theories [11,12] from raw empirical data. Thus Grounded Theory is an inductive research process. Typically data basis are qualitative interviews and narratives. First a detailed description of the case is undertaken which should lead to a new perspective on the data. In turn this generates new questions that call for collection of new data [9]. This theoretical sampling is an iterative process until a stage is reached at which a full picture of the field emerges. This is denoted as theoretical saturation [13][14][15]. The research process is intended to develop finally rather abstract categories. This is achieved by a method denoted as coding. While the different stages cannot be mechanically executed, the different stages of coding are often called line coding, focused coding, axial coding and selective coding [11], reaching increasing abstraction. For more details the reader might consult e.g., [12][13][14][15]. Here we limit ourselves to differentiate between open coding and theoretical coding. The former is closely oriented at the data, the latter aims at theoretical insights [12]. Open coding in a Grounded theory approach is particular well known for in-vivo codes, which are small annotations from the original 'raw' data that are particularly characteristic. For instance, in our current research on criminal organizations, a criminal described the conditions, in which he had found himself, as a 'rule of terror' , which provides a vivid picture of a war in the underworld [9]. In the following example we make intensive use of these textual annotations.
The qualitative research provided the basis for the development of a conceptual model that is ready for being implemented as an agent-based simulation model. We aim to show how GT assists formal knowledge representation in software engineering. Coding can be assisted with a number of software tools such as Atlas.ti or MaxQDA. We analyzed the data by using MaxQDA and then imported it to a tool called CCD [4,6].
CCD is a tool for developing conceptual models. A feature particularly relevant for software engineering is that it already generates templates of program codes for the simulation model [6]. The conceptual model consists of a web of so-called event-action sequences, i.e., a situational condition triggers an action by the agents which in turn generate new situational conditions. These sequences represent the dimensions of and relation between the codes. Moreover CCD preserves traceability to the data basis by annotations of -in terms of Grounded Theory -in vivo codes that justify the sequences. Conceptual modeling is a software engineering equivalent to open coding, whereby the condition-action sequences already represent generalized abstractions.

Integrating Qualitative Data in Agent Rules: Open Coding and Conceptual Modeling -An Example
Next, we provide a brief example: The description follows [8]. Data basis had been documents of police investigations based on research in the GLODERS project. These provided a textual data basis for developing categories that had been derived from annotations of relevant text passages, the in-vivo codes [8,10]. Whereas the data of the police files describes an individual case, the condition-action sequences represent event classes. Thus they provide a theoretical abstraction from the original textual documents. However, annotations enable to refer back to the in-vivo codes from which the sequences are derived.
In the following an example of the conceptual model will be exemplified by a part of the overall model, the conceptual model of the process of money laundering. Figure 1 shows the event-action sequences of this process in CCD format. Boxes with a red flag represent conditions which trigger actions, symbolized by a yellow flag. (Figure:1) There are two starting condition for the money laundering. First condition is that black money exist which is in need of laundering. This can be found in the box denoted as 'illegal money available' . This condition can be traced back to the following in-vivo code, i.e., text passage from the empirical document.

In vivo code (Illegal money available)
In the period between 1990 and 1992 police investigations had been undertaken. These revealed a criminal organization concerned with drug trafficking. The report from June 1992 estimated the income and the costs. It is estimated a transaction volume of nearly 300 million.
Since illegal money cannot be claimed back at a court, the second condition is that the criminal investor needs to trust the person to which he is handing over the money as the link between the illegal and the legal world. This is the condition in the box at the bottom left, denoted as 'level of trust above threshold' . This condition can be traced back to the following in-vivo code.

In vivo code (Invest in legal market)
O1 and V01 seem to be friends for me.
If these two conditions are fulfilled the simulation model can trigger the action of giving money to the trustee, which generates a situation of money laundering. Next the conceptual model simulates investment in the legal market as justified by the following annotation of the empirical document.

In vivo code (Legal money available)
At the moment author have paid 800 000 in the firm which are now several millions worth through legal trade.
Moreover, the empirical data suggests that in the process of redistribution the money back to the criminal investor so-called 'strawmen' are involved. These are person who are not directly associated to the criminal group but who have a relationship of trust to the group nevertheless as in the following annotation.

In vivo code (Straw man received money)
The funding went from V1 to B3 and then to the father of M.O.
The final step of the condition action sequence is the condition that the money is available for the criminal investor as legal money, denoted as 'return of investment is available' . This is justified by the following in-vivo code:

In vivo code (Return of investment available)
On the basis of witnesses and financial investigations it is suspected that O1 and all persons directly or indirectly associated with him received considerable boni for transactions in which V01 and his companies had been involved. This is a brief example of how the process of coding in a Grounded Theory research and conceptual modeling in software engineering are tightly interwoven. Condition-action sequences provide a means for increasing abstraction from the concrete data of a case. As CCD provides templates of program code, the final programming is facilitated the more, the more efforts are put in a close integration of Grounded Theory research in conceptual modeling.

Interpreting Simulation Results: Final Integration -An Example
Next we highlight how simulation results enable the development of a story line of a case as characterized by the notion of theoretical coding in a Grounded Theory approach. Thus we show how analysis of simulation results can put in terms of a narrative of a case. Rather than aiming at numerical accuracy we attempt to show that this provides a final integration in a coherent picture in terms of Grounded Theory. This can be facilitated by utilizing the concept of stylized facts [9].
The concept of stylized facts had been introduced by Kaldor in the 1960s in macroeconomic growth theory [16]. The central tenet of stylized facts is "to offer a way to identify and communicate key observations that demanded scientific explanation" [17][18][19][20][21][22]. For this purpose empirical details are deliberately left out. Rather stylized facts are broad patterns that are not exactly identical to empirical data but highlight central mechanisms at work of a certain class of phenomenon that should be valid beyond a phenomenon [9]. How stylized facts describe the explanatory narrative of the field will now be demonstrated at the example of prior research on simulating ethnonationalist radicalization [7]. The model is based on the much studied case of the ethnic conflicts in the former Yugoslavia. The fact that a rich body of literature exists enables to include a thick description of details of the radicalization process in the mechanisms of the model. Mostly evidence draws on [18][19][20][21][22][23][24] and occasionally also other sources. The literature suggests that, while awareness of ethnic identities had been historically deep-seated in principle, the escalation started in the mid-1980s as a power struggle in the Yugoslavian Communist Party. However, it is well-known that civilians became paramilitary fighter and war crimes were also undertaken by civilians. The wars caused massive refugee movements which further destabilized the state and interethnic relations [25]. Target of the model is the process of nationalist radicalization in the public political attitudes. The basic mechanism is a recursive feedback relation between dynamics on a political level and socio-cultural dynamics at the population level in the process of attitude formation (Figure 2).
Here we highlight only the basic idea of the model. For more details see [7,26]. The agents have two value orientations: civil values or national identities. Political actors enforce value orientations in the population through political campaigns e.g., by holding speeches etc. [7]. This can be described as top-down mechanism. On the other hand, political actors get selected according to their popularity in the population. The popularity reflects the support of the political campaigns, i.e., inasmuch a political actor appeals to the value orientations in public opinions. The selection of political actors by the population is the bottom-up mechanism of the model. This is a positive feedback cycle which is known to produce unstable behavior. However, politicians compete over various value orientations. This dampens exacerbation. The model was calibrated at the population census of 1991 in Serbia, Croatia, and Bosnia-Herzegovina. Whereas Serbia and Croatia had an ethnically rather homogeneous population, the population of Bosnia-Herzegovina was highly ethnically mixed. Thus the model consists of three political units: two with a homogeneous population, one with a heterogeneous population. Next, selected simulation results are displayed. Figure  3 shows attitude dynamics in a homogeneous political unit, Figure 4 attitude dynamics in a heterogeneous unit.  While both political units face radicalization the dynamics is different. The homogeneous population faces and early radicalization of political attitudes whereas in the heterogeneous political unit radicalization is delayed. This suggests an investigation of the agenda of the political actors. Table 1 displays the most popular political actors in a mixed population. Table 2 shows the most popular political actors in a homogeneous population. Thus, in the case of a heterogeneous population the feedback circle between political actors and public attitudes remains largely balanced. In contrast, the case homogeneous population is more biased towards a nationalist agenda. Interpreting these simulation results reveals two stylized facts of the escalation dynamics. The first mechanism concerns processes on a political macro level. The second mechanism concerns micro processes of neighborhood relations. The interpenetration of the processes reveals a sequential ordering [7]. First, An initial radicalization in the political unit A fosters counter-radicalization in a political unit B. This implies that a political agenda in A is monitored in B and that radicalization is perceived as threat. Examples for this stylized fact are Serbia and Croatia but also North and South Korea or in times of the cold war USA and Soviet Union. Successful appeals to national identities in A provide an opportunity those political appeals to identities in B gain support as well. However, success condition for counter-radicalization is a rather homogeneous population. Populations with mixed national identities provide more power of resistance since national identities remain ambiguous. Second, once conflicts emerged in homogeneous political units refugees from outside play an essential role for the later radicalization in the ethnically mixed political unit. This process of driven by neighborhood relations from the bottom-up. This is an example of how simulation results tell a story. The narrative is a stylized fact of a broad pattern characteristic for a certain class of phenomena. The final insight of simulation results are not the numerical figures but rather a qualitative insight that enables integration of the evidence in a coherent pattern [12]. In terms of Grounded Theory this is related to the objective of theoretical coding to reveal the narrative of the field. For instance, in the example that we discussed here, it questions a so-called 'diversity-breeds-conflict' theory which is often consulted to explain ethnic conflicts [27,28]. On the contrary a qualitative interpretation of the simulation results indicates that ethnically mixed populations provide more power of resistance against initial political radicalization. Diversity in itself is not a sufficient cause to breed conflict. It lacks a specification of the mechanisms of conflict escalation [7]. Thus interpretation of simulation results be means of stylized facts results provides a means for the specification of mechanisms at work in a social system. This is the theoretical finding of an inductive research process in which simulation and interpretative, qualitative research mutually cross-fertilize each other.

Conclusion
In this research note we attempted to demonstrate the usefulness of utilizing qualitative evidence for agent-based social simulation, both in the development of rules for agent behavior as well as for the interpretation of numerical simulation results. For this purpose we   consulted Grounded Theory, a well-established research methodology of qualitative social research. We demonstrated this by two examples from our past and current research. The first example shows how qualitative data can be transformed in a conceptual model for agentbased simulation. Software engineering guided by qualitative data analysis ensures traceability of the agent's behavioral rules to dense empirical evidence. In terms of Grounded Theory this can be denoted as open coding. The second example shows how numerical simulation results can be weaved into a narrative of a case. For this purpose we rely on the concept of stylized facts. In terms of Grounded Theory this can be denoted as theoretical coding for providing a final integration of evidence in a coherent picture. These examples demonstrate the mutual cross-fertilization of qualitative social research and software engineering in agent-based social simulation.