Behaviour examples for synthesizing automaton models by temporal formulas Ejemplos de comportamiento para sintetizar modelos de autómatas mediante fórmulas temporales

The paper deals with researching and developing the methods that make it possible to account behaviour examples when synthesizing automaton models by temporal formulas. Definitions of the terms and concepts used in work are given; the problem of synthesizing automaton systems according to the specification in the form of temporal formulas and behaviour examples is formulated; a promising algorithm for reducing the problem of synthesizing automaton systems to the Boolean formula satisfiability problem is described; an analysis of the domain and other approaches is carried out. New methods of taking into account behaviour examples in the synthesis of automaton systems according to a specification given in the form of temporal formulas are proposed. Algorithms for constructing graphs of scripts and methods for dividing graphs into clusters are described; they are designed to increase the efficiency of representing behaviour examples used for coding the behaviour examples in the form of Boolean formulas. An experimental study of the proposed methods of accounting for behaviour examples and basic approaches to the presentation of behaviour examples is carried out. The experimental results showed the superiority of the newly developed methods regarding the presentation of scripts in the form of temporal formulas. In summary, the main conclusions of the work carried out are presented.


INTRODUCTION
The synthesis of automata models is a common problem. Its field of application ranges from software verification and control system synthesis (Vashkevich & Biktashev, 2016;Peter Faymonville et al., 2017; and Volchikhin et al., 2013) to bioinformatics problems and formal description of parallel algorithms and processes Vashkevich et al., 2015). A common way to solve this problem is to reduce it to a Boolean formula's satisfiability problem (Vashkevich & Vashkevich, 1996). Typically, linear temporal logic formulas are used to define the specification of a synthesized system. In this area, a new promising approach to synthesizing an automaton model with a constraint on the size of the system has recently appeared (Peter Faymonville et al., 2017). However, only the temporal logic formulas are often not enough to specify all the synthesized system features. Sometimes the task is to synthesize automaton systems based only on behaviour examples. The purpose of the paper is to study effective methods of accounting for behaviour examples and compare the methods with other approaches to presenting behaviour examples.

MATERIALS AND METHODS
The Boolean formula satisfiability problem is the problem of finding such an assignment of propositional variables included in the Boolean formula so that the formula becomes true. For the Boolean formula satisfiability problem, it can contain only variables, parentheses, and operations AND, OR, NOT. An existential quantifier is implied in the Boolean formula satisfiability problem for the connection between the propositional variables included in it. The satisfiability problem for a Boolean formula with quantifiers is an extension of the Boolean formula satisfiability problem, in which, along with the existential quantifier, the universal quantifier can be used to connect variables. Synthesis with a system size constraint is an approach to the automaton model synthesis with a constraint on the size of the final system and on the number of visits to rejecting states. The problem of synthesis with the system size constraint can be represented as a problem of the solvability of a system of constraints, even in conditions where other approaches to synthesis are unsolvable, for example, in the synthesis of asynchronous or distributed systems (Vashkevich, 2004). Linear temporal logic is logic with operators that allows us to work with time. With its help, we can set the order of phenomena and their interactions in time. In addition to the usual logical operators, linear temporal logic formulas support the following operators: • X (next) -the formula must be satisfied in the next state.
• F (finally) -the formula must be satisfied in one of the following states. • G (globally) -the formula must be satisfied in every state. • U (until) -the formula must be satisfied at least until the moment when the formula is satisfied (which must be satisfied for sure now or in the future).
• R (release) -the formula must be satisfied up to the state (including the current state) in which is first satisfied, if is never satisfied, then must always be satisfied. • W (weak until) -the formula must be satisfied at least until the moment when the formula is satisfied, if will never be satisfied, then must always be satisfied. • M (strong release) -the formula must be satisfied up to the state (including this state), when is first satisfied (which must be satisfied now or in the future).
Let Σ be a finite set of propositional variables and a linear temporal formula defined over this set, then the language of the formula , denoted by ℒ ( ), consists of infinite sequences of states ∈ (2 K ). An example of a linear temporal logic formula: This expression specifies the behaviour of an arbiter system, which, after a request 0 or 1, must sooner or later issue the appropriate permission, while it is prohibited to issue both permissions simultaneously. The universal co-Buchi automaton over a finite alphabet Σ is given by the quadruple ⟨ , 0, , ⟩, where is a finite set of the automaton states; 0 ∈ is an initial state of the automaton, δ: × 2 Σ × is the transition relation, and ⊆ is a set of rejecting states. Let an infinite word ∈ (2 Σ ) * is given; the initiation of the given word on the automaton generates an infinite sequence of states 0 1 2 …∈ * . The start is considered admitting if it contains only a finite number of rejecting states. The automaton admits the word if all starts of the given the word by the automaton were admitting. The automaton language is denoted as ℒ ( ) and is a set { ∈ (2 Σ ) * | takes }. Further, universal co-Buchi automata are depicted as directed graphs with vertices Q and a symbolic representation of the relation in the form of propositional Boolean formulas (Σ). Rejection states will be marked with double lines.
Let Σ is a set of propositional variables; we divide it into two parts: I being variables controlled by their environment and O being variables controlled by the system. In this definition, we mean by the environment the external environment with which the automaton system interacts, and the automation system itself will be understood as the system. Thus, it is assumed that the variables from set I will be changed only by the external environment, and the variables from set O will be changed only by the automaton system. The transition system is given by the triple ⟨T, 0, ⟩, where T is a finite set of states; 0 is an initial state, and : × 2 I → 2 0 × is the transition function. If for a given state ∈ and variables ≠ '∈ 2I it follows from ( , ) = ( , _) and ( , ') = ( ', _) that = ', then the system of transitions is called the Moore transition system, otherwise. the Mealy transition system.
The problem of synthesizing reactive automata systems is to find a minimal automaton system that will implement the given specification. There are many approaches to solving this problem, but one of the most successful is the reduction to the Boolean formula satisfiability problem .
The main advantage of this approach and the speed at which the target system is located is the fact that when improving programs for solving the Boolean formula satisfiability problem, the work of the search algorithm for the reactive target system will also improve. Usually, the specification for the reactive target system is given in linear temporal logic formulas; however, it is often necessary to indicate behaviour examples of the sought system. Within this paper's frameworker, we will refer to the sequences of pairs of vectors ∈ 2 | I | and ∈ 2 | O | defining the state of variables controlled by the ensystem's environment and variables as scripts. A set of several scripts will be called behaviour examples. A separate pair that makes up the script will be called a script element. A reactive system is considered to implement behavior examples to reproduce every script in a given set. Behaviour examples allow us to specify additional system properties that were not presented in the LTL specification. Also, the task of synthesizing a reactive system is often posed only by behaviour examples (Finkbeiner & Schewe, 2007).
Annotation function : × → {⊥} ∪ ℕ is a function that assigns to the vertices of the starting graph either a natural number , or ⊥ if the vertex is unreachable. An annotation function is considered correct if the following conditions are met: a natural number is assigned to a pair of initial states ( 0, 0); a natural number is assigned to the pair of states ( , ), then for all ∈ 2 I and ∈ 2 О such that ( , ) = ( , ') and ( , ∪ , ') ∈ , a greater or equal, or strictly greater natural number must be assigned to the pair Let us describe the construction of a system of constraints for given , , and , which is solvable only if the annotation function is correct. By definition, the correctness of the annotation function can be proved by checking all transitions in the graph, for this, we code , and with the following propositional variables: • The transition function of the transition system is represented by two propositional variables: t, for each outgoing variable controlled by the system ∈ and t, i, t ', denoting the transition from state to state t'. Let ( , t ') ∈ × and ∈ 2 I , then t, i, t' = ⟺ ( , ) = (_, ') and t, = ⟺ ( , ) = ( , _), where ∈ .
• The transition relations δ: × 2 I∪O × of the co-Buchi automaton can be represented as a formula t, q, i, q' with the variables t, in such a way that the assignment for the variables t,i satisfies t, q, i, q'⟺ ( , ∪ , ') ∈ .
• For convenience, we divide the annotation function into two parts: : × → , which represents the reachability of the vertex, and #: × → ℕ. For each ∈ and ∈ , we introduce the variable t, q = True ⟺ the vertex ( , ) in the starting graph is reachable from the initial vertex, and the variable # t, q represented by a bit vector and equal to the value ( , ). Using the propositional variables described above, the Boolean formula is compiled that checks the correctness of the annotation function.
If for the given , and the propositional variables satisfy the system of constraints, then the annotation function is correct (Heule & Verwer, 2013). There are several different approaches to the synthesis of automaton systems, which allow simultaneously taking into account the specification in the form of linear temporal logic formulas and behaviour examples. Among them there are the following methods: -An iterative approach based on reducing to the Boolean formula satisfiability problem, the work of which consists of several stages: representing scripts in the form of a script tree, which is then represented as a Boolean formula; generating an automaton system according to the obtained formula; checking the synthesized automaton system using the model checking approach (Finkbeiner & Schewe, 2013) for compliance with linear temporal formulas; if counterexamples are identified at the current stage, they are added to the script tree with a special mark and the whole process is repeated anew. Thus, an automaton that implements a given specification is gradually deduced, or a fact that denies the existence of such a system is revealed. An important point in the described approach is the use of software tools for solving the Boolean formula satisfiability problem that support an incremental search for a solution (Clarke et al.,1999), which allow not to completely rerun the search for a solution after adding another counterexample.
-Another approach is the method (Vashkevich & Biktashev, 2016), which is based on reducing to the satisfiability problem for a Boolean formula with quantifiers and checking models with a constraint on the length of the tested paths. This approach is somewhat similar to the previous one, but now the boundary used in model validation to check the length of the paths being tested, is sequentially increased instead of adding counterexamples. After generating the automaton, as in the previous method, a check is performed again whether the obtained automaton satisfies the LTL properties; if not, then the boundary used for checking models is increased and the operation is repeated anew. There is also a modification of this approach, in which a formula with quantifiers is translated into a satisfiability formula by an almost exponential expansion of universality quantifiers.

Theoretical research
Today, perhaps the only way of presenting behaviour examples, that suggests the possibility of adding them to the specification when using the synthesis approach with a constraint on the size of the system is to represent behaviour examples in the form of linear temporal logic formulas. Since this method does not require modifications of the algorithm for reducing to the Boolean formula satisfiability problem used in the bounded synthesis approach, we will consider it as the starting point for further comparisons of the results with other proposed methods. To represent behaviour examples in the form of linear temporal formulas, it is necessary to sequentially replace each transition in the list of elements of a specific script with a construction of the form k → k ∧ X ( k+1→ k+1 ∧ …), where the vector specifies the state of environment variables in this element of the script, the vector specifies the state of the system variables, and X denotes the temporal logic operator "next".
Then the temporal formulas defining these scripts will look like this As can be seen from the example, the disadvantage of this approach is a strong increase in size of the formula defining the specification of the system, namely, (| | + | |) ⋅ | |, where | | is the size of the set of environment variables, | O | is the size of the set of variables of the system, and | | is the number of script elements in the given behaviour examples. Due to an increase in the formula, the amount of time required for construction of co-Buchi automaton increases, and the co-Buchi automaton itself also increases directly, which ultimately leads to an inefficient growth of the corresponding Boolean formula and an increase in the time spent on synthesizing the target automaton system. In addition, this method does not take into account the peculiarities of the behaviour examples.
Another way to represent system behavior examples is to represent them as a tree or graph of scripts. The original algorithm for constructing a script tree was described in (Vashkevich, N.P. (2004;Eén & Sörensson, 2003); we will use its modified version in the given work. A tree or script graph is a tree or, respectively, a graph, where the state of the environment variables and the expected state of the system variables are written on each edge. In traversing the tree or graph using the projection of conditions on its edges, we can get all the scripts from the list of scripts on which it is constructed. Initially, the tree contains only one vertex, i.e., its root, then script elements are sequentially added according to the following principle: if there is an edge from the current vertex according to the condition presented in the script element, then a transition is performed along this edge to the next vertex; if there is no such edge, then a new vertex is added to which a new edge is drawn from the current vertex with the necessary condition; after that, a transition to the created vertex is performed. For example, let's say we have the following scripts: The script tree constructed according to the algorithm described above for the given behaviour examples, is shown in Figure 1. As we can see from the example, the script tree allows us to more efficiently define behaviour examples by reducing the amount of information required to represent scripts. Let's develop the idea of combining behaviour examples into a script tree, and let's move on to the script graph. A script graph differs from a script tree in that now the final elements of the scripts are also merged in addition to combining the initial elements of the scripts. The process of building a script graph starts in the same way as building a tree, but now back edges are also remembered for each vertex. After the end of the first stage, all vertices from which no edges outgo merge into one final vertex. Further, starting from the final vertex, a recursive traversal of the graph occurs along the back edges in such a way that if several back edges originate from the vertex with the same condition, then the vertices to which these edges lead merge into one, after which these vertices are passed. The process is repeated again until it reaches the root. It is worth noting one key point of this algorithm: when traversing and merging vertices along back edges, it is very important not to allow a situation when we add the ability to get new scripts that were absent in the original list. For a better understanding, we will give an example of such a situation: for example, after the first step, we have a tree, which is shown in Figure 2. Note that if we do not take into account the case described above, then, when merging vertices along back edges, we get the following script graph (Figure 3). The graph shown in Figure 3 displays the script ( | ) → ( | ) that was absent in the original list of scripts. This can lead to the synthesis of an incorrect automaton system and cause problems with the search or even the absence of a solution for the Boolean formula satisfiability problem if the formulas of linear temporal logic in the specification prohibit or contradict such transitions. To avoid the described problems, we can use the approach applied in the algorithm for minimizing finite state automata, with the help of which the vertices are divided into equivalence classes and then a new script graph is constructed from them. As an example, Figure 4 shows a script graph for the same behaviour examples for which the script tree was constructed above: Using a script graph can further reduce the amount of information needed to present behaviour examples. The methods described later in this chapter will be based, in one way or another, on the search for matching the vertices of a script tree or graph to the vertices of the target transition system.
As mentioned above, the methods of accounting for behaviour examples in the synthesis of automaton models are based on the search for a mapping between the script graph's vertices and the states of the transition system. To do this, we introduce a new type of Boolean variables, St,j, defining the correspondence between vertices and states. Thus, St, j is true if and only if the state of the transition system corresponds to the vertex of the script graph. We also denote by ( ) the set of edges outgoing from the vertex and represented in the form of triplets from the vertex to which the edge enters and conditions for the state of variables controlled by the environment and variables controlled by the system. To confirm the correctness of the established correspondence between the script graph's vertices and the states of the transition system, it is necessary for each vertex to check the correspondence between the edges outgoing from it and the transitions from the transition system. In other words, if the state corresponds to the vertex , then for each which converted the behaviour examples into linear temporal logic formulas for comparison with the base implementation. The resulting data were delivered to the input of the tested program. During testing, the running time of each individual step of the bounded synthesis algorithm was measured and the total running time of all steps. About twenty starts were made for each input data and each method, among which the average time at each stage was selected, and before that, five starts were previously performed without time measurements.
The experiments' results for approaches based on reducing the Boolean formula satisfiability problem and approaches based on reducing the satisfiability problem of a Boolean formula with quantifiers are presented in the graphs shown in Figure 5 and Figure 6, respectively. The vertical scale in the graphs represents the time in seconds, and the horizontal scale gives a number of entities for which a solution was found. Each entity's running time was determined as the total running time of all stages of the "bounded synthesis" algorithm. As can be seen from the graph shown in Figure 5, for the variant with reducing the problem to the Boolean formula satisfiability, the first version of the formula showed itself in the best way, although it is worth noting that the second version is not far behind it. Such a large increase is justified by the fact that in the course of encoding for the Boolean formula satisfiability problem in the bounded synthesis version, an enumeration over 2 | Å | variables is performed, and the formula grows very much with an increase in the co-Buchi automaton. There is no dependence on 2 | Å | in the proposed methods; therefore, the formula does not increase so much with the growth of the script graph.
The graph presented in Figure 6 shows that the leader is the third version of the encoding, which uses the predicate of the transition function completeness and vertex clustering; the performance gain is also significant, although it is not as large as for other versions.
After the experiments, the fact was confirmed that when the behaviour examples are represented in the form of linear temporal logic formulas, the time required for the construction of a co-Buchi automaton significantly increases due to the complexity of the LTL specification structure.

CONCLUSION
The paper shows new methods of accounting for behaviour examples in the synthesis of automaton models using temporal formulas based on the approach of synthesizing automaton models with a constraint on the target system's size. Also, new options for representing behaviour examples in the form of clustered script graphs were proposed. In addition, an experimental study was carried out, which showed the high efficiency of new methods and their multiple superiority over the basic methods.