Object-Centric Replay-Based Conformance Checking: Unveiling Desire Lines and Local Deviations

Conformance checking methods diagnose to which extent a real system, whose behavior is recorded in an event log, complies with its specification model, e.g., a Petri net. Nonetheless, the majority of these methods focus on checking isolated process instances, neglecting interaction between instances in a system. Addressing this limitation, a series of object-centric approaches have been proposed in the field of process mining. These approaches are based on the holistic analysis of the multiple process instances interacting in a system, where each instance is centered on the handling of an object. Inspired by the object-centric paradigm, this paper presents a replay-based conformance checking method which uses a class of colored Petri nets (CPNs) – a Petri net extension where tokens in the model carry values of some types (colors). Particularly, we consider conservative workflow CPNs which allow to describe the expected behavior of a system whose components are centered on the end-to-end processing of distinguishable objects. For describing a system’s real behavior, we consider event logs whose events have sets of objects involved in the execution of activities. For replay, we consider a jump strategy where tokens absent from input places of a transition to fire move from their current place of the model to the requested places. Token jumps allow to identify desire lines, i.e., object paths unforeseen in the specification. Also, we introduce local diagnostics based on the proportion of jumps in specific model components. The metrics allow to inform the severity of deviations in precise system parts. Finally, we report experiments supported by a prototype of our method. To show the practical value of our method, we employ a case study on trading systems, where orders from users are matched to trade.


Introduction
Process mining is a discipline that focuses on the analysis of system processes on the basis of event logs and formal models [1]. Event logs consist of recorded traces, which describe the real behavior of systems. As formal models, most process mining methods use Petri nets -a formalism for analysis of concurrent distributed systems [2]. us, process mining methods allow, for instance, to discover models describing the real processes from event logs, or to check compliance of real processes by comparing their logs with models describing expected behavior. e former kind of method is called process discovery, whereas the la er is referred to as conformance checking [3]. ese methods have increasingly gained a ention, resulting in a plethora of case studies from business organizations [4] and innovative research applications, e.g., see [5][6][7].
Nevertheless, the majority of process mining methods have hitherto consisted on the individual analysis of isolated process instances, thereby neglecting their interaction with other instances in the system. is assumption falls short, and may throw out a wrong analysis, especially when there is a strong dependency between the life-cycles of instances (for example, "a buy order is lled only if a sell order is in the system, and both orders are in certain locations").
To overcome such limitations, object-centric approaches have emerged as a popular paradigm in the eld of process mining [8,9]. e common denominator of these approaches lies in the holistic analysis of the multiple processes interacting in a system, where each process is centered on the management of a class of objects.
As for conformance checking, certain methods have been proposed to validate multiple perspectives of a process beyond the control-ow, i.e., the correct ordering of activities [14,15]. Particularly, these methods make use of Petri nets with data (DPNs) to detect deviations caused by data corruptions (e.g., "a loan approval was wrongly executed due to a requested amount higher than expected"). However, the backbone of DPNs is an ordinary Petri net used to describe the individual execution of a single process instance, and data elements are only statically a ached to model transitions. us, this model does not allow to describe and validate the dynamic and concurrent interaction of multiple instances within a system.
In this paper, inspired by the object-centric paradigm, we present a conformance checking method to diagnose whether systems that manage di erent kinds of objects comply with their speci cation. e method uses a class of colored Petri nets (CPNs) -a Petri net extension where tokens carry values representing objects of di erent types (called colors) [16]. Particularly, we consider conservative work ow CPNs with multiple source and sink places. In this model, tokens cannot disappear or duplicate, and they move through paths whose endpoints are speci c pairs of source and sink places. In this way, the model allows to describe the expected behavior of systems with components centered on the end-to-end processing of distinct objects of a certain type. For describing the system's real behavior, we consider traces of event logs, where events consist of executed activities and sets of object identi ers. e la er indicates which objects were involved in an event's activity. As we will present, the characteristics proposed for both CPNs and event logs allow us not only to keep track of individual objects, but also to provide an algorithm with linear time complexity.
Our method is based on replaying individually each trace of an event log on top of a CPN model. When replaying each trace, the distinct objects are injected as tokens in source places of the model. en, for each event of the trace, we seek to re a transition labeled with the event's activity, and selecting tokens from the transition's input places that correspond to the event's object identi ers. If a token related to an event's object is not in a requested input place, we consider a jump strategy, where the missing token is moved from its current location in the model to the requested input place. is allows to force a transition ring, and to keep replaying a trace to nd more deviations. e method reports all token jumps between places and their frequency. As we will present, this information can be added to the input CPN model to unveil so-called desire lines [17], i.e., actual paths of objects which are unforeseen in the speci cation model.
In addition, we present local conformance metrics based on the proportions of token transfers and jumps through speci c components of a CPN model. By leveraging the correspondence of parts of a real system with components of a CPN model (e.g., activities with transitions and system locations with places), these metrics then allow to diagnose local deviations and their severity in concrete parts of a system. Finally, we report an experimental evaluation supported by a prototypical implementation of our method. To showcase the practical value of our approach, we shall make use of a case study on trading systems, where orders from users are matched to trade. e remainder of this paper is structured as follows. Section 2 introduces a motivating example illustrating the use of our method in trading systems. Sections 3 and 4 present the class of CPNs and event logs, which we use in our method. Section 5 presents the conformance checking method, as well as the conformance metrics. Section 6 reports experimental evaluations supported by our prototype. Section 7 discusses the related work, and nally Section 8 presents the conclusions.

Motivating Example: Checking Conformance in Trading Systems
To give the reader an idea on how our method can be applied on a speci c domain, let us consider the validation of trading systems in stock exchanges [18]. In trading systems, buy orders and sell orders from users are crossed to trade securities (e.g., stocks of a company). Orders that aim to trade the same kind of securities are placed in two-sided lists called order books (e.g., orders that buy or sell securities of "rosneft" are placed in the order book "rosneft"). In a trading system, there are as many order books as kinds of securities that can be traded. Fig. 1 presents a CPN modeling the speci cation of a trading system operating one order book. It describes how the system is expected to manage both object classes, buy orders and sell orders. CPNs consist of two kinds of nodes: places and transitions. Places (drawn as circles) represent locations. For example, places 1 and 2 are source places for orders, 3 and 4 model buy and sell sides of the order book, and nally 5 and 6 are sink places for canceled or lled orders. Conversely, transitions (drawn as boxes) model system activities. Transitions consume tokens from input places, producing them back in output places.
us, transitions and model submission of orders, transitions and represent cancellation of orders, whereas transition models a trade execution between two orders. Albeit this example abstracts from other activities in a trading system, it will allow us to clearly illustrate our method.  Like other distributed technologies, trading systems are prone to failures, e.g., due to errors during the system's development or due to malicious users hacking the system. us, trading systems are sensitive to deviate from their speci cation. For instance, let us assume a trading system initially built upon the speci cation of Fig. 1, but whose real behavior is actually described in Fig. 2(a). In this real system, buy and sell orders are "silently" allowed to trade without being accepted in the order book, i.e., skipping activities and . Also, an undesired variant of activity allows sell orders to trade an unrestricted number of times.
For system engineers, it can be hard to determine these deviations and to which extent they have violated the system. Fortunately, event logs of this system record this misbehavior (e.g., Fig. 2(b)). For instance, the two rst events in the trace 1 of Fig. 2(b) exemplify the mentioned problems of this system: a buy order b1 and a sell order s1 have traded, skipping activities and . Also, the sell order s1 illegally trades in the next event. Powered by this kind of logs, we introduce how our conformance method unveils these deviations.  Our method takes an event log (b) from this real system. Fig. 3 illustrates the se ing of our method for detecting the aforementioned deviations. As input, the method takes the CPN speci cation model of Fig. 1, and an event log of the real system as exempli ed in Fig. 2. In this example, each trace of an event log corresponds to all events executed in a speci c order book. As output, the method produces a report listing all token jumps detected (representing objects skipping activities), global and local conformance metrics, as well as other tra c statistics. ese reports can be used, for example, to extend the speci cation model in order to clearly identify deviations and their magnitude.   Token jumps and their average frequencies on traces appear as do ed lines between places. Note how these jumps directly correlate with the components in Fig. 2(a) that describe undesired behavior of the real system, i.e., orders skipping activities and jump from places 1 , 2 to places 3 and 4 . Also, components of the extended CPN are labeled with tra c statistics. For instance, places indicate how many tokens were transferred from them (k), and how many of them actually arrived to the place by a jump (j). By leveraging the association between model's transitions and places with system's activities and locations, local conformance metrics allow to diagnose the severity of deviations on precise system components. For instance, one of these metrics checks the proportion of token jumps to input places of a concrete transition to force its ring. en, this allows to measure the proportion of objects executing a precise activity without following the speci cation path. For example, activity consumed 7 orders, but 4 of them were not ready at the location related to place 4 . us, this activity is associated with a measure of 0.42, i.e., 42% of the objects were at the required location when executing , whereas the rest were improperly involved in this activity. As we will present in the next sections, similar local metrics can be associated to places (locations) and arcs from places to transitions. Finally, as depicted in Fig. 4, the metrics can be combined with the notion of a heat map to clearly visualize which components of a system have been violated more, e.g., if the local measure of a component is close to 0, then such a component is painted in red.

Colored Petri Nets
In this section, we present formal de nitions and execution semantics for a class of colored Petri nets (CPNs). As introduced in Section 1, CPNs are an extension of Petri nets where tokens carry values of some types (also referred to as colors). For example, types may account for object classes, whereas tokens carry object identi ers. Let D = { 1 , ..., } be a nite set of types. Each token in a CPN carries a value d of some type ∈ D. For instance, in the model of Fig. 1 two types are de ned: OB for buy orders and OS for sell orders. Places are mapped to types in D to indicate the kind of tokens they contain, e.g., type( 1 ) = OB.
Arcs are labeled with expressions to describe how tokens are processed upon transition rings. We consider that each expression consists of a typed variable. Let  be a nite set of typed variables. We denote by type(v) the type of a variable v ∈  s.t. type(v) ∈ D. We de ne a function that maps each arc to a variable in . For example, in Fig. 1, expressions ( 1 , 1 ) = ( 1 , 3 ) = x specify that one buy order is transferred from place 1 to 3 upon the ring of 1 . Let  be a nite set of activities. To relate transitions with real system activities, we x an activity-labeling function Λ that maps transitions to elements in .
De nition 1 (Colored Petri Net). Let D be a nite set of types, let  be a nite set of variables typed over D, and let  be a nite set of activities. A colored Petri net is a 6-tuple = ( , , , type, , Λ) where: • is a nite set of places; • is a nite set of transitions, s.t. ∩ = ∅; • ⊆ ( × ) ∪ ( × ) is a nite set of directed arcs (called the ow relation); • type ∶ → D is a place-typing function, mapping each place to a type in D; • ∶ →  is an arc-labeling function, mapping each arc to a variable in . ∀ ∈ , if is adjacent to a place ∈ , then type( ( )) = type( ); • Λ ∶ →  is an activity-labeling function, s.t. ∀ , ′ ∈ : ≠ ′ ⟺ Λ( ) ≠ Λ( ′ ), i.e., transitions are mapped to distinct activities.
We now de ne execution semantics for the CPNs de ned above. Let = ( , , , type, , Λ) be a CPN. A marking in a CPN is a function, mapping every place ∈ to a (possibly empty) set of tokens ( ), such that ( ) ⊆ type( ). We denote by 0 an initial marking. A binding of a transition ∈ is a function, that assigns a value (v) to each variable v occurring in arc expressions adjacent to , such that (v) ∈ type(v). Let • and • be respectively the sets of input places and output places of a transition ∈ . Transition is enabled in marking w.r.t. a binding i ∀ ∈ • ∶ ( ( , )) ∈ ( ), that is, each input place of has at least one token to be consumed. e ring of an enabled transition in a marking w.r.t. to a binding yields a new marking ′ such that ∀ ∈ ∶ ′ ( ) = ( ) ⧵ { ( ( , ))} ∪ { ( ( , ))}.
As introduced in Section 1, we make use of CPNs to model systems consisting of components centered on the end-to-end processing of di erent types of distinguishable objects. To faithfully describe these systems, CPNs shall be characterized by the following properties: on the one hand, CPNs must be conservative, that is, tokens cannot disappear or duplicate upon transition rings; on the other hand, models must be work oworiented with a pair of source and sink places for every type de ned in the model, such that tokens move in a path between a source and a sink corresponding to their type. We thus de ne conservative work ow CPNs.
De nition 2 (Conservative-Work ow Colored Petri Net). Let D = { 1 , ..., } be a nite set of types such that ≥ 1, and let = ( , , , type, , Λ) be a CPN de ned over D. We say that is a conservativework ow CPN if and only if: 1. CP is a conservative colored Petri net where: For every ∈ {1, ..., }, there exists one distinguished pair of places in , a source place and a sink place in , where type( ) = type( ) = with ∈ D, and there exists a path in from to such that for every place in the path type( ) = . We denote by 0 and the sets of source places and sink places in , i.e., for every transition , places located within the set of input places of have distinct types. e same rule holds for places located in the set of output places of .
As input models for our method, we shall consider conservative work ow CPNs. We brie y explain De nition 2. Firstly, a CPN is conservative i for every variable v occurring in an input arc of a transition , v occurs exactly once in an output arc of . Similarly, each variable occurring in an output arc of shall occur exactly once in an input arc of . is implies that when token values in input places are binded to variables upon transition rings, then such values are transferred to output places, without disappearing or being duplicated. In this way, our conformance method will be able to unambiguously associate every distinct object in a trace with a token in the model. Also, according to De nition 2, CPNs shall be work ow-oriented. More precisely, a work ow CPN has distinct source places and sink places where is the number of types de ned in the model. Each pair of source and sink places of the same type are connected by a path whose intermediate places are also of the same type. In our conformance method, distinct objects in a trace are injected as tokens in source places. en, tokens move in paths according to the information of objects recorded in events. Upon termination, the method will check whether these tokens arrived to their corresponding sink places. It can be inferred that all places of the same type form a subnet within a work ow CPN, where each sub-net represents a system component handling end-to-end processing of a concrete object class.
Finally, De nition 2 states that the model does not have transitions with input places of the same type. e la er allows to relate every object type with exactly one input place in each transition. In Section 5, we discuss how this syntactic restriction contributes to providing an algorithm with linear time complexity.

Event Logs
In this section, we introduce event logs, describing how they are structured.
Let  be a nite set of activities, and let D be a nite set of types. For every trace in , each event in is a tuple of the form = ( , ( )), where ∈  is an activity label, and ( ) = { 1 , ..., } is a nite set of objects. For each ∈ {1, ..., }, we say that ∈ ( ) is an object of type involved in the execution of activity , such that ∈ D.  Table 1 presents an event log with two traces related to end-to-end runs in a trading system. Events indicate activities executed and objects involved, e.g., event 4 in trace 1 indicates an execution of activity trade with two objects involved: buy order b1 and sell order s1. We consider that all objects in a trace are distinguishable by having distinct identi ers. We denote by ( ) the set of distinct objects in a trace, e.g., ( 1 ) = {b1, s1, s2}. With slight abuse of notation, we denote the type of an object by type( ).
To guarantee the proper execution of our method, event logs must comply with a criterion of syntactical correctness with respect to the CPN used in the method. Let be an event log, and let = ( , , , type, , Λ) be a conservative work ow CPN. We say that is syntactically correct w.r.t. to i , for every trace ∈ , each event in is syntactically correct. An event = ( , ( )) is syntactically correct w.r.t. to i ∃ ∈ ∶ Λ( ) = ∧ ∀ ∈ • ∃! ∈ ( ) ∶ type( ) = type( ) ∧ ∀ ∈ ( ) ∃! ∈ • ∶ type( ) = type( ). at is, for every event = ( , ( )), there exists a transition labeled with activity , and each input place of is associated with exactly one event's object, and similarly each event's object is associated with exactly one input place of .

e Algorithm
In this section, we present our conformance checking method. e method is based on the replay strategy described in [1] with some adaptations for the class of CPNs and event logs described in Sections 3 and 4. Particularly, we shall assume that input models are conservative work ow CPNs (i.e., see De nition 2), whereas event logs are syntactically correct to these models. e method replays individually each trace of an event log on a CPN. We consider that the input CPN has an empty initial marking. When replaying a trace , distinct objects in are rstly inserted as tokens in source places of the CPN, according to their type. en, for each event = ( , ( )) in , the method seeks to re a transition labeled with activity , and consuming the tokens that correspond to the event's objects, i.e., elements in ( ). If a token corresponding to an event's object is not in an input place of , then we consider a jump strategy where the missing token is moved from their current location in the model to the requested input place. is allows to force transition rings, and to keep replaying a trace to nd more deviations.
Algorithm 1 describes the method. As output, the method returns two integer counters j and k. e value j is the total number of token jumps, whereas the value k is the total number of tokens transferred from input places to output places upon transition rings. A ratio between these values j and k allows to measure the discrepancy between a trace and a CPN. In the following, we illustrate the use of the algorithm, whereas at the end of this part we introduce conformance measures based on these counters. k -number of tokens consumed/produced; To illustrate how the algorithm works, we will consider the example depicted in Fig. 5, which describes step-by-step the replay of trace 2 in Table 1 on the CPN of Fig. 1. For compactness, transition names are used instead of activity labels. Firstly, our method extracts all distinct objects of the trace, inserting them in source places according to their type (function populateSourcePlaces). For example, four objects are extracted from 2 -buy orders b1, b2 and sell orders s1, s2.
us, the source place 1 for buy orders (type( 1 ) = OB) is populated with tokens b1 and b2, and the source place 2 for sell orders (type( 2 ) = OS) is populated with tokens s1 and s2.
A er populating source places with the distinct objects, we start to replay the trace on the CPN. As described in lines 3-13 of Algorithm 1, for each event = ( , ( )), the following steps are performed. We select a transition in the CPN labeled with activity . en, for every object ∈ ( ), we check if the input place of contains object . If the la er is not true, then is moved from its current location in the model to place . Each token jump is counted by incrementing the value of counter j. A erwards, when all observed objects in ( ) are located in the input places of transition , then res. e transition consumes such objects from input places, transferring them to its output places. e counter k is incremented by the number of tokens transferred.  Table 1 on top of the CPN of Fig. 1 using Algorithm 1.
As an example, let us focus on the replay of 2 = ( , {b1, s1}) depicted in Fig. 5. e sell order s1 is not in place 4 . is event corresponds to the situation of a trade between b1 and s1, but s1 was still not allowed to trade. However, to continue to replay, the object s1 jumps from its current location (place 2 ) to place 4 . As observed later in Fig. 5, the same situation occurs on 3 = ( , {b2, s1}), where both tokens are absent from the input places of the transition to re. us, a er forcing the replay of 2 note how we can detect another deviating events.
A er replaying all events in a trace, we check if all distinct objects arrived to their corresponding sink places.
is nal step allows to validate, for example, if the real system completely processed all objects. Lines 14-21 of Algorithm 1 describe this nal step. For instance, in Fig. 5, object s2 was le in an intermediate place. is can be interpreted as a corrupt order that should have traded or been canceled at the end of a day. Hence, we force this token to jump to its sink place, which is 6 since s2 ∈ OS and type( 6 ) = OS. When all tokens are in the sink places of the CPN, they are consumed by the "environment". Note that the counter of transfers k is incremented by the number of all distinct objects consumed in this nal step.
Time Complexity. We brie y analyze the time complexity of our method. Let = (∑ ∀ ∈ | ( )|) be the number of objects recorded in all events of a trace . Let ( ) be the time taken to execute Algorithm 1. We sketch out that ( ) is ( ), where ( ) is the standard asymptotic notation, referring that the execution time of the method linearly growths according to the number of objects in all events of a trace. is bound can be guaranteed under the assumption that access to elements of a CPN model only requires constant time, i.e., the time taken to visit a transition or a place given its name is negligible. In the code above, we consider that the CPN is stored in the following associative arrays: location tracks the position of each object in the CPN; marking stores the tokens contained by every place, and inputPlace and outputPlace indicates input/output places of a transition given a type. Notably, as shown above in Line 3, each object is directly related by its type to exactly one input place as we consider CPNs whose transitions have input places of distinct types (i.e., De nition 2) and events are syntactically correct w.r.t. to the CPN. en, if the associative arrays representing the CPN guarantee constant time to access, remove and insert elements, then it follows that the operations for every object in each event are performed in constant time.
us, the execution time of the trace replay only depends on the number of objects in the trace, following that rep ( ) is ( ). e time taken end ( ) in Lines 14-20 of Algorithm 1 is also ( ) as ( ) ≤ distinct objects are consumed from sink places of the model. Finally, since init ( ), rep ( ), and end ( ) are ( ), then the execution time ( ) = init ( ) + rep ( ) + end ( ) of Algorithm 1 is also ( ).
Fitness Metric. We introduce a global metric, namely tness, to measure the overall degree of conformance between a trace of an event log and a CPN. It allows to quantify to which extent the behavior seen in the trace complies with the CPN model. e metric is based on a proportion of the total number of token jumps j and tokens transferred k.
De nition 4 (Fitness). Let be a trace, and let be a colored Petri net. Let j be the total number of token jumps and, let k be the total number of tokens transferred, computed in Algorithm 1 with and as input. en, the (global) tness metric fit( , ) is de ned as: We shall demonstrate that 0 ≤ fit( , ) ≤ 1. Let us focus on counter k. In each event , we transfer | ( )| tokens, as we force to replay all event's objects. Also, all distinct objects are consumed at the end of the method. us, we have that = (∑ ∀ ∈ | ( )|) + | ( )| with | ( )| > 0. Regarding the counter j. Let j be the number of token jumps made in an event of a trace . In every event , we know that at most | ( )| jumps can be made, so 0 ≤ j ≤ | ( )|. Let j be the number of token jumps of the distinct objects to sink places as they remained in intermediate places a er the replay (e.g., see Lines 14-21 in Algorithm 1). We know that 0 ≤ j ≤ | ( )|. Now, let us formulate the total number of jumps as j = (∑ ∀ ∈ j ) + j . en, it follows that j ≤ k. erefore, 0 ≤ fit( , ) ≤ 1.
We now extend the de nition of tness for an event log as the average of the tness values, which are computed upon the replay of each trace in the event log on top of a colored Petri net.
De nition 5 (Fitness of an event log). Let be an event log, and let be a colored Petri net. We denote by fit( , ) the average tness value obtained upon replaying individually every trace on top of using Algorithm 1, where: For example, let us consider the replay of traces 1 and 2 of Table 1 on top of the CPN of Fig. 1. A er the execution of Algorithm 1 with 1 , the obtained tness value is fit( 1 , ) = 1− 0 9 = 1, whereas with 2 , we have that fit( 2 , ) = 1 − 4 10 = 0.6. ese global measures may be interpreted as follows: the overall system's behavior observed in 1 complies completely with the speci cation model. Conversely, fit( 2 , ) = 0.6 indicates that only 60% of the overall behavior observed in 2 complied with speci cation model. en, by considering both traces of the log using De nition 5, fit( , ) = 0.8 gives an estimation on how in average the system, as observed in the logs, complies the speci cation model.

Local Conformance Diagnostics
In the previous part of this section, we introduced a global conformance measure, namely tness (De nitions 4 and 5). is measure allows to quantify to which extent the real system, as observed in logs, comply with the speci cation. Whilst such a metric provides an overall compliance estimation for the whole system, in many applications is required to provide local diagnostics, i.e., in which precise system components deviations are occurring, and in which magnitude.
In this part, we present local conformance metrics that are related to precise components of a system, and which can be computed upon the execution of our conformance checking method. Our approach is based on the direct association between real components of a system and components of a CPN input model. Recall that activities correspond to transitions, real locations to places, and the relation between a location and an activity is represented by an arc. us, by keeping track of the proportion of token transfers and jumps owing through a model component, we can precisely indicate the number and magnitude of deviations occurred in the part of the real system that such model component represents. In what follows, we introduce these metrics, and we illustrate their usage in an example.
De nition 6 (Place-conformance). Let be a trace, and let ∈ be a place in a CPN. Let k ( ) be the number of tokens consumed from upon the replay of , and let j ( ) denote the number of token jumps to place upon the replay of . e place-conformance fit ( ) is de ned as: e place-conformance fit ( ) compares the number of tokens consumed from place , when replaying trace , with how many of them actually jumped to to force transition rings. Hence, this metric can be seen as the proportion of objects that comply with be at the location represented by place upon the execution of any activity that requires objects from such a location. If fit ( ) is close to 1, then almost no tokens jumped to , e.g., objects are respecting the path established in the speci cation. Conversely, if fit ( ) is close to 0, then most of the tokens are jumping to place , e.g., the majority of the objects skip previous activities that precede the location represented by . Note that if k ( ) = 0, then fit ( ) is not de ned, i.e., assessments cannot be computed since no tra c owed through place during the replay of .
De nition 7 (Flow-conformance). Let be a trace, and let ( , ) ∈ be an input arc in a CPN, s.t. ∈ ∧ ∈ . Let k ( , ) be the number of tokens consumed from to re when replaying trace , and let j ( , ) denote the number of tokens that jumped to place to force the ring of . e ow-conformance fit ( , ) is de ned as: De nition 8 (Transition-conformance). Let be a trace, and let ∈ be a transition in a CPN. Let us de ne the active pre-set of transition t in trace as • = { | ∈ • ∧ k ( , ) > 0}. e transition-conformance fit ( ) is de ned as: De nitions 7 and 8 follow the same principle of place-conformance. Given the replay of a trace , the ow-conformance fit ( , ) compares the number of tokens transferred, through the arc from place to transition , with how of many them jumped to to force speci cally the ring of . is can be interpreted as the proportion of objects that comply to be at the location related to place when executing activity Λ( ).
e transition-conformance fit ( ) is the mean value of the ow-conformance between and all the input places from which consumes tokens. us, fit ( ) diagnoses how many of the objects consumed by activity Λ( ), from all its required locations, correspond to outliers.
We now exemplify the usage of the local conformance metrics. Let us recall our motivating example of trading systems shown in Section 2. Fig. 6(a) shows a model representing a real trading system 0 whose behavior is slightly deviated from the speci cation model of Fig. 1. In particular, some sell orders in 0 can be submi ed to the sell side of an order book (place 4 ) by skipping activity . Now, let us consider a trace from 0 . Let us assume that corresponds to the observed interaction of 20 distinct objects (i.e., 10 buy orders and 10 sell orders). We then ran our method to check conformance between and the model of Fig.  1. Let us suppose j = 5 and k = 55 as the total number of jumps and token transfers computed by Algorithm 1, thereby obtaining a tness value fit( , ) = 1 − 5 55 = 0.909 (i.e., De nition 4). Nevertheless, it becomes more insightful to diagnose conformance in precise system components. To this aim, we calculate the local conformance metrics previously introduced and we extend the speci cation of the speci cation model of Fig.  1 with such metrics. e model is also extended with relevant tra c statistics of token jumps and transfers in the model components (i.e., see Fig. 6(b)). We brie y discuss the speci cation model of Fig. 6(b) which has been enriched with conformance diagnostics. First, the number of transferred tokens k ( ) and token jumps j ( ) are shown besides each place , e.g., k ( 2 ) = 10 and j ( 2 ) = 5. Jumps between places are displayed as do ed lines labeled with their frequency. In this example, jumps corresponded to 5 sell orders that in the system 0 moved to the sell side of the order book (place 4 ) using the silent action . is action is not recorded in trace nor allowed by the speci cation model shown in Fig. 1. en, upon the replay of activities and , these sell orders were not in place 4 , but in place 2 . Hence, to force replay, these 5 tokens jumped from place 2 to place 4 .
Input arcs are labeled using notation j ( , ) | k ( , ) where k ( , ) is the number of tokens consumed by transition from place , and j ( , ) indicates the number of jumps to place to force the ring of . As an example, the label of input arc ( 4 , ) indicates that 7 sell orders were consumed by activity , but 3 of them were not ready in place 4 before the ring. Output arcs are simply labeled with the number of resources transferred from a transition to a place.
Local conformance diagnostics are displayed on components of the speci cation model. For example, for place 4 , the place-conformance fit ( 4 ) = 0.5.
is can be interpreted that only half of the time a sell order was ready in the sell side of the order book upon the execution of activity or . e owconformance k ( 4 , ) = 0.33, i.e., activity consumed a sell order 3 times, but two of these orders were not ready in place 4 . e transition-conformance fit ( ) = 0.66 is the mean value between fit ( 3 , ) = 1 and fit ( 4 , ) = 0.33. is means that buy orders were always available in place 3 , whereas only a third of the time sell orders were available in place 4 . As a result, 66% of the time no deviations were observed upon the execution of activity .
Notably, a practical bene t of the use of local conformance metrics is their combination with the notion of a heat map. For example, in Fig. (see Fig. 6(b)) a heat bar is displayed in the right side of the model denoting that the lower the conformance measure of a component, then the more red that such a component is painted.
is allows us to quickly identify which components experienced more deviations. For instance, in Fig 6(b), it can be easily seen that deviations are localized in the component of the system related to sell orders.
Finally, note that the introduced local measures are computed based on the information provided by a single trace . We close this section by extending de nitions of these measures to event logs. Considering now all traces in the log, they shall allow to diagnose the average magnitude of deviations in precise components of the system.

Implementation and Experimental Evaluation
We have developed a prototypical implementation of our method in the Python programming language. Our solution is supported by SNAKES [19] -a library which facilitates the prototyping of high-level classes of Petri nets, including CPNs. In the following, we describe the functioning of our prototype, as well as we report an experimental evaluation of our method. e prototype and all the material of our experiment is freely available in our project repository [20]. Fig. 7 illustrates the organization of our prototypical implementation. Users of our solution simply need to invoke a program called "conformance checker". e program receives three input arguments: an option indicating the conformance method to use (e.g., replay with CPN using jumps), an event log forma ed as a comma-separated value (CSV) le, and a CPN model. CPN models are built as Python scripts. is generic organization allows us to seamlessly extend our solution, incorporating other methods of our research. In addition, as depicted in Fig. 7, our prototype has an independent routine to generate arti cial event logs (as de ned in De nition 5) by running CPN models. When running a CPN to generate a trace, if two or more transitions are enabled in a given marking, then the routine randomly selects one of such transitions to re.
As an example, Fig. 8 depicts the execution of our prototype in a command-line interface with the aforementioned input arguments (option 1 stands for the conformance method presented in this paper). Upon successful execution, the program generates an output folder with CSV les corresponding to: jumps detected, frequency per trace, and their average ( le jumps.csv), tra c statistics and the local conformance metrics presented in Section 5 per each component type ( les arcs.csv, transitions.csv and places.csv), and nally the le traceFitness.csv reports the total number of token jumps, token transfers and the resulting tness per trace and their average (De nitions 4 and 5). As we motivated in Section 2, the information provided in these reports can be used to extend the speci cation model, so that to quickly identify observed deviations and their magnitude in concrete components of the system.  Experimental Evaluation. Using our prototype, we conducted an experiment with three event logs, arti cially generated from di erent trading system models 1 , 2 , and 3 . Each event log was replayed on top of the speci cation model of Fig. 1. ese models represent replicas of a trading system, according to the speci cation, but each of them with undesired behavior is increasingly added. For instance, system 2 is a variation of 1 , but with a subtle modi cation to slightly increase its di erence with the speci cation model. Table 2 describes each replica and its event log generated. Each event log consists of 100 traces and 20 resources (i.e., 10 buys orders and 10 sell orders). e aim of this experiment is two-fold: to showcase the use of local conformance measures presented in Section 5, computed with all traces of an event log, and to study the stability of the proposed measures, e.g., how much the metrics are a ected by subtle increases of undesired behavior.  (Fig. 9(a)) (2610, 26) sell orders and buy orders skip activity and .  Before discussing the conformance results, let us review the speci cation model in Fig. 1. According to the speci cation, in a system with no deviations each object must be transferred exactly 3 times: an order is submi ed (activity or ), then it trades or is canceled (activities , or ), and nally the order is consumed from a sink place.
is implies that the replay of traces on the speci cation model, with 20 objects each, must count 60 token transfers (that is, 6000 transfers for an event log with 100 objects). However, as shown in Table 2, each system variant presents certain undesired behavior, so objects can move between certain locations disobeying the speci cation model. Hence, when replaying event logs of these system variants on top of the speci cation, observed deviations will incite token jumps between places, and less expected token transfers through the model structure.
e la er is evidenced in columns resources transferred and jumps detected of Table 3. For ease of representation, the averages of transfers and jumps have been rounded. Table 3 summarizes the results upon the replay of each event log on top of the speci cation model. For each variant, it can be observed how the introduction of a single subtle deviation induces more token jumps during replay, e.g., more objects owing through paths unforeseen in the speci cation. It can be observed that the la er causes a monotonic and tenuous decrease in the average trace tness (e.g., De nition 5). However, more intriguing becomes to identify where deviations have occurred and their magnitude. To this aim, items (d), (e), and (f) in Fig. 9 presents the speci cation model of Fig. 1 extended with local conformance diagnostics, computed when checking conformance in each variant, and which are associated with model components (De nitions 9-11). Input arcs and do ed lines representing jumps indicate the rounded average number of transferred/jumped tokens that owed through them, considering all traces of a log variant. e introduction of certain undesired behavior in each variant is unveiled by our method as a jump, showing how such undesired behavior induces more deviations in a precise component. e la er impacts on the conformance-related measures, used to paint the model to clearly identify where deviations occurred more.

Related Work
Conformance checking relies to a great extent on the expressive power of the models used to describe expected behavior of systems under evaluation. In this regard, state-of-the-art conformance methods use work ow nets (WF-nets) which is an ordinary Petri net employed to describe the control-ow of a process executed in isolation, that is, one single process instance a er another. us, methods using WF-nets do not lend themselves for validating multiple instances concurrently interacting in a system. In contrast, the use of more expressive models overcoming the mentioned limitation becomes a demand in real-world applications.
For example, papers [21,22] present case studies where multi-instance modeling notations are needed to diagnose the behavior of objects interacting within modern computational environments. In what follows, we review how di erent works address this challenge by using various kinds of Petri net extensions, and focusing on their application for conformance checking and similar techniques.
Proclets are one of the earliest proposals in process mining to study and validate the interaction of process instances in a system [23]. Each instance runs in an independent work ow, and semantics are provided to describe communication between work ows. Among its di erent applications, the concept of Proclets evolved into artifact-centric processes for conformance checking [24]. Also, the decomposition of the conformance checking problem has been studied for artifact-centric processes [25] in order to perform replay between each artifact and its corresponding sub-log. Another variant of this model is reported in [26], where the authors check conformance of artifacts which are modeled using UML state and activity diagrams.
Object-centric Petri nets is a notation recently proposed to describe in a single model the interaction of multiple cases (process instances) [8,27]. Object-centric Petri nets can be seen as another subclass of CPNs with certain characteristics. For instance, arcs that transfer an arbitrary number of objects are introduced to describe one-to-many or many-to-many interactions. In [27], the authors thus present a method to discover an object-centric Petri net from event logs. In particular, these event logs are built in extensible object-centric (XOC) format [28]. e usage of object-centric Petri nets for conformance checking presents open challenges. For instance, the model does not provide a direct assignment of objects to concrete variables, so multiple bindings may be chosen. In this light, the fact that one event can be associated with multiple valid bindings implies the need for recursive strategies of non-linear time complexity, e.g., backtracking.
Another notable research direction in conformance checking is the validation of additional behavioral dimensions (perspectives) of a single process [14,15] such as time or data constraints. In [14] the authors present an alignment-based conformance method using Petri nets with data (DPNs). On the one hand, alignments nd di erences between a model and a trace by computing the shortest path in the state space of a synchronous product net, i.e., a Petri net composed by the input model and the trace [29]. On the other hand, a DPN is a WF-net whose transitions are equipped with data variables which can be read/wri en upon transition rings. In this way, additional process perspectives to analyze are encoded within these variables. In contrast, DPNs are not appropriate for analysis of multiple process instances, e.g., tokens are black dots and data values do not ow through the model. is is why, for instance, we have opted for a model based on CPNs whose tokens carry object identi ers.
CPNs have already been considered in the process mining eld. For instance, Rozinat et al. considered the discovery of CPN-based models for simulation [30,31]. Notably, in paper [32], we proposed a conformance checking method with a class of CPNs whose tokens are tuples carrying object identi ers and a ributes, thereby allowing to detect various kinds of deviations. For instance, using logs whose events record the state of object a ributes a er the event's activity execution, the method detects if such a ributes were transformed as speci ed by the CPN. Also, the method was applied to check compliance of a real-world trading system w.r.t. its speci cation. In this regard, we refer the reader to papers [33][34][35][36], which present our studies on the extraction of event logs and Petri net-based modeling of real-world trading systems. However, in the method presented in [32], the replay of a trace is stopped upon the occurrence of the rst deviating event, and also local diagnostics on system components are not provided. us, the approach of token jumps and local conformance diagnostics presented in this paper can be used to extend such methods.
Nested Petri nets is an extension where tokens can be Petri nets themselves, which allow to describe the inner behavior of objects [37]. e model becomes useful when it is crucial not only to analyze the ow of objects within a system, but also to validate the inner behavior of these objects. For instance, in paper [38], we studied the conformance problem between a nested Petri net and an event log of a multi-agent system. We proposed a compositional approach where the behavior of each agent can be checked separately.
Nested Petri Nets have also been used in the related elds of adaptive process modeling and veri cation [39,40]. Finally, another recent research direction on modeling and validation of object-centric systems focuses on the use of models that combine Petri nets with data persistence models such as relational databases. e Information Systems Modeling Language (ISML) [41] and catalog-nets [42] are examples of this research direction. For instance, in these works, methods are proposed for verifying the integrity of objects in the Petri net and their representation in databases.

Conclusion
In this paper, we have presented a replay-based conformance checking method to validate whether a system, whose components manage interacting objects of di erent classes, complies with its speci cation. As the modeling language for the speci cation, we considered a subclass of colored Petri nets, whereas for describing real behavior we considered event logs where events are equipped with sets of involved objects. ese objects refer to the individual resources involved in the execution of an activity. Regarding the subclass of CPNs, we particularly considered conservative work ow CPNs which faithfully characterizes systems handling the end-to-end processing of distinguishable objects. Among the syntactic constraints, we considered that transitions do not have input places of the same type. e la er assures in our se ing only one binding for an event to replay. Consequently, we can provide an algorithm that, unlike alignments, has linear time complexity. Noteworthy to mention that the constraint of distinct types for input places can be dropped at the price of losing linear time complexity. For such a case, multiple bindings then can be associated to an event, and thus the algorithm shall look for the correct one using, e.g., recursive-based strategies such as backtracking.
e replay strategy of our method has been based on populating tokens in the model that correspond to distinct objects observed in the trace. For an event to replay, if objects are not located as tokens in speci ed input places of the transition to re, we proposed a jump strategy to move tokens from their current location in the model to the required places. Interestingly, this approach not only allows us to force transition rings to nd more deviations in a trace, but also recorded jumps between places unveil the so-called desire lines, i.e., paths of objects which are unforeseen in the speci cation [17].
Leveraging the fact that real locations and activities directly correspond to concrete components of a CPN, we proposed local conformance diagnostics that can be used to clearly identify the severity of deviations in precise parts of a system. Besides, we presented a prototypical implementation of our method, and we illustrated its usage with a case study on trading systems. e prototype is freely available for its usage and extension. In this regard, the prototype may be upgraded to provide automatic enhancement of an input CPN model with conformance diagnostics. e work presented in this paper may give ground for di erent research directions. For instance, previous works on conformance checking based on Petri net models whose tokens carry object a ributes or object inner behavior (e.g., [32,38]) can be extended with the strategies presented in this paper, e.g., use of jumps and local conformance diagnostics. Also, it may be of interest to study other variants for this method, e.g., where tokens not only represent distinct objects, but also relationships between each other. is would imply, for instance, that the state of the objects do not correspond to a single location, but their state is in some sense distributed among places, similar to the approaches of ISML and catalog nets. e la er would make to the approach presented in this paper more intriguing and challenging.