Probabilistic Analysis of Tournament Organization Systems

In this paper a criteria of comparison of different tournament organization systems in sporting contests is offered; the criteria uses the probability of winning the fairly strongest player. Two probabilistic models have been analyzed. Calculating formulas for estimating the probability and probability density of score points gained by one or another player were obtained. Some really used tournament systems were analyzed with the stochastic modeling method. The available results also provide an order of objects presenting to experts while organizing the examination by paired comparison. An analytical estimation of probability of tournament results (or pared comparison) was obtained. In many cases it allows to avoid a time-consuming procedure of sorting out possible variants.


INTRODUCTION
The emergence and development of probability theory is largely due to the need for game analysis [1]. Under some conditions imposed on a random element of the game methods of probability theory allow predicting the average values characterizing game's results with multiple repetitions. Extensive literature is devoted to probabilistic methods of assessing the solution of combinatorial problems [2][3][4][5][6][7][8][9][10][11][12][13]. The solution of such a problem which is defined as an extreme problem, the criterion of which is defined on a set of possible systems for processing the results of pair comparisons is discussed below.
The goal of many types of games is the identification of the relative strength of players. At the same time, the organization of tournaments can be different for different competitions, different sports, etc. In some cases, the tournament is held on a round-robin system, in others -on a Сup system or on a "Swiss" one. In some cases, players are pre-divided into groups with the subsequent holding of the championship between the winners of the groups, etc. At the same time, the results of paired comparisons are the basis for the conclusion about the winner in any organization of the tournament [14][15][16][17][18][19].
This raises the questions: -How to evaluate the tournament organization? -What should be the minimum number of games to reveal the strongest player with a given probability at a given number of players?
-What is the probability that as a result of the tournament, the order of the occupied places by the players coincides with their actual "force"? -How does the score rule affect this probability? Let us say the number of players and the total number of games are specified. Since the goal of a tournament is to identify the strongest player, then intuitively clear that a tournament must be organized so that the number of games between close in force players should be greater, but the number of games in which a result can be predicted with a probability which is very close to one should be less.
To analyze a tournament scheme, let us use a probabilistic model of the result of a game of two players, and the probability of any outcome of a game should depend on their "force." This indicator can be entered in different ways: as a probability of a player is in any possible states, as an average value of some random rate, the comparison of which for two players determines the result of a game, etc.
The attempt to provide analytical and numerical analysis of various tournament systems is made below. At the same time, we initially use the most simple probability model of a player and a rule of scoring points. Then we will generalize the consideration for a model that is closer to reality.
The structure of the tournament is the better when a probability that with a given total number of games the player with the greatest "force" will be the winner is higher. Here by the "winner" we mean a player who scores at least as many points as any other player in the tournament.

DISCRETE DISTRIBUTION OF PLAYER STATUS AND PROBABILITY DENSITY OF THE NUMBER OF SCORED POINTS
Let the state of th player is a random discrete value which takes a value of one with probability and zero with probability . The value will be called the force of th player. Player states are independent of each other. Let us mark the total number of players as M.
A tournament is a series of paired comparisons (games), in which states of players are compared. If a state of th player is more than a state of th one, he gets two points, and his opponent gets zero. If their states are the same then everyone gets one point. The players are ordered in such a way so . The result of the game which means the number of points scored by each participant is a random value that is characterized by the probability density defined on the set of values points. The probabilistic nature of the results of each game leads to random errors. A tournament can be considered as a filter that separates the useful signal (a priori distribution of forces) from the interference. This filter is as better as the final placement of players closer to the a priori arrangement of their forces.
Let us order the players by the value so that . The result of a tournament will be called ideal if the places occupied by the players (the number of points scored) are such that , i.e. if the order of places determined by the number of scored points and playing system corresponds to the distribution of forces. The result of a tournament will be called correct if the player having the force won first place or shared it with other players, i.e.
Since the number of points scored by a player at the end of a tournament is random, the ideal or correct results can be expected with some probability and , where -is the total number of games played in the tournament. Now let us focus on determining the probability that the tournament is correct.
We will assume that the system of drawing is better than the system if (1) In this work, a comparison of the playing systems -round-robin, round-robin with preliminary division into groups and elimination, a cup with elimination after each game, etc. -was made.

The Density of the Number of Points Scored by Results of a Tournament
Single game. The state of th player is a random value, its expected value is (2) and dispersion is Thus, the average number of points scored by th player in the game with th one and the dispersion are equal to: For the number of points and . The sum of the points scored by both players in each game is 2.
When calculating the number of points scored in a tournament by th player, it should be taken into account that the player does not play with himself, and therefore he gets no point. The easiest way to put it is to set that (6) From now on, we assume that the results of the games are independent of each other. Then the number of points in two games is , it takes the values from 0 to 4, the density of this value is equal to the convolution (7) So, for for and etc.
According to (6) for games we have the following recursive equation: As is more, then the distribution of the total number of points is closer to the normal discrete distribution law defined on a set of natural numbers.
Since the convolution operation (9) will be used repeatedly below, let us recall some of its properties [20,21]: (1) The convolution is transposition, i.e. the result does not depend on the order in which the convolution functions are placed under the sign of the sum (see (9)).
(2) The domain is a combination of the domains and . For example, if one of the functions is defined for and the other for , then the function has the domain . (3) Since each of the convolution densities is not negative and the sum of its values for all points of the domain is equal to one, then the result of the convolution also corresponds to the same conditions.
(4) The expected value and dispersion of a sum of independent random variables are equal to the sum of expected values and dispersions of each of them. It allows us to find the expected value and dispersion of the distribution which is a result of the convolution.

Round-Robin Tournament
Let be the number of players. The total number of games is , and the total number of points scored by all players is . Number of games played by each player is .
Since the order of the games does not affect the number of scored points, then let us assume that each th player plays sequentially with the first, second, etc. up to th player.
Let us denote as the number of points received by th player in all games of the tournament. The number of points in each meeting is a discrete random value that has a density (4). The density of the number of points after games is related to the density of the number of points after th game by a ratio similar to (8) (10) where -convolution operation sign. It is considered that in the last game th player plays with th one.
In accordance with (4) The random value takes integer values in the range from zero to and its density is identified recursively with initial conditions (11) by the formula (10): (12) The summation limits are determined by the fact that the argument in the function takes values from zero to two.
The average number of points scored in a single-round tournament by th player is equal to the sum by , the dispersion is equal to the same sum . Thus (13) Calculation of probability of tournament correctness. Knowing for all players, we need to find the probability that one player (for certainty the first player) scored no less points than th one. The domain of probability densities satisfies the conditions (14) Let the first player scored the maximal number of points , then, with a probability of one, he scored more than any th player. A tournament is knowingly correct if at the end of the tournament . Otherwise, when calculating the probability that the first player will score points, and at the same time th player will score , we need to take into account the results of the personal game separately since these events are not independent.
Let us denote by the density of probability that in all games of the first player, except for his game with the th player, the total number of points he scored is . This density for is as follows: (15) where is the density of the probability of scoring points by th player in a game with th player. The symbol is used for the convolution operation: Similarly, there is -a density of points scored by th player in all games with except the game of this player with the first one: (17) Let us consider 3 possible outcomes of the game between the 1st and th players. (A) Player 1 lost to player . In this case, the probability that the first player scored points for the whole tournament is equal to (B) The result of the game between the 1st and th players was a draw.
(C) The player 1 beat the player : Let us find the probability of that the number of points scored by the 1st player will be no less than the number of points scored by the player . We will write expressions for this probability depending on the results of the personal game of the 1st and th players: The probability of the fact that the first player will not be lower than th player in the final results table, is calculated as the weighted average of the recorded values taking into account the probabilities of personal game results: (18) where is the probability density of the fact that the 1st player will score points in the game with th player.
However, multiplying the probabilities that the first player gaining more points than th one, where , can not be used for calculating the probability that the tournament is correct since each of these probabilities depends on whether the first player scored more points than th. It is true that: The tournament is correct if for any possible number of points scored by the first player, none of the other players scored more points. The minimum number of points that the first player must score in order to share the first place with a probability greater than zero is . In this case, all the games will end in a draw and he will share the first place with the rest of the participants. If, as mentioned above, he gets points, then he will obviously be the first or will share the first place with one of the remaining participants of the tournament. So in this case the probability that the tournament is correct is equal to one.
It is necessary to note that the total number of points scored by all participants in the tournament is fixed and equal to . This means that if the first player scored points, then the remaining players will score points. Let us denote by the probability that all players in total will score some = * = − .   The probability of tournament correctness in general case can be expressed by formula (19) In this case, the set of acceptable values of arguments is defined by the constraints: The practical use of formula (19) requires a cumbersome enumeration and at becomes too expensive in the calculation time. Below are formulas that allow you to get an estimate of the probability of the tournament is correct with a much lower labor intensity.

Since
, then the probability is determined for a fixed number of players for each value . Therefore, with regard to such a replacement, we will write . To evaluate the probability of tournament correctness, we get an expression (20) Here random events "no player except the first one will score more points than " and "all players, except the first one, will get in total points" are assumed to be independent. Therefore, the evaluation of the correctness of the tournament thus found is higher than the value found by the formula (19), however, the calculation does not require an enumeration.
To assess the accuracy of the last formula, numerical experiments were carried out for various variants of the initial distribution of forces . Table 1 shows the results of these calculations for various a priori distribution of forces. It shows the probability obtained by the formula (19), assessment of the probability found by the formula (20), and the error Δ, % obtained from the formula These calculations have shown that the assessment (20) has an error about 4%, which allows to use it to compare different tournament systems.

PLAYING WITH PRELIMINARY SPLIT INTO GROUPS AND KNOCK-OUT
Let us evaluate the structure of the tournament by the probability of its correctness . Let the number of participants be divided into 4 and the teams are divided into subgroups by the rule: the first subgroup consists of odd numbers, and the second -of even numbers. The playing is held in subgroups on a roundrobin system. Half of the teams that took the last places in the subgroups are cut off, the remaining teams play the championship on a round-robin system. Total number of games (21) The number of games is less than for a round-robin system. Let us compare the value of for two playing systems at , when the system with selection becomes a Cup system, and the number of games in a round-robin system exceeds twice the number of games in a Cup system. For a round-robin system, can be found by the formulas obtained in the previous section. For the system with the selection, the subgroups include teams with the forces , and , . To win in the draw, the player with the force must win in the subgroup, and then win in the final. Since these events are independent, the probability of victory is equal to (22) In turn, the probability of winning in the final game is equal to (23) After the substitution of this expression in (22), we get (24) Since the number of games in a round-robin system is twice as large than with a Cup system, you can conduct not one, but two games between players, and consider the one who scored more points in two games as the winner. Then the probability of the first player will win in his subgroup (the probability of winning more than two points) is determined by the formula (7) (25) The probability of reaching the final of the second and fourth players is equal, respectively (26) The probability of winning the first player in the game is equal to (27) Here the line corresponds to the probability of winning in two games. Numerical calculations, which were performed according to the formula (27) and with an equal distribution of the players' forces, showed that with such an organization the probability of the tournament's correctness is equal to . At the same time, the probability of a correct tournament in a roundrobin system, which was found using the formula (20) is equal .

PLAYER'S STATE IS A RANDOM VALUE WITH A CONTINUOUS DISTRIBUTION
The case where the force of each th player is the probability that his state is equal to one is considered above. This circumstance facilitated the calculation of paired comparisons results. Let us consider what will change if the density of state is continuous with expected value and dispersion . The player wins the player and gets two points when in the pair comparison . Since the densities are continuous, then with any small probability, and thus a tie is excluded. The formulas for calculating the density of the amount of points scored in the tournament remain correct because they use only the results of paired comparisons.  (1 ) The probability of winning of th player in a single game with th (see (4)) is equal to These expressions allow analyzing the outcome of tournaments and find the probability of their "correctness" when the number of games is fixed.

If
is the normal law of distribution as , then the probability of winning th player from th one (losing th player from th) is equal to where is the Laplace function that is defined in tables [21].
In the case of the uniform low of distribution, its value is zero for each player outside of the segment and is equal to within that segment.
4. COMPARISON OF DIFFERENT TYPES OF TOURNAMENTS BY STOCHASTIC MODELING Let us compare several systems, in which the total number of games is the same. In addition to the round-robin system, three more playing systems were considered. System 1. All players are divided into 2 subgroups that are identical in average strength. The roundrobin tournament is held in each subgroup. Then half of the players of each subgroup that took the last places form a new subgroup, in which another round-robin tournament is held and its winner is revealed. This winner and the remaining best players from each subgroup conduct the fourth round tournament. By the total amount of points scored in this tournament, 2 players who took 1 and 2 places are revealed. There are several games between them so the total number of games played is equal to the number of games in a round-robin system, so it is . The winner is the player who scored the maximum number of points in these games. System 2. All players are divided into 2 identical subgroups, in each subgroup a round-robin tournament is held. One player who took the last place in each subgroup is removed from the tournament. The remaining players conduct a round-robin tournament, after which the two best players hold a series of final games, the number of which is determined from the condition that the total number of games should be equal to the number of games in a round-robin system. System 3. All players are divided into 2 identical subgroups, in each of them a round-robin tournament is held. Half of the players of each subgroup who took the last places are removed from the tournament. The remaining players hold another two round-robin tournaments, then the two best players hold a series of final meetings, the number of which is such a way so the total number of games is equal to the number of games in the round-robin system.
In addition, two variants of the initial division of all players into subgroups were considered. In the first case (method A), the split was conducted by taking into account the force of the players : All players were ordered by their force. After that the first player in the list went to subgroup 1, the second -to subgroup 2, the third -to 1, the fourth -to 2, and so on. The second method (B) simulates the widely used division into subgroups in sports competitions by drawing of lots, so the players are divided into the groups randomly.
The distribution of the player's forces was made using two approaches: An equal distribution of forces, when the difference between is the same, and a random distribution of forces. In the case of the random distribution two different approaches were used: "Random 1" -when the forces of the strongest players are close to each other, and "Random 2" -when the force of the strongest player is significantly greater than the force of the second player.
The following results were obtained by stochastic modeling. For the given values of players' force , several iterations were performed. On each iteration a table of the results of all games was build and the player who took the first place was determined. The random value takes the value 1 if the event (in this case the strongest player took the first place) occurred, or 0 otherwise. The estimation of probability of the correctness of a tournament was defined as the estimation of the expected value of a random value .
The number of iterations was determined from the conditions that with the given probability the deviation of the estimation from the actual probability of tournament correctness was less than the given accuracy .
For these conditions, is defined as follows [22]: where is an estimation of the standard deviation of a random value , -inverse Laplace function.
In order to find the first estimation of the standard deviation , at first, there was a certain number of values , after which the estimation for and the value of were calculated. These values were further refined after each iteration.
The results of the comparative experiments are provided in Table 2. It shows probabilities that the strongest player will win first place. In the experiments: -number if palyers ; -total number of games in the tournament ; -level of trust ; -accuracy ; -initial number of iterations . Analyzing the results of the experiments, the following conclusions can be drawn: (1) The selection of a playing system can have a significant impact on the effectiveness of a tournament.
(2) The third system of organizing tournaments for all variants of a priori distribution of forces is the best among the systems considered.

CONCLUSIONS
A criterion that makes it possible to assess the way in which tournaments are organized with a given number of participants and a limited number of games was offered. Expressions for analytical estimation of the quality of tournament organization were received. Statistical modeling for calculating the criterion of the correctness of one or another system of the tournament organization was carried out.
All the obtained recommendations, but in other terms, relate to the task of determining the best product by means of an expertise in which an expert chooses the best one in a series of paired comparisons.