内容简介:So, an exploration of how tennis tournament seeding and bracketing impacts on the end result has percolated to the top of my to-do list, inspired by the Melbourne Open currently in play.This is the first of what will probably be two posts on this topic. To
So, an exploration of how tennis tournament seeding and bracketing impacts on the end result has percolated to the top of my to-do list, inspired by the Melbourne Open currently in play.
This is the first of what will probably be two posts on this topic. Today I wanted to look at the impact of seeding on the chance of the best players finishing first, or in the top 2, 4 or 8 players (ie the grand, semi or quarter finals). I compared the results of simulated single-elimination tournaments between these players in two different tournament types – completely random allocation of the draw, or seeding for the top 32 players as was standard in Grand Slam tournaments, apparently until 2019 .
There’s a material impact from the approach to the draw, which can be seen in these results:
This means, for example, that with 32 top players given special “seed” allocations in the draw, there is a 22% chance that the semi finals will include the four best players; but only a 2% chance if the draw is allocated completely at random. The chance of seeing the eight best players in the quarter finals under random allocation is effectively zero, whereas with a seeded draw this will happen 3% of the time (still not very often – which should be no surprise for tennis fans). On the other hand, the chance of seeing the top two seeds playing out the final is 42% in the seeded contest and only 20% when seeded.
Note that these results will depend upon the distribution of players’ strengths in a given tournament. At an extreme, if the top player completely dominates all others, seeding will make no difference to the winner. We can calculate some other theoretical values as limits. For example, if the top two players were effectively invincible when playing anyone except eachother, they would meet in the final 100% of the time in a seeded competition, and 50% of the time if the draw was unseeded. So anything below those levels (42% and 20% in our case) indicates the gap between the two players being “really good” compared to the rest of the field and “infinitely good”.
Here is a blog post on a closely related point – fellow-Melbournian Stefanie Kovalchik on the impact of a suggested move from 32 to 16 seeds for grand slam tournaments . I think that this proposal didn’t go ahead. Kovalchik shows that it would have lead to somewhat less fair results than the 32 seed system, where “fair” is defined as players reaching the round that would be expected based on their ranking.
Historical tennis results data
To perform my analysis I used data on the actual relative strength of players from Stephanie Kovalchik’s {deuce} R package . I wanted a realistic range of strengths that could go head-to-head in a real life tournament, so I chose a single point of time rather than a hypothetical cross-history match up that (for example) would pitch Margaret Court versus Serena Williams. To avoid confusion with contemporary reality, I chose 1990 as my year. Here are the top 10 womens’ tennis players at the end of 1990, as judged by their Elo ratings:
# A tibble: 128 x 3 player_id player_name elo <int> <chr> <dbl> 1 200414 Steffi Graf 2608. 2 200293 Martina Navratilova 2474. 3 200652 Monica Seles 2388. 4 200572 Gabriela Sabatini 2312. 5 200597 Mary Joe Fernandez 2236. 6 200017 Arantxa Sanchez Vicario 2195. 7 200049 Conchita Martinez 2194. 8 200077 Jennifer Capriati 2169. 9 200401 Manuela Maleeva Fragniere 2151. 10 200404 Zina Garrison 2139. </dbl> </chr> </int>
This is the top 10 rows of a 128 row dataframe women128
created with the chunk of R code below:
library(foreach) library(doParallel) library(tidyverse) library(scales) library(kableExtra) library(clipr) # deuce package by Stephanie Kovalchik to get Elo ratings of real tennis players # devtools::install_github("skoval/deuce") library(deuce) data(wta_elo) # Identify the top 128 women playing in 1990 and their last Elo rating that year women128 <- wta_elo %>% filter(year(tourney_start_date) == 1990) %>% # pick just the latest Elo rating group_by(player_name) %>% arrange(desc(tourney_start_date)) %>% slice(1) %>% ungroup() %>% arrange(desc(overall_elo)) %>% dplyr::select(player_id, player_name, elo = overall_elo) %>% slice(1:128)
Kovalchik provides Elo ratings at point in time of each player, ultimately derived from analysis of data collected by Jeff Sackman . I’ve written about Elo ratings earlier in this blog in the context of backgammon and Australian rules football. They are a powerful method of calculating ratings based on actual performance, with the great advantage of being convertable into probabilities for any hypothetical match up. Incremental adjustments are made to the rating based on how actuality relates to the probabilities derived from Elo ratings going in to the match. This makes them a useful and self-correcting metric that is readily incorporated into a statistical model.
To be sure I am using the correct basis for converting Kovalchik’s Elo ratings into probabilities, I’m going to use her elo_prediction()
function for estimating the chance of either of a pairing winning. To illustrate, here is a chart showing the chance of a selection of the players ranked 2 to 128 by Elo rating of beating Steffi Graf, the highest rated player in our subset of the data (and in fact, the highest rated player by this method in the data available, beginning in 1968 with the start of the open era).
Here’s the code for that illustration:
# Chance of beating Steffi Graf steffis_opponents <- c(2:5, 10, 20, 50, 128) y <- elo_prediction(women128[steffis_opponents, ]$elo, women128[1, ]$elo) the_caption <- "Source: http://freerangestats.info analysis based on 1990 WTA ratings in Stephanie Kovalchik's {deuce} R package." cbind(women128[steffis_opponents, ], y, rank = steffis_opponents) %>% mutate(lab = paste0(player_name, " (", rank, ")")) %>% ggplot(aes(x = elo, y = y, label = lab)) + geom_point() + geom_text_repel(size = 3, colour = "steelblue") + scale_y_continuous(label = percent) + labs(title = "Probability of beating Steffi Graf at the end of 1990", subtitle = "Various players from the top 128 WTA players, probability based only on Elo rating", caption = the_caption, x = "Elo rating of Ms Graf's hypothetical opponent (ranking shown in brackets)", y = "Ms Graf's opponent's probability of winning")
Simulating tournament draws and results
For the grunt work of simulating tournaments between these 128 players, I first write a function simulate_tournament()
which takes as its main argument a data frame of 128 rows in sequence to represent position in the draw. The input to this function is going to look like this:
> brackets # A tibble: 896 x 5 player_id round match player_name elo <int> <dbl> <int> <chr> <dbl> 1 200494 128 1 Dianne Van Rensburg 1855. 2 200423 128 1 Carling Bassett Seguso 1822. 3 200481 128 2 Elna Reinach 1777. 4 200506 128 2 Wiltrud Probst 1768. 5 200699 128 3 Meredith Mcgrath 1857. 6 200086 128 3 Magdalena Maleeva 1825. 7 200419 128 4 Kathy Rinaldi Stunkel 1804. 8 200624 128 4 Tami Whitlinger Jones 1647. 9 200395 128 5 Catherine Tanvier 1732. 10 200360 128 5 Pam Shriver 2047. ... </dbl> </chr> </int> </dbl> </int>
This indicates (for example) that in the “round of 128” – the first round played – Van Rensburg will play Seguso in match one. If we filter the object to match one of round two (the “round of 64) we see:
> filter(brackets, round == 64 & match ==1) # A tibble: 4 x 5 player_id round match player_name elo <int> <dbl> <int> <chr> <dbl> 1 200494 64 1 Dianne Van Rensburg 1855. 2 200423 64 1 Carling Bassett Seguso 1822. 3 200481 64 1 Elna Reinach 1777. 4 200506 64 1 Wiltrud Probst 1768. </dbl> </chr> </int> </dbl> </int>
There are now four players in match 1. However, one of Van Resnburg or Seguso will have lost in the round of 128; and so will one of Reinach and Probst. With a bit of care, this object brackets
contains the full draw of the elimination tournament, and can be constructed so the top 32 seeds are allocated to the brackets required in a 32 seed draw
. The remainder of the code in the chunk below does this for my two different methods of draws. It’s a bit clunky but it seems to work.
#' Function to simulate the result of a tournament of 128 players #' #' @param brackets a data frame or tibble with columns for player_id (player id), round (referring to #' the number of players left in that round of the tournament - 2,4,8,16,32,64 and 128), #' match (match id within that round) simulate_tournament <- function(brackets){ one_sim_results <- tibble(winner = numeric(), loser = numeric(), round = numeric(), match = numeric()) all_rounds <- unique(brackets$round) remaining_players <- brackets for(this_round in all_rounds){ all_matchups <- remaining_players %>% filter(round == this_round) %>% pull(match) %>% unique() for(this_match in all_matchups){ these_players <- remaining_players %>% filter(round == this_round) %>% filter(match == this_match) %>% select(player_id, elo) prob_win <- deuce::elo_prediction(these_players[1, ]$elo, these_players[2, ]$elo) a_wins <- rbinom(1, 1, prob = prob_win) == 1 if(a_wins){ update <- tibble(winner = these_players[1, ]$player_id, loser = these_players[2, ]$player_id, round = this_round, match = this_match) } else { update <- tibble(winner = these_players[2, ]$player_id, loser = these_players[1, ]$player_id, round = this_round, match = this_match) } one_sim_results <- rbind(one_sim_results, update) } remaining_players <- brackets %>% filter(!player_id %in% one_sim_results$loser) } return(one_sim_results) } #' Assign bracket slots to 128 players #' #' @param d a data frame or tibble with 128 rows and one column player_id #' @return a long data frame with columns for player_id, round and match indicating which #' match each player is in in each round (if they survive to that round). "round" is indicated #' by the convention of number of players left ie "Round of 128" is the first round, "Round of 2" #' is the grand final. #' @details The original order is retained so this can be used (with caution) to allocate #' matches to a seeded tournament. For example, with 32 seeds if the original data frame #' is in the order of position32 from the round_of_32_seeding object, with each position #' followed by 7 unseeded players. assign_rounds <- function(d){ d %>% mutate(position128 = rep(1:64, each = 2), position64 = rep(1:32, each = 4), position32 = rep(1:16, each = 8), position16 = rep(1:8, each = 16), position8 = rep(1:4, each = 32), position4 = rep(1:2, each = 64), position2 = rep(1:1, each = 128)) %>% gather(round, match, -player_id) %>% mutate(round = as.numeric(gsub("position", "", round))) } n_sims <- 10000 #-----------set up parallel processing----------- cluster <- makeCluster(7) # only any good if you have at least 7 processors :) registerDoParallel(cluster) clusterEvalQ(cluster, { library(foreach) library(tidyverse) library(deuce) }) clusterExport(cluster, c("women128", "assign_rounds", "simulate_tournament")) #-------------------unseeded tournament------------------- r1 <- foreach(this_sim = 1:n_sims, .combine = rbind) %dopar% { # new random positionining of all 128 players brackets <- women128 %>% sample_n(128) %>% select(player_id) %>% assign_rounds() %>% left_join(women128, by = "player_id") one_sim_results <- simulate_tournament(brackets) one_sim_results %>% mutate(sim = this_sim, method = "No seeding") } #------------------------seeded tournament----------- # https://en.wikipedia.org/wiki/Template:32TeamBracket round_of_32_seeding <- tibble( position32 = rep(1:16, each = 2), seed = c(1, 32, 17, 16, 9, 24, 25, 8, 5, 28, 21, 12, 13, 20, 29, 4, 3, 30, 19, 14, 11, 22, 27, 6, 7, 26, 23, 10, 15, 18, 31, 2) ) stopifnot(all(1:32 %in% round_of_32_seeding$seed)) # players and their position in the round of 32 bracket - doesn't change in each sim players <- women128 %>% arrange(desc(elo)) %>% mutate(ranking = 1:128, seed = ifelse(ranking <= 32, ranking, NA)) %>% left_join(round_of_32_seeding, by = "seed") clusterExport(cluster, "players") r2 <- foreach(this_sim = 1:n_sims, .combine = rbind) %dopar% { unseeded <- players[-(1:32), ] %>% # in random order: sample_n(n()) %>% mutate(position32 = 1:n() %% 16 + 1) %>% arrange(position32) # actual brackets change each sim for players who aren't one of the top 32 seeds brackets <- rbind(filter(players[, c("player_id", "position32")], !is.na(position32)), unseeded[, c("player_id", "position32")]) %>% arrange(position32) %>% assign_rounds() %>% left_join(select(players, -position32), by = "player_id") %>% arrange(desc(round), match) one_sim_results <- simulate_tournament(brackets) one_sim_results %>% mutate(sim = this_sim, method = "Seeded (32 seeds)") } results <- rbind(r1, r2) %>% left_join(players, by = c("winner" = "player_id")) %>% rename(winner_name = player_name)
Which gets us to our results.
The distribution of the winner of the whole tournament is summarised in the table below. Unsurprisingly, Ms Graf wins more of our simulated tournaments than any other player; 5,454/10,000 when the draw is at random and 5,913/10,000 when we use the 32-seed draw method. But even in this period of Graf’s dominance in women’s tennis, other players (and more than just the top few) have a non-zero chance of winning. Which is why people watch, of course. Interestingly, once we get to Monica Seles (ranked 3 by Elo at this point in time) and lower, players are more likely to win the overall tournament with an unseeded rather than a seeded draw.
winner_name | Actual ranking | No seeding | Seeded (32 seeds) |
---|---|---|---|
Steffi Graf | 1 | 5454 | 5913 |
Martina Navratilova | 2 | 2089 | 2219 |
Monica Seles | 3 | 1064 | 1018 |
Gabriela Sabatini | 4 | 510 | 418 |
Mary Joe Fernandez | 5 | 213 | 132 |
Conchita Martinez | 7 | 110 | 66 |
Arantxa Sanchez Vicario | 6 | 113 | 63 |
Jennifer Capriati | 8 | 89 | 42 |
Manuela Maleeva Fragniere | 9 | 66 | 34 |
Zina Garrison | 10 | 58 | 26 |
This is also the point at which we can interrogate our simulation results to get the image I started the blog with:
This analysis of the simulation results was made with this chunk of code
# Summarise the individual winners of the overall tournament: results %>% filter(round == 2) %>% group_by(winner_name, method, ranking) %>% summarise(wins = n()) %>% ungroup() %>% select(winner_name, method, wins, `Actual ranking` = ranking) %>% complete(winner_name, method, fill = list(wins = 0)) %>% pivot_wider(names_from = method, values_from = wins) %>% arrange(desc(`Seeded (32 seeds)`)) %>% slice(1:10) %>% kable() %>% kable_styling(full_width = FALSE) #' Present the success rate of different simulated draw methods #' #' @param results data frame of results as created earlier in this script #' @param women128 data frame with extra details on the 128 players in the tourname #' @param how many of the top players you want to compare the draw methods for. Must be a power of 2. #' @details Very specific to this particular bit of analysis success_rate <- function(results, women128, top_x){ if(log(top_x, 2) != as.integer(log(top_x, 2))){ stop("top_x must be a power of 2") } top_players <- women128[1:top_x, ]$player_name # we're going to count this by counting the winners in the previous round. eg to find if the top 2 # players are the final 2 in the competition, we see if they are in the winners from the round of 4: tmp <- results %>% filter(round == (top_x * 2)) %>% arrange(sim, method, match) %>% group_by(sim, method) %>% summarise(best = all(winner_name %in% top_players)) %>% group_by(method) %>% summarise(best = mean(best)) %>% mutate(top_x = top_x) return(tmp) } # Summary graphic rbind( success_rate(results, women128, top_x = 8), success_rate(results, women128, top_x = 4), success_rate(results, women128, top_x = 2), success_rate(results, women128, top_x = 1) ) %>% ggplot(aes(x = as.ordered(top_x), y = best, colour = method)) + geom_point() + geom_line(aes(x = as.numeric(as.ordered(top_x)))) + scale_y_continuous(label = percent) + expand_limits(y = 0) + labs(x = "Number of top players of interest", y = "Chance of those players being the\nlast ones in the competition", colour = "Method of allocating draw:", title = "Impact of seeding on the better players' end results in a tennis tournament", subtitle = "Seeding materially improves the chance of the best player winning or the best two players being in the grand final,and impacts the semi- and quarter- finals even more.", caption = the_caption)
That’s all for today. Some time soon I hope to come back to this and compare both methods to the alternative proposed by Charles Dodgson Lawn Tennis Tournaments. The True Method of Assigning Prizes with a Proof of the Fallacy of the Present Method . Dodgson, who as well as being a mathematician and logician found time to invent one of the most recognised characters in English literature, was writing at a time before seeding the draw was common and proposed an alternative to the single elimination tournament that in his view was guaranteed to give the correct first three prizes to the three best players. However, he did depend on a non-probabilistic view of what “best” means which is worth probing. But that’s for another day.
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
软件人才管理的艺术
Michael Lopp / 罗小平 / 人民邮电出版社 / 201008 / 35.00元
本书作者具有15年的硅谷人才管理经验,他在博客上发表了大量探讨软件人才的管理之道的文章,深受读者欢迎。本书素材取自他的博客文章,用深入浅出的语言,讲述发人深思的道理,具有很强的现实操作性。 本书分为三大部分:“管理的箭袋”、“过程就是产品”、“你的其他版本”。前两部分分别讲述了人员与产品的管理,第三部分除了讨论管理之外,还讲述了如何有针对性地准备简历和电话面试,来提高自己面试成功的几率。书中......一起来看看 《软件人才管理的艺术》 这本书的介绍吧!