Visualizing "The Best"

(This article was first published on max humber , and kindly contributed toR-bloggers)

How do you measure “The Best”?

It’s not immediately clear. Because, “The Best” is incredibly vague and subjective. My “Best” is not the same as your “Best”. And our “Best”s can converge and diverge depending on what we are measuring and how we measure it.

I think “The Best” is often the wrong question. Usually when we’re looking for “The Best” (are you sick of me saying “The Best” yet?) we’re really just trying to find “The Better”.

Answers to questions like “Who is the best player in the NBA?” or “What is the best city in the world?” or “Which Pokemon is the best?” are fraught with caveats and asterisks and clarifications. And they have to be! What do you mean “Best”? Best Shooter? Best of All Time? Best Last Year? Best in terms of Quality of Living? Best on measures of Entertainment? Best Speeds? Best Attacks? Best Best Best Best Best! Aggh!

To answer these “Best” questions we have to narrow down the problem and convert them into “Better” questions. By slimming down the pool of possible options we can actually start to make some progress!

“Who is the better basketball player: Steph Curry, Kawhi Leonard, or Larry Bird?” “Which city is better: Toronto, London, or San Francisco?” “Which starter Pokemon (from Gen 1) is better for my team?”

These are questions, I think, we can actually answer! But only if we’re explicit about the measures we’re using in our calculations and the weights that we assign to them.

For instance, if we want to find the better basketball player we could come up with some formula that includes assists, shooting, usage, defense, and rebounds. Maybe my formula is y = 2 * Shooting + 1.2 * Usage - 3 * Defense + 1.1 * Assists^2 + 1.3 * Rebounds . But I think reducing five incredibly rich measurements down to one value is absurd. It’s lossy compression! And you might not agree with my formula. You could have a better one. You might really value rebounds. Or you might want to replace usage with steals! Blah Blah Blah Blah Blah.

Worry not! We’ve finally reached the part where I present a method for finding “The Best” (or “The Better”, at least). A visulization for “The Best”. It is inspired by (more like entirely ripped off from) this FiveThirtyEight article.

All that is required is the tidyverse and forcats

Basketball

Step 1: Spin up the data

df <- tribble( ~player, ~assists, ~shooting, ~usage, ~defense, ~rebounds, "Larry Bird", 88, 89, 93, 92, 87, "Kawhi Leonard", 71, 94, 92, 93, 62, "Stephen Curry", 95, 92, 87, 43, 32, "Average", 72, 85, 32, 34, 30) %>% gather(stat, percentile, -player) %>% mutate(outof4 = percentile %/% 25 + 1) %>% mutate(col = ifelse(player == "Average", "A", "B")) %>% mutate(order = recode(stat, assists = 5, shooting = 1, usage = 2, defense = 3, rebounds = 4)) %>% mutate(stat = recode(stat, assists = "ASSIST\nRATE", shooting = "TRUE\nSHOOTING", usage = "USAGE\nRATE", defense = "DEFENSIVE\nBPM", rebounds = "REBOUND\nRATE")) %>% mutate(stat = factor(stat))

Step 2: Graph!

df %>% ggplot(aes(x = fct_reorder(stat, order), y = outof4)) + geom_col(alpha = 0.5, aes(fill = col), width = 1, show.legend = FALSE, color = "white") + geom_hline(yintercept = seq(0, 4, by = 1), colour = "#949494", size = 0.5, lty = 3) + geom_vline(xintercept = seq(0.5, 5.5, 1), colour = "#949494", size = 0.4, lty = 1) + facet_wrap(~player) + coord_polar() + scale_fill_manual(values = c("#e4e4e4", "#00a9e0")) + scale_y_continuous( limits = c(0, 4), breaks = c(1, 2, 3, 4)) + labs(x = "", y = "") + theme( panel.background = element_rect(fill = "#FFFFFF"), plot.background = element_rect(fill = "#FFFFFF"), strip.background = element_rect(fill = "#FFFFFF"), strip.text = element_text(size = 10), panel.grid = element_blank(), axis.ticks = element_blank(), axis.text.y = element_blank(), axis.text.x = element_text(size = 7), panel.spacing = grid::unit(2, "lines"))
Visualizing

I’m not sure what to call this new type of visualization, but I love it! I love it because you can immediately see that Larry Bird is “The Best” basketball player, at least when benchmarked against Steph and Kawhi. And I love it because it’s not just exotic fluff. The bubbles actually help with the interpretation of the data. The bubbles are simply better than columns, in this instance.

Just compare… These two graphs literally contain the same data:

df %>% ggplot(aes(x = stat, y = percentile, fill = player)) + geom_col(alpha = 0.5, color = "white")
Visualizing

df %>% ggplot(aes(x = stat, y = percentile, fill = player)) + geom_col(alpha = 0.5, color = "white", position = position_dodge())
Visualizing

Like, I can sort of tell that all the blue bars are really tall, but I can’t see much beyond that. It’s hard to jump back and forth between the stats and the players and tease out patterns.

Pokemon

Moving to the Pokemon question we can grab data from http://pokemondb.net/

df <- tribble( ~Pokemon, ~HP, ~Attack, ~Defense, ~SpAtt, ~SpDef, ~Speed, "Charizard", 78, 84, 78, 109, 85, 100, "Blastoise", 79, 83, 100, 85, 105, 78, "Venusaur", 80, 82, 83, 100, 100, 80) %>% gather(stat, value, -Pokemon) %>% mutate(outof6 = value %/% 20 + 1) %>% mutate(order = recode(stat, HP = 6, Attack = 1, SpAtt = 2, Defense = 3, SpDef = 4, Speed = 5))

Display it in the same way:

df %>% ggplot(aes(x = fct_reorder(stat, order), y = outof6, fill = Pokemon)) + geom_col(alpha = 3/4, width = 1, show.legend = FALSE, color = "white") + geom_hline(yintercept = seq(0, 6, by = 1), colour = "#949494", size = 0.5, lty = 3) + geom_vline(xintercept = seq(0.5, 5.5, 1), colour = "#949494", size = 0.4, lty = 1) + facet_wrap(~Pokemon) + coord_polar() + scale_fill_manual(values = c("#06AED5","#ED933C", "#65B54F")) + scale_y_continuous( limits = c(0, 6), breaks = c(1, 2, 3, 4, 5, 6)) + labs(x = "", y = "") + theme( panel.background = element_rect(fill = "#FFFFFF"), plot.background = element_rect(fill = "#FFFFFF"), strip.background = element_rect(fill = "#FFFFFF"), strip.text = element_text(size = 10), panel.grid = element_blank(), axis.ticks = element_blank(), axis.text.y = element_blank(), axis.text.x = element_text(size = 7), panel.spacing = grid::unit(2, "lines"))
Visualizing

And we can see that there really is no objective “Best” this time. It totally depends on what measures are important to us. Perhaps, I really value Speed and Attack in my Pokemon. Well, looking at the graph I can see that I ought to grab Charizard. And you might decide to go with Blastoise because you like big tanky defense pokemon. It totally depends! And single point value would be incredibly misleading here.

Cities

Just one last example to wrap it all up. Using the PWC Cities of Opportunity Index I can grab the measures that are important me, like, Broadband Quality (need that fast internet!), Entertainment, Quality of Living, Ease of Starting a Business, and Cost of Living to generate similar comparisons.

PWC actually has a visulatization tool that spits out:

But I think my bubbles are better!

df <- tribble( ~city, ~Broadband, ~Entertainment, ~`QOL`, ~Startup, ~`COL`, "Toronto", 17, 16, 30, 30, 12, "London", 19, 30, 16, 20, 1, "San Francisco", 20, 13, 18, 18, 6) %>% gather(stat, value, -city) %>% mutate(score = value %/% 6 + 1) %>% mutate(order = recode( stat, Broadband = 5, Entertainment = 1, COL = 2, QOL = 3, Startup = 4)) df %>% ggplot(aes(x = fct_reorder(stat, order), y = score, fill = city)) + geom_col(alpha = 1, width = 1, show.legend = FALSE, color = "white") + geom_hline(yintercept = seq(0, 6, by = 1), colour = "#949494", size = 0.5, lty = 3) + geom_vline(xintercept = seq(0.5, 5.5, 1), colour = "#949494", size = 0.4, lty = 1) + facet_wrap(~city) + coord_polar() + scale_fill_manual(values = c("#0C2238","#BF4E22", "#AF0023")) + scale_y_continuous( limits = c(0, 6), breaks = c(1, 2, 3, 4, 5, 6)) + labs(x = "", y = "") + theme( panel.background = element_rect(fill = "#FFFFFF"), plot.background = element_rect(fill = "#FFFFFF"), strip.background = element_rect(fill = "#FFFFFF"), strip.text = element_text(size = 10), panel.grid = element_blank(), axis.ticks = element_blank(), axis.text.y = element_blank(), axis.text.x = element_text(size = 7), panel.spacing = grid::unit(2, "lines"))
Visualizing

There you have it. “The Best” Visualization. Or at least a visualization for “The Best”.

Visualizing "The Best"

Trending Articles

《沈冰自述——我和周永康的故事》全本

Moog - Subsequent 25

出售: 林憶蓮•回來愛的身邊 (東芝1A1頭版)

筆記 - 使用 PowerShell 清除停用 AD 帳號與 OU

df-dferh-01 中国区 Android 安装 Google Play Store 后报错的解决办法

「一棒接一棒、棒棒強棒」108學年度家長會長交接典禮

吸烟与MBTI类型判断捷径 (豆瓣 INFJ的奇幻之旅小组)

acermark龍璿國際展出多款包裝設備

枋寮北勢寮隆山宮睽違12年再辦迎王祭典

日本女优有村千佳COS集锦：狂三&黑白岩&亚丝娜&绫波丽

有遇到过这个问题么。/jsb-videoplayer.js not found, possible missing file.

MAS v2.8 magicgenius 汉化版 - 11.11更新

出售: Monster Cable Interlink Reference 2

福建佛教人士望云和尚(林斌)的九仙禅寺被强行收走，望云妈妈被赶出寺庙

R 语言中的OpenBLAS*和英特尔® 数学核心函数库的性能比较

[转载]煞貢、直星、人專吉日\金神七煞歌

HAKERS哈克士戶外 12月8~14日廠拍

OBS Studio 23.2.1 免安裝中文版 - 免費網路實況廣播軟體實況主必備軟體取代Fraps

<請教>行駛中安卓機會重新開機

Udp2raw-tunnel 及其一键安装脚本