Difference between revisions of "3262: Sports Commentary"

Explain xkcd: It's 'cause you're dumb.
Jump to: navigation, search
(Explanation)
(Explanation: Thinko?)
 
(24 intermediate revisions by 11 users not shown)
Line 10: Line 10:
  
 
==Explanation==
 
==Explanation==
{{incomplete|This page was created at a statistically insignificant time. Don't remove this notice too soon.}}
+
{{incomplete|This page was created at a statistically insignificant time, but it is the FIRST PAGE TO START WITH 3262. Don't remove this notice too soon.}}
  
This comic was published 11 days into the {{w|2026 FIFA World Cup}}.
+
{{w|P-hacking}} is the academically problematic practice of attempting to come up with a question for which the data offers a significant ''p''-value (probability value), a subject [[882: Significant|previously covered]] in comic form. This is in contrast to correct scientific analysis, in which a realistic question is formulated clearly and then answered (or shown to be unjustified) with data.
 +
 
 +
A common way of doing ''p''-hacking is analyzing subgroups to attempt to find significance when the full dataset does not yield statistically significant results; for instance, arbitrarily restricting the analysis of medical data to male subjects to derive a significant ''p''-value when the inclusion of female subjects would have changed the conclusion. There are actual biological reasons why treatments may work differently, between the different groups, and other reasons why female subjects may be less suitable participants in the trial, but a ''post facto'' decision to only present the 'male data' would be problematic. As it would be for looking at many other retrospective distinctions and then choosing to present only the possibly-random patterns that stood out, and ignoring all those that did not.
 +
 
 +
Sports commentators are known to do a form of ''p''-hacking in which they recall facts regarding past performance, and sometimes they are made to sound more significant by choosing only such 'facts' that coincide heavily with the situation developing in front of them. By using {{tvtropes|OverlyNarrowSuperlative|overly narrow superlatives}}, a severe form of narrowing down of applicability also [[2901: Geographic Qualifiers|previously covered]], it also realistically reduces any real confidence that such a dwindling number of precedents are a useful predictor of how the upcoming event will turn out.
 +
 
 +
Randall satirizes this with an example in which the restriction uses very specific criteria largely irrelevant to gameplay patterns in order to narrow down the subgroup sample size to a measly two games. The 0-2 record (there were two situations considered as comparable, and neither of them resulted in the result hoped for in this current case) reflects random noise much more than any significant insight. As well as being irrelevant to gameplay, their ''p''-hacking also makes the game sound like jargon, which can be confusing and difficult to understand. This is ironic given a sports commentator's job is supposed to be to explain the situation they are fronting, rather than making them harder and more vague. However, this may be the inevitable response to being left in front of the camera during breaks in play, or even during periods of gameplay that have nominally unremarkable — feeling the pressure to say ''something'', they will draw upon ever more obscure and irrelevant details to justify their (or their off-screen advisors') efforts and expertise to entertain and inform the viewing public.
 +
 
 +
{{cot|A breakdown of the commentary's statement}}
 +
The comparison being made is that "Over the last 36 years, they've gone 0 for 2 when they've scored in the 37th minute to lead 2-1 against a team whose country comes before theirs alphabetically." This contains the following basic stipulations:
 +
;"Over the last 36 years, ...":Counting just the full FIFA World Cup competitions, assuming they qualified for every one, the total number of games that an internation team will have played, prior to anything in this years' competition, would have been a minimum of twenty seven matches (i.e. playing the First Round group-stages, playing once against each of the other three teams in their particular group of four). ''If'' they're ever successful enough in the group stage, they'd then progress through the Second Round knockout competition, for as many matches as they avoid being knocked out, and semi-final finalists additionally get to play one more match to establish the third-place overall. On top of that, there are the various regional qualifying matches they will usually have had to play to even enter the main competition, plus any other international matches (e.g. '{{w|Exhibition game|friendlies}}', or other region-based inter-nation competitions) that may have been taken part in.
 +
;"... when they've scored in the 37th minute...":A football game has a nominal 90 minutes of game-time, plus possible extra time. No team in the World Cup has scored any more than {{w|Hungary v El Salvador (1982 FIFA World Cup)|ten goals}} in a single game, but it is ''far'' more common for even winning teams to have scored just two or three times per game, statistically, the chances of scoring in any given minute is an insignificant detail. There is also effectively no useful analysis of a goal being in the 37th minute, as opposed to the 36th or 38th, and hardly any even in being between in the larger block between 30 and 40 minutes. The psychology of goal-timings usually gravitates towards whether they were in the first or second ''half'' of the event (or, beyond that, in Extra Time), with most useful attention paid to those that occur right at the start of either half (one team immediately seizing the initiative on the field) or right at the end (when desperation, increased chance-taking or just player exhaustion can lead to much needed/feared game-changing goals once any attempt at mutually defensive play breaks down and possible goal-droughts are ended).
 +
;"... to lead 2-1 ...":As an equivalent example, in the 2022 World Cup, 14 Group Stage games (out of 48) and 9 Knockout Stage games (out of 16) may have at some point reached a 2-1 scoreline for one or other team, depending upon the order the respective teams' goals occurred<!-- which I didn't look into - feel free to do that legwork for me! -->, making this a relatively rare situation to be in. For additional context, and most relevant to the full statement, that year's competition also saw just six Group games that had scores that ''might'' have had<!-- could also be checked, as I didn't dig into those enough --> a temporary 2-1 lead for the team that went on to lose, whereas ''no'' team with a 2-1 scoreline in the Knockouts did not then go on to win that match<!-- For those editors interested in my limited research on this matter: Argentina were 2-1 in two cases, then fought back to a draw by the end of Extra Time, but then triumphed due to out-scoring their opponents in the necessary Penalty Shootout -->.
 +
;"... against a team whose country comes before theirs alphabetically.":In ''every'' international match (and others, excepting perhaps games used to train the team's players against each other), there will inevitably be one national team whose name is alphabetically prior that of their opponents', even if that features very similar names (such as a match between the two Koreas, using the most similar manner of naming, where {{w|North Korea national football team|Korea DPR}} would precedes {{w|South Korea national football team|Korea Republic}}) and there would also be no clear reason why a naming issue (alone) would have any significant bearing upon match outcomes.
 +
;"... they've gone 0 for 2 ...":(As the stated past consequence of all these specifically combined conditions.) Just ''two'' occasions satisfied all these conditions, out of possibly many tens of matches, and we are told that neither of them ended in a victory. Not only are the comic's precedents ''very'' rare, compared to all possible games (which, nevertheless seems to be even rarer in real life<!-- unless and until finds such historically matching matches, then please edit this!-->), but also this mini-'streak' of results is only a matter of history. In [[1122: Electoral Precedent]], increasingly convoluted situations may have previously been entirely predictive in possibly even several dozen instances... ''until they weren't''.
 +
{{cob}}
 +
 
 +
This comic was published 11 days into the {{w|2026 FIFA World Cup}}. The World Cup was also the subject of [[3260: Messi]], published the previous Wednesday. Sports commentary was also the subject of [[904: Sports]].
  
 
==Transcript==
 
==Transcript==
 
{{incomplete transcript|Don't remove this notice too soon.}}
 
{{incomplete transcript|Don't remove this notice too soon.}}
[Cueball and Ponytail are sitting at a table. On the wall behind them is a screen showing a soccer field with some unreadable score information above it.]
+
:[Cueball and Ponytail are sitting at a table, looking at the wall behind them. On the wall is a screen showing a soccer field with some mostly unreadable score information above it. The only readable information is that the score is 2-1.]
 +
:Cueball: They could be in trouble. Over the last 36 years, they've gone 0 for 2 when they've scored in the 37th minute to lead 2-1 against a team whose country comes before theirs alphabetically.
 +
:[Caption below comic:]
 +
:I wish sports commentators hadn't discovered p-hacking.
  
Cueball: They could be in trouble. Over the last 36 years, they've gone 0 for 2 when they've scored in the 37th minute to lead 2-1 against a team whose country comes before theirs alphabetically.
+
{{comic discussion}}<noinclude>
 
 
[Caption below comic:] I wish sports commentators hadn't discovered P-hacking.
 
  
{{comic discussion}}<noinclude>
+
[[Category:Comics featuring Cueball]]
 +
[[Category:Comics featuring Ponytail]]
 +
[[Category:Soccer]]
 +
[[Category:Statistics]]

Latest revision as of 23:22, 23 June 2026

Sports Commentary
The plural of anecdote may not be data, but the singular of data is anecdote.
Title text: The plural of anecdote may not be data, but the singular of data is anecdote.

Explanation[edit]

Ambox warning blue construction.png This is one of 45 incomplete explanations:
This page was created at a statistically insignificant time, but it is the FIRST PAGE TO START WITH 3262. Don't remove this notice too soon. If you can fix this issue, edit the page!

P-hacking is the academically problematic practice of attempting to come up with a question for which the data offers a significant p-value (probability value), a subject previously covered in comic form. This is in contrast to correct scientific analysis, in which a realistic question is formulated clearly and then answered (or shown to be unjustified) with data.

A common way of doing p-hacking is analyzing subgroups to attempt to find significance when the full dataset does not yield statistically significant results; for instance, arbitrarily restricting the analysis of medical data to male subjects to derive a significant p-value when the inclusion of female subjects would have changed the conclusion. There are actual biological reasons why treatments may work differently, between the different groups, and other reasons why female subjects may be less suitable participants in the trial, but a post facto decision to only present the 'male data' would be problematic. As it would be for looking at many other retrospective distinctions and then choosing to present only the possibly-random patterns that stood out, and ignoring all those that did not.

Sports commentators are known to do a form of p-hacking in which they recall facts regarding past performance, and sometimes they are made to sound more significant by choosing only such 'facts' that coincide heavily with the situation developing in front of them. By using overly narrow superlatives, a severe form of narrowing down of applicability also previously covered, it also realistically reduces any real confidence that such a dwindling number of precedents are a useful predictor of how the upcoming event will turn out.

Randall satirizes this with an example in which the restriction uses very specific criteria largely irrelevant to gameplay patterns in order to narrow down the subgroup sample size to a measly two games. The 0-2 record (there were two situations considered as comparable, and neither of them resulted in the result hoped for in this current case) reflects random noise much more than any significant insight. As well as being irrelevant to gameplay, their p-hacking also makes the game sound like jargon, which can be confusing and difficult to understand. This is ironic given a sports commentator's job is supposed to be to explain the situation they are fronting, rather than making them harder and more vague. However, this may be the inevitable response to being left in front of the camera during breaks in play, or even during periods of gameplay that have nominally unremarkable — feeling the pressure to say something, they will draw upon ever more obscure and irrelevant details to justify their (or their off-screen advisors') efforts and expertise to entertain and inform the viewing public.

This comic was published 11 days into the 2026 FIFA World Cup. The World Cup was also the subject of 3260: Messi, published the previous Wednesday. Sports commentary was also the subject of 904: Sports.

Transcript[edit]

Ambox warning green construction.png This is one of 28 incomplete transcripts:
Don't remove this notice too soon. If you can fix this issue, edit the page!
[Cueball and Ponytail are sitting at a table, looking at the wall behind them. On the wall is a screen showing a soccer field with some mostly unreadable score information above it. The only readable information is that the score is 2-1.]
Cueball: They could be in trouble. Over the last 36 years, they've gone 0 for 2 when they've scored in the 37th minute to lead 2-1 against a team whose country comes before theirs alphabetically.
[Caption below comic:]
I wish sports commentators hadn't discovered p-hacking.

comment.png  Add comment      new topic.png  Create topic (use sparingly)     refresh discuss.png  Refresh 

Discussion

F1rst p0st! I'll do this explanation. 185.36.194.22 04:32, 23 June 2026 (UTC)

Did this example actually happen? 47.151.65.120 04:33, 23 June 2026 (UTC)

This comic reminds me of 1122: Electoral Precedent and 2383: Electoral Precedent 2020. Generalizing coincidences.

I am not a native English speaker. What does " they've gone 0 for 2" mean? Obviously it cannot be the score, since they are already leading 2-1? Or does this refer to a previous match? And on a more general note, I am really surprised to discover the second football themed comic strip in a few days. OK it's the World Cup, but I always thought that Randall doesn't really care about sports? --92.209.171.90 08:37, 23 June 2026 (UTC)

I am a native English speaker, but it was also a bit impenetrable to me. In part, perhaps, because it was intended to sound impenetrable (as part of the joke). But, even if not, it may be because it's using Americanized sports-talk phrasing that just isn't (yet!) used so much in my more native Anglicised commentaries that I'm used to.
However, I think they're saying that "in the two specific occasions in which all those other conditions occur, they won in neither of them".
A simpler version being perhaps to state that a given team/player has gone nought-for-two in previous matches with their current opponent(s). The results of those contests might have been anything (the winner having gone to 3-2 after penalties, 6-love/6-love/6-love, a par-4 advantage or getting them all out for 178 — depending upon the sport), it's just the win/lose (or win/not-win) count thats "0 for 2".
But this is a case of Overly Narrow Superlative (overlapping with P-Hacking), making it a dubious analysis. Starting with ignoring all the games there are in which a given svoreline was not achieved in a particular minute of play. I think part of this set-up is the difference between Gridiron 'football'/"hand-egg" having tons of points scored, whereas this football (Soccer) often turns on comparatively low scores that (one-nil can be a worthy and entertaining win/loss, and even a no-score-draw might have been fun to watch if your side isn't in desperate need for a win). These commentators, or at least the US audience they're commentating to, are used to spieling things about "the last time they were down on the forty-yard line in the fifth quarter, with two home runs and a shot from the free-throw line in hand..." (look, I know I don't know what they'd really say, to any accuracy, there was no point even trying!), at least to fill in the copious down-time/time-out pauses. (Which isn't actually as easy with low-scoring but more ever-moving 'soccer', where there's often much to be said about current player and ball movements almost all the time; although a five-day international cricket test match(!) commentary on the radio does rather famously lapse into 'filler' like discussing the nice cake that was sent to them by a listener, in the gaps between balls being bowled...)
Sorry, that was a long and convoluted paragraph. (But then, so was the Explanation, before I decided to say this down here. I hope it's been tweaked since then. I'm only really guessing about the Leftpondian commentator-speak being parodied here, and ball-sports aren't really my main interest in the sprorting sphere itself. (But, regarding balls that aren't themselves spheres, I'd happily discuss Rugby League or Rugby Union, and why they're 'better'... though I would totally acknowledge Aussie Rules as a class of its own as far as such contact-sports go.)
HTH, HAND. 82.132.236.84 10:08, 23 June 2026 (UTC)
I'm also English, and it's totally alien to me too. GSLikesCats307 (talk) 11:53, 23 June 2026 (UTC)
I don't know what prompted the rant above, but if you don't care to read it, "going 0 for 2" means having 0 successes out of 2 chances. In the context of this commentary, it's referring to winning 0 games out of the 2 games that meet the criteria. It's not intended to sound impenetrable; it's a common phrasing.163.116.145.33 13:37, 23 June 2026 (UTC)
Worth noting I think that it's a common AMERICAN phrasing. It's hardly ever used in England.
I don't know what prompted you to think it was a rant. It's certainly quite lengthy (in the context of discussion comments here - not in the grand scheme of things), but that's not really the definition of a rant. 82.13.184.33 14:57, 23 June 2026 (UTC)
Because it was lengthy, but it almost entirely ignored the question, focusing instead on your opinion on sports and their commentary styles. Going off on one vs. ranting...it's a blurry distinction. You absolutely did the former; possibly the latter. 0 for 2 (pronounced as a letter "O") simply means zero victories for two games played. It's not obscure terminology. Yorkshire Pudding (talk) 15:50, 23 June 2026 (UTC)
Yes, it is obscure. The only context in which “0 for 2” makes sense to me is in cricket: 0 runs scored, 2 batsmen out. The 0 of / out of / from 2 described above is just not something which would occur to me, and it's largely because of that ‘for’. If it's common, why have I never heard it on (for example) Match of the Day? Randomnonsense (talk) 16:44, 23 June 2026 (UTC)
It's hard to argue that a usage common to the US - the largest English-speaking country in the world - Canada (https://www.google.com/search?q=%22went+0+for%22+site%253Athestar.com), and Australia (https://www.google.com/search?q=%22went+0+for%22+site%3Aabc.net.au, including their coverage of American sports leagues, local sports leagues, international tennis, and international cricket) is obscure. Your personal knowledge is not a good metric for obscurity. 163.116.145.44 18:50, 23 June 2026 (UTC)
Precisely what Yorkshire Pudding said; you could have stopped after your third paragraph, or really even after your second, and you wouldn't have lost any relevant information. I'll add that the rantiness is enhanced by saying things like Gridiron 'football'/"hand-egg", and the very "sportsball"-coded "the last time they were down on the forty-yard line in the fifth quarter, with two home runs and a shot from the free-throw line in hand...," which, apart from their irrelevance to the topic, have a very superior air. 163.116.145.44 16:51, 23 June 2026 (UTC)

Closest match I can find is Germany - Curacao but there Germany took the lead in the 38th minute (not the 37th). I leave the deep dive on Germany's record against teams alphabetically before them when they have taken the lead 2-1 in the 37th/38th minute to someone else...

And, of course, Germany destroyed Curaçao 7-1, just like they did to Brazil (which is also alphabetically before Germany!) 12 years prior Wilh3lm (talk) 12:38, 23 June 2026 (UTC)
As I recall, the Germans had already scored 7 before the Brazilians scored, and quite a few people independently came up with Ger-many Braz-nil… Randomnonsense (talk) 16:48, 23 June 2026 (UTC)
I did a limited look into possible edge-cases, but didn't get anywhere so far as to confirm all minutes-of-goals for those. Concentrated on (fairly recent) games with more than a 2-2 final score, as the requirements are that they must have at least been 2-1 and then at least got levelled up by the lagging side, before the possible tie-breaking penality shootouts.
I suppose I could have first narrowed it down to every game with a reported 37th minute goal (given the rarity of that exact event, by apparently common agreement) then tallying up the precise game-state at that time, plus the final result. But I was looking into the other fine details first. 81.179.200.152 23:14, 23 June 2026 (UTC)

Don't assume "bad" p-hacking without looking closer at the data: Sometimes what looks like P-hacking is really finding previously-unseen patterns. If you have a drug trial on a drug that you have no reason to think will show gender differences and you are asking "is this drug better than existing drugs" and the results are inconclusive, then you do "p-hack" subgroups and find that in males between the ages of 18 and 50 it demonstrates superior results, you MAY be cherry-picking results or you MAY have found a hidden pattern. Assuming your sub-group size isn't ridiculously small, you can legitimately claim that you need more funding for a follow-up study or at least a follow-up analysis of this subgroup in previous studies of the same drug. 150.221.155.241 13:35, 23 June 2026 (UTC)

Yes, more data is usually the solution. The comic deliberately uses an extremely small dataset. You can make up almost any hypothesis and find 2-3 datapoints that fit it. Barmar (talk) 14:24, 23 June 2026 (UTC)
Or, perhaps more commonly, if you have a sufficiently large dataset you can mine through it and come up with two or three interesting-looking 'hypotheses' that it'll appear to support, even if you didn't have any to start with. 82.13.184.33 15:02, 23 June 2026 (UTC)
The difference is that taking a lot of data and looking at many possible patterns to that data is likely to reveal artefacts of mere chance.
Considering a single possible pattern and looking to see if it is justified is (even if, as an individual pattern, exactly as much a coincidental artefact or not) useful. So one might legitimately be able to suggest that the male subjects may show a legitimate long-term improvement to a treatment (possibly either because it works better with male hormones, or because there aren't the same underlying hormonal variations across each month, or just because it works better for different body-mass/fat distributions that are more typically male than female), or perhaps vice-versa (that it actually works more usefully for females, again for such reasons).
But filtering inconclusive results through many possible sorting algorithms and extracting 'nuggets' of apparent significance devalues those nuggets. Especially if it gets more complexly combinatorial. Imagine the number of criteria you might have considered, together and individually, to establish anything like "the treatment was twice as effective as the control treatment in males less than 30 with an older brother and females over 38 whose job is in education". How you'd even rationalise/explain such a complicated cause/effect relationship is one thing. That you've probably discarded so many other datums you had (out of hundred subjects, you're decided to ignore maybe half the people because of being under-/over-age, in this gendered thing, cutting back far more than that for the family/employment requirements and then of the remainder (probably somewhat <10, I'd guess, depending on where you actually got the initial subject-pool) and then, for such a comparison to stand up, the non-discards must be further split between on-treatment and whatever off-treatment/old-treatment you're comparing to, is an issue of practicality which is the other issue. (At least until you come up with a decent reason to recruit these subsets explicitly for a follow-up study, assuming the relationship doesn't just vanish/revert-to-the-mean when you do as it was pure chance after all.)
Realistically, you should weight the apparent significance of 'results' according to how many results you actively looked for. (e.g. had fifty possible things, divide the significance of any positive 'match' by 50. Although I can see reasons why division by 50**2 or ln(50) would be better, depending upon the scenario relevant to this kind of siftingt.) But better just to avoid that.
Which might be a big problem in AI-derived research results. You're asking an algorithm to make an uncounted large number of separate interpretations (including of exactly what question it is that you're asking) and then it's returning the 'nicest' results that seemingly fulfil its mysterious 'training' insofar as they aim to be the single/several results from the largely randomised treatments that are judged to be the least incomprehensible. 81.179.200.152 20:00, 23 June 2026 (UTC)

The saddest part of all this is that Randall, in the title text of this comic, might be one of the last persons in the English-speaking world to recognize that "data" is a plural noun, the singular of which (icymi) is "datum". 147.81.27.244 15:38, 23 June 2026 (UTC)

I can assure you that Randall knows full well how "data" operates and that's part of the joke. If all you have is a single data point, you can extrapolate whatever you want from it; which makes it no better than noise or anecdote. 74.202.210.170 19:05, 23 June 2026 (UTC)
      comment.png  Add comment