# Editing 2118: Normal Distribution

**Warning:** You are not logged in. Your IP address will be publicly visible if you make any edits. If you **log in** or **create an account**, your edits will be attributed to your username, along with other benefits.

The edit can be undone.
Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.

Latest revision | Your text | ||

Line 13: | Line 13: | ||

Many statistical samplings resemble a pattern called a "{{w|normal distribution}}". A theoretically perfect normal distribution would have an infinite sample size and infinitely small bins. That would produce a bar chart matching the shape of the curve in the comic. | Many statistical samplings resemble a pattern called a "{{w|normal distribution}}". A theoretically perfect normal distribution would have an infinite sample size and infinitely small bins. That would produce a bar chart matching the shape of the curve in the comic. | ||

− | The area between two vertical lines of the distribution represents the probability that | + | The area between two vertical lines of the distribution represents the probability that the value is between the x-values of the lines, and the total area is 1. Randall finds the area between two ''horizontal'' lines instead, which is mathematically meaningless, because the y-axis of a probability distribution represents {{w|absolute magnitude|magnitude}} as a fraction of unity (although we do have half of the normal curve between the two lines). The items represented by the magnitude at any given horizontal position are indistinguishable, unordered, and interchangeable; the idea that one could be above another is meaningless, and the fact that two items happen to fall at the same position on the y-axis doesn't mean they have anything in common. So, the comic explores the humor of annoying people by deliberately misunderstanding their work. |

− | + | An alternative explanation is that Randall has invented a new probability distribution, that we could call the ''tangent distribution'' (from the title text), the ''Munroe distribution'', or something of the sort. This distribution is defined as follows: consider the area between the curve in the comic and the horizontal axis, and consider a random point (X, Y) uniformly distributed in that region. Then X has the normal distribution and Y has the tangent distribution. Areas between vertical lines in the comic give probabilities about X, and areas between horizontal lines in the comic give probabilities about Y. So the comic gives a correct statement that the interval of Y values that is 52.682% of the range of Y centered at the midpoint of the range has probability 1/2. Great! Except this distribution has never been discussed before because it has no known application. Moreover, it makes no sense to talk about intervals centered at the midpoint of the range because the distribution of Y is not symmetric: the midpoint of the range is neither the mean, the median, nor the mode. So even if this distribution were interesting, the probability in the comic is not a good way to describe it! We do use such intervals for the normal distribution because the normal distribution is symmetric, and the center of symmetry is the mean, median, and mode. | |

− | + | The title text refers to the {{w|Normal (geometry)|normal line}}, which is perpendicular to the {{w|tangent}} line at a given point. Given a shape of interest, a normal line points perpendicularly away from it at a point, making a 90-degree angle with it in all directions, while a tangent line crosses a point on it and is exactly parallel to it at that point. The normal line is not at all related to the normal distribution, as the former is a geometry concept and the latter is probability/statistics one. Saying this to a statistician would only annoy the statistician further. This refers to the fact that the diagram attempts to divide the graph with horizontal lines when such a division would usually be done with perpendicular vertical lines. | |

− | + | This is annoying to a probabilist or statistician not only because the terms ''normal'' and ''tangent'' come from differential geometry and have no established meaning in probability theory. Even the word ''perpendicular'' has no established meaning in probability theory. Of course, the x and y coordinates in the comic are perpendicular (orthogonal) coordinates, but X and Y are not "perpendicular" or "orthogonal" random variables. Even if we give "perpendicular" or "orthogonal" a probabilistic meaning, and the most obvious such meaning is either {{w|Independence (probability theory)|independent}}, which even uses a symbol related to the geometric symbol for perpendicularity, or {{w|Uncorrelatedness (probability theory)|uncorrelated}}, which makes X and Y orthogonal vectors in the Hilbert space of random variables that are square integrable with respect to Lebesgue measure, X and Y are not perpendicular in either of these senses. | |

− | |||

− | This is annoying to a statistician not only because the terms ''normal'' and ''tangent'' come from differential geometry and have no established meaning in probability theory. Even the word ''perpendicular'' has no established meaning in probability theory. Of course, the x and y coordinates in the comic are perpendicular (orthogonal) coordinates, but X and Y are not "perpendicular" or "orthogonal" random variables. Even if we give "perpendicular" or "orthogonal" a probabilistic meaning, and the most obvious such meaning is either {{w|Independence (probability theory)|independent}}, which even uses a symbol related to the geometric symbol for perpendicularity, or {{w|Uncorrelatedness (probability theory)|uncorrelated}}, which makes X and Y orthogonal vectors in the Hilbert space of random variables that are square integrable with respect to Lebesgue measure, X and Y are not perpendicular in either of these senses. | ||

So the more probability and statistics you know, the more annoying this comic becomes. It is not just about confusing novices. | So the more probability and statistics you know, the more annoying this comic becomes. It is not just about confusing novices. | ||

==Transcript== | ==Transcript== | ||

+ | {{incomplete transcript|Do NOT delete this tag too soon.}} | ||

+ | |||

:[A bell curve of a normal distribution, with the area between two horizontal lines shaded.] | :[A bell curve of a normal distribution, with the area between two horizontal lines shaded.] | ||

+ | |||

:[The center of the chart is marked between the two lines:] | :[The center of the chart is marked between the two lines:] | ||

Line 36: | Line 37: | ||

:[A label on the outside of the graph, describing the distance between the two lines:] | :[A label on the outside of the graph, describing the distance between the two lines:] | ||

:"Remember, 50% of the distribution falls between these two lines!" | :"Remember, 50% of the distribution falls between these two lines!" | ||

+ | |||

:[Caption below the panel:] | :[Caption below the panel:] |