Difference between revisions of "3026: Linear Sort"

Explain xkcd: It's 'cause you're dumb.
Jump to: navigation, search
(Explanation: To make clear that Big-O is far more general than specifically "to optimise your sorting algorithm". And it arose for such prior generalities.)
(25 intermediate revisions by 25 users not shown)
Line 11: Line 11:
 
==Explanation==
 
==Explanation==
 
{{incomplete|Created in Θ(N) TIME by an iterative Insertion Sorter working on a multidimensional array - Please change this comment when editing this page. Do NOT delete this tag too soon.}}
 
{{incomplete|Created in Θ(N) TIME by an iterative Insertion Sorter working on a multidimensional array - Please change this comment when editing this page. Do NOT delete this tag too soon.}}
A common task in programming is to sort a list, a list being a collection of related elements of data that are stored in a linear fashion. There are dozens of algorithms that have been created for this through the years, from simple to complex, and each has its own merits with regards to how easy it is to understand / implement, how much space it uses, and how efficiently it operates on the data.
 
  
In computer science the runtime of an algorithm can be described using {{w|Big O Notation}}, which describes the asymptotic runtime (usually of an average case) as the number of elements operated on (notated as ''n'') grows larger and larger. Specifically, O(''f''(''n'')) means that as ''n'' grows without bound, the runtime is less than ''f''(''n'') times some constant value. For instance, an O(''n'') algorithm (here ''f''(''n'') = ''n'') would, as ''n'' grows large, take a fixed amount of time per element operated on, so the total time it takes would be a multiple of the size of the list. If it takes one second to look at a picture, it would take ten seconds to look at ten pictures. So "look at a list of pictures" is a linear operation and would be described as having runtime complexity O(''n''); this is described as 'linear' runtime. An algorithm described as O(''n''<sup>2</sup>) would take time less than or equal to some constant times ''n''<sup>2</sup>.
+
A common task to have computers do is to sort a list, such as from A to Z or from the smallest to the largest number. People have created dozens of algorithms for this task, from simple to complex, each with its own merits on ease of implementation, memory usage, and efficiency; to understand the last one, for this and all other uses of algorithms, computer scientists adopted {{w|Big O Notation}}. There are two aspects of Big O notation to know:
  
Some examples of runtimes expressed in Big O notation:
+
1. It is ''asymptotic''; meaning that it simply expresses the base relation between the size of the list and the time needed to sort. O(2''n'') is written simply as O(''n''), because Big O is more concerned about indicating the sort time scaling linearly rather than accurately giving you a formula to calculate how long it'll take. Likewise, O(2<sup>''n''</sup> + ''n''<sup>2</sup>) might be written as O(2<sup>''n''</sup>) because the value of ''n''<sup>2</sup> would eventually become small enough (relative to 2<sup>''n''</sup>) to be rounded off.
<ul><li><b><i>O</i>(1)</b> - Constant, which means the execution time is independent of the size of the data</li>
 
<li><b><i>O</i>(''n'')</b> - Linear, which means the execution time varies in direct proportion to the size of the data</li>
 
<li><b><i>O</i>(''n'' log ''n'')</b> - The execution time grows proportionally to ''n''*log(''n'')</li>
 
<li><b><i>O</i>(''n''<sup>2</sup>)</b> - Quadratic, meaning the execution time is proportional to the square of the size of the data.</li></ul>
 
  
It can be proven that the best general-purpose sorting methods are O(''n'' log ''n''). Thus, a sorting algorithm in linear time does not actually exist; sorting is more complex. The time it takes to sort a list of items grows as you add more items to the list. The complexity of sorting algorithms generally ranges from O(''n'' log ''n'') to O(''n''<sup>2</sup>). For example, one way to sort is to look at all the values to find the smallest item, then look at all the values again to find the second item - the smallest one not found yet - and so on until you've positioned every item in the right place. If "looking at a number" takes one second, then you could sort a list of 2 numbers in 4 seconds: look at both numbers, then look at them a second time. Sorting 3 numbers would take 9 seconds: look at all 3 numbers 3 times to find the right position. Sorting a deck of cards this way would take 52*52 seconds = about 45 minutes. The amount of time it takes to sort a list grows faster the more items you look at. This is not the most efficient way to sort, but it gets the job done.
+
2.Big O notation is only an average estimation. The actual figure depends on the hardware, software, and the initial state of the list - consider if you needed to sort a pile of books by title you might luck to find most books are already in the correct position and only need to move one or two, or they might be hopelessly jumbled up and necessitate moving most, even all of them. Big O notation is mainly used to examine various methods to accomplish the same task, running on the same hardware, rather than being real-world benchmarks.
  
The 'linear' sort here first uses the {{w|merge sort}}, which takes time O(''n'' log ''n'') rather than the faster linear (i.e., O(''n'')) time. It does not matter, however, because it `sleep()`s for 1e6 (1 million) seconds, more than 11 days, per item in the list. It does this by sleeping for that length of time minus the time it actually took -- converting it by brute force to linear time that is so slow that it will not overcome the O(''n'' log ''n'') term for any dataset it will encounter. It's linear because it's guaranteed to take a million seconds for every element in the list (regardless of how long the actual merge sort really took); doubling the list size doubles the number of millions of seconds it takes. No matter how inefficient the merge sort sort method is, it is very unlikely it would take longer than a million seconds per element. The joke is that although this algorithm runs in linear time, which is usually preferable to ''n'' log ''n'' time, the algorithm is so grotesquely slow that it would never be useful.
+
A few common Big O notations are listed below, from smallest to largest:
 +
<ul><li><b><i>O</i>(1)</b> - Constant time, which means the sorting will always take X seconds no matter how big the data is</li>
 +
<li><b><i>O</i>(''n'')</b> - Linear time, which means the execution time grows in direct proportion to the size of the data</li>
 +
<li><b><i>O</i>(''n'' log(''n''))</b> - The execution time grows proportionally to ''n'' * the {{w|logarithm}} of ''n''. For small lists, this value would be smaller than O(n) (meaning it'll take less time), but as the lists grow it will generally take more time (again, on average).</li>
 +
<li><b><i>O</i>(''n''<sup>2</sup>)</b> - Quadratic time, meaning the execution time grows proportionally to the ''square'' of the size of the data.</li></ul>
  
It should be noted that for sufficiently large lists, merge sort will take longer than the million seconds per element, at which point the sorting algorithm will stop being linear anyway. However, this issue will only arise for impossibly huge lists. If, for instance, merge sort took time ''n'' log(''n'') * 1 microsecond (slow by today's standards), then the comic's 'linear' sort would take longer than merge sort only for lists longer than 2<sup>1,000,000,000,000</sup> ≈ 10<sup>300,000,000,000</sup> elements - a number far larger than the number of atoms in the universe.
+
As one can image in most contexts one would wish for sorting to be done as fast as possible, so O(1) is better than O(n), which is in turn better than O(n*log(n)) etc (at least assuming your lists are relatively large).  The code in the comic describes a 'linear' sort that first sorts the list using {{w|merge sort}}, which is known to take time O(''n'' log(''n'')), and then `sleep()`s (pauses with no activity) for an amount of time equal to subtracting the time taken for the sort from the number of elements multiplied by 1 million (1e6) seconds. Here the joke is that, rather than creating an algorithm that actually takes O(n) time it simply disguises an algorithm that takes O(n*log(n)) and makes it appear to be O(n). The runtime is much longer than mergesort's, but it increases linearly with the length of the list, so it appears to be O(n). This is obviously not ideal as waiting around doing nothing is the opposite of optimization.{{cn}}
  
The title text refers to the {{w|Best, worst and average case|best and worst case}} of the sort, which are additional measures of the runtime of an algorithm. A given sort may only repeat a whole or partial scan of a list when a previous pass has identified a need to shuffle elements around; scanning a pre-sorted list just the once (and deducing that it has no work to do) results in a best case of O(''n''). Depending upon the algorithm, presenting a list that is reverse-sorted (or perhaps in some other specific ordering that happens to challenge it the most) may require even an 'average O(''n'' log ''n'')' process to take a worst-case number of operations of something like O(''''). It can be very useful to know that a given sorting method ''may'' take the average order of time, but have the possibility of a much shorter ''or'' longer runtime. An O(''n'') best case is particularly desirable when mostly expecting data that biases towards that result, but not all algorithms (or working situations) are capable of this. On the other hand, it may be more desirable to pursue an algorithm that is instead 'less worse' at handling worst-case orderings, or at least aren't expected to be [[1185: Ineffective Sorts|far far worse]] than others, despite a better response under such rare ideal conditions.
+
This joke is carried on by the title text, which refers to the {{w|Best, worst and average case|best and worst case}} of a sort, which are additional measures of its runtime to describe the shortest and longest potential times. A more optimal sort may decide how much of a list needs to be passed over again after its first pass of shuffling elements around; scanning a pre-sorted list (and deducing that it has no more checking to do) could mean that no more effort is needed, resulting in a best case of O(''n''). Depending upon the algorithm, presenting a list that is in an ordering that happens to challenge it the most (such as exactly reversed) may mean even an 'average O(''n'' log ''n'')' process would have to exceed this, resulting in a worst-case number of operations that may be O(''n<sup>2</sup>''). It can be very useful to know that a given sorting method ''may'' take the average order of time, but have the possibility of a much shorter ''or'' longer runtime... especially when the method is expected to be [[1185: Ineffective Sorts|far, far worse than others]], where only particular and more idealistic input lets it approach the more satisfyingly fast average/best responses.
 
 
By forcing all practical searches to take exactly the same time, regardless of the manner that the same data is provided for sorting, the best case and the worst case (in all practical cases) are also strictly O(''n''), for exactly the same contrived reason. For the programmer who conducted this deception, the real secret of the O(''n'') is that someone just takes the claim of a "linear sort" at face value, without worrying about why it is actually so slow. The actual worst case scenario is if someone becomes curious enough to investigate how the code works (perhaps once they discover it takes nearly twelve days per element) and uncovering the deceptive manner of the code.
 
 
 
The same idea trivially generalises to a _constant_ O(1) bound for sorting (that's so much better than linear!). [run a sort algorithm, wait till 40 years from query, give the answer]. That's the preferred way of repaying loans, too.
 
  
 +
By forcing all practical searches to take O(''n'') time, regardless of how otherwise identical data is presorted, the best case (and worst case, for that matter) will also be O(''n''). The last part of the text then plays on another meaning of best case and worst case, as best- and {{w|worst-case scenario}}s for a situation, by saying that the worst outcome for the code's author is when someone decides to investigate the code (perhaps owing to its absurd runtime, or else just justifiably skeptical of the declared optimality), whereupon that investigator will discover the deception and ruin the author's reputation.
  
 
==Transcript==
 
==Transcript==

Revision as of 14:25, 21 December 2024

Linear Sort
The best case is O(n), and the worst case is that someone checks why.
Title text: The best case is O(n), and the worst case is that someone checks why.

Explanation

Ambox notice.png This explanation may be incomplete or incorrect: Created in Θ(N) TIME by an iterative Insertion Sorter working on a multidimensional array - Please change this comment when editing this page. Do NOT delete this tag too soon.
If you can address this issue, please edit the page! Thanks.

A common task to have computers do is to sort a list, such as from A to Z or from the smallest to the largest number. People have created dozens of algorithms for this task, from simple to complex, each with its own merits on ease of implementation, memory usage, and efficiency; to understand the last one, for this and all other uses of algorithms, computer scientists adopted Big O Notation. There are two aspects of Big O notation to know:

1. It is asymptotic; meaning that it simply expresses the base relation between the size of the list and the time needed to sort. O(2n) is written simply as O(n), because Big O is more concerned about indicating the sort time scaling linearly rather than accurately giving you a formula to calculate how long it'll take. Likewise, O(2n + n2) might be written as O(2n) because the value of n2 would eventually become small enough (relative to 2n) to be rounded off.

2.Big O notation is only an average estimation. The actual figure depends on the hardware, software, and the initial state of the list - consider if you needed to sort a pile of books by title you might luck to find most books are already in the correct position and only need to move one or two, or they might be hopelessly jumbled up and necessitate moving most, even all of them. Big O notation is mainly used to examine various methods to accomplish the same task, running on the same hardware, rather than being real-world benchmarks.

A few common Big O notations are listed below, from smallest to largest:

  • O(1) - Constant time, which means the sorting will always take X seconds no matter how big the data is
  • O(n) - Linear time, which means the execution time grows in direct proportion to the size of the data
  • O(n log(n)) - The execution time grows proportionally to n * the logarithm of n. For small lists, this value would be smaller than O(n) (meaning it'll take less time), but as the lists grow it will generally take more time (again, on average).
  • O(n2) - Quadratic time, meaning the execution time grows proportionally to the square of the size of the data.

As one can image in most contexts one would wish for sorting to be done as fast as possible, so O(1) is better than O(n), which is in turn better than O(n*log(n)) etc (at least assuming your lists are relatively large). The code in the comic describes a 'linear' sort that first sorts the list using merge sort, which is known to take time O(n log(n)), and then `sleep()`s (pauses with no activity) for an amount of time equal to subtracting the time taken for the sort from the number of elements multiplied by 1 million (1e6) seconds. Here the joke is that, rather than creating an algorithm that actually takes O(n) time it simply disguises an algorithm that takes O(n*log(n)) and makes it appear to be O(n). The runtime is much longer than mergesort's, but it increases linearly with the length of the list, so it appears to be O(n). This is obviously not ideal as waiting around doing nothing is the opposite of optimization.[citation needed]

This joke is carried on by the title text, which refers to the best and worst case of a sort, which are additional measures of its runtime to describe the shortest and longest potential times. A more optimal sort may decide how much of a list needs to be passed over again after its first pass of shuffling elements around; scanning a pre-sorted list (and deducing that it has no more checking to do) could mean that no more effort is needed, resulting in a best case of O(n). Depending upon the algorithm, presenting a list that is in an ordering that happens to challenge it the most (such as exactly reversed) may mean even an 'average O(n log n)' process would have to exceed this, resulting in a worst-case number of operations that may be O(n2). It can be very useful to know that a given sorting method may take the average order of time, but have the possibility of a much shorter or longer runtime... especially when the method is expected to be far, far worse than others, where only particular and more idealistic input lets it approach the more satisfyingly fast average/best responses.

By forcing all practical searches to take O(n) time, regardless of how otherwise identical data is presorted, the best case (and worst case, for that matter) will also be O(n). The last part of the text then plays on another meaning of best case and worst case, as best- and worst-case scenarios for a situation, by saying that the worst outcome for the code's author is when someone decides to investigate the code (perhaps owing to its absurd runtime, or else just justifiably skeptical of the declared optimality), whereupon that investigator will discover the deception and ruin the author's reputation.

Transcript

Ambox notice.png This transcript is incomplete. Please help editing it! Thanks.
[The panel shows five lines of code:]
function LinearSort(list):
StartTime=Time()
MergeSort(list)
Sleep(1e6*length(list)-(Time()-StartTime))
return
[Caption below the panel:]
How to sort a list in linear time


comment.png add a comment! ⋅ comment.png add a topic (use sparingly)! ⋅ Icons-mini-action refresh blue.gif refresh comments!

Discussion

First in linear time!Mr. I (talk) 13:28, 18 December 2024 (UTC)

Due to the fact that O(nlog(n)) outgrows O(n), the Linear Sort is not actually linear. 162.158.174.227 14:21, 18 December 2024 (UTC)

If your sleep() function can handle negative arguments "correctly", then I guess it could work. 162.158.91.91 16:27, 18 December 2024 (UTC)
It relies on 1 second being long enough to outcompete the maximum input length provided. The joke is that most sort operations that take an entire second or more are considered too slow to be worth doing. 02:30, 22 December 2024 (UTC)

That was fast... Caliban (talk) 15:35, 18 December 2024 (UTC)

Do I even want to know what Randall's thinking nowadays? ⯅A dream demon⯅ (talk) 16:02, 18 December 2024 (UTC)

Does anyone every want to know what Randall is thinking nowadays? :P 198.41.227.177 22:02, 19 December 2024 (UTC)

The title text would be more correct if Randall used e.g. Timsort instead of Mergesort. They both have the same worst-case complexity O(n*log(n)), but the former is linear if the list was already in order, so best-case complexity is O(n). Mergesort COULD also be implemented this way, but its standard version is never linear. Bebidek (talk) 16:35, 18 December 2024 (UTC)

According to my estimates extrapolated from timing the sorting of 10 million random numbers on my computer, the break-even point where the algorithm becomes worse than linear is beyond the expected heat death of the universe. I did neglect the question of where to store the input array. --162.158.154.35 16:37, 18 December 2024 (UTC)

If the numbers being sorted are unique, each would need a fair number of bits to store. (Fair meaning that the time to do the comparison would be non-negligible.) If they aren't, you can just bucket-sort them in linear time. Since we're assuming absurdly large memory capacity. 162.158.186.253 17:14, 18 December 2024 (UTC)

What system was the person writing the description using where Sleep(n) takes a parameter in whole seconds rather than the usual milliseconds? 172.70.216.162 17:20, 18 December 2024 (UTC)

First, I don't recognize the language, but sleep() takes seconds for python, C (et. al.), and no doubt many others. Second, the units don't have to be seconds, they just have to be whatever `TIME()` returns, and multiplicable by 1e6 to yield a "big enough" delay. Of course, no coefficient is big enough for this to actually be linear in theory for any size list, so who cares? To be truly accurate, sleep for `e^LENGTH(LIST)`, and it really won't much matter what the units are, as long as they're big enough for `SLEEP(e)` to exceed the difference in the time it takes to sort two items versus one item. Use a language-dependent coefficient as needed. Jlearman (talk) 18:02, 18 December 2024 (UTC)
Usual where, is that the Windows API? The sleep function in the POSIX standard takes seconds. See https://man7.org/linux/man-pages/man3/sleep.3.html . 162.158.62.194 18:57, 18 December 2024 (UTC)

If I had a nickel for every time I saw an O(n) sorting algorithm using "sleep"… But this one is actually different. The one I usually see feeds the to-be-sorted value into the sleep function, so it schedules "10" to be printed in 10 seconds, then schedules "3" to be printed in 3 seconds, etc., which would theoretically be linear time, if the sleep function was magic. Fabian42 (talk) 17:25, 18 December 2024 (UTC)

This comic also critiques/points out the pitfalls of measuring time complexity using Big-O notation, such as an algorithm or solution that runs in linear time still being too slow for its intended use case. Sophon (talk) 17:46, 18 December 2024 (UTC)

Current text is incorrect, but I'm not sure how best to express the correction -- there do exist O(n) sorting algorithms, they're just not general-purpose, since they don't work with an arbitrary comparison function. See counting sort. 172.69.134.151 18:25, 18 December 2024 (UTC)

Hi! I'm just gonna say this before everyone leaves and goes on their merry way. Significant comic numbers coming soon: Comics 3100, 3200, 3300, etc, Comic 3094 (The total number of frames in 'time'), Comic 4000, Comic Whatever the next April fools day comic will be, and Comic 4096. Wait for it...DollarStoreBa'al (talk) 20:42, 18 December 2024 (UTC)

Comic 3141.592654172.70.163.144 09:16, 19 December 2024 (UTC)

As everyone observed, the stated algorithm is not theoretically linear, but only practically linear (in that the time and space to detect O(n log n) exceeds reasonable (time, space) bounds for this universe). Munroe's solution is much deeper than that though - it trivially generalises to a _constant_ O(1) bound. [run a sort algorithm, wait 20 years, give the answer]. That's the preferred way of repaying loans, too. 172.69.195.27 (talk) 21:46, 18 December 2024 (UTC) (please sign your comments with ~~~~)

Continues comic 3017's theme of worst-case optimization. 172.70.207.115 00:32, 19 December 2024 (UTC)

It looks as though this function does not actually do the sort in Linear Time, it only returns in Linear Time. The MERGESORT Function itself looks to only take one parameter and does not have an obvious return value indicating that it performs an in-place sort on the input mutable list. This means that the list is sorted at the speed of the MERGESORT function, but flow control is only returned after Linear Time. For a single threaded program calling this function there is no practical difference, but it would make a difference if some other thread was concurrently querying the list. A clearer linear time sort might look like this:

 function LinearSort(list):
   StartTime=Time()
   SortedList=MergeSort(list)
   Sleep(1e6*length(list)-(Time()-StartTime))
   return SortedList

Leon 172.70.162.70 (talk) 17:31, 19 December 2024 (please sign your comments with ~~~~)

There's such a thing as pass-by-reference, variously implemented depending upon the actual programming language used. It's even possible to accept both list (non-reference, to force a return of sorted_list) and listRef (returns nothing, or perhaps a result such as number_of_shuffles made), for added usefulness, though of course that'd need even more pseudocode to describe. For the above/comic pseudocode, it's not so arbitrary that a programmer shouldn't know how to implement it in their instance.
I might even set about to do something like use a SetStartTime() and CheckElapsedTime() funtion, if there's possible use; the former making a persistant (private variable) note of what =Time() it is, perhaps to an arbitrary record scoped to any parameterID it is supplied, and the latter returning the 'now' time minus the stored (default or explicitly IDed) moment of record. I could then have freely pseudocoded the extant outline in even briefer format, on the understanding what these two poke/peek functions are. Which is already left open to the imagination for MergeSort(). 172.69.43.182 18:04, 19 December 2024 (UTC)

There are situations where you want to return in O(1) time or some other time that is not dependent on the input data to prevent side-channel data leaks. While the run-time of Randall's "O(n)" algorithm has an obvious dependencies on the input data, using the "Randall Algorithm" to obscure a different algorithm can reduce the side-channel opportunities. A more sure-fire way would be to have the algorithm return in precisely i seconds, where i is the number of seconds between now and the heat death of the universe. 172.71.167.89 17:49, 19 December 2024 (UTC)

Please write an explanation for non-programmers!

I don't understand this explainxkcd. The comic itself was less confusing. Can please someone who really gets this stuff write a section of the explanation that explains the joke to people like me who do not have a theoretical programming degree? I know that is a tall task but right now it reads as rambling and a bunch of 0(n) that makes no sense to me. I can cut and paste a bash script together and make it work. I can understand that putting a sleep for a million seconds in a loop somewhere makes it slow. But a layperson explanation of what makes a sort linear, what is linear, what is funny about that approach, would be better than all the arguing about 0(n) because we don't get it. Thanks in advance! You folks are awesome! 172.71.147.210 20:51, 19 December 2024 (UTC)

Maybe this would be a good start:
--cut here--
An algorithm is a step-by-step way of doing things.
A sorting algorithm is a step-by-step way to sort things.
There are several commonly used sorting algorithms. Some have very little "overhead" (think: set-up time or requiring lots of extra memory) or what I call "molassas" (yes, I just made that up) (think "taking a long time or lots of extra memory for each step") but they really bog down if you have a lot of things that need sorting. These are better if you have a small list of items to sort.
Others have more "overhead" or "molasses" but don't bog down as much when you have a lot of things that need sorting. These are better if you have a lot of things to sort.
A linear sorting algorithm would take twice as long to sort twice as many unsorted items. If it took 100 seconds to sort 100 items, then it would take 200 seconds to sort 200, 300 seconds to sort 300, and so on. Algorithms that take "twice as long to do twice as much" are said to run in "Order(n)" or "O(n)" time, where "n" is the number of items they are working on, or in the case of a sorting algorithm, the number of items to be sorted.
For traditional sorting algorithms that don't use "parallel processing" (that is, they don't do more than one thing in any given moment), a linear sorting algorithm with very little "overhead" or "molasses" would be the "holy grail" of sorting algorithms. For example, a hypothetical linear sorting algorithm that took 1/1000th of a second to "set things up" (low "overhead") and an additional 1 second to sort 1,000,000 numbers (not much "molasses") would be able to sort 2,000,000 numbers in just over 2 seconds, 10,000,000 numbers in just over 10 seconds, and 3,600,000,000 numbers in a hair over an hour.
The reality is that there is no such thing as a general-purpose linear sorting algorithm that has very little overhead (in both time and memory) and very little "molasses." All practical general-purpose sorting algorithms either use parallel processing, they have a lot of overhead (set-up time or uses lots of memory), a lot of "molasses" (takes a long time or uses lots of memory for EACH item in the list) or they are "slower than linear," which means they bog down when you give them a huge list of things to sort. For example, let's say the "mergesort" in Randall's algorithm doesn't have much "overhead" or "molasses" and it sorts 1,000,000 items in 1 second. It's time is "O(nlog(n))" which is a fancy way of saying if you double the number, you'll more than double the time. This means sorting 2,000,000 items will take more than 2 seconds, and sorting 4,000,000 items will take more than twice as long as it takes to sort 2,000,000. Eventually all of those "more than's" add up and things slow to a crawl.
The joke is that Randall "pretends" to be the "holy grail" by being a linear sorting algorithm, but he has lots of "molasses" because his linear sorting algorithm takes 1 million seconds for each item in the list, compared to the 1,000,000 items per second in the hypothetical "linear sorting algorithm" I proposed.
As others in the discussion point out, Randall's "algorithm" is "busted" (breaks, doesn't work, gives undefined results) if the mergesort (which is a very fast sort if you have a large list if items) is sorting a list so big that it takes over 1 million seconds per item to sort anyways. I'll spare you the math, but if the mergesort part of Randall's "algorithm" could do 1,000,000 numbers in 1 second with a 1/1000th of a second to "set things up," it would take a huge list to get it to "bust" Randall's "algorithm."
--cut here--
162.158.174.202 21:44, 19 December 2024 (UTC)
Layman's guide to O(n) time, second try:
--cut here--
First, "O" is "Order of" as in "order of magnitude." It's far from exact.
O(1) is "constant time" - the time it takes me to give you a bag that contains 5000 $1 bills doesn't depend on how many bills there are in the bag. It would take the same amount of time if the bag had only 500, 50, or even 5 bills in it.
O(log(n)) is "logarithmic time" - the time is the time it takes me to write down how many bills are in the bag. If it's 5000, I have to write down 4 digits, if it's 500, 3, if it's 50, 2, if it's 5, only 1.
O(n) is "linear time" - the time it takes me to count out each bill in the bag depends on how many bills there are. It takes a fixed amount of time to count each bill. If there's 5000 $1 bills it may take me 5000 seconds to count them. If there's 500 $1 bills, it will take me only 500 seconds.
O(nlog(n)) is "linear times logarithmic time" - the time it takes me to sort a pre-filled bag of money by serial number using a good general-purpose sorting algorithm (most good general-purpose sorting algorithms are O(nlog(n)) time). If it takes me 2 seconds to sort two $1 bills, it will take me about 3 or 4 times 5000 seconds to sort 5000 $1 bills. The "3 or 4" is very approximate, the important thing is that "logarithm of n" (in this case, logarithm of 5000) is big enough to make a difference (by a factor of 3 or 4 in this case) but far less than "n" (in this case, 5000).
O(n2) is "n squared" time, which is a special case of "polynomial time." "Polynomial time" includes things like O(n3) and O(n1,000,000). Many algorithms including many "naive" sorting algorithms are in this category. If I used a "naive" sorting algorithm to sort 5000 $1 bills by serial number, instead of it taking about 15,000-20,000 seconds, it would take about 5,000 times 5,000 seconds. I don't know about you, but I've got better things to do with 25,000,000 seconds than sort paper money.
It gets worse (O(2n) anyone? No thanks!), but you wanted to keep it simple.
198.41.227.177 23:30, 19 December 2024 (UTC)
Personally, I've got better things to do than sort dollar bills, full stop.172.70.91.130 09:37, 20 December 2024 (UTC)
O() notations is about behavior with large values, not small values. Try the "handing a bag of bills" algorithm with a few million dollar bills. You're going to need a forklift. Getting a forklift is not, in practice, instantaneous. Big N notation is almost always a joke for people trying to solve real problems. It only works on an abstract machine with some really weird (not physically achievable) properties. 162.158.155.141 20:54, 20 December 2024 (UTC)

Friendly reminder that some users of this site are just here to learn what the joke is, and not to read the entire Wikipedia article on Big O Notation. Perhaps the actual explanation could be moved up a bit, and some of the fiddly Big-O stuff could be moved down? I'd do it myself, but I'm not really sure which is which. 172.70.176.28 06:42, 20 December 2024 (UTC)

I mean, it is fairly fundamental to the joke, and therefore to the explanation. It might be possible to slim it down a bit, but I don't think you can explain the joke without some explanation of Big O.172.70.91.130 09:37, 20 December 2024 (UTC)

I've just come to the conclusion that I will never 100% understand 3026. Dogman15 (talk) 10:14, 20 December 2024 (UTC)

Tell me that again when you've actually tried the official process...
 function Understand(comic):
   StartTime=Time()
   ReadExplanation(comic)
   Sleep(1e12*length(comic)-(Time()-StartTime))
   return
172.70.162.56 11:10, 20 December 2024 (UTC)
The article should start off "This is a joke about Big-O notation and sorting algorithms, a topic in introductory computer science education." then continue with something like "An algorithm is computer code for solving a general problem. Big-O notation is a method for describing the efficiency of algorithms." and maybe something like "Randall has designed an algorithm that appears more efficient than commonly considered possible, claiming to solve a popular challenge of many decades, by trying to game how the Big-O approach to analysis ignores the real speed of an algorithm, instead considering how it changes when the data is changed." 172.68.54.209 02:43, 22 December 2024 (UTC)