Difference between revisions of "Talk:2239: Data Error"

Explain xkcd: It's 'cause you're dumb.
Jump to: navigation, search
 
(16 intermediate revisions by 13 users not shown)
Line 4: Line 4:
  
 
: I'm not aware of anything in the news.  However, this is not the first time Randall has commented on research publication in a comic, so I suspect it's just another in that series.  It seems obvious that he feels the first option is the appropriate choice, and the second option is the joke. [[User:Ianrbibtitlht|Ianrbibtitlht]] ([[User talk:Ianrbibtitlht|talk]]) 21:22, 9 December 2019 (UTC)
 
: I'm not aware of anything in the news.  However, this is not the first time Randall has commented on research publication in a comic, so I suspect it's just another in that series.  It seems obvious that he feels the first option is the appropriate choice, and the second option is the joke. [[User:Ianrbibtitlht|Ianrbibtitlht]] ([[User talk:Ianrbibtitlht|talk]]) 21:22, 9 December 2019 (UTC)
 +
 +
: I believe there was a relatively recent issue where a Python script used for processing data-sets made assumptions about the order in which data files would be returned by the host operating system that turned out to not always be true, throwing the results of several analyses off.  Could he be referring to that?  The scripts in question were used for obtaining results into cyanobacteria studies...  https://arstechnica.com/information-technology/2019/10/chemists-discover-cross-platform-python-scripts-not-so-cross-platform/ [[Special:Contributions/162.158.34.222|162.158.34.222]] 15:03, 13 December 2019 (UTC)
 +
 +
I think the stickwoman is not "excited" but sarcastic, although you can't be sure in text. It is a joke based on the discrepancy in capabilities between real scientists and fictional mad scientists. [[Special:Contributions/108.162.238.119|108.162.238.119]] 22:23, 9 December 2019 (UTC)
 +
 +
:I agree, Megan is being a smart-ass [[Special:Contributions/108.162.245.202|108.162.245.202]] 15:46, 10 December 2019 (UTC)
 +
 +
:For start, "mad scientists" are usually more like mad engineers ... you can't get world domination by researching something and writing paper about it, you need to USE that research, usually by building something. -- [[User:Hkmaly|Hkmaly]] ([[User talk:Hkmaly|talk]]) 23:10, 9 December 2019 (UTC)
 +
::Are you suggesting scientists can't build things?  I don't actually know, since I'm an engineer! [[User:Ianrbibtitlht|Ianrbibtitlht]] ([[User talk:Ianrbibtitlht|talk]]) 23:43, 9 December 2019 (UTC)
 +
 +
What is a data error in general? Explain me a term :) [[Special:Contributions/172.69.22.74|172.69.22.74]] 02:39, 10 December 2019 (UTC)
 +
:The discovery that the data you used was sampled below the Nyquist frequency pretty much kills your thesis until you can get data that was properly acquired. All your results will be contaminated with artifacts produced by the sampling rate, rather than by variations in the quantity that you imagined you were observing. [[Special:Contributions/173.245.52.209|173.245.52.209]] 12:37, 10 December 2019 (UTC)
 +
::I thought I knew what a data error is, but after that reply I'm not sure - although I'm almos sure that it did not help the one asking the question ;-) --[[User:Kynde|Kynde]] ([[User talk:Kynde|talk]]) 15:55, 10 December 2019 (UTC)
 +
:::Well, that is '''a''' type of data error (bad sampling technique), but not the only type. The data itself could have had corruption problems, such as maybe some rogue second species of algae contaminated the samples, etc. [[Special:Contributions/172.69.62.46|172.69.62.46]] 21:39, 10 December 2019 (UTC)
 +
::::Also, malfunctioning or miscalibrated measuring equipment (transducers, cabling, etc.) would be another type of data error. [[User:Ianrbibtitlht|Ianrbibtitlht]] ([[User talk:Ianrbibtitlht|talk]]) 22:17, 10 December 2019 (UTC)
 +
:More about data errors. Yes, I listed just one kind, and a fellow I knew had to re-do his thesis because of that particular error. The careful researcher investigates many possible sources of error. The poor researcher simply throws away the data points that do not match his preconceptions. HERE WE GO, enumerating some errors: (1) Noise from physically sloppy equipment. (2) Lack of calibration of measuring  device. (3) Device loses calibration over time. (4) Manually recorded data errors, such as transposed digits. (5) Incorrect assumptions of linearity in the design of measurement. (6) Failure to record crucial environmental parameters. [That's just six minutes of thinking. Surely there are others.]
 +
:Yes, I omitted an important source of error: Sabotage! You're not paranoid, someone really is messing with your data.[[Special:Contributions/162.158.79.113|162.158.79.113]] 01:34, 11 December 2019 (UTC)
 +
:So, a data error is an error in your data, instead of in your analysis? [[Special:Contributions/172.68.132.107|172.68.132.107]] 11:35, 11 December 2019 (UTC)
 +
 +
If it were merely an error in analysis (see the recent mess with python,  [https://arstechnica.com/information-technology/2019/10/chemists-discover-cross-platform-python-scripts-not-so-cross-platform/] ), then you simply fix your analysis code and re-run.  So, yes, a "data error" means the original data values were flawed or invalid or whatever. Most likely sabotage inflicted by sophons.  [[User:Cellocgw|Cellocgw]] ([[User talk:Cellocgw|talk]]) 12:29, 11 December 2019 (UTC)
 +
 +
I'm happy that he said "two options" instead of "two choices", which of course would involve around four options. Watching the horrific Star Trek: Discovery for completist purposes, I was annoyed when someone said "you have only one alternative" when they meant "you have only one option". — [[User:Kazvorpal|Kazvorpal]] ([[User talk:Kazvorpal|talk]]) 18:39, 22 January 2020 (UTC)

Latest revision as of 18:40, 22 January 2020


Randall's comics are usually relevant to recent events on or near the day comics are posted. I was wondering if this Data Error comic might be referencing some recent event, some data error at NASA or something. Does anyone know what it might be in reference to? 108.162.219.40 21:13, 9 December 2019 (UTC) ... Sorry, forgot to sign in. Saibot84 21:14, 9 December 2019 (UTC)

I'm not aware of anything in the news. However, this is not the first time Randall has commented on research publication in a comic, so I suspect it's just another in that series. It seems obvious that he feels the first option is the appropriate choice, and the second option is the joke. Ianrbibtitlht (talk) 21:22, 9 December 2019 (UTC)
I believe there was a relatively recent issue where a Python script used for processing data-sets made assumptions about the order in which data files would be returned by the host operating system that turned out to not always be true, throwing the results of several analyses off. Could he be referring to that? The scripts in question were used for obtaining results into cyanobacteria studies... https://arstechnica.com/information-technology/2019/10/chemists-discover-cross-platform-python-scripts-not-so-cross-platform/ 162.158.34.222 15:03, 13 December 2019 (UTC)

I think the stickwoman is not "excited" but sarcastic, although you can't be sure in text. It is a joke based on the discrepancy in capabilities between real scientists and fictional mad scientists. 108.162.238.119 22:23, 9 December 2019 (UTC)

I agree, Megan is being a smart-ass 108.162.245.202 15:46, 10 December 2019 (UTC)
For start, "mad scientists" are usually more like mad engineers ... you can't get world domination by researching something and writing paper about it, you need to USE that research, usually by building something. -- Hkmaly (talk) 23:10, 9 December 2019 (UTC)
Are you suggesting scientists can't build things? I don't actually know, since I'm an engineer! Ianrbibtitlht (talk) 23:43, 9 December 2019 (UTC)

What is a data error in general? Explain me a term :) 172.69.22.74 02:39, 10 December 2019 (UTC)

The discovery that the data you used was sampled below the Nyquist frequency pretty much kills your thesis until you can get data that was properly acquired. All your results will be contaminated with artifacts produced by the sampling rate, rather than by variations in the quantity that you imagined you were observing. 173.245.52.209 12:37, 10 December 2019 (UTC)
I thought I knew what a data error is, but after that reply I'm not sure - although I'm almos sure that it did not help the one asking the question ;-) --Kynde (talk) 15:55, 10 December 2019 (UTC)
Well, that is a type of data error (bad sampling technique), but not the only type. The data itself could have had corruption problems, such as maybe some rogue second species of algae contaminated the samples, etc. 172.69.62.46 21:39, 10 December 2019 (UTC)
Also, malfunctioning or miscalibrated measuring equipment (transducers, cabling, etc.) would be another type of data error. Ianrbibtitlht (talk) 22:17, 10 December 2019 (UTC)
More about data errors. Yes, I listed just one kind, and a fellow I knew had to re-do his thesis because of that particular error. The careful researcher investigates many possible sources of error. The poor researcher simply throws away the data points that do not match his preconceptions. HERE WE GO, enumerating some errors: (1) Noise from physically sloppy equipment. (2) Lack of calibration of measuring device. (3) Device loses calibration over time. (4) Manually recorded data errors, such as transposed digits. (5) Incorrect assumptions of linearity in the design of measurement. (6) Failure to record crucial environmental parameters. [That's just six minutes of thinking. Surely there are others.]
Yes, I omitted an important source of error: Sabotage! You're not paranoid, someone really is messing with your data.162.158.79.113 01:34, 11 December 2019 (UTC)
So, a data error is an error in your data, instead of in your analysis? 172.68.132.107 11:35, 11 December 2019 (UTC)

If it were merely an error in analysis (see the recent mess with python, [1] ), then you simply fix your analysis code and re-run. So, yes, a "data error" means the original data values were flawed or invalid or whatever. Most likely sabotage inflicted by sophons. Cellocgw (talk) 12:29, 11 December 2019 (UTC)

I'm happy that he said "two options" instead of "two choices", which of course would involve around four options. Watching the horrific Star Trek: Discovery for completist purposes, I was annoyed when someone said "you have only one alternative" when they meant "you have only one option". — Kazvorpal (talk) 18:39, 22 January 2020 (UTC)