Editing Talk:2298: Coronavirus Genome

Jump to: navigation, search
Ambox notice.png Please sign your posts with ~~~~

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 4: Line 4:
 
:...{{w|Epigenetics}} is a real thing—the study of how changes in things other than the genome itself can be passed down between generations. An example is conditioning a mouse to be scared of the smell of oranges/cherries/almonds by having them associate the scent of acetophenone with an electric shock, then testing whether its pups also have the same fear of that smell: they do, but this obviously can't be by the genome itself changing (no component of this has a lot of ionizing radiation{{Citation needed}}). Whatever causes this is the topic of actual epigenetics. --[[User:Volleo6144|Volleo6144]] ([[User talk:Volleo6144|talk]]) 00:12, 25 April 2020 (UTC)
 
:...{{w|Epigenetics}} is a real thing—the study of how changes in things other than the genome itself can be passed down between generations. An example is conditioning a mouse to be scared of the smell of oranges/cherries/almonds by having them associate the scent of acetophenone with an electric shock, then testing whether its pups also have the same fear of that smell: they do, but this obviously can't be by the genome itself changing (no component of this has a lot of ionizing radiation{{Citation needed}}). Whatever causes this is the topic of actual epigenetics. --[[User:Volleo6144|Volleo6144]] ([[User talk:Volleo6144|talk]]) 00:12, 25 April 2020 (UTC)
 
::I know that, I added the link to the article. But afaik that has nothing to do with how the genome is formatted in Word, and I think it's a pun. [[User:Jacky720|That's right, Jacky720 just signed this]] ([[User talk:Jacky720|talk]] | [[Special:Contributions/Jacky720|contribs]]) 00:31, 25 April 2020 (UTC)
 
::I know that, I added the link to the article. But afaik that has nothing to do with how the genome is formatted in Word, and I think it's a pun. [[User:Jacky720|That's right, Jacky720 just signed this]] ([[User talk:Jacky720|talk]] | [[Special:Contributions/Jacky720|contribs]]) 00:31, 25 April 2020 (UTC)
 +
 
since when does notepad have spellcheck? [[Special:Contributions/172.68.226.46|172.68.226.46]] 23:05, 24 April 2020 (UTC)
 
since when does notepad have spellcheck? [[Special:Contributions/172.68.226.46|172.68.226.46]] 23:05, 24 April 2020 (UTC)
:Neither notepad nor wordpad have spellcheck.  I suspect he combined two jokes and the spellcheck to word link was not better established.[[User:Quinoje|Quinoje]] ([[User talk:Quinoje|talk]]) 19:35, 27 April 2020 (UTC)
 
 
: Word does, so maybe she is using Word instead? Kind of contradictory. [[Special:Contributions/172.69.34.46|172.69.34.46]] 23:14, 24 April 2020 (UTC)
 
: Word does, so maybe she is using Word instead? Kind of contradictory. [[Special:Contributions/172.69.34.46|172.69.34.46]] 23:14, 24 April 2020 (UTC)
:I assumed Randall meant Wordpad, which ifrc is an upgrade from notepad but has a really thinned out set of Word's features. Maybe there's a spellcheck in there? (haven't used it in ~10 years) [[User:Xseo|Xseo]] ([[User talk:Xseo|talk]]) 07:47, 27 April 2020 (UTC)
 
  
Very disappointed that she's using Notepad and not  Notepad++ . I mean,really...  [[User:Cellocgw|Cellocgw]] ([[User talk:Cellocgw|talk]]) 15:38, 27 April 2020 (UTC)
 
  
When Dr. Theall first scanned Finnegans Wake, he had to tell Microsoft the language was Old Icelandic.
+
True Story: In the 1980s, as part of the Work Experience initiative at my school, I was assigned to one of my local council's offices (I'd applied for their computer department, but someone else got that). I don't ''think'' the word-processor I used at home (Psion Exchange) had spellcheck, but the one the office used (Lotus? Can't actually recall, but it, like most things, was DOS-based) definitely had, and it was very easy to edit in new words. Inspired by the chemistry lessons I'd recently had, and some 'reports' I was asked to write (keeping the kid busy, more like!) that dealt with chemical degradation of concrete under the action of salt and suchlike, I of course added "NaCl" then absolutely any other chemical formulae I could think of. "H2SO4" was an early one (partial subscript formatting wasn't relevent to the spill-chucker) but I eventually got round to CH4, C2H6, C3H8, etc, and then as many of the derived alcohols, alkenes, alkynes, etc that I could be bothered to type in. Which were a lot. By the end I was 'confident' that nobody would ever type ''any'' correct chemical formula into that machine (no network-shared resources!) and have to worry about false-positive typo alerts. Yeah, well, I was still at school and thought I knew ''everything''. [[Special:Contributions/162.158.159.70|162.158.159.70]] 23:37, 24 April 2020 (UTC)
 
 
The OCR kept trying to spellcheck Finnegans Wake.15:11, 26 April 2020 (UTC)
 
:True Story: In the 1980s, as part of the Work Experience initiative at my school, I was assigned to one of my local council's offices (I'd applied for their computer department, but someone else got that). I don't ''think'' the word-processor I used at home (Psion Exchange) had spellcheck, but the one the office used (Lotus? Can't actually recall, but it, like most things, was DOS-based) definitely had, and it was very easy to edit in new words. Inspired by the chemistry lessons I'd recently had, and some 'reports' I was asked to write (keeping the kid busy, more like!) that dealt with chemical degradation of concrete under the action of salt and suchlike, I of course added "NaCl" then absolutely any other chemical formulae I could think of. "H2SO4" was an early one (partial subscript formatting wasn't relevent to the spill-chucker) but I eventually got round to CH4, C2H6, C3H8, etc, and then as many of the derived alcohols, alkenes, alkynes, etc that I could be bothered to type in. Which were a lot. By the end I was 'confident' that nobody would ever type ''any'' correct chemical formula into that machine (no network-shared resources!) and have to worry about false-positive typo alerts. Yeah, well, I was still at school and thought I knew ''everything''. [[Special:Contributions/162.158.159.70|162.158.159.70]] 23:37, 24 April 2020 (UTC)
 
  
 
Can confirm: virus genomes are looked at in notepad. I worked at one of the national laboratories for a summer, experimenting with ways to check for the length of a gene and strength of genetic expression in various circumstances in E. coli. We used notepad because even old computers can open very large files without difficulty, and all our scripts were in Perl, which can easily output to .rtf or .txt file formats. These files are huge, by the way. If you hold down on the scroll bar so it's zooming to the bottom, you could be waiting 20 minutes to reach the end depending on the number of kilobase pairs in your microbe. And epigenetics is not a pun. It's a real word. [[Special:Contributions/172.68.143.192|172.68.143.192]] 00:15, 25 April 2020 (UTC)
 
Can confirm: virus genomes are looked at in notepad. I worked at one of the national laboratories for a summer, experimenting with ways to check for the length of a gene and strength of genetic expression in various circumstances in E. coli. We used notepad because even old computers can open very large files without difficulty, and all our scripts were in Perl, which can easily output to .rtf or .txt file formats. These files are huge, by the way. If you hold down on the scroll bar so it's zooming to the bottom, you could be waiting 20 minutes to reach the end depending on the number of kilobase pairs in your microbe. And epigenetics is not a pun. It's a real word. [[Special:Contributions/172.68.143.192|172.68.143.192]] 00:15, 25 April 2020 (UTC)
:''even old computers can open very large files without difficulty'' - Depending on what you mean by "old" and "very large" that may well not be true. In Windows 3.x, Notepad could open files as large as 54Kb, increasing to 64Kb in Windows95, 512Mb in Windows 8 and 1Gb in Windows 10. I don't know which of those would fit a typical virus genome, but I'm guessing it's not all of them. [[Special:Contributions/162.158.187.151|162.158.187.151]] 13:43, 27 April 2020 (UTC)
 
:: Well, Sars-Cov-2 has around 30 kb, and that's considered big already. Since a base is a letter and thus a byte, a viral genome usually fits in the old notepad. But here is the catch: when people align things you get the number multiplied by whatever many genomes they are looking at. And don't even talk about the {{w|Nucleocytoviricota}}-whatsoever-twats.--[[Special:Contributions/162.158.179.12|162.158.179.12]] 06:11, 5 May 2020 (UTC)
 
  
 
Concurrent to the work in the medical community, work is underway in various open source software communities to fix bugs and other issues with software (eg genome analysis tools) that is useful to the scientists combatting COVID-19. These include the Debian "biohackathon" (https://lwn.net/Articles/816280/) as well as support from Mozilla (https://lwn.net/Articles/816386/). Parallel to these efforts, the FSF (Free Software Foundation) has focused on the shortage of medical equipment: https://lwn.net/Articles/816392/ [[Special:Contributions/108.162.242.5|108.162.242.5]] 00:34, 25 April 2020 (UTC)
 
Concurrent to the work in the medical community, work is underway in various open source software communities to fix bugs and other issues with software (eg genome analysis tools) that is useful to the scientists combatting COVID-19. These include the Debian "biohackathon" (https://lwn.net/Articles/816280/) as well as support from Mozilla (https://lwn.net/Articles/816386/). Parallel to these efforts, the FSF (Free Software Foundation) has focused on the shortage of medical equipment: https://lwn.net/Articles/816392/ [[Special:Contributions/108.162.242.5|108.162.242.5]] 00:34, 25 April 2020 (UTC)
  
 
I’m suddenly inspired to write a DNA-edit-mode for Emacs (if it doesn’t have it already) which would allow for the virus spell check as described in this comic. [[Special:Contributions/172.69.63.153|172.69.63.153]] 04:16, 25 April 2020 (UTC)
 
I’m suddenly inspired to write a DNA-edit-mode for Emacs (if it doesn’t have it already) which would allow for the virus spell check as described in this comic. [[Special:Contributions/172.69.63.153|172.69.63.153]] 04:16, 25 April 2020 (UTC)
: the dna-mode for emacs does exist. Google for it. It is not very useful for real work, though. [[User:Heikkil|Heikkil]] ([[User talk:Heikkil|talk]]) 04:40, 26 April 2020 (UTC)
+
 
 +
- the dna-mode for emacs does exist. Google for it. It is not very useful for real work, though.
  
 
Derek Lowe has some insights about actual coronavirus mutations [https://blogs.sciencemag.org/pipeline/archives/2020/04/21/watching-for-mutations-in-the-coronavirus here], if you are interested.
 
Derek Lowe has some insights about actual coronavirus mutations [https://blogs.sciencemag.org/pipeline/archives/2020/04/21/watching-for-mutations-in-the-coronavirus here], if you are interested.
  
 
Given coronavirus has an RNA genome, shouldn't all the 'T's be replaced by 'U's?
 
Given coronavirus has an RNA genome, shouldn't all the 'T's be replaced by 'U's?
: It is standard practice no to use U's in public sequence database. It simplifies things. [[User:Heikkil|Heikkil]] ([[User talk:Heikkil|talk]]) 04:40, 26 April 2020 (UTC)
 
  
The sequence in the transcript does not actually appear on the [https://www.ebi.ac.uk/ena/data/view/MT344963&display=text site] mentioned in the explanation. In fact, when I google for 'TACTAGCGTGCCTTTGTAAGCACAAGCTGATTAGTACGAACTTATGTACTCATTCGTTTCGGAAGAGACAGGTACGTTA' I only get this particular site. {{unsigned ip|141.101.104.221|07:00, April 25, 2020}}
+
- It is standard practice no to use U's in public sequence database. It simplifies things.
:To find this (or any) sequence go to [[https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastn&PAGE_TYPE=BlastSearch&LINK_LOC=blasthome|Nucleotide Blast]] and paste the query into the box. You will receive a list of a number of best matches (10, 50 or 100 in standard search), this should look like [[https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastSearch&VIEW_SEARCH=on&UNIQ_SEARCH_NAME=A_SearchOptions_1jST3G_gRB_dgzLunnk2EC_23turP_1HUFpP|this]]
+
 
Interestingly, this is an US-specific strain of the virus (top result currently is "Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/NC_0025/2020").[[User:Tier666|Tier666]] ([[User talk:Tier666|talk]]) 23:21, 25 April 2020 (UTC)
+
The sequence in the transcript does not actually appear on the [https://www.ebi.ac.uk/ena/data/view/MT344963&display=text site] mentioned in the explanation. In fact, when I google for 'TACTAGCGTGCCTTTGTAAGCACAAGCTGATTAGTACGAACTTATGTACTCATTCGTTTCGGAAGAGACAGGTACGTTA' I only get this particular site.
 
:Well, ''obviously'' it's a new variant, yet unknown to other clinical studies. Of RNA that has switched to looking like DNA, so this is a hot discovery! [[Special:Contributions/162.158.159.142|162.158.159.142]] 12:05, 25 April 2020 (UTC)
 
:Well, ''obviously'' it's a new variant, yet unknown to other clinical studies. Of RNA that has switched to looking like DNA, so this is a hot discovery! [[Special:Contributions/162.158.159.142|162.158.159.142]] 12:05, 25 April 2020 (UTC)
  
: The site shows several views into the public database entry that are easier to understand by humans than the raw sequence. Click the link at 'View: TEXT'. and scroll down. The relevant lines look like this:
+
- The site shows several views into the public database entry that are easier to understand by humans than the raw sequence. Click the link at 'View: TEXT'. and scroll down. The relevant lines look like this:
{{#tag:pre|
+
 
 
     aatccagtaa tggaaccaat ttatgatgaa ccgacgacga ctactagcgt gcctttgtaa    26220
 
     aatccagtaa tggaaccaat ttatgatgaa ccgacgacga ctactagcgt gcctttgtaa    26220
    gcacaagctg attagtacga acttatgtac tcattcgttt cggaagagac aggtacgtta    26280}}
 
:As you can see, these are not meant to be search for and compared in "a notepad". For the same reason, google does not index DNA sequence database entries. There are specialised tools for that.
 
:The sequnces were published this month, so they are available only in the most recent sequence database updates. [[User:Heikkil|Heikkil]] ([[User talk:Heikkil|talk]]) 04:40, 26 April 2020 (UTC)
 
 
 
I have had trouble opening .txt files of even a hundred KB in Notepad! Sometimes it even crashes... It's one of the reasons I started using Notepad++. Notepad++ also happens to have a very extensible spellcheck, & language-specific formatting options. Since I often need to use Windows machines, it's one of my most frequently installed apps, after 7Zip.
 
[[User:ProphetZarquon|ProphetZarquon]] ([[User talk:ProphetZarquon|talk]]) 18:03, 25 April 2020 (UTC)
 
  
The Grammar Checker concept only has a {{w|Colorless_green_ideas_sleep_furiously|limited analytical sophistication}}, though I don't doubt it'd still be enough to get a Nobel given the complexity of the task of deriving trivially feasible sequences from total codswallop. I also added the "next step" (probably much more than a single step), when I revised things, but that might actually be overstepping the explanation of the comic and removable. [[Special:Contributions/162.158.155.122|162.158.155.122]] 20:32, 25 April 2020 (UTC)
+
    gcacaagctg attagtacga acttatgtac tcattcgttt cggaagagac aggtacgtta    26280
:Thanks for mentioning this in the discussion area, as I wondered what that "next step" line meant when I read it a little while ago, let alone how it related to the comic.  I'll go ahead and trim that last "next step" sentence off the end, as I think it is unnecessary. [[User:Ianrbibtitlht|Ianrbibtitlht]] ([[User talk:Ianrbibtitlht|talk]]) 03:36, 26 April 2020 (UTC)
 
  
Is using Notepad to analyse RNA sequences more or less sane than using a spam filter to play chess? - [[User:Angel|Angel]] ([[User talk:Angel|talk]]) 00:43, 27 April 2020 (UTC)
+
As you can see, these are not meant to be search for and compared in "a notepad". For the same reason, google does not index DNA sequence database entries. There are specialised tools for that.
: Is that filter used to prevent emails pretending to be from Czech mates looking to give you a knight to remember in a message full of pawn images? [[Special:Contributions/162.158.158.211|162.158.158.211]] 15:10, 27 April 2020 (UTC)
 
  
Just stumbled on this. I wonder if Japanese spell checker tech (like many [https://en.wikipedia.org/wiki/Logogram logographic scripts], words aren't separated by whitespace) would work for strings of nucleotide letters. Normally, you try to match the longest possible strings with algorithms like BLAST, but maybe the spellcheckers get so much optimization that they're more efficient. Or maybe spellcheckers should use BLAST. [[User:Ericprud|Ericprud]] ([[User talk:Ericprud|talk]]) 18:04, 23 November 2022 (UTC)
+
The sequnces were published this month, so they are available only in the most recent sequence database updates.

Please note that all contributions to explain xkcd may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see explain xkcd:Copyrights for details). Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel | Editing help (opens in new window)

Templates used on this page: