<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
		<id>https://www.explainxkcd.com/wiki/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Heikkil</id>
		<title>explain xkcd - User contributions [en]</title>
		<link rel="self" type="application/atom+xml" href="https://www.explainxkcd.com/wiki/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Heikkil"/>
		<link rel="alternate" type="text/html" href="https://www.explainxkcd.com/wiki/index.php/Special:Contributions/Heikkil"/>
		<updated>2026-04-19T13:31:44Z</updated>
		<subtitle>User contributions</subtitle>
		<generator>MediaWiki 1.30.0</generator>

	<entry>
		<id>https://www.explainxkcd.com/wiki/index.php?title=Talk:2298:_Coronavirus_Genome&amp;diff=191244</id>
		<title>Talk:2298: Coronavirus Genome</title>
		<link rel="alternate" type="text/html" href="https://www.explainxkcd.com/wiki/index.php?title=Talk:2298:_Coronavirus_Genome&amp;diff=191244"/>
				<updated>2020-04-26T04:40:21Z</updated>
		
		<summary type="html">&lt;p&gt;Heikkil: added signatures&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!--Please sign your posts with ~~~~ and don't delete this text. New comments should be added at the bottom.--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Epigenetics is a pun, right? I think it's a pun but I don't know what and it's maddening. [[User:Jacky720|That's right, Jacky720 just signed this]] ([[User talk:Jacky720|talk]] | [[Special:Contributions/Jacky720|contribs]]) 23:03, 24 April 2020 (UTC)&lt;br /&gt;
:...{{w|Epigenetics}} is a real thing&amp;amp;mdash;the study of how changes in things other than the genome itself can be passed down between generations. An example is conditioning a mouse to be scared of the smell of oranges/cherries/almonds by having them associate the scent of acetophenone with an electric shock, then testing whether its pups also have the same fear of that smell: they do, but this obviously can't be by the genome itself changing (no component of this has a lot of ionizing radiation{{Citation needed}}). Whatever causes this is the topic of actual epigenetics. --[[User:Volleo6144|Volleo6144]] ([[User talk:Volleo6144|talk]]) 00:12, 25 April 2020 (UTC)&lt;br /&gt;
::I know that, I added the link to the article. But afaik that has nothing to do with how the genome is formatted in Word, and I think it's a pun. [[User:Jacky720|That's right, Jacky720 just signed this]] ([[User talk:Jacky720|talk]] | [[Special:Contributions/Jacky720|contribs]]) 00:31, 25 April 2020 (UTC)&lt;br /&gt;
&lt;br /&gt;
since when does notepad have spellcheck? [[Special:Contributions/172.68.226.46|172.68.226.46]] 23:05, 24 April 2020 (UTC)&lt;br /&gt;
: Word does, so maybe she is using Word instead? Kind of contradictory. [[Special:Contributions/172.69.34.46|172.69.34.46]] 23:14, 24 April 2020 (UTC)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
True Story: In the 1980s, as part of the Work Experience initiative at my school, I was assigned to one of my local council's offices (I'd applied for their computer department, but someone else got that). I don't ''think'' the word-processor I used at home (Psion Exchange) had spellcheck, but the one the office used (Lotus? Can't actually recall, but it, like most things, was DOS-based) definitely had, and it was very easy to edit in new words. Inspired by the chemistry lessons I'd recently had, and some 'reports' I was asked to write (keeping the kid busy, more like!) that dealt with chemical degradation of concrete under the action of salt and suchlike, I of course added &amp;quot;NaCl&amp;quot; then absolutely any other chemical formulae I could think of. &amp;quot;H2SO4&amp;quot; was an early one (partial subscript formatting wasn't relevent to the spill-chucker) but I eventually got round to CH4, C2H6, C3H8, etc, and then as many of the derived alcohols, alkenes, alkynes, etc that I could be bothered to type in. Which were a lot. By the end I was 'confident' that nobody would ever type ''any'' correct chemical formula into that machine (no network-shared resources!) and have to worry about false-positive typo alerts. Yeah, well, I was still at school and thought I knew ''everything''. [[Special:Contributions/162.158.159.70|162.158.159.70]] 23:37, 24 April 2020 (UTC)&lt;br /&gt;
&lt;br /&gt;
Can confirm: virus genomes are looked at in notepad. I worked at one of the national laboratories for a summer, experimenting with ways to check for the length of a gene and strength of genetic expression in various circumstances in E. coli. We used notepad because even old computers can open very large files without difficulty, and all our scripts were in Perl, which can easily output to .rtf or .txt file formats. These files are huge, by the way. If you hold down on the scroll bar so it's zooming to the bottom, you could be waiting 20 minutes to reach the end depending on the number of kilobase pairs in your microbe. And epigenetics is not a pun. It's a real word. [[Special:Contributions/172.68.143.192|172.68.143.192]] 00:15, 25 April 2020 (UTC)&lt;br /&gt;
&lt;br /&gt;
Concurrent to the work in the medical community, work is underway in various open source software communities to fix bugs and other issues with software (eg genome analysis tools) that is useful to the scientists combatting COVID-19. These include the Debian &amp;quot;biohackathon&amp;quot; (https://lwn.net/Articles/816280/) as well as support from Mozilla (https://lwn.net/Articles/816386/). Parallel to these efforts, the FSF (Free Software Foundation) has focused on the shortage of medical equipment: https://lwn.net/Articles/816392/ [[Special:Contributions/108.162.242.5|108.162.242.5]] 00:34, 25 April 2020 (UTC)&lt;br /&gt;
&lt;br /&gt;
I’m suddenly inspired to write a DNA-edit-mode for Emacs (if it doesn’t have it already) which would allow for the virus spell check as described in this comic. [[Special:Contributions/172.69.63.153|172.69.63.153]] 04:16, 25 April 2020 (UTC)&lt;br /&gt;
&lt;br /&gt;
- the dna-mode for emacs does exist. Google for it. It is not very useful for real work, though. [[User:Heikkil|Heikkil]] ([[User talk:Heikkil|talk]]) 04:40, 26 April 2020 (UTC)&lt;br /&gt;
&lt;br /&gt;
Derek Lowe has some insights about actual coronavirus mutations [https://blogs.sciencemag.org/pipeline/archives/2020/04/21/watching-for-mutations-in-the-coronavirus here], if you are interested.&lt;br /&gt;
&lt;br /&gt;
Given coronavirus has an RNA genome, shouldn't all the 'T's be replaced by 'U's?&lt;br /&gt;
&lt;br /&gt;
- It is standard practice no to use U's in public sequence database. It simplifies things. [[User:Heikkil|Heikkil]] ([[User talk:Heikkil|talk]]) 04:40, 26 April 2020 (UTC)&lt;br /&gt;
&lt;br /&gt;
The sequence in the transcript does not actually appear on the [https://www.ebi.ac.uk/ena/data/view/MT344963&amp;amp;display=text site] mentioned in the explanation. In fact, when I google for 'TACTAGCGTGCCTTTGTAAGCACAAGCTGATTAGTACGAACTTATGTACTCATTCGTTTCGGAAGAGACAGGTACGTTA' I only get this particular site.&lt;br /&gt;
:&lt;br /&gt;
UNSIGNED COMMENT: PLEASE SIGN WITH &amp;quot;&amp;lt;nowiki&amp;gt;~~~~&amp;lt;/nowiki&amp;gt;&amp;quot; &lt;br /&gt;
:To find this (or any) sequence go to [[https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastn&amp;amp;PAGE_TYPE=BlastSearch&amp;amp;LINK_LOC=blasthome|Nucleotide Blast]] and paste the query into the box. You will receive a list of a number of best matches (10, 50 or 100 in standard search), this should look like [[https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&amp;amp;PAGE_TYPE=BlastSearch&amp;amp;VIEW_SEARCH=on&amp;amp;UNIQ_SEARCH_NAME=A_SearchOptions_1jST3G_gRB_dgzLunnk2EC_23turP_1HUFpP|this]] &lt;br /&gt;
Interestingly, this is an US-specific strain of the virus (top result currently is &amp;quot;Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/NC_0025/2020&amp;quot;).[[User:Tier666|Tier666]] ([[User talk:Tier666|talk]]) 23:21, 25 April 2020 (UTC)&lt;br /&gt;
:Well, ''obviously'' it's a new variant, yet unknown to other clinical studies. Of RNA that has switched to looking like DNA, so this is a hot discovery! [[Special:Contributions/162.158.159.142|162.158.159.142]] 12:05, 25 April 2020 (UTC)&lt;br /&gt;
&lt;br /&gt;
: The site shows several views into the public database entry that are easier to understand by humans than the raw sequence. Click the link at 'View: TEXT'. and scroll down. The relevant lines look like this:&lt;br /&gt;
&lt;br /&gt;
     aatccagtaa tggaaccaat ttatgatgaa ccgacgacga ctactagcgt gcctttgtaa     26220&lt;br /&gt;
&lt;br /&gt;
     gcacaagctg attagtacga acttatgtac tcattcgttt cggaagagac aggtacgtta     26280&lt;br /&gt;
&lt;br /&gt;
As you can see, these are not meant to be search for and compared in &amp;quot;a notepad&amp;quot;. For the same reason, google does not index DNA sequence database entries. There are specialised tools for that.&lt;br /&gt;
&lt;br /&gt;
The sequnces were published this month, so they are available only in the most recent sequence database updates. [[User:Heikkil|Heikkil]] ([[User talk:Heikkil|talk]]) 04:40, 26 April 2020 (UTC)&lt;br /&gt;
 &lt;br /&gt;
&lt;br /&gt;
I have had trouble opening .txt files of even a hundred KB in Notepad! Sometimes it even crashes... It's one of the reasons I started using Notepad++. Notepad++ also happens to have a very extensible spellcheck, &amp;amp; language-specific formatting options. Since I often need to use Windows machines, it's one of my most frequently installed apps, after 7Zip.&lt;br /&gt;
[[User:ProphetZarquon|ProphetZarquon]] ([[User talk:ProphetZarquon|talk]]) 18:03, 25 April 2020 (UTC)&lt;br /&gt;
&lt;br /&gt;
The Grammar Checker concept only has a {{w|Colorless_green_ideas_sleep_furiously|limited analytical sophistication}}, though I don't doubt it'd still be enough to get a Nobel given the complexity of the task of deriving trivially feasible sequences from total codswallop. I also added the &amp;quot;next step&amp;quot; (probably much more than a single step), when I revised things, but that might actually be overstepping the explanation of the comic and removable. [[Special:Contributions/162.158.155.122|162.158.155.122]] 20:32, 25 April 2020 (UTC)&lt;br /&gt;
:Thanks for mentioning this in the discussion area, as I wondered what that &amp;quot;next step&amp;quot; line meant when I read it a little while ago, let alone how it related to the comic.  I'll go ahead and trim that last &amp;quot;next step&amp;quot; sentence off the end, as I think it is unnecessary. [[User:Ianrbibtitlht|Ianrbibtitlht]] ([[User talk:Ianrbibtitlht|talk]]) 03:36, 26 April 2020 (UTC)&lt;/div&gt;</summary>
		<author><name>Heikkil</name></author>	</entry>

	<entry>
		<id>https://www.explainxkcd.com/wiki/index.php?title=2298:_Coronavirus_Genome&amp;diff=191225</id>
		<title>2298: Coronavirus Genome</title>
		<link rel="alternate" type="text/html" href="https://www.explainxkcd.com/wiki/index.php?title=2298:_Coronavirus_Genome&amp;diff=191225"/>
				<updated>2020-04-25T14:39:42Z</updated>
		
		<summary type="html">&lt;p&gt;Heikkil: /* Explanation */  Clarify the analogy of grammar checking&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{comic&lt;br /&gt;
| number    = 2298&lt;br /&gt;
| date      = April 24, 2020&lt;br /&gt;
| title     = Coronavirus Genome&lt;br /&gt;
| image     = coronavirus_genome.png&lt;br /&gt;
| titletext = Spellcheck has been great, but whoever figures out how to get grammar check to work is guaranteed a Nobel.&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==Explanation==&lt;br /&gt;
{{incomplete|Created by a NOBEL IN SPELLCHECKING. Do NOT delete this tag too soon.}}&lt;br /&gt;
This comic is another comic in a [[:Category:COVID-19|series of comics]] related to the {{w|2019–20 coronavirus outbreak|2020 pandemic}} of the {{w|coronavirus}} {{w|SARS-CoV-2}}, which causes {{w|COVID-19}}.&lt;br /&gt;
&lt;br /&gt;
[[Megan]] is a {{w|Genetics|geneticist}} doing research on the SARS-CoV-2 virus. She is analyzing the virus's {{w|genome}}, its genetic material composed of {{w|RNA}}. The genomic sequence can be represented as a list of {{w|nucleotide}} bases ({{w|guanine}}, {{w|adenine}}, {{w|cytosine}}, {{w|thymine}} and {{w|uracil}} - often abreveated as G, A, C, T, and U).&lt;br /&gt;
&lt;br /&gt;
The nucleotide sequence displayed currently finds an 100% match to six SARS-CoV-2 sequences in public databases, all of them originating from USA East Coast. The sequence is from nucleotides 26202-26280 of the virus genome and overlaps an unknown open reading frame/gene named ORF3a. One of the matching sequences is [https://www.ebi.ac.uk/ena/data/view/MT344963].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Cueball]] is surprised that she and her colleagues actually use {{w|Microsoft Notepad}}, a simple {{w|text editor}}, to look at the genome, instead of more modern technology.  She explains that better research institutions use {{w|Microsoft Word}}, a more advanced editor, to allow additional formatting (such as '''bolding''' and ''italics''), and humorously calls this &amp;quot;{{w|epigenetics}}&amp;quot;.  In the real world, epigenetics is the study of changes that are not caused by changeing of nucleotides, but other chemical modifications to DNA and chromosomes that cause changes in patterns of gene expression and activation, often many generations down.  This might be considered analogous to altering the meaning of a text by changing its formatting rather than the content; for example, content can be moved into parentheses or footnotes to be de-emphasized, or placed in bold and made large to attract attention and emphasize key points.  Much as text can be wrapped in HTML tags or similar markup to change its formatting, nucleotides can be {{w|DNA methylation|methylated}} to prevent transcription, and the {{w|histone}}s around which DNA is wound can also be modified to promote or repress gene expression.&lt;br /&gt;
&lt;br /&gt;
The real punchline comes when Megan uses {{w|Spell checker|spellcheck}} to detect mutations in the genome by adding the previous genome to spellcheck and comparing them. Overall, Megan uses ridiculously and humorously crude methods to analyze a major genetic item.  The genome of SARS-CoV-2 is almost 30,000 base-pairs long, which far exceeds the {{w|longest words}} of any natural language and may exceed the capabilities of any available spell-checking program.&lt;br /&gt;
&lt;br /&gt;
The title text mentions {{w|Grammar checker|grammar checking}} and claims that whoever discovers how to use that to compare genomic material should be awarded a {{w|Nobel Prize}}. Spell-checking is analogous to comparing sequences to see their differences and similarities that is bread and butter of bioinformatics nowadays. Grammar checking would be analogous to being able to understand the chemical and biological function of a sequence straight from its nucleotides, something were unable to do at the moment except in very limited way and in a few and simple cases.&lt;br /&gt;
&lt;br /&gt;
==Transcript==&lt;br /&gt;
{{incomplete transcript|Do NOT delete this tag too soon.}}&lt;br /&gt;
&lt;br /&gt;
:[Megan sits at a desk, working on a laptop. A genome sequence is displayed on her laptop screen, shown with a jagged line in a text bubble.]&lt;br /&gt;
:Cueball (off-screen): So that's the coronavirus genome, huh?&lt;br /&gt;
:Megan: It is!&lt;br /&gt;
:Laptop: TACTAGCGTGCCTTTGTAAGCACAAGCTGATTAGTACGAACTTATGTACTCATTCGTTTCGGAAGAGACAGGTACGTTA&lt;br /&gt;
&lt;br /&gt;
:[Cueball walks up and stands behind Megan, still working on the laptop.]&lt;br /&gt;
:Cueball: It's weird that you can just look at it in a text editor.&lt;br /&gt;
:Megan: It's essential!&lt;br /&gt;
:Megan: We geneticists do most of our work in Notepad.&lt;br /&gt;
&lt;br /&gt;
:[A frameless panel, Cueball still standing behind Megan.]&lt;br /&gt;
:Cueball: Notepad?&lt;br /&gt;
:Megan: Yup! Nicer labs use Word, which lets you change the genome font size and make nucleotides bold or italic.&lt;br /&gt;
:Cueball: Ah, okay.&lt;br /&gt;
:Megan: That extra formatting is called &amp;quot;epigenetics&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
:[A regular panel, Cueball still stands behind Megan. He has his hand on his chin.]&lt;br /&gt;
:Cueball: Hey, why does that one have a red underline?&lt;br /&gt;
:Megan: When we identify a virus, we add its genome to spellcheck. That's how we spot mutations.&lt;br /&gt;
:Cueball: ''Clever!''&lt;br /&gt;
&lt;br /&gt;
{{comic discussion}}&lt;br /&gt;
[[Category: Comics featuring Cueball]]&lt;br /&gt;
[[Category: Comics featuring Megan]]&lt;br /&gt;
[[Category: Biology]]&lt;br /&gt;
[[Category:COVID-19]]&lt;/div&gt;</summary>
		<author><name>Heikkil</name></author>	</entry>

	<entry>
		<id>https://www.explainxkcd.com/wiki/index.php?title=2298:_Coronavirus_Genome&amp;diff=191223</id>
		<title>2298: Coronavirus Genome</title>
		<link rel="alternate" type="text/html" href="https://www.explainxkcd.com/wiki/index.php?title=2298:_Coronavirus_Genome&amp;diff=191223"/>
				<updated>2020-04-25T13:45:00Z</updated>
		
		<summary type="html">&lt;p&gt;Heikkil: Corrected explanation of epigenetics&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{comic&lt;br /&gt;
| number    = 2298&lt;br /&gt;
| date      = April 24, 2020&lt;br /&gt;
| title     = Coronavirus Genome&lt;br /&gt;
| image     = coronavirus_genome.png&lt;br /&gt;
| titletext = Spellcheck has been great, but whoever figures out how to get grammar check to work is guaranteed a Nobel.&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==Explanation==&lt;br /&gt;
{{incomplete|Created by a NOBEL IN SPELLCHECKING. Do NOT delete this tag too soon.}}&lt;br /&gt;
This comic is another comic in a [[:Category:COVID-19|series of comics]] related to the {{w|2019–20 coronavirus outbreak|2020 pandemic}} of the {{w|coronavirus}} {{w|SARS-CoV-2}}, which causes {{w|COVID-19}}.&lt;br /&gt;
&lt;br /&gt;
[[Megan]] is a {{w|Genetics|geneticist}} doing research on the SARS-CoV-2 virus. She is analyzing the virus's {{w|genome}}, its genetic material composed of {{w|RNA}}. The genomic sequence can be represented as a list of {{w|nucleotide}} bases ({{w|guanine}}, {{w|adenine}}, {{w|cytosine}}, {{w|thymine}} and {{w|uracil}} - often abreveated as G, A, C, T, and U).&lt;br /&gt;
&lt;br /&gt;
The nucleotide sequence displayed currently finds an 100% match to six SARS-CoV-2 sequences in public databases, all of them originating from USA East Coast. The sequence is from nucleotides 26202-26280 of the virus genome and overlaps an unknown open reading frame/gene named ORF3a. One of the matching sequences is [https://www.ebi.ac.uk/ena/data/view/MT344963].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Cueball]] is surprised that she and her colleagues actually use {{w|Microsoft Notepad}}, a simple {{w|text editor}}, to look at the genome, instead of more modern technology.  She explains that better research institutions use {{w|Microsoft Word}}, a more advanced editor, to allow additional formatting (such as '''bolding''' and ''italics''), and humorously calls this &amp;quot;{{w|epigenetics}}&amp;quot;.  In the real world, epigenetics is the study of changes that are not caused by changeing of nucleotides, but other chemical modifications to DNA and chromosomes that cause changes in patterns of gene expression and activation, often many generations down.  This might be considered analogous to altering the meaning of a text by changing its formatting rather than the content; for example, content can be moved into parentheses or footnotes to be de-emphasized, or placed in bold and made large to attract attention and emphasize key points.  Much as text can be wrapped in HTML tags or similar markup to change its formatting, nucleotides can be {{w|DNA methylation|methylated}} to prevent transcription, and the {{w|histone}}s around which DNA is wound can also be modified to promote or repress gene expression.&lt;br /&gt;
&lt;br /&gt;
Is this a pun on &amp;quot;gene editing&amp;quot; as with CRISPER Cas9 ?&lt;br /&gt;
&lt;br /&gt;
The real punchline comes when Megan uses {{w|Spell checker|spellcheck}} to detect mutations in the genome by adding the previous genome to spellcheck and comparing them. Overall, Megan uses ridiculously and humorously crude methods to analyze a major genetic item.  The genome of SARS-CoV-2 is almost 30,000 base-pairs long, which far exceeds the {{w|longest words}} of any natural language and may exceed the capabilities of any available spell-checking program.&lt;br /&gt;
&lt;br /&gt;
The title text mentions {{w|Grammar checker|grammar checking}} and claims that whoever discovers how to use that to compare genomic material should be awarded a {{w|Nobel Prize}}. Spell-checking could identify (space-delimited) lengths of genetic code that have never been seen before, but grammar checking could be used to identify whether known sequences of bases make no sense as a larger sequence (a gene, or even a whole organism), which is potentially a very big question among geneticists.&lt;br /&gt;
&lt;br /&gt;
==Transcript==&lt;br /&gt;
{{incomplete transcript|Do NOT delete this tag too soon.}}&lt;br /&gt;
&lt;br /&gt;
:[Megan sits at a desk, working on a laptop. A genome sequence is displayed on her laptop screen, shown with a jagged line in a text bubble.]&lt;br /&gt;
:Cueball (off-screen): So that's the coronavirus genome, huh?&lt;br /&gt;
:Megan: It is!&lt;br /&gt;
:Laptop: TACTAGCGTGCCTTTGTAAGCACAAGCTGATTAGTACGAACTTATGTACTCATTCGTTTCGGAAGAGACAGGTACGTTA&lt;br /&gt;
&lt;br /&gt;
:[Cueball walks up and stands behind Megan, still working on the laptop.]&lt;br /&gt;
:Cueball: It's weird that you can just look at it in a text editor.&lt;br /&gt;
:Megan: It's essential!&lt;br /&gt;
:Megan: We geneticists do most of our work in Notepad.&lt;br /&gt;
&lt;br /&gt;
:[A frameless panel, Cueball still standing behind Megan.]&lt;br /&gt;
:Cueball: Notepad?&lt;br /&gt;
:Megan: Yup! Nicer labs use Word, which lets you change the genome font size and make nucleotides bold or italic.&lt;br /&gt;
:Cueball: Ah, okay.&lt;br /&gt;
:Megan: That extra formatting is called &amp;quot;epigenetics&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
:[A regular panel, Cueball still stands behind Megan. He has his hand on his chin.]&lt;br /&gt;
:Cueball: Hey, why does that one have a red underline?&lt;br /&gt;
:Megan: When we identify a virus, we add its genome to spellcheck. That's how we spot mutations.&lt;br /&gt;
:Cueball: ''Clever!''&lt;br /&gt;
&lt;br /&gt;
{{comic discussion}}&lt;br /&gt;
[[Category: Comics featuring Cueball]]&lt;br /&gt;
[[Category: Comics featuring Megan]]&lt;br /&gt;
[[Category: Biology]]&lt;br /&gt;
[[Category:COVID-19]]&lt;/div&gt;</summary>
		<author><name>Heikkil</name></author>	</entry>

	<entry>
		<id>https://www.explainxkcd.com/wiki/index.php?title=Talk:2298:_Coronavirus_Genome&amp;diff=191222</id>
		<title>Talk:2298: Coronavirus Genome</title>
		<link rel="alternate" type="text/html" href="https://www.explainxkcd.com/wiki/index.php?title=Talk:2298:_Coronavirus_Genome&amp;diff=191222"/>
				<updated>2020-04-25T13:30:31Z</updated>
		
		<summary type="html">&lt;p&gt;Heikkil: Some bioinformatics facts added&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!--Please sign your posts with ~~~~ and don't delete this text. New comments should be added at the bottom.--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Epigenetics is a pun, right? I think it's a pun but I don't know what and it's maddening. [[User:Jacky720|That's right, Jacky720 just signed this]] ([[User talk:Jacky720|talk]] | [[Special:Contributions/Jacky720|contribs]]) 23:03, 24 April 2020 (UTC)&lt;br /&gt;
:...{{w|Epigenetics}} is a real thing&amp;amp;mdash;the study of how changes in things other than the genome itself can be passed down between generations. An example is conditioning a mouse to be scared of the smell of oranges/cherries/almonds by having them associate the scent of acetophenone with an electric shock, then testing whether its pups also have the same fear of that smell: they do, but this obviously can't be by the genome itself changing (no component of this has a lot of ionizing radiation{{Citation needed}}). Whatever causes this is the topic of actual epigenetics. --[[User:Volleo6144|Volleo6144]] ([[User talk:Volleo6144|talk]]) 00:12, 25 April 2020 (UTC)&lt;br /&gt;
::I know that, I added the link to the article. But afaik that has nothing to do with how the genome is formatted in Word, and I think it's a pun. [[User:Jacky720|That's right, Jacky720 just signed this]] ([[User talk:Jacky720|talk]] | [[Special:Contributions/Jacky720|contribs]]) 00:31, 25 April 2020 (UTC)&lt;br /&gt;
&lt;br /&gt;
since when does notepad have spellcheck? [[Special:Contributions/172.68.226.46|172.68.226.46]] 23:05, 24 April 2020 (UTC)&lt;br /&gt;
: Word does, so maybe she is using Word instead? Kind of contradictory. [[Special:Contributions/172.69.34.46|172.69.34.46]] 23:14, 24 April 2020 (UTC)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
True Story: In the 1980s, as part of the Work Experience initiative at my school, I was assigned to one of my local council's offices (I'd applied for their computer department, but someone else got that). I don't ''think'' the word-processor I used at home (Psion Exchange) had spellcheck, but the one the office used (Lotus? Can't actually recall, but it, like most things, was DOS-based) definitely had, and it was very easy to edit in new words. Inspired by the chemistry lessons I'd recently had, and some 'reports' I was asked to write (keeping the kid busy, more like!) that dealt with chemical degradation of concrete under the action of salt and suchlike, I of course added &amp;quot;NaCl&amp;quot; then absolutely any other chemical formulae I could think of. &amp;quot;H2SO4&amp;quot; was an early one (partial subscript formatting wasn't relevent to the spill-chucker) but I eventually got round to CH4, C2H6, C3H8, etc, and then as many of the derived alcohols, alkenes, alkynes, etc that I could be bothered to type in. Which were a lot. By the end I was 'confident' that nobody would ever type ''any'' correct chemical formula into that machine (no network-shared resources!) and have to worry about false-positive typo alerts. Yeah, well, I was still at school and thought I knew ''everything''. [[Special:Contributions/162.158.159.70|162.158.159.70]] 23:37, 24 April 2020 (UTC)&lt;br /&gt;
&lt;br /&gt;
Can confirm: virus genomes are looked at in notepad. I worked at one of the national laboratories for a summer, experimenting with ways to check for the length of a gene and strength of genetic expression in various circumstances in E. coli. We used notepad because even old computers can open very large files without difficulty, and all our scripts were in Perl, which can easily output to .rtf or .txt file formats. These files are huge, by the way. If you hold down on the scroll bar so it's zooming to the bottom, you could be waiting 20 minutes to reach the end depending on the number of kilobase pairs in your microbe. And epigenetics is not a pun. It's a real word. [[Special:Contributions/172.68.143.192|172.68.143.192]] 00:15, 25 April 2020 (UTC)&lt;br /&gt;
&lt;br /&gt;
Concurrent to the work in the medical community, work is underway in various open source software communities to fix bugs and other issues with software (eg genome analysis tools) that is useful to the scientists combatting COVID-19. These include the Debian &amp;quot;biohackathon&amp;quot; (https://lwn.net/Articles/816280/) as well as support from Mozilla (https://lwn.net/Articles/816386/). Parallel to these efforts, the FSF (Free Software Foundation) has focused on the shortage of medical equipment: https://lwn.net/Articles/816392/ [[Special:Contributions/108.162.242.5|108.162.242.5]] 00:34, 25 April 2020 (UTC)&lt;br /&gt;
&lt;br /&gt;
I’m suddenly inspired to write a DNA-edit-mode for Emacs (if it doesn’t have it already) which would allow for the virus spell check as described in this comic. [[Special:Contributions/172.69.63.153|172.69.63.153]] 04:16, 25 April 2020 (UTC)&lt;br /&gt;
&lt;br /&gt;
- the dna-mode for emacs does exist. Google for it. It is not very useful for real work, though.&lt;br /&gt;
&lt;br /&gt;
Derek Lowe has some insights about actual coronavirus mutations [https://blogs.sciencemag.org/pipeline/archives/2020/04/21/watching-for-mutations-in-the-coronavirus here], if you are interested.&lt;br /&gt;
&lt;br /&gt;
Given coronavirus has an RNA genome, shouldn't all the 'T's be replaced by 'U's?&lt;br /&gt;
&lt;br /&gt;
- It is standard practice no to use U's in public sequence database. It simplifies things.&lt;br /&gt;
&lt;br /&gt;
The sequence in the transcript does not actually appear on the [https://www.ebi.ac.uk/ena/data/view/MT344963&amp;amp;display=text site] mentioned in the explanation. In fact, when I google for 'TACTAGCGTGCCTTTGTAAGCACAAGCTGATTAGTACGAACTTATGTACTCATTCGTTTCGGAAGAGACAGGTACGTTA' I only get this particular site.&lt;br /&gt;
:Well, ''obviously'' it's a new variant, yet unknown to other clinical studies. Of RNA that has switched to looking like DNA, so this is a hot discovery! [[Special:Contributions/162.158.159.142|162.158.159.142]] 12:05, 25 April 2020 (UTC)&lt;br /&gt;
&lt;br /&gt;
- The site shows several views into the public database entry that are easier to understand by humans than the raw sequence. Click the link at 'View: TEXT'. and scroll down. The relevant lines look like this:&lt;br /&gt;
&lt;br /&gt;
     aatccagtaa tggaaccaat ttatgatgaa ccgacgacga ctactagcgt gcctttgtaa     26220&lt;br /&gt;
&lt;br /&gt;
     gcacaagctg attagtacga acttatgtac tcattcgttt cggaagagac aggtacgtta     26280&lt;br /&gt;
&lt;br /&gt;
As you can see, these are not meant to be search for and compared in &amp;quot;a notepad&amp;quot;. For the same reason, google does not index DNA sequence database entries. There are specialised tools for that.&lt;br /&gt;
&lt;br /&gt;
The sequnces were published this month, so they are available only in the most recent sequence database updates.&lt;/div&gt;</summary>
		<author><name>Heikkil</name></author>	</entry>

	<entry>
		<id>https://www.explainxkcd.com/wiki/index.php?title=2298:_Coronavirus_Genome&amp;diff=191219</id>
		<title>2298: Coronavirus Genome</title>
		<link rel="alternate" type="text/html" href="https://www.explainxkcd.com/wiki/index.php?title=2298:_Coronavirus_Genome&amp;diff=191219"/>
				<updated>2020-04-25T09:52:40Z</updated>
		
		<summary type="html">&lt;p&gt;Heikkil: The origin of the nucleotide sequence shown explained&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{comic&lt;br /&gt;
| number    = 2298&lt;br /&gt;
| date      = April 24, 2020&lt;br /&gt;
| title     = Coronavirus Genome&lt;br /&gt;
| image     = coronavirus_genome.png&lt;br /&gt;
| titletext = Spellcheck has been great, but whoever figures out how to get grammar check to work is guaranteed a Nobel.&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==Explanation==&lt;br /&gt;
{{incomplete|Created by a NOBEL IN SPELLCHECKING. Do NOT delete this tag too soon.}}&lt;br /&gt;
This comic is another comic in a [[:Category:COVID-19|series of comics]] related to the {{w|2019–20 coronavirus outbreak|2020 pandemic}} of the {{w|coronavirus}} {{w|SARS-CoV-2}}, which causes {{w|COVID-19}}.&lt;br /&gt;
&lt;br /&gt;
[[Megan]] is a {{w|Genetics|geneticist}} doing research on the SARS-CoV-2 virus. She is analyzing the virus's {{w|genome}}, its genetic material composed of {{w|RNA}}. The genomic sequence can be represented as a list of {{w|nucleotide}} bases ({{w|guanine}}, {{w|adenine}}, {{w|cytosine}}, {{w|thymine}} and {{w|uracil}} - often abreveated as G, A, C, T, and U).&lt;br /&gt;
&lt;br /&gt;
The nucleotide sequence displayed currently finds an 100% match to six SARS-CoV-2 sequences in public databases, all of them originating from USA East Coast. The sequence is from nucleotides 26202-26280 of the virus genome and overlaps an unknown open reading frame/gene named ORF3a. One of the matching sequences is [https://www.ebi.ac.uk/ena/data/view/MT344963].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Cueball]] is surprised that she and her colleagues actually use {{w|Microsoft Notepad}}, a simple {{w|text editor}}, to look at the genome, instead of more modern technology.  She explains that better research institutions use {{w|Microsoft Word}}, a more advanced editor, to allow additional formatting (such as '''bolding''' and ''italics''), and humorously calls this &amp;quot;{{w|epigenetics}}&amp;quot;.  In the real world, epigenetics is the study of changes that are not caused by direct changes to the genome itself, but in patterns of gene expression and activation.  This might be considered analogous to altering the meaning of a text by changing its formatting rather than the content; for example, content can be moved into parentheses or footnotes to be de-emphasized, or placed in bold and made large to attract attention and emphasize key points.  Much as text can be wrapped in HTML tags or similar markup to change its formatting, nucleotides can be {{w|DNA methylation|methylated}} to prevent transcription, and the {{w|histone}}s around which DNA is wound can also be modified to promote or repress gene expression.&lt;br /&gt;
&lt;br /&gt;
Is this a pun on &amp;quot;gene editing&amp;quot; as with CRISPER Cas9 ?&lt;br /&gt;
&lt;br /&gt;
The real punchline comes when Megan uses {{w|Spell checker|spellcheck}} to detect mutations in the genome by adding the previous genome to spellcheck and comparing them. Overall, Megan uses ridiculously and humorously crude methods to analyze a major genetic item.  The genome of SARS-CoV-2 is almost 30,000 base-pairs long, which far exceeds the {{w|longest words}} of any natural language and may exceed the capabilities of any available spell-checking program.&lt;br /&gt;
&lt;br /&gt;
The title text mentions {{w|Grammar checker|grammar checking}} and claims that whoever discovers how to use that to compare genomic material should be awarded a {{w|Nobel Prize}}. Spell-checking could identify (space-delimited) lengths of genetic code that have never been seen before, but grammar checking could be used to identify whether known sequences of bases make no sense as a larger sequence (a gene, or even a whole organism), which is potentially a very big question among geneticists.&lt;br /&gt;
&lt;br /&gt;
==Transcript==&lt;br /&gt;
{{incomplete transcript|Do NOT delete this tag too soon.}}&lt;br /&gt;
&lt;br /&gt;
:[Megan sits at a desk, working on a laptop. A genome sequence is displayed on her laptop screen, shown with a jagged line in a text bubble.]&lt;br /&gt;
:Cueball (off-screen): So that's the coronavirus genome, huh?&lt;br /&gt;
:Megan: It is!&lt;br /&gt;
:Laptop: TACTAGCGTGCCTTTGTAAGCACAAGCTGATTAGTACGAACTTATGTACTCATTCGTTTCGGAAGAGACAGGTACGTTA&lt;br /&gt;
&lt;br /&gt;
:[Cueball walks up and stands behind Megan, still working on the laptop.]&lt;br /&gt;
:Cueball: It's weird that you can just look at it in a text editor.&lt;br /&gt;
:Megan: It's essential!&lt;br /&gt;
:Megan: We geneticists do most of our work in Notepad.&lt;br /&gt;
&lt;br /&gt;
:[A frameless panel, Cueball still standing behind Megan.]&lt;br /&gt;
:Cueball: Notepad?&lt;br /&gt;
:Megan: Yup! Nicer labs use Word, which lets you change the genome font size and make nucleotides bold or italic.&lt;br /&gt;
:Cueball: Ah, okay.&lt;br /&gt;
:Megan: That extra formatting is called &amp;quot;epigenetics&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
:[A regular panel, Cueball still stands behind Megan. He has his hand on his chin.]&lt;br /&gt;
:Cueball: Hey, why does that one have a red underline?&lt;br /&gt;
:Megan: When we identify a virus, we add its genome to spellcheck. That's how we spot mutations.&lt;br /&gt;
:Cueball: ''Clever!''&lt;br /&gt;
&lt;br /&gt;
{{comic discussion}}&lt;br /&gt;
[[Category: Comics featuring Cueball]]&lt;br /&gt;
[[Category: Comics featuring Megan]]&lt;br /&gt;
[[Category: Biology]]&lt;br /&gt;
[[Category:COVID-19]]&lt;/div&gt;</summary>
		<author><name>Heikkil</name></author>	</entry>

	</feed>