Technology

How Excel Autocorrect Creates Genomic Headaches?

According to a recent report, many scientists and researchers have raised a complaint against the default autocorrect system on Microsoft Excel. Many scientists from genetic departments have been suffering from headaches due to excel autocorrect. They reported that the Excel autocorrect has been linked with the errors. Around one in five genetic journal papers contains errors on the program. 

They found that the excel autocorrect program converts most of the genetic symbols or gene names wrongly into dates or numerical values. For example, SEPT2(Septin-2) has been converted into a date format like “September 2”. It also happens with MARCH 1 (Short form of Membrane Associated Ring Finger(C3HC4)1, E3 Ubiquitin Protein Ligase).

On the other hand, there are some researchers who published their journal papers in Genome Biology, says that the issue can be solved by formatting the Excel columns as text or one can use google sheets, where the gene names are stored exactly as they were given.

Quantifying Excel Autocorrect

In 2016, Mark Ziemann and his colleagues from Australia tried to resolve the excel autocorrect problem. Ziemann and his team found that around one in five top genomic journal papers contained gene name conversion errors in Excel spreadsheets. 

Despite taking the issue into consideration and steps taken to fix the problem are still rife. Based on the analysis by Ziemann and his team, they found that around 11,000 articles published between 2014 to 2020 contained gene name errors. 

Ziemann, a researcher at computational reproducibility in genetics, Australia, says that even a simple cross-check can help to detect autocorrect errors, but without those checks, the error can pile up based on the volume of data in spreadsheets. 

How to Avoid Excel Autocorrect Mistakes?

Well, one of the ways to avoid autocorrect mistakes is to stop using spreadsheets. There are many spreadsheet tools such as LibreOffice and Gnumeric. You can use these tools as they have no issues and they are hard to audit. 

Many computational biologists prefer to use scripted computer languages like python and R, as they don’t autocorrect the gene symbols. You can also get to know the source of errors. However, the user must be aware of the computer languages so that it will be easy for them to write the code and analyze the data.

If you are not familiar with such computer programs and languages, you can do a quick check before publishing or sharing the data. 

Julia

Julia brings her years of healthcare experience to help consistently create engaging content. She received her master's degree in health administration. She ensures that Complete Health News content is up to date, relevant, and evidence-based. She uses her experience to write empowering content on lifestyle, fitness, Nutrition, Skincare, and more.

Share
Published by
Julia

Recent Posts

Benefits of Oatmeal for Diabetes

People with diabetes experience abnormal blood glucose levels. Fortunately, there are several easy and effective…

13 hours ago

New Plant-Based Jet Fuel Can Reduce Carbon Emission by 68%

An Indian-origin U.S. scientist has developed a plant-based jet fuel from mustard plants to replace…

2 days ago

Stop Grinding Teeth With These Home Remedies!

Grinding teeth or bruxism is an unconscious habit most humans have. Find out home remedies…

3 days ago

Why Women Live Longer than Men!

Everywhere in the world women experience greater longevity compared to men. According to recent CDC…

4 days ago

Healthy Baking Substitutes for Healthy Living

With Healthy baking substitutes, you can eat your cakes and pastries without the guilt of…

5 days ago

Detox Flush and Everything You Need to Know about It

What is Detox Flush? Detox flush is the process of eliminating free radicals from the…

6 days ago