Useful tips

Human Genes Renamed To Stop Excel From Reading Them As Dates

Human Genes Renamed To Stop Excel From Reading Them As Dates

Some of the most oft-used tools by scientists to create charts, lists and calculations include SQL, R, Python, Hadoop, LaTeX, and of course, Excel – which is a handy way to track work and even conduct clinical trials. However, genetic scientists in the past frequently encountered one problem that made Excel difficult to use – the fact that the program’s default system has been automatically renaming gene names into dates. A prime example is MARCH1 (membrane associated ring-CH-type finger 1). When scientists would type it into a cell, the Microsoft program would automatically convert it into the date 1-Mar. Thankfully, scientists have come up with the solution: renaming a total of 27 genes to avoid incompatibility with Excel.

How Big Was The Problem?

Without a doubt, the problem with official gene names is that they cannot simply be deleted or amended while still being useful for scientists searching for official names or abbreviations. Scientists working and using their sheets tried their best to get around it, for instance by adding comments and deleting them when they were no longer required (i.e. comments could indicate the fact that the date was in fact a gene). Subsequent deletion could be achieved easily with a right click and selection of Delete Note. Scientists could also simply amend their search, searching by date instead of by the correct genetic term. It is easy to see, however, how this system would be useless for documents needing continual or shared usage. Simply renaming birth month genes was therefore seen as an easy way to avoid meaningless or arbitrary shortcuts.

Help At Hand From The HUGO Gene Nomenclature Committee

One 2016 study by M Ziemann et al found that approximately one-fifth of all papers with Excel gene lists have erroneous gene name conversions, owing to Excel inadvertently converting gene symbols to dates. The researchers said, “the kind of errors we describe can be spotted by copying the column of gene names and pasting it into a new sheet, and then sorting the column. Any gene symbols converted to dates will appear as numbers at the top of the column.” The HUGO Gene Nomenclature Committee (HGNC) has changed the name of the 27 genes affected, additionally publishing new guidelines for naming genes, indicating the symbols that affect data handling and retrieval.

Genetic scientists need no longer fear that their data will be erroneously converted into dates. Potentially problematic gene names have been changed, and new guidelines have been published. This will enable data lists to remain unaffected by conversion, easing searches for the relevant genes.