How many genetic scientists have published the number of human genes?

How many genetic scientists have published the number of human genes?

Author: Zong Hua Release Date: 2018-06-26

After more than a decade of completion of the Human Genome Project, identifying genes remains a challenge.

Image source: Alan Phillips/Getty

The earliest attempts to estimate the number of genes in the human genome involved drunk geneticists, a bar in Cold Spring Harbor, New York, and pure speculation.

That was 2000. At the time, the human genome sequence sketch was still being drawn. Geneticists are betting on how many genes humans have, ranging from tens of thousands to hundreds of thousands. Nearly 20 years later, scientists who have mastered real data are still unable to agree on this amount. In their view, this knowledge gap has hampered efforts to discover mutations in related diseases.

The latest effort to fill this gap has used data from hundreds of human tissue samples and was published on the preprinted server BioRxiv a few days ago. It includes nearly 5,000 previously undiscovered genes, of which nearly 1,200 carry instructions for making proteins. The total number of more than 2 million protein-coding genes has been significantly higher than previously estimated (it is believed that this number is around 20,000).

However, many geneticists still do not believe that all the newly proposed genes can stand up to careful scrutiny. Their criticism emphasizes the difficulty of identifying new genes or even defining a gene.

"For 20 years, people have been working on this research, but we still haven't got an answer," said Steven Salzberg, a computational biologist at Johns Hopkins University who led the team's latest research.

In 2000, Ewan Birney launched the GeneSweep competition as the issue of how many human genes will be discovered in the genomics community. Today, Birney, the co-director of the European Institute of Bioinformatics (EBI), is the first to bet in a bar during the annual Genomics Conference.

The competition eventually attracted more than 1,000 participants and a $3,000 jackpot. The number of bets on the number of genes ranges from more than 312,000 to less than 26,000, with an average of around 40,000. At that time, the range of estimates was narrowed, but there were still different opinions.

The number of genes varies depending on the data being analyzed, the tools used, and the criteria for eliminating false information. The latest counts take advantage of a larger data set, another calculation method that differs from previous efforts, and a broader standard for defining genes.

The Salzberg team used data from the Genome Tissue Expression (GTEx) project. The project sequenced RNA from more than 30 different tissues collected from hundreds of corpses. RNA is the "media" between DNA and protein. Researchers want to identify genes that encode proteins and genes that don't encode proteins but still play an important role in cells. To this end, they assembled 9000 million of GTEx microRNA fragments and compared them to the human genome.

However, the expression of only one piece of DNA as RNA does not mean that it is a gene. To this end, the team tried to filter out noise using various criteria. For example, they compare the results obtained with genomes from other species and conclude that sequences shared by distant relatives may be preserved during evolution because they are useful, as may genes.

The researchers obtained 21,306 protein-coding genes and 21,856 non-coding genes—far more than the number of genes in the two most widely used human gene databases. The GENCODE gene set maintained by EBI includes 19901 protein-coding genes and 15779 non-coding genes. The RefSeq database, managed by the National Center for Biotechnology Information (NCBI), has 20,203 protein-coding genes and 17,871 non-coding genes.

Kim Pruitt, NCBI genomics researcher and former head of RefSeq, said that part of the reason for this difference may be the amount of data analyzed by the Salzberg team. However, there is another important difference. Both GENCODE and RefSeq rely on manual management - someone reviews the evidence for each gene and makes a final judgment. The Salzberg team relies entirely on computer programs to filter data.

"If people like our genetic catalog, then maybe we will be an arbiter of human genes in a few years," Salzberg said.

However, many scientists say they need more evidence to make sure that the latest catalog is accurate. According to Adam Frankish, an EBI computational biologist who coordinates GENCODE's manual annotation work, he and the team have scanned about 100 protein-coding genes identified by the Salzberg team. According to their estimates, only one appears to be a true protein-coding gene.

At the same time, the Pruitt team analyzed about a dozen new protein-coding genes published by the Salzberg team, but did not find any genes that met the RefSeq criteria. Some overlap with the genomic region of the retrovirus that appears to belong to the invading human ancestral genome, and the rest are other repetitive fragments that are rarely translated into proteins.

However, Salzberg believes that some repetitive sequences can be considered genes. An example is ERV3-1, which appears in RefSeq and encodes a protein that is overexpressed in colorectal cancer. Salzberg also acknowledged that the new genes in its team catalog are pending confirmation by the team and others.

Chinese Journal of Science and Technology (2018-06-26 3rd Edition International)

Source: Chinese Journal of Science

Drinking Water Test Kit

Shipping:We ship goods by UPS/DHL/FEDEX/TNT express which is door to door,
takes 3~ 5 days to arrive
2. We can also help you ship by air cargo which you need to go to Air port to takehome,
it takes about 2-7 days to arrive
3. If goods are large quantity and you not need so urgently ,
you can choose by sea to save some freight cost which takes about one month to arrive
No matter what delivery way ,depending on your actual requirements

Salt Water Test Strips,Drinking Water Test Kit,Water Bacteria Test Kit,Safe Home Water Test Kit

Changchun LYZ Technology Co., Ltd , https://www.lyzinstruments.com