Jump to a key chapter
Genome projects are scientific undertakings that attempt to identify an organism’s whole genome sequence and the location and function of the genes present in the genome.
Bioinformatics has been another key player in allowing the organic data collected from scientists worldwide to be read, stored, and organised at a faster rate than before.
Bioinformatics is the science of gathering and analysing large amounts of complex biological data, such as genetic codes.
Human Genome Project
A genome project involves collecting and sequencing many DNA samples from various donors of the same species. The DNA sequences obtained create a reference genome. Genome sequencing has tremendously helped the scientific community to understand different genes’ functions and interactions in different organisms. Whole-genome projects are usually created by the whole genome shotgun (WGS) approach. This approach involves sequencing multiple overlapping DNA fragments separately and then virtually assembling the small fragments into chromosomes using computer algorithms that identify the fragment sequences.
The Human Genome Project (HGP) was an international scientific research project aimed at sequencing the whole human DNA and identifying the location and function of all the genes in the human genome. The HGP was and still is the largest collaborative biological project globally. It was initiated on October 1, 1990, and was declared complete on April 14, 2003.
The double helix structure of the DNA was discovered in 1953, the HGP was completed in 2003, and CRISPR/Cas9 (an efficient method for editing the DNA in cells) was discovered in 2012. In less than 60 years, we went from not knowing much about DNA to sequencing and mapping all the genes within the human genome and knowing how to edit the genes within the cells! What do you think scientists will be able to do in the future?
Genome sequencing projects
DNA sequencing methods are constantly evolving and becoming more straightforward, but their principle is based on Sanger sequencing, an automated method invented by Fredrick Sanger in 1977.
Fredrick Sanger received his second Nobel Prize in chemistry for inventing the Sanger DNA sequencing method!
The Sanger DNA sequencing process can be broken down into three steps:
- Polymerase Chain Reaction (PCR): Automated DNA sequencing methods require large quantities of DNA. This is achieved by first amplifying the DNA samples using the polymerase chain reaction (PCR).
You can learn more about PCR in our Polymerase Chain Reaction article.
- Fluorescently labelled dideoxyribonucleotide triphosphate (DdNTP): Normal deoxyribonucleotide (dNTP), fluorescently labelled dideoxyribonucleotide (DdNTP) and DNA polymerase are added to the amplified DNA sample. DNA polymerase uses dNTPs to polymerise new strands of DNA based on the complementary sequence of the existing strands in the sample starting from the primer. DdNTP is a special type of nucleotide that differs from normal deoxyribonucleotides because it contains a hydrogen atom instead of a hydroxyl group on carbon number 3. DdNTP acts as an inhibitor of chain elongation and, once incorporated, terminates further nucleotide addition. The four different DdNTPs (A, G, T, and C) are tagged with different fluorescent labels giving each a distinct colour. Since DdNTPs will be randomly incorporated into the growing DNA strands, the result would be new DNA fragments of various lengths and sizes with the same point of origin (all starting from the primer) but ending with a fluorescently labelled DdNTP.
- Gel electrophoresis: The obtained fragments from the previous step, are pushed through a gel with small pores by an electrical field. This process separates the strands according to their length. Due to the random nature of the last step, there will be strands present that are 1 nucleotide in size, 2 nucleotides in size, 3 nucleotides in size and so on, and they all end with a fluorescently labelled DdNTP. Therefore, the fluorescent tags’ pattern of colour would tell us the DNA sequence.
Determining the proteome
Cells in organisms use DNA and the sequence in the genes to produce proteins.
A proteome is the total amount of proteins expressed by an organism or a cell at a given time and under specific conditions.
The field that studies proteins and the proteome of different organisms is called Proteomics. Proteins can be detected and sequenced with different techniques. However, protein composition changes depending on the specific conditions of the cell or organism, so it’s much more variable than the genome for a particular species.
The genome and proteome of simple organisms
It is relatively straightforward to determine the genome and proteome of basic organisms such as prokaryotes because:
- The size of prokaryotic DNA is substantially less than that of eukaryotic DNA.
- Histone proteins are not found in prokaryotic DNA.
- There are no non-coding DNA sequences in prokaryotic genomes. On the other hand, Eukaryotic DNA contains a large number of non-coding sequences that make determining the proteome challenging.
Benefits of knowing the proteome of simple organisms
The proteome of prokaryotes has many medical and non-medical applications.
Medical Applications
Identifying antigenic proteins on the surface of harmful bacteria can be exploited to develop vaccines against illnesses caused by certain microbes. Once the sequence of these antigens is known, they may be mass-produced and supplied to humans in the form of a vaccine. The immune system would then respond to the antigen by producing antibodies and memory cells against it. When confronted with a microbe that possesses the same antigen, memory cells would then be able to develop a secondary immune response to protect the host against infection.
Non-medical Applications
The proteome of simple organisms provides information on the biochemistry of the processes within them. Some of these microorganisms are employed in the production of biofuels. Moreover, organisms that can resist harsh and toxic environments can remove toxins from the environment.
The genome and proteome of complex organisms
The genome of complex organisms such as humans and plants is challenging to sequence due to the larger number of genes that are present in the eukaryotic DNA compared to prokaryotic DNA. But this challenging process has been overcome thanks to the recent advances in the technologies used for DNA sequencing resulting in the successful HGP in 2003 and various plant genome projects.
The major difficulty in studying complex organisms is determining the proteome. This difficulty is due to significant amounts of non-coding DNA in eukaryotic DNA. In humans, for example, it is estimated that 98.5% of the genome is non-coding and does not contribute to the proteome.
Another issue is determining which genome should be utilised for sequencing because all individuals, except for identical twins, have separate genomes.
Whole-genome sequencing (WGS) provides critical information for identifying congenital disorders caused by mutations, oncogenes and tumour suppressor genes affected by mutations that leads to cancer, tracking disease outbreaks and many more.
Genome Projects - Key takeaways
- Genome projects aim to determine the entire base sequence of an organism’s total DNA content.
- Automated DNA sequencing can be broken down into three steps:
- PCR: automated amplification of the DNA sample.
- Fluorescently labelled dideoxyribonucleotide triphosphate is added to the PCR substrate mixture.
- Gel electrophoresis
- Proteome can be determined by decoding the DNA base sequence of the active genes within a cell into amino acid sequences using the universal genetic code.
- applications of genome and proteome projects:
Learn faster with the 17 flashcards about Genome Projects
Sign up for free to gain access to all our flashcards.
Frequently Asked Questions about Genome Projects
What is the human genome project?
The Human Genome Project (HGP) was an international scientific research project aimed at defining the base pairs sequences within the human DNA, as well as identifying and mapping all of the human genome's genes with respect to their position as well as their function.
What can genome sequencing be used for?
Whole-genome sequencing (WGS) provides critical information for identifying congenital disorders caused by mutations, oncogenes and tumour suppressor genes affected by mutations that leads to cancer, tracking disease outbreaks and many more. Sequencing of the genome of pathogenic organisms such as viruses and prokaryotes can provide valuable information for manufacturing vaccines against those pathogens.
Why is the genome project so important?
The genome project is important because it uses information from DNA to develop novel treatments and cures. It also helps to identify the various genetic mutations that cause congenital disorders and cancers and hence can be used to find a cure for them. The human genome project (HGP) provides invaluable information about the genes within the human cells that are accessible to researchers around the world free of charge.
What is a genome sequencing project?
Genome projects are scientific initiatives whose ultimate goal is to uncover the entire genome sequence of an organism and to identify protein-coding genes and other essential genome-encoded properties.
Which techniques used in human genome project?
Two techniques were used in human genome project:
- Maxam gilbert technique which entails breaking the DNA strands at specific bases. This technique is not commonly used.
- Sanger technique or chain-termination technique. This method is more commonly used. It involves random incorporation of fluorescently labelled dideoxyribonucleotides in the DNA strand resulting. Dideoxyribonucleotides terminate further extension of the strands resulting in DNA fragments of various lengths that end in fluorescently tagged dideoxyribonucleotides. The tag is used to identify the base sequences within the DNA fragments.
About StudySmarter
StudySmarter is a globally recognized educational technology company, offering a holistic learning platform designed for students of all ages and educational levels. Our platform provides learning support for a wide range of subjects, including STEM, Social Sciences, and Languages and also helps students to successfully master various tests and exams worldwide, such as GCSE, A Level, SAT, ACT, Abitur, and more. We offer an extensive library of learning materials, including interactive flashcards, comprehensive textbook solutions, and detailed explanations. The cutting-edge technology and tools we provide help students create their own learning materials. StudySmarter’s content is not only expert-verified but also regularly updated to ensure accuracy and relevance.
Learn more