/*! This file is auto-generated */ .wp-block-button__link{color:#fff;background-color:#32373c;border-radius:9999px;box-shadow:none;text-decoration:none;padding:calc(.667em + 2px) calc(1.333em + 2px);font-size:1.125em}.wp-block-file__button{background:#32373c;color:#fff;text-decoration:none} Problem 32 You have the following sequence ... [FREE SOLUTION] | 91Ó°ÊÓ

91Ó°ÊÓ

You have the following sequence reads from a genomic clone of the Homo sapiens genome: Read 1: ATGCGATCTGTGAGCCGAGTCTTTA Read 2: AACAAAAATGTTGTTATTTTTATTTCAGATG Read 3: TTCAGATGCGATCTGTGAGCCGAG Read 4: TGTCTGCCATTCTTAAAAACAAAAATGT Read 5: TGTTATTTTTATTTCAGATGCGA Read 6: AACAAAAATGTTGTTATT a. Use these six sequence reads to create a sequence contig of this part of the \(H\). sapiens genome. b. Translate the sequence contig in all possible reading frames. c. Go to the BLAST page of the National Center for Biotechnology Information, or NCBI (htrps//www.ncbi nlm.nih gov/BLAST/, Appendix B) and see if you can identify the gene of which this sequence is a part by using each of the reading frames as a query for proteinprotein comparison (BLASTp).

Short Answer

Expert verified
Assemble the reads into a contig, translate in all frames, and use BLASTp to identify gene matches.

Step by step solution

01

Align the Sequence Reads

First, identify overlaps between the sequence reads to try assembling them into a longer contiguous sequence, or contig. Look for common subsequences in the reads that indicate how they can be joined. For instance, Read 3 and Read 1 overlap on "ATGCGATCTGTGAGC", suggesting that Read 3 can be followed by Read 1.
02

Construct the Sequence Contig

Following the alignments found in Step 1, build the sequence contig. Starting with Read 3 (as it seems to be centrally overlapping with others) and continuing to Read 1, the assembled contig can be constructed as follows: Merge Read 3 with overlapping parts of Read 1, then proceed with overlapping parts and sequences from Reads 5, 2, 6, and 4, respectively. The final sequence contig should be: "AACAAAAATGTTGTTATTTTTATTTCAGATGCGATCTGTGAGCCGAGTCTTTA".
03

Determine All Reading Frames

There are three reading frames in the 5' to 3' direction. You have to consider each nucleotide in the codons for all frames. Frame 1: Start translation from the first base. Frame 2: Start translation from the second base. Frame 3: Start translation from the third base. The same rules apply for the reverse complement, but in reverse direction (frames 4, 5, and 6).
04

Translate the Sequence in All Frames

Using a standard genetic code table, translate the contig sequence for each of the six reading frames identified in Step 3. For instance, the amino acids for frame 1 can be translated as follows: from "AAC" = asparagine (N), "AAA" = lysine (K), and so forth, resulting in a sequence of amino acids for each reading frame.
05

Use BLAST to Find Gene Match

Visit the BLASTp page on NCBI's website. Input the protein sequences obtained from translating the reading frames in Step 4 into the BLASTp query box to search for matches against the protein database. Choose the appropriate options for database and organism, and submit the search to see if the reading frames correspond to any known genes.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with 91Ó°ÊÓ!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Reading Frames
When analyzing a DNA sequence, understanding reading frames is crucial. In simple terms, a reading frame refers to one of the three possible ways nucleotides in a DNA sequence can be read as codons (groups of three nucleotides) to translate into proteins.
There are three reading frames on a single strand because you can start reading your sequence from the first nucleotide, the second, or the third.
This changes the groupings of the codons, thus potentially changing the resulting protein sequence.

For instance:
  • From the sequence "AACAAA...", starting from the 'A', the reading frame reads 'AAC', 'AAA', etc.
  • Starting from the 'A' one spot over, it reads 'ACA', 'AAA', and so on.
Remember, the reverse complement strand also has its own set of reading frames, often referred to as frames 4, 5, and 6. Paying attention to these frames is key in translating sequences accurately.
Sequence Alignment
Sequence alignment involves matching several sequence reads to create the longest possible contiguous sequence, known as a contig. This process is similar to assembling a jigsaw puzzle.
You look for overlapping areas between reads to stitch them together into a coherent sequence.

In the exercise, by recognizing overlaps like the sequence "ATGCGATCTGTGAGC" shared by Read 3 and Read 1, you can deduce how reads fit together. This attention to detail allows for the construction of a more complete view of the genome.

Perseverance and precision in recognizing overlaps and aligning sequences are crucial skills when building accurate DNA sequences from the reads.
BLAST Search
After assembling and translating your sequence, the next step is often to figure out which gene or protein is represented.
BLAST (Basic Local Alignment Search Tool) is invaluable here. It compares your protein sequences against a database to find matches and suggest what function your sequence might have.

To do this:
  • Visit the BLAST page on the NCBI website.
  • Select relevant options, such as the type of BLAST (BLASTp for protein sequences) and choose the appropriate database and organism.
  • Input your translated sequences and search for matches.
This step helps in identifying likely genes your contig might belong to by providing potential matches from the available genome database.
Translation of DNA Sequence
Translating a DNA sequence into a protein sequence involves interpreting the genetic code to determine the amino acids in a protein. Each group of three nucleotides, or codons, in the DNA sequence corresponds to a specific amino acid.
Using a genetic code table, you can match codons to their respective amino acids.

Here's how it works:
  • Convert codons like 'AAC' to amino acids using the genetic code (e.g., 'AAC' translates to asparagine).
  • Repeat this for each codon in the reading frame to build a complete protein sequence.

This translation process must be done for all reading frames, as different frames could result in different amino acid sequences, potentially representing different proteins. Therefore, understanding the nuances of translation is key for accurate biochemical insights.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

A segment of cloned DNA containing a protein-encoding gene is radioactively labeled and used in an in situ hybridization to chromosomes. Radioactivity was observed over five regions on different chromosomes. How is this result possible?

The entire genome of the yeast Saccharomyces cerevisiae has been sequenced. This sequencing has led to the identification of all the open reading frames (ORFs, gene-size sequences with appropriate translational initiation and termination signals) in the genome. Some of these ORFs are previously known genes with established functions; however, the remainder are unassigned reading frames (URF). To deduce the possible functions of the URFs, they are being systematically, one at a time, converted into null alleles by in vitro knockout techniques. The results are as follows: 15 percent are lethal when knocked out. 25 percent show some mutant phenotype (altered morphology, altered nutrition, and so forth). 60 percent show no detectable mutant phenotype at all and resemble wild type. Explain the possible molecular-genetic basis of these three mutant categories, inventing examples where possible.

To inactivate a gene by RNAi, what information do you need? Do you need the map position of the target gene?

You have sequenced the genome of the bacterium Salmonella typhimurium, and you are using BL.AST analysis to identify similarities within the S. typhimurium genome to known proteins. You find a protein that is 100 percent identical in the bacterium Escherichia coli When you compare nucleotide sequences of the S. typhimurium and \(E\) coli genes, you find that their nucleotide sequences are only 87 percent identical. a. Explain this observation. b. What do these observations tell you about the merits of nucleotide- versus protein-similarity searches in identifying related genes?

Two particular contigs are suspected to be adjacent, possibly separated by repetitive DNA. In an attempt to link them, end sequences are used as primers to try to bridge the gap. Is this approach reasonable? In what situation will it not work?

See all solutions

Recommended explanations on Biology Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.