The world’s Largest Sharp Brain Virtual Experts Marketplace Just a click Away
Levels Tought:
Elementary,Middle School,High School,College,University,PHD
| Teaching Since: | May 2017 |
| Last Sign in: | 283 Weeks Ago |
| Questions Answered: | 27237 |
| Tutorials Posted: | 27372 |
MCS,MBA(IT), Pursuing PHD
Devry University
Sep-2004 - Aug-2010
Assistant Financial Analyst
NatSteel Holdings Pte Ltd
Aug-2007 - Jul-2017
Here it is
deadline 4 hours:
budget is 7$
BIOL450 Introduction to Bioinformatics Name:_____________________
Computer Lab #2 02/06/14
NCBI Entrez and Searching Biological Databases
Part 1. the Entrez system at NCBI
Point your browser to the NCBI website.
1. What is the URL?
_____________________________________________________________________
You can search all databases in the NCBI system at the same time. To get an idea of how many records are in each database, search all databases for the following term:
all [filter]
2. Which database has the most records?
_____________________________________________________________________
3. How many records does it have?
_____________________________________________________________________
Part 2. Triose phosphate isomerase
We’re going to investigate the human triose phosphate isomerase 1 gene. This gene is responsible for a critical reaction in the glycolysis pathway, the series of reactions that converts simple sugars (glucose) into 2 pyruvate molecules. The pyruvate molecules are then used to generate energy (ATP) for the cell. There is much known about this gene and when it is deficient, severe problems can occur. A loss of function mutation in this gene would be fatal.
First, a note about the different between GenBank and RefSeq accession numbers. GenBank is an NCBI database that serves as an archive for all publicly available DNA sequences from more than 100,000 different organisms. Submitting scientists retain complete editorial control over their sequences, so they decide on gene symbols (which may not be the official ones) and what additional information to include. Scientists contact NCBI if they wish to make any modifications to their sequence records. As an archival database, GenBank can include redundant entries, even hundreds of records for the same gene, and some entries may contain errors in their sequence data. To address some problems associated with this archival database, NCBI developed the nonredundant RefSeq. RefSeq is a curated, nonredundant source of sequence data for genomic DNA, mRNA transcripts, and proteins of major research organisms. Unlike GenBank records, RefSeq records are created, reviewed, and updated by NCBI staff. Each RefSeq entry features a distinct accession number (two characters followed by an underscore in which the first two characters describe the sequence type). For more information about RefSeq, visit the RefSeq FAQs.
Use what you know about the NCBI databases to answer the following questions about the human version of this gene.
4. What is the RefSeq accession number for this gene in the mRNA form? (for more information about RefSeq, visit www.ncbi.nlm.nih.gov/RefSeq/ and click on the Accessions link.
_____________________________________________________________________
5. in the protein form?
_____________________________________________________________________
6. On what chromosome is this gene found?
_____________________________________________________________________
7. How many amino acids are in the protein chain?
_____________________________________________________________________
8. What are the first 5 amino acids?
_____________________________________________________________________
One of the very useful characteristics of the Entrez system is that they often provide cross-reference links among their many databases. In this case, the report for the human protein has links to recent publications. A recent paper was titled “Triosephosphate isomerase deficiency: consequences of an inherited mutation at mRNA, protein and metabolic levels.”
9. Where are the authors of this paper from?
_____________________________________________________________________
Part 3. Unigene
The Unigene project at NCBI was initiated to collect all expressed protein-coding sequences (expressed sequence tags, or ESTs) from various organisms, and organize them into groups for the single unique gene (Unigene) that they are most likely transcribed from. Thus, they advertise themselves as “An organized view of the transcriptome.” Visit the homepage for UniGene from the NCBI website. In 2003 there were 26 organisms that were included in the UniGene project.
10. How many organisms are currently represented in Unigene?
_____________________________________________________________________
Point your browser to the UniGene project for Homo sapiens. Each unique gene in the database is represented by a list of ESTs that correspond to that gene. By collecting all the sequences that have been expressed from a gene, we can not only obtain a view of the entire coding sequence for that gene, but also obtain insight into how genes are processed and regulated after they get transcribed. At the end of the page, a histogram is given that counts the number of ESTs (left column) per gene (right column).
11. According to the histogram, the vast majority of genes have _______ coding sequences (ESTs) associated with them.
A. Very few
B. Very many
12. How many EST sequences (not counting mRNA sequences) are there for the human triose phosphate isomerase in Unigene?
_____________________________________________________________________
----------- He-----------llo----------- Si-----------r/M-----------ada-----------m -----------Tha-----------nk -----------You----------- fo-----------r u-----------sin-----------g o-----------ur -----------web-----------sit-----------e a-----------nd -----------acq-----------uis-----------iti-----------on -----------of -----------my -----------pos-----------ted----------- so-----------lut-----------ion-----------. P-----------lea-----------se -----------pin-----------g m-----------e o-----------n c-----------hat----------- I -----------am -----------onl-----------ine----------- or----------- in-----------box----------- me----------- a -----------mes-----------sag-----------e I----------- wi-----------ll