Posts

Nucleic Acid databases

 Nucleotide Sequence Databases  Nucleotide Sequence Databases are data repositories that accept nucleic acid sequence data and make it freely available to public. The data in these repositories are heterogenous with respect to the source of material, quality, annotation and intended completeness of sequence relative to its biological target.  Nucleotide Sequence Databases are of 2 types 1) Primary Sequence Databases- Genbank, EMBL, DDBJ, TrEMBL  2) Secondary Sequence Databases-Swiss Prot, Prosite, PDB International  Nucleotide Sequence Database Collaboration consist mainly 3 databases; Genbank, EMBL, DDBJ. These 3 databases exchange and update data on a daily basis to achieve optimal synchronization. Genbank GenBank is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences .It is a primary database of nucleotide.  GenBank  is accessed and searched through  Entrez gateway at NCBI. User can ...

SWISS-PROT

 SWISS-PROT SWISS-PROT is a secondary databases which provides detailed sequence annotation that includes structure, function, and protein family assignment. It was established in 1986.It is maintained collaboratively by SIB (Swiss Institute of Bioinformatics) and EBI/EMBL. Provides high-level annotations, including description of protein function, structure of protein domains, post-translational modifications, variants, etc. It aims to be minimally redundant. Swiss-Prot is linked to many other resources, including other sequence databases. The sequence data are mainly derived from TrEMBL, a database of translated nucleic acid sequences stored in the EMBL database. The annotation of each entry is carefully curated by human experts and thus is of good quality. The protein annotation includes function, domain structure, catalytic sites, cofactor binding, posttranslational modification, metabolic pathway information, disease association, and similarity with other sequences.

Multiple Sequence Alignment using CLUSTAL W

  Multiple Sequence Alignment using CLUSTAL W Aim To show phylogenetic relationships of sequences by creating tree. Description Multiple sequence alignment is simply an alignment that contains more than two sequences. Multiple sequence alignment is very important for finding similar domains in a set of sequences and further doing phylogenetic analysis. There are two methods of multiple sequence alignment; progressive and iterative. CLUSTAL W is an example of progressive method. It produces multiple sequence alignment of divergent sequences. Evolutionary relationships are shown through cladogram.   Procedure STEP 1: Obtain sequence from NCBI for multiple sequence alignment. Go to NCBI homepage, select nucleotide/protein database and type the query. Select the hit in FASTA format for similarity search. STEP 2: Select BLAST-n option from NCBI -BLAST STEP 3: Run BLAST. STEP 4: Select three or four sequence similar to query and download it in FASTA format. STE...

RASMOL/RASWIN

  RASMOL/RASWIN Show information and Background Aim To display information about the protein selected and to change the background color. Description RasMOL is a computer program written for molecular graphics visualization intended and used mainly to depict and explore biological macromolecule structure, such as those found in the protein data bank . It was originally developed by Ronger Dayle in the early 1990s.RasMOL includes a scripting language, to perform many functions such as selecting certain protein chains, changing colours etc. Jmol Sirus software have incorporated this language into their commands.   Procedure STEP 1: Open RasMol STEP 2: Open a new PDB file of protein Command RasMol.>Show information RasMol.>background white Output(take print) Result Information about the selected protein was displayed and background color was changed to white.               Show Sequenc...

SNP

  SNP Aim To retrieve single nucleotide polymorphism (SNP) of the given. Description Single nucleotide polymorphism frequently called SNPs, are the most common type of genetic variation among people. It is a variation in a single nucleotide that occurs at a specific positon in the genome .SNPs occur normally throughout a person’s DNA. They can act as a biological markers, helping scientists   to locate genes that are associated with disease. Procedure STEP 1: Open   https://www.ncbi.nlm.nih.gov/snp/ STEP 2: Select SNP from the dropdown list and type gene name in the search box and click on go. STEP 3: Select three hits from the displayed gene SNPs . STEP 4: Note down the accession number, chromosomal number, allele number and clinical significance. STEP 5: Save the page STEP 6: Close the window Result

KEGG

  KEGG Aim To retrieve the “cysteine metabolism “ from Oryza savita Description KEGG (Koyoto Encyclopedia of Genes and Genomes ) is a database resource for understanding high level functions and utilities of the biological system, such as the cell, the organism and the ecosystem, from genomic and molecular level information. The most unique data object in KEGG is the molecular networks – molecular interaction, reaction and relation and relation networks representing systemic functions    of the cell and the organism .The KEGG database has been in development by Kanehisa Laboratories since 1995,and is now a prominent reference knowledge base for integration and interpretation of large –scale molecular sets generated by genome sequencing and other high-throughput experimental technologies. Procedure STEP 1: Access   https://www.genome.jp/kegg/ STEP 2: Select KEGG pathway STEP 3: Enter organism name and “cysteine metabolism” as keyword STEP 4: Click g...

PIR

  PIR Aim To retrieve aminoacid sequence for heat shock protein HSP70 in tomato. Description The protein information resource(PIR),located at Georgetown University Medical Center (GUMC), is an integrated public bioinformatics resource to support genomic   and proteomic research and scientific studies. PIR was established in 1984 by the National Biomedical Research Foundation(NBRF) as a resource to assist researchers and consumers in the identification and interpretation of protein sequence information. Prior to that ,the NBRF compiled the first comprehensive collection of macromolecular sequences n the Atlas of protein sequence and structure, published from 1964-1974,under the editorship of Margaret Dayhoff. Dr. Dayhoff and her research group pioneered in the development of computer methods for the comparison of protein sequences, for the detection of distantly related sequences and duplications within sequences and for the inference of evolutionary histories from al...