Project titles for:
                                                     
Project titles for: 2010-12
- Multiple Sequence Alignment:Comparison and Evaluation
- Prediction of Codon Bias in Rice (Oryza sativa) genes
- Subunit-Subunit interactions in Heterotetrameric structure of rice (Oryza sativa) ADP Glucose pyrophosphorylase
- Identification of microRNA in rice (Oryza sativa)
- Virtual High-Throughput Screening of Peroxisome proliferator-activated receptor Inhibitors
- In silico idetentification and analysis of Simple Sequence Repeats (SSRs) in rice (Oryza sativa) ESTs
- Effect on virulence in Streptococcus pneumonia due to the presence of rlrA gene
Simple sequence repeats (SSRs) or microsatellites have been proven to be the markers of choice in plant genetics research because of their hypervariability and ease of detection. However, development of these markers is expensive, labour intensive and time consuming, SSRs developed from these (expressed sequence tags) ESTs, known as EST-SSRs are most widely used and potentially valuable source of gene based markers for their high levels of crosstaxon portability, rapid and less expensive development. The emerging computational approaches provide a better alternative process of development of SSR markers from the ESTs than the conventional methods. In the present study, 64,999 EST sequences of Oryza sativa L., downloaded from CleanEST database, a novel database that classified dbEST libraries were mined for the microsatellites. The pre-processing of ESTs by employing Cross_match (for vector screening) and EST_trimmer (for removal of empty vectors, low quality sequences and poly A/T tails) resulted in 57,327 trimmed sequences. The trimmed sequences were assembled using the contig assembly program CAP3 forming 5,234 contigs and 23,267 singlets. The target of SSR detection and analysis was accomplished using MISA and SSR Locator. 5,331 SSRs were detected among which 1,142 were in contigs and 4,189 were in singlets. The percent frequency distributions of di-, tri-, tetra-, penta and hexanucleotide SSRs were detected as 25.57%, 73.37%, 1.61%, 0.77% and 0.20%, respectively. The trinucleotide SSRs are triplet codon that code for a particular amino acid. In these EST- SSRs arginine (CGC) was found to have highest percent frequency of 24.2% followed by proline (CCG) (16.8%) whereas asparagine (0.35%) and methionine (0.154%) were in low frequency. GC3 bias in synonymous codon use in SSR repeats was observed as the higher frequencies of G/C were found at the third codon position of most of the amino acids. Amongst 5331 SSRs 1,298 were hypervariable, 4,033 were potentially variable, 613 were compound and 4,718 were perfect SSRs. 4,620 flanking primer pairs were designed using Primer3 for the SSR containing sequences.