Attenuators (con’t), Pseudoknot application.
Matching attenuation sequences has turned out to be a challenging problem. The simplest way I’ve discovered is to write a grammar in two parts, one matching the first loop, and two, checking for direct repeats in the correct positions. Furthermore, this specific configuration is complex (Thus rare in random data.) and seems rare in genomic sequences.
I’ve located the following attenuator-like structures in two related genomes:
GTGGTCGGCCACAGGCGTGG ((((::::)))):::::::: ::::::::((((::::)))) >emb|V01174.1| Avian myelocytomatosis virus 5' LTR and gene for p96 polyprotein, proviral DNA in Gallus gallus genomic DNA Length=3780 GENE ID: 1491913 Amvgp1 | p110 [Avian myelocytomatosis virus] 459 GTGGTCGGCCACAGGCGTGG 478 >gb|AF033809.1|AF033809 Avian myelocytomatosis virus, complete genome Length=3392 GENE ID: 1491913 Amvgp1 | p110 [Avian myelocytomatosis virus] 72 GTGGTCGGCCACAGGCGTGG 91 Turns out this configuration is more common in bacterial genomes. MTB DS016976 et al. *note several repeats throughout ((((::::)))):::::::: ::::::::((((::::)))) ACGCTGTCGCGTGCCGACGC GCCGCAGACGGCTAAAGCCG GCTGGCCGCAGCCAGCGCTG GTGCACGCGCACAACGGTGC CGGTGGTCACCGTCGGCGGT GGTGTCGGCACCGGCCGGTG GCTCGTCGGAGCCAAAGCTC GCTCACCCGAGCGGCAGCTC GGCGATTCCGCCGTCGGGCG CCGCCCGGGCGGGGCGCCGC CCGGCAATCCGGCGTGCCGG TGTTCGGCAACAAGTATGTT CGCAAACATGCGGGAGCGCA GCGCGTCGGCGCGGGAGCGC GCGGACAGCCGCGGGAGCGG CGCGCGGCCGCGCTGCCGCG CCCGGAGCCGGGCATCCCCG CGCCCGGCGGCGCGCTCGCC GGCCCCACGGCCACCAGGCC CCGGCGTCCCGGGAGGCCGG CGGGAAAGCCCGCCATCGGG CGTGCTGGCACGAAGTCGTG CGGCTCTGGCCGACATCGGC GCCGCATCCGGCGGAGGCCG GGCCTGATGGCCAATCGGCC GGCGTGATCGCCAAGAGGCG ACCGGCGCCGGTGATTACCG TATATAGATATATAGATATA TATATACATATACAAATATA TATATATATATATAGATATA ATATATAAATATAAATATAT ATATATATATATAGATATAT GCCGTTGCCGGCCTGGGCCG GCCGAGGGCGGCTTCGGCCG GCGCATAAGCGCGAGAGCGC GCACTCCGGTGCTGCTGCAC GCGCGTGAGCGCCCGGGCGC CGGCGCGGGCCGGTTCCGGC CGGCAGGAGCCGGCGCCGGC TGGTACCGACCAGATTTGGT GGCGCCAGCGCCTGGCGGCG ACCGCATACGGTCCCAACCG CCGCGGTTGCGGTAGGCCGC GGCGCCGGCGCCGTCTGGCG CGGTGGCAACCGTCTGCGGT CGGCGATGGCCGGTCTCGGC CCTTCGGTAAGGCAACCCTT GCCGGGTGCGGCTGCTGCCG CCGCGGTAGCGGATCACCGC CCACCGGAGTGGTTGGCCAC GGCTGGCTAGCCAGCCGGCT CGACCGTTGTCGGGGCCGAC CGGGACCACCCGAAGGCGGG GCCGGAGTCGGCAGTGGCCG GCGCCGCTGCGCAGCAGCGC GCGCGGCGGCGCGTTGGCGC GCGCCGGTGCGCCGCGGCGC CATCACTTGATGAAATCATC GTGCTGTTGCACGTTGGTGC CCCGCCGCCGGGTGATCCCG GGGAGCTATCCCCCGGGGGA CGGTGGGCACCGCCCCCGGT CGGTGGAAACCGAACCCGGT GCGGGGACCCGCCGAGGCGG GCGGCCAACCGCAACAGCGG GCGCGGCAGCGCTGCTGCGC CCGCCGGTGCGGTGTCCCGC GCCGGATGCGGCTCCCGCCG CCCGCACACGGGAGAGCCCG CGCCCGTAGGCGAATCCGCC CGGACCAGTCCGACCACGGA CTGGATGTCCAGCGCGCTGG GGCGCGGCCGCCTGCTGGCG CGGCTGTGGCCGCGGTCGGC ACCCTGGTGGGTACCAACCC GTGCGTTAGCACCTCGGTGC CGGCGCGCGCCGTTACCGGC CGGCCGCAGCCGGGACCGGC GCCGTGTGCGGCCAGTGCCG CGGGCGTGCCCGCTAACGGG CGACGGTGGTCGACACCGAC GCGAGCGGTCGCGGCCGCGA CGCGGAGTCGCGGGTGCGCG CGGCCCAAGCCGAGCGCGGC CGGTTCCGACCGGATCCGGT CAGGCCGTCCTGGCGCCAGG CCGGCGCACCGGCGAACCGG CATCGTCGGATGGTTTCATC CCACCAGGGTGGCCGTCCAC //************************* I've also developed a pseudoknot grammar application that allows the investigator to vary the length of stems 1 and 2 and loops 1,2 and 3. This will be available as grantware within the next 2-3 months for public download. If you wish a beta copy contact me.
Advertisement
Leave a Comment