Skip to content

Attenuators (con’t), Pseudoknot application.

July 16, 2011

Matching attenuation sequences has turned out to be a challenging problem. The simplest way I’ve discovered is to write a grammar in two parts, one matching the first loop, and two, checking for direct repeats in the correct positions. ¬†Furthermore, this specific configuration is complex (Thus rare in random data.) and seems rare in genomic sequences.

I’ve located the following attenuator-like structures in two related genomes:

GTGGTCGGCCACAGGCGTGG
((((::::))))::::::::
::::::::((((::::))))
>emb|V01174.1|  Avian myelocytomatosis virus 5' LTR and gene for p96 polyprotein,
proviral DNA in Gallus gallus genomic DNA
Length=3780 GENE ID: 1491913 Amvgp1 | p110 [Avian myelocytomatosis virus]
459  GTGGTCGGCCACAGGCGTGG  478

>gb|AF033809.1|AF033809  Avian myelocytomatosis virus, complete genome
Length=3392 GENE ID: 1491913 Amvgp1 | p110 [Avian myelocytomatosis virus]
72  GTGGTCGGCCACAGGCGTGG  91

Turns out this configuration is more common in bacterial genomes.
MTB DS016976 et al. *note several repeats throughout

((((::::))))::::::::
::::::::((((::::))))
ACGCTGTCGCGTGCCGACGC
GCCGCAGACGGCTAAAGCCG
GCTGGCCGCAGCCAGCGCTG
GTGCACGCGCACAACGGTGC
CGGTGGTCACCGTCGGCGGT
GGTGTCGGCACCGGCCGGTG
GCTCGTCGGAGCCAAAGCTC
GCTCACCCGAGCGGCAGCTC
GGCGATTCCGCCGTCGGGCG
CCGCCCGGGCGGGGCGCCGC
CCGGCAATCCGGCGTGCCGG
TGTTCGGCAACAAGTATGTT
CGCAAACATGCGGGAGCGCA
GCGCGTCGGCGCGGGAGCGC
GCGGACAGCCGCGGGAGCGG
CGCGCGGCCGCGCTGCCGCG
CCCGGAGCCGGGCATCCCCG
CGCCCGGCGGCGCGCTCGCC
GGCCCCACGGCCACCAGGCC
CCGGCGTCCCGGGAGGCCGG
CGGGAAAGCCCGCCATCGGG
CGTGCTGGCACGAAGTCGTG
CGGCTCTGGCCGACATCGGC
GCCGCATCCGGCGGAGGCCG
GGCCTGATGGCCAATCGGCC
GGCGTGATCGCCAAGAGGCG
ACCGGCGCCGGTGATTACCG
TATATAGATATATAGATATA
TATATACATATACAAATATA
TATATATATATATAGATATA
ATATATAAATATAAATATAT
ATATATATATATAGATATAT
GCCGTTGCCGGCCTGGGCCG
GCCGAGGGCGGCTTCGGCCG
GCGCATAAGCGCGAGAGCGC
GCACTCCGGTGCTGCTGCAC
GCGCGTGAGCGCCCGGGCGC
CGGCGCGGGCCGGTTCCGGC
CGGCAGGAGCCGGCGCCGGC
TGGTACCGACCAGATTTGGT
GGCGCCAGCGCCTGGCGGCG
ACCGCATACGGTCCCAACCG
CCGCGGTTGCGGTAGGCCGC
GGCGCCGGCGCCGTCTGGCG
CGGTGGCAACCGTCTGCGGT
CGGCGATGGCCGGTCTCGGC
CCTTCGGTAAGGCAACCCTT
GCCGGGTGCGGCTGCTGCCG
CCGCGGTAGCGGATCACCGC
CCACCGGAGTGGTTGGCCAC
GGCTGGCTAGCCAGCCGGCT
CGACCGTTGTCGGGGCCGAC
CGGGACCACCCGAAGGCGGG
GCCGGAGTCGGCAGTGGCCG
GCGCCGCTGCGCAGCAGCGC
GCGCGGCGGCGCGTTGGCGC
GCGCCGGTGCGCCGCGGCGC
CATCACTTGATGAAATCATC
GTGCTGTTGCACGTTGGTGC
CCCGCCGCCGGGTGATCCCG
GGGAGCTATCCCCCGGGGGA
CGGTGGGCACCGCCCCCGGT
CGGTGGAAACCGAACCCGGT
GCGGGGACCCGCCGAGGCGG
GCGGCCAACCGCAACAGCGG
GCGCGGCAGCGCTGCTGCGC
CCGCCGGTGCGGTGTCCCGC
GCCGGATGCGGCTCCCGCCG
CCCGCACACGGGAGAGCCCG
CGCCCGTAGGCGAATCCGCC
CGGACCAGTCCGACCACGGA
CTGGATGTCCAGCGCGCTGG
GGCGCGGCCGCCTGCTGGCG
CGGCTGTGGCCGCGGTCGGC
ACCCTGGTGGGTACCAACCC
GTGCGTTAGCACCTCGGTGC
CGGCGCGCGCCGTTACCGGC
CGGCCGCAGCCGGGACCGGC
GCCGTGTGCGGCCAGTGCCG
CGGGCGTGCCCGCTAACGGG
CGACGGTGGTCGACACCGAC
GCGAGCGGTCGCGGCCGCGA
CGCGGAGTCGCGGGTGCGCG
CGGCCCAAGCCGAGCGCGGC
CGGTTCCGACCGGATCCGGT
CAGGCCGTCCTGGCGCCAGG
CCGGCGCACCGGCGAACCGG
CATCGTCGGATGGTTTCATC
CCACCAGGGTGGCCGTCCAC
//*************************
I've also developed a pseudoknot grammar application that allows the investigator to vary the length of stems 1 and 2 and loops 1,2 and 3. This will be available as grantware within the next 2-3 months for public download. If you wish a beta copy contact me.
Advertisements

From → Uncategorized

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: