Skip to content

Mirror Repeats – content, non-structural parsing

October 21, 2010

My work has been centered on writing grammars to match all possible structural configurations of a string. Having achieved my goal I came across something new called mirror repeats. eg ATCG-GCTA which turns out to be described by a very simple context-free language. A few examples I found are :

>gi|9629900|ref|NC_001866.1| Avian myelocytomatosis virus, complete genome 3392 bp
GAGGAGGAGGAG
TTTTAAAATTTT

>gi|2801466|gb|AF033810.1| Fujinami sarcoma virus, complete genome 4788 bp
CTCCTCCTCCTC.
CAGTGGGGTGAC.
AAAAAGGAAAAA. (occurs 2 times)
AAAAAGGAAAAA.
>gi|493011|gb|M10455.1|ACSUR2CG UR2 sarcoma virus, complete genome 3166 bp
CTCTTCCTTCTC.
ATAGCTTCGATA.
GGCGGAAGGCGG.

>gi|108737103|ref|NC_008094.1| Y73 sarcoma virus, complete genome 5188 bp ssrna
GAGGAAAAGGAG.
GTTATTTTATTG.
GGCGAAAAGCGG.

>gi|9626914|ref|NC_001494.1| Jaagsiekte sheep retrovirus, complete genome 7462 bp ss-RNA
TCCTCCTCCTCCTCCT.
TTAGAACAACAAGATT.

>gi|62294|emb|X13063.1| Turnip yellows virus (BWYV-FL1) genomic RNA
GAAATTTTAAAG.
TCTCCGAAGCCTCT.
CGGGAAAAGGGC.
>gi|8895506|gb|AF218039.1|AF218039 Cricket paralysis virus AF218039
AACAACAACAACAA.
AACAACAGTTTTGACAACAA.
CTAAACCAAATC.
TTCCCAACCCTT.
TTATCCCCTATT.

These are example of content parsing rather than structural parsing which maybe contained within C-S grammars to make them even more powerful in their ability to predict structure.

Advertisements

From → rna

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: