D-ORB

The Overrepresented RNA Blocks

M.J. Dupont, F. Major
D-ORB: A Web Server to Extract Structural Features of Related But Unaligned RNA Sequences
J. Mol. Biol., 435 (2023)

Gammacoronavirus 3'UTR   (Rfam: RF03123)


Options
Negatives sequences: Random sequences of similar lengths
ORBs positions sequence: Best consensus structure-matching sequence
Motifs: 162,498 motifs on the alphabet ACGURYN
Rfam
D-ORB
Augmented

G(AGN)A
Y(UA)G
U(AN)C
N(UUA)G
((G)RU)C

100.0
100.0
66.7
66.7
66.7

G(AGN)A
Y(UA)G
Y(UA)G
((G)RU)C

% of positive sequence with ORB






Deep neural network:

3-fold cross-validation: 100 %
(± 1.12e-05 %)

Decision tree:

3-fold cross-validation: 66.7 %
(± 23.6 %)

D-ORB structure
D-ORB structure decision tree:

3-fold cross-validation: 83.3 %
(± 23.6 %)
of NC_001451.1/27333-27607
NC_001451.1/27333-27607
AUUACCUACAUGUCUAUCGCCAGGGAAAUGUCUAAUCUGUCUACUUAGUAGCCUGGAAACGAACGGUAGACCCUUAGAUUUUAAUUUAGUUUAAUUUUUAGUUUAGUUUAAGUUAGUUUAGAGUAGGUAUAAAGAUGCCAGUGCCGGGGCCACGCGGAGUACGAUCGAGGGUACAGCACUAGGACGCCCAUUAGGGGAAGAGCUAAAUUUUAGUUUAAGUUAAGUUUAAUUGGCUAAGUAUAGUUAAAAUUUAUAGGCUAGUAUAGAGUUAGAGC
(((((((...(((((..))))))))))))(((((((((((((((...))(((((((((.((((((((..))))))))((((((((((((((((((((((((((((((((((((..)))))))))))).))))))))(((((((((((((((((((((...)))..))))))))))))))).)).)(((((((..(((((((((((((((...))))))))))))))))))))))))))))))))))))))))))))))))).))))).))))).)
...................................................................................................1...................................................................................................2...........................................................................
.........1.........2.........3.........4.........5.........6.........7.........8.........9.........0.........1.........2.........3.........4.........5.........6.........7.........8.........9.........0.........1.........2.........3.........4.........5.........6.........7.....
12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345

abstract shape: ()(()(()()()))

Number of sequences
Structure ORBs (out of 3) (%)
1 G(AGN)A
3 100.0
>NC_010800.1/27357-27632
UAGCACCUACAUGUCUAUCGCCAGGGAAAUGUCUAAUCUGUCUACUUAGUAGCCUGGAAACGAACGGUAGACCCUUAGAUUUUAAUUUAGUUUAAUUUUUAGUUUAGUUUAAGUUAGUUUAGAGUAGGUGUAAAGAUGCCAGUGCCGGGGCCACGCGGAGUACGAUCGAGGGUACAGCACAAGGACGCCCAUUAGGGGAAGAGCUAAAUUUUAGUUUAAGUUAAGUUUAAUUGGCUAAGUAUAGUUAAAAUUUAUAGGCUAGUAUAGAGUUAGAGC
.............................................................................................................................................................(...)..................................................................................................................
>NC_010646.1/31386-31654
AUACGUUCGCGUCUGUAGGUUUUGCGUUGUCUACUGCUUGGAGAAUCAGCAAUUUGUCAUCUUAUAGCCAAGAGUACGAAGGAUGACACGGUUUAUUAUGAAAGAAUUUCACCAAAAAUUGAUUACGCCUUAGGCUAGACUAGGUCCAAAGAAUCCAGUGAGAGAAGCCCUGCAAUGUAAAUCCAUUGGGGAAAGAGUUAGAAAAGAUUGUAAUUAUUCUAGGUGAUUGUGAAAAGUAGUUUUAAAUUUGACUAUAGGUAAUUGUUAGC
........................................(...)................................................................................................................................................................................................................................
>NC_001451.1/27333-27607
AUUACCUACAUGUCUAUCGCCAGGGAAAUGUCUAAUCUGUCUACUUAGUAGCCUGGAAACGAACGGUAGACCCUUAGAUUUUAAUUUAGUUUAAUUUUUAGUUUAGUUUAAGUUAGUUUAGAGUAGGUAUAAAGAUGCCAGUGCCGGGGCCACGCGGAGUACGAUCGAGGGUACAGCACUAGGACGCCCAUUAGGGGAAGAGCUAAAUUUUAGUUUAAGUUAAGUUUAAUUGGCUAAGUAUAGUUAAAAUUUAUAGGCUAGUAUAGAGUUAGAGC
............................................................................................................................................................(...)..................................................................................................................
2 G(AGN)A
Y(UA)G
2 66.7
>NC_010800.1/27357-27632
UAGCACCUACAUGUCUAUCGCCAGGGAAAUGUCUAAUCUGUCUACUUAGUAGCCUGGAAACGAACGGUAGACCCUUAGAUUUUAAUUUAGUUUAAUUUUUAGUUUAGUUUAAGUUAGUUUAGAGUAGGUGUAAAGAUGCCAGUGCCGGGGCCACGCGGAGUACGAUCGAGGGUACAGCACAAGGACGCCCAUUAGGGGAAGAGCUAAAUUUUAGUUUAAGUUAAGUUUAAUUGGCUAAGUAUAGUUAAAAUUUAUAGGCUAGUAUAGAGUUAGAGC
.................................................................................................................(..)........................................(...)..................................................................................................................
>NC_001451.1/27333-27607
AUUACCUACAUGUCUAUCGCCAGGGAAAUGUCUAAUCUGUCUACUUAGUAGCCUGGAAACGAACGGUAGACCCUUAGAUUUUAAUUUAGUUUAAUUUUUAGUUUAGUUUAAGUUAGUUUAGAGUAGGUAUAAAGAUGCCAGUGCCGGGGCCACGCGGAGUACGAUCGAGGGUACAGCACUAGGACGCCCAUUAGGGGAAGAGCUAAAUUUUAGUUUAAGUUAAGUUUAAUUGGCUAAGUAUAGUUAAAAUUUAUAGGCUAGUAUAGAGUUAGAGC
................................................................................................................(..)........................................(...)..................................................................................................................
3 G(AGN)A
Y(UA)G
U(AN)C
2 66.7
>NC_001451.1/27333-27607
AUUACCUACAUGUCUAUCGCCAGGGAAAUGUCUAAUCUGUCUACUUAGUAGCCUGGAAACGAACGGUAGACCCUUAGAUUUUAAUUUAGUUUAAUUUUUAGUUUAGUUUAAGUUAGUUUAGAGUAGGUAUAAAGAUGCCAGUGCCGGGGCCACGCGGAGUACGAUCGAGGGUACAGCACUAGGACGCCCAUUAGGGGAAGAGCUAAAUUUUAGUUUAAGUUAAGUUUAAUUGGCUAAGUAUAGUUAAAAUUUAUAGGCUAGUAUAGAGUUAGAGC
..............(..)..........................................................................((..................(..)...................)(...................(...).......................)(...........))............................................................................
>NC_010800.1/27357-27632
UAGCACCUACAUGUCUAUCGCCAGGGAAAUGUCUAAUCUGUCUACUUAGUAGCCUGGAAACGAACGGUAGACCCUUAGAUUUUAAUUUAGUUUAAUUUUUAGUUUAGUUUAAGUUAGUUUAGAGUAGGUGUAAAGAUGCCAGUGCCGGGGCCACGCGGAGUACGAUCGAGGGUACAGCACAAGGACGCCCAUUAGGGGAAGAGCUAAAUUUUAGUUUAAGUUAAGUUUAAUUGGCUAAGUAUAGUUAAAAUUUAUAGGCUAGUAUAGAGUUAGAGC
...............(..)..........................................................................((..................(..)...................)(...................(...).......................)(...........))............................................................................
4 G(AGN)A
Y(UA)G
U(AN)C
N(UUA)G
2 66.7
>NC_010800.1/27357-27632
UAGCACCUACAUGUCUAUCGCCAGGGAAAUGUCUAAUCUGUCUACUUAGUAGCCUGGAAACGAACGGUAGACCCUUAGAUUUUAAUUUAGUUUAAUUUUUAGUUUAGUUUAAGUUAGUUUAGAGUAGGUGUAAAGAUGCCAGUGCCGGGGCCACGCGGAGUACGAUCGAGGGUACAGCACAAGGACGCCCAUUAGGGGAAGAGCUAAAUUUUAGUUUAAGUUAAGUUUAAUUGGCUAAGUAUAGUUAAAAUUUAUAGGCUAGUAUAGAGUUAGAGC
...............(..)..........................................................................((..................(..)...................)(...................(...).......................)(...........)).........(...)..............................................................
>NC_001451.1/27333-27607
AUUACCUACAUGUCUAUCGCCAGGGAAAUGUCUAAUCUGUCUACUUAGUAGCCUGGAAACGAACGGUAGACCCUUAGAUUUUAAUUUAGUUUAAUUUUUAGUUUAGUUUAAGUUAGUUUAGAGUAGGUAUAAAGAUGCCAGUGCCGGGGCCACGCGGAGUACGAUCGAGGGUACAGCACUAGGACGCCCAUUAGGGGAAGAGCUAAAUUUUAGUUUAAGUUAAGUUUAAUUGGCUAAGUAUAGUUAAAAUUUAUAGGCUAGUAUAGAGUUAGAGC
..............(..)..........................................................................((..................(..)...................)(...................(...).......................)(...........)).........(...)..............................................................
5 G(AGN)A
Y(UA)G
U(AN)C
N(UUA)G
((G)RU)C
2 66.7
>NC_001451.1/27333-27607
AUUACCUACAUGUCUAUCGCCAGGGAAAUGUCUAAUCUGUCUACUUAGUAGCCUGGAAACGAACGGUAGACCCUUAGAUUUUAAUUUAGUUUAAUUUUUAGUUUAGUUUAAGUUAGUUUAGAGUAGGUAUAAAGAUGCCAGUGCCGGGGCCACGCGGAGUACGAUCGAGGGUACAGCACUAGGACGCCCAUUAGGGGAAGAGCUAAAUUUUAGUUUAAGUUAAGUUUAAUUGGCUAAGUAUAGUUAAAAUUUAUAGGCUAGUAUAGAGUUAGAGC
..............(..).................................................................((.......((..................(..)...................)(................((.(...).)..)..................)(...........)).......)((...)))............................................................
>NC_010800.1/27357-27632
UAGCACCUACAUGUCUAUCGCCAGGGAAAUGUCUAAUCUGUCUACUUAGUAGCCUGGAAACGAACGGUAGACCCUUAGAUUUUAAUUUAGUUUAAUUUUUAGUUUAGUUUAAGUUAGUUUAGAGUAGGUGUAAAGAUGCCAGUGCCGGGGCCACGCGGAGUACGAUCGAGGGUACAGCACAAGGACGCCCAUUAGGGGAAGAGCUAAAUUUUAGUUUAAGUUAAGUUUAAUUGGCUAAGUAUAGUUAAAAUUUAUAGGCUAGUAUAGAGUUAGAGC
...............(..).................................................................((.......((..................(..)...................)(................((.(...).)..)..................)(...........)).......)((...)))............................................................

>NC_010800.1/27357-27632
UAGCACCUACAUGUCUAUCGCCAGGGAAAUGUCUAAUCUGUCUACUUAGUAGCCUGGAAACGAACGGUAGACCCUUAGAUUUUAAUUUAGUUUAAUUUUUAGUUUAGUUUAAGUUAGUUUAGAGUAGGUGUAAAGAUGCCAGUGCCGGGGCCACGCGGAGUACGAUCGAGGGUACAGCACAAGGACGCCCAUUAGGGGAAGAGCUAAAUUUUAGUUUAAGUUAAGUUUAAUUGGCUAAGUAUAGUUAAAAUUUAUAGGCUAGUAUAGAGUUAGAGC
.............................................................................................................................................................(...)..................................................................................................................
.................................................................................................................(..)........................................(...)..................................................................................................................
...............(..)..........................................................................((..................(..)...................)(...................(...).......................)(...........))............................................................................
...............(..)..........................................................................((..................(..)...................)(...................(...).......................)(...........)).........(...)..............................................................
...............(..).................................................................((.......((..................(..)...................)(................((.(...).)..)..................)(...........)).......)((...)))............................................................
>NC_001451.1/27333-27607
AUUACCUACAUGUCUAUCGCCAGGGAAAUGUCUAAUCUGUCUACUUAGUAGCCUGGAAACGAACGGUAGACCCUUAGAUUUUAAUUUAGUUUAAUUUUUAGUUUAGUUUAAGUUAGUUUAGAGUAGGUAUAAAGAUGCCAGUGCCGGGGCCACGCGGAGUACGAUCGAGGGUACAGCACUAGGACGCCCAUUAGGGGAAGAGCUAAAUUUUAGUUUAAGUUAAGUUUAAUUGGCUAAGUAUAGUUAAAAUUUAUAGGCUAGUAUAGAGUUAGAGC
............................................................................................................................................................(...)..................................................................................................................
................................................................................................................(..)........................................(...)..................................................................................................................
..............(..)..........................................................................((..................(..)...................)(...................(...).......................)(...........))............................................................................
..............(..)..........................................................................((..................(..)...................)(...................(...).......................)(...........)).........(...)..............................................................
..............(..).................................................................((.......((..................(..)...................)(................((.(...).)..)..................)(...........)).......)((...)))............................................................
>NC_010646.1/31386-31654
AUACGUUCGCGUCUGUAGGUUUUGCGUUGUCUACUGCUUGGAGAAUCAGCAAUUUGUCAUCUUAUAGCCAAGAGUACGAAGGAUGACACGGUUUAUUAUGAAAGAAUUUCACCAAAAAUUGAUUACGCCUUAGGCUAGACUAGGUCCAAAGAAUCCAGUGAGAGAAGCCCUGCAAUGUAAAUCCAUUGGGGAAAGAGUUAGAAAAGAUUGUAAUUAUUCUAGGUGAUUGUGAAAAGUAGUUUUAAAUUUGACUAUAGGUAAUUGUUAGC
........................................(...)................................................................................................................................................................................................................................

ORB p-value ORB frequency
(per structure per nucleotide)
Means
ratio
1 G(AGN)A 0.00614 38.7
NC_001451.1/27333-27607
AUUACCUACAUGUCUAUCGCCAGGGAAAUGUCUAAUCUGUCUACUUAGUAGCCUGGAAACGAACGGUAGACCCUUAGAUUUUAAUUUAGUUUAAUUUUUAGUUUAGUUUAAGUUAGUUUAGAGUAGGUAUAAAGAUGCCAGUGCCGGGGCCACGCGGAGUACGAUCGAGGGUACAGCACUAGGACGCCCAUUAGGGGAAGAGCUAAAUUUUAGUUUAAGUUAAGUUUAAUUGGCUAAGUAUAGUUAAAAUUUAUAGGCUAGUAUAGAGUUAGAGC
...................................................................................................1...................................................................................................2...........................................................................
.........1.........2.........3.........4.........5.........6.........7.........8.........9.........0.........1.........2.........3.........4.........5.........6.........7.........8.........9.........0.........1.........2.........3.........4.........5.........6.........7.....
12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345
2 Y(UA)G 0.0067 16.2
NC_001451.1/27333-27607
AUUACCUACAUGUCUAUCGCCAGGGAAAUGUCUAAUCUGUCUACUUAGUAGCCUGGAAACGAACGGUAGACCCUUAGAUUUUAAUUUAGUUUAAUUUUUAGUUUAGUUUAAGUUAGUUUAGAGUAGGUAUAAAGAUGCCAGUGCCGGGGCCACGCGGAGUACGAUCGAGGGUACAGCACUAGGACGCCCAUUAGGGGAAGAGCUAAAUUUUAGUUUAAGUUAAGUUUAAUUGGCUAAGUAUAGUUAAAAUUUAUAGGCUAGUAUAGAGUUAGAGC
...................................................................................................1...................................................................................................2...........................................................................
.........1.........2.........3.........4.........5.........6.........7.........8.........9.........0.........1.........2.........3.........4.........5.........6.........7.........8.........9.........0.........1.........2.........3.........4.........5.........6.........7.....
12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345
3 U(AN)C 0.00741 10.8
NC_001451.1/27333-27607
AUUACCUACAUGUCUAUCGCCAGGGAAAUGUCUAAUCUGUCUACUUAGUAGCCUGGAAACGAACGGUAGACCCUUAGAUUUUAAUUUAGUUUAAUUUUUAGUUUAGUUUAAGUUAGUUUAGAGUAGGUAUAAAGAUGCCAGUGCCGGGGCCACGCGGAGUACGAUCGAGGGUACAGCACUAGGACGCCCAUUAGGGGAAGAGCUAAAUUUUAGUUUAAGUUAAGUUUAAUUGGCUAAGUAUAGUUAAAAUUUAUAGGCUAGUAUAGAGUUAGAGC
...................................................................................................1...................................................................................................2...........................................................................
.........1.........2.........3.........4.........5.........6.........7.........8.........9.........0.........1.........2.........3.........4.........5.........6.........7.........8.........9.........0.........1.........2.........3.........4.........5.........6.........7.....
12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345
4 N(UUA)G 0.00776 12.9
NC_001451.1/27333-27607
AUUACCUACAUGUCUAUCGCCAGGGAAAUGUCUAAUCUGUCUACUUAGUAGCCUGGAAACGAACGGUAGACCCUUAGAUUUUAAUUUAGUUUAAUUUUUAGUUUAGUUUAAGUUAGUUUAGAGUAGGUAUAAAGAUGCCAGUGCCGGGGCCACGCGGAGUACGAUCGAGGGUACAGCACUAGGACGCCCAUUAGGGGAAGAGCUAAAUUUUAGUUUAAGUUAAGUUUAAUUGGCUAAGUAUAGUUAAAAUUUAUAGGCUAGUAUAGAGUUAGAGC
...................................................................................................1...................................................................................................2...........................................................................
.........1.........2.........3.........4.........5.........6.........7.........8.........9.........0.........1.........2.........3.........4.........5.........6.........7.........8.........9.........0.........1.........2.........3.........4.........5.........6.........7.....
12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345
5 ((G)RU)C 0.00902 32.8
NC_001451.1/27333-27607
AUUACCUACAUGUCUAUCGCCAGGGAAAUGUCUAAUCUGUCUACUUAGUAGCCUGGAAACGAACGGUAGACCCUUAGAUUUUAAUUUAGUUUAAUUUUUAGUUUAGUUUAAGUUAGUUUAGAGUAGGUAUAAAGAUGCCAGUGCCGGGGCCACGCGGAGUACGAUCGAGGGUACAGCACUAGGACGCCCAUUAGGGGAAGAGCUAAAUUUUAGUUUAAGUUAAGUUUAAUUGGCUAAGUAUAGUUAAAAUUUAUAGGCUAGUAUAGAGUUAGAGC
...................................................................................................1...................................................................................................2...........................................................................
.........1.........2.........3.........4.........5.........6.........7.........8.........9.........0.........1.........2.........3.........4.........5.........6.........7.........8.........9.........0.........1.........2.........3.........4.........5.........6.........7.....
12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345
ORB p-value ORB frequency
(per structure per nucleotide)
Means
ratio

NCM p-value NCM frequency
(per structure per nucleotide)
Means
ratio
1 (.UAGYY.) 0.00592 7.16
2 R(AGN)R 0.00675 28.3
NC_001451.1/27333-27607
AUUACCUACAUGUCUAUCGCCAGGGAAAUGUCUAAUCUGUCUACUUAGUAGCCUGGAAACGAACGGUAGACCCUUAGAUUUUAAUUUAGUUUAAUUUUUAGUUUAGUUUAAGUUAGUUUAGAGUAGGUAUAAAGAUGCCAGUGCCGGGGCCACGCGGAGUACGAUCGAGGGUACAGCACUAGGACGCCCAUUAGGGGAAGAGCUAAAUUUUAGUUUAAGUUAAGUUUAAUUGGCUAAGUAUAGUUAAAAUUUAUAGGCUAGUAUAGAGUUAGAGC
...................................................................................................1...................................................................................................2...........................................................................
.........1.........2.........3.........4.........5.........6.........7.........8.........9.........0.........1.........2.........3.........4.........5.........6.........7.........8.........9.........0.........1.........2.........3.........4.........5.........6.........7.....
12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345
NCM p-value NCM frequency
(per structure per nucleotide)
Means
ratio

Structure is visualized using R2R 1.0.6 (Zasha Weinberg).
LOGOS are generated with ggseqlogo (Omar Wagih).