D-ORB

The Overrepresented RNA Blocks

M.J. Dupont, F. Major
D-ORB: A Web Server to Extract Structural Features of Related But Unaligned RNA Sequences
J. Mol. Biol., 435 (2023)

HOTAIR conserved region 1   (Rfam: RF01904)


Options
Negatives sequences: Random sequences of similar lengths
ORBs positions sequence: Best consensus structure-matching sequence
Motifs: 162,498 motifs on the alphabet ACGURYN
Rfam
D-ORB
Augmented

U(AU(U))
(UYAA)
(.UUACN.)
((C)C)U

100.0
100.0
100.0
100.0

U(AU(U))
(UYAA)
(.UUACN.)
((C)C)U

% of positive sequence with ORB






Deep neural network:

5-fold cross-validation: 99.9 %
(± 0.0366 %)

Decision tree:

5-fold cross-validation: 89.9 %
(± 12.2 %)

D-ORB structure
D-ORB structure decision tree:

5-fold cross-validation: 95 %
(± 10 %)
of AAIY01412966.1/1772-1808
AAIY01412966.1/1772-1808
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
(..(.........)).((..))((....))((..).)
.........1.........2.........3.......
1234567890123456789012345678901234567

abstract shape: ()()

Number of sequences
Structure ORBs (out of 9) (%)
1 U(AU(U))
9 100.0
>ABRQ01001830.1/1444-1408
UAUUUUACUGUCCAAAGGAAUCAAUUAAUUAACACCU
(..(.........))......................
>ABVD01271340.1/4487-4451
UAUUUUACCUUCCAAAGGAAUCAAUUAAUUAACGCCU
(..(.........))......................
>DQ926657.1/1046-1010
UAUUUUACAGUCCAAAGGAAUCAAUUAAUUAGCGCCU
(..(.........))......................
>AACN010527833.1/198-162
UAUUUUACUGUCCAAAGGAAUCAAUUAAUUAGAGCCU
(..(.........))......................
>AAQR03076666.1/20712-20748
UAUUUUACAAUCCAAAGGAAUCAAUUAAUUAGAGCCU
(..(.........))......................
>AALT01140141.1/1488-1452
UAUUUUACUGUCCAAAGGAACCAAUUAAUUAGUGCCU
(..(.........))......................
>AC160979.2/198368-198403
UAUUUUACAGUCCAAGGAAUCAAUUAAUUAGUGCCU
(..(.........)).....................
>AAIY01412966.1/1772-1808
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
(..(.........))......................
>ACBE01235903.1/1845-1881
UAUUUUACUGUCCAGAGGAAUCAAUUAAUUAGUGCCU
(..(.........))......................
2 U(AU(U))
(UYAA)
8 88.9
>DQ926657.1/1046-1010
UAUUUUACAGUCCAAAGGAAUCAAUUAAUUAGCGCCU
(..(.........))........(....)........
>AAQR03076666.1/20712-20748
UAUUUUACAAUCCAAAGGAAUCAAUUAAUUAGAGCCU
(..(.........))........(....)........
>AACN010527833.1/198-162
UAUUUUACUGUCCAAAGGAAUCAAUUAAUUAGAGCCU
(..(.........))........(....)........
>ABVD01271340.1/4487-4451
UAUUUUACCUUCCAAAGGAAUCAAUUAAUUAACGCCU
(..(.........))........(....)........
>AAIY01412966.1/1772-1808
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
(..(.........))........(....)........
>ABRQ01001830.1/1444-1408
UAUUUUACUGUCCAAAGGAAUCAAUUAAUUAACACCU
(..(.........))........(....)........
>ACBE01235903.1/1845-1881
UAUUUUACUGUCCAGAGGAAUCAAUUAAUUAGUGCCU
(..(.........))........(....)........
>AALT01140141.1/1488-1452
UAUUUUACUGUCCAAAGGAACCAAUUAAUUAGUGCCU
(..(.........))........(....)........
3 U(AU(U))
(UYAA)
(.UUACN.)
8 88.9
>ACBE01235903.1/1845-1881
UAUUUUACUGUCCAGAGGAAUCAAUUAAUUAGUGCCU
(..(.........))........(....)........
>AACN010527833.1/198-162
UAUUUUACUGUCCAAAGGAAUCAAUUAAUUAGAGCCU
(..(.........))........(....)........
>ABRQ01001830.1/1444-1408
UAUUUUACUGUCCAAAGGAAUCAAUUAAUUAACACCU
(..(.........))........(....)........
>AAQR03076666.1/20712-20748
UAUUUUACAAUCCAAAGGAAUCAAUUAAUUAGAGCCU
(..(.........))........(....)........
>DQ926657.1/1046-1010
UAUUUUACAGUCCAAAGGAAUCAAUUAAUUAGCGCCU
(..(.........))........(....)........
>AALT01140141.1/1488-1452
UAUUUUACUGUCCAAAGGAACCAAUUAAUUAGUGCCU
(..(.........))........(....)........
>ABVD01271340.1/4487-4451
UAUUUUACCUUCCAAAGGAAUCAAUUAAUUAACGCCU
(..(.........))........(....)........
>AAIY01412966.1/1772-1808
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
(..(.........))........(....)........
4 U(AU(U))
(UYAA)
(.UUACN.)
((C)C)U
8 88.9
>AAIY01412966.1/1772-1808
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
(..(.........))........(....).((..).)
>AAQR03076666.1/20712-20748
UAUUUUACAAUCCAAAGGAAUCAAUUAAUUAGAGCCU
(..(.........))........(....).((..).)
>AALT01140141.1/1488-1452
UAUUUUACUGUCCAAAGGAACCAAUUAAUUAGUGCCU
(..(.........))........(....).((..).)
>AACN010527833.1/198-162
UAUUUUACUGUCCAAAGGAAUCAAUUAAUUAGAGCCU
(..(.........))........(....).((..).)
>ABVD01271340.1/4487-4451
UAUUUUACCUUCCAAAGGAAUCAAUUAAUUAACGCCU
(..(.........))........(....).((..).)
>DQ926657.1/1046-1010
UAUUUUACAGUCCAAAGGAAUCAAUUAAUUAGCGCCU
(..(.........))........(....).((..).)
>ACBE01235903.1/1845-1881
UAUUUUACUGUCCAGAGGAAUCAAUUAAUUAGUGCCU
(..(.........))........(....).((..).)
>ABRQ01001830.1/1444-1408
UAUUUUACUGUCCAAAGGAAUCAAUUAAUUAACACCU
(..(.........))........(....).((..).)

>ABRQ01001830.1/1444-1408
UAUUUUACUGUCCAAAGGAAUCAAUUAAUUAACACCU
(..(.........))......................
(..(.........))........(....)........
(..(.........))........(....)........
(..(.........))........(....).((..).)
>ABVD01271340.1/4487-4451
UAUUUUACCUUCCAAAGGAAUCAAUUAAUUAACGCCU
(..(.........))......................
(..(.........))........(....)........
(..(.........))........(....)........
(..(.........))........(....).((..).)
>DQ926657.1/1046-1010
UAUUUUACAGUCCAAAGGAAUCAAUUAAUUAGCGCCU
(..(.........))......................
(..(.........))........(....)........
(..(.........))........(....)........
(..(.........))........(....).((..).)
>AACN010527833.1/198-162
UAUUUUACUGUCCAAAGGAAUCAAUUAAUUAGAGCCU
(..(.........))......................
(..(.........))........(....)........
(..(.........))........(....)........
(..(.........))........(....).((..).)
>AAQR03076666.1/20712-20748
UAUUUUACAAUCCAAAGGAAUCAAUUAAUUAGAGCCU
(..(.........))......................
(..(.........))........(....)........
(..(.........))........(....)........
(..(.........))........(....).((..).)
>AALT01140141.1/1488-1452
UAUUUUACUGUCCAAAGGAACCAAUUAAUUAGUGCCU
(..(.........))......................
(..(.........))........(....)........
(..(.........))........(....)........
(..(.........))........(....).((..).)
>AAIY01412966.1/1772-1808
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
(..(.........))......................
(..(.........))........(....)........
(..(.........))........(....)........
(..(.........))........(....).((..).)
>ACBE01235903.1/1845-1881
UAUUUUACUGUCCAGAGGAAUCAAUUAAUUAGUGCCU
(..(.........))......................
(..(.........))........(....)........
(..(.........))........(....)........
(..(.........))........(....).((..).)
>AC160979.2/198368-198403
UAUUUUACAGUCCAAGGAAUCAAUUAAUUAGUGCCU
(..(.........)).....................

ORB p-value ORB frequency
(per structure per nucleotide)
Means
ratio
1 U(AU(U)) < 0.001 134
AAIY01412966.1/1772-1808
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
.........1.........2.........3.......
1234567890123456789012345678901234567
2 (UYAA) < 0.001 70.9
AAIY01412966.1/1772-1808
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
.........1.........2.........3.......
1234567890123456789012345678901234567
3 (.UUACN.) < 0.001 162
AAIY01412966.1/1772-1808
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
.........1.........2.........3.......
1234567890123456789012345678901234567
4 ((C)C)U < 0.001 21.8
AAIY01412966.1/1772-1808
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
.........1.........2.........3.......
1234567890123456789012345678901234567
ORB p-value ORB frequency
(per structure per nucleotide)
Means
ratio

NCM p-value NCM frequency
(per structure per nucleotide)
Means
ratio
1 .UUUAC. < 0.001 190
AAIY01412966.1/1772-1808
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
.........1.........2.........3.......
1234567890123456789012345678901234567
2 U(CNA)N < 0.001 16.9
AAIY01412966.1/1772-1808
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
UAUUUUACUGUCCAAAGGAGUCAAUUAAUUAGCGCCU
.........1.........2.........3.......
1234567890123456789012345678901234567
NCM p-value NCM frequency
(per structure per nucleotide)
Means
ratio

Structure is visualized using R2R 1.0.6 (Zasha Weinberg).
LOGOS are generated with ggseqlogo (Omar Wagih).