D-ORB

The Overrepresented RNA Blocks

M.J. Dupont, F. Major
D-ORB: A Web Server to Extract Structural Features of Related But Unaligned RNA Sequences
J. Mol. Biol., 435 (2023)

CRISPR RNA direct repeat element   (Rfam: RF01315)


Options
Negatives sequences: Random sequences of similar lengths
ORBs positions sequence: Best consensus structure-matching sequence
Motifs: 162,498 motifs on the alphabet ACGURYN
Rfam
D-ORB
Augmented

(.Y.().UA.)
(.C.().GNN.)
(.A.)
(U()A)

100.0
100.0
100.0
100.0

(.Y.().UA.)
(.A.)

% of positive sequence with ORB






Deep neural network:

5-fold cross-validation: 99.9 %
(± 0.0299 %)

Decision tree:

5-fold cross-validation: 89.2 %
(± 14.4 %)

D-ORB structure
D-ORB structure decision tree:

5-fold cross-validation: 97.2 %
(± 5.08 %)
of CP000141.1/1929243-1929272
CP000141.1/1929243-1929272
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
(.((.(..(.((...))...))....)).)
.........1.........2.........3
123456789012345678901234567890

abstract shape: ()

Number of sequences
Structure ORBs (out of 19) (%)
1 (.Y.().UA.)
17 89.5
>CP000771.1/1024868-1024839
GUUUCAAUCCCUAAUAGGUAUGCUAAAAGC
...(.(...............)....)...
>CP000716.1/360504-360533
AUUUCAAUUCCUCCAAGGUAAGGUAAAAAC
...(.(...............)....)...
>CP000557.1/317462-317433
GUUUCAAUCUCUCAUAGGUACGAUACAAAC
...(.(...............)....)...
>CP000679.1/62025-62054
GUUUCAAUCCCCAAAGGGAAGGCUACAAAC
...(.(...............)....)...
>BA000043.1/355427-355398
GUUUCAAUCCCUCAUAGGUACGAUAAAAAC
...(.(...............)....)...
>CP000141.1/1929243-1929272
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
...(.(...............)....)...
>CP000612.1/587612-587583
GUUUCAAUUCCUUAUAGGUAGGCUGAUAAC
.(...(...............)......).
>CP000879.1/2085036-2085007
GUUUCCAUUCCUUAUAGGUAGGAUAAAAAG
...(.(...............)....)...
>CP000679.1/93158-93187
GUUUCAAUCCCCAAAGGGCAGGCUACAAAC
...(.(...............)....)...
>CP000141.1/1949543-1949572
GUUUCAAUCCCAGAUUGGUUCGAUUAAAAC
...(.(...............)....)...
>AP009389.1/2004616-2004645
GUUUCAAUUCCUCAUAGGCAGGCUAAAAAC
...(.(...............)....)...
>CP000679.1/93287-93316
GUUUCAAUCCCCAAAGGGUAGGCUACAAAC
...(.(...............)....)...
>AP008230.1/3154402-3154431
GUUUCAAUCCCUCAUAGGUAAGCUAACAAC
...(.(...............)....)...
>CP000568.1/717027-716998
GUUUCAAUUCCUCAUAGGUACGAUACAAAC
...(.(...............)....)...
>AARF01000280.1/1425-1454
GUUUCAAUUCCUUAUAGGUACGUUAGAAAC
...(..(...............)...)...
>CP000673.1/1722241-1722270
GUUUCAAUUCCUUAUAGGUAGUCUAAAAUC
...(.(...............)...)....
>CP001336.1/4172990-4173019
AUUUCAAUUCCUUAUAGGUAAGCUAACAAC
...(.(...............).....)..
2 (.Y.().UA.)
(.C.().GNN.)
15 78.9
>CP001336.1/4172990-4173019
AUUUCAAUUCCUUAUAGGUAAGCUAACAAC
...(..(.(.(.....)...))...)....
>AP008230.1/3154402-3154431
GUUUCAAUCCCUCAUAGGUAAGCUAACAAC
..(.(..(..(.....)...)).....)..
>CP000716.1/360504-360533
AUUUCAAUUCCUCCAAGGUAAGGUAAAAAC
...(.(..(.(.....)....))..)....
>CP000141.1/1949543-1949572
GUUUCAAUCCCAGAUUGGUUCGAUUAAAAC
...(.(..(.(.....)...))....)...
>CP000771.1/1024868-1024839
GUUUCAAUCCCUAAUAGGUAUGCUAAAAGC
...(.(...(.(....)...))....)...
>CP000673.1/1722241-1722270
GUUUCAAUUCCUUAUAGGUAGUCUAAAAUC
...(..(.(.(.....)...))...)....
>CP000679.1/93158-93187
GUUUCAAUCCCCAAAGGGCAGGCUACAAAC
...(.(..(.(.....)...))....)...
>AP009389.1/2004616-2004645
GUUUCAAUUCCUCAUAGGCAGGCUAAAAAC
...(.(..(.(.....)...))....)...
>CP000679.1/93287-93316
GUUUCAAUCCCCAAAGGGUAGGCUACAAAC
...(.(..(.(.....)...))....)...
>CP000557.1/317462-317433
GUUUCAAUCUCUCAUAGGUACGAUACAAAC
...(...(.(.(....)....))...)...
>AARF01000280.1/1425-1454
GUUUCAAUUCCUUAUAGGUACGUUAGAAAC
...(..(.(.(.....)....))...)...
>CP000141.1/1929243-1929272
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
...(.(..(.(.....)...))....)...
>CP000568.1/717027-716998
GUUUCAAUUCCUCAUAGGUACGAUACAAAC
...(.(..(.(.....)...))....)...
>CP000679.1/62025-62054
GUUUCAAUCCCCAAAGGGAAGGCUACAAAC
...(.(..(.(.....)...))....)...
>BA000043.1/355427-355398
GUUUCAAUCCCUCAUAGGUACGAUAAAAAC
...(.(..(.(.....)...))....)...
3 (.Y.().UA.)
(.C.().GNN.)
(.A.)
13 68.4
>AP008230.1/3154402-3154431
GUUUCAAUCCCUCAUAGGUAAGCUAACAAC
..(.(..(..((...))...)).....)..
>CP000679.1/62025-62054
GUUUCAAUCCCCAAAGGGAAGGCUACAAAC
...(.(..(.((...))...))....)...
>AP009389.1/2004616-2004645
GUUUCAAUUCCUCAUAGGCAGGCUAAAAAC
...(.(..(.((...))...))....)...
>CP000716.1/360504-360533
AUUUCAAUUCCUCCAAGGUAAGGUAAAAAC
...(.(..(.((...))....))..)....
>CP000679.1/93158-93187
GUUUCAAUCCCCAAAGGGCAGGCUACAAAC
...(.(..(.((...))...))....)...
>CP000673.1/1722241-1722270
GUUUCAAUUCCUUAUAGGUAGUCUAAAAUC
...(..(.(.((...))...))...)....
>AARF01000280.1/1425-1454
GUUUCAAUUCCUUAUAGGUACGUUAGAAAC
...(..(.(.((...))....))...)...
>CP000679.1/93287-93316
GUUUCAAUCCCCAAAGGGUAGGCUACAAAC
...(.(..(.((...))...))....)...
>CP000557.1/317462-317433
GUUUCAAUCUCUCAUAGGUACGAUACAAAC
...(...(.(.((..))....))...)...
>BA000043.1/355427-355398
GUUUCAAUCCCUCAUAGGUACGAUAAAAAC
...(.(..(.((...))...))....)...
>CP000568.1/717027-716998
GUUUCAAUUCCUCAUAGGUACGAUACAAAC
...(.(..(.((...))...))....)...
>CP000141.1/1949543-1949572
GUUUCAAUCCCAGAUUGGUUCGAUUAAAAC
...(.(..(.((...))...))....)...
>CP000141.1/1929243-1929272
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
...(.(..(.((...))...))....)...
4 (.Y.().UA.)
(.C.().GNN.)
(.A.)
(U()A)
12 63.2
>CP000141.1/1929243-1929272
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
(.((.(..(.((...))...))....)).)
>AP009389.1/2004616-2004645
GUUUCAAUUCCUCAUAGGCAGGCUAAAAAC
(.((.(..(.((...))...))....)).)
>CP000679.1/93287-93316
GUUUCAAUCCCCAAAGGGUAGGCUACAAAC
(.((.(..(.((...))...))....)).)
>CP000568.1/717027-716998
GUUUCAAUUCCUCAUAGGUACGAUACAAAC
(.((.(..(.((...))...))....)).)
>AARF01000280.1/1425-1454
GUUUCAAUUCCUUAUAGGUACGUUAGAAAC
(.((..(.(.((...))....))...)).)
>CP000673.1/1722241-1722270
GUUUCAAUUCCUUAUAGGUAGUCUAAAAUC
(.((..(.(.((...))...))...)).).
>CP000557.1/317462-317433
GUUUCAAUCUCUCAUAGGUACGAUACAAAC
(.((...(.(.((..))....))...)).)
>CP000716.1/360504-360533
AUUUCAAUUCCUCCAAGGUAAGGUAAAAAC
(.((.(..(.((...))....))..)).).
>CP000679.1/93158-93187
GUUUCAAUCCCCAAAGGGCAGGCUACAAAC
(.((.(..(.((...))...))....)).)
>CP000679.1/62025-62054
GUUUCAAUCCCCAAAGGGAAGGCUACAAAC
(.((.(..(.((...))...))....)).)
>CP000141.1/1949543-1949572
GUUUCAAUCCCAGAUUGGUUCGAUUAAAAC
(.((.(..(.((...))...))....)).)
>BA000043.1/355427-355398
GUUUCAAUCCCUCAUAGGUACGAUAAAAAC
(.((.(..(.((...))...))....)).)

>CP000716.1/360504-360533
AUUUCAAUUCCUCCAAGGUAAGGUAAAAAC
...(.(...............)....)...
...(.(..(.(.....)....))..)....
...(.(..(.((...))....))..)....
(.((.(..(.((...))....))..)).).
>CP000557.1/317462-317433
GUUUCAAUCUCUCAUAGGUACGAUACAAAC
...(.(...............)....)...
...(...(.(.(....)....))...)...
...(...(.(.((..))....))...)...
(.((...(.(.((..))....))...)).)
>CP000679.1/62025-62054
GUUUCAAUCCCCAAAGGGAAGGCUACAAAC
...(.(...............)....)...
...(.(..(.(.....)...))....)...
...(.(..(.((...))...))....)...
(.((.(..(.((...))...))....)).)
>BA000043.1/355427-355398
GUUUCAAUCCCUCAUAGGUACGAUAAAAAC
...(.(...............)....)...
...(.(..(.(.....)...))....)...
...(.(..(.((...))...))....)...
(.((.(..(.((...))...))....)).)
>CP000141.1/1929243-1929272
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
...(.(...............)....)...
...(.(..(.(.....)...))....)...
...(.(..(.((...))...))....)...
(.((.(..(.((...))...))....)).)
>CP000679.1/93158-93187
GUUUCAAUCCCCAAAGGGCAGGCUACAAAC
...(.(...............)....)...
...(.(..(.(.....)...))....)...
...(.(..(.((...))...))....)...
(.((.(..(.((...))...))....)).)
>CP000141.1/1949543-1949572
GUUUCAAUCCCAGAUUGGUUCGAUUAAAAC
...(.(...............)....)...
...(.(..(.(.....)...))....)...
...(.(..(.((...))...))....)...
(.((.(..(.((...))...))....)).)
>AP009389.1/2004616-2004645
GUUUCAAUUCCUCAUAGGCAGGCUAAAAAC
...(.(...............)....)...
...(.(..(.(.....)...))....)...
...(.(..(.((...))...))....)...
(.((.(..(.((...))...))....)).)
>CP000679.1/93287-93316
GUUUCAAUCCCCAAAGGGUAGGCUACAAAC
...(.(...............)....)...
...(.(..(.(.....)...))....)...
...(.(..(.((...))...))....)...
(.((.(..(.((...))...))....)).)
>CP000568.1/717027-716998
GUUUCAAUUCCUCAUAGGUACGAUACAAAC
...(.(...............)....)...
...(.(..(.(.....)...))....)...
...(.(..(.((...))...))....)...
(.((.(..(.((...))...))....)).)
>AARF01000280.1/1425-1454
GUUUCAAUUCCUUAUAGGUACGUUAGAAAC
...(..(...............)...)...
...(..(.(.(.....)....))...)...
...(..(.(.((...))....))...)...
(.((..(.(.((...))....))...)).)
>CP000673.1/1722241-1722270
GUUUCAAUUCCUUAUAGGUAGUCUAAAAUC
...(.(...............)...)....
...(..(.(.(.....)...))...)....
...(..(.(.((...))...))...)....
(.((..(.(.((...))...))...)).).
>AP008230.1/3154402-3154431
GUUUCAAUCCCUCAUAGGUAAGCUAACAAC
...(.(...............)....)...
..(.(..(..(.....)...)).....)..
..(.(..(..((...))...)).....)..
>CP000771.1/1024868-1024839
GUUUCAAUCCCUAAUAGGUAUGCUAAAAGC
...(.(...............)....)...
...(.(...(.(....)...))....)...
>CP001336.1/4172990-4173019
AUUUCAAUUCCUUAUAGGUAAGCUAACAAC
...(.(...............).....)..
...(..(.(.(.....)...))...)....
>CP000612.1/587612-587583
GUUUCAAUUCCUUAUAGGUAGGCUGAUAAC
.(...(...............)......).
>CP000879.1/2085036-2085007
GUUUCCAUUCCUUAUAGGUAGGAUAAAAAG
...(.(...............)....)...

ORB p-value ORB frequency
(per structure per nucleotide)
Means
ratio
1 (.Y.().UA.) < 0.001 31.7
CP000141.1/1929243-1929272
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
.........1.........2.........3
123456789012345678901234567890
2 (.C.().GNN.) < 0.001 38.7
CP000141.1/1929243-1929272
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
.........1.........2.........3
123456789012345678901234567890
3 (.A.) < 0.001 3.26
CP000141.1/1929243-1929272
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
.........1.........2.........3
123456789012345678901234567890
4 (U()A) < 0.001 17.8
CP000141.1/1929243-1929272
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
.........1.........2.........3
123456789012345678901234567890
ORB p-value ORB frequency
(per structure per nucleotide)
Means
ratio

NCM p-value NCM frequency
(per structure per nucleotide)
Means
ratio
1 (.Y.().UA.) < 0.001 31.7
CP000141.1/1929243-1929272
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
GUUUCAAUCCCAGAAUGGUUCGAUUAAAAC
.........1.........2.........3
123456789012345678901234567890
NCM p-value NCM frequency
(per structure per nucleotide)
Means
ratio

Structure is visualized using R2R 1.0.6 (Zasha Weinberg).
LOGOS are generated with ggseqlogo (Omar Wagih).