The swBuildDb
module allows generating SHAPEwarp-compliant databases starting from RNA reactivity profiles. Reactivity profiles must be provided in the RNAframework's XML format
Usage
To list the required parameters, simply type:
$ swBuildDb --help
Parameter | Type | Description |
---|---|---|
-o or --output | string | Output database folder (Default: sw_db/) |
-ow or --overwrite | Overwrites output database folder if already existing | |
--threads | int | Number of processors to use (Default: 1) |
--blockSize | int | Size (in nt) of the blocks for shuffling (Default: 10) |
--inBlockShuffle | int | Besides shuffling blocks, residues within each block will be shuffled as well |
--chunkSize | int | For each shuffling, only a chunk of this size will be extracted and used to build the shuffled database (Default: 1000) Note: this setting works fine for short queries (<1000 nt). If you plan to search longer queries, then it is advisable to increase the value of chunkSize |
--shufflings | int | Number of shufflings to perform for each database entry (Default: 100) |
--foldDb | Provided SHAPE profiles are first used to calculate base-pairing probability profiles, that are then used to generate the database Note: query searches must be performed with the foldQuery option of SHAPEwarp |
|
Probability profile database construction options | ||
--maxBPspan | int | Maximum allowed base-pairing distance (Default: 600) |
--noLonelyPairs | Disallows lonely pairs (helices of 1 bp) | |
--noClosingGU | Dissalows G:U wobbles at the end of helices | |
--slope | float | Slope for SHAPE reactivities conversion into pseudo-free energy contributions (Default: 1.8) |
--intercept | float | Intercept for SHAPE reactivities conversion into pseudo-free energy contributions (Default: -0.6) |
--temperature | float | Folding temperature (Default: 37.0) |
--winSize | int | Size (in nt) of the sliding window for partition function calculation (Default: 800) |
--offset | int | Offset (in nt) for partition function window sliding (Default: 200) |
--winTrim | int | Number of bases to trim from both ends of partition function windows to avoid terminal biases (Default: 50) |
Note
Shuffling SHAPE data in 10 nucleotide-long blocks (blockSize
= 10) yields more realistic profiles, as it preserves the relationship between neighboring residues. Although enabling inBlockShuffle
might produce hits with lower E-values, hence increasing the chance to recover more distal matches, it also increases the chances of recovering more false positive matches.