The swBuildDb module allows generating SHAPEwarp-compliant databases starting from RNA reactivity profiles. Reactivity profiles must be provided in the RNAframework's XML format

Usage

To list the required parameters, simply type:

$ swBuildDb --help

Parameter	Type	Description
-o or --output	string	Output database folder (Default: sw_db/)
-ow or --overwrite		Overwrites output database folder if already existing
--threads	int	Number of processors to use (Default: 1)
--blockSize	int	Size (in nt) of the blocks for shuffling (Default: 10)
--inBlockShuffle	int	Besides shuffling blocks, residues within each block will be shuffled as well
--chunkSize	int	For each shuffling, only a chunk of this size will be extracted and used to build the shuffled database (Default: 1000) Note: this setting works fine for short queries (<1000 nt). If you plan to search longer queries, then it is advisable to increase the value of `chunkSize`
--shufflings	int	Number of shufflings to perform for each database entry (Default: 100)
--foldDb		Provided SHAPE profiles are first used to calculate base-pairing probability profiles, that are then used to generate the database Note: query searches must be performed with the `foldQuery` option of `SHAPEwarp`
		Probability profile database construction options
--maxBPspan	int	Maximum allowed base-pairing distance (Default: 600)
--noLonelyPairs		Disallows lonely pairs (helices of 1 bp)
--noClosingGU		Dissalows G:U wobbles at the end of helices
--slope	float	Slope for SHAPE reactivities conversion into pseudo-free energy contributions (Default: 1.8)
--intercept	float	Intercept for SHAPE reactivities conversion into pseudo-free energy contributions (Default: -0.6)
--temperature	float	Folding temperature (Default: 37.0)
--winSize	int	Size (in nt) of the sliding window for partition function calculation (Default: 800)
--offset	int	Offset (in nt) for partition function window sliding (Default: 200)
--winTrim	int	Number of bases to trim from both ends of partition function windows to avoid terminal biases (Default: 50)

Note

Shuffling SHAPE data in 10 nucleotide-long blocks (blockSize = 10) yields more realistic profiles, as it preserves the relationship between neighboring residues. Although enabling inBlockShuffle might produce hits with lower E-values, hence increasing the chance to recover more distal matches, it also increases the chances of recovering more false positive matches.