Categories

## matchPWM is now part of Biostrings 2.7.47 in…

matchPWM() is now part of Biostrings 2.7.47 (in BioC devel, so you need R-2.7).

library(Biostrings)

Suppose 'pwm' contains a Position Weight Matrix, let's say:

pwm <- rbind(A=c( 1,  0, 19, 20, 18,  1, 20,  7),
C=c( 1,  0,  1,  0,  1, 18,  0,  2),
G=c(17,  0,  0,  0,  1,  0,  0,  3),
T=c( 1, 20,  0,  0,  0,  1,  0,  8))

Note that this is just a standard integer matrix with the 4 DNA base letters
as row names (having these row names is mandatory).
Some low-level utility functions are available for manipulating this kind of
matrix:

> maxWeights(pwm)  # the max weight in each column
[1] 17 20 19 20 18 18 20  8

> maxScore(pwm)  # the max possible score
[1] 140

Let's match 'pwm' against Human chr1:

library(BSgenome.Hsapiens.UCSC.hg18)
chr1 <- Hsapienschr1 Number of "best" matches: > countPWM(pwm, chr1, min.score="100%") # takes about 5 seconds on my system [1] 5152 With a lower cut-off value: m <- matchPWM(pwm, chr1, min.score="90%") See the 10 first matches ("first" means "smallest chromosome location", NOT "best" matches): > m[1:10] Views on a 247249719-letter DNAString subject subject: TAACCCTAACCCTAACCCTAACCCTAACCCTAAC...NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN views: start end width [1] 31931 31938 8 [GTAAACAA] [2] 33324 33331 8 [GTAAACAT] [3] 38425 38432 8 [GTAAACAG] [4] 39177 39184 8 [GTAAACAC] [5] 46971 46978 8 [GTAAACAT] [6] 49952 49959 8 [GTAAACAT] [7] 70381 70388 8 [GTAAACAG] [8] 74359 74366 8 [GTAAACAC] [9] 90714 90721 8 [GTAAACAT] [10] 96544 96551 8 [GTAAACAC] The speed could be improved, maybe by a factor 2 (or more, for longest PWMs). Also maybe an additional argument could be added to let the user control how the returned matches should be sorted ("left-to-right" or "best-first")? Unlike MatInspector or the transfac-tool, there is no facility yet to suggest individual cut-off values depending on the length of the PWMs. See '?matchPWM' for more information (e.g. how to search the minus strand of the chromosome).  https://stat.ethz.ch/pipermail/bioconductor/2008-April/022029.html Categories ## Save R function to a text file latex… Save R function to a text file latex.code <- function(){ cat("\\begin{align}\n") cat("[X'X]^{-1}X'y\n") cat("\\end{align}\n") } sink(file='ols.txt') latex.code() sink()  http://stackoverflow.com/questions/6222565/how-can-i-save-text-to-a-file-in-r Categories ## Sumatra is a tool for managing and tracking… Sumatra is a tool for managing and tracking projects based on numerical simulation and/or analysis, with the aim of supporting reproducible research. It can be thought of as an automated electronic lab notebook for computational projects. It consists of: a command-line interface, smt, for launching simulations/analyses with automatic recording of information about the experiment, annotating these records, linking to data files, etc. a web interface with a built-in web-server, smtweb, for browsing and annotating simulation/analysis results. a Python API, on which smt and smtweb are based, that can be used in your own scripts in place of using smt, or could be integrated into a GUI-based application. Sumatra is currently alpha code, and should be used with caution and frequent backups of your records. http://packages.python.org/Sumatra/index.html Categories ## Installing Python without root http stackoverflow com questions… Installing Python without root. http://stackoverflow.com/questions/622744/unable-to-install-python-without-sudo-access Categories ## Quick and easy submission of R on Sun… Quick and easy submission of R on Sun Grid Engine Easiest way to submit R jobs Here are two scripts and a symlink I created to make it easy as possible to submit R jobs to your Grid: qsub-R If you normally do something along the lines of: user@exec:~ nohup nice R CMD BATCH toodles.R
Now all you need to do is:

user@submit:~$qsub-R toodles.R Your job 3540 (“toodles.R”) has been submitted qsub-R is linked to submit-R, a script I wrote. It calls qsub and submits a simple shell wrapper with the R file as an argument. It ends up in the queue and eventually your output arrives in the current directory: toodles.R.o3540 Download it and install it. You’ll need to make the ‘qsub-R’ symlink to ‘3rd_party/uoa-dos/submit-R’ yourself, although there is one in the package already for lx24-x86: qsub-R.tar (10 KiB, tar) http://www.stat.auckland.ac.nz/~kimihia/sun-grid#submitting Categories ## When Mendeley bibliography duplicates instead of refreshing Check… When Mendeley bibliography duplicates instead of refreshing. Check whether the following option is set in Word 2007: Office button->Word Options->Advanced->General->Confirm file format conversion on open http://feedback.mendeley.com/forums/4941-mendeley-feedback/suggestions/1384943-bug-refresh-bibliography-in-word-creates-duplicat Categories ## Remove the first line in Perl$first line…

Remove the first line in Perl

\$first_line = <>;

while (<>) {
do_something();
}