matchPWM() is now part of Biostrings 2.7.47 (in BioC devel, so you need R-2.7).

Load Biostrings:


Suppose 'pwm' contains a Position Weight Matrix, let's say:

   pwm <- rbind(A=c( 1,  0, 19, 20, 18,  1, 20,  7),
                C=c( 1,  0,  1,  0,  1, 18,  0,  2),
                G=c(17,  0,  0,  0,  1,  0,  0,  3),
                T=c( 1, 20,  0,  0,  0,  1,  0,  8))

Note that this is just a standard integer matrix with the 4 DNA base letters
as row names (having these row names is mandatory).
Some low-level utility functions are available for manipulating this kind of

   > maxWeights(pwm)  # the max weight in each column
   [1] 17 20 19 20 18 18 20  8

   > maxScore(pwm)  # the max possible score
   [1] 140

Let's match 'pwm' against Human chr1:

   chr1 <- Hsapiens$chr1

Number of "best" matches:

   > countPWM(pwm, chr1, min.score="100%")  # takes about 5 seconds on my system
   [1] 5152

With a lower cut-off value:

   m <- matchPWM(pwm, chr1, min.score="90%")

See the 10 first matches ("first" means "smallest chromosome location", NOT "best"

   > m[1:10]
     Views on a 247249719-letter DNAString subject
        start   end width
    [1] 31931 31938     8 [GTAAACAA]
    [2] 33324 33331     8 [GTAAACAT]
    [3] 38425 38432     8 [GTAAACAG]
    [4] 39177 39184     8 [GTAAACAC]
    [5] 46971 46978     8 [GTAAACAT]
    [6] 49952 49959     8 [GTAAACAT]
    [7] 70381 70388     8 [GTAAACAG]
    [8] 74359 74366     8 [GTAAACAC]
    [9] 90714 90721     8 [GTAAACAT]
   [10] 96544 96551     8 [GTAAACAC]

The speed could be improved, maybe by a factor 2 (or more, for longest PWMs).

Also maybe an additional argument could be added to let the user control how
the returned matches should be sorted ("left-to-right" or "best-first")?

Unlike MatInspector or the transfac-tool, there is no facility yet to suggest
individual cut-off values depending on the length of the PWMs.

See '?matchPWM' for more information (e.g. how to search the minus strand of
the chromosome).


Save R function to a text file latex…

Save R function to a text file

latex.code <- function(){


Sumatra is a tool for managing and tracking…

Sumatra is a tool for managing and tracking projects based on numerical simulation and/or analysis, with the aim of supporting reproducible research. It can be thought of as an automated electronic lab notebook for computational projects.

It consists of:

a command-line interface, smt, for launching simulations/analyses with automatic recording of information about the experiment, annotating these records, linking to data files, etc.
a web interface with a built-in web-server, smtweb, for browsing and annotating simulation/analysis results.
a Python API, on which smt and smtweb are based, that can be used in your own scripts in place of using smt, or could be integrated into a GUI-based application.
Sumatra is currently alpha code, and should be used with caution and frequent backups of your records.


Installing Python without root http stackoverflow com questions…

Installing Python without root.


Quick and easy submission of R on Sun…

Quick and easy submission of R on Sun Grid Engine

Easiest way to submit R jobs

Here are two scripts and a symlink I created to make it easy as possible to submit R jobs to your Grid:


If you normally do something along the lines of:

user@exec:~$ nohup nice R CMD BATCH toodles.R
Now all you need to do is:

user@submit:~$ qsub-R toodles.R
Your job 3540 (“toodles.R”) has been submitted
qsub-R is linked to submit-R, a script I wrote. It calls qsub and submits a simple shell wrapper with the R file as an argument. It ends up in the queue and eventually your output arrives in the current directory: toodles.R.o3540

Download it and install it. You’ll need to make the ‘qsub-R’ symlink to ‘3rd_party/uoa-dos/submit-R’ yourself, although there is one in the package already for lx24-x86: qsub-R.tar (10 KiB, tar)


When Mendeley bibliography duplicates instead of refreshing Check…

When Mendeley bibliography duplicates instead of refreshing.

Check whether the following option is set in Word 2007:
Office button->Word Options->Advanced->General->Confirm file format conversion on open


Remove the first line in Perl $first line…

Remove the first line in Perl

$first_line = <>;

while (<>) {