talence soirée rencontre Viewing huge text files; between specific lines
$ sed -n 101,110p /var/log/cron
talence soirée rencontre Viewing huge text files; between specific lines
$ sed -n 101,110p /var/log/cron
Subsetting ranged data with its values.
Example
subset reads according to the width
> paired.read.rd
RangedData with 997111 rows and 1 value column across 1 space
space ranges | strand
|
1 chr4 [ 146855, 146898] | +
2 chr4 [1322462, 1322493] | -
3 chr4 [ 135547, 135703] | +
4 chr4 [ 965138, 965228] | -
5 chr4 [ 614464, 614606] | +
6 chr4 [ 274244, 274297] | +
7 chr4 [1191851, 1191994] | -
8 chr4 [ 310251, 310393] | +
9 chr4 [ 524981, 525273] | +
... ... ... ... ...
997103 chr4 [1071785, 1071930] | -
997104 chr4 [ 819270, 819409] | -
997105 chr4 [ 951987, 952139] | +
997106 chr4 [ 327573, 327659] | -
997107 chr4 [ 343265, 343289] | -
997108 chr4 [ 615827, 615992] | +
997109 chr4 [ 615402, 615423] | -
997110 chr4 [ 128254, 128323] | +
997111 chr4 [ 659492, 659641] | -
> paired.read.rd[width(paired.read.rd) > 100, ]
RangedData with 623327 rows and 1 value column across 1 space
space ranges | strand
|
1 chr4 [ 135547, 135703] | +
2 chr4 [ 614464, 614606] | +
3 chr4 [1191851, 1191994] | -
4 chr4 [ 310251, 310393] | +
5 chr4 [ 524981, 525273] | +
6 chr4 [1174028, 1174189] | -
7 chr4 [1174480, 1174655] | +
8 chr4 [ 869049, 869191] | -
9 chr4 [ 595415, 595565] | +
... ... ... ... ...
623319 chr4 [ 646433, 646588] | +
623320 chr4 [ 227923, 228078] | -
623321 chr4 [1204606, 1204767] | -
623322 chr4 [1013562, 1013741] | -
623323 chr4 [1071785, 1071930] | -
623324 chr4 [ 819270, 819409] | -
623325 chr4 [ 951987, 952139] | +
623326 chr4 [ 615827, 615992] | +
623327 chr4 [ 659492, 659641] | -
Or subset() works, too.
Exchange current window with the next on.
Ctrl-w x
The Minus File
Again following the lead of the standard shell utilities, Perl’s open function treats a file whose name is a single minus, “-“, in a special way. If you open minus for reading, it really means to access the standard input. If you open minus for writing, it really means to access the standard output.
UPDATE
========
The minus file is not working with 3-argument open.
Bryan> How can I use the “safe” 3-argument open and still be able to read off a
Bryan> pipe?
You don’t. 2-arg open has to be good for something.
And 2-arg open is perfectly safe if the second arg is a literal:
open OTHER, “<-" or die;
open my $handle, "<-" or die;
Don't let anyone tell you "Always use 3-arg open" unless they also
footnote it with "unless you have no variables involved".
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.posterous.com/ for Smalltalk discussion
Nice tips of unix command tools
123 398 17359 317 19 2909 39 -399 -5789 49 33 200 255 33 -378
sort -n file
Sort the file by first column. The -n option ensures numeric (as opposed to lexicographic) sort.
sort -k 2 -n file
Sort the file by second column. The “-k” option here denotes the column used as sort key.
grep '33' file
Extract all lines containing the string “33” (in the above example, lines 4 and 5).
grep -c '33' file
Same, but display only the number of matching lines (2 in the example), not the lines themselves. This is useful to analyze large data files of output data. For example, if a sequence of one million integers, is saved as a file, one per line, “grep -c ’33’ file” will display the number of 0’s in that sequence.
grep -c '-' *.out
Same command, but applied to all files in the current directory matching “*.out”. For each file there is an output line of the form “Filename: x”, where “x” is the number of matching lines in the file.
sort -n file | cat -n
Sort the file, then prepend line numbers to each line. This results in the following:
1 39 -399 -5789
2 49 33 200
3 123 398 17359
4 255 33 -378
5 317 19 2909
http://www.math.uiuc.edu/~hildebr/computer/unixtips.html
To import a SAM file or other data having “#” as data using read.table, it is necessary to change the “comment.char” option.
test.sam <- read.table('test.sam', comment.char = "")
Import specific columns into R using read.table
test.import <- read.table(pipe("cut -f1,2,4 data.tab"))
More discussions about the coordinate systems
http://biostar.stackexchange.com/questions/6394/what-are-the-advantages-disadvantages-of-one-based-vs-zero-based-genome-coordina
http://bergmanlab.smith.man.ac.uk/?p=36
0-based, half-open coordinate system or interbase coordinate system
| A | C | G | T | A | C | G | T | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8
So if you wanted to describe the first base, it would be:
chr 0 1
and GTAC:
chr 2 6
http://biostar.stackexchange.com/questions/7064/bed-coordinates
Update:
Interbase coordinates (top) and base-oriented (below) from Chado Wiki
Explains the SAM flags.
http://picard.sourceforge.net/explain-flags.html