Tag: R

  • with within and transform in R I found…

    http://musicaludi.fr/?karmanik=site-de-rencontre-40-landes&6cc=62 with, within, and transform in R.
    I found that they are useful to deal with data set.
    I don’t quite understand the explanation in the help page and I can’t not guarantee the correctness of my explanation so I provide some examples to show their behavior.
    Fist of all, they need data frame or list but not matrix. These are three examples to add a column which is the sum of the first two columns. See the differences among the functions.

    • with: returns one column
    • within: returns the whole data
    • transform: returns the whole data but the function argument is slightly different.

    with

    testwith <- data.frame(x1 = 1:10, x2 = 11:20)
    > testwith$y <- with(testwith, {x1 + x2})
    > testwith
       x1 x2  y
    1   1 11 12
    2   2 12 14
    3   3 13 16
    4   4 14 18
    5   5 15 20
    6   6 16 22
    7   7 17 24
    8   8 18 26
    9   9 19 28
    10 10 20 30
    

    within

    > testwith <- data.frame(x1 = 1:10, x2 = 11:20)
    > testwith <- within(testwith, {y <- x1 + x2})
    > testwith
       x1 x2  y
    1   1 11 12
    2   2 12 14
    3   3 13 16
    4   4 14 18
    5   5 15 20
    6   6 16 22
    7   7 17 24
    8   8 18 26
    9   9 19 28
    10 10 20 30
    

    transform

    > testwith <- data.frame(x1 = 1:10, x2 = 11:20)
    > testwith <- transform(testwith, y = x1 + x2)
    > testwith
       x1 x2  y
    1   1 11 12
    2   2 12 14
    3   3 13 16
    4   4 14 18
    5   5 15 20
    6   6 16 22
    7   7 17 24
    8   8 18 26
    9   9 19 28
    10 10 20 30
    

    Some more examples at
    http://stackoverflow.com/questions/1310247/in-r-do-you-use-attach-or-call-variables-by-name-or-slicing

  • Reverse search in RStudio Type part of a…

    Reverse search in RStudio
    Type part of a command and Ctrl-Up arrow will show the previous commands history.

  • Use t using apply to apply a non…

    Use t() using apply() to apply a non-aggregate function on a matrix row wise because column wise is the default order of matrix in R.

    Example.
    I’d like to reverse the order of elements in each row.

    > test.m <- matrix(1:20, nrow = 4)
    > test.m
    [,1] [,2] [,3] [,4] [,5]
    [1,]    1    5    9   13   17
    [2,]    2    6   10   14   18
    [3,]    3    7   11   15   19
    [4,]    4    8   12   16   20
    

    Apply will put the results in column wise manner.

    > apply(test.m, 1, rev)
    [,1] [,2] [,3] [,4]
    [1,]   17   18   19   20
    [2,]   13   14   15   16
    [3,]    9   10   11   12
    [4,]    5    6    7    8
    [5,]    1    2    3    4
    

    To get the right answer, I need transformation.

    > t(apply(test.m, 1, rev))
    [,1] [,2] [,3] [,4] [,5]
    [1,]   17   13    9    5    1
    [2,]   18   14   10    6    2
    [3,]   19   15   11    7    3
    [4,]   20   16   12    8    4
    

    On the other hand, if I want to reverse the order of elements in column, t() is not necessary.

    > apply(test.m, 2, rev)
    [,1] [,2] [,3] [,4] [,5]
    [1,]    4    8   12   16   20
    [2,]    3    7   11   15   19
    [3,]    2    6   10   14   18
    [4,]    1    5    9   13   17
    
  • Remove NA from data in R A

    Remove NA from data in R

    A <- na.omit(A)
    
  • Difference using names or entries in a column…

    Difference using names or entries in a column selecting elements from data frame.
    Use names to select elements from data frame, if possible.
    When using the names, the slicing index does not need to the the same length as the data,
    but if one of the column is used, the index and the data should be the same length.

    Example

    test.df <- data.frame(cbind(letters[1:3], 1:3, 4:6)
    row.names(test.df) <- letters[1:3]
    
    # When the length of the index is different from the number of row of the data frame, it does not work.
    test.idx <- c('a', 'c')
    test.df[test.idx, ]
    test.df[test.df[[[1]], ]
    
    # When the length of the index is the same as the row of the data frame, it works
    test.idx <- c('a', 'b', 'c')
    test.df[test.idx, ]
    test.df[test.df[[[1]], ]
    
  • Different behaviors between data frame and matrix in…

    Different behaviors between data frame and matrix in R

    > # Generate an artificial matrix
    > test.m <- matrix(1:6, nrow = 3)
    > row.names(test.m) <- c('x1', 'x2', 'x3')
    > col.names(test.m) <- c('a', 'b')
    Error in col.names(test.m) <- c("a", "b") :
      could not find function "col.names<-"
      >
      > # Generate a data frame from the matrix
      > test.df <- as.data.frame(test.m)
      >
      >
      > # Selecting elements
      > ## the row names can be used to select elemnts from a data frame or a matrix
      > test.idx <- c('x3', 'x1')
      > test.df[test.idx, ]
         V1 V2
         x3  3  6
         x1  1  4
         > test.m[test.idx, ]
            [,1] [,2]
            x3    3    6
            x1    1    4
            >
            > # Selecting elements with index having a name which is not in the data
            > ## data frame returns NA rows
            > ## matrix returns an error
            > test.idx <- c('x4', 'x1')
            > test.df[test.idx, ]
               V1 V2
               NA NA NA
               x1  1  4
               > test.m[test.idx, ]
               Error: subscript out of bounds
               >
               >
               > # Duplicate row names
               > ## duplicate row names are not allowed in data frame.
               > ## duplicate row names are allowed in matrix.
               > test.row.names <- c('x1', 'x2', 'x1')
               > row.names(test.df) <- test.row.names
               Error in `row.names<-.data.frame`(`*tmp*`, value = c("x1", "x2", "x1")) :
                 duplicate 'row.names' are not allowed
                 In addition: Warning message:
                 non-unique value when setting 'row.names': ‘x1’
                 > rownames(test.m) <- test.row.names
                 >
                 > # names
                 > ## names() returns column name in data frame
                 > ## names() returns NULL in matrix
                 > names(test.df)
                 [1] "V1" "V2"
                 > names(test.m)
                 NULL
    
  • Using smooth spline in stat scale in ggplot2…

    Using smooth.spline in stat_scale in ggplot2

    smooth.spline2 <- function(formula, data, ...) { 
      mat <- model.frame(formula, data) 
      smooth.spline(mat[, 2], mat[, 1]) 
    } 
    
    predictdf.smooth.spline <- function(model, xseq, se, level) { 
      pred <- predict(model, xseq) 
      data.frame(x = xseq, y = pred$y) 
    } 
    
    qplot(mpg, wt, data = mtcars) + geom_smooth(method = "smooth.spline2", se= F)
    

    From: http://groups.google.com/group/ggplot2/browse_thread/thread/149dfa0891fe383a

  • LOESS Fit a polynomial regression fitting

    LOESS: Fit a polynomial regression fitting

  • How to install R packages from the source…

    How to install R packages from the source

    install.packages(file_name_and_path, repos = NULL, type="source")
    
  • Fitting distribution in R Use fitdistr in MASS…

    Fitting distribution in R

    Use fitdistr in MASS package

    Example;

    fitdistr(as.vector(res.score.m), densefun = 'normal')
    
           mean             sd      
      -8.0647926561    2.9550064789 
     ( 0.0005642286) ( 0.0003989698)
    

    http://www.statmethods.net/advgraphs/probability.html