r - calculate differences in dataframe -

March 15, 2015

i have dataframe looks this:

set.seed(50) data.frame(distance=c(rep("long", 5), rep("short", 5)),            year=rep(2002:2006),            mean.length=rnorm(10))     distance year mean.length 1      long 2002  0.54966989 2      long 2003 -0.84160374 3      long 2004  0.03299794 4      long 2005  0.52414971 5      long 2006 -1.72760411 6     short 2002 -0.27786453 7     short 2003  0.36082844 8     short 2004 -0.59091244 9     short 2005  0.97559055 10    short 2006 -1.44574995

i need calculate difference between in mean.length between long , short in each year. whats fastest way of doing this?

here's 1 way using plyr:

set.seed(50) df <- data.frame(distance=c(rep("long", 5),rep("short", 5)),                  year=rep(2002:2006),                  mean.length=rnorm(10))  library(plyr) aggregation.fn <- function(df) {   data.frame(year=df$year[1],              diff=(df$mean.length[df$distance == "long"] -                    df$mean.length[df$distance == "short"]))} new.df <- ddply(df, "year", aggregation.fn)

gives you

> new.df   year       diff 1 2002  0.8275344 2 2003 -1.2024322 3 2004  0.6239104 4 2005 -0.4514408 5 2006 -0.2818542

a second way

df <- df[order(df$year, df$distance), ] n <- dim(df)[1] df$new.year <- c(1, df$year[2:n] != df$year[1:(n-1)]) df$diff <- c(-diff(df$mean.length), na) df$diff[!df$new.year] <- na new.df.2 <- df[!is.na(df$diff), c("year", "diff")]  all(new.df.2 == new.df)  # true

Search This Blog

Three

r - calculate differences in dataframe -

Comments

Post a Comment

Popular posts from this blog

.htaccess - First slash is removed after domain when entering a webpage in the browser -

Socket.connect doesn't throw exception in Android -

SPSS keyboard combination alters encoding -