r - calculate differences in dataframe -
i have dataframe looks this:
set.seed(50) data.frame(distance=c(rep("long", 5), rep("short", 5)), year=rep(2002:2006), mean.length=rnorm(10)) distance year mean.length 1 long 2002 0.54966989 2 long 2003 -0.84160374 3 long 2004 0.03299794 4 long 2005 0.52414971 5 long 2006 -1.72760411 6 short 2002 -0.27786453 7 short 2003 0.36082844 8 short 2004 -0.59091244 9 short 2005 0.97559055 10 short 2006 -1.44574995
i need calculate difference between in mean.length
between long
, short
in each year. whats fastest way of doing this?
here's 1 way using plyr:
set.seed(50) df <- data.frame(distance=c(rep("long", 5),rep("short", 5)), year=rep(2002:2006), mean.length=rnorm(10)) library(plyr) aggregation.fn <- function(df) { data.frame(year=df$year[1], diff=(df$mean.length[df$distance == "long"] - df$mean.length[df$distance == "short"]))} new.df <- ddply(df, "year", aggregation.fn)
gives you
> new.df year diff 1 2002 0.8275344 2 2003 -1.2024322 3 2004 0.6239104 4 2005 -0.4514408 5 2006 -0.2818542
a second way
df <- df[order(df$year, df$distance), ] n <- dim(df)[1] df$new.year <- c(1, df$year[2:n] != df$year[1:(n-1)]) df$diff <- c(-diff(df$mean.length), na) df$diff[!df$new.year] <- na new.df.2 <- df[!is.na(df$diff), c("year", "diff")] all(new.df.2 == new.df) # true
Comments
Post a Comment