r - Subtracting groupwise means from columns using either plyr or matrix algebra -
i'm trying write parallelizable code (exploting plyr
, domc
) calculate , subtract groupwise means columns of data frame. i'm having hard time getting plyr
syntax correct.
here script working for-loop:
data = data.frame(x = rnorm(100),y = rnorm(100),id = round(runif(100)*10)) data = data[with(data,order(id)),] dm = matrix(rep(na,nrow(data)*(ncol(data)-1)),nrow(data),(ncol(data)-1)) (i in 1:(ncol(data)-1)){ m = summaryby(data[,i]~id,data=data,fun=mean) d = data.frame(data[,i],id=data$id) = merge(d,m,by="id") dm[,i] = a[,2]-a[,3] }
but try break column names of data using ddply, , gives me error message. here non-working code:
dmf = function(i){ m = summaryby(data[,i]~id,data=data,fun=mean) d = data.frame(data[,i],id=data$id) = merge(d,m,by="id") dm = a[,2]-a[,3] as.data.frame(dm) } dm = ddply(.data=data,.fun = dmf,.variables = colnames(data)) >error in .subset(x, j) : invalid subscript type 'list'
anybody have solution this?
alternatively, if doable matrices, i'd appreciate sort of solution better matrix intuition me.
to take full advantage of plyr
, combine colwise
, base function scale
. also, if needed, let ddply
handle parallelization @ highest level:
dm <- ddply(data, "id", colwise(scale, center = true, scale = false), .parallel = true)
Comments
Post a Comment