r - Finding a sensible range -

June 15, 2015

i'm struggling few days. 3rd question @ stackoverflow same topic, hope time question better defined.

my data distributed this: (histogram)

histogram of true data

the x-axis correspond range of probabilities: 0 1.

i want assign states state 1 state 10 sensibly probability range.

this have got:

interval <- round(quantile(datag, c(seq(0,1,by=0.10))),3)

output:

   0%   10%   20%   30%   40%   50%   60%   70%   80%   90%  100%   0.000 0.008 0.015 0.024 0.036 0.054 0.080 0.124 0.209 0.397 1.000

assign states 0 10:

states <- data.frame(datag, state=findinterval(datag, interval))  head(states)

output: states

probability      state 0.20585012         8 0.21202839         9 0.07087725         6 0.7109513         10 0.9641807         10

the problem this: can see above, have state 9 probability 0.2120 , state 10 > 0.710. happy prob=0.2120 state 4 , prob=0.710 state 7 , prob=0.96 = state 10.

so how assign states more uniformly?

to replicate datag:

datag <- data.frame(probability=rgamma(10000, shape=0.6, rate=4.8, scale=1/4.8))

edit: @roman:

datag <- subset(datag, probability<=1)

edit: @simon

yes, i'm aware of "cut":

table(cut(datag, breaks = c(seq(0,0.8,by=0.1))))

output:

(0,0.1] (0.1,0.2] (0.2,0.3] (0.3,0.4] (0.4,0.5] (0.5,0.6] (0.6,0.7] (0.7,0.8]  125545     26625     12795      8126      5556      4108      3227      2606

how 1 define breaks? after intervals (breaks themselfs) can assign states corresponding interval probability falls in.

you've got answer in op! don't take wrong way, think need spend more time reading documentation ?cut! if set labels = false in cut integer codes each break corresponds to.

#  set seed true reproducibility! set.seed(1) datag <- data.frame(probability=rgamma(10000, shape=0.6, rate=4.8, scale=1/4.8)) int <- cut( datag$probability , breaks = seq(0 , 1 , = 0.1 ) , lab = false ) head( cbind( prob = datag$probability , int ) )             prob int [1,] 0.031860645   1 [2,] 0.455054687   5 [3,] 0.134175238   2 [4,] 0.058957301   1 [5,] 0.855493999   9 [6,] 0.009144936   1

Search This Blog

Copy

r - Finding a sensible range -

Comments

Post a Comment

Popular posts from this blog

matlab - Deleting rows with specific rules -

asp.net - redirect .aspx with query string to html page using htaccess -

image - ClassNotFoundException when add a prebuilt apk into system.img in android -