python - Remove all elements which occur in less than 1% and more than 60% of the list -


if have list of strings:

['fsuy3,fsddj4,fsdg3,hfdh6,gfdgd6,gfdf5', 'fsuy3,fsuy3,fdfs4,sdgsdj4,fhfh4,sds22,hhgj6,xfsd4a,asr3']  

(big list)

how can remove words occur in less 1% , more 60% of strings?

you can use collections.counter:

counts = counter(mylist) 

and then:

newlist = [s s in mylist if 0.01 < counts[s]/len(mylist) < 0.60] 

(in python 2.x use float(counts[s])/len(mylist))


if you're talking comma-seperated words, can use similar approach:

words = [l.split(',') l in mylist]  counts = counter(word l in words word in l)  newlist = [[s s in l if 0.01 < counts[s]/len(mylist) < 0.60] l in words] 

Comments

Popular posts from this blog

matlab - Deleting rows with specific rules -

jquery - How would i go about shortening this code? And to cancel the previous click on click of new section? -