r - How to delete rows from a dataframe that contain n*NA -


i have number of large datasets ~10 columns, , ~200000 rows. not columns contain values each row, although @ least 1 column must contain value row present, set threshold how many nas allowed in row.

my dataframe looks this:

 id q  r  s  t  u  v  w  x  y  z   1  5  na 3  8  9  na 8  6  4  b  5  na 4  6  1  9  7  4  9  3   c  na 9  4  na 4  8  4  na 5  na  d  2  2  6  8  4  na 3  7  1  32  

and able delete rows contain more 2 cells containing na get

id q  r  s  t  u  v  w  x  y  z  1  5  na 3  8  9  na 8  6  4  b 5  na 4  6  1  9  7  4  9  3   d 2  2  6  8  4  na 3  7  1  32  

complete.cases removes rows containing na, , know 1 can delete rows contain na in columns there way modify non-specific columns contain na, how many of total do?

alternatively, dataframe generated merging several dataframes using

    file1<-read.delim("~/file1.txt")     file2<-read.delim(file=args[1])      file1<-merge(file1,file2,by="chr.pos",all=true) 

perhaps merge function altered?

thanks

use rowsums. remove rows data frame (df) contain precisely n na values:

df <- df[rowsums(is.na(df)) != n, ] 

or remove rows contain n or more na values:

df <- df[rowsums(is.na(df)) < n, ] 

in both cases of course replacing n number that's required


Comments

Popular posts from this blog

image - ClassNotFoundException when add a prebuilt apk into system.img in android -

I need to import mysql 5.1 to 5.5? -

Java, Hibernate, MySQL - store UTC date-time -