select - R: obtaining subset of a column that matches a certain criteria -

March 15, 2015

let's have table of data of students in school. want @ family size of students male (1) , @ least considered "tall". how in r?

i can seem figure out how column of family size of students, student_data$family_size, can't figure out how narrow down further.

   family_size  ...  gender ... height 1       6              1         tall 2       3              0         tall 3       5              1         tall 4       4              1         tall 5      10              0         short 6       2              1         average

so want:

     family_size 1       6 2       5 3       4

i'm not sure how indexing turn out, maybe corresponds original indexing of first table, that's not important.

also, i'm not sure if i've uploaded data frame or not, when execute typeof(student_data), returns "list"

we can use subset. has subset , select argument pass logical index subset rows , select columns based on column index or name respectively. in op's post, mentioned extract rows have 'male' gender i.e. represented 1 in binary column. so, gender==1 gives logical true/false converting 1 true , other values (0 here) false. condition check rows have 'tall' substring in 'height' column. use grepl match substring 'tall' in 'height' column. couple both conditions &, , select column 'family_size'.

subset(df1, gender==1 & grepl('tall', height), select= family_size) #   family_size #1           6 #3           5 #4           4

or using [ instead of subset. [ recommended option use inside functions. default option drop=true. so, if subsetting single column, might end vector. avoid that, can use drop=false.

df1[with(df1, gender==1 & grepl('tall', height)), 'family_size', drop=false]

data

df1 <- structure(list(family_size = c(6l, 3l, 5l, 4l, 10l, 2l),  gender = c(1l,  0l, 1l, 1l, 0l, 1l), height = c("very tall", "tall", "tall",  "tall", "very short", "average")), .names = c("family_size",  "gender", "height"), class = "data.frame", row.names = c("1",   "2", "3", "4", "5", "6"))

Search This Blog

Enable

select - R: obtaining subset of a column that matches a certain criteria -

data

Comments

Post a Comment

Popular posts from this blog

resizing Telegram inline keyboard -

javascript - How to bind ViewModel Store to View? -

python - Alternative to referencing variable before assignment -