r - How to separate a column in dplyr based on regex -

July 15, 2013

i have following data frame:

df <- structure(list(x2 = c("bb_137.hvmsc", "bb_138.combined.hvmsc",  "bb_139.combined.hvmsc", "bb_140.combined.hvmsc", "bb_141.hvmsc",  "bb_142.combined.hmsc-bm")), .names = "x2", row.names = c(na,  -6l), class = c("tbl_df", "tbl", "data.frame"))

which looks this

> df # tibble: 6 x 1                        x2                     <chr> 1            bb_137.hvmsc 2   bb_138.combined.hvmsc 3   bb_139.combined.hvmsc 4   bb_140.combined.hvmsc 5            bb_141.hvmsc 6 bb_142.combined.hmsc-bm

what want separate 2 columns (with . separator), keeping last field second column

              col1 col2             bb_137 hvmsc    bb_138.combined hvmsc    bb_139.combined hvmsc    bb_140.combined hvmsc             bb_141 hvmsc    bb_142.combined hmsc-bm

what's right way it?

my attempt this:

> df %>% separate(x2, = c("sid","status", "tiss"), sep = "[.]")  # tibble: 6 x 3      sid   status    tiss *  <chr>    <chr>   <chr> 1 bb_137    hvmsc    <na> 2 bb_138 combined   hvmsc 3 bb_139 combined   hvmsc 4 bb_140 combined   hvmsc 5 bb_141    hvmsc    <na> 6 bb_142 combined hmsc-bm

warning message: few values @ 2 locations: 1, 5

we can use negative lookahead separator in separate function.

library(tidyr) separate(data = df, col = x2, = c("col1", "col2"), sep = "(\\.)(?!.*\\.)")  #            col1    col2 #           <chr>   <chr> #1          bb_137   hvmsc #2 bb_138.combined   hvmsc #3 bb_139.combined   hvmsc #4 bb_140.combined   hvmsc #5          bb_141   hvmsc #6 bb_142.combined hmsc-bm

regex taken this answer.

Search This Blog

Enable

r - How to separate a column in dplyr based on regex -

Comments

Post a Comment

Popular posts from this blog

resizing Telegram inline keyboard -

javascript - How to bind ViewModel Store to View? -

javascript - Solution fails to pass one test with large inputs? -