python - Extract integral parts from strings in dataframe columns of lists -
if have column data such as:
value 1 [a_1, a_342, a_452] 2 [a_5, a_99] 3 [a_482, a_342, a_452, a_888]
i need trim column to:
value 1 [1, 342, 452] 2 [5, 99] 3 [482, 342, 452, 888]
basically, want remove a_
, make each entry of column become list of integers
i tried using replace
, map
function based on pandas python package none of works.
for single entry in column such as:
value 1 a_1 2 a_5 3 a_99
i can use df['value'] = df['value'].str[2:].astype(int)
, however, doesn't work lists of strings above.
really appreciate if can give suggestions. thank in advance.
option 1
to make life easy, convert str
, use str.replace
, , apply ast.literal_eval
on result.
import ast df['value'] = df['value'].astype(str).str.replace('a_', '')\ .apply(lambda x: [int(y) y in ast.literal_eval(x)]) df value 1 [1, 342, 452] 2 [5, 99] 3 [482, 342, 452, 888]
option 2
using df.extractall
df['value'] = df['value'].astype(str).str.extractall('(\d+)').unstack()\ .apply(lambda x: list(x.dropna().astype(int)), 1) df value 1 [1, 342, 452] 2 [5, 99] 3 [482, 342, 452, 888]
df['value'].tolist() [[1, 342, 452], [5, 99], [482, 342, 452, 888]]
Comments
Post a Comment