python - Using `itertools.groupby()` to get lists of runs of strings that start with `A`? -
the (abstracted) problem this: have log file
a: 1 a: 2 a: 3 b: 4 b: 5 a: 6 c: 7 d: 8 a: 9 a: 10 a: 11
and want end list of lists this:
[["1", "2", "3"], ["6"], ["9", "10", "11"]]
where file has been broken "runs" of strings starting a
. know can use itertools.groupby
solve this, , right have solution (where f
list of lines in file).
starts_with_a = lambda x: x.startswith("a") coalesced = [g _, g in groupby(f), key=starts_with_a] runs = [re.sub(r'a: ', '', s) s in coalesced if starts_with_a(s)]
so use groupby, have filter out stuff doesn't start "a". okay, , pretty terse, there more elegant way it? i'd love way that:
- doesn't require 2 passes
- is terser (and/or) more readable
help me harness might of itertools
!
yes, filter out lines don't start a
use key produced groupby()
each group returned. it's return value of key
function, it'll true
lines start a
. i'd use str.partition()
here instead of regular expression:
coalesce = (g key, g in groupby(f, key=lambda x: x[:1] == "a") if key) runs = [[res.partition(':')[-1].strip() res in group] group in coalesce]
since str.startswith()
argument fixed-width string literal, may use slicing; x[:1]
slices of first character , compares 'a'
, gives same test x.startswith('a')
.
i used generator expression group groupby()
filtering; could inline 1 list comprehension:
runs = [[res.partition(':')[-1].strip() res in group] key, group in groupby(f, key=lambda x: x[:1] == "a") if key]
demo:
>>> itertools import groupby >>> f = '''\ ... a: 1 ... a: 2 ... a: 3 ... b: 4 ... b: 5 ... a: 6 ... c: 7 ... d: 8 ... a: 9 ... a: 10 ... a: 11 ... '''.splitlines(true) >>> coalesce = (g key, g in groupby(f, key=lambda x: x[:1] == "a") if key) >>> [[res.partition(':')[-1].strip() res in group] group in coalesce] [['1', '2', '3'], ['6'], ['9', '10', '11']]
Comments
Post a Comment