python - How to grep for unicode characters? -
how search file using grep string of unicode characters?
i'm trying count number of occurrences of string "\xfe\n\xfe". can find python doing:
open(filename).read().count('\xfe\n\xfe')
this finds few thousand matches.
however, since loads entire file memory, crash if try search file larger system's memory.
so i'm trying equivalent grep via:
grep -p -c "\xfe\n\xfe" filename
it consumes 0 memory, great, though run on same file, finds 0 matches. what's wrong syntax?
you don't need read entire file memory. can iterate on file , count occurrence of string across lines taking pair of lines @ every instant:
count = 0 open(filename) f: prev_line = next(f) line in f: if prev_line.endswith('\xfe\n') , line.startswith('\xfe'): count += 1 prev_line = line
Comments
Post a Comment