python - How to grep for unicode characters? -

April 15, 2011

how search file using grep string of unicode characters?

i'm trying count number of occurrences of string "\xfe\n\xfe". can find python doing:

open(filename).read().count('\xfe\n\xfe')

this finds few thousand matches.

however, since loads entire file memory, crash if try search file larger system's memory.

so i'm trying equivalent grep via:

grep -p -c "\xfe\n\xfe" filename

it consumes 0 memory, great, though run on same file, finds 0 matches. what's wrong syntax?

you don't need read entire file memory. can iterate on file , count occurrence of string across lines taking pair of lines @ every instant:

count = 0 open(filename) f:    prev_line = next(f)    line in f:       if prev_line.endswith('\xfe\n') , line.startswith('\xfe'):          count += 1       prev_line = line

Search This Blog

Enable

python - How to grep for unicode characters? -

Comments

Post a Comment

Popular posts from this blog

resizing Telegram inline keyboard -

javascript - How to bind ViewModel Store to View? -

javascript - Solution fails to pass one test with large inputs? -