python - Regular Expression To Find C Style Comments -
i trying write regular expression find c style headers in java source files. @ present time experimenting python.
here source code:
import re text = """/* * copyright blah blah blha blah * blah blah blah blah * 2008 blah blah blah @ org */""" print print "i guess program printed correct thing." pattern = re.compile("^/.+/$") print "-----------" print pattern pos = 0 while true: match = pattern.search(text, pos) if not match: break s = match.start() e = match.end() print ' %2d : %2d = "%s"' % (s, e-1, text[s:e]) pos = e
i trying write simple expression looks between forward slash , forward slash. can make regular expression more complicated later.
does know going wrong? using forward slash dot meta-character, plus symbol 1 or more things, , dollar symbol end.
i don't think should anchor (using '^' , '$') match.
secondly, think regex should r"/[^/]*/"
matches (portion of) string starts slash, followed 0 or more non-slash characters , terminates slash.
to wit:
>>> import re >>> text = """foo bar baz ... /* ... * copyright blah blah blha blah ... * blah blah blah blah ... * 2008 blah blah blah @ org ... */""" >>> rx = re.compile(r"/[^/]*/", re.dotall) >>> mo = rx.search(text) >>> text[mo.start(): mo.end()] '/*\n * copyright blah blah blha blah \n * blah blah blah blah \n * 2008 blah blah blah @ org\n */'
note comment not start start of string regex finds nicely.
Comments
Post a Comment