Powershell Regex to match specific order of text and include only them in results -
i'm trying find working regex powershell in select-string commandlet looking specific text marked start of looking text , point other specific texts until last text found.
example of file text:
[begin of_header] some.text="text" some.text="text" serial=0x94pa some.text="text" some.text="text" timer=0 some.text="text" some.text="text" tag.sm=00 some.text="text" some.text="text" some.text="text" some.text="text" tag.om=00 some.text="text" some.text="text" some.text="text" tag.uc=00 some.text="text" some.text="text" some.text="text" events=pd_exf1 some.text="text" some.text="text" some.text="text" acp="my looking dynamic text" some.text="text" some.text="text" dir=6 some.text="text" some.text="text" wg=100 some.text="text" some.text="text" h=95.5 some.text="text" some.text="text" [begin of_header] serial=0xzzz timer=0 some.text="text" some.text="text" tag.om=00 tag.uc=00 some.text="text" some.text="text" events=pd_exf1 acp="my looking dynamic text" dir=6 wg=100 h=95.5 [begin of_header] serial=0xpppp timer=0 tag.sm=00 some.text="text" some.text="text" tag.om=00 tag.uc=00 some.text="text" some.text="text" events=pd_exf1 acp="my looking dynamic text" dir=6 wg=100 h=95.5
in case should static word [begin of_header], point start exact order match of dynamic values beginning serial= , ending acp="my looking dynamic text". , acp= can have various values + serial. if there missing value, example tag.sm=00 missing, skip searching in group , jump next [begin of_header] , start analyzing again.
the result should this:
[begin of_header] serial=0x94pa timer=0 tag.sm=00 tag.om=00 tag.uc=00 events=pd_exf1 acp="my looking dynamic text" [begin of_header] serial=0xpppp timer=0 tag.sm=00 tag.om=00 tag.uc=00 events=pd_exf1 acp="my looking dynamic text"
i found similar here doesn't work want.
also don't work expected because not exclude broken exact match order:
select-string -literalpath "c:\myfile.txt" -pattern "\[begin of_header\]|serial=|timer=|tag.sm=|tag.om=|tag.uc=|events=|acp=" | select-object linenumber,line
the regular expression complex since order of elements fixed don't see problem.
$header = '[begin of_header]' $re = [regex]'(?smi)(^serial=.*?$).*(^timer=.+?$).*(^tag\.sm=.+?$).*(^tag\.om=.+?$).*(^tag\.uc=.+?$).*(^events=.+?$).*(^acp=.+?$)' (get-content .\myfile.txt -raw) -split [regex]::escape($header)| select-string $re | foreach-object{ $header for($i=1;$i -lt 8;$i++){$_.matches.groups[$i].value} "" }
sample output:
> q:\test\2017\09\10\so_46139332.ps1 [begin of_header] serial=0x94pa timer=0 tag.sm=00 tag.om=00 tag.uc=00 events=pd_exf1 acp="my looking dynamic text" [begin of_header] serial=0xpppp timer=0 tag.sm=00 tag.om=00 tag.uc=00 events=pd_exf1 acp="my looking dynamic text"
- the header used split file contents chunks match re separately
(?smi)
advises re uses modifier: single line. dot matches newline characters
m modifier: multi line. causes ^ , $ match begin/end of each line (not begin/end of string)
- i modifier: insensitive. case insensitive match
(^serial=.*?$).*
- 1st capturing group (^serial=.*?$)
^ asserts position @ start of line
serial= matches characters serial= literally (case insensitive)
.*?
. matches character *? quantifier — matches between 0 , unlimited times, few times possible, expanding needed (lazy) $ asserts position @ end of line
.*
matches character * quantifier — matches between 0 , unlimited times, many times possible, giving needed (greedy)
- 1st capturing group (^serial=.*?$)
Comments
Post a Comment