Was trying to extract some FIX messages from log files, and more precisely filter out a number of messages like heartbeat. My initial thought was to do something like that:
That works fine, except it's a little bit wasteful. What this command above does is, replace all the null character (that are used as delimiter in FIX messages) by "|". So that got me thinking how I could filter out those without having to replace anything. My first idea was:
That didn't work because grep was interpreting \001 as a back reference. Several variants later (among which the use of sed) and some constructive goggling made me understand that the the problem wasn't coming from grep but from bash and here's the proof.
If you type the following (under linux):
The trick is to tell bash do some quoting (escaping), and you do that by prefixing the string with $:
So putting it all together the original command above becomes:
cat fixlog.log | grep '\[8=FIX.' | tr '\1' '|' | grep -v '\|35=0'
That works fine, except it's a little bit wasteful. What this command above does is, replace all the null character (that are used as delimiter in FIX messages) by "|". So that got me thinking how I could filter out those without having to replace anything. My first idea was:
cat C20120203.log | grep '\[8=FIX.' | grep -v '\00135=D'
That didn't work because grep was interpreting \001 as a back reference. Several variants later (among which the use of sed) and some constructive goggling made me understand that the the problem wasn't coming from grep but from bash and here's the proof.
If you type the following (under linux):
$ echo '\001' | hexdump -C 00000000 5c 30 30 31 0a |\001.| 00000005
The trick is to tell bash do some quoting (escaping), and you do that by prefixing the string with $:
$ echo $'\001' | hexdump -C 00000000 01 0a |..| 00000002
So putting it all together the original command above becomes:
cat fixlog.log | grep '\[8=FIX.' | grep -v $'\00135=0'The mechanism allow you to grep for any arbitrary ASCII character.