Raw data, particularly log data, is often not delimited cleanly but can be made to be so using some simple pattern
matching rules. logcnv
was developed to help with this process, with the intent that the parsed output be fed
directly into aq_pp
for further processing.
In this tutorial, we will use an Apache web log as an example, since they are among the most common type of logs we
work with. You will need the file ‘apache.log’ (click to download
),
the first line of which is:
54.248.98.72 - - [23/Nov/2014:03:07:23 -0800] "GET / HTTP/1.0" 301 - "-" "Mozilla/5.0 (compatible; monitis - premium monitoring service; http://www.monitis.com)"
logcnv
is similar to aq_pp
in that it defines the column spec using type:name notation, except here we add a
new option ‘sep’ which specifies the substring that separates a given column from the one next to it. The apache log
for example can be parsed using:
logcnv -f,eok apache.log -d ip:ip sep:' ' s:rlog sep:' ' \
s:rusr sep:' [' i,tim:time sep:'] "' s,clf,hl1:req_line1 sep:'" ' \
i:res_status sep:' ' i:res_size sep:' "' \
s,clf:referrer sep:'" "' s,clf:user_agent sep:'"'``
As in other AQ tools, attributes are used to augment the processing. Here, we use the ‘hl1’ attribute with the
column ‘req_line1’. This attribute parses HTTP request strings such as GET /index.html?query HTTP/1.0
.
Similar to aq_pp
you can specify the columns to output using the -c
option and a list of column names.
You can limit which columns are output in the final result by using the ‘-c’ option. i.e. run:
logcnv -f,eok apache.log -d ip:ip sep:' ' s:rlog sep:' ' s:rusr sep:' [' \
i,tim:time sep:'] "' s,clf,hl1:req_line1 sep:'" ' i:res_status sep:' ' \
i:res_size sep:' "' s,clf:referrer sep:'" "' s,clf:user_agent sep:'"' \
-c ip time req_line1_f2 res_status res_size``
"ip","time","req_line1_f2","res_status","res_size"
54.248.98.72,1416740843,"/",301,0
46.23.67.107,1416740853,"/",301,0
85.17.156.99,1416740863,"/",200,30003
173.193.219.173,1416740885,"/",301,0
54.248.98.72,1416740903,"/",301,0
93.174.93.117,1416740904,"/xmlrpc.php",200,370
46.23.67.107,1416740913,"/",301,0
93.174.93.117,1416740919,"/xmlrpc.php",200,370
174.34.156.130,1416740923,"/",200,30003
173.193.219.173,1416740945,"/",301,0