loginf: Log analyzer
Synopsis
loginf [-h] Global_Opt Input_Spec Output_Spec Global_Opt: [-verb] [-cmax ColNumMax] Input_Spec: [-f[,AtrLst] File [File ...]] [-d ColId [ColId ...]] [-f_raw File [File ...]] Output_Spec: [-pp_eok Percent] [-o File [-b64]] [-o_raw File] [-o_typ File] [-o_pp_col File]
Description
loginf is used to collect data characteristics related information from the input. It can process log files and/or previously generated raw result files as input. On output, it produces a data characteristics report in a JSON text format and/or a raw result format. Characteristics compiled include row counts, column counts, column min/max length/value, column type guesses, column uniquness estimates, and so on.
Options
-verb
Verbose – print program progress to stderr while processing. Usually, a marker is printed for each 10,000,000 records processed.
-cmax ColNumMax
The maximum number of columns to process. Default is 4096. Processing will stop if this limit is exceeded.
-f[,AtrLst] File [File …] [-lim Num]
Set the input attributes and log files to analyze. If the data come from stdin, set File to ‘-‘ (a single dash). If no -f or -f_raw is given, log from stdin is assumed. Optional AtrLst is a list of comma separated attributes:
- +Num[b|l] – Specifies the number of bytes (b suffix) or lines (no suffix or l suffix) to skip before processing.
- bz=BufSize – Set the per-record buffer size to BufSize bytes. It must be big enough to hold the data of all the columns in a record. Default size is 64KB.
- lim=RecLimit – Set the maximum number of records to process.
- notitle – The first record from the input is not a label line.
- csv – Input is in CSV format. This is the default.
- sep=c or sep=\xHH – Input is in ‘c’ (single byte) separated value format. ‘xHH’ is a way to specify ‘c’ via its HEX value HH.
- tab – Input is in HTML table format. Each row has the form “…<td>Column1</td>…<td>Column2</td>…</tr>”. In other words, a row begins at the first “<td …>” tag and ends at a “</tr>” tag.
- bin or aq – Input is in aq_tool’s internal binary format. Tha data must be generated by an aq_tool using a bin or aq output format attribute.
- auto – Determine input data format automatically. Supported formats are:
- Delimiter-separated columns. May not work if the number of columns in not fixed.
- Blank padded fixed-width columns. Individual columns can be left or right adjusted (but not both on the same column).
- Aq_tool’s internal binary format.
- JSON, detection only, no further analysis.
- XML, detection only, no further analysis.
- Default to a line separated format with a single column.
The -lim option sets the maximum number of records to load from each input file to Num.
Example:
$ loginf ... -f file1 file2 ...
- Load and analyze logs file1 and file2.
-d ColId [ColId …]
Select the columns to analyze. Other columns will be ignored. ColId is one-based.
-f_raw File [File …]
Set the input raw result files to load. Files must be previously generated by this program via the -o_raw option. If the data come from stdin, set File to ‘-‘ (a single dash).
Example:
$ loginf ... -f_raw file1.raw file2.raw ...
- Load and combine file1.raw and file2.raw.
$ loginf ... -f file3 file4 -f_raw file1.raw file2.raw ...
- Load and combine file1.raw and file2.raw, then further load and analyze logs file3 and file4 and combine all the results together.
-pp_eok Percent
Acceptable error percentage when determining column data type. Default is 0. Column data type is determined based on the column values. If more than one types are detected in a column, the type detected the most will be chosen if the percentage of all the other types combined is less than or equal to this threshold. Otherwise, a string type will be assigned when there is an inconsistency.
-o File [-b64]
Output a text report of the result. Report is written in JSON format. If File is a ‘-‘ (a single dash), data will be written to stdout. Note that the file will be overwritten if it contains any data. If no -o, -o_raw or -o_pp_col is given, a report will be written to stdout.
With the -b64 option, the strings in the JSON report will be encoded in a base64 format.
Example:
$ loginf ... -f file1 ... -o file1.report
- Save the JSON report to file1.report.
-o_raw File
Output raw result. This raw result can be used in a later run using the -f_raw option. If File is a ‘-‘ (a single dash), data will be written to stdout.
Example:
$ loginf ... -f file1 ... -o_raw file1.raw -o file1.report
- Save raw result to file1.raw and a report of the same result to file1.report.
-o_typ File
Output the input data’s format type. If File is a ‘-‘ (a single dash), data will be written to stdout. The output a single line description of these forms:
- Mixed – More than one format detected. This only happens when there are more than one input to the program and that the inputs have differing formats.
- Fixed-Width – Columns appear to have fixed widths. The columns may be left/right adjusted with blank paddings.
- byte Separated – Columns are separated by a single byte separator. byte can be a printable ASCII character, a ‘\’ escaped character or a \xHH sequence where HH is the hex value of the separator.
- byte Separated CSV – Same as the above except that some columns may be quoted in a CSV-like manner.
- CSV – Same as the above when the separator is a comma.
- Aq Tool Binary – Input is in aq_tool’s internal binary format. Tha data can only be interpreted by an aq_tool using the bin or aq input format attribute.
- HTML Table – A simple HTML table was detected.
- JSON – A JSON object was detected.
- XML – An XML object was detected.
If no specific type can be detected, the output defaults to “\n” Separated for single column data set with newline separated rows.
Example:
$ loginf ... -f,auto file1 ... -o_typ -
- Determine file1’s format automatically and print the resulting description to stdout.
-o_pp_col File
Output aq_pp column spec based on the charasteristics of the processed data. The output is line oriented, with one column spec per line. If File is a ‘-‘ (a single dash), data will be written to stdout.
Example:
$ loginf ... -f file1 -lim 1000 ... -o_pp_col file1.col
- Analyze the first 1000 records in file1 and output aq_pp column spec to file1.col.
Exit Status
If successful, the program exits with status 0. Otherwise, the program exits with a non-zero status code along error messages printed to stderr. Applicable exit codes are:
- 0 – Successful.
- 1 – Memory allocation error.
- 2 – Command option spec error.
- 3 – Initialization error.
- 4 – System error.
- 5 – Missing or invalid license.
- 11 – Input open error.
- 12 – Input read error.
- 13 – Input processing error.
- 21 – Output open error.
- 22 – Output write error.
See Also
- aq_pp – Record preprocessor
- udbd – Udb server
- aq_udb – Udb server interface