aq_pp module script compiler
mcc.pmod in_script [out.c|out.cpp] [out.so]
This is the aq_pp
module script compiler.
It converts a script written in C/C++ and module commands
into a dynamic module for aq_pp
.
This compiler is normally used internally by aq_pp -pmod
for on-the-fly module generation. However, it can also be used to develop
modules manually.
Simply install the manually created module (the .so
file) in the
appropriate location and aq_pp -pmod will be able
to use it.
in_script
in_script
to ‘-‘ (a single dash).out.c|out.cpp
Save the intermediate source to an output file. This is a C/C++ source file generated based on the input module script. It closely ressembles the original script except for some added support/interface code.
The output file must have a .c
or .cpp
extension.
Only one of the two can be specified.
Whether to save this output is optional. Use it for debugging
or to help module development as needed.
out.so
.so
extension.Module commands abstract and hide most of the module API details.
They resemble C macros, as in COMMAND(parameters)
.
The commands consist of declaration statements,
processing function specifications and
module helpers.
They tell the module compiler what code to generate
before building the final dynamic module.
A module script is primarily a C/C++ source with certain embedded module commands. This is a sample script that does row filtering:
DECL_LANG(C); DECL_COLUMN(ColName_1, S); DECL_COLUMN(ColName_2, I); DECL_COLUMN_DYNAMIC(Col_3, S); DECL_END; MOD_INIT_FUNC() { if (arg_n != 1) return 0; if (!MOD_COLUMN_BIND(Col_3, arg[0])) return 0; return 1; } MOD_PROC_FUNC() { CDAT_I_T col_i; col_i = $ColName_2; if (ModDifHStr($ColName_1, $Col_3, DIF_A_NCAS) == 0 && col_i >= 100 && col_i <= 199) return MOD_A_TRUE; return MOD_A_FALSE; }
Columns are type specific. Column types are defined in the data spec.
In the module script, a C/C++ variable of the appropriate type must
be used when copying or manipulating column values.
These are the aq_pp
column types and their corresponding module
types/typedefs:
Spec type Program typedef Module typedef Description S HStr * CDAT_S_T A pointer to a hash string data structure. F double CDAT_F_T A double precision floating point number. L u_int64_t CDAT_L_T An unsigned (always positive) 64bit integer. LS int64_t CDAT_LS_T A 64bit integer. I u_int32_t CDAT_I_T An unsigned (always positive) 32bit integer. IS int32_t CDAT_IS_T A 32bit integer. IP NetIp CDAT_IP_T An IP address data structure.
Declaration statements are used to declare variables and options. The compiler interprets these declarations and determines what code to generate. For example, column declarations will result in column handling code, variable declarations will result in variable handling code, and so on.
DECL_LANG(Lang);
Tell the compiler what programming language is being used in the script.
Lang
can either be C
or CPP
. Default is C
.
Example:
DECL_LANG(C);
C
is the default,
so this declaration is not strictly necessary.DECL_BUILD_OPT(Arguments);
Supply custom command line arguments for the compiler. Use cases are:
-Imy_include_directory
.define
; e.g., -DMY_DEF=1
.my_dir/my_lib.a
.-lm
for the math library.Example:
DECL_BUILD_OPT(-DMY_VERSION_STRING='"1.1.1"' -lm);
DECL_COLUMN(ColName, ColType);
Declare a column for use in the script.
ColName
is a column in the data spec.
The given name and type will be verified at run time
during module initialization to ensure that the spec is valid.Example:
DECL_COLUMN(ColName_1, S);
ColName_1
is an actual column name.
It is specified as-is, like a variable (not a string).DECL_COLUMN_DYNAMIC(ColName, ColType);
Declare a column for the script just like DECL_COLUMN(), except that the actual target column name is not known until run time (hence, dynamic).
Example:
DECL_COLUMN_DYNAMIC(Col_3, S); MOD_INIT_FUNC() { if (!MOD_COLUMN_BIND(Col_3, "ColName_3")) return 0; ... }
DECL_DATA(VarDecl);
Declare one or more variables as the module’s instance specific data. Unlike global variables which are shared between concurrent instances of the same module, variables declared this way are instance specific (i.e., each instance has its own copies of the variables). This is the recommended way of managing module data.
VarDecl
is a variable declaration like int num1, num2
.MOD_DATA(num1)
and MOD_DATA(num2)
will access
the values of those integers.Example:
DECL_DATA(int flag); DECL_DATA(int num1, num2); MOD_INIT_FUNC() { if (...) MOD_DATA(flag) = 1; else MOD_DATA(flag) = 2; ... } MOD_ROW_FUNC(TabName_1) { if (MOD_DATA(flag) == 1) MOD_DATA(num1) += 1; else if (MOD_DATA(flag) == 2) MOD_DATA(num2) += 1; ... }
flag
is conditionally initialized to
1 or 2 during module initialization. num1
and num2
are already
initialized to 0 automatically.DECL_END;
The processing functions carry out the intended task of a module. There are several predefined module functions - one optional initialization function, one or more processing functions and one optional wrap up function. If any of them are defined, the compiler will generate code that call these function automatically.
A module function is defined like a C function:
PREDEFINED_FUNCTION_NAME(function_dependent_argument) { code_block ... }
MOD_*_FUNC()
)
and argument (function dependent) specification.etc/include/pmod.h
”).MOD_INIT_FUNC()
Define a function for module initialization.
ModCntx *mod
- A module instance handle. Pass this to any support
functions that use module helpers.const char *const *arg, int arg_n
- The parameters passed to the
module when it was called on the command line is available here as a
string array. Use them to set up run time parameters as necessary.aq_pp
will terminate.Example:
MOD_INIT_FUNC() { if (arg_n != 1) return 0; if (!MOD_COLUMN_BIND(Col_3, arg[0])) return 0; return 1; }
arg
and arg_n
are implicit variables in the function).MOD_PROC_FUNC()
Define a function for data row processing. This function must be defined.
It is called for each data row being processed.
Use it to examine and/or modify column values.
It is called with this implicit argument:
ModCntx *mod
- A module instance handle. Pass this to any support
functions that use module helpers.It must return a enumerated return code that tells aq_pp
what to do:
MOD_A_TRUE - True. aq_pp
will continue processing or take “if”
statement dependent actions if the module is used as an “if” condition.
MOD_A_FALSE - False. aq_pp
will skip any remaining processing on the
current row or take “if” statement dependent actions if the module is
used as an “if” condition.
MOD_A_QNOW - Quit now. aq_pp
will stop processing immediately.
MOD_A_QAFT - Like MOD_A_TRUE, but the call will stop processing after finishing the current row.
aq_pp
will call the module againwith the current row until a different code is returned.
Example:
MOD_PROC_FUNC() { CDAT_I_T col_i; col_i = $ColName_2; if (ModDifHStr($ColName_1, $Col_3, DIF_A_NCAS) == 0 && col_i >= 100 && col_i <= 199) return MOD_A_TRUE; return MOD_A_FALSE; }
ColName_1
and Col_3
‘s values are the same (case insensitive) and
ColName_2
‘s value is between 100 and 199, false otherwise.$ColName
(or MOD_CDAT()) to address
column values.MOD_DONE_FUNC()
Define a function that performs module wrap up related tasks.
aq_pp
exits.ModCntx *mod
- A module instance handle. Pass this to any support
functions that use module helpers.Example:
MOD_DONE_FUNC() { ModLog("%s done\n", MOD_NAME); }
These are helpers that are designed specifically for module processing tasks.
They can be used in any processing functions or subroutines called
from these functions (these subroutines must be given a ModCntx *mod
argument).
int MOD_COLUMN_BIND(ColName, const char *real_name)
Dynamic column setup function.
ColName
must ba a column declared via DECL_COLUMN_DYNAMIC().real_name
is a C string buffer containing the actual name of the column.CDAT_*_T MOD_CDAT(ColName)
, CDAT_*_T $ColName
Use either form like a program variable to address the value of a column in the current row.
CDAT_*_T
type (see column datatypes)
derived from ColType
in the declaration.Example:
DECL_COLUMN(InNumColumn, I); DECL_COLUMN_DYNAMIC(OutNumColumn, I); MOD_INIT_FUNC() { MOD_COLUMN_BIND(OutNumColumn, "RealColumn"); ... } MOD_PROC_FUNC() { if ($InNumColumn == 4321) $OutNumColumn += 1; ... }
void MOD_CDAT_S_NSET(ColName, const char *b, unsigned int n)
Set the value of the given column in the current row to a hash string
based on string buffer b
and length n
.
Example:
DECL_COLUMN(StrColumn_1, S); MOD_PROC_FUNC() { MOD_CDAT_S_NSET(StrColumn_1, "abc", 3); ... }
void MOD_CDAT_S_SET(ColName, CDAT_S_T hs)
Set the value of the given column in the current row to a copy of
hash string hs
.
hs
is an existing hash string (e.g., the value of another string
column).Example:
DECL_COLUMN(StrColumn_1, S); DECL_COLUMN(StrColumn_2, S); MOD_PROC_FUNC() { MOD_CDAT_S_SET(StrColumn_1, $StrColumn_2); ... }
void MOD_CDAT_S_DEL(ColName)
const ColDefn *MOD_CDEF(ColName)
MOD_DATA(variable)
const char *MOD_NAME
MOD_LOG_ERR(const char *format, ...)
Generic programming supports and convenient functions for module specific datatype handling.
int ModDifHStr(const CDAT_S_T hs1, const CDAT_S_T hs2, int dif_flag)
Compare the values of 2 hash strings.
hs1
is greater, and -1 otherwise.dif_flag
is either 0 (case sensitive comparision) or
DIF_A_NCAS (case insensitive comparison).Example:
DECL_COLUMN(StrColumn_1, S); DECL_COLUMN(StrColumn_2, S); MOD_PROC_FUNC() { if (ModDifHStr($StrColumn_1, $StrColumn_2, 0) == 0) ... ... }
int ModDifHStrStr(const CDAT_S_T hs, const char *b, int n, int dif_flag)
Compare the value of hash string hs
to string buffer b
of
length n
.
hs
is greater, and -1 otherwise.dif_flag
is either 0 (case sensitive comparision) or
DIF_A_NCAS (case insensitive comparison).Example:
DECL_COLUMN(StrColumn_1, S); MOD_PROC_FUNC() { if (ModDifHStrStr($StrColumn_1, "abc", 3, 0) == 0) ... ... }
int ModDifHStrPat(const CDAT_S_T hs, const char *pat, int n, int dif_flag)
Compare the value of hash string hs
to pattern buffer pat
of
length n
.
pat
may contain ‘*’ (for any number of bytes) and ‘?’
(for any 1 byte). Use a ‘’ to escape literal ‘*’, ‘?’ and ‘\’ in the
pattern. If the pattern is given as a literal, any backslashes in it
must be backslash escaped one more time for the C/C++ interpreter.dif_flag
can have these values:Example:
DECL_COLUMN(StrColumn_1, S); MOD_PROC_FUNC() { if (ModDifHStrPat($StrColumn_1, "a*c", 3, 0) == 0) ... ... }
int ModDifIp(const CDAT_IP_T *ip1, const CDAT_IP_T *ip2)
Compare the values of 2 IP addresses. Note that the arguments are pointers to IP address structures.
ip1
is greater, and -1 otherwise.Example:
DECL_COLUMN(IPColumn_1, IP); DECL_COLUMN(IPColumn_2, IP); MOD_PROC_FUNC() { if (ModDifIp(&$IPColumn_1, &$IPColumn_2) == 0) ... ... }
void ModLog(const char *format, ...)
Print a message to stderr.
Example:
MOD_INIT_FUNC() { if (arg_n != 1) { ModLog("%s: missing module argument\n", MOD_NAME); return 0; } ... }
void *ZAlloc(size_t size)
size
bytes of memory. This is the same as the C function
malloc()
except that the returned memory is initialized to zero.Type *ZALLOC_TYPE(Type)
Type
. This is a macro based on
ZAlloc().Type *ZALLOC_TYPE_N(Type, int num)
num
object of type Type
. This is a macro based on
ZAlloc().int ReAlloc(void *orig_mem, size_t new_size)
This function works like a combination of the C functions
malloc()
and realloc()
- it allocates new_size
bytes if the
original memory address is NULL or reallocates to new_size
otherwise.
orig_mem
is the address of the original memory address
(i.e., an address of an address).char *StrNDup(const char *b, int n)
Duplicate a data buffer b
of length n
(i.e., allocate memory and
copy data).
b
is NULL, NULL is returned regardless of the value of n
.n
is greater than or equal to 0, b
needs not be null
terminated.n
is less than 0, b
must be null terminated. The string length
of b
will be used as the data length.BUF_INIT(BufData *buf)
BufData
structure.
This should be done on any uninitialized BufData
structure before it is
used for the first time.BUF_CLEAR(BufData *buf)
BufData
structure. Do this before destroying a BufData
structure.int BufNCat(BufData *buf, const char *b, int n)
Append data buffer b
of length n
to the buffer in
BufData
structure buf
.
buf->s
string is null terminated.b
is NULL, the size of buf->s
will be increased by n
(if necessary), but no data will be copied. In other words,
buf->s
and buf->z
may change, but buf->n
will not.n
is greater than or equal to 0, b
needs not be null
terminated.n
is less than 0, b
must be null terminated. The string length
of b
will be used as the data length.void HStrNSet(const ColDefn *col, CDAT_S_T *hs, const char *b, unsigned int n)
Replace hash string hs
with one based on string buffer b
and
length n
.
hs
must have a value on input - either a valid hash string or 0.hs
is the value of a column, specify the relevant column definition
as col
. This is similar to what MOD_CDAT_S_NSET() does.hs
is not the value of a column, set col
to 0.Example:
DECL_DATA(CDAT_S_T my_str); MOD_INIT_FUNC() { HStrNSet(0, &MOD_DATA(my_str), "abc", 3); ... } ... MOD_DONE_FUNC() { HStrDel(0, &MOD_DATA(my_str)); ... }
void HStrSet(const ColDefn *col, CDAT_S_T *hs, CDAT_S_T s)
Replace hash string hs
with a copy of s
.
hs
must have a value on input - either a valid hash string or 0.hs
is the value of a column, specify the relevant column definition
as col
. This is similar to what MOD_CDAT_S_SET() does.hs
is not the value of a column, set col
to 0.void HStrDel(const ColDefn *col, CDAT_S_T *hs)
Delete (dereference) hash string hs
. hs
will be set to a generic
blank hash string on return.
hs
must have a value on input - either a valid hash string or 0.hs
is the value of a column, specify the relevant column definition
as col
. This is similar to what MOD_CDAT_S_DEL() does.hs
is not the value of a column, set col
to 0.Additional resources can be found in the low level include file
“etc/include/pmod.h
”.
The ability to address columns by their names is a key feature of
the module script support. Both ColName
and $ColName
are designed to address columns, but they differ in these ways:
ColName
(without the leading dollar sign) refers to an
abstract column reference.
It is only valid in module helpers.$ColName
(with the leading dollar sign) is a shorthand for
MOD_CDAT(ColName)
. It refers to a column’s value.
It acts like a program variable of type CDAT_*_T
(see column datatypes). It can be used anywhere
program variables are appropriate.