aq_udb

Udb server interface

Synopsis

aq_udb [-h] Global_Opt Mnt_Spec|Order_Spec|Export_Spec

Global_Opt:
    [-verb] [-stat] [-test]
    [-server AdrSpec [AdrSpec ...]]
    [-local]

Mnt_Spec:
    -crt[,AtrLst] DbName |
    -clr[,AtrLst] DbName[:TabName] |
    -probe[,AtrLst] DbName

Order_Spec:
    -ord[,AtrLst] DbName[:TabName] [ColName ...]

Export_Spec:
    -exp[,AtrLst] DbName[:TabName] |
    -cnt[,AtrLst] DbName[:TabName] |
    -scn[,AtrLst] DbName[:TabName]
    [-seed RandSeed]
    [-var ColName Val]
    [-pp TabName
      [-bvar ColName Val]
      [-eval ColName Expr]
      [-filt FilterSpec]
      [-goto DestSpec]
      [-del_row | -del_key]
    -endpp]
    [-bvar ColName Val]
    [-eval ColName Expr]
    [-filt FilterSpec]
    [-goto DestSpec]
    [-del_row | -del_key]
    [-mod ModSpec [ModSrc]]
    [-lim_key Num] [-lim_rec Num]
    [-sort[,AtrLst] [ColName ...] [-top Num]]
    [-o[,AtrLst] File] [-c ColName [ColName ...]]

Description

aq_udb is a client of the Udb server. It is used to send command to the server (or a pool of servers) to manipulate and/or export the data held by the server. It can also instruct the server to clear a portion or all of the held data.

Data manipulation can be done using builtin options or through a custom module that is dynamically loaded on the server side.

Note: Data import to the Udb server is done by aq_pp.

Options

-test

Test command line arguments and exit. If specified twice (-test -test), a more throughout test will be attempted. For example, the program will try to connect to Udb in test mode.

  • If all specs are good, the exit code will be 0.
  • If there is an error, the exit code will be non-zero. Usually, an error message will also be printed to stderr. connect to remote servers, and so on.
-verb
Verbose - print program progress to stderr while processing. Usually, a marker is printed for each 10,000,000 records processed.
-stat

Print a record count summary line to stderr at the end of processing. The line has the form:

aq_udb: rec=Count
-server AdrSpec [AdrSpec ...]
Set the target servers. If given, server spec in the Udb spec file will be ignored. AdrSpec has the form IP_or_Domain[|IP_or_Domain_Alt][:Port]. See udb.spec for details.
-local
Tell the program to connect to the local servers only. Local servers are those in the server spec (from the Udb spec file or -server option) whose IP matches the the local IP of the machine the program is running on.
-crt[,AtrLst] DbName

Create a database explicitly. Normally, a database is created automatically during an import (see aq_pp). DbName is the database name (see Target Database). Note that it is not an error to create a database that already exists as long as the database definition is identical.

Optional AtrLst is a comma separated list containing:

-clr[,AtrLst] DbName[:TabName]

Remove/reset the data of a table/vector. DbName is the database name (see Target Database). TabName is a table/vector name in the database. Specific clear actions are:

  • For a table, its records are removed.
  • For a vector, its columns are reset to 0/blank.
  • For the Var vector (i.e., when TabName is “var”), its columns are reset to 0/blank.
  • If TabName is not given or if it is a ”.” (a dot), everything will be cleared - all keys, tables, vectors, the “var” vector and the database definition will all be removed.

Optional AtrLst is a comma separated list containing:

-probe[,AtrLst] DbName

Check if the servers associated with DbName are heathly and that DbName has been defined on the servers.

  • If all servers responded successful, the exit code will be 0.
  • If a connection failed or DbName is not defined, the exit code will be non-zero. Usually, an error message will be printed on stderr.
  • Use this with -verb and/or -stat to get more info if desired.

DbName is the database name (see Target Database). Optional AtrLst is a comma separated list containing:

-ord[,AtrLst] DbName[:TabName] [ColName ...]

Sort records in a table for each key. The default sort order is ascending. The records are sorted internally; not output will be generated. DbName is the database name (see Target Database). TabName is a table name in the database. ColName sets the desired sort columns. If no ColName is given, the “TKEY” column is assumed (see udb.spec). If TabName is not given or if it is a ”.” (a dot), the behavior depends on whether any ColName is given:

  • No ColName - all tables with a “TKEY” will be sorted.
  • With ColName - sort by primary keys. ColName must belong to the key set. Note that this only sorts the keys on a per server basis. If the database is distributed over a server pool, the keys is not sorted across servers.

Optional AtrLst is a comma separated list containing:

  • spec=UdbSpec - Set the spec file directly (see Target Database).
  • dec - Sort in descending order. Default is ascending.
-exp[,AtrLst] DbName[:TabName]

Export data. DbName is the database name (see Target Database). TabName is a table/vector name in the database. If TabName is not given or if it is a ”.” (a dot), the primary keys will be exported. Optional AtrLst is a comma separated list containing:

-cnt[,AtrLst] DbName[:TabName]

Count unique primary keys and rows. DbName is the database name (see Target Database). TabName is a table/vector name in the database. If TabName is not given or if it is a ”.” (a dot), the primary keys will be counted. Optional AtrLst is a comma separated list containing:

-scn[,AtrLst] DbName[:TabName]

Scan data only. No output will be produced. This option is typically used along with certain data processing rules (see Data Processing Steps) and/or a data processing module (see -mod). DbName is the database name (see Target Database). TabName is a table/vector name in the database. If TabName is not given or if it is a ”.” (a dot), the primary keys will be scanned - this is typically used with -pp rules. Optional AtrLst is a comma separated list containing:

-seed RandSeed
Set the seed of random sequence used by the $Random -eval builtin variable.
-var ColName Val

Set the value of the Var vector column ColName to Val. A Var vector must be defined in the Udb spec file and ColName must be a column in that table. See udb.spec for details. Note that a string Val must be quoted, see String Constant spec for details.

  • Var columns can also be altered by -eval and modules (see -mod).
  • Var column values are persistent until they are cleared by a -clr operation, at which point the columns are reset to 0 or blank.

Example:

$ aq_udb ... -var Var1 0 ...
  • Initialize Var1 in Var vector to 0 before any buctet is processed.
-bvar ColName Val

Same as -var except that the column is set to Val repeatedly as each key is processed before other processing rules are executed. Note that a string Val must be quoted, see String Constant spec for details.

This rule can also be used within a -pp group. In this case, ColName is set to Val as each key is processed before other pre-processing rules are executed.

See Data Processing Steps for details on these usages.

Example:

$ aq_udb ... -pp -bvar Var1 0 ...
  • Initialize Var1 in Var vector to 0 as each key is processed.
-eval ColName Expr

For each row in the table/vector being exported/counted/scanned, evaluate expression Expr and place the result in a column identified by ColName. The column can be part of the target table or the Var vector.

This rule can also be used within a -pp group. In this case, the target table becomes the -pp table. Note that -eval rules inside -pp groups are evaluated before those for the target table/vector. See Data Processing Steps for details.

Expr is the expression to evaluate. Data type of the evaluated result must be compatible with the data type of the target column. For example, string result for a string column and numeric result for a numeric column (there is no automatic type conversion; however, explicit conversion can be done using the To*() functions described below). Operands in the expression can be columns from the target table/vector, columns from other vectors, columns from the Var vector, constants, builtin variables and functions.

  • Column names are case insensitive. Do not quote the name. To address columns other than those in the target table/vector, use the VecName.ColName format. For the Var vector, VecName is optional unless ColName also exists in the target.
  • String constants must be quoted, see String Constant spec for details.
  • Use ‘(‘ and ‘)’ to group operations as appropriate.
  • For a numeric type evaluation, supported operators are ‘*’, ‘/’, ‘%’, ‘+’, ‘-‘, ‘&’, ‘|’ and ‘^’.
  • Depending on the operand type, evaluation may use 64-bit floating point arithmetic or 64-bit signed integral arithmetic. For example, “1 + 1” is evaluated using integral arithmetic while “1 + 1.0” is evaluated using floating point arithmetic. Similarly, “Col1 + 1” may use either arithmetic depending on Col1’s type while “Col1 + 1.0” always uses floating point.
  • For a string type evaluation, the only supported operator is ‘+’ for concatenation.
  • Certain types can be converted to one another using the builtin functions ToIP(), ToF(), ToI() and ToS().
  • Operator precedency is NOT supported. Use ‘(‘ and ‘)’ to group operations as appropriate.

Builtin variables:

$Random
A random number (postive integer). Its value changes every time the variable is referenced. The seed of this random sequence can be set using the -seed option.
$RowNum
Represent the per key per table row index (one-based). It is generally used during a table scan to identify the current row number.

Standard functions:

See aq-emod for a list of supported functions.

Example:

$ aq_udb -exp mydb:Test
    -eval c_delta 'c1 - c2'
  • Calculate c_delta before exporting.
-filt FilterSpec

For each row in the table/vector being exported/counted/scanned, evaluate FilterSpec and use the result to determine whether to keep the data row. The result can also be used in a -if/-elif/-endif for Rule Execution Controls.

This rule can also be used within a -pp group. In this case, the target table becomes the -pp table. Note that -filt rules inside -pp groups are evaluated before those for the target table/vector. See Data Processing Steps for details.

FilterSpec is the filter to evaluate. It has the basic form [!] LHS [<compare> RHS] where:

  • The negation operator ! negates the result of the comparison. It is recommended that !(...) be used to clarify the intended operation even though it is not required.
  • LHS and RHS can be:
    • A column name (case insensitive). Do not quote the name. The column can be part of the target table/vector, other vectors, and/or the Var vector. To address columns other than those in the target table/vector, use the VecName.ColName format. For the Var vector, VecName is optional unless ColName also exists in the target.
    • A constant, which can be a string, a number or an IP address. A string constant must be quoted, see String Constant spec for details.
    • An expression to evaluate as defined under -eval.
  • If only the LHS is given, its values will be used as a boolean - a non blank string or non zero number/IP equals True, False otherwise.
  • Supported comparison operators are:
    • ==, >, <, >=, <= - LHS and RHS comparison.
    • ~==, ~>, ~<, ~>=, ~<= - LHS and RHS case insensitive comparison; string type only.
    • !=, !~= - Negation of the above equal operators.
    • &= - Perform a “(LHS & RHS) == RHS” check; numeric types only.
    • !&= - Negation of the above.
    • & - Perform a “(LHS & RHS) != 0” check; numeric types only.
    • !& - Negation of the above.

More complex expression can be constructed by using (...) (grouping), ! (negation), || (or) and && (and). For example:

LHS_1 == RHS_1 && !(LHS_2 == RHS_2 || LHS_3 == RHS_3)

Example:

$ aq_udb -exp mydb:Test
    -filt 't > 123456789'
  • Export only rows of Test with ‘t > 123456789’.
$ aq_udb -exp mydb:Test
    -filt 'Eval($Random % 100) == 0'
  • Randomly select roughly 1/100th of the rows for export.
-goto DestSpec

Go to DestSpec. This is uaually done conditionally within a -if/-elif/-endif block (see Rule Execution Controls for details).

DestSpec is the destination to go to. It is one of:

  • next_key - Stop processing the current key and start over on the next key.
  • next_row - Stop processing the current row and start over on the next row.

This rule can also be used within a -pp group. In this case, these additional destinations are supported:

  • proc_key - Terminate all -pp processings (i.e., stop the current -pp group and skip all pending -pp groups) and start the export/count/scan operation for the current key.
  • next_pp - Stop the current -pp group and start the next one.
-del_row[,AtrLst]

Delete the current row in the database. No more processing on the current row will be done.

Optional AtrLst is a comma separated list containing:

  • post=DestSpec - Set the action to take after the delete. DestSpec is one of:
    • next_key - Stop processing the current key and start over on the next key.
    • proc_key - Skip all pending -pp groups and start the export/count/scan operation for the current key.
    • next_row - Start processing the next row. This is the default behavior.
-del_key[,AtrLst]

Delete the current key and its associated data from the database. No more processing on the current key will be done.

Optional AtrLst is a comma separated list containing:

  • post=DestSpec - Set the action to take after the delete. DestSpec is one of:
    • next_key - Start processing the next key. This is the default behavior.
-mod ModSpec [ModSrc]

Specify a module to be loaded on the server side during an export/count/scan operation. A module contains one or more processing functions which are called as each key is processed according to the Data Processing Steps. Only one such module can be specified.

ModSpec has the form ModName or ModName(Arg1, Arg2, ...) where ModName is the module name and Arg* are module dependent arguments. Note that the arguments must be literals - string constants (quoted), numbers or IP addresses. ModSrc is an optional module source file containing:

  • A module script source file that can be used to build the specified module. See the Udb module script compiler documentation for more information.
  • A ready-to-use module object file. It must have a .so extension.

Without ModSrc, the server will look for a preinstalled module matching ModName. Standard modules:

roi("VecName.Count_Col", "TabName.Page_Col", "Page1[,AtrLst]", ...)

Module for ROI counting. ROI spec is given in the module arguments. There are 3 or more arguments:

  • VecName.Count_Col - Column to save matched count to. It must have type I.
  • TabName.Page_Col - Column to get the match value from. It must have type S. Rows in the table must already be in the desired ROI matching order (usually ascending time order).
  • PageN[,AtrLst] - One or more pages to match against the TabName.Page_Col value. Each page is given as a separate module argument. Optional AtrLst is a comma separated list containing:
    • ncas - Do case insensitive match.
    • seq - Require that the page match occur immediately after the previous match (i.e., no unmatch page in between). Applicable on the second page and up only.

Either exact or wildcard match can be done. Exact match will either match the entire TabName.Page_Col value or up to (but not including) a ‘?’ or ‘#’ character. Wildcard match is done if Page contains ‘*’ (matches any number of bytes) and/or ‘?’ (matches any 1 byte). Literal ‘,’, ‘:’, ‘*’, ‘?’ and ‘\’ in Page must be ‘\’ escaped.

-pp[,AtrLst] TabName [-bvar ... -eval ... -filt ... -goto ... -del_row ...] -endpp

-pp groups one or more -bvar, -eval, -filt, -goto, -del_row and -del_key actions together. Each group performs pre-processing on a set of key specific data (e.g., a table). It is done before the main export/count/scan action. See Data Processing Steps for details.

TabName sets the target table/vector for the rules in the -pp group. It may refer to a table/vector or the primary key set. To target a table/vector, specify its name. To target the primary key set, specify a ”.” (a dot). ”.” is a pseudo vector containing the primary key columns.

Optional AtrLst is a comma separated list containing:

  • post=DestSpec - Set the action to take after all the rows in the target table has been exhausted. DestSpec is one of:
    • next_key - Stop processing the current key and start over on the next key.
    • proc_key - Skip all pending -pp groups and start the export/count/scan operation for the current key.
    • next_pp - Start the next -pp group. This is the default behavior.

The -bvar rules in the group are always executed first. Then the list of -eval, -filt, -goto, -del_row and -del_key rules are executed in order. Rule executions can also be made conditional by adding “if-else” controls, see Rule Execution Controls for details.

-endpp marks the end of a -pp group.

Example:

$ aq_udb -exp mydb:Test1
    -pp,post=next_key 'Test2'
      -goto proc_key
  • Only export Test1 from keys whose Test2 table is not empty. If Test2 is not empty, the -goto rule will be executed on the first row, causing execution to jump to export processing; in this way, the post action is not triggered. However, if Test2 is empty, -goto is not executed and post is triggered.
$ aq_udb -exp Test
    -pp .
      -filt 'Eval($Random % 100) == 0'
    -endpp
    -filt 't > 123456789'
  • Randomly select roughly 1/100th of the keys for export. From this subset, export only rows of Test with ‘t > 123456789’. Note that -endpp is mandatory here to prevent misinterpretation of the 2nd -filt.
-lim_key Num

Limit export output to the given Num keys. Default is 0, meaning no limit.

Note: If the data is distributed over multiple servers, the result exported can be less than expected if Num is close to Total_Num_Keys / Num_Servers.

-lim_rec Num

Limit export output to the given Num records. Default is 0, meaning no limit.

Note: If the data is distributed over multiple servers, the result exported can be less than expected if Num is close to Total_Num_Records / Num_Servers.

-sort[,AtrLst] ColName ... [-top Num]

-exp export output post processing option. This sets the output sort columns. Note that the sort columns must be in the output columns.

Optional AtrLst is a comma separated list containing:

  • dec - Sort in descending order. Default order is ascending.

-top limits the output to the top Num records in the result.

Note: Sort should not be used if the output contains columns other than those from the target table/vector (e.g. other vector columns).

-o[,AtrLst] File

Export output option. Set the output attributes and file. See the aq_tool output specifications manual for details. If this option is not used with an export, data is written to stdout.

Example:

$ aq_udb -exp mydb:Test ... -o,esc,noq -
  • Output to stdout in a format suitable for Amazon Cloud.
-c ColName [ColName ...]

Select columns to output during an export.

  • For a table/vector export, columns from the target table/vector, columns from other vectors, and/or columns from the Var vector can be selected. Default output includes all target table/vector columns.
  • For a primary key export, columns from the primary key, columns from any vectors, and/or columns from the Var vector can be selected. Default output includes the primary key columns only.
  • For a Var vector export, only columns from the Var vector can be selected. Default output includes all Var vector columns.

To address columns other than those in the target table/vector, use the VecName.ColName format. For the Var vector, VecName is optional unless ColName also exists in the target.

A ColName can be preceeded with a ~ (or !) negation mark. This means that the column is to be excluded.

Example:

$ aq_udb -exp mydb:Test ... -c Test_Col1 ... Test_ColN Var_Col1 ... Var_ColN
  • Output Var vector columns along with columns from Test. Even though Test_Col* are normally exported by default, they must be listed explicitly in order to include any Var_Col*.

Exit Status

If successful, the program exits with status 0. Otherwise, the program exits with a non-zero status code along error messages printed to stderr. Applicable exit codes are:

  • 0 - Successful.
  • 1 - Memory allocation error.
  • 2 - Command option spec error.
  • 3 - Initialization error.
  • 4 - System error.
  • 5 - Missing or invalid license.
  • 11 - Input open error.
  • 12 - Input read error.
  • 13 - Input processing error.
  • 21 - Output open error.
  • 22 - Output write error.
  • 31 - Udb connect error.
  • 32 - Udb communication error.

String Constant

A string constant must be quoted between double or single quotes. With double quotes, special character sequences can be used to represent special characters. With single quotes, no special sequence is recognized; in other words, a single quote cannot occur between single quotes.

Character sequences recognized between double quotes are:

  • \\ - represents a literal backslash character.
  • \" - represents a literal double quote character.
  • \b - represents a literal backspace character.
  • \f - represents a literal form feed character.
  • \n - represents a literal new line character.
  • \r - represents a literal carriage return character.
  • \t - represents a literal horizontal tab character.
  • \v - represents a literal vertical tab character.
  • \0 - represents a NULL character.
  • \xHH - represents a character whose HEX value is HH.
  • \<newline> - represents a line continuation sequence; both the backslash and the newline will be removed.

Sequences that are not recognized will be kept as-is.

Two or more quoted strings can be used back to back to form a single string. For example,

'a "b" c'" d 'e' f" => a "b" c d 'e' f

Target Database

aq_udb obtains information about the target database from a spec file. The spec file contains server IPs (or domain names) and table/vector definitions. See udb.spec for details. aq_udb finds the relevant spec file in several ways:

  • The spec file path is taken from the spec=UdbSpec attribute of the -crt, -ord, -exp, -cnt, -scn, -clr or -probe option.
  • The spec file path is deduced implicitly from the DbName parameters of the -crt, -ord, -exp, -cnt, -scn, -clr or -probe option. This method sets the spec file to “.conf/DbName.spec” in the runtime directory of aq_udb.
  • If none of the above information is given, the spec file is assumed to be “udb.spec” in the runtime directory of aq_udb.

Rule Execution Controls

-pp also supports conditional actions using the -if[not], -elif[not], -else and -endif construction:

-if[not] RuleToCheck
  RuleToRun
  ...
-elif[not] RuleToCheck
  RuleToRun
  ...
-else
  RuleToRun
  ...
-endif

Sypported RuleToCheck are -eval and -filt. Suppoeted RuleToRun are -eval, -filt, -goto, -del_row and -del_key.

Example:

$ aq_udb -exp mydb:Test
    -pp Test
      -bvar v_seq 0
      -if -filt 'flag == "yes"'
        -eval v_seq 'v_seq + 1'
        -eval c3 'v_seq'
      -else
        -eval c3 '0'
      -endif
  • Before exporting Test, assign a per key sequence number to column c3 if the “flag” column is “yes” or just 0 otherwise. Note that -bvar rules are always executed before the others regardless of their placement within a -pp group.

Data Processing Steps

For each export/count/scan operation, data is processed according to the commandline options in this way:

  • Initialize Var columns according the -var options.
  • Scan the primary keys. For each key in the database:
    • Execute -pp groups in the order they are specified on the commandline. For each -pp group:
      • Initialize Var columns according the -bvar rules.
      • Scan the -pp table. For each row in the table:
      • When all the rows are exhausted, follow the post attribute setting or start the next group by default.
    • Initialize Var columns according the -bvar rules.
    • If a module is specified (see -mod) and it has a key-level processing function, the fuction is called. This function can inspect and/or modify any data associated with the key. It can also tell the server to skip the current key so that it will not be exported/counted/scanned.
    • Process the target export/count/scan table. For each data row in the target table:
      • Execute the list of -eval, -filt, -goto, -del_row and -del_key rules (including any “-if-elif-else-endif” controls) in order.
      • If a module is specified (see -mod) and it has a row processing function, the function is called. This function can inspect and/or modify the current data row. It can also tell the server to skip the current row so that it will not be exported/counted/scanned.
      • Export/count, the current data row.

See Also