OpenLogReplicator.json file format

All parameters are stored in OpenLogReplicator.json for version 0.8.9. Please use the example file for a start. Below are described global parameters, parameters for Oracle database reader module and Kafka writer module.

Please have in mind that until the release of version 1.0 the format can change between versions.

Global parameters

versiontextyesThe value of the parameter must be set as the current version of the program; this is a safe-checker to update the JSON configuration file after updating the binaries to a newer version; after updating binaries always check the documentation for parameter changes and verify that the JSON configuration file is correct.
sourceslistyesOracle database sources (details below).
targetslistyesOutput targets (details below)
dump-redo-lognumber0This parameter is primary used for verifying the correctness how program works. It can create a stream similar to logdump command which can be compared as text to verify if certain parameters have been correctly decoded. Please however mind that logdump can have some inconsistency.
Possible values:
0 – no logdump file is created;
1 – for every processed redo log file a <database>-<nnn>.logdump file is created (<database> – database name, <nnn> – redo log sequence); the file contains results similar to logdump command; please not that not all redo OP codes are parsed and analyzed, so the result can be lesser than the original result from logdump; any difference in output compared to original logdump command can be an indication of improper log parsing;
2 – like 1 but additional information is printed which is not originally printed in logdump output – like for example details about supplemental log groups.
dump-raw-datanumber0This parameter is only valid when dump-redo-log parameter is set to 1 or 2. Possible values:
0 – no hex dump is added to DUMP-<nnn>.trace file;
1 – before logdump information for every vector the full vector is dumped in HEX format – useful for analysis of the content.
tracenumber2This parameter defines amount of output messages. All messages are sent to stderr output stream. Possible values:
0 – silent, just basic information about startup and errors are sent to output;
1 – warning – like 0 + warnings;
2 – info – like 1 + info;
3 – full – like 2 + additional details about certain operations.
trace2number0This parameter is used mainly for debugging. The value is a logical map of various trace parameters, please refer to source code for details.

Element of sources list

aliastextyesThe name of the source – referenced later by Kafka target
nametextyesThis name is used for identifying database connection. It is mentioned in logs and used for file creation with checkpoint and schema.
readergroupyesConfiguration of redo log reader (details below)
formatgroupyesConfiguration of format to store transactions (details below)
checkpointgroupnoConfiguration of checkpoint processing (details below)
archtextpath or listParameter defines the way that list of archived log files are get. Possible values are:
online – archived log list is read using database connection; the database connection is closed during program work, just open occasionally to read archived redo log list; valid only for online, asm and standby modes;
online-keep – like online, but the database connection is kept open;
path – the redo log list if read from disk; valid for all readers but batch;
list – like path but the list of files is provided by list; default mode for batch reader
tableslistA list of elements which is used to define a set of tables that should be replicated. The list can contain SQL wildcards. Every element should contain a table element and optionally a key element. Table element is used as a wildcard to define table name. Example:
“table”: {{“table”: “owner1.table1”}, {“table”: “owner2.table2”, “key”: “col1, col2, col3”}, {“table”:”sys.%”}}. If a table does not contain a primary key, a custom set of columns can be treated as primary key. See quick start document for further details about primary key behavior.
flagsnumber0A logical sum of various flags:
0x0001 – read only archived redo logs
0x0002 – schemaless mode, it means that the program can operate when there is no schema available, please refer to details about this mode
0x0004 – don’t use direct read (O_DIRECT) for reading redo log files
0x0008 – don’t use O_NOATIME for open files
0x0010 – ignore basic errors and continue redo log processing
0x0020 – allow basic tracing of DDL operations (unsupported feature to debug DDL operations)
0x0040 – show invisible columns in output
0x0080 – show hidden constraint columns in output
0x0100 – include incomplete transactions in output (useful when OpenLogReplicator starts reading a transaction from middle – without reading the beginning)
0x0200 – include system transactions in output
0x0400 – flush output transactions on exit
0x0800 – don’t delete old checkpoint files
0x1000 – don’t delete old schema files
memory-min-mbnumber, min: 16,
max: memory-max-mb
32Amount of memory in megabytes allocated at startup and desired amount of allocated memory during work. If memory is dynamically allocated in greater amount it will be release as soon as it it not required any more. See notes for memory-max-mb about memory for Kafka buffer.
memory-max-mbnumber, min: 161024Maximum amount of memory in megabytes the program can allocate. Does not include memory allocated for sending big JSON messages to Kafka – this memory is not included here and is allocated on demand separately.
read-buffer-max-mbnumber, min: 1, max: max memory1/4 of memorySize of memory buffer used for disk read verification
redo-read-sleep-usnumber10000Number in microseconds. Amount of time the program would sleep when all data from online redo log is and the program is waiting for more transactions.
arch-read-sleep-usnumber10000000Number in microseconds. Valid only for *-ARCH modes. Amount of time the program would sleep when the whole archived redo log is processed and archived new redo l
arch-read-retrynumber3How many times should archived redo logs be retried before failing
redo-verify-delay-usnumber50000By default all redo log files read operations are retried to prevent from reading inconsistent data on some filesystems (like ext4 or btrfs); this parameter defines time delay after which the redo log file data is read second time for verification.
event-ownertextOwner for even-table.
event-tabletextThis is technical parameter primary used only for running test cases. This parameters contains a table name. If any DML transactions occurs for this table (like insert, update or delete) which not necessary is committed, the program would stop.

Reader group

typetextyesonline – default mode to read online and archived redo logs and connect to database for reading metadata, creates <database>-schema.json file
asm – like online but the redo log files can be read from ASM instance, example file: OpenLogReplicator.json.example-asm
offline – like online, but metadata is only read from <database>-schema.json file, no connection to the database is required, example file: OpenLogReplicator.json.example-offline
batch – only archived redo logs are read which are listed by redo-logs parameter, example file: OpenLogReplicator.json.example-batch
standby – reading Standby Data Guard database, example file: OpenLogReplicator.json.example-standby
usertextfor online and asm modesUser for connecting to database instance
passwordtextfor online and asm modesPassword for connecting to database instance
servertextfor online and asm modesConnect string for online mode for connecting to database instance. The connect string should be in form: “//<host>:<port>/<service>
user-asmtextfor asm modeUser for connecting to Oracle ASM instance
password-asmtextfor asm modePassword for connecting to ASM instance
server-asmtextfor asm modeConnect string for online mode for connecting to ASM instance. The connect string should be in form: “//<host>:<port>/<service>
disable-checksnumberfor online and asm modesA logical sum of various flags:
0x0001 – during startup the program does not checks if the database user is has appropriate grants to system tables (valid only for online and online-arch modes)
0x0002 – during startup the program does not check if listed tables contain supplemental logging for primary keys
0x0004 – disable CRC check for read blocks
path-mappinglistfor online and offline modesList of pairs of files [before1, after1, before2, after2, …]. Every path (of online and archived redo log) is compared with the list. If a prefix of the path matches with beforeX it is replaced with afterX. For example the path may be: /opt/fra/o1_mf_1_1991_hkb9y64l_.arc, but file is mounted using sshfs under different path so having
“path-mapping”: [“/db/fra”, “/opt/fast-recovery-area”], the program would look for /opt/fast-recovery-area/o1_mf_1_1991_hkb9y64l_.arc instead.
redo-logslistfor batch modeList of redo logs files which should be processed in batch mode. Can also contain folder list. In this case – all files in this folder would be processed. Please refer to example file: OpenLogReplicator.json.example-batch.
log-archive-formattextsee notesThis parameter allows to override the database configuration setting of parameter log_archive_format. When FRA is configured the format of files is expected to be o1_mf_%t_%s_%h_.arc. When FRA is not used the value use for this parameter is read from database configuration parameter log_archive_format.
redo-copy-pathtextWhen this parameter is set – all data which is read from redo log files is written to file in format: path/<database>_<seq>.arc

Format group

typetextyesjson – transactions in native JSON OpenLogReplicator format;
protobuf – transactions in Protocol Buffer format (not yet working, in process of implementation);
See quick start chapter for details.
messagenuber0This parameter defines how DML operations are grouped in output.
1 – one message for every event: begin transaction, DML (insert, update, delete), commit transaction;
0 – transaction is one message
xidnumber0Format of XID of transaction. Values:
0 – xid field: text format (x.y.z format),
1 – xidn field: numeric format (combined and merged in uint64)
timestampnumber0This parameter defines how timestamp field is presented in the message. By default tm field is used and timestamp in Unix Epoch format. This field is a logical sum of:
1 – use tms field instead and ISO-8601 format (YYYY-MM-DDTHH:MI:SSZ);
2 – put timestamp field in every message (when message is set to 0).
charnumber0This parameter defines how (n)char and (n)varchar(2) values are sent to output. Values:
0 – output written in Unicode format, using UTF-8 to code characters;
1 – no character set transformation is applied, the characters are copied from source “as is”;
2 – like 0, but instead of characters the output is in HEX format (in “C” format – like 0xFF);
3 – like 1, but instead of characters the output is in HEX format (in “C” format – like 0xFF).
scnnumber0By default SCN is present only in first message (when message = 0) in scn field. This field is a logical sum of
1 – print SCN values in hexadecimal format (in “C” format – like 0xFF) in scns field;
2 – put SCN field in every message (when message is set to 0).
unknown-typenumber0This field determines how values are sent to output. Possible values are:
0 – print ‘?’ for unknown characters which fail to map;
1 – print ‘?’ for unknown characters which fail to map, output to stderr information about warning.
schemanumber0This field determines how schema is sent to output. This is a logical sum of values:
1 – print full schema (including column descriptions), but just with first message for every table;
2 – when full message is used – repeat it with every message;
4 – add objn field to schema description.
columnnumeric0This parameter defines which columns should appear in JSON output:
0 – default behavior, INSERT and DELETE contain only non-null values; UPDATE contains only changed columns or those which are member of primary key;
1 – INSERT and DELETE contain all values; UPDATE contains only changed columns or those which are member of primary key;
2 – JSON output should contain all columns which appear in REDO stream, including those which did not change;
Please mind that it is technically not possible to differentiate if the column was actually mentioned by UPDATE DML command or not. “UPDATE X SET A = A” might have the same redo log vector as “UPDATE X SET A = A, B = B” – in some cases.

Checkpoint group

pathtextcheckpointThe path to store checkpoint files
interval-snumber600Interval for checkpoint message writing – time
interval-mbnumber100Interval for checkpoint message writing – amount of processed redo log data
allnumber0Send checkpoint messages as often as possible – can’t be used with other parameters
output-checkpointnumber1Should checkpoint records be send to output
output-log-switchnumber1Should log switch event generate a checkpoint

Target group

aliastextyesA logical name of the target used in JSON file for referencing.
sourcetextyesA logical name of the source which this Kafka target should be connected with.
writergroupyesConfiguration of output writer (details below)

Writer group

typetextyesValid values are:
kafka – connect directly to Kafka output stream and send transactions;
file – write output messages directly to file;
network – stream using plain TCP/IP transmission;
zeromq – stream using ZeroMQ messaging.
topictextfor kafka typeName of Kafka topic used to send transactions as JSON messages.
brokerstextfor kafka typeString list of Kafka brokers.
Example: “brokers”: “host1:9092, host2:9092”
uritextfor network and zeromqFor network: <host>:<port> – information for network listener;
For zeromq: <protocol>://<host>:<port> – uri for ZeroMQ connection.
nametextfor file typeFor file type of output this option defines the name of the output file with the transactions.
If no file is provided – the output is written to stdout stream.
max-message-mbnumber, min: 1, max: 953100for kafka typeMaximum size in MB of message sent to Kafka. Memory for this buffer is be allocated independently from memory defined as memory-min-mb/memory-max-mb when a big message to Kafka is being constructed. If transaction is close to this value it would be divided in many parts. Every time such situation occurs, a warning is printed to the log.
max-messagesnumber, min: 1, max: 10000000100000Defines the value of maximum number of messages handled by the client Kafka library (queue.buffering.max.messages)
Only valid for type: kafka
enable-idempotencenumber, min:0, max: 11for kafka typeIdempotent producer is enabled
poll-interval-usnumber, min: 100, max: 3600000000100000for kafka, network, zeromq typeInterval for polling for new messages
checkpoint-interval-snumber10How often should checkpoint file be updated
queue-sizenumber, min: 1, max: 100000065536Size of message queue