OpenLogReplicator.json file format

All parameters are stored in OpenLogReplicator.json for version 0.7.15. Please use the example file for a start. Below are described global parameters, parameters for Oracle database reader module and Kafka writer module.

Please have in mind that until the release of version 1.0 the format can change between versions.

Global parameters

versiontextyesThe value of the parameter must be set as the current version of the program; this is a safe-checker to update the JSON configuration file after updating the binaries to a newer version; after updating binaries always check the documentation for parameter changes and verify that the JSON configuration file is correct.
sourceslistyesOracle database sources (details below).
targetslistyesOutput targets (details below)
dump-redo-lognumber0This parameter is primary used for verifying the correctness how program works. It can create a stream similar to logdump command which can be compared as text to verify if certain parameters have been correctly decoded. Please however mind that logdump can have some inconsistency.
Possible values:
0 – no logdump file is created;
1 – for every processed redo log file a <database>-<nnn>.logdump file is created (<database> – database name, <nnn> – redo log sequence); the file contains results similar to logdump command; please not that not all redo OP codes are parsed and analyzed, so the result can be lesser than the original result from logdump; any difference in output compared to original logdump command can be an indication of improper log parsing;
2 – like 1 but additional information is printed which is not originally printed in logdump output – like for example details about supplemental log groups.
dump-raw-datanumber0This parameter is only valid when dump-redo-log parameter is set to 1 or 2. Possible values:
0 – no hex dump is added to DUMP-<nnn>.trace file;
1 – before logdump information for every vector the full vector is dumped in HEX format – useful for analysis of the content.
tracenumber2This parameter defines amount of output messages. All messages are sent to stderr output stream. Possible values:
0 – silent, just basic information about startup and errors are sent to output;
1 – warning – like 0 + warnings;
2 – info – like 1 + info;
3 – full – like 2 + additional details about certain operations.
trace2number0This parameter is used mainly for debugging. The value is a logical map of various trace parameters, please refer to code for details.

Element of sources list

aliastextyesThe name of the source – referenced later by Kafka target
nametextyesThis name is used for identifying database connection. It is mentioned in logs and used for file creation with checkpoint and schema.
readergroupyesConfiguration of redo log reader (details below)
formatgroupyesConfiguration of format to store transactions (details below)
archtextpath or listParameter defines the way that list of archived log files are get. Possible values are:
online – archived log list is read using database connection; the database connection is closed during program work, just open occasionally to read archived redo log list; valid only for online, asm and standby modes;
online-keep – like online, but the database connection is kept open;
path – the redo log list if read from disk; valid for all readers but batch;
list – like path but the list of files is provided by list; default mode for batch reader
tableslistA list of elements which is used to define a set of tables that should be replicated. The list can contain SQL wildcards. Every element should contain a table element and optionally a key element. Table element is used as a wildcard to define table name. Example:
“table”: {{“table”: “owner1.table1”}, {“table”: “owner2.table2”, “key”: “col1, col2, col3”}, {“table”:”sys.%”}}. If a table does not contain a primary key, a custom set of columns can be treated as primary key. See quick start document for further details about primary key behavior.
flagsnumber0A logical sum of various flags:
0x0001 – don’t use direct read (O_DIRECT) for reading redo log files
0x0002 – don’t use O_NOATIME for open files
0x0004 – ignore basic errors and continue redo log processing
0x0008 – allow basic tracing of DDL operations (unsupported feature to debug DDL operations)
0x0010 – disable read verification – can speed up read operations for certain file systems like XFS – don’t use for EXT4 or BTRFS
0x0020 – check CRC for all redo log files before processing
0x0040 – hide invisible columns from output
0x0080 – include incomplete transactions in output (useful when OpenLogReplicator starts reading a transaction from middle – without reading the beginning)
memory-min-mbnumber, min: 16,
max: memory-max-mb
32Amount of memory in megabytes allocated at startup and desired amount of allocated memory during work. If memory is dynamically allocated in greater amount it will be release as soon as it it not required any more. See notes for memory-max-mb about memory for Kafka buffer.
memory-max-mbnumber, min: 161024Maximum amount of memory in megabytes the program can allocate. Does not include memory allocated for sending big JSON messages to Kafka – this memory is not included here and is allocated on demand separately.
redo-read-sleepnumber10000Number in microseconds. Amount of time the program would sleep when all data from online redo log is and the program is waiting for more transactions.
arch-read-sleepnumber10000000Number in microseconds. Valid only for *-ARCH modes. Amount of time the program would sleep when the whole archived redo log is processed and archived new redo l
checkpoint-intervalnumber10Number is seconds. How often would program update checkpoint information in <database>-chkpt.json file
event-tabletextThis is technical parameter primary used only for running test cases. This parameters contains a schema and table name. If any DML transactions occurs for this table (like insert, update or delete) which not necessary is committed, the program would stop.

Reader group

typetextyesonline – default mode to read online and archived redo logs and connect to database for reading metadata, creates <database>-schema.json file
asm – like online but the redo log files can be read from ASM instance, example file: OpenLogReplicator.json.example-asm
offline – like online, but metadata is only read from <database>-schema.json file, no connection to the database is required, example file: OpenLogReplicator.json.example-offline
batch – only archived redo logs are read which are listed by redo-logs parameter, example file: OpenLogReplicator.json.example-batch
standby – reading Standby Data Guard database, example file: OpenLogReplicator.json.example-standby
usertextfor online and asm modesUser for connecting to database instance
passwordtextfor online and asm modesPassword for connecting to database instance
servertextfor online and asm modesConnect string for online mode for connecting to database instance. The connect string should be in form: “//<host>:<port>/<service>
user-asmtextfor asm modeUser for connecting to Oracle ASM instance
password-asmtextfor asm modePassword for connecting to ASM instance
server-asmtextfor asm modeConnect string for online mode for connecting to ASM instance. The connect string should be in form: “//<host>:<port>/<service>
disable-checksnumberfor online and asm modesA logical sum of various flags:
0x0001 – during startup the program does not checks if the database user is has appropriate grants to system tables (valid only for online and online-arch modes)
0x0002 – during startup the program does not check if listed tables contain supplemental logging for primary keys
path-mappinglistfor online and offline modesList of pairs of files [before1, after1, before2, after2, …]. Every path (of online and archived redo log) is compared with the list. If a prefix of the path matches with beforeX it is replaced with afterX. For example the path may be: /opt/fra/o1_mf_1_1991_hkb9y64l_.arc, but file is mounted using sshfs under different path so having
“path-mapping”: [“/db/fra”, “/opt/fast-recovery-area”], the program would look for /opt/fast-recovery-area/o1_mf_1_1991_hkb9y64l_.arc instead.
redo-logslistfor batch modeList of redo logs files which should be processed in batch mode. Can also contain folder list. In this case – all files in this folder would be processed. Please refer to example file: OpenLogReplicator.json.example-batch.

Format group

typetextyesjson – transactions in native JSON OpenLogReplicator format;
protobuf – transactions in Protocol Buffer format (not yet working, in process of implementation);
See quick start chapter for details.
messagenuber0This parameter defines how DML operations are grouped in output.
1 – one message for every event: begin transaction, DML (insert, update, delete), commit transaction;
0 – transaction is one message
xidnumber0Format of XID of transaction. Values:
0 – xid field: text format (x.y.z format),
1 – xidn field: numeric format (combined and merged in uint64)
timestampnumber0This parameter defines how timestamp field is presented in the message. By default tm field is used and timestamp in Unix Epoch format. This field is a logical sum of:
1 – use tms field instead and ISO-8601 format (YYYY-MM-DDTHH:MI:SSZ);
2 – put timestamp field in every message (when message is set to 0).
charnumber0This parameter defines how (n)char and (n)varchar(2) values are sent to output. Values:
0 – output written in Unicode format, using UTF-8 to code characters;
1 – no character set transformation is applied, the characters are copied from source “as is”;
2 – like 0, but instead of characters the output is in HEX format (in “C” format – like 0xFF);
3 – like 1, but instead of characters the output is in HEX format (in “C” format – like 0xFF).
scnnumber0By default SCN is present only in first message (when message = 0) in scn field. This field is a logical sum of
1 – print SCN values in hexadecimal format (in “C” format – like 0xFF) in scns field;
2 – put SCN field in every message (when message is set to 0).
unknownnumber0This field determines how values are sent to output. Possible values are:
0 – print ‘?’ for unknown characters which fail to map;
1 – print ‘?’ for unknown characters which fail to map, output to stderr information about warning.
schemanumber0This field determines how schema is sent to output. This is a logical sum of values:
1 – print full schema (including column descriptions), but just with first message for every table;
2 – when full message is used – repeat it with every message;
4 – add objn field to schema description.
columnsnumeric0This parameter defines which columns should appear in JSON output:
0 – default behavior, INSERT and DELETE contain only non-null values; UPDATE contains only changed columns or those which are member of primary key;
1 – INSERT and DELETE contain all values; UPDATE contains only changed columns or those which are member of primary key;
2 – JSON output should contain all columns which appear in REDO stream, including those which did not change;
Please mind that it is technically not possible to differentiate if the column was actually mentioned by UPDATE DML command or not. “UPDATE X SET A = A” might have the same redo log vector as “UPDATE X SET A = A, B = B” – in some cases.

Target group

aliastextyesA logical name of the target used in JSON file for referencing.
sourcetextyesA logical name of the source which this Kafka target should be connected with.
formatgroupyesConfiguration of output writer (details below)

Writer group

typetextyesValid values are:
kafka – connect directly to Kafka output stream and send transactions;
file – write output messages directly to file.
topictextfor kafka typeName of Kafka topic used to send transactions as JSON messages.
brokerstextfor kafka typeString list of Kafka brokers.
Example: “brokers”: “host1:9092, host2:9092”
nametextfor file typeFor file type of output this option defines the name of the output file with the transactions.
If no file is provided – the output is written to stdout stream.
max-message-mbnumber, min: 1, max: 953100for kafka typeMaximum size in MB of message sent to Kafka. Memory for this buffer is be allocated independently from memory defined as memory-min-mb/memory-max-mb when a big message to Kafka is being constructed. If transaction is close to this value it would be divided in many parts. Every time such situation occurs, a warning is printed to the log.
max-messagesnumber, min: 1, max: 10000000100000Defines the value of maximum number of messages handled by the client Kafka library (queue.buffering.max.messages)
Only valid for type: kafka