Preventing archive log deletion before being processed by Oracle GoldenGate

Oracle GoldenGate just like Oracle Streams has a mechanism of preventing archive logs from being deleted before they are processed.

This mechanism is supported both in Classic and Integrated Extract mode. Although the latter one offers more functionality and control. This article brings in dept analysis what is happening in the database.

The research has been made using Oracle Database 12.1.2 and GoldenGate 12.2.0.1 on Linux64 platform. There are no Restore Points set – to make the case simpler. The archive log deletion policy may just depend on backups and DataGuard.

1. When RMAN is allowed to delete archive logs.

It is easier to understand the whole mechanism when you look from the end of the process. The question is: when will RMAN allow an archive log to be deleted? RMAN has built-in special checks to prevent archive log deletion, when archive logs are still needed for further processing. Unless of course the FORCE switch is used for DELETE ARCHIVELOG command.

The way RMAN behaves is dependent on the setting ARCHIVELOG DELETION POLICY.

Let’s trace it to see what RMAN internally calls when DELETE ARCHIVELOG ALL command is run. Tracing this command show that many SQL statements are being called. I’m not focusing on the SELECT statements that appear every time. Below is a comparison for a few scenarios. I’m just extracting the difference of SQL statements which are called in the following scenarios:

1. CONFIGURE ARCHIVELOG DELETION POLICY TO NONE;

1
2
dbms_rcvman.getRequiredSCN(reqscn => :lreqscn, rlgscn => :lrlgscn, streams=> 1, 
  alldest=> :alldest_in);

2. CONFIGURE ARCHIVELOG DELETION POLICY TO SHIPPED TO ALL STANDBY;

1
2
dbms_rcvman.getRequiredSCN(reqscn => :lreqscn, rlgscn => :lrlgscn, streams=> 1, 
  alldest=> :alldest_in);

3. CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON ALL STANDBY;

1
2
dbms_rcvman.getAppliedSCN(appscn => :lappscn, rlgscn => :lrlgscn, alldest=> 
  :alldest_in);

4. CONFIGURE ARCHIVELOG DELETION POLICY TO BACKED UP 1 TIMES TO DEVICE TYPE DISK;

1
2
3
4
5
6
dbms_rcvman.getAlBackupHistory(backedUpDev => :devtype, FIRST => FIRST, 
  bhistoryRec => bhistoryRec, doingCmd => :backup, keepTag => :tag, toDest1 => :todest1, 
  toDest2 => :todest2, toDest3 => :todest3, toDest4 => :todest4); 
...
dbms_rcvman.getRequiredSCN(reqscn => :lreqscn, rlgscn => :lrlgscn, streams=> 1, 
  alldest=> :alldest_in);

5. CONFIGURE ARCHIVELOG DELETION POLICY TO SHIPPED TO ALL STANDBY BACKED UP 1 TIMES TO DEVICE TYPE DISK;

1
2
3
4
5
6
dbms_rcvman.getAlBackupHistory(backedUpDev => :devtype, FIRST => FIRST, 
  bhistoryRec => bhistoryRec, doingCmd => :backup, keepTag => :tag, toDest1 => :todest1,
  toDest2 => :todest2, toDest3 => :todest3, toDest4 => :todest4);
...
dbms_rcvman.getRequiredSCN(reqscn => :lreqscn, rlgscn => :lrlgscn, streams=> 1,
  alldest=> :alldest_in);

It seems that when backup policy is set to BACKED UP X TIMES then a stored procedure getAlBackupHistory is called. Also one of procedures is being called: getRequiredSCN or getAppliedSCN. The source of the getRequiredSCN procedure shows that internally procedure getAppliedSCN is called. This procedure calls a lot of SQL statements but among them two are most important:

1
2
SELECT REQUIRED_CHECKPOINT_SCN, SOURCE_RESETLOGS_SCN FROM DBA_CAPTURE 
  WHERE CAPTURE_TYPE = 'LOCAL' ORDER BY 1;

And:

1
SELECT MIN_REQUIRED_CAPTURE_CHANGE# FROM V$DATABASE;

The code will choose a minimal value among possible SCNs. The lower (and NOT NULL) value of SCN is being chosen. MOS note 1990815.1 confirms this kind of behavior. The column REQUIRED_CHECKPOINT_SCN in DBA_CAPTURE is updated only by the Integrated Extract process. When Classic Extract is used only value 0 is found here.

As we see here, RMAN tracks what GoldenGate (using Streams API) is doing and will not delete redo logs which might still be needed by GoldenGate as long as GoldenGate keeps
information in DBA_CAPTURE view (aka ALL_CAPTURE). As long as a row exists in DBA_CAPTURE or MIN_REQUIRED_CAPTURE_CHANGE# is updated we are safe. That’s good news. Let’s find out how those values are updated.

2. How GoldenGATE will inform RMAN which Archive Logs are needed

Above we have seen that Streams API is used to communicate with RMAN. According to the documentation the REGISTER EXTRACT command should be called to create such communication. Let’s trace what this command does. Apart from calling different SELECT statements I have found the following procedure calls while using the following commands:

1. REGISTER EXTACT… DATABASE

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
dbms_streams_adm.set_up_queue(queue_table => queue_table_name,
  queue_name => capture_queue_name);
...
DBMS_XSTREAM_GG.SET_GG_SESSION();
...
dbms_capture_adm.create_capture(queue_name => capture_queue_name,
  capture_name => outbound_capture_name, first_scn => firstScn, start_scn => startScn,
  source_database => source_global_name, source_root_name => source_global_name,
  capture_class => 'goldengate');
... 
DBMS_XSTREAM_GG_ADM.ADD_OUTBOUND(server_name => outbound_server_name,
  capture_name=> outbound_capture_name, source_database => source_global_name,
  source_container_name => NULL, source_root_name => source_global_name,
  committed_data_only => FALSE, wait_for_inflight_txns => 'N', synchronization => 'NONE',
  start_scn => startScn, comment => outbound_comment);
...
dbms_capture_adm.create_capture(queue_name => capture_queue_name,
  capture_name => outbound_capture_name, first_scn => firstScn, start_scn => startScn,
  source_database => container_global_name, source_root_name => source_global_name,
  capture_class => 'goldengate');
...
BMS_XSTREAM_GG_ADM.ADD_OUTBOUND(server_name => outbound_server_name,
  capture_name=> outbound_capture_name, source_database=> container_global_name,
  source_container_name=> container_short_name, source_root_name => source_global_name,
  committed_data_only => FALSE, wait_for_inflight_txns => 'N', synchronization => 'NONE',
  start_scn => startScn, comment => outbound_comment);
...
DBMS_XSTREAM_GG.SET_GG_SESSION(FALSE);
...
DBMS_XSTREAM_GG_ADM.START_OUTBOUND(server_name => outbound_server_name);

2. REGISTER EXTRACT … LOGRETENTION

1
2
3
4
dbms_streams_adm.set_up_queue(queue_table=>queue_table_name, queue_name=>queue_name);
...
dbms_capture_adm.create_capture(queue_name => queue_name, capture_name => chosen_capture_name,
  first_scn=>min_scn, checkpoint_retention_time=>log_retention_days) ;

3. UNREGISTER EXTRACT … DATABASE

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
dbms_capture_adm.alter_capture(capture_name=>outbound_capture_name,
  remove_rule_set=>TRUE, remove_negative_rule_set=>TRUE); 
...
dbms_apply_adm.alter_apply(apply_name=>outbound_server_name, remove_rule_set=>TRUE,
  remove_negative_rule_set=>TRUE);
...
DBMS_RULE_ADM.DROP_RULE_SET(rule_set_name => capture_inclusion_ruleset, delete_rules => TRUE);
...
DBMS_RULE_ADM.DROP_RULE_SET(rule_set_name => capture_exclusion_ruleset, delete_rules => TRUE);
...
DBMS_RULE_ADM.DROP_RULE_SET(rule_set_name => apply_inclusion_ruleset, delete_rules => TRUE);
...
DBMS_RULE_ADM.DROP_RULE_SET(rule_set_name => apply_exclusion_ruleset, delete_rules => TRUE);
...
DBMS_XSTREAM_ADM.DROP_OUTBOUND(server_name => outbound_server_name);
...
DBMS_CAPTURE_ADM.DROP_CAPTURE(capture_name => outbound_capture_name,
  drop_unused_rule_sets => TRUE);
...
DBMS_STREAMS_ADM.REMOVE_QUEUE(queue_name => capture_queue_name, cascade => TRUE,
  drop_unused_queue_table => TRUE);

4. UNREGISTER EXTRACT … LOGRETENTION

1
dbms_streams_adm.remove_queue(rec.owner || '.' || queue_name, TRUE);

I have omitted the conditional calls to procedures to make the above code clearer.

It seems that the create_queue is called to create a Streams API checkpoint and remove_queue procedures is called to remove the checkpoint. The checkpoint is visible in DBA_CAPTURE. The DATABASE version of the REGISTER EXTRACT additionally creates some internal Streams processes for the Integrated Extract process.

3. How long will the database keep Archive Logs for GoldenGate – CLASSIC EXTRACT

I have traced GoldenGate interaction with the database Streams API. Every 4 hours it calls the following code:

1
2
3
4
dbms_streams_adm.set_up_queue(queue_table=>queue_table_name, queue_name=>queue_name);
...
dbms_capture_adm.create_capture(queue_name => queue_name, capture_name => chosen_capture_name,
  first_scn=>min_scn, checkpoint_retention_time=>log_retention_days);

I could not found the parameter which would modify the 4 hours interval. According the documentation the TRANLOGOPTIONS LOGRETENTION should make some difference, but I could not see the effect.

Using Classic Extract a row exists in the DBA_CAPTURE but only the values START_SCN and FIRST_SCN move. Other SCN values are always 0’s. Every 4 hours The value of MIN_REQUIRED_CAPTURE_CHANGE# in V$DATABASE seems to be updated together with START_SCN and FIRST_SCN in DBA_CAPTURE.

4. How long will the database keep Archive Logs for GoldenGate – INTEGRATED EXTRACT

The only blog post that I have found about this subject I have found was an article in Андрей Севостьянов’s blog.

By default the value of REQUIRED_CHECKPOINT_SCN is updated every +-10 seconds in DBA_CAPTURE. But the value MIN_REQUIRED_CAPTURE_CHANGE# is updated every 6 hours. That is quite a long time. To modify this behavior the following setting has to be made in Extract PRM file:

1
TRANLOGOPTIONS INTEGRATEDPARAMS(_CKPT_RETENTION_CHECK_FREQ xxx)

xxx is the number of seconds that between MIN_REQUIRED_CAPTURE_CHANGE# updates. Default value is 21600. This is also mentioned in MOS note 421176.1.

The current setting of _CKPT_RETENTION_CHECK_FREQ can be read using the following query:

1
SELECT * FROM DBA_CAPTURE_PARAMETERS WHERE PARAMETER = '_CKPT_RETENTION_CHECK_FREQ';

Every time the SCN is updated you should see a note in the Oracle alert log:

1
2
3
Mon Jul 17 20:05:45 2017
GoldenGate Capture C 1: FIRST scn changed.
scn: 0x0000.00a7cc73

5. What happens when FRA gets full?

The documentation for GoldenGate says:

If the Oracle flash recovery storage area is full, RMAN will purge the archive logs even when needed by Extract. This limitation exists so that the requirements of Extract (and other Oracle replication components) do not interfere with the availability of redo to the database.

Indeed, when it gets full the database would delete the archive logs even though they have not yet been processed by Integrated Extract process. An information appears in the database alert log:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
************************************************************************
WARNING: The following archived logs needed BY Streams capture process
are being deleted TO free SPACE IN the recovery area. IF you need
TO process these logs again, restore them FROM a backup TO a destination
other than the recovery area USING the following RMAN commands:
 RUN{
 # <directory/ASM diskgroup> IS a location other than the
 # recovery area
 SET ARCHIVELOG DESTINATION TO '<directory/ASM diskgroup>';
 RESTORE ARCHIVELOG ...;
 }
 
/opt/fast_recovery_area/SID/archivelog/2017_07_17/o1_mf_1_908_dpt4p6nk_.arc
/opt/fast_recovery_area/SID/archivelog/2017_07_17/o1_mf_1_909_dpt4rgmv_.arc
************************************************************************

Later on the Extract process fails to start:

1
2
3
2017-07-17 21:49:29 ERROR OGG-00868 Oracle GoldenGate Capture FOR Oracle, extract1.prm:
  Error code 1291, error message: ORA-01291: missing logfile
 (Missing Log File WAITING FOR DICTIONARY REDO. READ POSITION SCN: 0.11033108 (11033108)).

You would need somehow to recover the archive log to get the process up and running. There is also a thread on the Oracle community referring this situation.

 

1 thought on “Preventing archive log deletion before being processed by Oracle GoldenGate”

  1. Pingback: Oracle GoldenGate Classic Extract together with RMAN recovery catalog – Bersler

Leave a Comment

Your email address will not be published. Required fields are marked *