TIP 102#: GoldenGate replicat performance

One of the nice feature of GoldenGate is BATCHSQL which enables GoldenGate replicat to organize similar SQL statements into array and apply them at once instead of normal mode to apply SQL statement one by one. Oracle document says that around 5K bytes of data change per row, BATCHSQL loses its benefits.

The following shows a sample run of BATCHSQL with INSERTAPPEND and how it saves resources and improves the performance.

Scenario 1 : Small table with massive changes

create table repuser.tbl1 (id number, name varchar2(30));
alter table repuser.tbl1 add constraint uc1 unique (id);


Initial load :

SQL> delete from repuser.tbl1;

0 rows deleted.

SQL> insert into repuser.tbl1 select object_id*-1,object_name from all_objects where object_id is not null;

56135 rows created.

SQL> c/-1/-100000
1* insert into repuser.tbl1 select object_id*-100000,object_name from all_objects where object_id is not null
SQL> /

56135 rows created.

SQL> c/-100000/1
1* insert into repuser.tbl1 select object_id*1,object_name from all_objects where object_id is not null
SQL> /

56135 rows created.

SQL> commit;

Commit complete. 

To find out the impact of BATCHSQL in terms of performance, The changes are tested on Oracle 11203 using GG 11201 and the changes are shipped to target with Oracle 11203 using GG 11201. The following shows different setting for replicat and how each setting improves the performance

Replicat setting 1 : Default setting - No BATCHSQL, No INSERTAPPEND


GGSCI (localhost.localdomain) 33> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
REPLICAT    RUNNING     RENA_IN     00:22:24      00:00:02    

 It takes more than 22 minutes to apply changes (224450 deletes and 224450 inserts)


The following trace shows that 224450 times delete and 224450 times insert is run and it took 1261.54+27.72=1289.26 seconds to complete executions of these two statements (Overheads and other internal executions are ignored).


DELETE FROM "REPUSER"."TBL1"
WHERE
"ID" = :b0 AND ROWNUM = 1


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute 224540   1261.54    1296.81       2979  567004335     238878      224540
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   224541   1261.54    1296.81       2979  567004335     238878      224540

INSERT INTO "REPUSER"."TBL1" ("ID","NAME")
VALUES
(:a0,:a1)


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute 224540     26.91      27.72          0        787     237105      224540
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   224541     26.91      27.72          0        787     237105      224540



Replicat setting 2 : BATCHSQL, INSERTAPPEND




GGSCI (localhost.localdomain) 23> !
info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
REPLICAT    RUNNING     RENA_IN     00:07:04      00:00:00    

The same changes only took more than 7 mins to be completed. It is more than 3 times faster setting 1.


The following shows that only 450 times delete and 450 times insert are run and it took only 452.10+0.71=452.81 seconds to complete the executions of two statements.
Also it shows that GoldenGate runs insert in DIRECT PATH with adding hints.


DELETE FROM "REPUSER"."TBL1"
WHERE
"ID" = :b0 AND ROWNUM = 1


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute    450    440.00     452.10         12  139050526     238818      224540
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      451    440.00     452.10         12  139050526     238818      224540

INSERT /*+ APPEND APPEND_VALUES */ INTO "REPUSER"."TBL1" ("ID","NAME")
VALUES
(:a0,:a1)


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute    450      0.60       0.71          0        922       5263      224540
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      451      0.60       0.71          0        922       5263      224540



Replicat setting 3 : BATCHSQL, No INSERTAPPEND




GGSCI (localhost.localdomain) 23> !
info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
REPLICAT    RUNNING     RENA_IN     00:07:54      00:00:00    

It took around 8 minutes to apply the same change, Timing is very close to BATCHSQL/INSERTAPPEND but almost 3 times faster than traditional GoldenGate.

Trace also shows that 450 executions of delete and 450 executions of inserts for the same amount of the change.Elapsed time is 518.87+0.48=519.35 seconds.



DELETE FROM "REPUSER"."TBL1"
WHERE
"ID" = :b0 AND ROWNUM = 1


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute    450    509.74     518.87          0  168600692     238673      224540
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      451    509.74     518.87          0  168600692     238673      224540


INSERT INTO "REPUSER"."TBL1" ("ID","NAME")
VALUES
(:a0,:a1)


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute    450      0.46       0.48          0       2312      11070      224540
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      451      0.46       0.48          0       2312      11070      224540

So far, BATCHSQL and INSERTAPPEND sounds promising for massive changes in small table. In the next post, I will try to test this feature with wider table to see how this feature performs for changes around 5KB per row.

TIP 101#: GoldenGate and Archivelog cleanup

In earlier version of GoldenGate, there was always a concern on not to cleanup archivelogs which are required by GoldenGate (when it runs or in case it is bounced).
The way that I do archivelog cleanup in GoldenGate environment was to have a script to extract the sequence which recovery checkpoint of all extracts is and make sure not to cleanup any archivelog after the minimum of sequences.
However, with the new version of GoldenGate, Oracle introduced integration of RMAN with GoldenGate which makes RMAN aware of GoldenGate and the required archivelogs.
This feautre works with some configurations in place for Classic capture and it should automatically works for Integrated Capture (Oracle newly introduced GoldenGate capture).
I ran a test on integrated capture in 11.2.0.1 on Linux 32bit and the following shows that although RMAN is aware of GoldenGate but because of a bug???, no archivelogs which are generated after start up of extract is cleaned up by RMAN. (Concern of archivelog dest full !!!)

Please see the following example :

Exatract checkpoints


GGSCI (localhost.localdomain) 15> info extract ext_int, showch

EXTRACT    EXT_INT   Last Started 2012-09-15 12:40   Status RUNNING
Checkpoint Lag       00:00:00 (updated 00:00:10 ago)
Log Read Checkpoint  Oracle Integrated Redo Logs
2012-09-15 13:07:26
SCN 0.6199997 (6199997)


Current Checkpoint Detail:

Read Checkpoint #1

Oracle Integrated Redo Log

Startup Checkpoint (starting position in the data source):
Timestamp: 2012-09-15 12:40:41.000000
SCN: Not available

  Recovery Checkpoint (position of oldest unprocessed transaction in the data source):
Timestamp: 2012-09-15 13:07:26.000000
SCN: 0.6199997 (6199997)

Current Checkpoint (position of last record read in the data source):
Timestamp: 2012-09-15 13:07:26.000000
SCN: 0.6199997 (6199997)

Write Checkpoint #1

GGS Log Trail

Current Checkpoint (current write position):
Sequence #: 15
RBA: 1065
Timestamp: 2012-09-15 13:07:30.085648
Extract Trail: ./dirdat/tg

CSN state information:
CRC: D5-D0-B1-2C
CSN: Not available

Header:
Version = 2
Record Source = A
Type = 12
# Input Checkpoints = 1
# Output Checkpoints = 1

File Information:
Block Size = 2048
Max Blocks = 100
Record Length = 2048
Current Offset = 0

Configuration:
Data Source = 3
Transaction Integrity = 1
Task Type = 0

Status:
Start Time = 2012-09-15 12:40:57
Last Update Time = 2012-09-15 13:07:30
Stop Status = A
Last Result = 400


Recovery checkpoint is on 6199997 scn which according the following is sequence
664


select 'arch',sequence#,first_time,next_time,RESETLOGS_TIME from v$archived_log where &your_scn between first_change# and next_change#
union all
select 'log-hist',sequence#,first_time,null,RESETLOGS_TIME from v$log_history where &your_scn between first_change# and next_change#
union all
select 'log-current',sequence#,first_time,next_time,null from v$log where &your_scn between first_change# and next_change#

arch 664 09/15/2012 13:05:41 09/15/2012 13:07:34 07/26/2012 08:03:23
log-hist 664 09/15/2012 13:05:41  07/26/2012 08:03:23
log-current 664 09/15/2012 13:05:41 09/15/2012 13:07:34 


Checking archive destinaton 

SQL> archive log list
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u02/oracle/app/oracle/product/arch
Oldest online log sequence     663
Next log sequence to archive   665
Current log sequence           665
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, Data Mining and Real Application Testing options
[oracle@localhost ~]$ cd  /u02/oracle/app/oracle/product/arch

[oracle@localhost arch]$ ls -ltr
total 458680
-rw-rw---- 1 oracle oracle 41370624 Sep 12 18:18 1_647_789638603.dbf
-rw-rw---- 1 oracle oracle 44616192 Sep 12 18:27 1_648_789638603.dbf
-rw-rw---- 1 oracle oracle 45473792 Sep 12 18:27 1_649_789638603.dbf
-rw-rw---- 1 oracle oracle 43464192 Sep 12 18:28 1_650_789638603.dbf
-rw-rw---- 1 oracle oracle 45917696 Sep 12 18:28 1_651_789638603.dbf
-rw-rw---- 1 oracle oracle 45963776 Sep 12 18:28 1_652_789638603.dbf
-rw-rw---- 1 oracle oracle 44632576 Sep 12 18:29 1_653_789638603.dbf
-rw-rw---- 1 oracle oracle 41152512 Sep 12 18:29 1_654_789638603.dbf
-rw-rw---- 1 oracle oracle 46688256 Sep 12 18:30 1_655_789638603.dbf
-rw-rw---- 1 oracle oracle 28416512 Sep 15 12:26 1_656_789638603.dbf
-rw-rw---- 1 oracle oracle   103936 Sep 15 12:27 1_657_789638603.dbf
-rw-rw---- 1 oracle oracle 33522176 Sep 15 12:37 1_658_789638603.dbf
-rw-rw---- 1 oracle oracle  1222144 Sep 15 12:49 1_659_789638603.dbf
-rw-rw---- 1 oracle oracle  6498816 Sep 15 13:05 1_660_789638603.dbf
-rw-rw---- 1 oracle oracle     1024 Sep 15 13:05 1_661_789638603.dbf
-rw-rw---- 1 oracle oracle     7680 Sep 15 13:05 1_662_789638603.dbf
-rw-rw---- 1 oracle oracle     2560 Sep 15 13:05 1_663_789638603.dbf
-rw-rw---- 1 oracle oracle    67072 Sep 15 13:07 1_664_789638603.dbf


I expect RMAN cleans up all archivelog but 664
Running RMAN Cleanup


[oracle@localhost arch]$ rman target /
Recovery Manager: Release 11.2.0.3.0 - Production on Sat Sep 15 13:11:29 2012
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
connected to target database: ORCL1123 (DBID=3438744075)
RMAN> delete archivelog all;
.
.
.
Recovery Manager complete.
[oracle@localhost arch]$ ls -ltr
total 68316
-rw-rw---- 1 oracle oracle 28416512 Sep 15 12:26 1_656_789638603.dbf
-rw-rw---- 1 oracle oracle   103936 Sep 15 12:27 1_657_789638603.dbf
-rw-rw---- 1 oracle oracle 33522176 Sep 15 12:37 1_658_789638603.dbf
-rw-rw---- 1 oracle oracle  1222144 Sep 15 12:49 1_659_789638603.dbf
-rw-rw---- 1 oracle oracle  6498816 Sep 15 13:05 1_660_789638603.dbf
-rw-rw---- 1 oracle oracle     1024 Sep 15 13:05 1_661_789638603.dbf
-rw-rw---- 1 oracle oracle     7680 Sep 15 13:05 1_662_789638603.dbf
-rw-rw---- 1 oracle oracle     2560 Sep 15 13:05 1_663_789638603.dbf
-rw-rw---- 1 oracle oracle    67072 Sep 15 13:07 1_664_789638603.dbf



For some reasons RMAN keep archives between sequence 656 to 664.
No retention policy for archivelog is defined.


Checking the capture which is started by integrated capture.


select session_restart_scn from v$streams_capture;

6156108


this scn is scn which log miner started from.
sequence 656 based on the following query is mapped to above scn.


select 'arch',sequence#,first_time,next_time,RESETLOGS_TIME from v$archived_log where &your_scn between first_change# and next_change#
union all
select 'log-hist',sequence#,first_time,null,RESETLOGS_TIME from v$log_history where &your_scn between first_change# and next_change#
union all
select 'log-current',sequence#,first_time,next_time,null from v$log where &your_scn between first_change# and next_change#


6156108  ==> sequence 656 

'ARCH' SEQUENCE# FIRST_TIME NEXT_TIME RESETLOGS_TIME
arch 656 09/12/2012 18:30:07 09/15/2012 12:26:54 07/26/2012 08:03:23
log-hist 656 09/12/2012 18:30:07  07/26/2012 08:03:23


RMAN keeps all archivelogs after scn which integrated capture is started from
To confirm, I bounced extract and confirmed that now RMAN moved to the next scn which extract is started from.


stop extract ext_int
start extract ext_int
select session_restart_scn from v$streams_capture;

6200279


Above scn is mapped to sequence 665.
Now if I run RMAN archivelog deletion, then all archivelogs till sequence 665 is removed.


[oracle@localhost arch]$ ls -ltr
total 69468
-rw-rw---- 1 oracle oracle 28416512 Sep 15 12:26 1_656_789638603.dbf
-rw-rw---- 1 oracle oracle   103936 Sep 15 12:27 1_657_789638603.dbf
-rw-rw---- 1 oracle oracle 33522176 Sep 15 12:37 1_658_789638603.dbf
-rw-rw---- 1 oracle oracle  1222144 Sep 15 12:49 1_659_789638603.dbf
-rw-rw---- 1 oracle oracle  6498816 Sep 15 13:05 1_660_789638603.dbf
-rw-rw---- 1 oracle oracle     1024 Sep 15 13:05 1_661_789638603.dbf
-rw-rw---- 1 oracle oracle     7680 Sep 15 13:05 1_662_789638603.dbf
-rw-rw---- 1 oracle oracle     2560 Sep 15 13:05 1_663_789638603.dbf
-rw-rw---- 1 oracle oracle    67072 Sep 15 13:07 1_664_789638603.dbf
-rw-rw---- 1 oracle oracle  1161728 Sep 15 13:23 1_665_789638603.dbf
-rw-rw---- 1 oracle oracle     3584 Sep 15 13:23 1_666_789638603.dbf
-rw-rw---- 1 oracle oracle     6656 Sep 15 13:23 1_667_789638603.dbf

rman target /
delete archivelog all;


[oracle@localhost arch]$ ls -ltr
total 1152  
-rw-rw---- 1 oracle oracle 1161728 Sep 15 13:23 1_665_789638603.dbf
-rw-rw---- 1 oracle oracle    3584 Sep 15 13:23 1_666_789638603.dbf
-rw-rw---- 1 oracle oracle    6656 Sep 15 13:23 1_667_789638603.dbf



If you use integrated capture or classic Goldengate, it is safer to prepare your own script for cleanup archivelog after extracting recovery checkpoint of all extract processes. Currently, RMAN does cleanup archivelog only to the scn which extracts are started, If you want to rely on RMAN cleanup, make sure to plan to bounce extract regularly AND watch archivelog destination closely.