OptimizingOracleRAC - 数据库编程

s to the redo log prior to sending the block to the requesting instance. If the application design is such that uncommitted blocks are often in demand across instances in the cluster then these redo log flushes might become common. If there are bottlenecks in the redo log devices then the IO waits will be magnified.

We can measure the impact on LMS response time by leveraging the timing information in GV$SYSTAT and the FLUSHES statistic in GV$CR_BLOCK_SERVER. Putting the two together we can calculate the proportion of blocks that required flushing and the proportion of LMS time spent performing the flush:

SQL> WITH sysstat AS (
2 SELECT SUM(CASE WHEN name LIKE '%time'
3 THEN VALUE END) total_time,
4 SUM(CASE WHEN name LIKE '%flush time'
5 THEN VALUE END) flush_time,
6 SUM(CASE WHEN name LIKE '%served'
7 THEN VALUE END) blocks_served
8 FROM gv$sysstat
9 WHERE name IN
10 ('gc cr block build time',
11 'gc cr block flush time',
12 'gc cr block send time',
13 'gc current block pin time',
14 'gc current block flush time',
15 'gc current block send time',
16 'gc cr blocks served',
17 'gc current blocks served')),
18 cr_block_server as (
19 SELECT SUM(flushes) flushes, SUM(data_requests) data_requests
20 FROM gv$cr_block_server )
21 SELECT ROUND(flushes*100/blocks_served,2) pct_blocks_flushed,
22 ROUND(flush_time*100/total_time,2) pct_lms_flush_time
23 FROM sysstat CROSS JOIN cr_block_server;

PCT_BLOCKS_FLUSHED PCT_LMS_FLUSH_TIME
------------------ ------------------
.25 36.03

Note how even a very small proportion of block flushes can still account for a large proportion of total LMS time. Tuning the redo log IO layout might be indicated.

The LMS will sometimes need to flush redo entries to disk before returning a block. Redo log IO can therefore be a critical factor in Global Cache latency.

Cluster balance

Achieving balance in a RAC configuration is important for scalability, manageability and performance. While some variation in workload across the cluster is to be expected, in an unbalanced cluster the following undesirable situations can arise:

Sessions on busy instances get poor service time. Even though there may be spare capacity in the cluster as a whole, sessions on busy instances will be unable to utilize that capacity and will experience poor performance. Sessions on idle instances wait for blocks from busy instances. Because a lot of operations result in requests to remote instances, an over-loaded instance can cause performance problems across the entire cluster. A session on an “idle” instance may experience high Global Cache wait times waiting on blocks from the busy instance. Benefits of adding new instances may not be realized. If some of the instances in the cluster are subject to a higher workload, then these instances may become bottlenecks to overall database throughput. As instances are added to the cluster, expected performance improvements might be unattainable. Tuning is harder because each instance has different symptoms. In an unbalanced cluster, sessions on busy instances may experience high CPU waits while sessions on less busy instances will experience high Global Cache waits. Troubleshooting performance problems on an unbalanced cluster can therefore be more challenging because of the inconsistent symptoms.

We can assess cluster balance fairly easily: the following query reports on CPU, DB time and logical reads on each instance within the cluster since startup:

SQL> WITH sys_time AS (
2 SELECT inst_id, SUM(CASE stat_name WHEN 'DB time'
3 THEN VALUE E

OptimizingOracleRAC(十)

Cluster balance