Tuning by Statistics

As you know, is not good to resolve tuning issues using the old method of "ratios" only. So we will start understanding several views from Oracle that will help you to tune your database based on its use.

Parsing and Read Consistency
The view v$sysstat is the accumulated statistics for the complete system since the instance has started except for what is active now and shown in v$sesstat. The numbers in this view are held in "tens of milliseconds," (sounds like hundredths of seconds to me). So to convert to seconds, we divide the value by 100. Note that TIMED_STATISTICS must be set to true in the init.ora for these views to have timing numbers. So to see overall parsing and read consistency waits, we can use the following:

column parsing heading 'Parsing|(seconds)'
column total_cpu heading 'Total CPU|(seconds)'
column waiting heading 'Read Consistency|Wait (seconds)'
column pct_parsing heading 'Percent|Parsing'

select total_CPU, parse_CPU parsing,
parse_elapsed-parse_CPU waiting,
trunc(100*parse_elapsed/total_CPU,2) pct_parsing
from (select value/100 total_CPU
from v$sysstat where name = 'CPU used by this session'),
(select value/100 parse_CPU
from v$sysstat where name = 'parse time cpu'),
(select value/100 parse_elapsed
from v$sysstat where name = 'parse time elapsed');



Total CPU Parsing Read Consistency Percent
(seconds) (seconds) Wait (seconds) Parsing
---------- ---------- ---------------- ----------
1236263327 21072.43 4277.93 0
We can see from this output that this instance has consumed a lot of CPU seconds, but a relatively small amount of that was used for parsing. Of the parsing, about 20 percent of the time was waiting for read consistency, but since the parsing percent is so small, this is not impacting our system performance.

Finding Free Buffers
If we need to scan buffers to find a free one, then we are wasting CPU resource. This can be checked with the script below:

select inspected,requested,(inspected/requested) ratio 
from (select value inspected from v$sysstat where name = 'free buffer inspected') a,
(select value requested from v$sysstat where name = 'free buffer requested') b;


INSPECTED REQUESTED RATIO
---------- ---------- ----------
5100 12288112 .000415035

We see that, for the vast majority of the time, the first buffer we look at is available (low ratio). So there is no problem with getting free buffers as we need them. Let's look further.

Active Waiting and Latch Contention
Active waiting can be compared to waiting at the betting window with your two dollars in your hand. You are not just passively standing there, but are focused on all the windows to see when a resource becomes available.
We can see these active waits in v$system_event with the following:

select event,time_waited/100 seconds 
from v$system_event
where event not in
('rdbms ipc message'
,'SQL*Net more data to client'
,'SQL*Net message from client'
,'SQL*Net message to client'
,'pmon timer'
,'smon timer'
,'wakeup time manager')
order by seconds;


EVENT SECONDS
---------------------------------------- ----------
buffer deadlock 0
. . .
latch free 3422.98
. . .
control file parallel write 80356.43
log buffer space 106812.85

The previous code shows me that we have waited the most time for log buffer space. I will check that I have the log buffer sized correctly. This is still a small fraction of the CPU seconds used by the instance as we saw above; it was over one billion, so we are way under one percent of the time spent waiting. This also shows us that we have been waiting for latches. So let's see which latch we are not getting when needed with the following code:

Column name format a35
SELECT name, gets, misses, immediate_gets, immediate_misses
FROM v$latch l
WHERE (misses > .1*gets)
or (immediate_misses > .1*(immediate_misses+immediate_gets));



NAME GETS MISSES IMMEDIATE_GETS IMMEDIATE_MISSES
--------------------------------- ------ -------- -------------- ----------------
channel operations parent latch 86845 6 43300 78881

This is a latch that we can't tune and we also see that the number of misses is not increasing as we rerun this query, so we will leave it alone.
Given this, we have to conclude that the instance is not using too much CPU resource for its own maintenance — so let's check the users.


User Code Execution

In our instance, we have checked that the environment for the instance is not forcing us to consume more CPU resources. So we will check the code that is being executed. Most often, the CPU consumption problem is caused by what a user is doing.

Highest Consuming Sessions
We can see which user is consuming the most CPU at this time by
creating a table with the cumulative values so we can pull out the current usage:

drop table t_CPU;
create table t_CPU storage
(initial 64k next 64k pctincrease 0) pctfree 0
as select a.sid,a.value
from v$sesstat a,v$statname b
where b.name = 'CPU used by this session'
and a.statistic# = b.statistic#;

column name format a25
column username format a20
column CPU heading 'CPU|Seconds'

select a.sid, c.username,b.name,(a.value-t.value)/100 CPU
from v$session c,v$sesstat a,v$statname b,t_CPU t
where c.sid = a.sid
and c.sid=t.sid
and b.name = 'CPU used by this session'
and a.statistic# = b.statistic#
and a.value > 0
and a.value - t.value > 0
order by a.value-t.value;

drop table t_CPU;

CPU
SID USERNAME NAME Seconds
------ -------------------- ------------------------- ----------
52 CPU used by this session .01
. . .
23 SYSTEM CPU used by this session 35.82
177 ORADBA CPU used by this session 38.26
167 ORADBA CPU used by this session 52.07

You can see that SID 167 is the current largest consumer. What we are looking for, then, is code that is in a tight loop in memory, just processing over and over. If this CPU is being consumed by normal usage, then no changes are needed. So let's see what SID 167 is running:

select c.executions,trunc(a.last_call_et/60) min,b.sql_text
from v$session a, v$sqltext b, v$sqlarea c
where a.sid=167
and a.sql_address = b.address
and a.sql_address = c.address
order by b.piece;


EXECUTIONS MIN SQL_TEXT
---------- ------ ----------------------------------------------------------------
94784 0 select daily_notes_id, center_code, TO_CHAR(day,'MM/DD/YYYY')
day, notes from daily_notes where center_code = :1
and day = TO_DATE(:2,'MM/DD/YYYY') order by 1

There is nothing wrong with this code and after watching for a while we see that this user is not running any code that jumps out as bad. We conclude that this session is just doing the most work.


Highest Consuming Statements
We can also check for the statements with the most buffer gets. These are probably working the most in memory:

set linesize 140
column executions format 999,999,999 heading "Executions"
column disk_reads format 999,999,999 heading "Block Reads"
column buffer_gets format 999,999,999 heading "Buffer Reads"
column sql_text format a90 heading "Statement"
break on executions on buffer_gets skip 1

--we want the 2 largest consumers

variable max_buffer_read number
begin
select max(buffer_gets) into :max_buffer_read
from v$sqlarea
where parsing_user_id >0;

select max(buffer_gets) into :max_buffer_read
from v$sqlarea
where parsing_user_id >0
and buffer_gets < :max_buffer_read;
end;
/

select a.executions,a.buffer_gets,b.sql_text
from v$sqlarea a,v$sqltext b
where a.buffer_gets >= :max_buffer_read
and a.address = b.address
and a.parsing_user_id > 0
order by a.buffer_gets desc, piece asc;


Executions Buffer Reads Statement
---------- ------------ -------------------------------------------------------------
6,300,634 51,221,671 SELECT DISTINCT B.FIRST_NAME || ' ' || B.LAST_NAME
FROM BLOCK_PHYSICIAN A,PHYSICIAN B
WHERE A.PHYSICIAN_FID=B.HYSICIAN_ID AND A.BLOCK_TIME_FID=:b1

75,265 27,227,798 select client_id,greeting_lu,first_name,last_name
,home_phone,work_phone,current_status_lu,center_code
,start_time
,TO_CHAR(start_time, 'MM/DD/YYYY HH:MI AM') start_datetime
,appointment_type_lu, appointment_id , number_attending
,offsite, offsite_location,treatment_info, bgcolor
,(TO_NUMBER(TO_CHAR(start_time,'HH24'))*60)
+TO_NUMBER(TO_CHAR(start_time,'MI')) time,physician
,TO_CHAR(start_time, 'HH:MI AM') start_display_time,note
from appointment_view_no_phys
where center_code = :1
and start_time between TO_DATE(:2,'MM,DD,YYYY')
and TO_DATE(:3,'MM,DD,YYYY')+1
and appointment_type_lu in
(:4,:5,:6,:7,:8,:9,:10,:11,:12,:13,:14)
and (current_status_lu not in
('Cancelled','Rescheduled','Attended','NoShowed'))
order by start_time,last_name,first_name

Remember that these numbers are those that are emerging since the instance has come up.
We see that the first statement is doing less than 10 buffer reads per execution on an average. Because I know that these tables are in the keep buffers, this makes sense.
The second statement is doing about 361 buffer gets per execution. This will be one to investigate and see how this view is joining. Maybe we can make this more efficient.