Top CPU. Who is the culprit?

We DBAs know it very well: Oracle is the guilty one. Every time the UNIX server is slow, somebody comes to us complaining about Oracle, for the simple reason that an instance is running on the server.

What do we do? We connect to the db console and investigate was happened immediately after 14:00 (2:00 PM)


Was session 252 or the session 255 the cause of the problem? Was this statement responsible for the slow response and using most of the database server CPU?

Maybe yes, maybe no; the graph in the database console doesn't tell us whether any other non Oracle process was running at the same time. It would be much better to get the whole picture and list the top processes running at the time, not only the Oracle ones and, yes, if the top process is an Oracle one, know what it was doing.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Thu Dec 27 14:01:25 MET 2007
========================================================================
28373 root 293M 256M sleep 59 0 0:01.01 11.2% coda/11
879   /opt/OV/bin/ovcd
  28373 /opt/OV/lbin/perf/coda
    root 28373   879  0   Jun 20 ?       7184:02 /opt/OV/lbin/perf/coda
========================================================================
4140 oracle 524M 445M sleep 0 0 0:00.21 4.1% oracle/1
4140  oracleCZAPPPR1 (LOCAL=NO)
  oracle  4140     1  4 14:02:55 ?        0:22 oracleCZAPPPR1 (LOCAL=NO)

 PID  SID SPID         ORA User  OS User    S_# TERMINAL          PROGRAM      STATUS   SQL
---- ---- ------------ --------- --------- ---- ----------------- ------------ -------- ------------------------------
  27  255 4140         COCOM     IUSR_ZRHC   16 UNKNOWN           oracleCZAPPP ACTIVE   SELECT dlid , dl_file , filesi
                                 LI02                             R1@zrhmed91           ze , downloads.dt_create , sta
                                                                                        t_display , get , states."ROWI
                                                                                        D" , downloads."ROWID" , direc
                                                                                        tories."ROWID"  FROM states ,
                                                                                        downloads , directories WHERE
                                                                                        dir_dirid=dirid AND dirtype=2
                                                                                        AND downloads.dt_create > sys



 SID SQL_TEXT                                                     WAIT_CLASS       WAIT_TIME SECONDS_IN_WAIT STATE
---- ------------------------------------------------------------ ---------------- ---------- --------------- ------
 255 SELECT dlid , dl_file , filesize , downloads.dt_create , sta Idle             0               0 WAITING
     t_display , get , states."ROWID" , downloads."ROWID" , direc
     tories."ROWID"  FROM states , downloads , directories WHERE
      dir_dirid=dirid AND dirtype=2 AND downloads.dt_create > sys
     date - 100     AND stat_statid= states.statid    AND dir_dir
     id IN (SELECT dirid FROM directories WHERE slrs_slrid = 71)
      ORDER BY dt_create DESC, dl_file DESC

========================================================================
So, in reality there was another process, belonging to root, that used 11.2% of the CPU. This was not apparent in the database console...

And what happened after 15:00? There was a heavy load on the server. Let's connect to the database console. But what can we say if the database console unhappily shows us an "UNKNOWN" SQL Command?



Yes, an Oracle process was using the 50% of the CPU, and our script will tell us which job (if any) started by that particular SID is doing WHAT

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Thu Dec 27 15:32:41 MET 2007
========================================================================
19659 oracle 560M 489M cpu0 59 0 0:32.05 50% oracle/11
19659 ora_j000_CZAPPPR1
  oracle 19659     1 50 14:59:19 ?       32:06 ora_j000_CZAPPPR1

 PID  SID SPID         ORA User  OS User    S_# TERMINAL             PROGRAM      STATUS   SQL
---- ---- ------------ --------- --------- ---- -------------------- ------------ -------- -----------
  16  253 19659        COCOM     oracle     107 UNKNOWN              oracle@zrhme ACTIVE
                                                                     d91 (J000)


 SID        JOB
 253        443
-- CDR 15:00
BEGIN
    cocom$process.coc_do_downloadlist();
EXCEPTION
 WHEN OTHERS THEN
   NULL;
END;



  • Our clever script lists the top processes sorted by cpu
  • Shows their parents and children
  • If the user id is oracle, it provides the statement that was running at the time
  • The active jobs are detected.
The script has been tested on Solaris, HP UNIX and Linux; we will gladly adapt it to your particular UNIX flavour if you were to have any hitch.
Oracle scripts
Home