Thursday, August 7, 2014

Excessive PGA Memory Usage by Sessions and System Hang

One day in 2006, users (developers) started complaining that development database (9i database on RHEL 4)  has become very slow. I was not even able to log into the server to check the situation as it was hung, and only solution was to hard-restart the server.
It happened again, and again, and again. It was a challenge to find out what is going wrong and where. After some "googling", I was able to guess that something is happening with RAM/SawpSpace usage which was causing system to hang and eventually needed a hard-reset. Since this issue was happening sporadically, it was not easy to catch the culprit.

I decided to configure a job which would run every 10 seconds and capture value of PGA_ALLOC_MEM column in v$process dynamic performance view along with other session details from v$session, and insert into a table with timestamp of every capture of this data. So, when it happened next time, I was able to find a session which had allocated PGA memory 4G when last time system crashed. Further digging and tracing to the session from this module (module from where this session was connected) it was found that there was a bug in a module which was causing a loop to run millions of times and opening the cursors and hence eating up all the memory and swap space.

Time by time, I have used same methods in different scenarios which made me able to find out the issues with the applications and also some Oracle related bugs (if some oracle internal process PGA memory keeps on increasing and eventually depleting all memory because of memory leak).

Query
Following query can be used to check the processes which are allocating more PGA memory

#######################################################
set lines 500
set pages 1000
col program format a20
col module format a20
col machine format a40

select a.spid OS_PROCESS_ID,b.sid,b.serial#,b.username,b.schemaname,b.osuser,b.machine,b.status,b.server,
b.program,b.module,round(a.PGA_ALLOC_MEM/1024/1024,2) "Currecn PGA Allocated",round(a.PGA_USED_MEM/1024/1024,2)"Currect PGA memory in use"
from v$process a,v$session b where  a.addr=b.PADDR
order by a.PGA_ALLOC_MEM;

#######################################################

Scheduled Job
Following steps can be used to set up and job which will connect PGA memory information on regular basis (10 minutes in this case) and put in a table "PGA_MEMORY_USAGE".

#######################################################
create table PGA_MEMORY_USAGE as
select logon_time,b.sid,round(a.pga_alloc_mem/1024/1024,2) PGA_ALLOCATED_MEMORY,b.username,b.machine,b.module from v$process a, v$session b where a.addr=b.paddr and a.pga_alloc_mem > 20000000 order by a.pga_alloc_mem;

alter table PGA_MEMORY_USAGE add(snapshot_time date);

variable j number

exec dbms_job.submit (:j,what=>'begin insert into PGA_MEMORY_USAGE select logon_time,b.sid,round(a.pga_alloc_mem/1024/1024,2),b.username,b.machine,b.module,sysdate from v$process a, v$session b where a.addr=b.paddr order by a.pga_alloc_mem; commit; end;',next_date=>sysdate,interval=>'sysdate+1/144')
commit;

SQL> print :j

         J
----------
        21

#######################################################

Now following query can be used to check PGA memory by session and when memory of some particular session started growing and how much it grew.

alter session set nls_date_format='DD-MON-YY HH24:MI:SS';

select * from PGA_MEMORY_USAGE order by PGA_ALLOCATED_MEMORY;


Remove Job
Once you have found the culprit, following commands could be used to clean up the system.

#######################################################
exec dbms_job.remove(21);
commit;

drop table  PGA_MEMORY_USAGE;

#######################################################

MOS note 822527.1 further explains about how to check, which part of PGA memory is growing so that issue could be resolved easily.


No comments: