DBA Admin: 2016

Saturday 3 September 2016

Skip tablespaces while restore and recovery of database

NAME
------------------------------
SYSTEM
SYSAUX
UNDOTBS1
USERS
TEMP
EXAMPLE
DEV_IAS_ORASDPM
DEV_MDS
DEV_SOAINFRA
DEV_IAS_TEMP
DEV_ORABAM

NAME
------------------------------
DEV_PORTAL_LOG
DEV_PORTAL_DOC
DEV_PORTAL_IDX
DEV_IAS_PORTLET
DEV_PORTAL

16 rows selected.

Rman> connect target u/p@s

Rman>
set until time

"to_date('2011-12-30:00:00:00', 'yyyy-mm-dd:hh24:mi:ss')";

Rman> restore database skip tablespace
'DEV_IAS_ORASDPM','DEV_SOAINFRA',

'DEV_IAS_TEMP','DEV_ORABAM','DEV_PORTAL_LOG',

'DEV_PORTAL_DOC','DEV_PORTAL_IDX',

'DEV_IAS_PORTLET','DEV_PORTAL';

Rman> recover database skip tablespce 'DEV_IAS_ORASDPM','DEV_SOAINFRA',

'DEV_IAS_TEMP','DEV_ORABAM','DEV_PORTAL_LOG',

'DEV_PORTAL_DOC','DEV_PORTAL_IDX','DEV_IAS_PORTLET','

DEV_PORTAL';

Sql>alter database open resetlogs;

Sql>select name, status from v$datafile;

First get metadata for schema and then take a expdp backup.

Scp ur backup to target db server.

Create a schema and then impdp.

Compile invalid objects if any.

RMAN Command cheat sheet

RMAN> list backupset;

RMAN> list backupset of database;

RMAN> list backupset of database completed before '22-NOV-00';

RMAN> list backupset of archivelog all;

RMAN> list backupset of tablespace users;

RMAN> list backupset of datafile 1;
RMAN> list backup summary;

RMAN> list backup;

RMAN> list backup of controlfile;

RMAN> list backup of tablespace SYSTEM;

RMAN> list backup by file;

RMAN> list backup of archivelog all ;
RMAN> list copy of database;

RMAN> list copy of database archivelog all;
RMAN> allocate channel for maintenance type disk;

RMAN> configure channel device type disk clear ;
RMAN> report need backup days=2 database;

RMAN> report need backup days=10 tablespace TEMP;

RMAN> report need backup days=4
datafile 'D:\ORACLE\ORADATA\OR816\TEMP01.DBF';
RMAN> report schema;

RMAN> report obsolete;
RMAN> crosscheck backup;

RMAN> crosscheck backupset;

RMAN> crosscheck copy;

RMAN> crosscheck archivelog all;

RMAN> crosscheck controlfilecopy '/ora01/oracle/admin/BACKUP/demo/
demo_controlfile_bak_03-15-11_19:59:11' ;
RMAN> delete noprompt expired backup ;

RMAN> delete noprompt obsolete;

RMAN> delete archivelog all;

RMAN> delete expired archivelog all;

RMAN> delete archivelog all completed before 'sysdate -1';

RMAN> delete noprompt archivelog until time 'sysdate - 1';

RMAN> delete archivelog all backed up 1 times to device type disk
completed before 'sysdate-1';

Monday 15 August 2016

Types of Locks Within Oracle

Oracle provides the following three main kinds of locks:

DML locks
DDL locks
Internal locks and latches

DML Locks

DML locks or data locks guarantee the integrity of data being accessed concurrently by multiple users. DML locks help to prevent damage caused by interference from simultaneous conflicting DML or DDL operations. By default, DML statements acquire both table-level locks and row-level locks.

The reference for each type of lock or lock mode is the abbreviation used in the Locks Monitor from Oracle Enterprise Manager (OEM). For example, OEM might display TM for any table lock within Oracle rather than show an indicator for the mode of table lock (RS or SRX).

Row Locks (TX)

Row-level locks serve a primary function to prevent multiple transactions from modifying the same row. Whenever a transaction needs to modify a row, a row lock is acquired by Oracle.

There is no hard limit on the exact number of row locks held by a statement or transaction. Also, unlike other database platforms, Oracle will never escalate a lock from the row level to a coarser granular level. This row locking ability provides the DBA with the finest granular level of locking possible and, as such, provides the best possible data concurrency and performance for transactions.

The mixing of multiple concurrency levels of control and row level locking means that users face contention for data only whenever the same rows are accessed at the same time. Furthermore, readers of data will never have to wait for writers of the same data rows. Writers of data are not required to wait for readers of these same data rows except in the case of when a SELECT... FOR UPDATE is used.

Writers will only wait on other writers if they try to update the same rows at the same point in time. In a few special cases, readers of data may need to wait for writers of the same data. For example, concerning certain unique issues with pending transactions in distributed database environments with Oracle.

Transactions will acquire exclusive row locks for individual rows that are using modified INSERT, UPDATE, and DELETE statements and also for the SELECT with the FOR UPDATE clause.

Modified rows are always locked in exclusive mode with Oracle so that other transactions do not modify the row until the transaction which holds the lock issues a commit or is rolled back. In the event that the Oracle database transaction does fail to complete successfully due to an instance failure, then Oracle database block level recovery will make a row available before the entire transaction is recovered. The Oracle database provides the mechanism by which row locks acquire automatically for the DML statements mentioned above.

Whenever a transaction obtains row locks for a row, it also acquires a table lock for the corresponding table. Table locks prevent conflicts with DDL operations that would cause an override of data changes in the current transaction.

Table Locks (TM)

What are table locks in Oracle? Table locks perform concurrency control for simultaneous DDL operations so that a table is not dropped in the middle of a DML operation, for example. When Oracle issues a DDL or DML statement on a table, a table lock is then acquired. As a rule, table locks do not affect concurrency of DML operations. Locks can be acquired at both the table and sub-partition level with partitioned tables in Oracle.

A transaction acquires a table lock when a table is modified in the following DML statements: INSERT, UPDATE, DELETE, SELECT with the FOR UPDATE clause, and LOCK TABLE. These DML operations require table locks for two purposes: to reserve DML access to the table on behalf of a transaction and to prevent DDL operations that would conflict with the transaction.

Any table lock prevents the acquisition of an exclusive DDL lock on the same table, and thereby prevents DDL operations that require such locks. For example, a table cannot be altered or dropped if an uncommitted transaction holds a table lock for it.

A table lock can be held in any of several modes: row share (RS), row exclusive (RX), share (S), share row exclusive (SRX), and exclusive (X). The restrictiveness of a table lock's mode determines the modes in which other table locks on the same table can be obtained and held.

Sunday 14 August 2016

Change or switch undo and Temp tablespace in Oracle database

Change or switch undo tablespace in Oracle database :
---------------------------------------------------------

The complexity comes in if there were pending transactions in your old undo tablespace and you have already switched to the new one and when you try to drop the old one it says "Tablespace is in use".

How to switch the database to a new UNDO tablespace and drop the old one
$ sqlplus / as sysdba

SQL> show parameter undo

NAME TYPE VALUE
------------------------------------ ----------- ----------------------
undo_management string AUTO
undo_retention integer 900
undo_tablespace string UNDOTBS1

The current undo tablespace as suggested by the initialization parameter undo_tablespace is UNDOTBS1. Leave this sysdba as is, open another console, log in as user SCOTT and initiate a transaction.

$ sqlplus scott/tiger

SQL>update emp set sal = sal + 1000 where empno=7839;
1 row updated.

With an update on emp table we have initiated a transaction. The undo data is written to a segment in the UNDOTBS1 tablespace. Now leave this SCOTT's session intact and go back to the sysdba console without issuing any COMMIT or ROLLBACK.

SQL>CREATE UNDO TABLESPACE undotbs2
DATAFILE '/d01/apps/oradata/oraxpo/undotbs201.dbf'
SIZE 50M AUTOEXTEND ON NEXT 5M;

Tablespace created.

-- We created a new UNDO tablespace named UNDOTBS2

SQL>ALTER SYSTEM SET UNDO_TABLESPACE=UNDOTBS2 SCOPE=BOTH;

System altered.

-- Switch the database to the new UNDO tablespace.

SQL> DROP TABLESPACE undotbs1 INCLUDING CONTENTS AND DATAFILES;
DROP TABLESPACE undotbs1 INCLUDING CONTENTS AND DATAFILES
*
ERROR at line 1:
ORA-30013: undo tablespace 'UNDOTBS1' is currently in use

-- Try to drop the tablespace but failed.

With the alter system set undo_tablespace=UNDOTBS2, the database UNDO tablespace is changed and any new transaction's undo data will go to the new tablespace i.e. UNDOTBS2.

But the undo data for already pending transaction (e.g. the one initiated by SCOTT before the database UNDO tablespace switch) is still in the old tablespace with a status of PENDING OFFLINE. As far as it is there you cannot drop the old tablespace.

set lines 10000
column name format a10

SELECT a.name,b.status
FROM v$rollname a,v$rollstat b
WHERE a.usn = b.usn
AND a.name IN (
SELECT segment_name
FROM dba_segments
WHERE tablespace_name = 'UNDOTBS1'
);

NAME STATUS
---------- ---------------
_SYSSMU8$ PENDING OFFLINE

The above query shows the name of the UNDO segment in the UNDOTBS1 tablespace and its status. Now lets see which users/sessions are running this pending transaction.

column username format a6

SELECT a.name,b.status , d.username , d.sid , d.serial#
FROM v$rollname a,v$rollstat b, v$transaction c , v$session d
WHERE a.usn = b.usn
AND a.usn = c.xidusn
AND c.ses_addr = d.saddr
AND a.name IN (
SELECT segment_name
FROM dba_segments
WHERE tablespace_name = 'UNDOTBS1'
);

NAME STATUS USERNAME SID SERIAL#
---------- -------------- ------ ---------- ----------
_SYSSMU8$ PENDING OFFLINE SCOTT 147 4

So this is SCOTT with SID=147 and SERIAL#=4. Since we know now the user, we can go to him/her and request to end the transaction gracefully i.e. issue a ROLLBACK or COMMIT. However, if this is not possible (say the user initiated the transaction and left for annual leave :) and trust me this happens) you may go ahead and kill the session to release the undo segments in the UNDOTBS1 tablespace.

SQL> alter system kill session '147,4';

System altered.

SELECT a.name,b.status , d.username , d.sid , d.serial#
FROM v$rollname a,v$rollstat b, v$transaction c , v$session d
WHERE a.usn = b.usn
AND a.usn = c.xidusn
AND c.ses_addr = d.saddr
AND a.name IN (
SELECT segment_name
FROM dba_segments
WHERE tablespace_name = 'UNDOTBS1'
);
no rows selected

As we can see once the session is kills we don't see anymore segments occupied in the UNDOTBS1 tablespace. Lets drop UNDOTBS1.

SQL> DROP TABLESPACE undotbs1 INCLUDING CONTENTS AND DATAFILES;
DROP TABLESPACE undotbs1 INCLUDING CONTENTS AND DATAFILES
*
ERROR at line 1:
ORA-30013: undo tablespace 'UNDOTBS1' is currently in use
If you are retaining undo data then you still won't be able to drop the tablespace because it is still in use by undo_retention. Let the UNDO_RETENTION time pass and then try to drop the tablespace. In my case it is 900 seconds i.e. 15 minutes.

-- After 15 minutes.
SQL> DROP TABLESPACE undotbs1 INCLUDING CONTENTS AND DATAFILES;

Tablespace dropped.

How to drop and recreate TEMP Tablespace in Oracle 9i/10g/11g:
----------------------------------------------------------------

1. Create Temporary Tablespace Temp

CREATE TEMPORARY TABLESPACE TEMP2 TEMPFILE ‘/u01/app/oradata/temp01′ SIZE 2000M, ‘/u01/app/oradata/temp02′ SIZE 2000M';

2. Move Default Database temp tablespace

ALTER DATABASE DEFAULT TEMPORARY TABLESPACE temp2;

3. Make sure No sessions are using your Old Temp tablespace

a. Find Session Number from V$SORT_USAGE:
SELECT USERNAME, SESSION_NUM, SESSION_ADDR FROM V$SORT_USAGE;

b. Find Session ID from V$SESSION:

If the resultset contains any tows then your next step will be to find the SID from the V$SESSION view. You can find session id by using SESSION_NUM or SESSION_ADDR from previous resultset.

SELECT SID, SERIAL#, STATUS FROM V$SESSION WHERE SERIAL#=SESSION_NUM;
OR
SELECT SID, SERIAL#, STATUS FROM V$SESSION WHERE SADDR=SESSION_ADDR;

c. Kill Session:

Now kill the session with IMMEDIATE.

ALTER SYSTEM KILL 'SID,SERIAL#' IMMEDIATE;

4. Drop temp tablespace

DROP TABLESPACE temp INCLUDING CONTENTS AND DATAFILES;

5. Recreate Tablespace Temp

CREATE TEMPORARY TABLESPACE TEMP TEMPFILE /u01/app/temp/temp01′ SIZE 2000M;

6 Move Tablespace Temp, back to new temp tablespace

ALTER DATABASE DEFAULT TEMPORARY TABLESPACE temp;

7. Drop temporary for tablespace temp

DROP TABLESPACE temp2 INCLUDING CONTENTS AND DATAFILES;

No need to do shutdown when drop temp tablespace and the recreate it. If something happen with temp tablespaces e.g. : crash, corrupt, etc. Oracle database will ignore the error, but DML (insert,update,delete) and SELECT Query will suffer.

Saturday 6 August 2016

ORA-38729: Not enough flashback database log data to do FLASHBACK.

I have faced an interesting issue a month before. When I try to perform a flashback database, I was getting the below error message.
I had enough flashback logs but still I have been thrown the below error message.
Error message

RMAN> FLASHBACK DATABASE TO SCN 69979801;
Starting flashback at 10-NOV-11
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1252 instance=RMAN> FLASHBACK DATABASE TO SCN 69979801;
Starting flashback at 10-NOV-11
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1252 instance=BHU_1 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=1377 instance=BHU_1 device type=DISK
allocated channel: ORA_DISK_3
channel ORA_DISK_3: SID=1502 instance=BHU_1 device type=DISK
allocated channel: ORA_DISK_4
channel ORA_DISK_4: SID=1628 instance=BHU_1 device type=DISK
allocated channel: ORA_DISK_5
channel ORA_DISK_5: SID=1753 instance=BHU_1 device type=DISK
allocated channel: ORA_DISK_6
channel ORA_DISK_6: SID=1876 instance=BHU_1 device type=DISK

starting media recovery
media recovery failed
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of flashback command at 11/10/2011 16:42:04
ORA-38729: Not enough flashback database log data to do FLASHBACK.
RMAN>
BELOW ERROR MESSAGE FOR THE SAME ISSUE
In the last error you had, the problem was that the archive logs weren't available, so you got an error
ORA-38754: FLASHBACK DATABASE not started; required redo log is not available
ORA-38762: redo logs needed for SCN 39221435 to SCN 39240345
ORA-38761: redo log sequence 56 in thread 1, incarnation 7 could not be accessed
This is slightly different than if the flashback log is not available
ORA-38729: Not enough flashback database log data to do FLASHBACK.

My environment è we had 2 node RAC database. When I try to flashback the database but I got the above error.
Did the flashback is enabled for the database
SQL> select flashback_on from gv$database;
FLASHBACK_ON
------------------
YES
YES

What is the size & retention set for the flashback?
SQL> show parameter db_rec
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_recovery_file_dest string +RECO
db_recovery_file_dest_size big integer 270G
db_recycle_cache_size big integer 0
SQL> show parameter db_flash
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_flash_cache_file string
db_flash_cache_size big integer 0
db_flashback_retention_target integer 1440 à 1 day
When I try to check the flashback log, do I have enough flashback logs to perform a flashback?
SQL> select * from v$flashback_database_log;
OLDEST_FLASHBACK_SCN OLDEST_FL RETENTION_TARGET FLASHBACK_SIZE ESTIMATED_FLASHBACK_SIZE
-------------------- --------- ---------------- -------------- ------------------------
84854010 09-NOV-11 1440 3698974720 935731200
We can able to check, till the second of last flashback log that is present in the flashback log.
SQL> select to_char(oldest_flashback_time,'mm/dd/yy hh24:mi:ss') from v$flashback_database_log;
TO_CHAR(OLDEST_FL
-----------------
11/09/11 22:00:12

What is the size occupied by the flashback logs
SQL> select space_used/(1024*1024),space_limit/(1024*1024) from v$recovery_file_dest;
SPACE_USED/(1024*1024) SPACE_LIMIT/(1024*1024)
---------------------- -----------------------
3709 276480

Then I come to know that flashback database command requires both flashback logs and archive logs for period which we mention in the “FLASHBACK DATABASE SCN/TIME/ SEQUENCE XXXXXXXXXX”.
DIFFERENT OPTION OF PERFORMING FLASHBACK
SQL> FLASHBACK DATABASE TO TIMESTAMP (SYSDATE - 2/24);
SQL> FLASHBACK DATABASE TO SCN n;
SQL> FLASHBACK DATABASE TO SEQUENCE=M THREAD=N;
SQL> FLASHBACK DATABASE TO TIMESTAMP timestamp '2012-01-25 13:54:00';
SQL> FLASHBACK DATABASE TO TIMESTAMP to_timestamp('2012-01-25 13:54:00', 'YYYY-MM-DD HH24:MI:SS');

Flashback database use the flashback logs to bring the database files to a SCN before the time/SCN you want to flashback. This is an approximate point based on the flashback snapshots taken to the flashback logs. Oracle then uses the archives to apply redo to get to the exact SCN or time you want to flashback.
http://docs.oracle.com/cd/E11882_01/backup.112/e10642/flashdb.htm#CFHCEIHB
"When you use Flashback Database to rewind a database to a past target time, the command determines which blocks changed after the target time and restores them from the flashback logs. The database restores the version of each block that is immediately before the target time. The database then uses redo logs to reapply changes that were made after these blocks were written to the flashback logs.
Redo logs on disk or tape must be available for the entire time period spanned by the flashback logs. For example, if the flashback retention target is 1 week, then you must ensure that online and archived redo logs that contain all changes for the past week are accessible. In practice, redo logs are typically needed much longer than the flashback retention target to support point-in-time recovery."

ORA-16737: the redo transport service for standby database 'BHU_B" has an error

ORA-16737: the redo transport service for standby database "BHU_B" has an error

Error message
DGMGRL> show database verbose 'BHU_A';
Database – BHU_A
Role:            PRIMARY
Intended State: TRANSPORT-ON
Instance(s):
    BHU_1
    BHU_2
    BHU_3
      Error: ORA-16737: the redo transport service for standby database "BHU_B" has an error
Database Warning(s):
    ORA-16629: database reports a different protection level from the protection mode

#1
CHECK: when you have the above problem, you will get the protection mode in the    Primary & standby v$database.protection_level shows as "RESYNCHRONIZATION"
SQL> select protection_mode, protection_level from v$database;

#2
CHECK: whether you are getting any in the archive location
SQL>select dest_id,status,error from v$archive_dest;

#3
CHECK: Check whether online redo log are configured in the primary & standby database properly, it includes size & accessibility
Note: primary &standby online redo log should be same
SQL>select group#,thread#,sequence#,bytes,archived,status from v$log;

#4
CHECK: Check whether standby redo log are configured in the primary &standby database properly, it includes size & accessibility

Note: primary &standby standby redo log should be same
select member from v$logfile where type='STANDBY';

#5
CHECK: Check parameters are configured properly; some times instance parameters have a different value.
Ex: some common parameter will have different value for each instance in the cluster database. You need to check on the primary &standby cluster database environment.

#6
Check whether maximum Availability is enabled, when you have LogXptMode is synchronization
SYMP:
ORA-16629: database reports a different protection level from the protection mode

In DG Broker
DGMGRL> EDIT CONFIGURATION SET PROTECTION MODE AS MAXAVAILABILITY;
Succeeded.

#7 check your password with the setting
A)
1) check for the "sec_case_sensitive_logon" parameter.
2) if the problem exist, create the password file with ignorecase option in the orapwd password creation.
3) after recreating the password, restart both the primary & standby database.

B)
DGMGRL> show database verbose 'BHU_A' LogXptStatus;
LOG TRANSPORT STATUS
PRIMARY_INSTANCE_NAME STANDBY_DATABASE_NAME               STATUS
            BHU1_1             BHU1_B
            BHU1_2             BHU1_B ORA-16191: Primary log shipping client not logged on standby

ORA-00376: file 6 cannot be read at this time

When I try to start a standby database in the read only mode, I have been thrown the below error message

$ srvctl start database -d BHU_B
PRCR-1079 : Failed to start resource ora.BHU_b.db
CRS-5017: The resource action "ora.BHU_b.db start" encountered the following error:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 6 cannot be read at this time
ORA-01110: data file 6: '+BHU_B_SYSTEM/BHU_b/datafile/undo.264.789559193'
Process ID: 9165
Session ID: 3497 Serial number: 1
. For details refer to "(:CLSN00107:)" in "/oracle/GRID/11203/log/bhurac01/agent/crsd/oraagent_oracle/oraagent_oracle.log"
It is an undo tablespace (datafile) which is throwing the error and I have checked the datafile it is in the recovery mode.
SQL> select file#,name,status,enabled from v$datafile where file#=6;

6 +BHU_B_SYSTEM/BHU_b/datafile/undo.264.789559193

RECOVER READ WRITE
SO I HAVE FOLLOWED BELOW PROCEDURE TO OVERCOME THIS PROBLEM

#1
1 A) Stop the apply process in the standby database
DGMGRL> edit database 'BHU_B' set state='APPLY-OFF';
Succeeded.

2 B) It is RAC Database, kept only one instance in mount stage and other instance are in the offline mode(shutdown)

3 C) It is standby database (undo tablespace-datafile). So I am not able to do offline

#2 – Take a online backup of datafile through RMAN
RMAN> copy datafile 6 to '+BHU_B_DATA1';
Starting backup at 25-JUL-12
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1842 instance=BHU_1 device type=DISK
channel ORA_DISK_1: starting datafile copy
input datafile file number=00006 name=+BHU_B_SYSTEM/BHU_b/datafile/undo.264.789559193
output file name=+BHU_B_DATA1/BHU_b/datafile/undo.301.789569089 tag=TAG20120725T124447 RECID=69 STAMP=789569388
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:05:05
Finished backup at 25-JUL-12
Starting Control File and SPFILE Autobackup at 25-JUL-12
piece handle=/oracle/BHU/11203/dbs/c-72629545-20120725-01 comment=NONE
Finished Control File and SPFILE Autobackup at 25-JUL-12

#3 I doing a rename through RMAN itself, there is no need of using the RENAME command in the sqlplus

RMAN> switch datafile 6 to copy;
datafile 6 switched to datafile copy "+BHU_B_DATA1/BHU_b/datafile/undo.301.789569089"

#4 when I check the status of the datafile, it looks in the RECOVER MODE. So I cant open the database in the READ-ONLY MODE.

SQL> select name,status from v$datafile where file#=6;
NAME                                                 STATUS
----------------------------------------------------------------------------
+BHU_B_DATA1/BHU_b/datafile/undo.301.789569089 RECOVER

#5 Started the Recover through DG Broker
DGMGRL> edit database 'BHU_B' set state='APPLY-ON';
Succeeded.

#6 Monitor the apply Lag
I could see that the system is using the old archive log to recover the datafile.
Once the recovery is completed, you can open the database in the read only mode.

I have see the status of the datafile; it is in the online mode

SQL> select name,status from v$datafile where file#=6;
NAME                                  STATUS
----------------------------------------------------------
+BHU_B_DATA1/BHU_b/datafile/undo.301.789569089 ONLINE

index Rebuild - Progress & Index creation Progress

To identify the index rebuild progress.
i have used the oracle schema to identify the session which are active in nature.

Note: For some reason V$SESSION_LONGOPS doesn't show the progress of index Rebuild process. use the below query for index rebuild. index creation can be seen in the below query and other query with the
V$SESSION_LONGOPS

SQL> SELECT MESSAGE FROM V$SESSION_LONGOPS WHERE SID IN (SELECT SID FROM V$SESSION WHERE USERNAME='BHUVAN' AND STATUS='ACTIVE') ORDER BY START_TIME;

MESSAGE
-----------------------------------------------------------------------------------
Table Scan: (stale or locked) obj# 31940: 1426408 out of 1426408 Blocks done
Table Scan: BHUVAN.EMP: 2388523 out of 2388523 Blocks done
Sort Output: : 235280 out of 235280 Blocks done
Table Scan: BHUVAN.EMP: 154318 out of 2388523 Blocks done

-- i have merge active session with the V$SESSION_LONGOPS view to retrieve progress of index creation with the index creation syntax.

SQL> col a.sid format 9999
SQL> select a.sid, sql_text ,target, sofar, totalwork, time_remaining still, elapsed_seconds tillnow
2 from v$session a , v$sql b, v$session_longops c
3 where a.sid=c.sid
4 and a.sql_address = b.address
5 and a.sql_address = c.sql_address
  6 and status = 'ACTIVE';
SID
SQL_TEXT
TARGET                                                                SOFAR TOTALWORK      STILL    TILLNOW
---------------------------------------------------------------- ---------- ------
366
CREATE INDEX "EMP~EXT" ON "EMP" ("CLIENT", "BPEXT") PCTFREE 10 INITRANS 002 TABLESPACE PBHUVAN COMPRESS 2 STORAGE (INITIAL 0000000064 K NEXT 0000001024 K MINEXTENTS 0000000001 MAXEXTENTS UNLIMITED PCTINCREASE 0000 FREELISTS 001)
BHUVAN.EMP                                                        894046    2388523        563        337

Wednesday 3 August 2016

pt

Identify the root cause of performance issues at a precise point in the past—even when the sessions have disconnected.
In my previous two articles on the topic of performance tuning (bit.ly/SEzR6t and bit.ly/U7U5IJ), you learned how to identify the cause of performance bottlenecks occurring in real time and view the summary of what happened in the past. Although these processes provide information to help resolve performance issues in many cases, there are times when you would like to know precisely what happened at a specific point in the past. In this article, you will learn how to identify the precise sequence of events in the past in a historical—not current—session.
To run the examples in this article, execute the setup.sql script. The script creates a user called ARUP and creates three tables: TEST1, TEST2, and TEST3. In the very rare event that a user called ARUP already exists, modify the script to create another user with a different name and replace all occurrences of ARUP with that name. The sample database created by setup.sql will take up less than 200 MB of space in your database.
After running the setup.sql script, open four SQL*Plus sessions, connected as ARUP. In three of these sessions, execute the test1.sql, test2.sql, and upd1.sql scripts, respectively. Here is an example of one session running upd1.sql on a UNIX-based system:

# sqlplus arup/arup
SQL> @upd1.sql

In the fourth session, execute upd1.sql again. This last session will hang, because the third session has already executed upd1.sql, which updates one row of the TEST1 table but does not commit and therefore keeps the row locked. The fourth session executes the same script, which attempts to update the same row. But the row is locked by the third session, so the fourth session waits until the lock is gone and the row is available for locking.
The test1.sql and test2.sql scripts are designed to consume significant amounts of CPU, so the first two sessions will be slow due to CPU contention. These sessions will run for a very long time.
After all the sessions have waited for about 10 minutes, stop the execution in sessions 1 and 2 by pressing Control-C to exit each SQL*Plus session. In session 3, enter commit and press Enter. This will release the lock on the row, and you will see session 4 show “1 row updated.” Exit SQL*Plus sessions 3 and 4. Now all the sessions connected as ARUP are closed—and history.

Past Sessions

In the setup, I showed you how to simulate performance issues in three of the four sessions. If these were typical business applications, the applications would have showed signs of slowdown, inviting the ire of the respective application owners and users and bringing unwanted attention to you, the DBA. Now that you have that attention, what should you do next?
In my previous two performance tuning articles, you learned how to find the clues to the sources of performance issues in sessions. Unfortunately, those techniques will be of no help in this case. The V$SESSION view shows the reason for a performance issue in a session, but the session that caused the issue in this article is no longer present. The other important view, V$SESSION_EVENT, shows all the wait events waited for by the session, but, as with the V$SESSION view, it will show the data only if the session is still active in the database. The view V$SESSTAT shows the resource usage for a session and can offer clues to which sessions have consumed what amount of a resource such as redo or memory. However, because all the sessions that experienced performance issues are historical, looking into these views will not help resolve the issues. You need to determine the specific events that occurred at a point in time in historical sessions. If you had enabled tracing, the trace files would have shown the events, but you did not expect these issues in advance and didn’t enable tracing, because tracing would have degraded performance. So how do you now identify the cause of the performance issues suffered by these old sessions?

Active Session History

Fortunately, looking at performance issues in old sessions is easy with an Oracle Database feature called Active Session History. Note that the use of Active Session History requires Oracle Diagnostics Pack, a licensed option of Oracle Database available since Oracle Database 10g Release 1.
Every second, Active Session History polls the database to identify the active sessions and dumps relevant information about each of them—such as the user ID, state, the machine it is connected from, and the SQL it is executing—into a special area in the system global area (SGA) of the database instance called the ASH buffer. So even though a session is no longer present in the database instance, the ASH buffer has captured its information. In addition, because Active Session History records activities every second, it can show a second-by-second snapshot of the activities in a session. In other words, Active Session History can show a movie of the activities of the session instead of a single picture. (Note that when the ASH buffer is filled, the data is written to disk and the snapshots are taken every 10 seconds rather than every second.)
You can examine the contents of the ASH buffer in a view named V$ACTIVE_SESSION_HISTORY. Here are a few of the important columns of the V$ACTIVE_SESSION_HISTORY view:
SAMPLE_ID. The unique identifier of the Active Session History record.
SAMPLE_TIME. When Active Session History captured this data on all active sessions.
USER_ID. The numerical user ID (not the username) of the database user who created this session.
SESSION_ID. The session ID (SID) of the session.
SESSION_STATE. The state the session was in when Active Session History took the sample. It shows WAITING if the session was waiting for something; otherwise, it shows ON CPU to indicate that the session was doing productive work.
EVENT. If the session was in a WAITING state (in the SESSION_STATE column), this column will show the wait event the session was waiting for.
TIME_WAITED. If the session was in a WAITING state, this column will show how long it had been waiting when Active Session History took the sample.
WAIT_TIME. If the session is doing productive work—not in a WAITING state—this column will show how long the session waited for the last wait event.
SQL_ID. The ID of the SQL statement the session was executing at the time the sample was taken.
SQL_CHILD_NUMBER. The child number of the cursor. If this is the only version of the cursor, the child number will be 0.
Knowing the meaning of the V$ACTIVE_SESSION_HISTORY columns, you can identify what it was that historical sessions were waiting for. To begin the identification, you need to pose two questions to the application owners or users executing the SQL statements that experienced slow performance:

Which username was used to connect to the database?
What was the time interval (start and end times) of the period when the performance issues occurred?

Because you ran the setup script as the user ARUP, the answer to the first question is ARUP. Next, you need to find out the USER_ID of the ARUP user by issuing the following SQL:

select user_id 
from dba_users
where username = 'ARUP';
 
   USER_ID
—————————————
        92

Now suppose the user told you that the performance issues occurred between 4:55 p.m. and 5:05 p.m. on September 29. With this information, you can query the V$ACTIVE_SESSION_HISTORY view to find out the activities of the ARUP sessions (with USER_ID 92) within that period, as shown in Listing 1. (The output has been truncated to fit the space available.) Because Active Session History collects information on all active sessions, you need to order the output by SID, which identifies a session (shown under SESSION_ID), and then by the collection time (shown under SAMPLE_TIME).
Let’s examine the first row of the output. It shows that the session identified by SESSION_ID 39 was waiting for the “enq: TX - row lock contention” event on 29-SEP-12 at 04.55.02.379 PM. Because the session was in a WAITING state, the value of the WAIT_TIME column is irrelevant, so it shows up as 0. Because the session was still in a WAITING state when Active Session History captured the state, the TIME_WAITED column shows 0. When the session finally got the lock, it could do what it had to do and stopped waiting. At that point, the total time the session had waited was updated in Active Session History, shown in the first boldface line in the Listing 1 output (sample time 29-SEP-12 05.16.52.078): 1,310,761,160 microseconds (shown in the TIME_WAITED column), or about 22 minutes. This is such an important property of Active Session History that I repeat: Seeing 0 in the WAIT_TIME column does not mean that the session didn’t wait at all. It simply means that the session was waiting for some event for more than one second because the previous WAIT_TIME and TIME_WAITED column values showed 0. You should look at the last occurrence of the wait event (the EVENT column value) for that session in Active Session History to determine what the total wait time really was.
When you explain to your user that the cause of the delay was an unavailable lock during the period 04.55.02.379 PM to 05.16.52.078 PM, that person might ask you what SQL statement the session was executing at that time. That’s extremely easy to get from the Listing 1 output: the session with SESSION_ID 39 was executing the SQL statement with SQL_ID fx60htyzmz6wv and child number (CH#) 0.
You can get that SQL statement text with this query:

select SQL_TEXT 
from v$sql 
where sql_id = 'fx60htyzmz6wv';
 
SQL_TEXT
————————————————————————————————————————————————
update test1 set status = 'D' where object_id = :b1

The SQL statement includes an UPDATE that had to lock the row. Because the row had already been locked by another session, it was not possible for this session (SESSION_ID 39) to succeed and it therefore had to wait. The next logical questions from the user would perhaps be which specific row on which table the SQL statement and the session were waiting for and which session had locked the row. These are also very easy to find with the query shown in Listing 2. The BLOCKING_SESSION column shows the session that was holding the lock: the session with SESSION_ID 43. The Listing 2 output also shows the object ID (CURRENT_OBJ#) of the table whose row was locked and other information necessary to get the row information. With the data in the Listing 2 output, you can get the ROWID of the locked row by using the query shown in Listing 3. The row with ROWID AAAdvSAAHAAABGPAAw had been locked by the session with SESSION_ID 43 and was being requested to be updated by the session with SESSION_ID 39. You now have the information on why the session (with SESSION_ID 39) was slow—it was waiting for a lock for 22 minutes—what SQL statement it was executing, and what specific row it was looking to lock. You have successfully uncovered the root cause of the performance issue in the session with SESSION_ID 39.
Code Listing 3: Getting specific row information

select
    owner||'.'||object_name||':'||nvl(subobject_name,'-') obj_name,
    dbms_rowid.rowid_create (
        1,
        o.data_object_id,
        row_wait_file#,
        row_wait_block#,
        row_wait_row#
    ) row_id
from v$session s, dba_objects o
where sid = &sid
and o.data_object_id = s.row_wait_obj#

OBJ_NAME       ROW_ID
—————————————  ———————————————————
ARUP.TEST1:-   AAAdvSAAHAAABGPAAw

Resource Contention

After uncovering the root cause of the slowness in session 39, you now turn your attention to session 44. Revisiting the output in Listing 1, you can see that session 44 switched between waiting (shown under SESSION_STATE as WAITING) and doing productive work (shown as ON CPU). Note the very first line in the output for the session with SESSION_ID 44 and SAMPLE_TIME 29-SEP-12 04.55.34.419 PM. The SESSION_STATE column shows WAITING, which means that the session was waiting at that time. The EVENT and TIME_WAITED columns show “resmgr:cpu quantum” and “109984,” respectively. This means that the session had already waited for 109,984 microseconds, or about 0.11 seconds, at that time for the “resmgr:cpu quantum” event. The next line in Listing 1, sampled about a second later, shows the SESSION_STATE column value as ON CPU, which means that the session was doing productive work at that time—not waiting. You need to know why the session was intermittently waiting for this wait event and therefore slowing down.
The “resmgr:cpu quantum” event is due to Oracle Database’s Database Resource Management feature. Database Resource Management acts as a resource governor: it limits CPU consumption of individual sessions when the total CPU demand from all sessions goes up to more than 100 percent, and it enables more-important sessions to get the CPU they need. Because the output shows the session waiting, you can conclude that the CPU consumption by the session with SESSION_ID 44 was high enough at that time for Database Resource Management to limit its CPU usage. Well, the application owner counters, this application is very important and the resource usage should not have been constrained.
In that case, you may suspect that the session was under a consumer group that has a more restrictive CPU allocation than expected. So your next stop is to find out what consumer group was active for the session at that time—not now. Fortunately, the process for finding this information is straightforward. Active Session History records the consumer group that was active for a session at the time of sampling and displays that information in the CONSUMER_GROUP_ID column in the V$ACTIVE_SESSION_HISTORY view. You can see that information by using the query shown in Listing 4.
Code Listing 4: Listing consumer groups

select sample_time, session_state, event, consumer_group_id
from v$active_session_history
where user_id = 92
and sample_time between
    to_date('29-SEP-12 04.55.02 PM','dd-MON-yy hh:mi:ss PM')
    and
    to_date('29-SEP-12 05.05.02 PM','dd-MON-yy hh:mi:ss PM')
and session_id = 44
order by 1;

                           SESSION
SAMPLE_TIME                _STATE   EVENT               CONSUMER_GROUP_ID
—————————————————————————  ———————  ——————————————————  —————————————————
29-SEP-12 04.55.34.419 PM  WAITING  resmgr:cpu quantum              12166
29-SEP-12 04.55.35.419 PM  ON CPU                                   12166
29-SEP-12 04.55.36.419 PM  WAITING  resmgr:cpu quantum              12166
29-SEP-12 04.55.37.419 PM  ON CPU                                   12166
29-SEP-12 04.55.38.419 PM  WAITING  resmgr:cpu quantum              12166
29-SEP-12 04.55.39.419 PM  WAITING  resmgr:cpu quantum              12166
29-SEP-12 04.55.40.419 PM  ON CPU                                   12166
… output truncated …
29-SEP-12 04.55.37.419 PM  ON CPU                                   12162
29-SEP-12 04.55.38.419 PM  ON CPU                                   12166
29-SEP-12 04.55.39.419 PM  ON CPU                                   12162
29-SEP-12 04.55.40.419 PM  ON CPU                                   12162

Because the session may have been under different consumer groups, it’s wise to select the consumer groups for all of the sampled data in Active Session History for that session, rather than just one sample. In this case, the session was under consumer group 12166 until it changed to 12162 on 29-SEP-12 at 04.55.37.419 PM. To find out the names of the consumer groups, use the following query:

select name
from v$rsrc_consumer_group
where id in (12166,12162);     

   ID  NAME
—————— ————————————
12166  OTHER_GROUPS
12162  APP_GROUP

From the output of Listing 4, you can see that from 04.55.34.419 PM, the session was under consumer_group_id 12166, which is OTHER_GROUPS (as shown in the query against v$rsrc_consumer_group above). The CONSUMER_GROUP_ID changed from 12166 to 12162 at 04.55.37.419 PM. This could have happened due to one of the three most likely reasons: a DBA manually activated a different resource plan, a different plan was activated automatically by a scheduling mechanism, or a DBA changed the consumer group of the session from OTHER_GROUPS to APP_GROUP (also shown in the query against v$rsrc_consumer_group above) online. Whatever the reason, with the APP_GROUP consumer group in effect, the session was doing more-productive work (as shown by the SESSION_STATE value ON CPU) than waiting with the “resmgr:cpu quantum” event. This observation may lead to many conclusions, but the most obvious is perhaps that the APP_GROUP consumer group is less restrictive in terms of CPU allocation than OTHER_GROUPS. In that case, you should examine why the OTHER_GROUPS consumer group was activated earlier and, perhaps more importantly, whether this restriction was necessary or just a mistake. In any case, you have now found the root cause of the wait.
The next obvious question is why session 44 consumed so much CPU that it had to be constrained by Database Resource Management. The answer lies in the SQL statement that session 44 was executing at that time (not now). The SQL_ID column in the output of Listing 1 was fngb4y81xr57x. You can get the text of that SQL statement with the following query:

SQL> select SQL_TEXT from v$sql 
where sql_id = 'fngb4y81xr57x';
 
SQL_TEXT
———————————————————————————————————
SELECT MAX(TEST1.OWNER) FROM TEST1, 
TEST2, TEST2, TEST2, TEST2, TEST2, 
TEST2, TEST2, TEST2, TEST2, TEST2, 
TEST2, TEST2, TEST2, TEST2, TEST2, 
TEST2, TEST2, TEST2, TEST2, TEST2, 
TEST2, TEST2, TEST2, TEST2

You can immediately see from the output that the query was performing a Cartesian join (joining all rows of a table with all rows of another without a join condition) on a TEST2 table several times. Even if the TEST2 table has a relatively small number of rows, the Cartesian join will produce a lot of logical I/Os. Because logical I/O requires CPU cycles to execute, it’s hardly surprising that the session consumed so much CPU that Database Resource Management limited it. To resolve this issue, you will need to rewrite the query to eliminate or reduce Cartesian joins.

More Uses

In the previous sections, you saw how to find issues that occurred at a specific point in the past in Active Session History. Hopefully, the content gave you an idea of how powerful Active Session History is and how it can be used in many circumstances. Here is another example of the power and usage of Active Session History: suppose a user complains that things seemed to have been slow from a specific client machine—prolaps01—between 4:55 p.m. and 5:05 p.m. on September 29. Because Active Session History also records the machine name, you can use the query shown in Listing 5 to display all the different events experienced by all sessions from the prolaps01 machine and how often each event occurred. You can see from the Listing 5 output that during that time, the sessions connected from the prolaps01 client machine experienced locking and Resource Manager–related waits many times. With this information, you can dig further inside the V$ACTIVE_SESSION_HISTORY view to identify specific sessions and what they were doing earlier to have experienced these waits.
Code Listing 5: Checking all events from a machine

select event, count(1)
from v$active_session_history
where machine = 'prolaps01'
and sample_time between
    to_date('29-SEP-12 04.55.00 PM','dd-MON-yy hh:mi:ss PM')
    and
    to_date('29-SEP-12 05.05.00 PM','dd-MON-yy hh:mi:ss PM')
group by event
order by event;

EVENT                             COUNT(1)
————————————————————————————      ————————
… output truncated …
db file scattered read                  93
db file parallel write                 127
log file parallel write                134
db file sequential read                293
control file parallel write            612
control file sequential read           948
enq: TX - row lock contention         1309
resmgr:cpu quantum                    1371

Although pulling specific data from Active Session History is great for spot analysis, sometimes you may want to take a look at the entire collection of Active Session History data within a time frame. Active Session History reports for a specific time period are great for that. You can generate an Active Session History report from Oracle Enterprise Manager or from the command line. For the latter, connect to the database as a DBA user and execute the following script at the SQL*Plus prompt: @$ORACLE_HOME/rdbms/admin/ashrpt.sql.

Next Steps READ “*Beginning Performance Tuning” “Performance Tuning: Diagnose the Past” LEARN more about performance tuning* *Oracle Database 2 Day + Performance Tuning Guide 11g Release 2 (11.2)* *generating and using Active Session History reports*

Next Steps

READ

“Beginning Performance Tuning”

“Performance Tuning: Diagnose the Past”
LEARN more about performance tuning
Oracle Database 2 Day + Performance Tuning Guide 11g Release 2 (11.2)
generating and using Active Session History reports

You can find more information on this procedure in “NEXT STEPS.”

Active Session History Archive

Active Session History collects information on active sessions from the database instance every second. Depending on the database activity, that will lead to a lot of data collected inside the ASH buffer, but because the ASH buffer is a memory-resident structure, it has only a finite amount of space. In addition, when the instance goes down, the instance’s memory vanishes with it. Therefore, Oracle Database archives the information from the ASH buffer to a database table to make it persistent. This archived table data is visible in a view called DBA_HIST_ACTIVE_SESS_HISTORY. If you don’t find the data in the V$ACTIVE_SESSION_HISTORY view, check for it in the DBA_HIST_ACTIVE_SESS_HISTORY view, as shown in Listing 6. The output shows that the session was experiencing row lock waits—you can get the lock and row information from the DBA_HIST_ACTIVE_SESS_HISTORY view by using the query shown in Listing 7.
Code Listing 7: Getting row lock information from the Active Session History archive

select sample_time, session_state, blocking_session,
owner||'.'||object_name||':'||nvl(subobject_name,'-') obj_name,
    dbms_ROWID.ROWID_create (
        1,
        o.data_object_id,
        current_file#,
        current_block#,
        current_row#
    ) row_id
from dba_hist_active_sess_history s, dba_objects o
where user_id = 92
and sample_time between
    to_date('29-SEP-12 04.55.02 PM','dd-MON-yy hh:mi:ss PM')
    and
    to_date('29-SEP-12 05.05.02 PM','dd-MON-yy hh:mi:ss PM')
and event = 'enq: TX - row lock contention'
and o.data_object_id = s.current_obj#
order by 1,2;

Conclusion

Active Session History is a very powerful facility inside Oracle Database that records the information on all active sessions in the database instance in a buffer every second and exposes that data to you in a view called V$ACTIVE_SESSION_HISTORY and subsequently to a persistent table visible in the DBA_HIST_ACTIVE_SESS_HISTORY view. Active Session History offers a second-by-second record of the activities inside the session—even when that session has disconnected or ceased that activity—enabling you to go back in time and identify what was ailing a specific session at a certain point in the past.

Setup.sql REM This to for setting up the test users, tables, etc. REM REM Create the user. If this user exists, use a different name REM create user arup identified by arup / grant create session, unlimited tablespace, create table to arup / connect arup/arup REM REM create all test tables REM drop table test1 / drop table test2 / drop table test3 / create table test1 as select * from all_objects / create table test2 as select * from test1 / create table test3 as select rownum col1, created col2 from test2 / Script: test1.sql declare l_dummy_1 varchar2(2000); l_dummy_2 varchar2(2000); l_stmt varchar2(2000); begin for i in 1..71540 loop l_stmt := 'select to_char(col2,''mm/dd/yyyy hh24:mi:ss'')'\|\| ' from test3'\|\| ' where col1 = to_char('\|\|i\|\|')'; dbms_output.put_line('l_stmt='\|\|l_stmt); execute immediate l_stmt into l_dummy_1; l_stmt := 'select col1 '\|\| 'from test3 '\|\| 'where col2 = to_date('''\|\| l_dummy_1\|\| ''',''mm/dd/yyyy hh24:mi:ss'')'\|\| ' and col1 = '''\|\|to_char(i)\|\|''''; dbms_output.put_line('l_stmt='\|\|l_stmt); execute immediate l_stmt into l_dummy_2; end loop; end; / Script: test2.sql declare l_dummy varchar2(200); begin select max(test1.owner) into l_dummy from test1, test2, test2, test2, test2, test2, test2, test2, test2, test2, test2, test2, test2, test2, test2, test2, test2, test2, test2, test2, test2, test2, test2, test2, test2; dbms_lock.sleep(120); end; / Script: upd1.sql update test1 set status = 'D' where object_id = 2 /

Setup.sql

REM This to for setting up the test users, tables, etc.
REM
REM Create the user. If this user exists, use a different name
REM
create user arup identified by arup
/
grant create session, unlimited tablespace, create table to arup
/
connect arup/arup
REM
REM create all test tables
REM
drop table test1
/
drop table test2
/
drop table test3
/
create table test1
as
select * from all_objects
/
create table test2
as
select * from test1
/
create table test3
as
select rownum col1, created col2
from test2
/

Script: test1.sql

declare
    l_dummy_1   varchar2(2000);
    l_dummy_2   varchar2(2000);
    l_stmt      varchar2(2000);
begin
    for i in 1..71540 loop
        l_stmt :=
            'select to_char(col2,''mm/dd/yyyy hh24:mi:ss'')'||
            ' from test3'||
            ' where col1 = to_char('||i||')';
        dbms_output.put_line('l_stmt='||l_stmt);
        execute immediate l_stmt into l_dummy_1;
        l_stmt :=
            'select col1 '||
            'from test3 '||
            'where col2 = to_date('''||
            l_dummy_1||
            ''',''mm/dd/yyyy hh24:mi:ss'')'||
            ' and col1 = '''||to_char(i)||'''';
        dbms_output.put_line('l_stmt='||l_stmt);
        execute immediate l_stmt into l_dummy_2;
    end loop;
end;
/

Script: test2.sql

declare
    l_dummy varchar2(200); 
begin 
    select max(test1.owner) 
    into l_dummy
    from test1, test2, test2, test2, test2,
        test2, test2, test2, test2,
        test2, test2, test2, test2, 
        test2, test2, test2, test2,
        test2, test2, test2, test2, 
        test2, test2, test2, test2; 
    dbms_lock.sleep(120);  
end;     
/

Script: upd1.sql

update test1 set status = 'D' where object_id = 2
/

Response-Time Analysis Made Easy in Oracle Database 10g

by Robin Schumacher

Historically, in trying to achieve maximum database performance, Oracle DBAs and performance analysts have fought an uphill battle to obtain solid response time metrics for system as well as user session activity. The problem facing DBAs has always had two facets: first, determining exactly "where" the database or user sessions have been spending their time; and second, determining the objective nature of the user experience.
Given all the possible activities and interactions within the database, these tasks are far from trivial. The Oracle Wait Interface, which was introduced many versions ago, has been a great jump-start for admins who know how to use it, but even it has lacked the ideal ability to tell a DBA how well the system or user sessions have efficiently processed transactions or queries. Enabling and poring over trace files can garner this level of detail, but for most overworked DBAs who have to manage large database farms, this exercise is a luxury.
Fortunately, those Oracle DBAs who have upgraded to Oracle Database 10g will find major response-time enhancements that allow a much better picture to be presented about system and session-level response-time metrics. Most significantly, the Oracle Automatic Database Diagnostic Monitor (ADDM) provides insights into response times and much more by automatically analyzing collected statistics, identifying problem areas, and even providing recommendations through the Oracle Enterprise Manager Grid Control GUI.
Furthermore, and most relevant to our discussion here, historical mechanisms in Oracle Database 10g allow DBAs to look back in time to perform their own response-time trend analysis, which helps them determine peak and nonpeak transaction/system times as well as locate rogue processes and SQL that elongate batch cycles or ETL jobs.
In this article, I'll explore the use of some of these historical mechanisms at the system, session, and SQL levels. For more information about ADDM, see the Oracle documentation as well as the " ADDM and SQL Tuning Advisor" installment of Arup Nanda's "Oracle Database 10g: Top 20 Features for DBAs."
System-Level Response-Time Analysis
Starting at the global or system level, DBAs typically want answers to these questions:

In general, how well is my database running? What defines efficiency?
What average response time are my users experiencing?
Which activities affect overall response time the most?

The answers to these questions have been fairly elusive for DBAs before Oracle Database 10g, but now such metrics can be somewhat easy to capture if you happen to be using the latest and greatest Oracle database.
First of all, part of the answer to how well, in general, a database is running can be obtained by issuing this query in Oracle Database 10g:

                               
select  METRIC_NAME,
        VALUE
from    SYS.V_$SYSMETRIC
where   METRIC_NAME IN ('Database CPU Time Ratio',
                        'Database Wait Time Ratio') AND
        INTSIZE_CSEC = 
        (select max(INTSIZE_CSEC) from SYS.V_$SYSMETRIC); 

METRIC_NAME                         VALUE
------------------------------ ----------
Database Wait Time Ratio                6
Database CPU Time Ratio                94

The Oracle Database 10g V$SYSMETRIC view contains several excellent response-time metrics, two of which are the Database Wait Time Ratio and Database CPU Time Ratio. The query above shows the latest snapshot of these two statistics, which help you determine if your database is currently experiencing a high percentage of waits/bottlenecks vs. smoothly running operations. The Database CPU Time Ratio is calculated by dividing the amount of CPU expended in the database by the amount of "database time," which is defined as the time spent by the database on user-level calls (with instance background process activity being excluded). High values (90-95+ percent) are good and indicate few wait/bottleneck actions, but take this threshold only as a general rule of thumb because every system is different.

You can also take a quick look over the last hour to see if the database has experienced any dips in overall performance by using this query:

                               
select  end_time,
        value
from    sys.v_$sysmetric_history
where   metric_name = 'Database CPU Time Ratio'
order by 1;

END_TIME                  VALUE
-------------------- ----------
22-NOV-2004 10:00:38         98
22-NOV-2004 10:01:39         96
22-NOV-2004 10:02:37         99
22-NOV-2004 10:03:38        100
22-NOV-2004 10:04:37         99
22-NOV-2004 10:05:38         77
22-NOV-2004 10:06:36        100
22-NOV-2004 10:07:37         96
22-NOV-2004 10:08:39        100
.
.

And, you can get a good idea of the minimum, maximum, and average values of overall database efficiency by querying the V$SYSMETRIC_SUMMARY view with a query such as this:

                               
select  CASE METRIC_NAME
            WHEN 'SQL Service Response Time' then 'SQL Service Response Time (secs)'
            WHEN 'Response Time Per Txn' then 'Response Time Per Txn (secs)'
            ELSE METRIC_NAME
            END METRIC_NAME,
                CASE METRIC_NAME
            WHEN 'SQL Service Response Time' then ROUND((MINVAL / 100),2)
            WHEN 'Response Time Per Txn' then ROUND((MINVAL / 100),2)
            ELSE MINVAL
            END MININUM,
                CASE METRIC_NAME
            WHEN 'SQL Service Response Time' then ROUND((MAXVAL / 100),2)
            WHEN 'Response Time Per Txn' then ROUND((MAXVAL / 100),2)
            ELSE MAXVAL
            END MAXIMUM,
                CASE METRIC_NAME
            WHEN 'SQL Service Response Time' then ROUND((AVERAGE / 100),2)
            WHEN 'Response Time Per Txn' then ROUND((AVERAGE / 100),2)
            ELSE AVERAGE
            END AVERAGE
from    SYS.V_$SYSMETRIC_SUMMARY 
where   METRIC_NAME in ('CPU Usage Per Sec',
                      'CPU Usage Per Txn',
                      'Database CPU Time Ratio',
                      'Database Wait Time Ratio',
                      'Executions Per Sec',
                      'Executions Per Txn',
                      'Response Time Per Txn',
                      'SQL Service Response Time',
                      'User Transaction Per Sec')
ORDER BY 1

METRIC_NAME                       MINIMUM    MAXIMUM    AVERAGE
------------------------------ ---------- ---------- ----------
CPU Usage Per Sec                       0          7          1
CPU Usage Per Txn                       1         29          8
Database CPU Time Ratio                61        100         94
Database Wait Time Ratio                0         39          5
Executions Per Sec                      2         60          8
Executions Per Txn                     16        164         41
Response Time Per Txn (secs)            0        .28        .08
SQL Service Response Time (sec          0          0          0
User Transaction Per Sec                0          1          0

The query above contains more response-time metrics than simply the Database CPU and Wait Time Ratios (we'll cover those later), but you can see the benefit in being able to acquire this information. For this particular instance, the average Database CPU Time Ratio is 94, which is well within our acceptable limits.
The next question DBAs pose at the system level involves the average level of response time that their user community is experiencing. (Prior to Oracle Database 10g this type of data was difficult to capture, but not anymore.) The query shown above that interrogates the V$SYSMETRIC_SUMMARY view tells us what we need to know. If complaints of unacceptable response times are mounting from users, the DBA can check the Response Time Per Txn and SQL Service Response Time metrics to see if a database issue exists. For example, the statistics shown above report that the maximum response time per user transaction has been only .28 second, with the average response time being a blazing .08 second. Oracle certainly wouldn't be to blame in this case.
If, however, response times are longer than desired, the DBA will then want to know what types of user activities are responsible for making the database work so hard. Again, before Oracle Database 10g, this information was more difficult to acquire, but now the answer is only a query away:

                               
select  case db_stat_name
            when 'parse time elapsed' then 
                'soft parse time'
            else db_stat_name
            end db_stat_name,
        case db_stat_name
            when 'sql execute elapsed time' then 
                time_secs - plsql_time 
            when 'parse time elapsed' then 
                time_secs - hard_parse_time
            else time_secs
            end time_secs,
        case db_stat_name
            when 'sql execute elapsed time' then 
                round(100 * (time_secs - plsql_time) / db_time,2)
            when 'parse time elapsed' then 
                round(100 * (time_secs - hard_parse_time) / db_time,2)  
            else round(100 * time_secs / db_time,2)  
            end pct_time
from
(select stat_name db_stat_name,
        round((value / 1000000),3) time_secs
    from sys.v_$sys_time_model
    where stat_name not in('DB time','background elapsed time',
                            'background cpu time','DB CPU')),
(select round((value / 1000000),3) db_time 
    from sys.v_$sys_time_model 
    where stat_name = 'DB time'),
(select round((value / 1000000),3) plsql_time 
    from sys.v_$sys_time_model 
    where stat_name = 'PL/SQL execution elapsed time'),
(select round((value / 1000000),3) hard_parse_time 
    from sys.v_$sys_time_model 
    where stat_name = 'hard parse elapsed time')
order by 2 desc;


DB_STAT                          TIME_SECS       PCT_TIME
-----------------------------    ---------       --------
sql execute elapsed time         13263.707       45.84                                 
PL/SQL execution elapsed time    13234.738       45.74                                 
hard parse elapsed time           1943.687        6.72                                  
soft parse time                    520.584         1.8
.
.

The example output above shows a database that has spent the vast majority of its time handling SQL and PL/SQL requests. Complete descriptions of all the statistics supported by the V$SYS_TIME_MODEL view can be found here.
In addition to active time, a DBA will want to know the global wait times as well. Prior to Oracle Database 10g, a DBA had to view individual wait events to understand waits and bottlenecks, but now Oracle provides a summary/rollup mechanism for waits via wait classes:

                               
select  WAIT_CLASS,
        TOTAL_WAITS,
        round(100 * (TOTAL_WAITS / SUM_WAITS),2) PCT_WAITS,
        ROUND((TIME_WAITED / 100),2) TIME_WAITED_SECS,
        round(100 * (TIME_WAITED / SUM_TIME),2) PCT_TIME
from
(select WAIT_CLASS,
        TOTAL_WAITS,
        TIME_WAITED
from    V$SYSTEM_WAIT_CLASS
where   WAIT_CLASS != 'Idle'),
(select  sum(TOTAL_WAITS) SUM_WAITS,
        sum(TIME_WAITED) SUM_TIME
from    V$SYSTEM_WAIT_CLASS
where   WAIT_CLASS != 'Idle')
order by 5 desc;

WAIT_CLASS      TOTAL_WAITS  PCT_WAITS TIME_WAITED_SECS   PCT_TIME
--------------- ----------- ---------- ---------------- ----------
User I/O            2245204       7.48          4839.43      54.39
System I/O          2438387       8.12          2486.21      27.94
Application          920385       3.07           513.56       5.77
Other                 39962        .13           422.36       4.75
Commit               200872        .67           284.76        3.2
Network            24133213      80.38           162.26       1.82
Concurrency            6867        .02           102.63       1.15
Configuration         39377        .13            86.21        .97

It's much easier to tell now that the bulk of overall wait time is due, for example, to user I/O waits than to try to tally individual wait events to get a global picture. As with response-time metrics, you can also look back in time over the last hour with a query like this one:

                               
select  to_char(a.end_time,'DD-MON-YYYY HH:MI:SS') end_time,
        b.wait_class,
        round((a.time_waited / 100),2) time_waited 
from    sys.v_$waitclassmetric_history a,
        sys.v_$system_wait_class b
where   a.wait_class# = b.wait_class# and
        b.wait_class != 'Idle'
order by 1,2;

END_TIME             WAIT_CLASS      TIME_WAITED
-------------------- --------------- -----------
22-NOV-2004 11:28:37 Application               0
22-NOV-2004 11:28:37 Commit                  .02
22-NOV-2004 11:28:37 Concurrency               0
22-NOV-2004 11:28:37 Configuration             0
22-NOV-2004 11:28:37 Network                 .01
22-NOV-2004 11:28:37 Other                     0
22-NOV-2004 11:28:37 System I/O              .05
22-NOV-2004 11:28:37 User I/O                  0
.
.

You can, of course, just focus on a single SID with the V$SESS_TIME_MODEL view and obtain data for all statistical areas of a session. You can also view current session wait activity using the new wait classes using the following query:

                               
select  a.sid,
        b.username,
        a.wait_class,
        a.total_waits,
        round((a.time_waited / 100),2) time_waited_secs
from    sys.v_$session_wait_class a,
        sys.v_$session b
where   b.sid = a.sid and
        b.username is not null and
        a.wait_class != 'Idle'
order by 5 desc;

SID USERNAME   WAIT_CLASS      TOTAL_WAITS TIME_WAITED_SECS
--- ---------- --------------- ----------- ----------------
257 SYSMAN     Application          356104            75.22
255 SYSMAN     Commit                14508            25.76
257 SYSMAN     Commit                25026            22.02
257 SYSMAN     User I/O              11924            19.98
.
.
.

After this stage, you can check the standard individual wait events as you've been able to do in earlier versions of Oracle with V$SESSION_WAIT and V$SESSION_EVENT. You'll also find the new wait classes in these two modified views with Oracle Database 10g.
If you need to look back in time to discover what sessions were logged on and consuming the most resources, you can use the following query. In the example below, we're looking at activity from midnight to 5 a.m. on November 21, 2004, that involved user I/O waits:

                               
select  sess_id,
        username,
        program,
        wait_event,
        sess_time,
        round(100 * (sess_time / total_time),2) pct_time_waited
from
(select a.session_id sess_id,
        decode(session_type,'background',session_type,c.username) username,
        a.program program,
        b.name wait_event,
        sum(a.time_waited) sess_time
from    sys.v_$active_session_history a,
        sys.v_$event_name b,
        sys.dba_users c
where   a.event# = b.event# and
        a.user_id = c.user_id and
        sample_time > '21-NOV-04 12:00:00 AM' and 
        sample_time < '21-NOV-04 05:00:00 AM' and
        b.wait_class = 'User I/O'
group by a.session_id,
        decode(session_type,'background',session_type,c.username),
        a.program,
        b.name),
(select sum(a.time_waited) total_time
from    sys.v_$active_session_history a,
        sys.v_$event_name b
where   a.event# = b.event# and
        sample_time > '21-NOV-04 12:00:00 AM' and 
        sample_time < '21-NOV-04 05:00:00 AM' and
        b.wait_class = 'User I/O')
order by 6 desc;

SESS_ID USERNAME PROGRAM    WAIT_EVENT                SESS_TIME PCT_TIME_WAITED
------- -------- ---------- ------------------------- ---------- -------------
    242 SYS      exp@RHAT9K db file scattered read       3502978         33.49
    242 SYS      oracle@RHA db file sequential read      2368153         22.64
    242 SYS      oracle@RHA db file scattered read       1113896         10.65
    243 SYS      oracle@RHA db file sequential read       992168          9.49

The Oracle Database 10g V$ACTIVE_SESSION_HISTORY view comes into play here to provide an insightful look back in time at session experiences for a given time period. This view gives you a lot of excellent information without the need for laborious tracing functions. We'll see more use of it in the next section, which deals with analyzing the response times of SQL statements.
SQL Response-Time Analysis
Examining the response time of SQL statements became easier in Oracle9i, and with Oracle Database 10g, DBAs have many tools at their disposal to help them track inefficient database code.
Historically the applicable V$ view here has been V$SQLAREA. Starting with Oracle9i, Oracle added the ELAPSED_TIME and CPU_TIME columns, which have been a huge help in determining the actual end user experience of a SQL statement execution (at least, when dividing them by the EXECUTIONS column, which produces the average amount of time per execution).
In Oracle Database 10g, six new wait-related and timing columns have been added to V$SQLAREA:

APPLICATION_WAIT_TIME
CONCURRENCY_WAIT_TIME
CLUSTER_WAIT_TIME
USER_IO_WAIT_TIME
PLSQL_EXEC_TIME
JAVA_EXEC_TIME

The new columns are helpful in determining, for example, the amount of time that a procedure spends in PL/SQL code vs. standard SQL execution, and if a SQL statement has experienced any particular user I/O waits. For example, a query you could use to find the top five SQL statements with the highest user I/O waits would be:

                               
select *
from
(select sql_text,
        sql_id,
        elapsed_time,
        cpu_time,
        user_io_wait_time
from    sys.v_$sqlarea
order by 5 desc)
where rownum < 6;

SQL_TEXT                  SQL_ID       ELAPSED_TIME CPU_TIME  USER_IO_WAIT_TIME
------------------------- ------------ ------------ ---------- ---------------
select /*+ rule */ bucket db78fxqxwxt7     47815369   19000939            3423
SELECT :"SYS_B_0" FROM SY agdpzr94rf6v     36182205   10170226            2649
select obj#,type#,ctime,m 04xtrk7uyhkn     28815527   16768040            1345
select grantee#,privilege 2q93zsrvbdw4     28565755   19619114             803
select /*+ rule */ bucket 96g93hntrzjt      9411028    3754542             606

Of course, getting the SQL statements with the highest elapsed time or wait time is good, but you need more detail to get to the heart of the matter—which is where the V$ACTIVE_SESSION_HISTORY view again comes into play. With this view, you can find out what actual wait events delayed the SQL statement along with the actual files, objects, and object blocks that caused the waits (where applicable).
For example, let's say you've found a particular SQL statement that appears to be extremely deficient in terms of user I/O wait time. You can issue the following query to get the individual wait events associated with the query along with the corresponding wait times, files, and objects that were the source of those waits:

                               
select event,
        time_waited,
        owner,
        object_name,
        current_file#,
        current_block# 
from    sys.v_$active_session_history a,
        sys.dba_objects b 
where   sql_id = '6gvch1xu9ca3g' and
        a.current_obj# = b.object_id and
        time_waited <> 0;

EVENT                     TIME_WAITED OWNER  OBJECT_NAME           file  block
------------------------- ----------- ------ --------------------- ---- ------
db file sequential read         27665 SYSMAN MGMT_METRICS_1HOUR_PK    3  29438
db file sequential read          3985 SYSMAN SEVERITY_PRIMARY_KEY     3  52877

Of course, you can use V$ACTIVE_SESSION_HISTORY in a historical fashion to narrow down unoptimized SQL statements for a particular time period. The point is that Oracle Database 10g makes it a lot easier to conduct response-time analysis on SQL statements with simplified data dictionary views, as opposed to the time-consuming trace-and-digest method.
Conclusion
DBAs and performance analysts who manage the performance of Oracle Database 10g will find many of the response-time metrics they've yearned for over the years now at their fingertips in the latest release of Oracle's flagship database. Such statistics will help accelerate the process of finding the proverbial needle in the haystack of large and complex database performance situations.

Saturday 30 July 2016

Important srvctl and crsctl commands

Displays the configuration for the SCAN VIP.

[root@rac2 Desktop]# srvctl config scan -i 1
SCAN name: rac-scan, Network: 1/192.192.2.0/255.255.255.0/eth4
SCAN VIP name: scan1, IP: /rac-scan/192.192.2.51

Displays the configuration information for a VIP or VIP's configured on a node.

[root@rac2 Desktop]# srvctl config vip -n rac2
VIP exists: /rac2-vip/192.192.2.202/192.192.2.0/255.255.255.0/eth4, hosting node rac2

Displays configuration information for the SCAN listener.

[root@rac2 Desktop]# srvctl config scan_listener -i 1
SCAN Listener LISTENER_SCAN1 exists. Port: TCP:1521

[root@rac2 Desktop]# srvctl config nodeapps
Network exists: 1/192.192.2.0/255.255.255.0/eth4, type static
VIP exists: /rac1-vip/192.192.2.201/192.192.2.0/255.255.255.0/eth4, hosting node rac1
VIP exists: /rac2-vip/192.192.2.202/192.192.2.0/255.255.255.0/eth4, hosting node rac2
GSD exists
ONS exists: Local port 6100, remote port 6200, EM port 2016

Temporarily relocates service from one node of the cluster to another.
mandatory options are either -i, -t or -c, -n, but not both.

[root@rac2 Desktop]# srvctl relocate service -d orcl -s taf_service -i orcl1 -t orcl2, orcl3

Display OHAS autostart config on this server
[root@rac2 Desktop]# crsctl disable crs
     Disable OHAS autostart on this server

crsctl query crs activeversion
     Lists the Oracle Clusterware operating version
crsctl query crs releaseversion
     Lists the Oracle Clusterware release version
crsctl query crs softwareversion [<nodename>| -all]
     Lists the version of Oracle Clusterware software installed

Find public and private network ip addresses

[root@rac2 Desktop]# oifcfg getif
eth5 10.10.2.0 global cluster_interconnect
eth4 192.192.2.0 global public

check $CRS_HOME/cdata/<cluster_name> directory

Friday 29 July 2016

Dba_jobs view

DESC DBA_JOBS
----------------------
JOB
SEHEMA_USER
LAST_DATE
THIS_DATE
NEXT_DATE
WHAT --> Body of the anonymous PL/SQL block that the job executes

Bringing Down a DBMS_JOB

1. Find the Job You Want to Bring Down
In order to do anything you first need to find the job that is giving you a headache. Go ahead and run the running_jobs.sql. This will give you the prime information, job, sid, serial#, and spid, for the following actions in bringing down the job.

2. Mark the DBMS_JOB as Broken
Use the following command for the job that you have to deal with.

SQL> EXEC DBMS_JOB.BROKEN(job#,TRUE);

All this command does is mark the job so that if we get it to stop, it won't start again. Let's make one thing perfectly clear, after executing this command the job is still running.

As a side note, if you are trying to shut down a database with jobs that run throughout the day, they may hinder your attempts to bring down the database cleanly. This is a wonderful command to make sure no jobs are executing during the shutdown process. Just be aware that you will need to mark the jobs as unbroken when the database comes back up, more on that later.
3. Kill the Oracle Session

Since the job is still running and it isn't going to end soon, you will need to kill the Oracle session that is executing the job. Use the following command for to kill the job.

ALTER SYSTEM KILL SESSION 'sid,serial#';

4. Kill the O/S Process

More often than not the previous step will still leave the job attached to the database and still running. When this happens you will need to go out to the operating system level and get rid of the process that has spawned from the running job. In order to do this you must login to the database box and issue the following command, depending on the type of operating system you have.

For Windows, at the DOS Prompt: orakill sid spid

For UNIX at the command line> kill '9 spid

The orakill is an Oracle command, while kill is a Unix command.
5. Check if the Job is Still Running

Re-run the session_jobs.sql script to see if you have gotten rid of the job. If you have there is no reason to go further. Usually steps 1 through 4 will be sufficient to get rid of a job but when the job is running wild you will have to continue with steps 6 through 11 which describes a process for bouncing the job queue process.

6. Determine the Current Number of Job Queue Processes

SQL> col value for a10
SQL> select name,value from v$parameter where name = 'job_queue_processes';

7. Alter the Job Queue to Zero

SQL> ALTER SYSTEM SET job_queue_processes = 0;

This will bring down the entire job queue processes.

8. Validate that No Processes are Using the Job Queue
Re-run the session_jobs.sql script to see if any jobs are still running. Since we have given a hard stop to the job queue and issued the kill commands, you can now wait until no more jobs are running. After all the jobs have quit running, you can do whatever maintenance or tuning you need to do before proceeding.

9. Mark the DBMS_JOB as Not Broken
You can now reset the broken job to not broken so they can run again. Just issue the command.

SQL>EXEC DBMS_JOB.BROKEN(job#,FALSE):

10. Alter the Job Queue to Original Value
Set the job queue to its' original value so that the jobs can run again.

ALTER SYSTEM SET job_queue_processes = original_value;

11. Validate that DBMS_JOB Is Running
To make sure everything is back to normal, re-run the above scripts to validate that jobs are scheduled, not broken, and are executing with the next and last dates columns changing.

Check

dba_seheduler_jobs

dba_seheduler_jobs_running

DBA_jobs

DBA_jobs_running

v$session_longops

You need to identify the session of the job in v$session.
In v$session there is much more information.
Amongst others the sql_id of the statement it is executing
V$sql_plan will show the plan of that sql.
V$session_wait will show what it is waiting for
v$sess_io will show whether any IO is going on
v$session_longops will show what long operations are going on and how far they are.

1-the alert_log for this instance;
2- select * from dba_objects;
3- either windows or unix check what other jobs are schuelded;
4- you will need to use stats pack for thissituation.
so take astatspack before and while running this job.

5- show your sp code.
6- an Xplain plan for the select statement.

7-invoke the job and check locks with

DBA Admin