Interconnect and RAC wait events:
The main difference to keep in mind when monitoring a RAC database versus a single-instance database is the buffer cache and its operation. In a RAC environment, the buffer cache is global across all instances in the cluster and hence the processing differs. When a process in a RAC database needs to modify or read data, Oracle will first check to see if it already exists in the local buffer cache. If the data is not in the local buffer cache the global buffer cache will be reviewed to see if another instance already has it in their buffer cache. In this case the remote instance will send the data to the local instance via the high-speed interconnect, thus avoiding a disk read.
Monitoring an Oracle RAC database often means monitoring this situation and the amount of requests going back and forth over the RAC interconnect. The most common wait events related to this are gc cr request and gc buffer busy (note that in Oracle RAC 9i and earlier these wait events were known as “global cache cr request”and “global cache buffer busy” wait events).
gc cr request
The gc cr request wait event specifies the time it takes to retrieve the data from the remote cache. In Oracle 9i and prior, gc cr request was known as global cache cr request. High wait times for this wait event often are because of:
RAC Traffic Using Slow Connection – typically RAC traffic should use a high-speed interconnect to transfer data between instances, however, sometimes Oracle may not pick the correct connection and instead route traffic over the slower public network. This will significantly increase the amount of wait time for the gc cr request event. The oradebug command can be used to verify which network is being used for RAC traffic:
Transact-SQL
SQL> oradebug setmypid
SQL> oradebug ipc
This will dump a trace file to the location specified by the user_dump_dest Oracle parameter containing information about the network and protocols being used for the RAC interconnect.
Inefficient Queries – poorly tuned queries will increase the amount of data blocks requested by an Oracle session. The more blocks requested typically means the more often a block will need to be read from a remote instance via the interconnect.
gc buffer busy acquire and gc buffer busy release
The gc buffer busy acquire and gc buffer busy release wait events specify the time the remote instance locally spends accessing the requested data block. In Oracle 11g you will see gc buffer busy acquire wait event when the global cache open request originated from the local instance and gc buffer busy release when the open request originated from a remote instance. In Oracle 10g these two wait events were represented in a single gc buffer busy wait, and in Oracle 9i and prior the “gc” was spelled out as “global cache” in the global cache buffer busy wait event. These wait events are all very similar to the buffer busy wait events in a single-instance database and are often the result of:
Hot Blocks – multiple sessions may be requesting a block that is either not in buffer cache or is in an incompatible mode. Deleting some of the hot rows and re-inserting them back into the table may alleviate the problem. Most of the time the rows will be placed into a different block and reduce contention on the block. The DBA may also need to adjust the pctfree and/or pctused parameters for the table to ensure the rows are placed into a different block.
Inefficient Queries – as with the gc cr request wait event, the more blocks requested from the buffer cache the more likelihood of a session having to wait for other sessions. Tuning queries to access fewer blocks will often result in less contention for the same block.
Archive log deletion policy:
-------------------------------------------
CONFIGURE ARCHIVELOG DELETION POLICY
TO BACKED UP 2 TIMES TO 'SBT_TAPE';
The archive log deletion policy says the the logfiles have to be backed up for two times before they get deleted.
DBMS_SPM Package:
Disk space used by the SQL management base is regularly checked against a limit based on the size of the
Disk space used by the SQL management base is regularly checked against a limit based on the size of the
SYSAUX
tablespace. By default, the limit for the SMB is no more than 10% of the size of the SYSAUX
tablespace. The allowable range for this limit is between 1% and 50%. A weekly background process measures the total space occupied by the SMB, and when the defined limit is exceeded, the process will generate a warning that is written to the alert log. The alerts are generated weekly until either the SMB space limit is increased, the size of the SYSAUX
tablespace is increased, or the disk space used by the SMB is decreased by purging SQL management objects (SQL plan baselines or SQL profiles).
To change the percentage limit, use the
CONFIGURE
procedure of the DBMS_SPM
package:BEGIN DBMS_SPM.CONFIGURE( 'space_budget_percent',30); END; /
In this example, the space limit is changed to 30%. To learn about additional parameters used by the
CONFIGURE
procedure.select parameter_name, parameter_value from dba_sql_management_config; PARAMETER_NAME PARAMETER_VALUE --------------------- --------------- SPACE_BUDGET_PERCENT 30 PLAN_RETENTION_WEEKS 105
Row Migration and Row Chaining:
If an
UPDATE
statement
increases the amount of data in a row so that the row no longer fits in its
data block, then Oracle Database tries to find another block with enough free
space to hold the entire row. If such a block is available, then Oracle
Database moves the entire row to the new block. This operation is called row
migration. If the row is too large to fit into any available block, then
the database splits the row into multiple pieces and stores each piece in a
separate block. This operation is called row chaining. The database
can also chain rows when they are inserted.
Migration and chaining are especially detrimental to performance
with the following:
·
UPDATE
statements that cause migration and chaining to
perform poorly
·
Queries that select migrated or
chained rows because these must perform additional input and output
The definition of a sample output table named
CHAINED_ROWS
appears
in a SQL script available on your distribution medium. The common name of this
script is UTLCHN1
.SQL
, although its exact
name and location varies depending on your platform. Your output table must
have the same column names, data types, and sizes as the CHAINED_ROWS
table.
Increasing
PCTFREE
can
help to avoid migrated rows. If you leave more free space available in the
block, then the row has room to grow. You can also reorganize or re-create
tables and indexes that have high deletion rates. If tables frequently have
rows deleted, then data blocks can have partially free space in them. If rows
are inserted and later expanded, then the inserted rows might land in blocks
with deleted rows but still not have enough room to expand. Reorganizing the
table ensures that the main free space is totally empty blocks-----------------------------
The "checkpoint not complete" messages are generated because the logs are switching so fast that the checkpoint associated with the log switch isn't complete.
During that time, Oracle's LGWR process has filled up the other redo log groups and is now waiting for the first checkpoint to successfully execute.
Remember, Oracle will stop processing until the checkpoint completes successfully!
Usually you see this "checkpoint not complete" error in your alert log:
/u01/app/oracle/mydb/redo16.log
Thu Jan 14 22:12:55 2015
Thread 1 cannot allocate new log, sequence 46352
Checkpoint not complete
Thread 1 cannot allocate new log, sequence 46352
Checkpoint not complete
Oracle suggests that you might consider setting archive_lag_target to zero to reduce "checkpoint not complete" messages:
alter system set archive_lag_target=0 scope=both;
Overall Oracle performance can be dramatically improved by increasing the log sizes so that logs switch at the recommended interval of 15 to 30 minutes.
You need to ensure that the on-line redo logs don't switch too often during periods of high activity and switch often enough during times of low processing workloads.
This should reduce the delays from the checkpoint not complete errors.
RAC network protocol:
For the public network, each network adapter must support TCP/IP.
For the private network, the interface must support the user datagram protocol (UDP) using high-speed network adapters and switches that support TCP/IP (minimum requirement 1 Gigabit Ethernet).
Note:
UDP is the default interface protocol for Oracle RAC and Oracle Clusterware. You must use a switch for the interconnect. Oracle recommends that you use a dedicated switch.
Oracle does not support token-rings or crossover cables for the interconnect.
How to find which asm disk maps to which linux partition :
- Issue following command to list all asm disks
# ls -ls /dev/oracleasm/disks/ASMDISK*
0 brw-rw—- 1 grid asmadmin 8, 17 Apr 12 18:43 /dev/oracleasm/disks/ASMDISK01
0 brw-rw—- 1 grid asmadmin 8, 18 Apr 12 18:43 /dev/oracleasm/disks/ASMDISK02
2 columns after group (asmadmin) refer to major/minor
e.g. for ASMDISK01 major = 8, minor = 17
– see the contents of /proc/partitions to find which partition is mapped to which major/minor
# cat /proc/partitions
major minor #blocks name
8 0 83886080 sda
8 1 104391 sda1
8 2 5116702 sda2
8 3 5116702 sda3
8 4 1 sda4
8 5 73545538 sda5
8 16 93487104 sdb
8 17 4008186 sdb1
8 18 4008217 sdb2
8 19 4008217 sdb3
8 20 1 sdb4
…
8 29 4008186 sdb13
8 30 4008186 sdb14
8 31 4008186 sdb15
Here sdb1 is mapped to major 8 and minor 17.
Hence, partition /dev/sdb1 is mapped to ASMDISK01
Dynamic Remastering:
When one instance departs the cluster, the GRD portion of that instance needs to be redistributed to the surviving nodes. Similarly, when a new instance enters the cluster, the GRD portions of the existing instances must be redistributed to create the GRD portion of the new instance. This is called dynamic resource reconfiguration also called dynamic remastering.
Use the command cat /proc/cpuinfo which will output a chunk of data for each core.
Each chunk starts with this info:
processor : 3
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel(R) Core(TM) i5-4210M CPU @ 2.60GHz
Cores are numbered starting from 0, so if the last chunk says processor : 3 as in this case, your machine has 4 cores.
Current timestamp:
CURRENT_TIMESTAMP returns the current date and time in the session time zone, in a value of datatype
TIMESTAMP
WITH
TIME
ZONE
.
SQL> SELECT SESSIONTIMEZONE, CURRENT_TIMESTAMP FROM DUAL;
SESSIONTIMEZONE
---------------------------------------------------------------------------
CURRENT_TIMESTAMP
---------------------------------------------------------------------------
+05:30
03-AUG-16 07.42.28.629657 PM +05:30
SQL> select * from v$active_instances;
INST_NUMBER INST_NAME
-------------------- ------------------------------------
1 rac1.oracl.com:gisdb1
What privilege have to give to scott to query all tables in database?
SESSIONTIMEZONE
---------------------------------------------------------------------------
CURRENT_TIMESTAMP
---------------------------------------------------------------------------
+05:30
03-AUG-16 07.42.28.629657 PM +05:30
SQL> select * from v$active_instances;
INST_NUMBER INST_NAME
-------------------- ------------------------------------
1 rac1.oracl.com:gisdb1
What privilege have to give to scott to query all tables in database?
SQL> Grant select_any_table to Scott;
Recreate oraInventory in oracle:
------------------------------------------------
Go to Oracle Universal installer location for creating Oracle Inventory
$ORACLE_HOME/oui/bin
Note: when you are running for different homes, you have run the oui in there respective home only
$ ./runInstaller -silent -ignoreSysPrereqs -attachHome ORACLE_HOME="/apps/oracle/product/10.2.0.2" ORACLE_HOME_NAME="orahome_102"
Starting Oracle Universal Installer...
How to find dumpfile belongs to exp/expdp:
Using
$ Strings export_hr.DMP
ASM disk Status:
Disks that were discovered but that have not yet been assigned to a disk group have a header status of either CANDIDATE or PROVISIONED.
CANDIDATE
Disk is not part of a disk group and may be added to a disk group with the ALTER DISKGROUP statement
PROVISIONED
Disk is not part of a disk group and may be added to a disk group with the ALTER DISKGROUP statement. The PROVISIONED header status is different from the CANDIDATE header status in that PROVISIONED implies that an additional platform-specific action has been taken by an administrator to make the disk available for Automatic Storage Management.
For example, on Windows, the administrator used asmtool or asmtoolg to stamp the disk with a header, or on Linux, the administrator used ASMLib to prepare the disk for ASM.
Clearing a Redo Log File :
------------------------------
A redo log file might become corrupted while the database is open, and ultimately stop database activity because archiving cannot continue. In this situation the ALTER DATABASE CLEAR LOGFILE statement can be used to reinitialize the file without shutting down the database.
The following statement clears the log files in redo log group number 3:
ALTER DATABASE CLEAR LOGFILE GROUP 3;
This statement overcomes two situations where dropping redo logs is not possible:
If there are only two log groups
The corrupt redo log file belongs to the current group
If the corrupt redo log file has not been archived, use the UNARCHIVED keyword in the statement.
ALTER DATABASE CLEAR UNARCHIVED LOGFILE GROUP 3;
This statement clears the corrupted redo logs and avoids archiving them. The cleared redo logs are available for use even though they were not archived.
If you clear a log file that is needed for recovery of a backup, then you can no longer recover from that backup. The database writes a message in the alert log describing the backups from which you cannot recover.
Note:
If you clear an unarchived redo log file, you should make another backup of the database.
If you want to clear an unarchived redo log that is needed to bring an offline tablespace online, use the UNRECOVERABLE DATAFILE clause in the ALTER DATABASE CLEAR LOGFILE statement.
If you clear a redo log needed to bring an offline tablespace online, you will not be able to bring the tablespace online again. You will have to drop the tablespace or perform an incomplete recovery. Note that tablespaces taken offline normal do not require recovery.
No comments:
Post a Comment