DBA Admin: interview

Interconnect and RAC wait events:

The main difference to keep in mind when monitoring a RAC database versus a single-instance database is the buffer cache and its operation. In a RAC environment, the buffer cache is global across all instances in the cluster and hence the processing differs. When a process in a RAC database needs to modify or read data, Oracle will first check to see if it already exists in the local buffer cache. If the data is not in the local buffer cache the global buffer cache will be reviewed to see if another instance already has it in their buffer cache. In this case the remote instance will send the data to the local instance via the high-speed interconnect, thus avoiding a disk read.

Monitoring an Oracle RAC database often means monitoring this situation and the amount of requests going back and forth over the RAC interconnect. The most common wait events related to this are gc cr request and gc buffer busy (note that in Oracle RAC 9i and earlier these wait events were known as “global cache cr request”and “global cache buffer busy” wait events).

gc cr request

The gc cr request wait event specifies the time it takes to retrieve the data from the remote cache. In Oracle 9i and prior, gc cr request was known as global cache cr request. High wait times for this wait event often are because of:

RAC Traffic Using Slow Connection – typically RAC traffic should use a high-speed interconnect to transfer data between instances, however, sometimes Oracle may not pick the correct connection and instead route traffic over the slower public network. This will significantly increase the amount of wait time for the gc cr request event. The oradebug command can be used to verify which network is being used for RAC traffic:

Transact-SQL

SQL> oradebug setmypid

SQL> oradebug ipc

This will dump a trace file to the location specified by the user_dump_dest Oracle parameter containing information about the network and protocols being used for the RAC interconnect.

Inefficient Queries – poorly tuned queries will increase the amount of data blocks requested by an Oracle session. The more blocks requested typically means the more often a block will need to be read from a remote instance via the interconnect.

gc buffer busy acquire and gc buffer busy release

The gc buffer busy acquire and gc buffer busy release wait events specify the time the remote instance locally spends accessing the requested data block. In Oracle 11g you will see gc buffer busy acquire wait event when the global cache open request originated from the local instance and gc buffer busy release when the open request originated from a remote instance. In Oracle 10g these two wait events were represented in a single gc buffer busy wait, and in Oracle 9i and prior the “gc” was spelled out as “global cache” in the global cache buffer busy wait event. These wait events are all very similar to the buffer busy wait events in a single-instance database and are often the result of:

Hot Blocks – multiple sessions may be requesting a block that is either not in buffer cache or is in an incompatible mode. Deleting some of the hot rows and re-inserting them back into the table may alleviate the problem. Most of the time the rows will be placed into a different block and reduce contention on the block. The DBA may also need to adjust the pctfree and/or pctused parameters for the table to ensure the rows are placed into a different block.

Inefficient Queries – as with the gc cr request wait event, the more blocks requested from the buffer cache the more likelihood of a session having to wait for other sessions. Tuning queries to access fewer blocks will often result in less contention for the same block.

Archive log deletion policy:

-------------------------------------------

CONFIGURE ARCHIVELOG DELETION POLICY

TO BACKED UP 2 TIMES TO 'SBT_TAPE';

The archive log deletion policy says the the logfiles have to be backed up for two times before they get deleted.

DBMS_SPM Package:

Disk space used by the SQL management base is regularly checked against a limit based on the size of the SYSAUX tablespace. By default, the limit for the SMB is no more than 10% of the size of the SYSAUX tablespace. The allowable range for this limit is between 1% and 50%. A weekly background process measures the total space occupied by the SMB, and when the defined limit is exceeded, the process will generate a warning that is written to the alert log. The alerts are generated weekly until either the SMB space limit is increased, the size of the SYSAUX tablespace is increased, or the disk space used by the SMB is decreased by purging SQL management objects (SQL plan baselines or SQL profiles).

To change the percentage limit, use the CONFIGURE procedure of the DBMS_SPM package:

BEGIN
  DBMS_SPM.CONFIGURE(
    'space_budget_percent',30);
END;
/

In this example, the space limit is changed to 30%. To learn about additional parameters used by the CONFIGURE procedure.

select parameter_name, parameter_value 
 from dba_sql_management_config;



PARAMETER_NAME          PARAMETER_VALUE
--------------------- ---------------
SPACE_BUDGET_PERCENT            30
PLAN_RETENTION_WEEKS           105

Row Migration and Row Chaining:

If an UPDATE statement increases the amount of data in a row so that the row no longer fits in its data block, then Oracle Database tries to find another block with enough free space to hold the entire row. If such a block is available, then Oracle Database moves the entire row to the new block. This operation is called row migration. If the row is too large to fit into any available block, then the database splits the row into multiple pieces and stores each piece in a separate block. This operation is called row chaining. The database can also chain rows when they are inserted.

Migration and chaining are especially detrimental to performance with the following:

· UPDATE statements that cause migration and chaining to perform poorly

· Queries that select migrated or chained rows because these must perform additional input and output

The definition of a sample output table named CHAINED_ROWS appears in a SQL script available on your distribution medium. The common name of this script is UTLCHN1.SQL, although its exact name and location varies depending on your platform. Your output table must have the same column names, data types, and sizes as the CHAINED_ROWS table.

Increasing PCTFREE can help to avoid migrated rows. If you leave more free space available in the block, then the row has room to grow. You can also reorganize or re-create tables and indexes that have high deletion rates. If tables frequently have rows deleted, then data blocks can have partially free space in them. If rows are inserted and later expanded, then the inserted rows might land in blocks with deleted rows but still not have enough room to expand. Reorganizing the table ensures that the main free space is totally empty blocks

Checkpoint not Complete:
-----------------------------

The "checkpoint not complete" messages are generated because the logs are switching so fast that the checkpoint associated with the log switch isn't complete.
During that time, Oracle's LGWR process has filled up the other redo log groups and is now waiting for the first checkpoint to successfully execute.
Remember, Oracle will stop processing until the checkpoint completes successfully!
Usually you see this "checkpoint not complete" error in your alert log:

/u01/app/oracle/mydb/redo16.log

Thu Jan 14 22:12:55 2015
Thread 1 cannot allocate new log, sequence 46352
Checkpoint not complete

Oracle suggests that you might consider setting archive_lag_target to zero to reduce "checkpoint not complete" messages:

alter system set archive_lag_target=0 scope=both;

Overall Oracle performance can be dramatically improved by increasing the log sizes so that logs switch at the recommended interval of 15 to 30 minutes.

You need to ensure that the on-line redo logs don't switch too often during periods of high activity and switch often enough during times of low processing workloads.

This should reduce the delays from the checkpoint not complete errors.

RAC network protocol:

For the public network, each network adapter must support TCP/IP.

For the private network, the interface must support the user datagram protocol (UDP) using high-speed network adapters and switches that support TCP/IP (minimum requirement 1 Gigabit Ethernet).

Note:

UDP is the default interface protocol for Oracle RAC and Oracle Clusterware. You must use a switch for the interconnect. Oracle recommends that you use a dedicated switch.

Oracle does not support token-rings or crossover cables for the interconnect.

How to find which asm disk maps to which linux partition :

- Issue following command to list all asm disks

# ls -ls /dev/oracleasm/disks/ASMDISK*

0 brw-rw—- 1 grid asmadmin 8, 17 Apr 12 18:43 /dev/oracleasm/disks/ASMDISK01

0 brw-rw—- 1 grid asmadmin 8, 18 Apr 12 18:43 /dev/oracleasm/disks/ASMDISK02

2 columns after group (asmadmin) refer to major/minor

e.g. for ASMDISK01 major = 8, minor = 17

– see the contents of /proc/partitions to find which partition is mapped to which major/minor

# cat /proc/partitions

major minor #blocks name

8 0 83886080 sda

8 1 104391 sda1

8 2 5116702 sda2

8 3 5116702 sda3

8 4 1 sda4

8 5 73545538 sda5

8 16 93487104 sdb

8 17 4008186 sdb1

8 18 4008217 sdb2

8 19 4008217 sdb3

8 20 1 sdb4

…

8 29 4008186 sdb13

8 30 4008186 sdb14

8 31 4008186 sdb15

Here sdb1 is mapped to major 8 and minor 17.

Hence, partition /dev/sdb1 is mapped to ASMDISK01

Dynamic Remastering:

When one instance departs the cluster, the GRD portion of that instance needs to be redistributed to the surviving nodes. Similarly, when a new instance enters the cluster, the GRD portions of the existing instances must be redistributed to create the GRD portion of the new instance. This is called dynamic resource reconfiguration also called dynamic remastering.

Use the command cat /proc/cpuinfo which will output a chunk of data for each core.

Each chunk starts with this info:

processor : 3

vendor_id : GenuineIntel

cpu family : 6

model : 60

model name : Intel(R) Core(TM) i5-4210M CPU @ 2.60GHz

Cores are numbered starting from 0, so if the last chunk says processor : 3 as in this case, your machine has 4 cores.

Current timestamp:

CURRENT_TIMESTAMP returns the current date and time in the session time zone, in a value of datatype TIMESTAMP WITH TIME ZONE.

SQL> SELECT SESSIONTIMEZONE, CURRENT_TIMESTAMP FROM DUAL;
SESSIONTIMEZONE
---------------------------------------------------------------------------
CURRENT_TIMESTAMP
---------------------------------------------------------------------------
+05:30
03-AUG-16 07.42.28.629657 PM +05:30

SQL> select * from v$active_instances;

INST_NUMBER    INST_NAME
--------------------     ------------------------------------
   1                           rac1.oracl.com:gisdb1

What privilege have to give to scott to query all tables in database?

SQL> Grant select_any_table to Scott;

Recreate oraInventory in oracle:

------------------------------------------------

Go to Oracle Universal installer location for creating Oracle Inventory

$ORACLE_HOME/oui/bin

Note: when you are running for different homes, you have run the oui in there respective home only

$ ./runInstaller -silent -ignoreSysPrereqs -attachHome ORACLE_HOME="/apps/oracle/product/10.2.0.2" ORACLE_HOME_NAME="orahome_102"

Starting Oracle Universal Installer...

How to find dumpfile belongs to exp/expdp:

Using

$ Strings export_hr.DMP

ASM disk Status:

Disks that were discovered but that have not yet been assigned to a disk group have a header status of either CANDIDATE or PROVISIONED.

CANDIDATE
Disk is not part of a disk group and may be added to a disk group with the ALTER DISKGROUP statement

PROVISIONED
Disk is not part of a disk group and may be added to a disk group with the ALTER DISKGROUP statement. The PROVISIONED header status is different from the CANDIDATE header status in that PROVISIONED implies that an additional platform-specific action has been taken by an administrator to make the disk available for Automatic Storage Management.
For example, on Windows, the administrator used asmtool or asmtoolg to stamp the disk with a header, or on Linux, the administrator used ASMLib to prepare the disk for ASM.

Clearing a Redo Log File :
------------------------------
A redo log file might become corrupted while the database is open, and ultimately stop database activity because archiving cannot continue. In this situation the ALTER DATABASE CLEAR LOGFILE statement can be used to reinitialize the file without shutting down the database.

The following statement clears the log files in redo log group number 3:

ALTER DATABASE CLEAR LOGFILE GROUP 3;
This statement overcomes two situations where dropping redo logs is not possible:

If there are only two log groups
The corrupt redo log file belongs to the current group
If the corrupt redo log file has not been archived, use the UNARCHIVED keyword in the statement.

ALTER DATABASE CLEAR UNARCHIVED LOGFILE GROUP 3;
This statement clears the corrupted redo logs and avoids archiving them. The cleared redo logs are available for use even though they were not archived.

If you clear a log file that is needed for recovery of a backup, then you can no longer recover from that backup. The database writes a message in the alert log describing the backups from which you cannot recover.

Note:
If you clear an unarchived redo log file, you should make another backup of the database.
If you want to clear an unarchived redo log that is needed to bring an offline tablespace online, use the UNRECOVERABLE DATAFILE clause in the ALTER DATABASE CLEAR LOGFILE statement.

If you clear a redo log needed to bring an offline tablespace online, you will not be able to bring the tablespace online again. You will have to drop the tablespace or perform an incomplete recovery. Note that tablespaces taken offline normal do not require recovery.

DBA Admin

Tuesday, 28 June 2016

interview

No comments:

Post a Comment

Oracle Database Switch Over and Switch Back

Pages

Search This Blog