Thursday, May 26, 2011

I am not a genius, but I am still 11 year Oracle DBA

Becoming an Oracle DBA is not some mythical thing. Excelling in this field, sustaining in this field, covering milestones is not something out of this world. You really don't have to be an Einstein to become a world class Database Administrator in Oracle technologies.

True lots of skills are needed to be a world class DBA like having sound understanding of architecture and features of various releases of Oracle databases, dexterity in advanced technologies like RAC, ASM, Data Guard, RMAN, RAT etc. Also needed is solid foundation knowledge of various operating systems like Linux, Solaris, other flavors of Unix, Window. Networking and storage information is also needed. Besides of these hard skills, soft skills like communication, documentation, risk management, priority handling, team playing, solo flight, crisis management are also very much needed.

But they can be acquired, mastered and polished and improved. Learning is a life long experience and excelling is personal. To acquire and master all those skills, you don't have to be genius or super-person. You should just persevere, go through it, do it and keep doing it and one day you wake up with feeling that at last you know a little bit about this DBA thing, and then suddenly you realize that you know nothing.

Remember, the more you know , the more you realize how little you know.

Wednesday, May 25, 2011

The Oracle ACE Program Newsletter | May 2011

They May 2011 edition if Oracle ACE Program newsletter is out and as usual it contains nicely summarized information about the activities performed by Oracle ACEs around the globe and the new entrants in this club.

You can subscribe to the Oracle newsletters here.

Lovely Oracle has been mentioned in this newsletter too, twice :)

Monday, May 23, 2011

awrgrpt.sql in Exadata

Oracle RAC is just a part of Oracle Exadata, and if you need to have AWR reports generated from all the instances, which is often the case if you are working Oracle support, then you can use the awrgrpt.sql script. You just have to supply the starting/ending snap ids and the location, and just from one db node you can get the AWR report from all the instances. You can also get report from just one instance by using awrgrpti.sql script.

This newer awrgrpt.sql script has replaced the older server performance script the spawrrac.sql.

Thursday, May 19, 2011

Exadata Patching Funfare

Last weekend was a patching galore. Implemented one cell patch and two one-off patches to two Exadata machines. These were applied on top of Bundle Pack 9 (BP9) patch and  more or less the patching was smooth, and it was good to have the world's top database appliance patched to the latest offered by the Oracle.

Will post very soon the tidbits about the cell and compute nodes patching.

Wednesday, May 18, 2011

Million Dollar Tip for Public Speaking

Whether you are teaching a course to a class, speaking in a conference, or giving a presentation, prepare well and just remember that you are there to share your knowledge and to learn from others. Don't go with the mindset that you know more than your audience, or with the notion that you know less than your audience. With the former, you would become over-confident and likely sound arrogant to your audience and they would hate it. With the later, you would sound inept and nobody would take you seriously.

Tuesday, May 17, 2011

Some ASM Space tidbits in Exadata

Following are some of the queries, I frequently use to see the space status at the ASM diskgroups at the Exadata database machine, especially for the FLASH RECOVERY AREA diskgroup RECO.

Yes, they work the same on non-Exadata system, but you know Exadata is the buzzword :)

SQL> select * from v$flash_recovery_area_usage;

FILE_TYPE            PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
-------------------- ------------------ ------------------------- ---------------
CONTROL FILE                          0                         0               1
REDO LOG                          6                         0              3
ARCHIVED LOG                 3                         0              5
BACKUP PIECE                          0                         0               0
IMAGE COPY                            0                         0               0
FLASHBACK LOG                         0                         0               0
FOREIGN ARCHIVED LOG                  0                         0               0

7 rows selected.

SQL> select sum(PERCENT_SPACE_USED) from v$flash_recovery_area_usage;

SUM(PERCENT_SPACE_USED)
-----------------------
                  9

SQL> show parameter recovery

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_recovery_file_dest               string                            +RECO
db_recovery_file_dest_size       big integer                     2000000M
recovery_parallelism                  integer                          0



[oracle@EXA1:~]$ export ORACLE_SID= ASMSID
[oracle@EXA1:~]$ export ORACLE_HOME= GRID HOME
[oracle@EXA1:~]$ asmcmd


ASMCMD> lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  NORMAL  N         512   4096  4194304   7688704  3596368           771700         1412334              0             N  RECO/

 select name,total_mb/1024/1024,free_mb/1024/1024,usable_file_mb/1024/1024 from v$asm_diskgroup;




NAME     TOTAL_MB/1024/1024      FREE_MB/1024/1024     USABLE_FILE_MB/1024/1024
--------               ------------------      -----------------             ------------------------
DATA                    135.2734375        18.87905121                      12.55378914
RECO                    18.09545898        13.42976379                      11.34690666
SYSTEMDG          11.9909668          11.9886322                        1.903817177

Tuesday, May 10, 2011

My Webinar at BrainSurface : Administration of Automatic Diagnostic Repository

Can you spare one hour or so on this Thursday 12th May at 11 am EDT? I would like to invite you to my webinar as part of WebaThon Series @BrainSurface.

You can register for free here.

Following are details of the Webinar:

Oracle has consolidated the diagnostic information of database under one umbrella called Automatic Diagnostic Repository (ADR). This repository is XML-based opening up new vistas for future Oracle diagnostic tools to manipulate the hierarchical data. In this session, learn the fundamental features of ADR, layout of files, server and client side tracing, retention policies, a peek at ADR layout in Exadata, adrci command line tool, and where you need to manually rotate the files.
Hosted by Tariq Farooq


Date & Time 05/13/11, 11:00 AM — 12:15 PM
Registrations Available Unlimited
Registration Deadline 05/13/22, 11:00 AM

Monday, May 9, 2011

Setting ADR Base in Exadata Cell Node

Automatic Diagnostic Repository is also present at the cell node of Exadata database machine. You have to set the ADR Base in V2, and then you can show and select the ADR homes to manage the diagnostic data.


[root@cell1-n0 ~]# adrci
adrci> set base /opt/oracle/cell/log
adrci> show homes
ADR Homes:
diag/asm/cell/cell1
adrci> set home diag/asm/cell/cell1
adrci> show tracefile
diag/asm/cell/cell1/incident/incdir_15/svtrc_i15.trc
diag/asm/cell/cell1/incident/incdir_14/svtrc_i14.trc
diag/asm/cell/cell1/incident/incdir_12/svtrc_i12.trc

Wednesday, May 4, 2011

Swappiness

Swappiness, as the name suggests, affects the swapping in the Linux systems. 

Its kind of slider value. 0 means keep pages in memory as long as possible. 100 means at maximum throttle and in-between just a frequency for swapping.

The minimum value for the swappiness is 0 and the maximum is 100. The default value is 60. On the Oracle Exadata database machine it is also set to default value of 60.

Decreasing the swappiness would keep more pages in memory instead of putting them in swap space, and so those pages would occupy the memory, and in case memory is needed than on run time those pages would be swapped out, causing the overhead, but the plus side is that if those pages are accessed again, they dont have to be swapped in again, as they would already be present in the memory.

If swappiness is increased then it would swap out more aggressively, and if pages are accessed which have been swapped out, the user's interactivity will get degradation first time.

This is all relative. 

For instance, If we are observing memory shortage for the prcoess and there is lot of swapping out (so in vmstat) then we might look at increasing swappiness to swapout more to make room for more memory.

If we are observing more and more swapping in (ver frequent si in vmstat) then we could look at decreasing swappiness.

In order to play with the swappiness, you can temporarily set it by:

echo 100 > /proc/sys/vm/swappiness
and you can permanently set it in /etc/sysctl.conf

Tuesday, May 3, 2011

ORA-29786: SIHA attribute GET failed with error

If you are querying v$asm_disk or v$asm_diskgroup and the query is hanging, and tracing shows that RBAL is waiting for the "GPnP Get Item" event, then more than likely you are hitting a bug.

This don't really hamper your ASM instance working, but its prudent to restart

the instance, though the blocking of RBAL also tend to disappear after a while.

MOS is your friend in this case.

Monday, May 2, 2011

ORA-00700 soft assert Error

After the dreaded ORA-00600 (database bug), Ora-7445 (OS Exception), or ORA-04031 (memory exhuastion) errors, we have "soft errors" in 11g, the ORA-00700.

According to the MOS note ID 737878.1, ORA-700 errors are referred to as soft asserts in 11g.  Basically these are internal errors that do not cause anything fatal to occur in the process, and so the process can continue.  Soft asserts are triggered when the caller wants to make a note of the fact that something unexpected has happened, but would like to continue on because the failure is not fatal to the process or the instance.  In such cases, the process just records the error, takes the corresponding dumps and moves on as though nothing happened.