Friday, April 28, 2017

Find process using swap in linux

$top 
then press OpEnter
This will list process by swap usage. Hit C to show the full process name 

Or use this script to find. Got this script from below link. Credits to original writer

http://northernmost.org/blog/find-out-what-is-using-your-swap/


#!/bin/bash 
# Get current swap usage for all running processes
# Erik Ljungstrom 27/05/2011
# Modified by Mikko Rantalainen 2012-08-09
# Pipe the output to "sort -nk3" to get sorted output
# Modified by Marc Methot 2014-09-18
# removed the need for sudo

SUM=0
OVERALL=0
for DIR in `find /proc/ -maxdepth 1 -type d -regex "^/proc/[0-9]+"`
do
    PID=`echo $DIR | cut -d / -f 3`
    PROGNAME=`ps -p $PID -o comm --no-headers`
    for SWAP in `grep VmSwap $DIR/status 2>/dev/null | awk '{ print $2 }'`
    do
        let SUM=$SUM+$SWAP
    done
    if (( $SUM > 0 )); then
        echo "PID=$PID swapped $SUM KB ($PROGNAME)"
    fi
    let OVERALL=$OVERALL+$SUM
    SUM=0
done
echo "Overall swap used: $OVERALL KB"

View hidden parameters in oracle database 11g

select name,value from v$parameter where substr(name,1,1)='_';

Thursday, April 27, 2017

Database resource manager oracle 11g

Using create_simple_plan procedure


This below query will automatically create resource manager plan, resource manager groups if they dont exist and assign directives to each group. First group will get level2 75% usage and second group will get 25% usage under contention.

Resource_manager_plan= EDW_PLAN
DEFAULT_CONSUMER_GROUP = 75%
GG_GROUP = 25%

begin
dbms_resource_manager.create_simple_plan(simple_plan=>'EDW_PLAN',
CONSUMER_GROUP1=>'DEFAULT_CONSUMER_GROUP',GROUP1_PERCENT=>75,
CONSUMER_GROUP2=>'GG_GROUP',GROUP2_PERCENT=>25);
END;
/

After this assign the users to their respective groups

BEGIN
    dbms_resource_manager_privs.grant_switch_consumer_group(
        grantee_name => 'SRAVAN',
        consumer_group => 'GG_GROUP',
        grant_option => TRUE
    );
END;
BEGIN
    dbms_resource_manager.set_initial_consumer_group(
        user => 'SRAVAN',
        consumer_group => 'GG_GROUP'
    );
END;


After this activate the resource manager plan

alter system set resource_manager_plan="EDW_PLAN";

You can verify and monitor DBRM using below queries

SELECT name, is_top_plan FROM v$rsrc_plan;

SELECT name, active_sessions, queue_length,consumed_cpu_time, cpu_waits, cpu_wait_time FROM v$rsrc_consumer_group;

SELECT se.sid sess_id, co.name consumer_group, se.state, se.consumed_cpu_time cpu_time, se.cpu_wait_time, se.queued_time FROM v$rsrc_session_info se, v$rsrc_consumer_group co WHERE se.current_consumer_group_id = co.id;

SELECT sequence# seq, name plan_name,to_char(start_time, 'DD-MON-YY HH24:MM') start_time,to_char(end_time, 'DD-MON-YY HH24:MM') end_time, window_name FROM v$rsrc_plan_history;

select sequence# seq, name, cpu_wait_time, cpu_waits,consumed_cpu_time from V$RSRC_CONS_GROUP_HISTORY;



*****************************************************************************

Normal procedure of creating resource manager plan

begin
  sys.dbms_resource_manager.clear_pending_area();
  sys.dbms_resource_manager.create_pending_area();
  sys.dbms_resource_manager.create_plan (
      plan                         => 'EDW_PLAN'
     ,mgmt_mth                     => 'EMPHASIS'
     ,active_sess_pool_mth         => 'ACTIVE_SESS_POOL_ABSOLUTE'
     ,parallel_degree_limit_mth    => 'PARALLEL_DEGREE_LIMIT_ABSOLUTE'
     ,queueing_mth                 => 'FIFO_TIMEOUT'
     ,comment                      => 'Allocate 75% to default consumer group and 25% to golden gate user under contention
'
     ,sub_plan                      => FALSE);
  sys.dbms_resource_manager.create_plan_directive (
      plan                         => 'EDW_PLAN'
     ,group_or_subplan             => 'OTHER_GROUPS'
     ,switch_estimate              => FALSE
     ,mgmt_p2                      => 75
     ,mgmt_p3                      => 100
     ,switch_for_call              => FALSE
     ,comment                      => 'OTHER_GROUPS Level 3'  );
  sys.dbms_resource_manager.create_plan_directive (
      plan                         => 'EDW_PLAN'
     ,group_or_subplan             => 'SYS_GROUP'
     ,switch_estimate              => FALSE
     ,mgmt_p1                      => 100
     ,switch_for_call              => FALSE
     ,comment                      => 'SYS Level 1'  );
  sys.dbms_resource_manager.create_consumer_group (
      consumer_group               => 'GG_GROUP'
     ,comment                      => 'Level 2 Group 2');
  sys.dbms_resource_manager.create_plan_directive (
      plan                         => 'EDW_PLAN'
     ,group_or_subplan             => 'GG_GROUP'
     ,switch_estimate              => FALSE
     ,mgmt_p2                      => 25
     ,switch_for_call              => FALSE
     ,comment                      => 'Level 2 Group 2'  );
  sys.dbms_resource_manager.submit_pending_area();
end;



Drop all the plans, plan directives, groups

begin
  sys.dbms_resource_manager.clear_pending_area();
  sys.dbms_resource_manager.create_pending_area();
  sys.dbms_resource_manager.delete_plan_cascade ('EDW_PLAN');
  sys.dbms_resource_manager.submit_pending_area();
end;
/


Note:

You can use toad or enterprise manager to to create, alter any resource plans in easier way if you feel the above code is complicated.

Please also make sure you modify default_maintenance_plan because this will be activated during automatic maintenance windows and if you did not include your allocations it will stick to default ones.

You can limit parallelism, CPU% and other parameters using DBRM


Switching user group

BEGIN
  SYS.DBMS_RESOURCE_MANAGER.clear_pending_area();
  SYS.DBMS_RESOURCE_MANAGER.create_pending_area();
  SYS.DBMS_RESOURCE_MANAGER_PRIVS.grant_switch_consumer_group(
    grantee_name   => 'SXXXX',
    consumer_group => 'BATCH_GROUP',
    grant_option   => FALSE);
  SYS.DBMS_RESOURCE_MANAGER.submit_pending_area();
END;
/
BEGIN
  SYS.DBMS_RESOURCE_MANAGER.set_initial_consumer_group(
    user           => 'SXXXX',
    consumer_group => 'BATCH_GROUP');
END;
/

Wednesday, April 26, 2017

find inserts and updates count in oracle database

SELECT sum( stat.executions_delta ) insert_executions
  FROM dba_hist_sqlstat stat
       JOIN dba_hist_sqltext txt ON (stat.sql_id = txt.sql_id )
       JOIN dba_hist_snapshot snap ON (stat.snap_id = snap.snap_id)
 WHERE snap.begin_interval_time BETWEEN TO_DATE('April 25, 2017, 11:00 A.M.','Month dd, YYYY, HH:MI A.M.') AND TO_DATE('April 25, 2017, 11:50 A.M.','Month dd, YYYY, HH:MI A.M.')
   AND txt.command_type = 2;

SELECT sum( stat.executions_delta ) update_executions
  FROM dba_hist_sqlstat stat
       JOIN dba_hist_sqltext txt ON (stat.sql_id = txt.sql_id )
       JOIN dba_hist_snapshot snap ON (stat.snap_id = snap.snap_id)
 WHERE snap.begin_interval_time BETWEEN TO_DATE('April 25, 2017, 11:00 A.M.','Month dd, YYYY, HH:MI A.M.') AND TO_DATE('April 25, 2017, 11:50 A.M.','Month dd, YYYY, HH:MI A.M.')
   AND txt.command_type = 6;

Tuesday, April 25, 2017

Using SAR in Linux

ls -al /var/log/sa | grep "Apr 22"




Load average at particular time and date

ls -al /var/log/sa | grep "Apr 22"

sar -q -f /var/log/sa/sa22 -s 10:00:01


Swap space used at particular time and date
sar -S -f /var/log/sa/sa22 -s 10:00:01

Memory usage
sar -r -f /var/log/sa/sa22 -s 10:00:01

IO usage
sar -b -f /var/log/sa/sa22 -s 10:00:01

Check swap pages in and out
sar -W -f /var/log/sa/sa22

Thursday, April 20, 2017

Changing agent timezone in oem 12c

/opt/app/oracle/agentHome/core/12.1.0.2.0/bin>echo $TZ
US/Pacific
/opt/app/oracle/agentHome/core/12.1.0.2.0/bin>./emctl stop agent
Oracle Enterprise Manager Cloud Control 12c Release 2
Copyright (c) 1996, 2012 Oracle Corporation.  All rights reserved.
Stopping agent ..... stopped.

/opt/app/oracle/agentHome/core/12.1.0.2.0/bin>./emctl resetTZ agent
Oracle Enterprise Manager Cloud Control 12c Release 2
Copyright (c) 1996, 2012 Oracle Corporation.  All rights reserved.
Updating /opt/app/oracle/agentHome/agent_inst/sysman/config/emd.properties...
Successfully updated /opt/app/oracle/agentHome/agent_inst/sysman/config/emd.properties.
Login as the em repository user and run the  script:
exec mgmt_target.set_agent_tzrgn('xxx:3872','US/Pacific')
and commit the changes
This can be done for example by logging into sqlplus and doing
SQL> exec mgmt_target.set_agent_tzrgn('xxx:3872','US/Pacific')
SQL> commit
/opt/app/oracle/agentHome/core/12.1.0.2.0/bin>./emctl start agent

Tuesday, April 18, 2017

ssh password less connectivity exadata

[celladmin]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/celladmin/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/celladmin/.ssh/id_rsa.
Your public key has been saved in /home/celladmin/.ssh/id_rsa.pub.
The key fingerprint is:
55:xx:yy:zz:44:21:d0:29:dc:d0:cf:06:ac:7b:14:c4 celladmin@cel01.xxx.local
The key's randomart image is:
+--[ RSA 2048]----+
|   .oXo.o.  o.   |
|    o Eo   o     |
|     o =. o      |
|    . ..++       |
|     o .S +      |
|    . .  o       |
|     .           |
|                 |
|                 |
+-----------------+

Once the key is generated verify in the folder to see the new id* files..

Now create a new file with entries of all the hostnames which you want to have ssh password less connectivity with

vi cell_group

cel01
cel02
cel03
cel04

After that run the below.. give the password of the user when prompted of the other cells

dcli -g ./cell_group -k
celladmin@cel01's password:
celladmin@cel02's password:
celladmin@cel04's password:
celladmin@cel03's password:
cel01: ssh key added
cel02: ssh key added
cel03: ssh key added
cel04: ssh key added


Repeat the same process in all the storage cells...

Thursday, April 13, 2017

NTP change on  Exadata db nodes

# grep ^server /etc/ntp.conf

server 10.120.0.10 prefer iburst burst minpoll 4 maxpoll 4
server 10.120.2.1 iburst burst minpoll 4 maxpoll 4
server 10.120.2.2 iburst burst minpoll 4 maxpoll 4


Modify the servers in /etc/ntp.conf

Now we can restart the ntp server.

# /etc/init.d/ntpd restart
Shutting down ntpd:                        [ OK ]
ntpd: Synchronizing with timer server:     [ OK ]
Starting ntpd:                             [ OK ]




To force update time


# ntpd -gq
-g requests an update irrespective of the time offset, and -q requests the daemon to quit after updating the date from the ntp server.




************************************************************************************************************************************************
NTP change in storage cells

server 10.120.0.10 prefer iburst burst minpoll 4 maxpoll 4
server 10.120.2.1 iburst burst minpoll 4 maxpoll 4
server 10.120.2.2 iburst burst minpoll 4 maxpoll 4



Log in as the root user.

####Check status

cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome

####Make grid disks inactive

cellcli -e alter griddisk all inactive

####Verify if offline

cellcli -e list griddisk

cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome

####Shut down services

cellcli -e alter cell shutdown services all

service ocrvottargetd stop

####Change NTP settings as above

/usr/local/bin/ipconf


####Verify NTP settings

grep ^server /etc/ntp.conf

####Startup services

service ocrvottargetd start

cellcli -e alter cell startup services all

####Make grid disks online

cellcli -e alter griddisk all active

####Verify grid disks are online

cellcli -e list griddisk

cellcli -e list griddisk attributes name, asmmodestatus


After this if time is still wrong force update it

service ntpd stop
ntpd -gq
service ntpd start


Once everything is synced and online repeast same steps on other cells..



******************************************************************************

[root@xxx ~]# cellcli -e alter cell shutdown services all

Stopping the RS, CELLSRV, and MS services...
The SHUTDOWN of services was successful.
[root@xxx ~]# service ocrvottargetd stop
ocrvottargetd: unrecognized service
[root@xxx ~]# /usr/local/bin/ipconf
[Info]: ipconf command line: /opt/oracle.cellos/ipconf.pl -nocodes
Logging started to /var/log/cellos/ipconf.log
Interface ib0   is                      Linked.    hca: mlx4_0
Interface ib1   is                      Linked.    hca: mlx4_0
Interface eth0  is                      Linked.    driver/mac: ixgbe/00:10:e0:b4:18:ac
Interface eth1  is .................... Unlinked.  driver/mac: ixgbe/00:10:e0:b4:18:ad
Interface eth2  is .................... Unlinked.  driver/mac: ixgbe/00:10:e0:b4:18:ae
Interface eth3  is .................... Unlinked.  driver/mac: ixgbe/00:10:e0:b4:18:af

Network interfaces
Name  State
ib0   Linked
ib1   Linked
eth0  Linked
eth1  Unlinked
eth2  Unlinked
eth3  Unlinked
[Warning]: Some network interface(s) are disconnected. Check cables and swicthes and retry
Do you want to retry (y/n) [y]: n

The current nameserver(s): 10.120.11.6 10.120.11.7
Do you want to change it (y/n) [n]:
The current timezone: America/Los_Angeles
Do you want to change it (y/n) [n]:
The current NTP server(s): 10.120.0.10 216.239.32.15
Do you want to change it (y/n) [n]: y
Fully qualified hostname or ip address for NTP server. Press enter if none: 10.120.0.10
Continue adding more ntp servers (y/n) [n]: y
Fully qualified hostname or ip address for NTP server. Press enter if none: 10.120.2.1
Continue adding more ntp servers (y/n) [n]: y
Fully qualified hostname or ip address for NTP server. Press enter if none: 10.120.2.2
Continue adding more ntp servers (y/n) [n]:

Network interfaces
Name  State    Status  IP address    Netmask       Gateway     Net type   Hostname
ib0   Linked   UP      192.168.10.26 255.255.252.0             Private    xxx-priv1.hke.local
ib1   Linked   UP      192.168.10.27 255.255.252.0             Private    xxx-priv2.hke.local
eth0  Linked   UP      10.120.16.203 255.255.255.0 10.120.16.1 Management xxx.hke.local
eth1  Unlinked UNCONF
eth2  Unlinked UNCONF
eth3  Unlinked UNCONF
Select interface name to configure or press Enter to continue:

Select canonical hostname from the list below
1: xxx-priv1.hke.local
2: xxx-priv2.hke.local
3: xxx.hke.local
Canonical fully qualified domain name [3]:

Select default gateway interface from the list below
1: eth0
Default gateway interface [1]:

Canonical hostname: xxx.hke.local
Nameservers: 10.120.11.6 10.120.11.7
Timezone: America/Los_Angeles
NTP servers: 10.120.0.10 10.120.2.1 10.120.2.2
Default gateway device: eth0
Network interfaces
Name  State    Status  IP address    Netmask       Gateway     Net type   Hostname
ib0   Linked   UP      192.168.10.26 255.255.252.0             Private    xxx-priv1.hke.local
ib1   Linked   UP      192.168.10.27 255.255.252.0             Private    xxx-priv2.hke.local
eth0  Linked   UP      10.120.16.203 255.255.255.0 10.120.16.1 Management xxx.hke.local
eth1  Unlinked UNCONF
eth2  Unlinked UNCONF
eth3  Unlinked UNCONF
Is this correct (y/n) [y]:

Do you want to configure basic ILOM settings (y/n) [y]:
Loading basic configuration settings from ILOM ...
ILOM Fully qualified hostname [xxx-ilom.hke.local]:
Inet protocol (IPv4,IPv6) [IPv4]:
ILOM IP address [10.120.16.11]:
ILOM Netmask [255.255.255.0]:
ILOM Gateway or none [10.120.16.1]:
ILOM Nameserver (multiple IPs separated by a comma) or none [10.120.11.6]:
ILOM Use NTP Servers (enabled/disabled) [enabled]:
ILOM First NTP server. Fully qualified hostname or ip address or none [10.120.0.10]:
ILOM Second NTP server. Fully qualified hostname or ip address or none [216.239.32.15]: 10.120.2.1
ILOM Vlan id or zero for non-tagged VLAN (0-4079) [0]:

Basic ILOM configuration settings:
Hostname             : xxx-ilom.hke.local
IP Address           : 10.120.16.11
Netmask              : 255.255.255.0
Gateway              : 10.120.16.1
DNS servers          : 10.120.11.6
Use NTP servers      : enabled
First NTP server     : 10.120.0.10
Second NTP server    : 10.120.2.1
Timezone (read-only) : America/Los_Angeles
VLAN id              : 0
Is this correct (y/n) [y]:

[Info]: Run /opt/oracle.cellos/validations/init.d/saveconfig
[Info]: Stopping cellwall service ...
cellwall: Flushing firewall rules:                         [  OK  ]
[Info]: cellwall service stopped
[Info]: Restart ntpd service
Shutting down ntpd:                                        [  OK  ]
Starting ntpd:                                             [  OK  ]
[Info]: Starting cellwall service ...
cellwall: Checking network interfaces:                     [  OK  ]
cellwall: Checking ILOM interface:                         [  OK  ]
cellwall: Flushing firewall rules:                         [  OK  ]
cellwall: Applying firewall rules:                         [  OK  ]
[Info]: cellwall service started
[Info]: Save /etc/sysctl.conf in /etc/sysctl.conf.backupbyExadata
[Info]: Adjust settings for IB interfaces in /etc/sysctl.conf
Re-login using new IP address 10.120.16.203 if you were disconnected after following commands
ip addr show eth0
sleep 4

[Warning]: You modified NTP server.
         Ensure you also update the Infiniband Switch NTP server
         if the same NTP server was also used by the Infiniband switch.







*****************************************************************************************************
NTP change in Infiniband switches


Login to infiniband using ilom-admin user default password is welcome1


check existing ntp configuration

show -d properties /SP/clients/ntp/server/1
show -d properties /SP/clients/ntp/server/2


check ntp status

show -d properties /SP/clock

To check current time

show /SP/clock datetime


Change ntp server address

set /SP/clients/ntp/server/1 address=10.120.0.10

set /SP/clients/ntp/server/2 address=10.120.2.1


*****************************************************************************************************

On solaris

ntpq -p

this lists ntp servers

cd /etc/inet

vi ntp.conf

edit the servers to new servers

Example:

server 10.120.2.1 prefer
server 10.120.2.2


After this save the file.

Now disable and enable ntpd

svcadm disable ntp

svcadm enable ntp


verify time servers

ntpq -p

date



Tuesday, April 11, 2017

Metric collection error on storage cell Exadata




Run below on all db nodes from agent Home

emctl control agent runCollection xxxcel01.hxx.local:oracle_exadata Response
emctl control agent runCollection xxxcel02.hxx.local:oracle_exadata Response
emctl control agent runCollection xxxcel03.hxx.local:oracle_exadata Response
emctl control agent runCollection xxxcel04.hxx.local:oracle_exadata Response

Failed to execute_exadata_response.pl



Set up SSH connectivity by following Enterprise Manager Exadata Management Getting Started Guide, Chapter 8 Troubleshooting, Section 8.1 Establish SSH Connectivity.



Basically go to /home/oracle/.ssh on db node where agent home exists

see if id_dsa and id_dsa.pub exists

if not

ssh-keygen -t dsa -f id_dsa


The above commands will create those pub files

cat id_dsa.pub

copy the text for that particular node

for example

ssh-dss AAAAB3NzaC1kc3MAAACBAKoOgGt4Ewz9rbgDz6/ChbLJ+yMEmezgLvr/bbOp+UztDtpZiIP5NTKLNyn1kFdZpQCxxssdewdsdsjFyX+Qg6vWaMqnnqtvIgXMMHnv7omiMmg735PsBQYGP8xXRn8eUMv/2+gdInRiA9kysyZVJtMma7IFisZV4oTVSr8qwww0kafBAAAAFQCfDGhPXVVtZ/UEHZdhzgd9g5iJKQAAAIAZ0ZKbgXAzk4w1I5KsUEvhOiDYPh6idFp+J6Fe2lLYHmTCwkAPejSXDh3JWNEXuVU2zq58GaAJ+ta8Kb+ous819sZgRsgHNrcZ6n3aq+XrVgm3wpjdAp5R6dQ5CyDnuOIeNUFYKcQLSHlJ+LDV/UBoi0n6/UXx4+5RdBAe4y/ZWQAsssIB+pQocF+nsQjschcZThyJNNrOqM1Sj24+G/rAe/91jn/crgE3pRanvkWW/YkPE9Kpbqw5U84gH76Hv8jGwHwu55Q6iCsPdwC7iSxA0W7mU9YWuD+YIyBZ38qIarMbno56sGY4yfFRQNVgq/G0yL2h/K4qz3b/u6ihc5EZKAriV3w== oracle@xxx.hxx.local

Open the following location as root on the storage cell server

/home/cellmonitor/.ssh

see if authorized_keys file is present.

vi authorized_keys

Paste the above key from all the db servers and save it.

Once its done wait for some time and metric error will be gone on storage cells








Exadata Flash card or Damaged hard drive replacement steps

Exadata Hard disk or Flash card replacement

To turn off storage cell

first we need to take grid disk offline

Login as celladmin or root and check the status of grid disks using

cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome

It should show Online and yes

If it shows dropped then it means there is a bad drive

Replace the drive first

Let it rebalance

To check rebalance status

select INST_ID, OPERATION, STATE, POWER, SOFAR, EST_WORK, EST_RATE, EST_MINUTES from GV$ASM_OPERATION;


Once rebalance complete

run verify the status again

cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome

If everything is online and yes

Then take grid disks offline

cellcli -e alter griddisk all inactive

cellcli -e list griddisk


Now you should see offline

Now you can reboot the server


Note: See how long your maintenance takes place

If it is more than 3.6 hours please make sure you change the settings on asm diskgroup so that it will just sync back everything once it comes online

To check the configured time

select dg.name,a.value from v$asm_diskgroup
dg, v$asm_attribute a where dg.group_number=a.group_number and
a.name='disk_repair_time';


To change the time.. Do this to all the diskgroups

ALTER DISKGROUP DATA SET ATTRIBUTE 'DISK_REPAIR_TIME'='8.5H';



After maintenance

Make all grid disks active

cellcli -e alter griddisk all active

check status

cellcli -e list griddisk attributes name, asmmodestatus


You might see some of the disks are syncing.. Give it some time and re check it should be done fairly soon..


There will not be db downtime if you have asm redundancy set to normal or high..

Monday, April 3, 2017

Linux commands

Add user celladmin with su access

add below using visudo

visudo

celladmin  ALL=(ALL)       NOPASSWD: ALL


Now to  test

login as celladmin user

type

sudo -s


To set password not expire

password unexpire

chage -l celladmin

chage -I -1 -m 0 -M 99999 -E -1 celladmin


add user to sudo with no password

oracle ALL=(root)  NOPASSWD:  /bin/su - 

Featured Post

Apply Patch 22191577 latest GI PSU to RAC and DB homes using Opatch auto or manual steps

Patch 22191577: GRID INFRASTRUCTURE PATCH SET UPDATE 11.2.0.4.160119 (JAN2016) Unzip the patch 22191577 Unzip latest Opatch Version in or...