Thursday, December 24, 2015

why did kinit throw error "Preauthentication failed while getting initial credentials"?

Problem: 
While executed following command :

#kinit -k -t /root/utilscripts/nsupdateuser.keytab nsupdate@example.com

it threw error:

kinit: Preauthentication failed while getting initial credentials

Solution: Password of user may be wrong, Try to reset and test again.

Why did kinit throw error "KDC reply did not match expectations while getting initial credentials"

Problem: 

While executing following command:

#kinit username@MYDOMAIN.COM -k -t username.keytab

it threw error :

kinit: KDC reply did not match expectations while getting initial credentials

Solution : 

user doesn't have remote access to the machine. 

How to manage kerberos keytab file?

1. Use klist to display the keytab file entries:

klist -e -k -t  mykeytabfile.keytab
or klist -ekt nsupdateuser.ktab
or type command:

#ktutil     # execute this command
ktutil:     # this prompt will appear
ktutil: read_kt /etc/apache2/http.keytab   #read keytab file
ktuilt: list                             #list all princples

example:

[root@customer-prod-util-101 utilscripts]# klist -e -k -t  nsupdateuser.ktab
Keytab name: FILE:nsupdateuser.ktab
KVNO Timestamp         Principal
---- ----------------- --------------------------------------------------------
   1 09/08/15 21:50:45 nsupdate@site1.example.com (aes256-cts-hmac-sha1-96)
[root@customer-prod-util-101 utilscripts]#


2. Following is an example of the keytab file creation process using kerberos method :

  > ktutil
  ktutil:  addent -password -p username@example.com -k 1 -e rc4-hmac
  Password for username@example.com: [enter your password]
  ktutil:  addent -password -p username@example.com -k 1 -e aes256-cts
  Password for username@example.com: [enter your password]
  ktutil:  wkt username.keytab
  ktutil:  quit

Following is an example using Heimdal Kerberos:

> ktutil -k username.keytab add -p username@example.com -e arcfour-hmac-md5 -V 1


3. Obtain a ticket-granting ticket using the keytab for testing:

You can check that the keytab contains the appropriate encryption key by attempting to use it to obtain a ticket-granting ticket. This can be done using the kinit command:

#kinit -k -t /etc/nsupdateuser.keytab nsupdate@example.com    # here nsupdate is username exiting in AD. this has privileges to update dns records on win DNS.
#klist                <#will show if ticket is created or not

example :

[root@customer-prod-util-101 ~]#  kinit -k -t /root/utilscripts/nsupdateuser.ktab nsupdate   # uses default domain
[root@customer-prod-util-101 ~]# klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: nsupdate@site1.example.com

Valid starting     Expires            Service principal
12/24/15 05:29:49  12/24/15 15:29:49  krbtgt/site1.example.com@site1.example.com
        renew until 12/31/15 05:29:49
[root@customer-prod-util-101 ~]#


   Or try to login to test if keytab file works :

Test with out keytab file:

#kinit username@MYDOMAIN.COM
password>     // pass password of username

Test with keytab file:

#kinit username@MYDOMAIN.COM -k -t username.keytab


4. Using a keytab to authenticate scripts:

To execute a script so it has valid Kerberos credentials, use:

  > kinit username@example.com -k -t mykeytab; myscript

list out principle :

>ktutil
ktutil:  rkt nsupdateuser.ktab
ktuilt: list

5. Merging keytab files:

> ktutil
  ktutil: read_kt mykeytab-1
  ktutil: read_kt mykeytab-2
  ktutil: read_kt mykeytab-3
  ktutil: write_kt krb5.keytab
  ktutil: quit


6. Delete principle:

#ktuil
ktutil: rkt
ktuilt: list
ktutil: delete_entry slot-number
ktuilt: wkt
ktuilt: quit


7. Destroy cached ticket:

kdestroy -A  //all cache will be destroyed
kdestroy -C  //this cache will be deleted only

#kdestroy -c "FILE:/tmp/krb5cc_0"

Monday, September 28, 2015

Why did df command throw "disk" Input output error

Issue/Symptom  : While DBA was starting oracle instance, it was failing. On checking FS, it was found that arch volumes are not mounted. While tried to remount them and checked through "df -h", it was throwing below error:
[root@customer-pet-oracle-3d ~]#  df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/cciss/c0d0p3      58G   48G  7.3G  87% /
/dev/cciss/c0d0p1     494M   18M  452M   4% /boot
tmpfs                  63G  232M   63G   1% /dev/shm
tmpfs                 4.0K     0  4.0K   0% /dev/vx
df: `/cinprds1_arch00': Input/output error
df: `/cinprd1_arch00': Input/output error
df: `/customerdcdp1_arch00': Input/output error
df: `/customerrptp1_arch00': Input/output error
example-prod-sea1utilnas-1a-pet:/vol/customerpet_data
                      450G  335G  116G  75% /filers/example-prod-sea1utilnas-1a-pet/customerpet_data
[root@customer-pet-oracle-3d ~]
OS Environment : RHEL 5.5
Software/Application :
DB : oracle 11.2.0.4
VxVm : VRTSvxvm-5.1.100.000-SP1_RHEL5, Symantec License Manager vxlicrep utility version 3.02.51.010
vxfs : VRTSvxfs-5.1.100.000-SP1_GA_RHEL5
Customer Environment : ATT PET Oracle DB
Investigation :
1.
$sanlun lun show|grep -i minipet_arc

customer-pet-sea1bfiler-1a:  /vol/customer_MINIPET_ARCH/lun1                /dev/sdag        host1    FCP        500.1g (536952700928)   GOOD
customer-pet-sea1bfiler-1a:  /vol/customer_MINIPET_ARCH/lun0                /dev/sdah        host1    FCP        500.1g (536952700928)   GOOD
customer-pet-sea1bfiler-1a:  /vol/customer_MINIPET_ARCH/lun2                /dev/sdai        host1    FCP          250g (268435456000)   GOOD
customer-pet-sea1bfiler-1a:  /vol/customer_MINIPET_ARCH/lun3                /dev/sdaj        host1    FCP          250g (268435456000)   GOOD
2. Check fstab entry how it is :
fstab was :
/dev/vx/dsk/minipet_arch_dg/cinprds_minipet_vol_arch00 /cinprds1_arch00 vxfs    _netdev 0 1
3. Check netfs service if running:
$/etc/init.d/netfs status
4. Search dg in log as root  :
$ awk '/arch_dg/ {print $0}' /var/log/messages.*

Sep 25 19:26:08 customer-pet-oracle-3d kernel: vxfs: msgcnt 1 mesg 037: V-2-37: vx_metaioerr - vx_inode_iodone - /dev/vx/dsk/minipet_arch_dg/rpt_minipet_vol_arch00 file system meta data write error in dev/block 0/1104
Sep 25 19:26:08 customer-pet-oracle-3d vxvm:vxconfigd: V-5-1-7935 Disk group minipet_arch_dg: update failed: Disk group has no valid configuration copies
Sep 25 19:26:08 customer-pet-oracle-3d vxvm:vxconfigd: V-5-1-7934 Disk group minipet_arch_dg: Disabled by errors
[...]
Sep 25 19:30:01 customer-pet-oracle-3d kernel: VxVM vxio V-5-3-1285 voldmp_errbuf_sio_start: Failed to flush the error buffer ffff811130c6aa00 on device 0xc900130 to DMP<4>vxfs: msgcnt 5 mesg 039: V-2-39: vx_writesuper - /dev/vx/dsk/minipet_arch_dg/cinprds_minipet_vol_arch00 file system super-block write error
Sep 25 19:30:01 customer-pet-oracle-3d kernel: vxfs: msgcnt 6 mesg 037: V-2-37: vx_metaioerr - vx_dirbread - /dev/vx/dsk/minipet_arch_dg/cinprds_minipet_vol_arch00 file system meta data write error in dev/block 0/1104
[...]
Sep 25 19:40:01 customer-pet-oracle-3d kernel: vxfs: msgcnt 21 mesg 039: V-2-39: vx_writesuper - /dev/vx/dsk/minipet_arch_dg/rpt_minipet_vol_arch00 file system super-block write error
Sep 25 19:40:01 customer-pet-oracle-3d kernel: vxfs: msgcnt 22 mesg 008: V-2-8: vx_direrr: vx_readdir_int_1 - /dev/vx/dsk/minipet_arch_dg/rpt_minipet_vol_arch00 file system dir inode 5 dev/block 0/150297879 dirent inode 0 error 5
Sep 25 19:40:01 customer-pet-oracle-3d kernel: vxfs: msgcnt 23 mesg 039: V-2-39: vx_writesuper - /dev/vx/dsk/minipet_arch_dg/rpt_minipet_vol_arch00 file system super-block write error
[...]
Sep 26 04:18:13 customer-pet-oracle-3d kernel: vxfs: msgcnt 334 mesg 016: V-2-16: vx_ilisterr: vx_iread - /dev/vx/dsk/minipet_arch_dg/rpt_minipet_vol_arch00 file system error reading inode 3
Sep 26 04:18:13 customer-pet-oracle-3d kernel: vxfs: msgcnt 335 mesg 039: V-2-39: vx_writesuper - /dev/vx/dsk/minipet_arch_dg/rpt_minipet_vol_arch00 file system super-block write error
Sep 26 04:18:13 customer-pet-oracle-3d kernel: vxfs: msgcnt 336 mesg 031: V-2-31: vx_disable - /dev/vx/dsk/minipet_arch_dg/rpt_minipet_vol_arch00 file system disabled
Above confirms that few blocks are corrupted in disks which are under disk group "minipet_arch_dg"

Permanent Solution :
1. Unmount file system if mounted. 2. Run file system check through fsck :
$fsck -F vxfs -o full 
example :
$/opt/VRTS/bin/fsck -o full -y /dev/vx/rdsk/minipet_arch_dg/rpt_minipet_vol_arch00

or 

$fsck.vxfs -o full /dev/vx/dsk/minipet_arch_dg/rpt_minipet_vol_arch00
3. OR:
or reboot system [make sure fsck is enabled in fstab]

Root Cause Analysis :

Error messages in system log confirm that disk blocks are corrupted. vxiod was failing to write data. vxconfigd informed kernel that it was unable to change vxfs config. 

Tuesday, August 11, 2015

Why does liwewise throw error like "Problem executing /opt/pbis/bin/ad-cache --delete-all >/dev/null 2>/dev/null"?

OS Environment :  Linux
Application : pbis aka likewise with version 8.3
Problem: While executing following command, it was throwing error :

/opt/likewise/bin/lwconfig --file /opt/pbis/bin/lwconfig.txt
Problem executing '/opt/pbis/bin/ad-cache --delete-all >/dev/null 2>/dev/null'

Error: Error returned by external program

Solution : As ad-cache is linked to lsa, so copy lsa binary from machine where pbis 8.0 is installed to the affected machine.
Solution : ad-cache is linked to lsa binary and this has bug. 

Wednesday, January 28, 2015

How to install python boto module on windows?

OS Environment : Windows 2007, 64 bit
Application : Python boto module
Implementation Steps : 

1. First Install python 3
2. Execute following steps now :

From windows command prompt :

$cd c:/
c:\>cd Python34/Scripts
c:\Python34\Scripts>pip.exe install -U boto

Output will look like below :

Downloading/unpacking boto
Installing collected packages: boto
Successfully installed boto
Cleaning up...

Saturday, January 24, 2015

How to configure Postfix as a SMTP gateway?

■ Requirement: Configure postfix as a smtp gateway server
OS Environment : Linux [RHEL 5, RHEL 6]
Application: postfix
■ Assumption : 

  •       Domain name= example.com, 
  •       Internal Mail server IP = 192.168.1.3, 
  •       Gateway mail server IP = 192.168.1.2, 
  •       Internal postfix smtp is pre-configured. 

Implementation Steps :

A. DMZ Mail Server Setup (or gateway mail server):  The DMZ mail server forwards the inbound mail to the internal mail server and delivers the outbound mail to internet.

1. Edit /etc/postfix/main.cf and update the lines below.

mydestination =
local_recipient_maps =
local_transport = error:local mail delivery is disabled

mynetworks = 127.0.0.0/8 192.168.1.3
relay_domains = example.com
transport_maps = hash:/etc/postfix/transport
smtpd_recipient_restrictions = permit_mynetworks
reject_unauth_destination


2. Edit the file /etc/postfix/transport and add the line below.

example.com :[192.168.1.3]

NOTE : If you would like to use multiple internal server for multiple users/domains then postfix transport maps should be changed.

e.g. in main.cf

transport_maps = /etc/postfix/transport

in /etc/postfix/transport:

user1/domain1 smtp:1-mailserver.example.com
user2/domain2 smtp:2-mailserver.example.com


3. Execute following commands to reflect above :

$ postmap /etc/postfix/transport 
$ postfix reload
NOTE : That will deliver email for user1 and user2 to [1,2]-mailserver.example.com. If you don't specify anything in the transport map, the default transport (which is usually deliver locally) will be used.

B. Configure Internal Mail Server :

The internal mail server holds the mailbox and forward all outbound mail to the DMZ mail server.

1. Edit /etc/postfix/main.cf and update the lines below : 

transport_maps = hash:/etc/postfix/transport

2. Edit file /etc/postfix/transport and add the lines below :

example.com :
.example.com :
* smtp:[192.168.3.2]


3. Create a transport database file :

$ postmap /etc/postfix/transport

4. Restart the Postfix : 

$ service postfix restart

Friday, January 23, 2015

How to install ruby on linux server?

■ Requirement : Install ruby on linux system
■ OS Environment : Linux(RHEL, Centos)
■ Implementation Steps : 

$ cd /usr/local/src
$ Download latest tar ball of ruby
$ tar xvzf ruby-XXX.tar.gz
$ cd ruby-XXX
$ ./configure
$ make
$ make install
$ ruby rubytest.rb

How to install FFmpeg, FFmpeg-PHP,Mplayer,Mencoder, flv2tool,LAME, MP3 Encoderon linux server?

■ Requirement : Install FFmpeg, FFmpeg-PHP,Mplayer,Mencoder, flv2tool,LAME, MP3 Encoderon
■ OS Environment : Linux, RHEL 5, 64 bit

■ Implementation Steps :

1. Login into server and get root access.
2. cd /usr/local/src/
3. Download following source file from appropriate vendor sites :

essential-20061022.tar.bz2
flvtool2_1.0.5_rc6.tgz
lame-3.97.tar.gz
ffmpeg-php-0.5.1.tbz2
libogg-1.1.3.tar.gz
libvorbis-1.1.2.tar.gz
MPlayer-1.0rc2.tar.bz2
ffmpeg-0.5.tar.bz2

4. Extract above modules :

$ for pkg in  lame-3.97.tar.gz libogg-1.1.3.tar.gz libvorbis-1.1.2.tar.gz flvtool2_1.0.5_rc6.tgz essential-20061022.tar.bz2 ffmpeg-php-0.5.1.tbz2 MPlayer-1.0rc2.tar.bz2 ffmpeg-0.5.tar.bz2; do tar -xvzf $pkg; done

5. Create a codecs directory :

$ mkdir /usr/local/lib/codecs/

6. Install dependent libraries :

$ yum install gcc gmake make libcpp libgcc libstdc++ gcc4 gcc4-c++ gcc4-gfortran subversion ruby ncurses-devel -y

7. Copy essentials codes in proper location :

$ cd /usr/local/src/
$ mv /usr/local/src/essential-20061022/* /usr/local/lib/codecs/
$ chmod -R 755 /usr/local/lib/codecs/

8. Install LAME :

$ cd /usr/local/src/lame-3.97
$ ./configure
$ make 
$ make install

9. Install LIBOGG:

$ cd /usr/local/src/
$ cd /usr/local/src/libogg-1.1.3
$ ./configure --enable-shared ; make ; make install
$ PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
$ export PKG_CONFIG_PATH

Put above config path in root user .bashrc file

10. Install LIBVORBIS:

$ cd /usr/local/src/
$ cd /usr/local/src/libvorbis-1.1.2
$ ./configure; make ; make install

11. Install FLVTOOL2

$ cd /usr/local/src/
$ cd /usr/local/src/flvtool2_1.0.5_rc6/
$ ruby setup.rb config
$ ruby setup.rb setup
$ ruby setup.rb install

12. Install MPLAYER

$cd /usr/local/src/
$ cd /usr/local/src/MPlayer-1.0rc2
$ ./configure; make; make install

13. Install FFMPEG:

$ cd /usr/local/src/
$ cd /usr/local/src/ffmpeg-0.5
$ ./configure --enable-libmp3lame --enable-libvorbis --disable-mmx --enable-shared
$ make
$ make install

$ export LD_LIBRARY_PATH=/usr/local/lib/

$ ln -s /usr/local/lib/libavformat.so.50 /usr/lib/libavformat.so.50
$ ln -s /usr/local/lib/libavcodec.so.51 /usr/lib/libavcodec.so.51
$ ln -s /usr/local/lib/libavutil.so.49 /usr/lib/libavutil.so.49
$ ln -s /usr/local/lib/libmp3lame.so.0 /usr/lib/libmp3lame.so.0
$ ln -s /usr/local/lib/libavformat.so.51 /usr/lib/libavformat.so.51

14. Install FFMPEG-PHP:

$ cd /usr/local/src/
$ cd /usr/local/src/ffmpeg-php-0.5.1/
$ phpize
$ ./configure
$ make
$ make install

Enable ffmpeg module in php.ini like below

echo 'extension=ffmpeg.so'

15. Now check the binaries like below :

$ for bin in lamp flvtool2l mplayer ffmpeg; do which $bin; done




Why did "ls -al" command take much time to return output?

Issue/Symptom  : "ls -al" command took much time than expected.

time ls -la /data/XXX/mboxes/tier1/01/XXX/15/67/178/XXX/ 
... (just removing the output of the ls -la command itself)
real 1m26.674s 
user 0m0.285s 
sys 0m1.627s



OS Environment : RHEL 6, storage netapp filer ONTAP 8 
Software/Application : storage - FAS3240 over NFSv3, ls command[coreutils-8.4-16.el6.x86_64]
Investigation : Direcotry "XXX" contains around 22481 files.
Workaround Solution : Time consumption is expected in this case. This may not be same on other environment. Please follow RCA section.

Root Cause Analysis :


Above time is expected as directory has around 22481 files. "ls -al" retrieves more file attributes details than normal "ls" command. 
As per strace analysis, it is not found that any system call which took more than 1 sec. Accumulation of all calls took 1m26s as you see. Compared other user's 
directory[has around ~4000 mails], it was  not found  this much delay. It just took less than 2 sec. Test and details :
time ls -la /data/XXX/mboxes/tier1/01/XXX/15/67/178/XXX/ << no of file 22481
[...]
real 1m26.674s 
user 0m0.285s 
sys 0m1.627s
with out "-al" option :
time ls /data/XXX/mboxes/tier1/01/XXX/15/67/178/XXX/
[....]
real 0m4.678s
user 0m0.223s
sys 0m1.008s
This user has less mails :
time ls -al /data/XXX/mboxes/tier1/01/XXX/15/67/47/XXX << no of file 4442
real 0m1.010s
user 0m0.067s
sys 0m0.297s
time ls -al /data/XXX2/mboxes/tier1/01/XXX/15/67/47/XXn/
real 0m0.736s
user 0m0.070s
sys 0m0.162s
ls -al /data/XXX/mboxes/tier1/01/XXX/15/67/183/XXX/|wc -l
6468
took time :
real 0m1.406s
user 0m0.090s
sys 0m0.406s

Strace analysis :

#strace -Tttvv ls -al /data/XXX/mboxes/tier1/01/XXX/15/67/178/XXX/ &> /var/log/strace_op
#cat  /var/log/strace_op|awk '{print $NF}'|egrep -v 'msg|dat|new|bin|size|\?'|sed 's/<//'|sed 's/>//'|less 
#cat  /var/log/strace_op|awk '{print $NF}'|egrep -v 'msg|dat|new|bin|size|\?'|sed 's/<//'|sed 's/>//'|awk 'BEGIN {sum=0.0} {sum+=$NF} END {print sum}' 

If you see too much penalty in terms of consumption of retrieve time then please engage NetApp to perf analysis. Download perf tool 
and retrieve data from filers. Handover this data to Netapp to further analysis. 

Why inodes is almost full on file system?

Issue/Symptom  : dfm: Warning event on filer:/(Inodes Almost Full)
OS Environment : Netapp OnTAP 8.1
Software/Application : Netapp DFM[5] sends alert that inode is full.
Environment : Applicable to all customer who uses netapp filers
Investigation : Huge number of small files are put on volumes

Workaround Solution :

Check how is the usage of inodes : 

filer> df -i 

Check currently maximum setup :

filer> maxfiles 

Calculate maximum number of inodes it can hold [4KB size of each inode]

Find maximum size of volume :

filer> df -h  

If maximum size is  XY GB, then maximum supported inodes :

= (XY*1024*1024)/4

Set new inode value :

filer>maxfiles   
 
Permanent solution  : Same as workaround solution. 
Root Cause Analysis : No required, is known.

How to clean unused semaphore

Issue/Symptom  : Sometime you'll see semaphore usage is full.
OS Environment : Linux or RHEL
Software/Application : HP ovo sends semaphore usage alerts
Investigation : Unsed semaphores are not cleared by kernel
Workaround Solution : Use below scripts to clean unused semaphore.


cat clean-unused-semaprhore.sh

#Developed By Kamal maiti, 
#check if root can run it.
if [[ $EUID -ne 0 ]]; then
   exit 1
 else

#collect all semaphore ID
  for SEMID in `ipcs -s|egrep -v -e "Semaphore|key"|sed '/^$/d'|awk '{print $2}'|sort -u`
     do
     #GETPID of semaphore
   PID=`ipcs -s -i $SEMID|tail -2|head -1|awk '{print $NF}'`
     #GET PROCESS ID
    #Ignore process ID 0 which is main process & test PID greater than 0
if [ $PID -gt 0 ]; then

#Test of PID exists in process list, if exits then don't do anything.

  if ps -p $PID > /dev/null
    then
  #running process are
     echo "$SEMID   $PID" &>/dev/null      else
# dead process are, kill corresponding semaphore of related PID is not exisitng.

    echo "$SEMID   $PID" &>/dev/null 

  #cleaning semaphore of dead process :
  ipcrm -s $SEMID
 fi
fi
 done
fi


RUN :chmod +x clean-unused-semaprhore.sh; ./clean-unused-semaprhore.sh

High CPU usage, server was not accessible over ssh

■ Issue/Symptom : High load on server, not accessible over ssh
OS Environment : RHEL 5.5
■ Background Information  :
  • Infra was running test
  • Server was intermittently highly loaded
  • ssh was failing :
    • [usera@user01lxv ~]$ ssh 10.57XXX
    • Password:
    • Connection closed by 10.57.XXX
  • console shows "lockd: rejected NSM callback from 7f000001:30001" and sometimes NFS is not ok
Investigation :
  • iowait was very high and fluctuating.
  • All the cpu were busy to serve i/o bound operations
$ mpstat -P ALL 1

Linux 2.6.18-128.el5 (xxxxxxx) 11/19/2014
10:17:29 PM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s
10:17:30 PM all 0.00 0.00 0.00 75.00 0.00 0.00 0.00 25.00 182.18
10:17:30 PM 0 0.00 0.00 0.00 100.00 0.00 0.00 0.00 0.00 182.18
10:17:30 PM 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 0.00
  • top had shown high load or no process took too much cpu
top - 22:18:03 up 50 days, 22:20, 4 users, load average: 25.19, 26.68, 30.74
Tasks: 235 total, 2 running, 231 sleeping, 0 stopped, 2 zombie
Cpu(s): 2.0%us, 0.8%sy, 0.0%ni, 0.0%id, 96.8%wa, 0.0%hi, 0.4%si, 0.0%st
Mem: 3866480k total, 2916884k used, 949596k free, 12440k buffers
Swap: 8385920k total, 498424k used, 7887496k free, 350200k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
20841 cw 21 0 5606m 2.1g 5040 S 4.0 56.8 291:05.56 /opt/cw/jre/bin/java -Duser.timezone=America/Mexico_City -Xms2560m -Xmx2560m -XX:MaxPermSize=128m
  • Found that there were lot of "D" stated processes which didn't appear on nso-102, 101
$ ps aux |awk '{print $1 " " $8 " " $NF }'|grep D

USER STAT COMMAND
root D< [kjournald]
root Ds 0
root Ds /var/run/vmware-guestd.pid
nobody DN /usr/bin/log2mysql-nso-tomcat-writer
nobody DN /usr/bin/log2mysql-nso-tomcat-spooler
root D
  • In above output, system thread kjournald is also in D state which looked bad from kernel perspective. Journalling would have stopped.
■ Workaround Solution :
Shutdown VM and power on again.[D stated processes can't be killed unless system is rebooted]

Permanent Solution :
Shutdown VM and power on again. .[D stated processes can't be killed unless system is rebooted]
Root Cause Analysis :
  • IOwait was mainly taking place as there were high number of D stated processes.

why amazon cloud load balancer was flapping between two instances?

Issue/Symptom :
  • Why SMOKETEST LB in amazon aws cloud was flapping between qpass-prod-smktst-201.dub1.qpass.net & qpass-prod-smktst-101.dub1.qpass.net. States changed between "InService" & "OutofService" ?
OS Environment :
  • Both nodes has RHEL 6, LB is provided by Amazon
Investigation :
  • LB is mapped to above two nodes. Incomming port is 443, destination port is 80. It is found that applications are listening port 80 on both nodes. Server Health check timeout has been increased in LB, but issue still persisted.
Permanent Solution :

For the LB, at Health check section, Ping target will be TCP:80, Timeout set 5 seconds, Interval 30 seconds, Unhealthy Threshold 2
Healthy Threshold  10


Root Cause Analysis : 
  • It was found that Ping target was HTTP:80, Ping path was /ping.html. Though webbased ping returns OK[200 status code] but it does't work properly.

why netapp dfm does send "Clock Skewed" alert from filer?

Issue/Symptom  : DFM sent alert like "Dfm: Error event on Clock Skewed"

OS Environment : Netapp ONTAP

Investigation : Not performed

Workaround Solution :

"options timed.enable off"
"options timed.enable on"

Permanent Solution : 
  • Unknown.
Root Cause Analysis :
unknown

Why it was failed to run command ‘/bin/bash’ and was throwing error "No such file or directory" during configuring chrooted ftp?

Issue/Symptom  : Receiving following error while execute below command

$chroot /chroot
"Editing Chroot: failed to run command ‘/bin/bash’: No such file or directory

OS Environment : RHEL 5,6
Involved Software/Application : openssh, coreutils, rssh
Investigation : Found that /bin/bash is present inside /chroot directory.
Permanent Solution :

1. $ldd /chroot/bin/bash to see dependent libraries.
2. Copy missing libraries from /lib64 or /usr/lib64 to /chroot/lib64

Root Cause Analysis : Libraries of /bin/bash were missing or were 
partially present. 

why it was throwing error "550 Permission denied" while uploading file on vsftpd server, but no issue was reported to download file?

Issue : Receiving "550 Permission denied" while uploading file on vsftpd server, no issue to download file.

Kernel : 2.6.9-55.ELsmp
OS: RHEL 4
vsftpd : vsftpd-2.0.1-5.el4.5, vsftpd-2.0.1-9.el4.i386.rpm


Solution :

enabling "chown_uploads=YES" caused problem. Disabled it like :

#chown_uploads=YES


Also enabled below :


anonymous_enable=YES


Troubleshooting : 

from client : following steps had been performed :

ftp> put testfile.txt
local: testfile.txt remote: testfile.txt
---> TYPE I
200 Switching to Binary mode.
ftp: setsockopt (ignored): Permission denied
---> PASV
227 Entering Passive Mode (10,57,71,14,181,149)
---> STOR testfile.txt
550 Permission denied.
ftp> exit


from server, we saw following messages in /var/log/vsftpd.log

Thu Jan 22 02:24:17 2015 [pid 9478] [test122] FTP command: Client "10.57.71.126", "PASV"
Thu Jan 22 02:24:17 2015 [pid 9478] [test122] FTP response: Client "10.57.71.126", "227 Entering Passive Mode (10,57,71,14,24,213)"
Thu Jan 22 02:24:17 2015 [pid 9478] [test122] FTP command: Client "10.57.71.126", "STOR testfile.txt"
Thu Jan 22 02:24:17 2015 [pid 9478] [test122] FTP response: Client "10.57.71.126", "550 Permission denied."

successful upload log shows below messages [in /var/log/xferlog]: 

Thu Jan 22 04:20:26 2015 1 10.57.71.126 12 /home/test122/upload/testfile.txt b _ i r test122 ftp 0 * c
Thu Jan 22 04:21:01 2015 1 10.57.71.126 12 /home/test122/upload/testfile.txt b _ o r test122 ftp 0 * c