Monday, July 25, 2011

How to extract exception stack trace from javaEE application log file ?

I had the requirement to extract complete list of exception stack trace from java EE application log files for past 6 months and analyze the root cause for all exceptions.

I used the following sed commands to get the list which has all duplicate list of exceptions.

sed -ne '/ERROR/,/WebContainer :/p' *.log.2011* >/tmp/exception_list.txt
sed -i '/WARN/d' /tmp/exception_list.txt
sed -i '/INFO/d' /tmp/exception_list.txt
sed '/com.ibm.ws/d' -e '/org.apache/d' /tmp/exception_list.txt
sed -e '/com.ibm.ws/d' -e '/org.apache/d' /tmp/exception_list.txt


Remove duplicate exception block from the log.

1) Remove time/date information from log entry cut -f 2 -d']'
exception_list.txt
2)Remove additional spaces if any sed 's/ ERROR/ERROR/'
/tmp/exception_list.txt

uniq
command will not work unless is sorted (sort), so use below command to remove duplicates with out changing line order

awk ' !x[$0]++'
/tmp/exception_list.txt >/tmp/exception_list_final.txt


Wednesday, July 20, 2011

How to delete duplicate MS communicator conversation copied to outlook History folder ?

Recently I faced below problem and couldn't find any solution/tools in internet. I decided to solve the problem myself.

I have enabled the option to save MS communicator conversation in outlook and the conversation will be saved to outlook at particular intervals. In case the chat session go beyond this interval, multiple copies of session will be saved with the information at the moment of interval timeout. i.e . The recent copy of the conversation includes all information from previous copy and all older copies can be deleted from outlook folder.

The tools to delete duplicate items didn't help here so I wrote VB script to handle this task myself. Here is the logic
1)Open Conversation History folder
2)Sort items by subject then Received in ascending order to group the items by subject and get the oldest item first among the subject
3) In first step mark the first item as pointer and named as X
4) In subsequent steps compared the subject of current item and previous pointer item (X) and if matches then verified body of item X contained in body of current item
5) If so then delete the item X
6) Outside the loop make current item as X and continue the loop

VB Script:

Set objOutlook = CreateObject("Outlook.Application")
Set objNamespace = objOutlook.GetNamespace("MAPI")
Set objFldr = objNamespace.Folders.Item("Mailbox - Ramasamy, Yuvaraj").Folders.Item("Conversation History")
Set colItems = objFldr.Items
colItems.Sort "[Subject][Received]", false
item_comp = Null
WScript.Echo colItems.count
For Each objItem in colItems

If isNUll(item_comp ) then
set item_comp = objItem
elseif (item_comp.subject = objItem.subject) then
If left(objItem.body,len(item_comp.body)) = item_comp.body then
item_comp.Delete
end if
end if
set item_comp = objItem
Next
WScript.Echo "all done"

Friday, April 22, 2011

Change/delete locked symantec scheduled admin scan in Windows:

Most organizations should have installed some commercial Anti virus software in all IT assets (laptops/desktops etc...) and add READ ONLY configuration for daily quick scan and weekly scheduled full scan. IT administrators might have scheduled this on weekdays to ensure 100% completion of weekly scan and they don't want to abort any scan or postpone the scan schedule to number of days.

In my office laptop the scheduled weekly scan run on Wednesday's @11:30 AM and will take 5-6 hours to complete.Due to heavy disk scan and CPU contention I'm unable to use other resource hungry applications in my laptop (RAD/Firefox etc...).

Though we can't edit this settings from symantec UI, this can be exploited via registry editor.
Here are the steps to change SYMANTEC in Windowx XP:

1)Open registry editor (regedit)
2)Navigate to "HKEY_LOCAL_MACHINE\SOFTWARE\Symantec\Symantec Endpoint Protection\AV\LocalScans\{xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx}\Schedule" and find out the key based on the name of the scheduled scan.
Note: Otherwise search entire registry for the schedule name (weekly scan wed@11:30) using Find(F4) option
3)Open the DWORD "DayOfWeek" and edit the value as appropriate (6 -for scan on Saturdays)


The schedule can be even deleted from the registry at your own risk which will completely remove the scheduled scan.

Under "LocalScans" you should see many nodes named "ClientScheduledScan_" or something similar. The "ClientScheduledScan_" is the node in the registry tree that registers my forced Administrator system scan with Symantec AntiVirus. Yours could be different, so you'll need to look around under LocalScans to find the correct one.

Here's a snapshot:




Disclaimer
: Disabling admin scan on a PC by your employer is probably a violation of their IT-security policy and could get you into trouble.


Thursday, April 14, 2011

Running hadoop in Windows (Pseudo-Distributed Mode):

In previous post I have explained about starting hadoop as standalone service in windows.In this post I will explain how to start hadoop in pseudo distributed mode.In pseudo-distributed mode Hadoop daemon runs in a separate Java process in localhost

Configuration:
Add below mentioned configuration to start services required to mimic distribution mode

conf/core-site.xml:

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

conf/hdfs-site.xml:

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>


conf/mapred-site.xml:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>
We need to setup passphraseless ssh authentication to connect to localhost .Try $ ssh localhost and this should connect to your local machine with out prompting password/passphrase . If not please read below to setup openssh server/client and password less authentication

openssh client/server programs can be installed as part of cygwin installation by selecting "openssh" package during cygwin installation.
1)open cygwin console and run "ssh-host-config -y". This will generate configuration files required to start ssh server,setup local windows user account and creates windows service(sshd)
2)Now ssh service can be started either with cygwin command (cygrunsrv -S sshd) or standard windows command (net start sshd)


Note:Sometimes the service will not start and rebooting the system may help.

Run the following steps in cygwin console to setup public key authentication for ssh local connection

$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys


Execution of hadoop in distribution mode:

Format a new distributed-filesystem:
$ bin/hadoop namenode -format

Start the hadoop daemons (this start name node and job tracker also)
$ bin/start-all.sh

The hadoop daemon log output is written to the ${HADOOP_LOG_DIR} directory (defaults to ${HADOOP_HOME}/logs).

Browse the web interface for the NameNode and the JobTracker; by default they are available at:

Stop the daemons with:
$ bin/stop-all.sh

Sounds simple.isn't it ??
How ever setting up public key authentication troubled me a lot due to issues with cygwin installation.I have deleted my previous installation files (without ssh server) and re installed cygwin along with open ssh client/server programs and followed the steps to start ssh server and public key authentication.

When I tried to connect to localhost via ssh i got password prompt which should not be the case.I deleted all the key pairs and followed the same steps to setup sshd service and public key authentication.But this it seems the correct keys were picked by ssh program and connection was closed by sshd and got below error message (Connection closed by ::1)

I decided to clean removal and installation of cygwin without any customization.

Follow below steps to remove cygwin in clean way:

1)Remove sshd configuration steps (cygrunsrv -E sshd and cygrunsrv -R sshd)
2)
delete the folder (default -->c:\cygwin) and all its sub-folders
3)remove the Environment Variable CYGWIN and PATH variables if defined
4)Remove the following entries completely from registery (regedit)
  • HKEY_CURRENT_USER/Software/Cygnus Solutions
  • HKEY_CURRENT_USER/Software/cygwin
  • HKEY_LOCAL_MACHINE/Software/Cygnus Solutions
  • HKEY_LOCAL_MACHINE/Software_/cygwin
5)Remove local user/group sshd (compmgmt.msc)
6)Search for all public/private key files (id_rsa/id_dsa/id_dsa.pub/id_rsa.pub/authorized_keys) and delete them

Re-install cygwin in default path and make sure to select openssh packages as part of installtion and setup ssh server and password less authentication as indicated above.This time public key authentication worked for me and should work for you as well.

Now the important step is completed and happily executed command to start hadoop daemons.unfortunately I ended up with below error

localhost: Error: JAVA_HOME is not set.


Not sure why is this error occurred even after defining JAVA_HOME environment variable and this is not the case when I started hadoop in stand alone mode.I just followed apache documentation to fix this error.
edit the file conf/hadoop-env.sh to define at least JAVA_HOME to be the root of your Java
installation.


Finally I'm able to start hadoop in pseudo distribution mode in windows system and able to access Name node and job tracker via web url

Reference:
1) Apache Hadoop documentation
2) Setup Hadoop

Wednesday, April 13, 2011

Running hadoop in windows machine:
Though windows supported as development platform by hadoop some tweaks are necessary to successfully start hadoop services.The following modes are supported by hadoop
Local (Standalone) Mode
Pseudo-Distributed Mode

Fully-Distributed Mode

Now we will see the steps to start hadoop in standalone mode in windows. I'm sure you will encounter one or more issues mentioned below when starting hadoop via cygwin

Required Software:
1)JavaTM 1.6.x, preferably from Sun
2)ssh must be installed and sshd must be running to use the Hadoop scripts that manage remote Hadoop daemons. (Cluster mode)
3)Cygwin

Download:
Download stable Hadoop distribution from Apache Download Mirrors.

Prepare to Start the Hadoop Cluster:
Unpack the downloaded Hadoop distribution.Define JAVA_HOME as environment variable or edit conf/hadoop-env.sh file
Try the following command:
$sh bin/hadoop
This should the usage documentation for the hadoop script without any start-up errors.


How ever you will get the error "C:\program command not found" if JRE is installed in default path (c:\program files\jre*")

How to fix this issue ?
Open the file hadoop-config.sh.Search for the text JAVA_PLATFORM=`CLASSPATH=${CLASSPATH} ${JAVA} and replace ${JAVA} with "${JAVA}" to handle space related issues in the file path while running hadoop via cygwin

Now this error will disappear and popup another error "java.lang.NoClassDefFoundError: org/apache/hadoop/util/PlatformName "

The class org.apache.hadoop.util.PlatformName exists in hadoop-common-*.jar.
The hadoop script
automatically adds all necessary files CLASSPATH and it seems cygwin-style paths (/cygdrive/c/apps/hadoop-0.21.0) on classpath is not recognised properly when starting java runtime.

Note:set -x option can be used to debug the scripts

How to fix this issue now?

Open hadoop-config.sh file and add below line before using the CLASSPATH variable to define JAVA_PLATFORM (add after line JAVA_LIBRARY_PATH='')
CLASSPATH=`cygpath -p -w "$CLASSPATH"`


Now run $sh bin/hadoop which will display usage documentation like below

Usage: hadoop [--config confdir] COMMAND
where COMMAND is one of:
fs run a generic filesystem user client
version print the version
jar run a jar file
distcp copy file or directories recursively
archive -archiveName NAME -p * create a hadoop archive
classpath prints the class path needed to get the

Friday, April 01, 2011

Remove deleted workspace names from "Switch Workspace" Menu in Eclipse based IDE

Sometime the deleted workspace names won't be removed from "switch workspace" list which may cause confusion to select the correct workspace.Here is the procedure to remove the entries from the list

Go to Eclipse/RAD installation directory (Eg:C:\Program Files\IBM\SDP) and find org.eclipse.ui.ide.prefs file under configuration\.settings.
Open the
file org.eclipse.ui.ide.prefs in any favorite text editor and remove the unwanted values associated with key "RECENT_WORKSPACES".

Take care \n will be associated with every workspace name since this is maintained like bundle file.



Enter your Comments