yuvaraj blog: April 2011

Friday, April 22, 2011

Change/delete locked symantec scheduled admin scan in Windows:

Most organizations should have installed some commercial Anti virus software in all IT assets (laptops/desktops etc...) and add READ ONLY configuration for daily quick scan and weekly scheduled full scan. IT administrators might have scheduled this on weekdays to ensure 100% completion of weekly scan and they don't want to abort any scan or postpone the scan schedule to number of days.

In my office laptop the scheduled weekly scan run on Wednesday's @11:30 AM and will take 5-6 hours to complete.Due to heavy disk scan and CPU contention I'm unable to use other resource hungry applications in my laptop (RAD/Firefox etc...).

Though we can't edit this settings from symantec UI, this can be exploited via registry editor.
Here are the steps to change SYMANTEC in Windowx XP:

1)Open registry editor (regedit)
2)Navigate to "HKEY_LOCAL_MACHINE\SOFTWARE\Symantec\Symantec Endpoint Protection\AV\LocalScans\{xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx}\Schedule" and find out the key based on the name of the scheduled scan.
Note: Otherwise search entire registry for the schedule name (weekly scan wed@11:30) using Find(F4) option
3)Open the DWORD "DayOfWeek" and edit the value as appropriate (6 -for scan on Saturdays)

The schedule can be even deleted from the registry at your own risk which will completely remove the scheduled scan.

Under "LocalScans" you should see many nodes named "ClientScheduledScan_" or something similar. The "ClientScheduledScan_" is the node in the registry tree that registers my forced Administrator system scan with Symantec AntiVirus. Yours could be different, so you'll need to look around under LocalScans to find the correct one.

Here's a snapshot:

Disclaimer: Disabling admin scan on a PC by your employer is probably a violation of their IT-security policy and could get you into trouble.

Thursday, April 14, 2011

Running hadoop in Windows (Pseudo-Distributed Mode):

In previous post I have explained about starting hadoop as standalone service in windows.In this post I will explain how to start hadoop in pseudo distributed mode.In pseudo-distributed mode Hadoop daemon runs in a separate Java process in localhost

Configuration:
Add below mentioned configuration to start services required to mimic distribution mode

conf/core-site.xml:

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

conf/hdfs-site.xml:

<configuration>
   <property>
       <name>dfs.replication</name>
       <value>1</value>
   </property>
</configuration>

conf/mapred-site.xml:

<configuration>
   <property>
       <name>mapred.job.tracker</name>
       <value>localhost:9001</value>
   </property>
</configuration>

We need to setup passphraseless ssh authentication to connect to localhost .Try $ ssh localhost and this should connect to your local machine with out prompting password/passphrase . If not please read below to setup openssh server/client and password less authentication

openssh client/server programs can be installed as part of cygwin installation by selecting "openssh" package during cygwin installation.
1)open cygwin console and run "ssh-host-config -y". This will generate configuration files required to start ssh server,setup local windows user account and creates windows service(sshd)
2)Now ssh service can be started either with cygwin command (cygrunsrv -S sshd) or standard windows command (net start sshd)

Note:Sometimes the service will not start and rebooting the system may help.

Run the following steps in cygwin console to setup public key authentication for ssh local connection

$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

Execution of hadoop in distribution mode:

Format a new distributed-filesystem:
$ bin/hadoop namenode -format

Start the hadoop daemons (this start name node and job tracker also)
$ bin/start-all.sh

The hadoop daemon log output is written to the ${HADOOP_LOG_DIR} directory (defaults to ${HADOOP_HOME}/logs).

Browse the web interface for the NameNode and the JobTracker; by default they are available at:

NameNode - http://localhost:50070/
JobTracker - http://localhost:50030

Stop the daemons with:
$ bin/stop-all.sh

Sounds simple.isn't it ??
How ever setting up public key authentication troubled me a lot due to issues with cygwin installation.I have deleted my previous installation files (without ssh server) and re installed cygwin along with open ssh client/server programs and followed the steps to start ssh server and public key authentication.

When I tried to connect to localhost via ssh i got password prompt which should not be the case.I deleted all the key pairs and followed the same steps to setup sshd service and public key authentication.But this it seems the correct keys were picked by ssh program and connection was closed by sshd and got below error message (Connection closed by ::1)

I decided to clean removal and installation of cygwin without any customization.

Follow below steps to remove cygwin in clean way:
1)Remove sshd configuration steps (cygrunsrv -E sshd and cygrunsrv -R sshd)
2)delete the folder (default -->c:\cygwin) and all its sub-folders
3)remove the Environment Variable CYGWIN and PATH variables if defined
4)Remove the following entries completely from registery (regedit)

HKEY_CURRENT_USER/Software/Cygnus Solutions
HKEY_CURRENT_USER/Software/cygwin
HKEY_LOCAL_MACHINE/Software/Cygnus Solutions
HKEY_LOCAL_MACHINE/Software_/cygwin

5)Remove local user/group sshd (compmgmt.msc)
6)Search for all public/private key files (id_rsa/id_dsa/id_dsa.pub/id_rsa.pub/authorized_keys) and delete them

Re-install cygwin in default path and make sure to select openssh packages as part of installtion and setup ssh server and password less authentication as indicated above.This time public key authentication worked for me and should work for you as well.

Now the important step is completed and happily executed command to start hadoop daemons.unfortunately I ended up with below error


localhost: Error: JAVA_HOME is not set.

Not sure why is this error occurred even after defining JAVA_HOME environment variable and this is not the case when I started hadoop in stand alone mode.I just followed apache documentation to fix this error.
edit the file conf/hadoop-env.sh to define at least JAVA_HOME to be the root of your Java
installation.

Finally I'm able to start hadoop in pseudo distribution mode in windows system and able to access Name node and job tracker via web url

Reference:
1) Apache Hadoop documentation
2) Setup Hadoop

Wednesday, April 13, 2011

Running hadoop in windows machine:
Though windows supported as development platform by hadoop some tweaks are necessary to successfully start hadoop services.The following modes are supported by hadoop
Local (Standalone) Mode
Pseudo-Distributed Mode
Fully-Distributed Mode

Now we will see the steps to start hadoop in standalone mode in windows. I'm sure you will encounter one or more issues mentioned below when starting hadoop via cygwin

Required Software:
1)JavaTM 1.6.x, preferably from Sun
2)ssh must be installed and sshd must be running to use the Hadoop scripts that manage remote Hadoop daemons. (Cluster mode)
3)Cygwin
Download:
Download stable Hadoop distribution from Apache Download Mirrors.

Prepare to Start the Hadoop Cluster:
Unpack the downloaded Hadoop distribution.Define JAVA_HOME as environment variable or edit conf/hadoop-env.sh file
Try the following command:
$sh bin/hadoop
This should the usage documentation for the hadoop script without any start-up errors.

How ever you will get the error "C:\program command not found" if JRE is installed in default path (c:\program files\jre*")

How to fix this issue ?
Open the file hadoop-config.sh.Search for the text JAVA_PLATFORM=`CLASSPATH=${CLASSPATH} ${JAVA} and replace ${JAVA} with "${JAVA}" to handle space related issues in the file path while running hadoop via cygwin

Now this error will disappear and popup another error "java.lang.NoClassDefFoundError: org/apache/hadoop/util/PlatformName "

The class org.apache.hadoop.util.PlatformName exists in hadoop-common-*.jar.
The hadoop script automatically adds all necessary files CLASSPATH and it seems cygwin-style paths (/cygdrive/c/apps/hadoop-0.21.0) on classpath is not recognised properly when starting java runtime.

Note:set -x option can be used to debug the scripts

How to fix this issue now?

Open hadoop-config.sh file and add below line before using the CLASSPATH variable to define JAVA_PLATFORM (add after line JAVA_LIBRARY_PATH='')
CLASSPATH=`cygpath -p -w "$CLASSPATH"`

Now run $sh bin/hadoop which will display usage documentation like below

Usage: hadoop [--config confdir] COMMAND
where COMMAND is one of:
fs run a generic filesystem user client
version print the version
jar run a jar file
distcp copy file or directories recursively
archive -archiveName NAME -p * create a hadoop archive
classpath prints the class path needed to get the

Friday, April 01, 2011

Remove deleted workspace names from "Switch Workspace" Menu in Eclipse based IDE

Sometime the deleted workspace names won't be removed from "switch workspace" list which may cause confusion to select the correct workspace.Here is the procedure to remove the entries from the list

Go to Eclipse/RAD installation directory (Eg:C:\Program Files\IBM\SDP) and find org.eclipse.ui.ide.prefs file under configuration\.settings.
Open the file org.eclipse.ui.ide.prefs in any favorite text editor and remove the unwanted values associated with key "RECENT_WORKSPACES".

Take care \n will be associated with every workspace name since this is maintained like bundle file.

yuvaraj blog

Friday, April 22, 2011

Thursday, April 14, 2011

Wednesday, April 13, 2011

Friday, April 01, 2011

Enter your Comments

About Me

Blog Archive