Monday, May 28, 2012

Hadoop: "Format aborted" Error Message


I have recently started playing around with hadoop. I went through a bunch of tutorials and docs and installed it on a Centos 6 box. When I tried to start the namenode, it gave me an error informing that my namenode dir is not formatted. Fair enough but whenever I tried to format it, it used to get aborted. I checked all the configs, user, group, permissions and what not. I read and reread the docs to figure out if I am missing anything but no luck. Every time I got the following error:

-bash-4.1$ hadoop namenode -format
12/05/28 07:33:42 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = hadoop1.staging.example.com/10.10.54.143
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 0.20.2-cdh3u4
STARTUP_MSG:   build = file:///data/1/tmp/topdir/BUILD/hadoop-0.20.2-cdh3u4 -r 214dd731e3bdb687cb55988d3f47dd9e248c5690; compil
ed by 'root' on Mon May  7 14:01:59 PDT 2012
************************************************************/
Re-format filesystem in /data/namenode ? (Y or N) y
Format aborted in /data/namenode
12/05/28 07:33:46 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at hadoop1.staging.example.com/10.10.54.143
************************************************************/


There were no logs or messages to analyze. Every things looked in order then what was the issue?

Turned out that when the CLI asked me "Re-format filesystem in /data/namenode ? (Y or N)" I was supposed to hit upper case Y and not y. I found this quite bad and un-intuitive on developers' part that they did not implement case-insensitivity or gave out a message about hitting upper case when a lower case was entered. I just hope that this post might save someone some time.

Friday, May 25, 2012

How To Downgrade Or Reinstall RPM Package

I had this issue yesterday when I was playing with Fedora 17 Beta. Here is what happened:
1. I installed Fedora 17 Beta and did a yum update.
2. I installed some common things like vim and tree.
3. I tried to install rpm-build and rpmdevtools.


As soon as I did the third step, yum spit out an error saying that the version of "rpm" package I have is newer that what is required. Now there is a problem. Had it been any other package, I could simply have uninstalled the newer version by doing a yum erase and have installed the required version but what do I do now? If I uninstall rpm package then how will I install rpm package again? Yum itself uses rpm in the backend. I wasn't able to find any "force" flag for yum.

A simple solution to the problem above is to use rpm command instead of using yum. Go any of the mirrors and download the rpm package. Now you have the package use "--force" flag and install if via rpm command.


rpm -ivh --force rpm-4.9.1.3-6.fc17.x86_64.rpm

The trick worked well and I was able to resume work.

Wednesday, May 2, 2012

Logrotate: The Most Basic Log Management Tool [Examples]

Logrotate is the default and easiest log management tool around. It is shipped by default with most of the major Linux distributions. Logrotate can help you to rotate logs (in other words, it can create a separate log file per day/week/month/year or on the basis of size of log file). It can compress the older log files. It can run custom scripts after rotation. It can rename the log to reflect the date.

Logrotate scripts goes to /etc/logrotate.d/. Let us see some examples to understand it better. Here we'll rotate /var/log/anyapp.log

1. Rotate logs daily:
$ cat /etc/logrotate.d/anyapp
/var/log/anyapp.log {
daily
rotate 7


The logrotate script above will rotate /var/log/anyapp.log everyday and it'll keep the last 7 rotated log files. Instead of daily you can use monthly or weekly also.

2. Compress the rotated logs:
$ cat /etc/logrotate.d/anyapp
/var/log/anyapp.log {
daily
rotate 7

compress


Now you'll find that logrotate is also compressing the rotated files. This is really a big life saver if you want to save some disk space which is a very common use case specially in VPS or cloud environment.
By default logrotate does a gzip compression. You can alter this behavior by using compresscmd. For example "compresscmd /bin/bzip2" will get you bzip2 compression.

3. Compress in the next cycle:
$ cat /etc/logrotate.d/anyapp
/var/log/anyapp.log {
daily
rotate 7
compress
delaycompress


This is useful in case it is not possible to immediately compress the file. This happens when the process keeps on writing to the old file even after the rotation. If the last line sounded strange to you then you might want to read about inodes. Also note that "delaycompress" will work only if "compress" is included in the script.

4. Compressing the copy of the log:
$ cat /etc/logrotate.d/anyapp
/var/log/anyapp.log {
daily
rotate 7
compress
delaycompress
copytruncate


Copytruncate comes handy in the situation where process writes to the inode of the log and rotating the log might cause process to go defunct or stop logging or a bunch of other issues. Copytruncate copies the log and the further processing is done on the copy. It also truncates the original file to zero bytes. Therefore the inode of the file is unchanged and process keeps on writing to the log file as if nothing has happened.

5. Don't rotate empty log and don't give error if there is no log:
$ cat /etc/logrotate.d/anyapp
/var/log/anyapp.log {
daily
rotate 7
compress
delaycompress
copytruncate

notifempty
missingok


Self explanatory. Both "notifempty" and "missingok" has opposite twins named "ifempty" and "nomissingok" which are the defaults for logrotate.

6. Execute custom script before and/or after logrotation:
$ cat /etc/logrotate.d/anyapp
/var/log/anyapp.log {
daily
rotate 7
prerotate
    /bin/myprescript.sh
endscript
postscript
    /bin/mypostscript.sh
endscript
}

You can run multiple scripts/commands as long as they are in between (pre|post)rotate and endscript. I have removed some of the parameters from the script to maintain readability.

I have just scratched the surface of logrotate. In practice it is capable of much more. You should check out logrotate's man page for more options.