2011

Wednesday, December 14, 2011

IP Packet Filtering: iptables Explained For Beginners

iptables is a IP Filter which is shipped with kernel. Technically speaking, an IP filter will work on Network layer in TCP/IP stack but actually iptables work on data link and transport layer as well. In a broad sense, iptables consists on tables, which consists of chain which is further comprised of rules.
Default tables are:

Raw
Mangle
NAT
Filter

Default chains are (yes, they are written in upper case):

PREROUTING: used by raw, mangle and nat tables
INPUT: used by mangle and filter tables
FORWARD: used by mangle and filter tables
OUTPUT: used by raw, mangle, nat and filter tables
POSTROUTING: used by mangle and nat tables

I'll discuss about Filter table here. It is the one which is most generally used but if you are interested in others as well you can find a detailed tutorial at frozentux. Filter table uses three chains, INPUT, FORWARD and OUTPUT.

INPUT chain is for the packets meant for your own local machine. Reply of a http request made by your browser will go through INPUT chain.
OUTPUT chain is for the packets going out of your machine. The http request made by your browser will go through this chain.
FORWARD chain is for the packets which you receive but they are not meant for you. Your machine is just supposed to forward them to another device. This generally happens when the machine is configured as a gateway or something similar.

Now every iptables rules have some "target" which is executed when it is matched against a "criteria". Following are the most common targets:

ACCEPT: Packet is accepted and goes to the application for processing.
DROP: Packet is dropped. No information regarding the drop is sent to the sender.
REJECT: Packet is dropped and information (error) message is sent to the sender.
LOG: Packet details are sent to syslogd for logging.
DNAT: Rewrites the destination IP of the packet
SNAT: Rewrites the source IP of the packet

First four are used in Filter tables a lot. Now let us discuss some of the common criteria:

-p <protocol>: It matches protocols like tcp, udp, icmp and all
-s <ip_addr>: It matches source IP address
-d <ip_addr>: It matches destination IP address
--sport <port>: It matches the source port
--dport <port>: It matches the destination port
-i <interface>: It matches the interface from which the packet entered
-o <interface>: It matches the interface from which the packet exits

Now we know the basic things to start building our rules. Let us try to write some rules for a few hypothetical (or real) situations. First we'll set default policy for iptables filter table using -P flag.

iptables -t filter -P INPUT DROP

iptables -t filter -P OUTPUT DROP

Now we'll allow this machine to send only http requests and ssh requests:

iptables -A INPUT -p tcp -i eth0 --dport 80

iptables -A INPUT -p tcp -i eth0 --dport 22

Note that -A flag is used because we want to append these rules to current iptables config. If we do not use -A then the rules will be overwritten.

Also assume that my machine acts as an ftp server so I should allow connections to port 20 and 21.

iptables -A OUTPUT -p tcp -i eth0 --sport 20

iptables -A OUTPUT -p tcp -i eth0 --sport 21

Also I would say that I trust my internal network:

iptables -A OUTPUT -j ACCEPT -p all -d 192.168.1.0/24 -o eth0

iptables -A INPUT -j ACCEPT -p all -s 192.168.1.0/24 -i eth0

So this is how my final iptables script would look like:


iptables -t filter -P INPUT DROP

iptables -t filter -P OUTPUT DROP

iptables -A INPUT -p tcp -i eth0 --dport 80

iptables -A INPUT -p tcp -i eth0 --dport 22

iptables -A OUTPUT -p tcp -i eth0 --sport 20

iptables -A OUTPUT -p tcp -i eth0 --sport 21

iptables -A OUTPUT -j ACCEPT -p all -d 192.168.1.0/24 -o eth0

iptables -A INPUT -j ACCEPT -p all -s 192.168.1.0/24 -i eth0

Note that rules are read sequentially. So other than policy rules, if any rule matches then reading will stop and action will be taken according to that rule.

Sunday, December 4, 2011

Controlling Your Linux Server Using Twitter

Last weekend I wrote about "Using Twitter To Monitor Your Linux Server" using a command line client "MYST". I have added a new feature to MYST using which you can tell your server to execute commands by using DMs. Let us start configuration for the same.

Step 1: Ensure that you can send direct messages to your server's twitter account. Just follow the server's account and your server's account also needs to follow you.

Step 2: Add the following in your .myst.conf file below MYST section:


[Users]

auth_users: adityapatawari

You can add more twitter handles separated by blank space.

Step 3: Now you can DM any command to your server's account and it'll be executed. You need to ensure that myst.py script's "getdm" option is in cron.

MYST uses python's os.system to execute these commands. The user's access level who executed myst.py will decide if the commands coming from DM will be executed or not. If you want everything to be executed then run myst.py with super user or using sudo.

Please contribute to the MYST project on Gitorious.org

Sunday, November 27, 2011

Using Twitter To Monitor Your Linux Server!

Yes, you can use Twitter for monitoring your server. I won't say that it is a complete monitoring solution nor I will ask you to throw away your existing monitoring mechanisms. In fact the script I am talking about, MYST (AGPLv3), was not created for this purpose. It was created so that I could tweet without using browser with inspiration from Hiemanshu, a Fedora contributor, using python-twitter api.

It is just a fun script which you can use to tweet the health of your server periodically to a private account which only a moderated set of people can follow. So here is how you do it:

Step 1: Create a Twitter account. From settings page, mark it private.

Step 2: Open Twitter's new application page and fill the form. Put the name as 'MYST' and website as 'http://myst.adityapatawari.com'.

Step 3: Download 'MYST: Twitter for Shell' and extract it.

Step 4: Open your application 'MYST' listed at Twitter's apps page and fill .myst.conf with the relevant details.
Copy it to your home directory.

Step 5: Install python-twitter (version 0.82) on your server along with dependencies.

Step 6: Put a cron with appropriate time (and path to scripts) to execute the following periodically:


./myst.py update `./monitor.sh`

This is a cool method to check out the system health and you can modify monitor.sh to add more parameters to monitor.

Please contribute to the MYST project on Gitorious.org

Wednesday, November 2, 2011

How To Install And Configure Puppet (Getting Puppet Up)

Puppet is a system management tool used my many large and small enterprises to manage their infrastructure. From the top of my head Twitter, Wikipedia, Digg, Nokia, Rackspace are some of the companies using it and there is no reason that you cannot use it to manage that single server you love or the entire data center you own.

Installing puppet is not difficult but i"ll recommend installing puppet on Fedora/EL from tmz repo instead of the official repo. Official repo is generally out of date while tmz repo has latest builds. I don't know if some such repo exists for debian-like distributions. If you are interested in installing from source then you should check out this page. Puppet follows client - server architecture. So you need to run a server and a client (which can be on the same machine). Install "puppet" on the client side and "puppet-server" on the server side.

Now let us start with building the server:

Puppet Server
Step 1: Configure tmz repo and install puppet-server

yum install puppet-server

Step 2: Puppet, by default, looks for site.pp file in manifest directory. Let us create one, if not present already.


mkdir /etc/puppet/manifests

touch /etc/puppet/manifests/site.pp

Step 3: Start the puppet master (a.k.a. server) using:

service puppetmaster start

Puppet Client
Step 1: Tell the client where is the server by adding server entry in [main] section:

[main]

server=puppet.aditya.pa

Step 2: Start the puppet client

service puppet start

Puppet client will request a certificate from master. Now let us go to the master and sign the certificate.
You can check out all the signing requests by firing following on puppet master:

puppet cert --list

Sign the correct certificate by:

puppet cert --sign fed1.aditya.pa

Our puppet is up and running and ready to use. I'll build some manifests and modules to manage applications in next post or if you want, you can catch me at Fedora Users and Developers Conference at Pune, India on 6 Novemeber, 2011 where I'll build some manifests and modules live as a part of hackfest event.

Monday, October 31, 2011

Virtualization Using Open Source Tools (qemu-kvm)

Today I am going to talk about creating virtual machines using qemu-kvm. I'll be using Scientific Linux (a.k.a. Enterprise Linux much like Red Had and Centos) for this. A very good guide is available for Fedora at fedora wiki.

Optional: Install yumex. It is a good package manager for yum. Much better than the default stuff shipped with standard distro. It is better for guys who like to do gui based installation.

Step 1: Install following packages:

yum install qemu-kvm python-virtinst virt-manager virt-viewer libvirt

Step 2: Start the libvertd service

service libvirtd start

Step 3: Launch "Virtual Machine Manager" from the menu bar or just run virt-install --prompt for an interactive CLI virtual machine creation. A one line install is also possible. Read man virt-install carefully before attempting this.
Example:

virt-install --connect qemu:///system -n fed15-kvm -r 768 --vcpus=2 -f fed15-kvm -s 12 -c ../iso_files/fed15_64.iso --vnc --noautoconsole --os-type linux --os-variant fedora14 --accelerate --network=bridge:virbr0 --hvm

You are done! You can launch the machine using "Virtual Machine Manager" or by using virsh --connect qemu:///system start fed15-kvm where fed15-kvm is the name of my VM, though using command will start it in background.

Have fun and do comment if I have missed on something or you have any tips and tricks.

Friday, September 23, 2011

Building A Highly Available Linux Cluster Using Wackamole

Wackamole is an application which manages a bunch of IPs which should be accessible from outside all the time. Given a set of machines and a IPs, wackamole will ensure that if any machine goes down, other machine will take up its IP almost instantly and outside world will see no impact. It tries to balance the number of IPs across the number of machines available. Wackamole uses Spread network messaging system.

Let us start configuring it. I am using two virtual machines running Centos 5. The concept can be extended to as many machines as you want.

Step 1: Install the Wackamole on both the machines. It'll pull spread as a dependency. Sadly it is not yet available for Enterprise Linux or Fedora (I'll upload the package I built and share the link) or you can try it out on debian where wackamole is available in the repo. Centos, Red Hat, Fedora users need to do a manual install for wackamole and spread.

Step 2: Two machines will be having eth0 (or whatever default interface) with two IPs, just to be clear one ip on one box. Let us call them primary IP. Also we'll need a pool of IPs which can be distributed. We'll configure them on virtual interfaces. For this tutorial, I am taking 172.16.31.10 and 172.16.31.11 as primary IPs for the eth0 of the two machines and 172.16.31.20 and 172.16.31.21 as the virtual pool. You can take as many IPs as you want for virtual pool but make sure that they belong to you.

Now pick you favorite machine out of two and do the following steps on that machine only.

Step 3: Configure the entire pool as virtual interfaces. That means eth0:1 will be mapped to 172.16.31.20 and eth0:2 will be mapped to 172.16.31.20. This can be done by putting a file similar to /etc/sysconfig/network-scripts/ifcfg-eth0 (or you can wait for the next blog post).

Step 4: Let us configure spread. Check out the contents of my /etc/spread.conf file below first.


Spread_Segment  172.16.31.255:4805 {

        centos1 172.16.31.10
        centos2 172.16.31.11

}

EventLogFile = /var/log/spread.log

Here the first line is defining the spread_Segment which is basically the broadcast address of the network along with the port. Don't forget to configure your firewall for white listing that port. Next two lines specify the hostname of the machines along with the primary IP address and the last line will give the location of the log file. There are a lot of other things you can configure, check out the documentation for that.
Start spread by firing spread -n testvm -c /etc/spread.conf. If you use service spread restart and it fails then you might need to tweak your init script at /etc/init.d/spread.
Copy this config file to the other machine also and start spread.

Step 5: Now configure Wackamole. See the config file below:


Spread = 4805 #Spread port

SpreadRetryInterval = 5s #How often to try to connect to spread, if it fails

Group = wack1 #Cluster group

Control = /var/run/wack.it #Name of the socket

Prefer None #You treat all the IPs as equal



VirtualInterfaces {

#IPs from the virtual pool. Can be as many as you want.

        { eth0:172.16.31.20/24 }

        { eth0:172.16.31.21/24 }

}

Arp-Cache = 20s



Notify {

# Notify to this Broadcast address but a not more than 8 times.
        eth0:172.16.31.1/32 throttle 8

        arp-cache

}





balance {

        AcquisitionsPerRound = all

        interval = 4s

}

mature = 5s

I have commented the config file for understanding. It is straight forward so there should not be a lot of doubts.

Start Wackamole by firing wackamole -d -c /etc/wackamole.conf. Again, you might have to tweak init file if you use service wackamole start.

Copy this config file to the other machine also and start Wackamole.

That is it. Wackamole is configured. You'll see that one of your virtual interfaces is actually shifted to the other machine. That is wackamole trying to balance your cluster. You can add more machines and IPs to this config.

Since I built the rpms first and then did all the configs, my config files were in /etc/. If you have used source, your config files will be at a different location. Just use the correct config location while firing the command or tweaking the init script.

Sunday, August 28, 2011

How To Install and Configure Splunk For Log Analysis

Splunk, in simple words, is a log analyzer, a powerful one. It understands machine data like no one else and make it searchable and displays it in easy to understand way with a web dashboard. This data is important not only for searches but also for investigations, monitoring and making decisions regarding your infrastructure. Large organizations like Motorola and Vodafone use splunk to keep an eye on their server and now I am going to show you, how you can do the same.

How to install Splunk?

To install splunk just download the relevant installer from this page and use a suitable package manager. I'll use Centos 5.6 for this post. So just do a "yum install splunk-4.2.3-105575.i386.rpm". The rpm will be installed in /opt/splunk. Chances are that it won't be in your path so either add it or just use the full path. If the splunk is not in the /opt run a find on it "find / -name splunk". To start splunk server just issue the command " /opt/splunk/bin/splunk start" and accept the license. Don't forget to allow the port, usually 8000, from you firewall.

Adding logs for analysis

Let us add apache logs to the splunk for analysis. These logs are located at /var/log/httpd/. Log in using the web console at http://localhost:8000 with credentials id: admin and password: changeme. Click on the "Add data" button and then on Apache logs link. Now you got to "Consume Apache logs on this Splunk server", click next and specify the path to the apache logs directory and you are done.

Searching through the logs

From home page go to the welcome tab and launch the search app. Now you just need to type the search query and the relevant parts from the logs will be presented to you.

Building Reports

From the search app only, click on the "Build Report" link and just specify the criteria and that is it. You'll get the report in no time.

And there it is. Splunk is good to go. Explore more and comment here. :)

Monday, August 22, 2011

Munin: Server Statistics, The Easy Way

Server usage statistics always play a useful role when things go wrong. Often they provide you with the warnings like "oops! there is too much data, we'll need a new disk" or "wow! good traffic, maybe we should get more bandwidth". If you are asked to write code yourself for collecting these data and represent in an graphical fashion then you'll spend a few days, if not weeks to do so.

Well, why to reinvent wheel? Just use munin and be happy. Installing munin is easy enough. Just do a yum install munin munin-node and you'll be done.
After installation you need to configure it. For that edit file /etc/munin/munin.conf and uncomment and edit the following lines

[...]
dbdir /var/lib/munin
htmldir /var/www/html/munin
logdir /var/log/munin
rundir /var/run/munin
[...]
# Where to look for the HTML templates
tmpldir /etc/munin/templates
[...]
# a simple host tree
[localhost]
address 127.0.0.1
use_node_name yes
[...]

Now start the munin node service service munin-node start. Well, that is it. Just wait for a few minutes and then go to http://localhost/munin to get the graphs.

Here are a couple of the graphs I have obtained (and there are tonnes more):

Friday, August 19, 2011

Monit: Monitor Your Server And Ensure Service Uptime

Many of us (specially sysadmins) often have nightmares of servers going down at odd hours and clients shouting on us. Right from a large organization to people having personal servers just to host a blog, want their servers up and running all the time. Of course, one cannot stay up 24X7 to watch over so we'll do some automation to make sure that the services are running.

Enter Monit! According to their website "Monit can start a process if it does not run, restart a process if it does not respond and stop a process if it uses too much resources. You can use Monit to monitor files, directories and filesystems for changes, such as timestamp changes, checksum changes or size changes. You can also monitor remote hosts; Monit can ping a remote host and can check TCP/IP port connections and server protocols". We'll look into the "start a process if it does not run, restart a process if it does not respond and stop a process if it uses too much resources" part today.

To install monit just do a yum install monit and you are good to go. Next, you need to configure it. I'll show you the configuration for httpd. Others are similar, you can check out monit wiki for more how-tos.

The config file is located at monit.conf, although you don't have to fiddle with it but give it a read if you want.

We'll be writing and saving our configs in /etc/monit.d/ directory. So, create a file /etc/monit.d/apache and write the following lines in it.

check process httpd with pidfile /var/run/httpd.pid

group apache

start program = "/usr/sbin/httpd -k start"

stop program = "/usr/sbin/httpd -k stop"

if failed host 127.0.0.1 port 80 protocol http

then restart

if 5 restarts within 5 cycles then timeout

Now I'll explain these lines. First two lines just ask monit to check process named httpd whose pids are written in /var/run/httpd.pid file and they belong to apache group. Next two lines are telling the start and stop commands for the httpd service. Notice that I have not used "service httpd start". The next two lines tell monit to restart in case it finds that at port 80 http is not running. Lastly, we instruct monit to do this for a maximum of 5 times consecutively and then stop because if in 5 consecutive checks the service keeps on shutting down then problem is something else. I did not use service httpd start/stop because monit requires fully qualified file names to execute.

Next you need to start the monit service by firing service monit start and then you can go to sleep having some peace of mind

My next few posts are going to be on Automation and Network Monitoring only.Enjoy!

Monday, July 4, 2011

Koji: Common Mistakes, Errors and Troubleshooting

Koji is a tool which can be nightmare for you. At times, even when everything will look fine, it won't work or throw some of the most weird error messages for you to resolve. Here are some of them with probable solutions.

I am getting "OSError: [Errno 13] Permission denied" what do I do?
#1 Check that permission for /var/{lib,cache}/mock is 02755. You can set this permission by the following commands:
chmod 02755 /var/lib/mock
chmod 02755 /var/cache/mock

#2 Make sure that you have build and srpm-build defined into your build tag. Use this guide to define them. Also ensure that bare minimum required packages are added to the each group. I have added following packages in my Koji instance:

bash bzip2 coreutils cpio diffutils findutils gawk gcc grep sed gcc-c++ gzip info patch redhat-rpm-config rpm-build shadow-utils tar unzip util-linux-ng which make
I can build using srpm but not using SCM url. How to fix that?
#1 You need to have some files in you scm (git or subversion etc). These are a Makefile, all the patches and spec file. Makefile will consist of the url of the source and way to download it.

#2 Make sure that you are giving a complete SCM build command, an example using anonymous checkout:
koji build --scratch mytag git://abc.com/project.git#d9e1204eddd9ae972456a9bdd2d847a

#3 Make sure that you have added the url of the SCM in /etc/kojid/kojid.conf and restarted the kojid.
I am getting "ERROR: Could not find useradd in chroot, maybe the install failed?"
#1. Checkout #2 of first bullet
I am getting "GenericError: failed to merge repos:" while trying to regenerate repos. What do I do?
#1. Use the command "koji list-external-repos" and check out that there are no extra trailing '/' and do not hardcode the arch. Use $arch instead and don't forget to escape $arch using '\' else it will be evaluated while adding it to repos only.

#2. Ensure that disk is not full.
I am getting "BuildrootError: could not init mock buildroot, mock exited with status 20; see root.log for more information" but root.log is empty.
#1. Now Koji is trying to make a fool out of you. Well, no, actually Koji guesses what log has the relevant error message and this time it has guessed wrong. Just check out mock_output.log. Make sure that you have bare minimum packages added to your build group.

Saturday, June 25, 2011

How to Configure Pulp: The Ultimate Repository Management Tool

Pulp is a nifty piece of python code which I recently deployed to manage some ( a lot, actually) external linux repositories. Pulp is a great tool if you want to manage a lot of repositories and related content like packages, arches, distros and erratas etc. It'll not only help you to mirror the repositories but also to do remote installs to the clients (pulp calls it consumers) and groups. So, let us get started.
Make sure that you have a good amount of disk space on your server.

There is a really good documentation here on how the installation works. I'll write about some tips and tricks which are not there in the documentation.

1. How do I install it on Scientific Linux and other Enterprise Linux Servers, nss package is not of the latest version?
You need to enable the rolling repo for in Scientific Linux for this. It is not included in the yum.conf.d by default so chances are that you'll get an older version of nss if rolling repo is not added.

2. I am getting a "SSL WrongHost" error. How do I fix that?
First, you need to pick a hostname for the server (localhost.localdomain is a bad choice). Set the hostname using command "hostname <hostname>". Now we are going to generate a certificate for this domain to get rid of ssl error.
Just do a "cd /etc/pki/tls/certs/" and there will be a localhost.crt. Just rename it to something else and run "make testcrt" to get a new certificate. Follow the said steps closely in order.

3. I installed both pulp and pulp-cds on same server and now I am getting httpd alias problem. How to resolve?
Well, I understand the enthusiasm of trying out stuff but pulp and pulp-cds are not supposed to be installed on same server, not unless you know the ins and outs of pulp and what you are doing. The problem occurs because both pulp.conf and pulp-cds.conf in conf.d has same alias defined but for different targets. So comment out the Alias in pulp-cds.conf or get rid of the pulp-cds package all together, I would do the later one.

4. I want to use the repo using http. How do I do that?
Just find out the lines mentioned below in httpd conf directory and comment them out using "#".

SSLRequireSSL
SSLVerifyClient optional_no_ca
SSLVerifyDepth 2
SSLOptions +StdEnvVars +ExportCertData

Watch out this post for more tips and hacks.

Monday, May 23, 2011

How To Capture Data Packets On A Network Using Wireshark (a. k. a. Ethereal)

Wireshark, formerly known as Ethereal, is an amazing Network Monitoring tool. It helps you to capture the data packets being sent/received by your network interface and analyze it.
Warning: Before using Wireshark in promiscuous mode make sure that you have the required permissions to do so. Promiscuous mode, in a way, is packet sniffing and might be able to get rid of the job you currently have. (In simpler words, if you do not own the network or if you are not the network administrator then it can get you fired!)

Now, I am going to demonstrate this using my Fedora 13 box as a client (kept in New Delhi, India) and will connect to an Ubuntu 10.04 machine (kept in Florida, USA) using ssh. Let us check it out step by step.

Install the wireshark using your package manager. You need to install wireshark as well as wireshark-gnome to get the GUI.

yum install wireshark wireshark-gnome
Launch the wireshark. Do NOT start the analysis yet. We will first switch off the promiscuous mode.
Go to "Capture" and select "Options" and uncheck the "Capture packets in promiscuous mode" check box.
Select the interface you want to listen to. I will listen to eth0, which is usually the default for your first Network Interface. Also specify a capture filter. Check out this list for complete filters and their formats. I will write "host <ubuntu-maachine-ip-addess>".
You are all set but again before clicking start double check that promiscuous mode is turned off. Click Start.
Connect to the Ubuntu server using the Fedora box and the captured packets will be shown.

Filters are necessary if you want the capture to make some sense. Try it without any filter for once and you will be amazed by seeing the number of packets which pass through your network interface card.
While I have warned you about the promiscuous mode, I encourage you to use it on virtual machine but for learning purpose only (or if you happen to have a small switch or something then create a network for yourself).

Thursday, May 19, 2011

How To Create And Configure IPTables Firewall Using Firewall Builder Step By Step

iptables is an application that enables a system administrator to manipulate Linux Kernel Firewall tables and rules. It is extensively used in packet filtering and to create firewalls. In this post I am going to introduce you to an application called Firewall Builder (also known as fwbuilder) which helps in creating firewalls easily. fwbuilder can be used to create a wide variety of firewalls including Cisco Pix and HP ProCurve but we'll create something simpler, an iptables based firewall. So just follow this step by step tutorial:

Install fwbuilder package. It is GPLed for Linux based systems. Find the install instructions here. Don't worry, there are rpm and deb available.
Once installed, launch the fwbuilder as root user (iptables need root permissions).
Choose the fisrt icon which says "Create new firewall".
Choose the firewall software as "iptables" and suitable OS. If you are not sure about the options then go for "Linux 2.4/2.6". It is the kernel version. Also give the firewall an appropriate name and click next.
Select "Configure Interfaces Manually" and click next.
Click on the tiny green "+" sign on the left and add the ip addresses of your interfaces. Name would be the usual Linux names like "eth0". Now click finish.
Now click on the green "+" sign to add rules. By default these rules are all restrictive. They'll stop ALL the traffic from your network interface so we need to modify them. The easiest way to do so is to right click on the options.
Once you have modified the rule, you need to compile the firewall which will generate the rules from the GUI. Just click the compile button (the one with hammer!)
Now install the firewall by clicking the install button next to the compile button.

So now you can create firewalls easily. Check out the documentation of fwbuilder if you want more detailed instructions.

Image: http://www.fwbuilder.org/

Tuesday, May 17, 2011

So What Is .htaccess? Directory Level Configuration File For Apache

If you are in web development or if you have ever hosted a website, then chances are that you have also heard of a file named .htaccess. This post will introduce you to .htaccess and will help you in creating some of the basic rules of the same. First off, you need to know that .htaccess is not the name of the file but it is the extension just like .pdf and .txt.
Let us start by enabling the use of .htaccess for you server (usually this is turned off by default). You have to set the AllowOverride to All from None in the Apache configuration file (placed in /etc/httpd/ or /etc/apache2/ usually). Follow this post on Ubuntu Forum if you need more help.

Redirecting URLs
Now let us start with some simple redirects. Unlike mod_rewrite, the redirects done by .htaccess are visible on the client's address bar. Assuming that you would like to redirect http://abc.com/old.html to http://abc.com/new.html, write the following rule in your .htaccess file:

Redirect /old.html http://abc.com/new.html

The last part has to be the full URL of the new location.

Password Protection

.htaccess can also be used to protect your files and directories with a password. For this, you need to create a .htpasswd file which will consist of usernames and passwords (in encrypted format). .htpasswd should be in the following format:

username1:password1

username2:password2

username3:password3

Now you need to create a .htaccess file and provide the required details in proper format as shown below:

AuthName "Restricted Area"

AuthType Basic

AuthUserFile /var/www/.htpasswd

AuthGroupFile /dev/null

require valid-user

AuthName is the name of the restriction. You can safely change it to "Provide Password" or any other message. We are using http basic authentication, hence AuthType is Basic. AuthUserFile is the place where I am keeping my .htpasswd file. It is recommended that you should not keep the .htpasswd any your web directory (for me, Fedora user, that is /var/www/html/). The last line says that any valid user can see the content.

You can use this tool to create an .htpasswd and .htaccess easily.

Preventing Directory Listing

If you want to prevent users from seeing what is there in the directory which has no index page, you can use this method to stop the indexing which is on by default. Just add a line into your .htaccess file and you are good to go:

IndexIgnore *

Producing Custom Error Pages

You can use .htaccess to produce a customized error page for every kind of error. Follow the simple sytax below:

ErrorDocument 400 /error/badrequest.html

ErrorDocument 401 /error/pwdreqd.html

ErrorDocument 403 /error/forbidden.html

ErrorDocument 404 /error/notfound.html

You can specify the html inside the .htaccess but I would not recommend it.

Now that you know a little bit about the .htaccess, I would recommend you to go through the Apache Tutorial for the same.

Thursday, March 17, 2011

The Apache "mod_rewrite": Best SEO Tool, Must Have Web Development Skill

Apache mod_rewrite is a module of Apache web server which allows server side manipulation of the URLs. (If you need help with Apache web server, read this first). This means that you can redirect http://abc.com to http://zxy.com without changing the URL in the client's address bar. All you need to know is the syntax of mod_rewrite (which I'll tell you here) and some regular expressions (though it is not mandatory). A great cheat sheet for regular expression is available from AddedBytes.com. If you want to know who uses mod_rewrite then let me tell you, whenever you hear of something known as "clean URL", most probably mod_rewrite is behind it. Wordpress, Drupal and many other blogging platforms and CMSs produce clean URLs using mod_rewrite.

To ensure that mod_rewrite is enabled on your apache, just make sure that your apache config file has the following line uncommented:

LoadModule rewrite_module modules/mod_rewrite.so

If it was commented before and you have uncommented it manually, then you need to restart your apache web server. To check if mod_rewrite has been enabled or not, just create an empty file anywhere in your web root and write the following line:

Open this file in any web browser and check out apache2handler. mod_rewrite would appear there if it is enabled successfully.

Now let us write some basic mod_rewrite rules for real. Suppose tha you want to redirect all the traffic to you website to a fixed URL (this generally happens when you take down your site for update or maintenance). All you need to do is write the following lines in your .htaccess (more on .htaccess in some other post):-

RewriteEngine On

RewriteRule .* maintenance.php

First line will turn the Rewrite engine on (it is usually on but we like to be on the safe side). The second line will capture all the incoming URL (.* is the regular expression here) and will redirect it to the maintenance.php in the web root.

Now let us create a more useful rewrite rule. Something similar to what is used to create clean URLs. Consider a website which takes input in the form http://abc.com/name.php?q=aditya. This looks ugly and is not search engine friendly. We would like to have something like http://abc.com/name/aditya. This would be nice and easy to remember. For this let us create a rule for mod_rewrite.

RewriteEngine On

RewriteRule ^name/(\w+)/?$ name.php?id=$1

The Rewrite engine is going to examine the incoming URL requests and (time to open the regular expression cheat sheet if you do not master it) convert the friendly URL into the URL understandable by the server.

We can also create custom error messages using mod_rewrite. For creating a 404 Not Found message, follow the rules given below:

Saturday, February 12, 2011

How To Complie Linux Kernel (For Beginners)

Compiling Linux Kernel, according to a popular myth, is a very tough job. Personally I find it really easy and I'll walk you step by step through this process. I am using Fedora 13 KDE for the compilation.

Before building your kernel I would advice you to backup your data and grub.conf.

Download the source from http://kernel.org. The convention is that that even numbered sources like 2.2, 2.4, 2.6 etc are stable while odd numbered like 2.3 and 2.5 are not suited for production environment. I am using kernel 2.6.37 for this tutorial.
Assuming that you have placed your source on /usr/src/, extract the kernel source form the archive using the following command:
tar xvjf linux-2.6.37.tar.bz2
Now, it is a good practice to clean the build area before any compilation.
make mrproper
Let us start with the configuration now. Kernel source comes with several configuration tools to make your life easier. I will use xconfig but a GNOME user might want to go for gconfig.
make xconfig
Now select the modules and features you want your kernel to have. I would recommend checking Loadable Module Support. I also got rid of all Mac driver since I use a Dell. Likewise you can strip down your kernel easily. Save the file once you are done.
Once we got the .config file we should go to the Makefile and add a customization marker to differentiate my kernel from the default ones. I'll assign a value to the variable EXTRAVERSION in the Makefile. For me it was EXTRAVERSION = -aditya.
Now just run the make command and be patient. It took my machine around 2 hours for the compilation. I am running an Intel Core 2 Duo and 3 Gig of RAM.
Once the kernel is compiled install the modules.
make modules_install
Now I would copy my kernel and the system map to the /boot.
cp /usr/src/linux-2.6.37/arch/i386/boot/bzImage /boot/vmlinuz-2.6.37-aditya
cp /usr/src/linux-2.6.37/System.map /boot/System.map-2.6.37-aditya
Run the new-kernel-pkg to create the list of module dependencies, update grub.conf etc.
new-kernel-pkg -v --mkinitrd --depmod --install 2.6.37-aditya

And you are done. You actually built a kernel. Now reboot and enjoy it :)

Update: Please check out the comments for some cool tips.

Tuesday, February 1, 2011

Basic System Monitoring Commands

I am going to discuss some basic system monitoring commands here. Nothing big, most of the systems will have it pre-installed.

top: Displays pid, user, CPU, physical memory and swap usage in real time. You can press M - to sort by memory, S - to sort by time, P - to sort by CPU interactively. You can also press u and k to check out user processes and kill one respectively.
ps: It shows a list of processes. Use option -A for the entire list, -ef to get a detailed view and -u for user processes.
vmstat: It provides statistics for processors, memory, swap, I/O and CPU.
df: This is used to get information about the file system. It'll show you the size, used space, available space, used %, and mount information.
du: It is used to check out the size of files. It goes recursively if you do not specify the file name.
iostat: It provides the information about the Kernel version, average cpu usage and hardware devices. You might need to install sysstat package for this.
who and w: Provides information regarding the number of people logged in. w provides more detailed report.

Tuesday, January 18, 2011

GAWK Notes

Here are some bits and pieces I picked up about GAWK these days. This post is somewhat rough, you might need to study it in detail to get a good grasp over it.
Gawk is, essentially, a tool to process text data. It can work out like simple commands such as cat or grep as well as create powerful scripts and data filters. The Gawk instruction generally consists of a pattern and and action. It can operate on text files as well as standard input. A gawk command will look like:

$ gawk 'pattern {action}' 

Example 1, to look for name "aditya" in contacts files, you can run:

$ gawk '/aditya/{print}' contact.txt

Here Gawk will look for the pattern "aditya" and if it is found it will print it, as specified in action.

Example 2, to check out the products for which you have to pay some amount and .99 bucks like 9.99 or something, you can do:

$ gawk '/[0-9]*\.99/{print}' prices.txt

[]matches a class of characters. * is used for repeated matches so that we could match 9.99 as well as 11.99. The peroid (.), plus(+) and question mark (?), none used in this example, are used to match a single character, one or more characters, none or one character respectively.

Example 3, to check out home or office contacts I can do the following:

$ gawk '/home|office/ {print}' contects.txt

Here pipe (|) acts as or.

Example 4, to print out any particular word of each line or the entire line do the following:

$ gawk '/aditya/ {print $2; print $0}' contacts.txt

$n will print out the nth word of the line while $0 will print the entire line. NR and NF are special variables which respectively holds the value of number of current record and number of fields in current record.

Example 5. to check out the length of a line:

$ gawk '{print length($0), $0}' contacts.txt

Some string functions:

length(str): returns the number of characters in the string.
index(str1, str2): returns the position in str2 where str1 begins.
split(str, arr, delim): copies the segment of str that are separated by delimiter into array and returns the number of elements.
substr(str, pos, len): returns a substring starting at pos of lenght len
match(str, pattern): returns the position of the match.
sub(pattern, replacement, str) and gsub(pattern, replacement, str): performs a substitution on string str replacing every pattern with replacement string. gsub is global sub.
toupper(str) and tolower(str): obvious.