Previous Page
Next Page

7.12. Analyzing Web and FTP Logs

Fedora provides the Webalizer tool for analyzing Apache and vsftp logfiles, but the default configuration works only with the default Apache virtual host. With a few minutes of configuration, Webalizer can analyze the logfiles off all of your Apache virtual hosts as well as your vsftp server.

7.12.1. How Do I Do That?

The default configuration for Webalizer analyzes the default Apache logfile at 4:02 a.m. each day, as long as that logfile is not empty. The results can be read by using a browser on the same machine and accessing http://localhost/usage/, which displays the report page. A sample report page is shown in Figure 7-30.

Figure 7-30. Webalizer web usage report


7.12.1.1. Analyzing virtual host logfiles

This configuration assumes that your Apache virtual host logfiles are named /var/log/httpd/<virtualhostname>-<access_log> and are in combined format.


To configure Webalizer to analyze your virtual host logfiles each day, create the file /etc/cron.daily/00webalizer-vhosts:

#! /bin/bash
# update access statistics for virtual hosts

CONF=/etc/httpd/conf/httpd.conf

for NAME in $(sed -n "s=^[^#]*CustomLog logs/\([^ ]*\)-.*=\1=p" $CONF)
do

    mkdir /var/www/usage/$NAME
    chmod a+rx /var/www/usage/$NAME

    LOG=/var/log/httpd/${NAME}-access_log

   if [ -s $NAME ]
   then
     exec /usr/bin/webalizer -Q  -o /var/www/usage/$NAME $LOG
   fi

fi

Make this file readable and executable by root:

# chmod u+rx /etc/cron.daily/00webalizer-vhosts
               

Next, edit /etc/webalizer.conf and place a pound-sign character (#) at the start of the HistoryName and IncrementalName lines to comment them out:

                  #HistoryName    /var/lib/webalizer/webalizer.hist
...(Lines snipped)...
#IncrementalName        /var/lib/webalizer/webalizer.current

This will ensure that a separate analysis history is maintained for each virtual host.

The virtual host logfiles will be analyzed every morning at 4:02 a.m., and the reports will be accessible at http://localhost/usage/<virtualhostname>.

7.12.1.2. Analyzing the FTP logfile

To analyze the vsftp logfile each day, create the file /etc/cron.daily/00webalizer-ftp:

#! /bin/bash
# update access statistics for ftp

if [ -s /var/log/xferlog ]; then
   exec /usr/bin/webalizer -Q -F ftp -o /var/www/usage/ftp /var/log/xferlog 
fi

Make this file readable and executable by root:

# chmod u+rx /etc/cron.daily/00webalizer-ftp
               

Then create the directory /var/www/usage/ftp:

# mkdir /var/www/usage/ftp
# chmod a+r /var/www/usage/ftp
               

Make sure that you have made the changes to /etc/webalizer.conf noted previously.

Your FTP usage statistics will now be analyzed each day at 4:02 a.m. along with your web statistics. The reports will be accessible at http://localhost/usage/<ftp>.

7.12.1.3. Accessing the usage statistics from another location

It's often inconvenient to access the usage statistics from the same machine that is running Apache. To make the statistics password-protected and accessible from any system, edit the file /etc/httpd/conf.d/webalizer.conf to look like this:

#
# This configuration file maps the Webalizer log-analysis
# results (generated daily) into the URL space. By default
# these results are only accessible from the local host.
#
Alias /usage /var/www/usage

<Location /usage>
    Order deny,allow
    Allow from ALL
    AuthType            Basic
    AuthName            "usage statistics"
    AuthUserFile        /var/lib/webalizer/passwd
    Require             valid-user
</Location>

Create the password file with the htpasswd command:

# htpasswd -c /var/lib/webalizer/passwd chris
New password: 
                     NeverGuess
Re-type new password: 
                     NeverGuess
Adding password for user chris

The SELinux context of the directory containing the password file must be changed in order for this to work:

# chcon -t httpd_sys_content_t /var/lib/webalizer/
                  


The statistics reports should now be accessible using a web browser on any computer.

7.12.2. How Does It Work?

The script /etc/cron.daily/00webalizer is started once a day (at around 4:02 a.m.) by crond. This script in turn starts up Webalizer; the default configuration file (/var/webalizer.conf) is preset to analyze the main Apache logfile (/var/log/httpd/access_log) and place the results in /var/www/usage.

The script file 00webalizer-vhosts obtains the virtual host log filenames from /etc/httpd/conf/httpd.conf and runs Webalizer on each logfile after the main logfile has been processed. 00webalizer-ftp does the same thing for the vsftp logfile, /var/log/xferlog.

The web directory /var/www/usage is initially protected by the file /var/httpd/conf.d/webalizer.conf so that Apache will serve it only to a browser running on the same computer.

Webalizer analyzes web files and logfiles to determine usage patterns; it can process the Apache common and combined logfile formats, and the wuftp logfile formats (which is the same format used by vsftp). It stores the generated statistics for the last year in the file webalizer.hist, and stores partial statistics for the current reporting period (month) in the file webalizer.current. The data from previous runs of the program is retrieved from those files and combined with data from the current logfile to generate the reports. By default, webalizer.hist and webalizer.current are stored in /var/lib/webalizer; the changes to the configuration file cause these files to be stored in the output directories so that each report has its own, separate copy of these files.

The generated reports are saved as HTML pages and PNG graphics.

7.12.3. Where Can I Learn More?


Previous Page
Next Page