topvhost - A Virtual Host Monitor

Introduction

The apache mod status module provides very detailed real time server information but does not provide much intuition on how that activity is spread over the virtual host configuration. I had once written a php prototype that built a list of virtual host log files using glob(), then scanned those files every second to collect the last modified date time, and finally enumerated those files and time stamps in descending order. It was not very pretty and very inefficient but it provided the inspiration for a better server monitor.

Linux provides a very efficient mechanism (inotify) to monitor file system changes, so I set about creating a curses application to reproduce my prototype. To be generally useful, the file list specification had to be flexible to account for the various ways virtual host logging is setup. I parameterized and slightly generalized the glob() mechanism and added an explicit list mechanism. To hold these and other parameterizations a configuration file "~/.topvhosts" was created.

The efficiency of this approach, allowed me to add information incrementally extracted from each log file by scanning the records added to the file since it last changed. Record count is used as a proxy for "hits" and the fields from the last parsed record can be displayed to provide "almost" real-time information.

Installation

The application is currently provided as a source tarball released under the GPL. In order to build the application the following are required:

The build uses GNU autotools so the installation follows the typical pattern of expanding the tarball, changing your directory to the main distribution directory, then executing "./configure", "make", and "make install". You must configure your application before use - see the next section.

Configuration

The configuration file is named '.topvhosts' and is stored in the home directory. A prototype of this file is found in the main distribution directory. This file should be customized and placed in the user's home directory (it is hoped that this step may become an addition make target in the future). The general features of this "INI" style text file are:

The currently recognized configuration settings are:

default The name of the configuration file section used if none is specified on the command line
hdr_format A sprintf like format string that describes the topmost line of the display. See details below.
row_format A sprintf like format string that describes the other lines of the display. See details below.
log_format A sprintf like format string that describes the format of the log records. If not specified, the apache default will be used. See details below.
glob_src A path containing '%s' as a place holder for the domain name. The place holder will be replaced by '*' and expanded by glob() to produce a file list
glob_omit The extension of any files to be removed from the list generated by glob()
config_src The name of a section in the configuration file whose assignments will be added to the file list. The left hand value is presumed to be the domain name and the right hand value is presumed to be the complete path to the log file
file_src The full path to an external configuration file whose topmost assignments use the same assignment format as used by config_src

The list of files monitored by the application is obtained by merging the *_src specifications. The sample configuration file included with the package is shown below:

# Configuration for topvhost
#
#
default=DA
hdr_format=%02m-%02d %02H:%02M:%02S Elapsed: %t Hosts: %3n Hits: %6o Read: %#I\n|
row_format=%02m-%02d %02H:%02M:%02S %8o %18h %v %3s %40r\n
[DA]
glob_omit=.error.log
glob_src=/var/log/httpd/domains/%s.log
log_format=%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %I %O
[Plesk]
glob_src=/var/www/vhosts/%s/statistics/logs/access_log
[Main]
config_src=MainFiles
[MainFiles]
main_log=/var/log/httpd/access_log
[Example]
glob_omit=.error.log
glob_src=/var/log/httpd/domains/%s.log
log_format=%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %I %O
row_format=%02m-%02d %02H:%02M:%02S %8o %18h %v %3s %40r\n%40{Referer}i %40{User-Agent}i\n
The sections in the sample include:
[DA] A sample configuration for virtual hosts managed by the Direct Admin control panel
[Plesk] A sample configuration for virtual hosts managed by the Plesk control panel
[Main] A sample configuration with log files enumerated in the [MainFiles] section
[Example] A sample Direct Admin section with a multi-row record layout

Log and display fields are treated as named columns and construction of the row display takes place as the transfer of log columns to row columns. It makes sense to use the same nomenclature to describe both the source and destination of this transfer. The source record is specified by apache LogFormat syntax, so the destination (the display row) is described by a similar syntax:

The token formats are summarized by the following tabulation:

TokenValuelogrowhdrNote
%Bbytesxw response excluding headers
%{name}Ccookiexw use row entry "%<width>{name}C"
%Hhour nn 
%Iread  nBytes read since start - use '#' for human format
%Mminute nn 
%Ssecond nn 
%Telapsed  nSeconds since start
%Uurlxw  
%Yyear nn 
%bbytesxw response excluding headers (CLF)
%dday nn 
%{name}eenvironmentxw  use row entry "%<width>{name}e"
%hremote hostxw  Output right aligned and truncated/padded to width
%(name)iheaderxw  use row entry "%<width>{name}i"
%mmonth nn 
%ncount (hosts)  nTotal number of virtual hosts
%ocount (hits) nnRecord count
%rrequestxw  Output left aligned and truncated/padded to width
%sstatusxw  
%telapsed  *"hh:mm:ss" since start
%uuserxw auth only
%vdomain name *  Output left aligned and truncated/padded to width
Legend x = extracted from log record;
n=numeric specification (0-+# )*[\d\.]+;
w = width specification \d+; * = left aligned, fixed width
log_format defaults to '%h %l %u %t \"%r\" %>s %b' if not provided

Currently the fixed width requirement is not enforced for those fields marked 'n' above. For those fields marked 'w' above, the format specification is converted to an integer that determines the fixed width of the field. The distributed configuration file shown below provides sample syntax.

Execution

Invocation arguments (courtesy of getopt) are:

-d Diagnostic for output format
-f --file Specify an alternate configuration file
-i --init Initialize hit counts by reading the log files. The default is to start the hit counts at zero an only read records added after startup
-s --select Specify the configuration section
-v --verify Display the list of log files and the initial record displayed and then exit

A sample command line for someone running the Direct Admin panel who wanted to smoke test the just built application from the build directory would be:

src/topvhost -f./.topvhost -sExample

The execution main loop consists of checking the keyboard for input, reading any log files that have changed, displaying the result, and suspending for 1 second. The following keys are recognized:

b B Previous screen - also Page Up if keypad recognized
f F Next screen - also Page Down if keypad recognized
h H Order display by descending hit count
q Q Quit the display
r R Force a display refresh
t T Order the display by descending access time

The application also catches and terminates on SIGINT or SIGKILL. Some minor display glitches can be fixed be resizing the terminal window.

Current Status

Please use the .05 package. Users of versions prior to .04 should note there have been incompatible changes in layout syntax which require adjustment of an existing configuration file. See the distributed ChangeLog.

Obviously, this application is not suitable for servers with many busy virtual hosts. I do have some ideas on how to better deal with higher loads by better using the inotify queue but that will have to wait until I get some free time again.

Download:   topvhost-0.5.tar.gz Aug 27 11:30 100K
Add man(1) page, minor log input parsing and error handling corrections.
Download:   topvhost-0.4.tar.gz Aug 15 09:56 100K
Add pagination, configuration and other changes for DRY and ubuntu compilation
Download:   topvhost-0.3.tar.gz Aug 14 07:10 100K
Restructure application, remove C cruft, add exception handling. Fix most of the display gremlins
Download:   topvhost-0.2.tar.gz Aug 11 16:48 100K
Correct format errors
Download:   topvhost-0.1.tar.gz Aug 10 13:06 100K
Initial release

 

gary at issiweb dot com