###################################################################
# logsplitter v0.2.2 (c)2006-2007 Tim Jackson (tim@timj.co.uk)    #
###################################################################

INTRODUCTION
This is a PHP-based splitter for Squid and Pound logs. It's intended primarily
for situations where Squid is being used as a reverse proxy (HTTP accelerator)
or where Pound is being used as a load balancer.
It's pretty simple really; it takes the normal Squid/Pound access log as an 
input and then splits it into a log file that looks like an Apache "combined" 
access log, with the addition of Squid hit/miss information on the end in
the case of Squid logs (because this is useful for other purposes, like
producing hit/miss statistics).

For security purposes and so that related logs can be grouped, the list of
valid hostnames which you are expecting to see in the access log must be
defined using a "hostname list". This is a newline-separated text file with a 
list of hostnames. If you wish to group multiple hostnames together into a 
single output file (for example to group www.example.com and 
subdomain.example.com into one output file) you can space-separate multiple 
hostnames on a line. You can also use simple wildcards by using asterisks. 

Example:

===== example hostname list file
www.example.com
www.example.net
www.example.org *.example.org *.otherdomain.example.com
=====

It is anticipated that this file will normally be automatically generated by
some external tool.


INSTALLATION
There are only two files to install really:
- logsplitter (which is the file you run from the command line)
- Text/LogSplitter.php (main logic in a class)

Out of the box, it will run from the directory you extracted it into.

You will need to install the PEAR module Console_Getopt in order for the
main logsplitter CLI program to work.


CONFIGURATION
There is a simple configuration file (logsplitter.ini) which simply defines 
the input and output file locations. A commented example config
(logsplitter.ini.sample) is provided.

By default, logsplitter reads from the configuration file 
"logsplitter.ini".


USAGE
Simply run "logsplitter" on the command line. It will read the defined 
config file and process the defined access log file, outputting as it goes.
To get some summary statistics, use the "-v" (verbose) option.
To override which config file is to be used, use the "-c" (config file) option.

Examples:

$ logsplitter -v
5 lines in 0.00 seconds (12314 lines/sec, 2283 kbytes/sec)

$ logsplitter -v -c /path/to/myconfig.ini
5 lines in 0.00 seconds (12314 lines/sec, 2283 kbytes/sec)


NOTES ABOUT CANONICALISATION
logsplitter canonicalises all hostnames found in input files to lowercase. It
also removes any port number specifications from the hosts logged in the file.
(The assumption is that you are not going to run two completely different sites
which need to be treated separately for logging purposes on
http://www.example.com/ and http://www.example.com:1234/)