Apache Weblog Analysis
Whether you run your own blog or web server or use some hosted service – at some point you may be
interested in some information on how well your server or your users are doing. Many infos like hit
frequency, geolocation of users and distribution of spent bandwidth are very useful for this and can
be obtained in different ways:
- by instrumenting the page running inside the client browser (eg piwik)
- by analysis of the web server logs (eg webalizer)
For the latter I have been using for several years webalizer, which does nice web based analysis
plots. More recently I moved to a more complicated server environment with several virtual web
services and I found the configuration and data selection options a bit limting. Hence I started as
a toy project to implement the same functionality with a set of simple R scripts, which I will
progressively share here.