Understanding
Web Logs
Mike
de Sousa, Director, AbleStable

Every
time you visit a website your every move is being
recorded. Although no personally identifiable information
is gathered during this process (your name, email
address etc), your entry point, the paths you travel,
the pages you browse, the duration of your stay, and
your exit point are all saved in a web log file for
later analysis. Discover how this benefits you and
how logs can help make websites a whole lot better...
A Sea of Information

Websites everywhere produce web logs, records
of all the activity that takes place on the website
from a server's perspective. Back in the dim and distant
past of the Internet some ten years ago it cost thousands
to 'mine' these logs for the valuable information
they contained about user patterns. Today website
owners can reap the benefits of analysing their logs
using accessible, inexpensive software and scripts.
Website log files are generated by a web server, and
contain a record of website activity. Every time a
person visits a website, a log file is updated with
the visitor's information by the web server. A single
HTML page usually includes many graphics and other
associated files, and therefore results in many entries
in the log files which can be downloaded and used
to generate useful statistics. This is commercially
valuable data in that it provides information about
who is visiting a website, how they got there, where
they go, and what files they request. In turn this
gives the webmaster information about how their site
is used and can be a powerful tool to restructure
and improve a website's overall performance.
Web logs are usually saved in a 'raw' text form that
is all but impossible to read without the aid of a
log analyzer. The job of a web analyzer is to demystify
these logs and present the information in meaningful
ways. Logs present their information in a 'string'
of text. The format of the 'common' log file follows:
%S %j %u [%d/%M/%Y:%h:%n:%o%w%j] "%e%w%r%wHTTP%j"
%c %b
%S %j %u [%d/%M/%Y:%h:%n:%o%w%j] "%e%w%r"
%c %b
%S %j %u [%d/%M/%Y:%h:%n:%o%w%j] "%r" %c
%b
Client and Server

There are essentially two kinds of log analyzer
available. Those that carry out their tasks on the
web server (usually script based applications),
and those that are exclusively client-based and
used on a local computer. Some commercial software
products deliver both client-based and server-based
log analysis solutions. Some log analyzers are free,
others cost a great deal of money.
If you're settling for a client-based log analyzer,
make sure it features automatic log file format
detection and log compression support (that it can
import compressed files then recompress them after
they have been processed). The ability to export
and print any reports may also be a crucial feature
to those who need to provide evidence of their website
statistics to advertisers, sponsors, clients and
so on.
It may well be that one analyzer provides more information
than another but delivers it in a less than elegant
way. The graphs and visual aids that deliver summaries
of log activities are particularly useful, but the
quality of these visual aids are not always of a
high standard. When choosing a log analyzer take
time to view the sample reports and consider whether
you'd be happy to use these as presentational materials
in a professional context.
Log Location

Finding your log files is usually a simple process.
Connect to your web space via FTP entering the username
and password your host provided, then browse your
folder tree until you find your log folder. This
usually sits at the root or thereabouts, but some
hosts hide this away a little. It's all down to
the way the host configures their server. If you're
uncertain exactly where your log files are located,
your web administrator or hosting company will be
happy to tell you.
Log Formats

There are a number of different log file formats
and it's crucial you use a log analyzer that can
read the appropriate format delivered by your server.
The most common formats are:
•
Common Access Log Format
• W3C Extended
• Apache/NCSA
Combined
• Microsoft
IIS
• or a Custom
format specific to a particular server
Jumping to Conclusions

Website owners may be aware of their log files but
often have little or no idea what to do with them.
Most hosts now provide a 'stats package' of some
kind but these vary greatly in quality. Many will
simply show a top ten of various hits (top ten entry
pages, top ten exit pages etc) and are of relatively
limited value as compared with a comprehensive professional
log analyzer.
Guard against false conclusions when viewing log
files for the first time. A superficial reading
may seem to indicate one thing when in fact something
else is going on. For instance, a unique user is
determined by their IP address. By default, a visit
session is terminated when a user falls inactive
for more than 30 minutes. So a unique user may visit
your web site twice and get reported as two visits.
It may be tempting to assume you're getting far
more unique visitors than you actually are. As a
general rule use the logs as a guide rather than
cast iron evidence of actual website usage.
The most valuable questions your logs will provide
answers to are:
• Where do visitors
arrive from?
•
Where do visitors enter?
• Where do visitors
go?
• Do visitors
make it to the pages you want them to see the most?
• Do visitors
get into a loop on certain pages?
• Are there
too many clicks for people to get to the information
they want to see?
• Where do visitors
exit?
What Logs Record

The basic statistics tracked by logs are:
• Visitors and
Page views - per hour/day/week
• Page Counts
- the number of time a page was viewed
• Entry Pages
- pages that visitors enter your site on
• Exit Pages
- the last page a visitor viewed on your site
• Referrers -
where your visitors came from such as Google or
any other link
• Search Phrases
- words used on the search engines to find your
site
• Other Stats
- browsers used and geographic locations
Ecommerce statistics return information about tracking
revenue, advertising campaigns, and trends:
• Revenue Tracking
- tracks actual sales
• Campaign Tracking
- tracks and monitors the performance of add campaigns
• Conversion
Tracking - tracks conversions such as sign ups for
a service
• Time Trends
- tracks trends over time along with revenues
• Click Paths
- tracks in realtime the visitors on your site and
the path they take
Every Move You Make

For reference a more detailed list of what information
can be filtered by a good log analyzer follows:
General Traffic

• Visits for
a specified period of time
• Requests over
a user defined period
• Requests to
the server
• Information
about incoming, outgoing, download traffic, and
bandwidth
• Records of
visits, requested pages, downloads, and images
• Spider requests
to the server
• Information
about client and server errors, and about visits
with errors
Page Statistics

• Number of visits
when a web page was accessed
• Number of visits
when a web page was accessed first for the visit
• Number of visits
when a web page was accessed last for the visit
• Number of visits
when a web page was the only page accessed for the
visit
• Number of visits
when a visitor made a particular path through web
site
Download Statistics

• Number of visitors
who downloaded a particular file
• Number of visitors
who downloaded a particular combination of one or
more files
• The traffic
caused by a particular download
• Number of visitors
who downloaded files which came from a particular
referring page, server, or search engine
• The percentage
of visitors that downloaded files out of all visitors
which came from a particular referring page, server,
or search engine
Images Statistics

• Number of visitors
who accessed a particular image resource
• Number of visitors
who accessed a particular combination of one or
more image resources
• Number of visitors
and non-visitors who accessed a particular image
resource
Referrers Statistics

• Number of visits
from a particular server
• Number of visits
from a particular web page
• Number of visits
from a particular search engine
• Number of visits
from a particular query on a search engine
• Number of visits
using a particular word to find your site
• Number of visits
using a particular word combination to find your
site
• Number of visits
using a particular search phrase to find your site
Audience Reports

• Number of visits
from each country
• The outgoing
traffic generated by visitors from each country
• Number of visitors
using a particular operating system
• Number of visitors
using a particular browser
• Number of visitors
using a particular downloading software
• Number of visitors
using a particular user agent
• Number of visitors
using a particular downloading user agent
• Number of times
your site is added to 'favourites' or 'bookmarks'
Spiders Statistics

• Number of requests
made by a particular robot or spider
• Traffic caused
by a particular robot or spider
• Number of times
a particular resource was requested by robots and
spiders
• Number of times
a particular resource was requested by googlebot
Error Logging

• Number of visits
when a particular error occurred
• Number of visits
when a particular file was not found
Conclusion
Anyone who owns or runs a website should be looking
to their logs to improve their site. Web logs provide
a wealth of information that is always valuable
and often surprising. By implementing changes as
a result of log analysis, websites deliver their
purpose and achieve their goals more easily. For
anyone using the Internet web logs help make the
browsing experience a whole lot better, so if you've
got a site, there's no time to loose, download those
logs today...
|
|
|
|