Traffic Accounting System (TAS)

© Anton Voronin (, 2000-2001.

The most recent version of this document can be found at:



TAS is designed to fetch and process the traffic statistics from PC or Cisco routers (actually, with slight modifications - from any traffic accounting capable devices) - on IP level and from specific applications on application level.

The application level is needed because some "intermediate" services (like http-proxy servers, news servers or mail relays) "hide" actual user's traffic from IP level. For example, a client requests a large file from the abroad via your http proxy server. On IP level you can notice only the traffic between the client and your proxy server. So if you wish to know all traffic flows initiated by or destinated for your clients (either for billing, for setting traffic limits or just for estimating your network usage per each client), you have to account the traffic on application level as well. TAS can deal with the following applications: squid, sendmail and MailGate.

What TAS can let you, in a few words:

  • Do traffic accounting on IP-level and application level
  • Store all accounting data for the desirable number of months for later data extraction
  • Generate complex reports (in HTML or plain text) according to the specified rules either on periodic basis or on demand
  • Have a convinient web interface for easy definition of report rules and quick report generation
What TAS is not intended for:
  • Time accounting
  • Accounting for non IP or IP-based traffic
  • Accounting by username (in case of password-based internet access)
  • Charging/Pricing/Billing/Client management
  • Gathering traffic accounting information from the router interfaces like trafd, bpft, ipacctd, or ipa do (TAS is just a front-end for such kind of programs)
  • Reports generation for arbitrary time intervals (you can get reports for either current day, current month or any of the previous months that the data is kept for; reports are also generated on periodic basis after each day and each month and are stored in the report archive).
Although there are currently so many limitations, there is also some kind of whish list below.

Design notes

TAS is written completely in Perl and consists of the following components:

Most of them use configuration files (see below).

The first four programs collect accounting data picked up from routers or specific applications. AcctMax does a specific processing required for IP data before it is processed by AcctLog. AcctLog builds arbitrary reports according to the rules specified in its configuration. AcctJoin summarizes daily databases into current month databases. Periodic scripts are responsible for running other TAS components, send the reports to operator and archive them.

Accounting data is stored in Berkeley DB tables. I know, it is not very smart idea to use db for this task because it leads to consequent search of the full database when selecting data for building reports. But it is very simple and convinient to summarize the data in hash tables tied to db tables.


After you have unpacked the archive, you'll see the Makefile. You don't need to build or configure anything before install. To install the TAS just type:

make install
By default all components are installed under /usr/local. If you want to use any other prefix (for example, /usr/local/tas), then type:
make PREFIX=/usr/local/tas install
After the files are copied you need to do some installation steps manually (see the next chapter for each TAS component).

The TAS components

Obtaining data from Cisco routers

If you use Cisco routers you need to add the following to their configuration:

ip accounting-threshold 32768
ip rcmd remote-host root X.X.X.X root enable
where X.X.X.X is your accounting server's ip.
And for each interface:
ip accounting output-packets

Obtaining data from PC routers

If you use PC routers, you need to have on them a script by default called TrafShowAll and placed in /usr/local/bin. This script should print accounting data to stdout in the following format:

from  to  packets  bytes  router  protocol  status
where router, protocol and status are just tags which you can later use as filters when extracting traffic records. Actually you can use any arbitrary strings for these fields, but for uniformity with databases of other traffic types it is advised to use the router's short domain name as router, an upper-level protocol name (or just string "IP" if your router does not supply this information) as protocol and a packet transition status (something like "passed", "denied", "altered", etc.) if known and needed, or just symbol "*". For example:  51  32411  Router1  IP  *
On each call it has to show the data accumulated since the previous call.

For example, it may use utilities from trafd package:

#! /bin/sh
/usr/local/bin/trafsave ed0 ed1 ed2 ed3
/usr/local/bin/traflog -i ed0 -n -o mycustomformat 2>/dev/null
/usr/local/bin/traflog -i ed1 -n -o mycustomformat 2>/dev/null
cd /var/log && rm trafd.ed0 trafd.ed1 2>/dev/null
Be sure to describe the log format for traflog in /usr/local/etc/traflog.format file:
mycustomformat {
	from:"%s " to:"%s 0 " psize:"%ld RouterName " proto:"%s *\n"
NB: Here we use "0" instead of number of packets because trafd does not count packets.

Then write the TrafShowAll script like so:

/usr/local/bin/trafsave ed0 fxp0
/usr/local/bin/traflog -i ed0 -n -o mycustomformat
/usr/local/bin/traflog -i fxp0 -n -o mycustomformat
NB2: Run trafd/trafsave/traflog only for those interfaces between which you don't have packed forwarding, else traffic will be counted twice - on input interface and on output interface. In most cases you should take data only from a single (external) interface. In case of Cisco routers you won't meet this limitation because they count outbound traffic only.

As an alternative (and I think, more correct) solution you can use a simple and very suitable tool - ipacctd. Unfortunately it is documented in Russian only. But all you have to do is to compile your kernel with IPDIVERT option and arrange a startup script for ipacctd like this:

/sbin/ipfw add 001 divert 10000 ip from any to any via ed* out
ipacctd -v -p 10000 -f /var/log/ipacct
Then your TrafShowAll script will be the following:
/bin/rm /var/log/ipacct
/usr/bin/killall -HUP ipacctd
sleep 3
/usr/bin/awk '{ print $1, $2, $4, $3, "RouterName", "IP", "*" }' </var/log/ipacct
The advantages of ipacctd against trafd are that it can count outbound-only traffic (and so it is possible to gather it from all interfaces) and can count number of packets.

If you don't like both of these, you can just do something yourself. Arrange ipfw rule for outbound traffic that would permit all packets and log them (like in ipacctd example but with the log option). You can redirect these log messages via syslog to a script that would append the information about source, destination and packet size to a file. Your TrafShowAll script should output this file and truncate it on each call. I'd appreciate if someone sent me the working scripts, to place them into this documentation page.

By default AcctFetch connects to PC routers using ssh. If you don't change this behaviour in tas.conf, then you have to arrange passwordless SSH access from your accounting server to all of your PC routers. This implies your accounting server is a trusted and highly protected host. If you don't like this scheme, try to use something else that would let you fetch the information through the network (I'll be glad if you then share your ideas with me).

Web interface

Web interface consists of administrator's component and client's component.

Administrator's component AcctLog.cgi allows you to quickly define and produce simple reports without making a special configuration file.

It has multi-language support, so you can redefine any menu items, button labels, or status text to be in your native language (currently only English and Russian texts are defined). See Configuration section for details.

AcctLog.cgi has a few limitations in comparison to report rules definition through the configuration file:

  • It can generate only one table at a time
  • It does not allow to use category expressions to define host categories in table or column definitions like in AcctLog.conf configuration file, only hostname or ip address or a name of group list can be used (in case of a list name the traffic will be counted separately for each host group of the list or even more detailed according to each group's split option rather than for the whole list in summary. In other words, it's like if you have prefixed this list name with '*' symbol if used in a teble's category expression in AcctLog.conf.
However, it also has some advantages:
  • A user does not need to specify table definitions in a configuration file, which has quite a complex syntax. You can even teach your boss to use it himself ;-)
  • It lets you preview the report model (to ensure everithing is correct) before building the actual report
Here are some screenshots that can give you a point of what you can do with the web interface: Client's component, Client.cgi is a very simplified version of AcctLog.cgi, it can be used to let client hosts to learn their traffic themselves. Traffic is counted only for host which makes HTTP request. The report table contains only two columns (host name or address and traffic data). The report parameters are specified as URL parameters and should be the following:
All except TIME_PERIOD and PROXIES correspond to the table and column definition fields of %tables parameter in AcctLog.conf configuration file (see below) and all of them must present.

TIME_PERIOD should be a part of the database file name that refers to the time period of the traffic it stores (i.e., "today", "month" or "XXXXYY" where XXXX is a year, and YY is a month number).

PROXIES parameter should contain ip-addresses of your clients' proxy servers separated by colons. This may be useful if don't want your clents to see statistics for their proxy servers instead of their own in case if there isn't possible to determine the end-client IP-address. They will then just see the notice about that.

For example, you can provide the following form on your web-page (or several forms for different report configurations):

<FORM NAME="f1" METHOD=POST TARGET="/cgi-bin/Client.cgi">
	VALUE="Incoming traffic for current month (except current day)">
<A Href="javascript:document.f1.submit();">
	IP-traffic for current month
To have specific access control for AcctLog.cgi and Client.cgi you can put into .htaccess file in the same directory something like this:
<Files AcctLog\.cgi>
        AuthName                'HostMaster'
        require                 group hostmaster
<Files Client\.cgi>
        Order allow,deny
        Allow from
        Allow from
        Allow from
        Deny from all


TAS uses the following configuration files (by default they are placed in /usr/local/etc/tas):

  • tas.conf used by AcctFetch, AcctSquid, AcctSendmail, AcctMailgate and AcctLog.cgi;
  • AcctLog.conf used by AcctLog and AcctLog.cgi;
  • accounting.conf used by periodic scripts;
  • cgi.conf - language definition for AcctLog.cgi.

tas.conf uses perl(1) syntax of variable definintion. It has the following parameters:

AcctLog.conf uses perl(1) syntax of variable definintion. It has the following complex parameters:

accounting.conf uses sh(1) syntax of variable definintion. It has the following parameters:

cgi.conf uses perl(1) syntax of variable definintion. It defines all text strings used in the cgi interface, so you can translate them into any other language. All parameters have self-explanatory names. The distribution contains files for Russian and English. Generally, cgi.conf is just a symlink pointing to one of them. I'd appreciate if someone sent me files translated into other languages.

Upgrading from versions before 1.2

In version 1.2 there have been made significant changes regerding data storage and configuration (see History of changes for details). So if you have been using a previous version before, you need to take the following steps to adapt existing database and configuration for the new version of TAS.

  • First of all the database format have been changed to provide faster data extaction. To convert your databas files stop execution of all the programs that collect data (AcctFetch, AcctSendmail and AcctMailgate), backup all database files, then use AcctConvert with each db file you want to convert. If your databases are large enough, conversion process may take a lot of time, so to prevent data lost you may stop execution of data collectors only for the time of conversion of "today" databases.

  • The syntax of category expressions used in %tables parameter of AcctLog.conf configuration file has been extended (see its definition for details). Along with extending the syntax this made understanding of category expressions a bit easier because host groups are already familiar to a user (as they are also used in %lists parameter). But it also required some semantics change: if earlier a list name used as category expression operand led to counting traffic separately for each of of host groups the list consisted of (or even more detailed, according to those host groups' prefixes - '?' or '*'), then now to achieve the same behaviour you need to prefix the list name itself with '*' (as it is now a real host group), othervise accounting will be made for this list in summary and will be placed in a single row of the report table with the list name used as a the row header.

  • The traffic tags earlier used in database records for more selective accounting than just by source/restination hosts, and had different semantics for each type of traffic now replaced with a set of three different tags whose semantics is the same for any type of traffic - agent host, protocol and status. Accordingly, two more fields have been added to column definitions in the %tables parameter of AcctLog.conf configuration file (see their definitions for details). These fields are optional, so AcctLog won't refuse to run if they are absent, however, selection results may differ. So if you were using tag lists in column descriptions for squid traffic (for which they defined the request status), now you should move them from 5th to 7th field of each column definition where they are used; if you were using tag lists in column descriptions for mailgate traffic (for which they defined a gatewayed protocol), now you should move them from 5th to 6th field of each column definition where they are used.

  • The next change also concerns traffic tags. The status values for squid traffic have been changed and their set has been extended - they are now the same as in squid's access log file except those with status code "000" (instead of just MISS, HIT, DENIED, NOFETCH and OTHER values used before). So if you earlier had "HIT" value in tag list field of a column definintion, now you should probably replace it with "TCP_HIT,TCP_IMS_HIT,TCP_MEM_HIT,TCP_REFRESH_HIT,TCP_NEGATIVE_HIT", and respectively replace "MISS" value, if any, with "TCP_MISS,TCP_REFRESH_MISS,TCP_CLIENT_REFRESH_MISS".

Report example

Here you can find a report example, HTML version (205 KB). As it is very detailed, it implies quite a complex configuration. I have particularly changed domains, ip-addresses, peering neighbours names and client names to some nonexistent.


The report building is quite time consuming operation. TAS uses the following measures to make it more efficient:

  • Since version 1.2 TAS uses b-tree databases instead of hash tables which provides a key search instead of the full sequential search. For table category expressions that cover a limited set of hosts it now takes much less time to find and extract the needed records from a database than before (for example, in my case with a monthly database that has more than 1.5 million records building report for a single host takes a few seconds instead of 30 or more minutes needed when the full sequential search was used). Of course, if you used total or each keyword in your table's category expression, the time needed for report would be about the same as before.
  • All IP addresses are stored and compared in binary form.
  • The fast inet_aton() function from libc is used to convert addresses into binary form.
  • IP addresses are converted to binary form and resolved only once, before being checked against all category expressions.
  • Only addresses from local (as specified by user) networks are resolved.
  • Addresses are resolved using DNS only once, then the server replies (including negative ones) are stored in the internal resolution cache.
  • Before processing the data, configured group lists and table descriptions are converted into form that provides faster processing.


Q. Why the "TOTAL" value in my report tables is less than the sum of values in each row?

A. This is because some hosts covered by the table's category expression belong to more than one inclusive host group (or to more than one of its entries if a group is defined as a group list), so the data for these hosts was counted in more than one row, but only once in the "TOTAL" row.

Q. I am sure that all the host groups mentioned in my report table do not overlap, but the "TOTAL" value is still slightly less than the sum of values in each row...

A. The "TOTAL" value is computed in bytes and then rounded to KBytes, MBytes or GBytes depending on what you have configured for the column. So the sum of values in each cell of the column, if rounding occurs, may differ from the "TOTAL" value by no more than sum of their rounding errors.

Q. What happens if daily periodic script didn't run in time (for example, there was a power failure)?

A. The shifted files with suffixes ".yesterday.from.db" and "" (which would have been processed if daily periodic script has run) will remain till the next midnight. At midnight they will be rotated and renamed with suffixes ".yesterday.1.from.db" and "", and the new files with suffixes ".yesterday.from.db" and "" will be created. If next day periodic script won't run again, the files will be rotated further. When daily periodic script finally runs, it will process all of those files. Reports each pair of rotated files will start with the line, "PREVIOUSLY SHIFTED FILE ip.yestaerday.1 FOUND! PROCESSING IT..." and end with the line "PROCESSING OF FILE ip.yesterday.1 DONE." (and the same for other types of traffic you enabled processing for). More questions are welcomed ;-)


Periodic scripts don't run and I don't see any error in the "daily run output" messages. a) Make sure you didn't redefine local_preiodic parameter in /etc/periodic.conf, and if you did, include "/usr/local/etc/periodic" into it.

b) If the problem still exists, then enable daily_show_badconfig and monthly_show_badconfig options in your /etc/periodic.conf to see the error output if the script fails to run (there is a chance you just didn't create a storage directory specified in your accounting.conf, and so periodic scripts fail because they can't write into it).

The report the TAS produces, contains empty tables. a) Make sure that list names you have specified in table category expression in your AcctLog.conf, are defined in the %lists block, and they are in the same case. Remember that only total and each keywords are predefined.

b) If the lists used in a table category, contain only domain names, be sure that IP addresses that correspond to these domain names, are covered by your @local_nets parameter (else they won't be resolved, and so won't match the lists you expect they will).

a) You get messages from cron regarding AcctFetch: "Bus error. Core dumped".

b) AcctFetch hangs forever and all its instances consequently run, are blocked.

c) Traffic databases get corrupted (AcctLog, AcctJoin or AcctMax either report "Inappropriate file type" or dump core or just hang).

d) Traffic databases have enormous size.

Ensure that neither softupdates nor async mount option is set on filesystem where your traffic databases reside (/var/account by default).
You use trafd package and noticed that TAS reports twice bigger amount of traffic than you expected. Ensure that you don't run trafd for more than one interface between which you have packet forwarding.
You use Russian language file for web interface, and pressing submit buttons in the menu frame result "Bad parameters" error. This means that data posted from the client to server isn't recoded into the server character set. So you either need to:

a) Use Russian Apache web server.

b) Recode the language file cgi.conf into the character set used on your client machine(s) (windows-1251 for Windows, x-mac-cyrillic for Macintosh, iso-8859-5 for Sun, etc...)

c) Translate $menu_submit_* parameter values in cgi.conf into English, or just latin transliteration of Russian.

After accessing the client web-interface to learn their traffic, the hosts which are under both NAT and HTTP proxy see an empty report or see their intranet ip-address instead of their real ip-address for which the traffic is counted. Client.cgi uses HTTP_X_FORWARDED_FOR variable, when available, to learn the actual client's address. But if the user host is within the intranet address space, and the real ip-address for which he may like to learn the traffic, is on his HTTP proxy (probably the same machine as NAT proxy), then his HTTP proxy will pass his intranet address instead of its own.

Advise users who are under NAT to either turn off using HTTP proxy for your URL in their browsers (if their firewall permits that) to ensure their addresses will be translated by NAT, or to just access directly from their proxy machine.

Another solution is to require SSL connection, in which case HTTP_X_FORWARDED_FOR won't be passed to the server (at least, squid proxy server doesn't pass it). For that make your links or form targets to Client.cgi starting from "https://", not "http://". Of course you should be using some sort of SSL-capable web server to make it possible. If you use apache+mod_ssl, you may also wish to use SSLRequireSSL configuration directive to prohibit plain http connection.

Using SSL is also advised for security reasons.

Planned enhancements

In the future it is planned to get rid of DNS resolution of addresses into names and grouping by names when building a report. Instead AcctLog should connect to a MySQL database that keeps all the information about clients, find out who owns the given address, and so be able to aggregate hosts by clients in the report tables rather than by ip nets or domain names. Of course, DNS resolution and grouping will be kept as an option.

When running on the periodic basis, the results of accounting data extraction for each client have to be automatically put into a client database, not only into the report.

Optional support for timestamps with 5-minute step. May require too lot of disk space or a reduction algorithm.

History of changes

  • Current version

      Is now the same as 1.2.1
  • Version 1.2.1, released August 10, 2001

    • Deprecate parameter "$cisco_checkpoint_command" in tas.conf. Checkpoint and fetching should better be done within the same shell command conjuncted with "&&".

    • AcctFetch now warns if Cisco router's accounting data threshold was exceeded on some reason and accounting data was missed.

      Fix monthly accounting script - databases where kept 1 month less than specified by the "keep" parameter in accounting.conf.

  • Version 1.2, released July 2, 2001

    • July 2, 2001
      Bugfix in monthly accounting script - sendmail database did not rotate because of a misprint.

    • June 29, 2001
      Fix for a bug with submit buttons in the web interface (AcctLog.cgi) that was introduced during the code cleanup at the date of previous change.

    • June 18, 2001
      Added client-oriented web interface Client.cgi, which can be used to let the client hosts to learn their traffic themselves.

    • June 17, 2001
      Fix for a bug in AcctLog that caused to count traffic more than once for hosts belonging to key ranges (which are compiled from tables' category expressions) that happened to be adjacent in a database. I.e., if you specified a table's category expression like "" and there are no addresses between and in your database, then traffic for hosts belonging to would be counted twice. This bug was introduced together with the new database format.

      Subject of a report sent by e-mail now contains date for which the report was built.

    • June 8, 2001
      Database format has been changed from hash to sorted B-tree for faster data extraction.

      AcctConvert utility added to convert old databases into new format.

      Syntax of category expressions has been extended and made easier for understanding, but its semantics was slightly changed.

      Info about agent host (from where the accounting data was fetched), protocol and information unit's status is now stored in database for any type of traffic instead of a single tag that had different semantics for each type of traffic.

      A user now can specify more units in which to count traffic (except "bytes" now "kbytes", "mbytes" and "gbytes" are also supported in table column definitions).

      Web interface has been changed to support selection by agent host, protocol and status and to support new traffic units.

      Progress output of web interface made more detailed.

      Fixed an error of computation netmask from masklen, which arised when maxlen was 32, because perl supports shift operations on 4-byte integers only. The different algorythm is now used.

      Fixed an error of old files removal when "compress_rotated" parameter in accounting.conf was set to "NO".

      Multiple squid log files are now supported (parameter "squid_log" in accounting.conf was replaced with "squid_logs" and its syntax was changed; however "squid_log" is also supported for compatibility).

      Periodic scripts now have hardcoded default values for config parameters, so commenting them out is now safe.

      It is now possible to specify the rounding option for units larger than "bytes": "up", "down" or "nearest" (corresponding optional field was added to the table definition).

      FAQ section was started in the documentation page.

    • May 17, 2001
      Added web interface for AcctLog.

    • April 10, 2001
      HTML report example included into the distribution.

    • Mar 19, 2001
      AcctLog now checks the list names used in table or column category expressions in AcctLog.conf to ensure they are defined in %lists parameter. This can prevent problems caused by user mistprints.

    • Mar 15, 2001
      TAS now can generate nice reports in HTML. Added -h switch for AcctLog and "report_type" parameter to accounting.conf.

      "ip_daily_max" parameter added to accounting.conf, so a user now can turn the daily traffic maximization off.

    • Mar 9, 2001
      "Troubleshooting" section added to this documentation page.

    • Mar 2, 2001
      Shell commands issued by AcctFetch to obtain traffic statistics from routers have been made configureable with tas.conf.

      Sample TrafShowAll script mentioned in this documentation have been changed, and a bit more explaination about using TAS with trafd has been added.

  • Version 1.1, released Feb 7, 2001

    • Feb 5, 2001
      Another "Use of unitialized value" warning has been fixed in AcctLog.

    • Feb 2, 2001
      Many typos fixed in this documentation page.

    • Feb 1, 2001
      Fixed a very stupid misprint in AcctJoin, made during the changes of Jan 5, 2001, that caused to store number of packets instead of number of bytes when adding data to the month database.

    • Jan 16, 2001
      Fixed a minor bug in AcctLog that caused "Use of uninitialized value" warning in some situations.

    • Jan 10, 2001
      Fixed typo in AcctFetch.
      SEEK_SET definition is now correctly taken from IO::Seekable module.
      Removed unused variables from AcctSendmail and AcctMailgate.
      Databases are not locked now - a stampfile is now locked instead.
      AcctFetch now opens ip-database only after the data was fetched from a router and then closes it before fetching data from another router.

    • Jan 8, 2001
      Fix for bug with renaming current-day databases that were not processed in time for some reason.

    • Jan 6, 2001
      Accounting.conf now lets a user to specify which types of traffic to process (directives process_ip, process_squid, process_sendmail, process_mailgate), so there's no need any more to modify periodic scripts if statistics for some types of traffic is not gathered.

    • Jan 5, 2001
      Russian text accidentally remained in periodic scripts, was translated into English.
      Fix for squid's accounting database rotation failure in monthly periodic script.
      Fix for uninitialized value warning in AcctLog.
      Configuration directory is now /usr/local/etc/tas instead of /usr/local/etc/Scripts.
      New configuration file tas.conf has been added, it lets to specify a directory where accounting databases should reside.
      Two configuration parameters added to accounting.conf: "compress_rotated" to specify whether or not to compress rotated month databases and "keep" to specify number of months to keep the data.

    • Jan 4, 2001
      File permissions fixed for executable scripts so that they could be run by any user. Thanks to Andreas Klemm (

    • Dec 27, 2000
      The data fetched from a router now is accumulated in memory and committed to a database only after the fetch is completed. Commitment during fetching sometimes led to a db table damage (I don't know why).

  • Version 1.0, released Dec 25, 2000

    The first release.

Credits to

  • James Couzens ( for the mirror site.


Note: these links are relative, so if you obtained this document within the TAS package, they won't work.