Exim – checking maildir quotas at SMTP RCPT time

August 22nd, 2004

(This doc written in August 2004 but updated Jan 2009 to correct an error in the description of how Jeremy Harris’s solution works)

A discussion in August 2004 on the exim-users list about checking quotas at SMTP RCPT time with Exim sparked an interesting discussion. To summarise:

  • By default, when using Exim’s inbuilt quota support, messages for users who are over-quota will be accepted at SMTP time, and a bounce message will be subsequently created and sent to the envelope-sender of the original mail.
  • Checking quotas at SMTP RCPT time and issuing a suitable rejection message would be a better way of dealing with the problem. Many people, myself included, do not like accepting then bouncing mail in today’s world of spam and viruses where envelope senders are, as often as not, fake. This often creates “collateral spam” to innocent third parties. (And, incidentally, wastes bandwidth too).
  • It is not easy for Exim’s inbuilt quota management to allow checking of quotas at SMTP RCPT time, due to both architectural considerations (quotas are defined in transports and checked in delivery processes, which have little to do with message reception) and, more significantly, security/permission ones (at message reception time, Exim is typically running as an unprivileged user yet the checking of quotas normally requires root privileges).

A number of solutions were proposed: (if I mis-explain or misquote anyone, it’s not intentional – please let me know so I can put the record straight)

  • Jeremy Harris: Do some clever, but rather complicated and unreadable stuff with ACLs and databases to check the maildirsize file directly from Exim, using the SIZE command (if it exists) from a remote SMTP client to check at RCPT time and the real (received) message size at DATA time. Relies on Jeremy’s particular configuration (though could be modified for others) and doesn’t scale well to multiple MXes and requires calculation of the maildirsize file after each RCPT command.
  • Peter Bowyer: Use an external socket daemon (for example, Alun Jones‘s Exim socket daemon) to do the checks (implementing your own logic to determine whether or not a given user is over-quota), and use some kind of ${readsocket…} call from the Exim ACLs. This is probably the most elegant and ‘perfect’ way of doing it, though not the simplest and it does require the quota to be calculated at each RCPT command.
  • Greg Woods: Periodically run a script which checks for over-quota users, and write an override redirect/alias file for any users that are overquota, containing things like user: :fail:User is overquota. Greg provided a sample script for use with Cyrus which uses the ‘quota‘ helper program to determine whether users are over-quota.

Now, I was considering this problem too, and was quite inspired by Greg’s solution which seemed to be efficient, elegant, robust and easily scalable (to scale to multiple MXes, you would merely have to synchronise the single file containing the list of overquota users). However, I wanted a version which would work directly with maildirs (e.g. in a typical Courier-IMAP ‘virtual user’ configuration). Plus, I didn’t really see the need to write an entire redirect line per user.

So, I came up with a solution which involves:

  • a small script which iterates through a directory of (assumed) maildirs, calculates the quota usage (using the Maildir++ ‘maildirsize’ file) and dumps a list of overquota users in a linear list to a text file.
  • an Exim router to check the above file. Used in conjunction with a ‘verify = recipient‘ directive in a RCPT ACL, this will prompt rejection of over-quota users at SMTP RCPT time.
  • using the Exim ‘quota_is_inclusive = false‘ directive on the maildir transport which delivers to local mailboxes

The only downside to this method is that it’s not ‘real-time’; there is an interval (according to the frequency at which you run the script to check quotas) during which users can be overquota but will not be determined as such. This means that there is a small window during which bounces might still be generated; the severity of the problem varies according to how often you can afford to run the checker script.

My solution is presented below in the hope that other people may find it useful. It’s very much a first attempt, so there may be problems I have overlooked. If so, please let me know so that I can fix them.

Step 1: The Exim router

I am assuming that you have some kind of clearly-delineated virtual mailsystem, where all mail to be delivered to IMAP mailboxes is ultimately addressed to a/some specific domain(s) for this purpose, listed in a domainlist called +maildir_domains (I use a private namespace for deliveries; all ultimate local maildir deliveries, for example, will be addressed to username@maildir and all ‘real’ mail addresses (e.g. info@example.com) are aliased to this). Therefore, before the router which routes your maildir users’ mail, insert a router similar to the following:

maildir_overquota:
  driver = redirect
  domains = +maildir_domains
  local_parts = lsearch;/etc/exim/maildir_quota_exceeded
  data = :fail:Mailbox quota exceeded
  allow_fail

This looks up the user to be delivered to in the linear file /etc/exim/maildir_quota_exceeded (which we will generate later – see next step; it could of course easily be converted to a DBM/cdb/etc. file for performance if necessary) and for any users listed in that file, it will redirect to the special address “:fail: Mailbox quota exceeded” which, in a typical configuration, will cause the error “Maildir quota exceeded” to be returned to any user trying to send mail to an over-quota user, either locally via a generated bounce message or at SMTP time. Note that it will also fail any messages from the over-quota user, which may or may not be desirable.

Step 2: The quota-checking script

The next, and most important, step is to generate the list of over-quota users. To do this, I wrote a script called maildir-check-quotas (follow the link to download). This (simple) script assumes you have all your maildir folders in a single directory (/home/vmail by default, though you can easily change that). It iterates through each folder and, if it finds a maildirsize file, works out the quota usage. If a user is over-quota, it writes that users’ name to the file /etc/exim/maildir_quota_exceeded (again, easily configurable). You should run this script periodically (e.g. from cron), for example every five minutes or so (perhaps more, if you can support the load).

By the way, the script is in PHP. I’m sure it can be converted to other languages pretty easily if you have a preference.

Note: This script assumes you have Exim’s transport option maildir_use_size_file set for maildir deliveries, though it will fallback gracefully (assuming no quota) for mailboxes that do not have a maildirsize file.

A sample file as output from this script (assuming users ‘fred’, ‘bob’ and ‘mary’ were over-quota when it was run) would be:

# List of maildir users who are over-quota
# Auto-generated by maildir-check-quotas v1.0
# Generated at Sun, 22 Aug 2004 17:40:00 +0100
bob
fred
mary

To get verbose information when running the script, pass it the -v option.

Step 3: Making sure that users can actually go over-quota!

Now, the above is all very well, but by default the maildir-check-quotas script checks to see if a user has actually exceeded their quota (or matched it exactly, but that’s unlikely). In a typical configuration, however, Exim treats self-imposed (i.e. non-filesystem) quotas in a similar way to system quotas, and tries to prevent the user ever exceeding the quota. This means that a mail which would send a user over-quota will be rejected. However, this means that no users will ever exceed their quota and therefore the quota checking script will never find any over-quota users! This rather defeats the object of the exercise. There are three obvious solutions:

  • Start rejecting mail at some percentage threshold (e.g. 99%) of actual allowed quota. Not ideal, as this is an inexact science (unless all your quotas are the same and you increase quotas to compensate) and it might mean that you end up rejecting mail which wouldn’t actually have sent a user over-quota. Additionally, there is the possibility of generating a limited number of bounces for an indefinite period (if a user is under the percentage threshold, but a new mail would take them over ‘absolute’ limit). However, if you want to use this method you can with my script – just set $MAILBOX_FULL_PERCENTAGE to something less than 100%.
  • A variation of above, but have a fixed ‘threshold’ (e.g. 1MiB) – you then set all quotas to be threshold bytes over the actual desired quota, and then make the script check for mailboxes exceeding (quota − threshold). This is slightly more complicated, but would give users a (maximum) ‘grace’ allocation of threshold over their allocated quota. This method also suffers from the problem of potentially generating bounces for an indefinite period. I don’t currently support this with my script.
  • Use the Exim transport option ‘quota_is_inclusive‘ and set it to false. (My preferred option). That way, the last mail which will cause a user to go over-quota will be accepted, and the maildir-check-quotas script will therefore ‘trip’ on the next run. The obvious downside to this is, of course, that your users can go arbitrarily over-quota depending on the size of the final mail that causes them to go over-quota (and subject to your normal message size limits). This method still suffers from the problem of possibly generating bounces, although in this case only during the period between a user exceeding their quota and the next run of the script.

Put all this together and you should have a system which checks quotas simply and effectively and allows SMTP time rejection. I think that a readsocket{} check and accompanying daemon is probably still the “best” way to attack this problem (though, perhaps, less efficient – especially in the face of abusive behaviour from remote hosts, though some gentle caching could probably alleviate things), and I may experiment with that at a later stage, but for now I thought this method might prove useful to some people.

ID cards consultation response

January 29th, 2003

The UK Government recently undertook a consultation on “entitlement” cards – another term for identity cards. Whilst I’m very aware of the issues of identity fraud and other problems which ID card proposals purport to reduce, I have misgivings about whether they will actually help, and serious concerns over personal privacy implications.

My response, submitted to the Home Office, is below:

I am aware of the Government’s consultation on the possibility of introducing some kind of identity or “entitlement” card to the UK.

For the purposes of aggregation of for/against responses, I am against such a move.

Nevertheless, I would like to provide some more constructive and granular feedback than a simple yes/no response. Whilst resources limit the time I am able to devote to this topic, and therefore this response is less substantial than would be ideal, I would like to make the following observations which cover a number of points raised in the consultation:

  1. The potential for abuse of such a system is worrying. Whilst I have no doubt that the strict security procedures and processes mentioned in the consultation would be implemented, the practical reality is that centralising such a large amount of data, and introducing complex links with a variety of services employing a large number of people creates a substantial risk of abuse, not only by those involved in administering and using such a system, but by police and other bodies. In particular, when combined with recent legislation such as the Regulation of Investigatory Powers Act, I can only conclude that such a system will open up new possibilities for widespread intrusion of privacy, albeit by a perhaps limited number of people. As an engineer by training and someone involved heavily in IT and database systems, I certainly appreciate the elegance and efficiency which centralised and rationalised data storage can bring, but in public systems (such as that which would be required for an identity card), the technological idealism and search for efficient delivery of services has to be balanced against the privacy and other risks to society. In this case, I am of the opinion that a system of the kind proposed may very well dangerously undermine some of the inherent safeguards in a somewhat decentralised system; primarily that the administrative burden to correlate disparate data naturally limits the potential scale of widespread surveillance and/or abuse of private information.
  2. Whilst acknowledging that there is evidence that “identity fraud” is on the increase, I am not convinced that an identity card scheme will help to significantly reduce the incidence of such fraud. Whilst it may provide some benefits, I feel these may be offset by the fact that a universal identity card will provide a “high value” target for fraudsters. History shows that the higher the potential gains from forgeries and suchlike, the more resources that potential fraudsters will invest in circumventing the security provided by a system. Although a single centralised database certainly provides some benefits in this respect, only with a compulsory card and the introduction of biometric data would there be, I believe, a significant increase in the difficulty of committing identity fraud – but I do not believe that a system encompassing these features which will be acceptable to members of society at large is likely to be available in the foreseeable future.
  3. Similarly, I am not convinced of the implicit assumption in the consultation that an identity card will necessarily reduce the incidence of illegal immigration or work. A “black market” for illegal workers is likely to always exist.
  4. As correctly identified in the consultation, the risks involved in such a large-scale IT project would be significant. Past evidence of very large public/private sector IT projects shows that such projects tend to be under-estimated and may spiral out of control. With costs already estimated at £1.5bn, this is by all means an extremely large project, and I would voice a significant concern that the true end cost in purely financial terms may be much higher.
  5. If it were to be decided that introduction of an identity card was mandated, I would strongly prefer a “voluntary” rather than “compulsory” card (as defined in the consultation).
  6. The concepts discussed in the paper seem to follow very much “traditional” thinking on identity cards, whereas perhaps what is required here is some fresh thinking and new concepts. For example, whilst I don’t offer this by any means as a proposal (simply “food for thought”), how about a large-scale *decentralised* system, the fundamental concept of which would follow the lines of the PKI (public key infrastructure) trust/identity assurance system in use on parts of the Internet and other networks? The building of a nationwide, offline, optional, decentralised yet secure system where ‘networks’ or ‘webs’ of trust could be created would certainly be without precedent in the world and might easily be dismissed as “out of the question”, but perhaps in fact this is the kind of system which, with suitable planning (the enormity of which I do not underestimate) might actually enable some of the goals sought by an identity card system to be achieved, without introducing the centralisation and possibilities for abuse that “traditional” identity card systems may bring.
  7. I sincerely hope that you will take account of these views/opinions, and find them useful as part of the consultation process.

Tim Jackson