[rbldnsd] Re: logfile > 2G may result in DoS to rbldnsd

Tue Jan 13 18:53:32 MSK 2004

Michael Tokarev wrote:
> Anders Henke wrote:
> []
> >However, I came across a minor problem: when logging to file and the 
> >logfile's size reaches 2G (exactly 2^31-1), rbldnsd stops working:
> >it keeps running, but doesn't answer queries anymore and doesn't log 
> >(e.g. to syslog) about having a problem.
> 
> Hmm.
> 
> Which OS you're using?  

The affected hosts are running Linux 2.4.2x kernels on Debian 3.0
(so libc, filesystem and tools are current enough to handle files 
larger than 2G - also verified with 'dd if=/dev/zero of=file').

> The situation you described does not look as normal.  
> It should not stall, but it should just stop logging (best
> case) or should be killed by SIGXFSZ signal (worst case).  It will be
> interesting to see some trace (strace/ktrace/truss) output of the
> situation.

When trying to pass the 2G-barrier, it gets killed with SIGXFSZ:

write(4, "1074007388 172.17.24.42 2.0.0.12"..., 4096) = 511
write(4, ".17.24.42 2.0.0.127.sbl.spamhaus"..., 3585) = -1 EFBIG (File
too large
)
--- SIGXFSZ (File size limit exceeded) ---
+++ killed by SIGXFSZ +++

'Stalled' has been a seen in a not-too-different scenario: when rbldnsd 
starts up with a 2^31-1 logfile and tries to log, rbldnsd seems to 
stall - sometimes. It doesn't seem to be reproducable when stracing the
binary, but when not tracing the binary it does occur more likely.

> Well yes.  In fact, I added logging just as a debugging tool, it is not
> supposed to be enabled on production system, exactly due to the amount of
> data it generates.  If you want to perform logging, I highly recommend
> to rotate logs much more frequently, e.g. once per hour or so.  All after
> all, huge logfiles are almost impossible to deal with anyway.. ;)

rbldnsd-logfiles do compress fairly well, so that's not such an issue :)

It's nice to sum up who's querying rbldnsd in order to detect locations
who should get a mirror for themselves or optimize balancing querying 
(caching, recursive) nameservers. To get accurate statistics for such
optimizing, you need at least a full day of requests - so quite large
logs can occur (or you need to rotate logs quite often).

> >Of course, best solution would be to enable rbldnsd for 
> >O_LARGEFILE-support ...
> 
> It is possible and should be easy to do.  E.g., by passing
> O_LARGEFILE to log open() routine if defined.  But before
> this, I'm very interested to see what exactly it is doing
> when it reaches 2Mb logfile limit.  It should not stall like
> you described, regardless.

Ok:
-if permissions don't allow writing to the logfile, it prints
 'rbldnsd: error (re)opening logfile `logs/logfile42': Permission denied'
 and continues working. Fine.

-if it starts with a logfile of 2147483136 bytes (less than 2^31) and
 tries to write to byte 2147483648, it receives SIGXFSZ and gets killed
 (see strace above).
 However, simple monitoring skripts can take care of this.

-if it starts with a logfile of 2147483647 bytes, the first request
 sometimes does time out, following ones do usually succeed; the behaviour
 is still somehow strange, as it's not always the same and is not
 related to "wait a few seconds until zones are fully available".

E.g this morning rbldnsd was detected as being dead, restarted it and 
five minutes later the same five-minute-cronjob found it as being dead
again - this occured at least five times in a row until logrotate rotated
the logfile and rbldnsd started up again, serving requests since then.
However, every other unexpected "crashes" of rbldnsd during the last few 
days seems to be caused by SIGXFSZ.

Regards,

Anders
-- 
Schlund + Partner AG              Security
Brauerstrasse 48                  v://49.721.91374.50
D-76135 Karlsruhe                 f://49.721.91374.225