[rbldnsd] BL data preprocessing

Michael Tokarev rbldnsd@corpit.ru
Sun, 30 Mar 2003 04:44:40 +0400 (MSD)


> Dmitry Agaphonov wrote:
> > Hello,
> > 
> > Are there any specific actions to do on very large zones data files for
> > further rbldnsd perfomance (loading zones, lookup etc.)?  For example,
> > sorting in some order or anything else?

None.  Well...  With key-value data (e.g. spews-like), keep entries with
the same VALUE together, i.e.

Dmitry, I've a question: why you asked at a first place?  Do you
fell rbldnsd is too slow loading data?

Please try out rbldnsd-0.74pre2 released today (see
http://www.corpit.ru/mjt/rbldnsd/) - it contains some
rather significant modifications in zone loading code.
There was a worst case before - when all TXT RRs where
different.  In particular, Wirehub's permblockIP.txt,
which is quite small (100000 entries) was loading much
slower than osirus zones (350000 entries), and rbldnsd
used more memory for permblockIP - that was because
osirus zones contains many repeated TXT RRs, less unique
than of Wirehub's data, I tested rbldnsd primarily with
osirus (in a hope the data is "usual"), so worst case
wasn't handled well.  Currently, both are loaded at
approx. equal speed (osirus is 3 times slower, just as
it should be), but in case with osirusoft data, rbldnsd
may use more memory since it isn't doing hard work to
find all repeated strings anymore (worst case is with
_randomized_ osirus data, original file in a random
order - in this case, total memory required is about
the same as size of data file).

Either way, my previous recommendations (about
possibly pre-sorting TXT RRs) apply.

Please take a look at the new snapshot and tell me
if it works better or worse for you.  If worse, I
want to look at your data... ;)

Thanks.

/mjt