[rbldnsd] [Fwd: Re: rbldnsd and CIDR heirarchy]
Michael Tokarev
rbldnsd@corpit.ru
Thu, 07 Aug 2003 15:28:03 +0400
Forwarded here with permissinos from Aaron Hopkins.
-------- Original Message --------
Subject: Re: rbldnsd and CIDR heirarchy
Date: Thu, 07 Aug 2003 14:02:44 +0400
From: Michael Tokarev <mjt@tls.msk.ru>
Organization: Telecom Service, JSC
To: Aaron Hopkins <aaron{atte}die{dotte}net>
References: <Pine.LNX.4.44.0308062020260.25275-100000@asherah.die.net>
Aaron Hopkins wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
>
> Using ipv4set and CIDR notation, I'd like to be able to serve only the
> most-specific match for a given IP. For example, with:
>
> $SOA 2d noupdate.mxes.org. root.mxes.org. 0 1h 15m 14d 1d
> $NS 2d a.ns.mxes.org
> $NS 2d b.ns.mxes.org
> $TTL 1d
> 0.0.0.0/0:0.0.0.0:use=unallocated
Hmm. This one is invalid for rbldnsd. If you want to list the whole 'net,
you have to use at least two entries:
0/1
128/1
btw, I prefer to use a tab instead of colon as a delimiter between
key and value, e.g.
0.0.0.0/0 0.0.0.0:use=unallocated
(invalid anyway). BTW2. 0.0.0.0 will be converted to 127.0.0.2
(it is not treated as a valid A value).
> 209.151.224.0/19:use=allocated;asn=11051
> 209.151.224.1:use=allocated;asn=11051;flags=openproxy
> I'd like the following to happen:
>
> - - TXT query for 1.224.151.209.ip.mxes.org should only return
> "use=allocated;asn=11051;flags=openproxy"
>
> - - TXT query for 8.225.151.209.ip.mxes.org should only return
> "use=allocated;asn=11051;flags=openproxy"
I assume you mean it should return just "use=allocated;asn=11051"
for 8.225.151.209.ip.mxes.org.
> Right now, it looks like rbldnsd is not set up for this. It wants to return
> all matching TXT records. Is there something I'm missing?
Well, not quite this. Rbldnsd keeps 4 sets of IPs, /32, /24, /16 and /8 networks.
One /25 entry expanded to 128 /32 entries. I.e., it "splits" supplied data by
octets. This is "how DNS works" - there's no CIDR notation inside in-addr.arpa
and similar zones. If an entry is found in /32 array, no other arrays will be
searched; if not found, but found in /24 array, no /16 and /8 arrays will be
searched and so on.
This is done for several reasons, one of them is that this form (4 arrays
of netmasks, "split" by bitlen) is the most compact form to represent such
a data. In order to store original netmask too, rbldnsd will require 3/2
memory of it's current usage (additional field in a structure, plus proper
alignment). Another issue with storing netmask is that lookups will need
to be done differently (e.g. using patricia trie), which in turn requires
different data structures (e.g. for patricia trie, each entry will be at
least 20 bytes long compared to currrent 8).
In most cases, this approach is sufficient and should work well. There are
cases when it does not work in a way how one expects it to work, and, in
particular, how DJB's rbldns works (it only returns most specific entry).
In the example you provided, rbldnsd will return exactly what you want,
because /19 entry will be stored inside /24 array, and /32 entry will be
in /32 array which is searched first. However, if you'll change /19 to
e.g. /25, rbldnsd will return two TXT records for both entries. This
whole behaviour IS inconsistent. But in fact, this is the result of
mixing different things together, not the design. You're trying to mix
"networks by usage" together with hosts with various problems. Note
you explicitly repeated "use=allocated;asn=11051" part for the proxy,
which has nothing to do with a proxy per se. Following this example
further, I guess you want just the opposite: TWO TXT records (one about
"use=xxx", and another about proxy), not ONE as you asked for. E.g.:
1.225.151.209.ip.mxes.org IN TXT "use=allocated;asn=11051"
1.225.151.209.ip.mxes.org IN TXT "flags=openproxy"
Or even 3:
1.225.151.209.ip.mxes.org IN TXT "use=allocated"
1.225.151.209.ip.mxes.org IN TXT "asn=11051"
1.225.151.209.ip.mxes.org IN TXT "flags=openproxy"
The best way (imho) is to use several datasets for this, and combine
them later. E.g., one is by-ASN, and another with proxies:
rbldnsd ... \
ip.mxes.org:ip4set:by-asn \
ip.mxes.org:ip4set:proxy \
...
and maybe made them available in a subzones too, e.g.:
rbldnsd ... \
by-asn.ip.mxes.org:ip4set:by-asn \
ip.mxes.org:ip4set:by-asn \
proxy.ip.mxes.org:ip4set:proxy \
ip.mxes.org:ip4set:proxy \
...
which will not double memory requiriments anyway: data
will be reused. (Or even place all the data inside
`combined' dataset, in one datafile).
This is a way to guarantee that rbldnsd will return ALL matching
records for a given IP, regardless of internal representation
(exactly the opposite of your original goal).
Note that supporting this separated data format will be
MUCH simpler too.
Not sure about "use=xxx" however. Depending on your goals
(I can't yet see what exactly you're trying to achieve),
this information may or may not be inside by-asn dataset.
It seems you're trying to classify the whole IPv4 space,
which is umm.. difficult.
Either way, this is here we go. It *seems* you're trying
to do something "unusual" (as if there "usual" things ;).
BTW, this question you asked is interesting. May I share
this email with rbldnsd@corpit.ru mailinglist subscribers?
(I wanted to Cc this list, but thought I'd ask first).
/mjt