[rbldnsd] enhanced dnset
Sami Farin
safari-rbldnsd at safari.iki.fi
Sat Nov 26 00:55:12 MSK 2005
On Sat, Nov 26, 2005 at 12:07:17AM +0300, Michael Tokarev wrote:
....
> >if you don't want to give TXT record, use
> >dls.net :\d{1,3}-\d{1,3}-\d{1,3}-\d{1,3}\.dls\.net
> >(PCRE pattern is always the latest field delimeted by ':').
> >
> >Does this sound sane?
> >Free tips'n'tricks?
>
> Well, I don't think it's necessary to invent new complicated syntax
> for such stuff. Plain regexps (maybe modified a bit to be more easy
> for domains where a dot (.) is commonly used) are just fine, ie,
I mentioned PCRE because I have used it before...
(I added PCRE support for qmail).
But one pcre_exec takes only around 3000 CPU cycles for
the patterns like those mentioned in this email.
2GHz CPU could do those 666666 a second.
> instead of
>
> dls.net :\d{1,3}-\d{1,3}-\d{1,3}-\d{1,3}\.dls\.net
>
> it's sufficient to use just
>
> \d{1,3}-\d{1,3}-\d{1,3}-\d{1,3}\.dls\.net
>
> It's trivial to parse the regexp and extract a fixed ending part
> (.dsl.net in this example).
Okay, that sounds sane.
> I had a working prototype of similar stuff long time ago (it will
> not compile anymore as rbldnsd changed since), using shell-style
> wildcards (?*[]) instead of regexps, with highly optimized matcher.
> But the problem was -- it wasn't deterministic in speed. I know
> which stuff people will try to use (shell-style):
>
> *dsl*
> *[0-9][0-9][0-9]*
>
> etc. Ie, everything will sort into top-level domain, without
> any suffix whatsoever.
This I do not like.
> Which, among the speed issue, has another
> problem: what to do if a name matches *several* patterns like
> that? Do we want to invent a "weight" for a regexp/pattern
> (like, more wildcard characters = less weight etc) and try
> to match every pattern we have, choosing the "best" one, or
> pick a random (which?) one? (Well, here, another approach
> can be used: "Order Matters", ie, first match found wins.)
>
> What I'd really like to see, and I already mentioned that
> (probably even in the TODO file) is some sort of "finite
> automata" implementation, like the one used in tools like
> lex or re2c, but run-time (as opposed to compile-time)
> changeable. For some reason I wasn't able to find such
> a library anywhere on the 'net...
>
> This approach guarantees near-constant response time for
> any number of (complex) expressions, and it will solve
> "which match to choose" problem as well (longest match
> wins).
>
> Yet there are more (albiet small and probably non-real-life)
> issues. Like, what to do with domain labels containing
> some "funny" characters like dots or \0s. Note that a
> domain name isn't really a string of characters, it's a
Yes. Maybe it's best just to send NXDOMAIN (+SOA if possible)
in those cases.
> structured entity (sequence of labels), pretty similar to
> filenames; and shell-style wildcards normally does not work
> "across" directory separators: /some/where/file does not
> match against /some*file.
>
> To summarize: it isn't difficult to add support of regexps
> into rbldnsd, but usually, that will be at least O(N) complexity,
> where N is the number of entries found in the data file. Which
How so? If you get query for
66-117-164-146.dls.net
then rbldnsd has to do these lookups for our new and nice dnsetenh:
net
dls.net
that makes two.
(Maybe dnsetenh could have configurable max number of labels,
in case someone wants to DoS (CPU-time-usage) it.)
I thought rbldnsd could do the lookup for each label...
so that you could match *dhcp*.edu, for example,
without having to find every edu domain.
> I don't like...
>
> /mjt
--
More information about the rbldnsd
mailing list