udns questions

Tue Apr 19 15:01:23 MSD 2005

Markus Koetter wrote:
> Hi,
> 
> thanks for your long reply, i got no problems if you post it to the lists.

Ok, Cc'ing udns at corpit.ru.  I really apprecate this sort of discussion,
because as the library is new and I was almost the only one using it
for now, I don't yet know what demands users have and whenever I got
them right.  I want real usage scenarios and want to know which problems
occurs to be able to correct them as much early as possible.  I *know*
there will be an API change, and I want it to occur earlier, not later,
for obvious reasons...

>> And speaking of TCP sockets, there's a single problem with them:
>> supporting them isn't easy from the application point of view.
>> Because TCP sockets will be dynamic -- unlike UDP ones, which
>> gets opened at initialisation time and remains up to the end.
>> There will be a need to register/unregister new working socket
>> in the application select-loop somehow.  And regardless of how
>> I tried, I can't think of a good API here.
> 
> im not that bad in socket coding in posix, and understand the problem ;)
> as long as if you only poll one single socket, that does not change 
> during the intervall, its easy to use and stuff.
> if you start opening a tcp socket in runtime, after you put the udp 
> socket to the poll loop, there has to be a way you can poll that socket 
> too.

Well, I know at least one way: an application registers "socket callback"
routine, which will be called by udns each time new socket to poll is
needed or not needed anymore.  It is the only sane way to implement this
stuff, without having additional scalability problems (I mean, to avoid
setting up poll/select/whatever bits at every event-loop iteration).

This approach works, and works reasonable well, for everything except
one single case: syncronous dns requests.  Currently udns implements
both async and sync interface, with sync interface implemented as a
very small poll-based loop inside the udns library.  So, in case we're
in the "sync mode" right now, udns should be able to figure that out
and modify internal poll bits as well as executing application-supplied
callbacks.

Well.. probably I know how to do it.  I think.

So, to (again!) simplify things, I will allow single TCP connection
at max, for only one request at a time.  This means we will only have
single extra socket to watch for.  And this means that, if an app did
not register any callbacks (default state), no TCP connections will be
performed...  And ofcourse this means the library will be unable to
do many tcp connections in parallel, but will still be able to
parallelize udp queries exactly as it does now.

Does it look good? ;)

> but i suck in dns, i know how it works, why we need it, know some stuff 
> about recursive dns requests, but .. i always just read it, never 
> implemented it myself, thats why i was asking you for udns
> 
> as i dont know how many sockets a single dns requests can take at all, 
> or if using tcp, i cant say that much about it.

The thing is very simple.  Initially we send out queries over UDP.  When
some nameserver replies, the repy can indicate it is truncated (with the
"tc" bit set in header), and in this case we should repeat the same query
to this same nameserver over TCP -- at this time the nameserver already
knows the reply so it should (in theory) be fast.  So, there may be at
max one extra TCP connection (with alot of stages! -- initial connection
request, sending query, waiting for (parts of) reply and so on) per
request.  After reply is received, the connection will be closed.

Ofcourse we may encounter "bad" nameservers, which asks for TCP connection
and reply with something bogus, or it can just refuse TCP connections for
whatever reason, so we will have to repeat the whole stuff querying next
nameserver -- in this case more that a single tcp socket can be required,
but not more than one at a time anyway.

> but maybe this could be done by using
> int dns_getsock(dns_context *);
> to return the sockets udns wants to poll no matter if its udp or tcp for 
> this session?

in this case it should be

  int dns_getsocks(dns_context *, int fd[], int fdcnt)

ofcourse.  And this is something I want to avoid for *dynamic* sockets:
this way, we're forced to re-initialize every fd[i] in our fd_set.  Note
select/poll isn't the only available interface, and this approach works
poorly with "advanced" event-loop mechanisms such as epoll and kqueue.

> if one is unable to get the socket each time to poll it, one could use
> int dns_init(request_type do_tcp, int do_open)
> with do_tcp false per default.
> so one uses only udp sockets that do not change.
> 
> if there is more than one socket udns wants to be polled ... there _is_ 
> a problem, so one has to make sure udns does not want to poll more than 
> one socket :\
> i dont know if dns allows this.

There IS a problem, as I described above: "normal" queries and other
application stuff should continue while we're dealing with our next TCP
socket.

> some words why i did not like adns
> - i was unable to set a timeout for each adns request myself

Hmm.  udns does not allow this either.  The timeout is per-context, not
per-request.  Is it a problem?

Note again: there will be either simple but not-for-everything API, or
very generic but difficult to use.  Dunno whenever you noticied that,
but some rr-specific routines does not accept "flags" argument, while
other does.  Eg, the one asking for PTR accepts no flags, while the
one asking for A does.  This is because the only flag of interest here
is whenever the DN being requested is "fqdn" or not, ie, whenever to
perform default-domain-suffix search for the request or not.  Ofcourse
there's no point to perform search for 1.2.3.4.in-addr.arpa. name.

Now imagine each query-submitting routine has another parameter,
timeout (and, number of retries, and (which is quite useful sometimes!)
even a list of nameservers!) -- how complex the whole stuff will be!

Ofcourse it is possible to modify some query-specific stuff after it
has been submitted.  I may even delay actual query dispatching till
the next event loop iteration (currently newly submitted query is sent
right when it is submitted), to allow setting some per-query options.
Well, now that's an idea really -- this way I can even achive some
code cleanups!.. ;)

> - i was unable to remove requests from the context

There should be a way to cancel requests in progress.  I will be
greatly surprized if adns does not have this way.

> - one context for all my resolve requests

Multiple contexts aren't that necessary really in almost all cases
but the two:

  - when you're implementing some sort of dns debugging tool, like
    dig or host, which tries to query different nameservers
    (think of host -C for example).

  - when your app is multi-threaded -- in this case it is a good idea
    to have per-thread context; but in this case it is also possible
    to dedicate single thread for name resolution and caching.

> - internal polling

It allows you to perform external polling too.  Like (from memory),
adns_before_select() and adns_after_select() -- first to fill in
bits in fd_sets and to modify timeouts, and second to process
results.  It also have adns_{before,after}_poll().  This approach
works well for select() and poll(), and it is easier than using a
callback to register sockets (as I proposed above), but it isn't
that universal (it does not work for epoll and kqueue for example --
btw, that was another reason I wrote my own library).

> - no ipv6 for years, some inoffical patches, but ... nothing real handy

I don't think udns really supports IPv6.  If you know about sockets,
you should understand this (and maybe suggest something too): with
current "single socket for all" approach used in udns, AND both
IPv4 and IPv6 nameservers specified in resolv.conf, udns will open
single IPv6 socket and use V4MAPPED addresses to query IPv4
nameservers.  I verified it works on my linux machines without any
IPv6-related "external" stuff (i just activated IPv6 protocol and
specified single bogus v6 address in resolv.conf for testing --
tcpdump shows that v4 requests where sent to real nameservers).

Ofcourse, with multiple sockets instead of only one, this problem
(and some others - like detecting dead nameservers quickly instead
of after a timeout) will go away, but others will come in.

Speaking of which (others).  I *still* (after several years of
experiments in various applications!) can't understand how to work
properly with connected udp sockets.  The problem is that socket
operation may return an error related to *previous* operation,
and there should be a way to tell which op *really* failed.
Right now udns uses only one unconnected socket and just ignores
all errors (in fact, it does not see that errors at all).  But
in case of a single connected socket used for multiple requests,
error handling becomes a problem (and proper error handling
improves things compared to current approach).

> - if adns was able to set timeout values per request, remove requests 
> from context etc ... bad documentation
> - hatred with the void pointers

Well, void pointers aren't that bad really, but yes, it's too easy
to pass wrong pointer to a wrong routine and the like...

> - ugly struct i had to get my results from, been a pain to find out how 
> to get the ips by rr domains

Are udns structs any better?  I found them pretty ugly too... ;)
Especially now when I'm thinking about adding some decoders for
some more request types, with processing of additional sections --
like, to request MX records *together* with all A/AAAA records
necessary to perform actual mail delivery, or to request *and
verify* a PTR record - get PTR, get A/AAAA and compare with
original A/AAAA -- the structs will be quite.. ugly I think.

> - really sucking naming all over the code
> 
> + i prefer more controll to more features with less controll in apis
> 
> all in all i will use udns now, as you provide debian packages, and i 
> like you and the api, even if udns doesnt support tcp for now.
> 
> if missing tcp really gets a problem to me, i will try to patch it 
> myself if you help me (as said before i suck in dns)

The main and only real problem, as I already pointed out in my
previous email, isn't the implementation, but choosing the right
interface.  Implementing it is not an issue at all.  And for
good API, you already are helping alot!

/mjt