lots of useless recvfrom() calls

Sun Jun 3 21:13:23 MSK 2018

On Sun, Jun 03, 2018 at 12:52:06AM +0300, Lennert Buytenhek wrote:

> > > > I am using the patch below to avoid one useless recvfrom() call (that
> > > > returns EAGAIN) for every call to dns_ioevent().  Under low load, where
> > > > every call to dns_ioevent() processes one DNS reply, this saves half of
> > > > the total number of recvfrom() calls.
> > > 
> > > Is it really a problem under low load? If the load is low, it doesn't
> > > really matter how many recvfrom() system calls are made, I think.
> > 
> > 'low load' here is 'a low expected number of reply packets per POLLIN
> > cycle', but even for a large number of queries per second and multiple
> > parallel queries, I would not expect there to be more than one reply
> > packet per POLLIN cycle very often.
> > 
> > Is it a 'problem' per se, well, probably not, since it's not a
> > correctness issue.  But it looks ugly in strace, and the difference in
> > (system) CPU time used is visible in a microbenchmark (kPTI woohoo),
> > and it's basically just stealing some CPU cycles from more useful tasks.
> > 
> > (Unfortunately, my only benchmark is synthetic, but I've attached it
> > below.  It needs ivykis, which should be in your distro, and iv_udns
> > which I CCd you on.  Run it as:
> > 
> > 	$ ./bench <parallel_queries> <seconds_to_run_for>
> > 
> > )
> 
> I've been testing with the ugly patch below, which adds dns_ioevent_lt(),
> which is an alternative version of dns_ioevent() that only ever polls
> the receive fd once.  (The naming is horrible, I agree.  Maybe something
> like dns_ioevent_once() would be better?)
> 
> My benchmark is between two Skylake machines, both connected to a
> gigabit LAN, one running bind serving an authoritative zone, and the
> other running the benchmark program I sent earlier, with 15 second runs
> of DNS queries at 1 concurrent outstanding query, and using dns_ioevent()
> ("stock") and dns_ioevent_lt() ("earlyout") on alternating runs, for a
> whole bunch of runs, where I measure the amount of user time used in
> every 15 second run.  (The local network is slightly noisy, and
> sometimes it drops queries, so I discard data for runs where we had to
> retransmit.)
> 
> The benchmark is still running, but so far I have this:
> 
> $ ministat -c 99.5 -w 73 -q stock.user earlyout.user
> x stock.user
> + earlyout.user
>     N           Min           Max        Median           Avg        Stddev
> x 747          0.17          0.94          0.57    0.55676037    0.10852262
> + 743          0.19          0.86          0.54     0.5276716    0.10745821
> Difference at 99.5% confidence
>         -0.0290888 +/- 0.0172899
>         -5.22465% +/- 3.10545%
>         (Student's t, pooled s = 0.107993)
> $
> 
> This shows a small but statistically significant difference in user time
> between the recv-until-EAGAIN and recv-only-once approaches, where the
> latter uses on average 5.2% less user CPU time (+/- 3.1%).  This is on
> kernel 4.16.12, with kPTI enabled.

After ~1800 more runs:

$ ministat -c 99.5 -w 73 -q stock.system earlyout.system
x stock.system
+ earlyout.system
    N           Min           Max        Median           Avg        Stddev
x 2634          0.42           2.3          1.47     1.4234169    0.30632381
+ 2632          0.42          2.19           1.4     1.3647264    0.30456105
Difference at 99.5% confidence
        -0.0586904 +/- 0.0260124
        -4.12321% +/- 1.82746%
        (Student's t, pooled s = 0.305444)
$

And:

$ ministat -c 99.5 -w 73 -q stock.user earlyout.user
x stock.user
+ earlyout.user
    N           Min           Max        Median           Avg        Stddev
x 2634          0.17          0.94          0.56    0.54850797    0.11372476
+ 2632           0.1          0.86          0.52    0.51102204    0.10885037
Difference at 99.5% confidence
        -0.0374859 +/- 0.00947987
        -6.83416% +/- 1.7283%
        (Student's t, pooled s = 0.111315)
$

So system time is down by ~4% and user time by ~7%, with >= 99.5%
confidence.