Hi Robert,
While we are at it, I think this is good point to analyze.
So here is an brief analysis of rt_rcu patch from the overhead
standpoint -
1. Read side has no overhead, we just don't take the per-bucket lock.
2. For just the route cache portion of code, RCU comes into picture
only when dst entries are deleted. This however has two issues -
a> expiry of dst entries is checked through a non-frequent
timer b>lease for recently used dst entries are extended.
So we don't do frequent RCU based deletion of dst entries.
Periodically a set of dst entries expire and instead of
freeing them immediately, we just put them in RCU queue(s)
for freeing after the grace period (call_rcu() in rt_free()).
Coming to the RCU mechanism -
1. Grace period detection : Different RCU algorithms do it
differently, however if there is no RCU pending *nothing*
is done regarding this. One rcu implementation uses
a 10ms timer to check for grace period completion and another
rcu_poll uses a repeating tasklet to poll for it. The grace period
detection is based on a per-cpu context switch counter. I have not seen
signficant profile counts for grace period detection scheme, but
nevertheless I will put up the profile counts for Dave's test
at the LSE website.
2. Actual update : RCU processes the batched update callbacks from tasklet
context. The rt_rcu callbacks don't do anything other than
call dst_free(), which would have been called by non-RCU
code under lock in any case. I am not sure doing this from
tasklet context adds any overhead and I suspect that it doesn't.
Comments/suggestions ?
Thanks
-- Dipankar Sarma <dipankar@in.ibm.com> http://lse.sourceforge.net Linux Technology Center, IBM Software Lab, Bangalore, India. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/