Re: Missed jiffies

george anzinger (george@mvista.com)
Sat, 16 Feb 2002 08:05:19 -0800


Tyson D Sawyer wrote:
>
> My system looses about 8 seconds every 20 minutes. This is reported
> by ntp and verified by comparing 'date' to 'hwclock --show' and a wall
> clock.
>
> My system is a x86 Dell laptop with HZ=1024.
>
> I am quite certain that the issue is the System Management Interrupt
> (SMI).
>
> While doing latency tests I have observed 18ms delays every 2 seconds.
> Like, as they say, clock work. Given that in 18ms with HZ==1024
> roughly 18 timer interrupts should occur then 17 of them (I believe)
> would be lost. Looking in the kernel sources I could find nothing
> that adjusts for this.
>
> Since I have defined HZ to be 1024, I miss lots of timer interrupts.
> However, since the the processor spends 18ms at a time in SMM (System
> Mangement Mode), then even the stock 10ms timer tick will sometimes
> miss a tick. Thus the problem applies to non-hacked kernels also.
>
> I don't know that there is a solution for all systems, however, at
> least on pentium systems it seems possible to use the TSC to catch
> this. However, even if I worked up a patch to do so, do_timer()
> always increments jiffies by just 1 count and it isn't clear that its
> safe to call it repeatedly to catch up with lost ticks. It also isn't
> clear that it would be safe to modify jiffies directly in one of the
> arch/i386/kernel/time.c functions.
>
> In general, I'd like to try a solution that looks something like:
>
> tsc_per_jiffie = cpu_khz * 1000 / HZ;
>
> tsc_remainder += last_tsc_low-tsc_low;
> jiffies_increment=0;
> do {
> tsc_remainder -= tsc_per_jiffie;
> jiffies_increment++;
> } while (tsc_remainder > tsc_per_jiffie);
>
> do_timer(regs, jiffies_increment);
>
> The above was created on the fly and completely untested. It needs
> bits like making sure that the arithmetic works properly on overflow
> of tsc_low. It also requires a patch to do_timer() and proper
> structuring for portability.

You might take a look at the high-res-timers patch (see URL in
signature) where the timer interrupt is separated from the wall clock
computation. In that patch, do_timer() updates jiffies as needed and
then calls the wall clock update which can handle more than one jiffie
at a time.

One of the nasty problems, especially with machines such as yours (i.e.
lap tops), is the fact that TSC is NOT clocked at a fixed rate. It is
affected by throttling (reduced in 12.5% increments) and by power
management. The patch attempts to find a way thru these problems by
making the ACPI pm timer one of the options for keeping wall clock.
This timer is clocked at a constant rate regardless of power management,
indeed, it was created to address just these concerns. The down side is
that it accessed via an I/O instruction and thus adds overhead to the
tick processing and also to all attempts to read system time to a finer
level than the jiffie (most of which are internal, i.e. not from user
land).
>
> One problem I see is that tsc_per_jiffie must be perfect or time will
> drift. I think it might work to not carry over the remainder from
> cycle to cycle under some conditions (no missed ticks) but I'd have
> to think about that the effects of timing jitter on this.
>
> Have attempts to address this problem been made before?
>
> What are the problems with incrementing jiffies by more than 1?
>
> What problems have I missed?
>
> What strategies might be employed to prevent degraded system
> performance since this code is in a criticle path?
>
> Have I competely missed something, the kernel already takes care of
> this and I have the problem all wrong?
>
> This problem also comes up with IDE access with dma off and I've
> seen reports of it when using frame buffers.

The IDE issue is correctly address by using DMA.

I think the real problem needs to be addressed, i.e. why does the SMI
(and/ or other code) keep the interrupt system off so long. Most
interrupts are completed in micro seconds, not milliseconds, lets fix
the real problem.
>
> Thanks!
> Ty
>
> --
> Tyson D Sawyer iRobot Corporation
> Senior Systems Engineer Military Systems Division
> tsawyer@irobot.com Robots for the Real World
> 603-532-6900 ext 206 http://www.irobot.com
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

-- 
George           george@mvista.com
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Real time sched: http://sourceforge.net/projects/rtsched/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/