Re: ps performance sucks

Andrew Morton (akpm@digeo.com)
Thu, 07 Nov 2002 14:21:56 -0800

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: David Woodhouse: "Re: swsusp: don't eat ide disks"
Previous message: jw schultz: "Re: Why are exe, cwd, and root priviledged bits of information?"

"Albert D. Cahalan" wrote:
>
> Andrew Morton writes:
> > "Albert D. Cahalan" wrote:
>
> >> In case you happen to know where they are, I'm looking for these:
> >>
> >> pages reclaimed
> >
> > /proc/vmstat:pgsteal
>
> That's a funny name for it. Sure about that? Longer description
> of what I'm looking for:
>
> reattaches from reclaim list
> Number of pages that have been faulted while on the inactive list

Ah. No, we don't account that. That would be "minor faults
against pagecache".

> To me, "pgsteal" sounds like pages grabbed from a clean list to
> be used for some new purpose.

Yes, it is. "page reclaim"

> >> minor faults
> >
> > /proc/vmstat:pgfault - /proc/vmstat:pgmajfault
> >
> >> COW faults
> >> zero-page faults
> >
> > These are not available separately
>
> They count as minor faults?

Yes.

> >> anticipated short-term memory shortfall
> >
> > hm. tricky.
>
> How about these then? (and would you want them?)
>
> a. urgency level for the need to free up memory

It's a bit hard to put one's finger on what this means
really. We have the free-pages info in /proc/meminfo,
and the page-stealing rates in /proc/meminfo.

If we see that kswapd is stealing pages like mad then we
know there's a lot of replacement pressure, but that certainly
doesn't mean that the system is under any sort of difficulty.

I guess one should step back and ask "what are we trying to
report here"?

> b. amount (or %) by which the system is overcommitted

That's approximately /proc/meminfo:Committed_AS / total memory.

> >> pages freed
> >
> > /proc/vmstat:pgfree
> >
> > This is a little broken in 2.5.46. pgfree is accumulated
> > _before_ the per-cpu LIFO queues and pgalloc is accumulated _after_
> > the per-cpu queues (or vice versa) so they're out of whack.
>
> Can I assume it will be fixed soon? Is this a value you'd like?

Yes, I have a fix for that queued.

pgfree will include freeings from programs exitting, munmapping,
truncating, etc. I think it's not a very interesting metric
for system behaviour.

/proc/vmstat:pgsteal is more interesting. It shows the rate
at which the kernel is reclaiming cache (pagecache and swapcache)
to satisfy its memory demands.

> >> pages scanned by page-replacement algorithm
> >
> > /proc/vmstat:pgscan
> >
> >> clock cycles by page replacement algorithm
> >
> > Not available. Could sum up the CPU across all kswapd instances,
> > which is a bit lame.
>
> I suspect that it's cycles of the page aging "clock" hand,
> not CPU cycles. So that would be pages scanned divided by
> the average number of pages in a full scan.

OK. /proc/vmstat:pgscan is incremented when the VM considers
a page for replacement. You can divide this by Active+Inactive
from meminfo to determine the scanning rate.

Also, pgscan/pgsteal is a metric of the efficiency of page
reclaim. If it's 1.00 then every page coming off the tail of
the inactive list is being reaped. If it gets much below 0.3
or so then the VM is having quite some difficulty.

> >> number of system calls
> >
> > Not available
>
> I though so. Bummer. I guess this is due to overhead.
>
> >> number of forks (fork, vfork, & clone) and execs
> >
> > /proc/stat: processes
>
> That's fork/vfork/clone all together, w/o execs?

Looks like it, yes.

> (good for "vmstat -f", but poor for "vmstat -s")
>
> Got one more:
>
> wired pages
> Total number of pages that are currently in use
> and cannot be used for paging

I guess this should include:

- Pages in use by the kernel (kmalloc, kernel stacks etc).
(These are mostly accounted for in /proc/meminfo:Slab)

- Pages which are mlocked (there is no accounting of these at all)

- PageReserved pages

- Not much else, really. Maybe pages which are under direct-IO.

PageReserved accounting is simple enough, but it looks like the
problem which needed that metric can be solved by other means.

mlock accounting is tricky, and without that the value of this is a
bit questionable.

> Thanks for all the help. BTW, you didn't say if you liked the
> proposed changes, so I'm assuming they don't matter to you.

I like anything which improves the observability of kernel
behaviour ;)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: David Woodhouse: "Re: swsusp: don't eat ide disks"
Previous message: jw schultz: "Re: Why are exe, cwd, and root priviledged bits of information?"