As I said earlier, much of the kernel behavior information oprofile
requires is made available by LTT. Instead of redirecting the syscall
table, for example, LTT hooks itself within the syscall path to obtain
all syscall entries and exits (all contained within #ifdef's of course).
The same goes for important events such as forks/reads/writes/scheduling
etc. The mapping of PID to the name of an executable, for instance,
is easily extracted from this information.
As with other applications/uses which require insight into the dynamics
of the kernel, it would seem to me that oprofile would greatly benefit
from the integration of the LTT patch in the mainline kernel. If
nothing else, oprofile could use the LTT collection engine to forward
its data to user-space, much like DProbes currently does.
By the same token, the LTT collection engine could replace the slew
of per-driver tracing systems already in the kernel, providing therefore
a uniform tracing system:
drivers/char/ftape/lowlevel/ftape-tracing.c
drivers/char/ip2/ip2trace.c
drivers/char/dtlk.c
drivers/char/mwavedd.h
drivers/char/n_r3964
drivers/acpi/utilities/utdebug.c
drivers/cdrom/sbpcd.c
drivers/isdn/eicon/eicon_mod.c
drivers/scsi/gdth.c
drivers/scsi/megaraid.c
drivers/scsi/qlogicfc.c
drivers/scsi/qlogicisp.c
drivers/net/wavelan.c
drivers/net/skfp/hwmtm.c
drivers/net/pcmcia/wavelan_cs.c
drivers/net/wireless/orinoco.c
drivers/net/wireless/orinoco_cs.c
drivers/video/radeonfb.c
drivers/usb/pwc.h
drivers/usb/hpusbscsi.c
include/linux/jdb.h for fs/ext3/*.c and fs/jdb/*.c
net/irda/irnet/irnet.h
etc.
The above list is but a sample of the files containing actual code
implementing tracing and/or data collection engines. There a great
deal many files that actually have trace points already in them. A
simple "grep -r TRACE *" provides an interesting insight to the
number of subsystems requiring tracing, each implementing their own
scheme.
It is time to provide a uniform tracing and high-throughput data
collection engine for all to use. LTT has already been field-tested
for these purposes and is very easily extended to include any
additional functionality required.
Any comments/thoughts are greatly appreciated.
Cheers,
Karim
John Levon wrote:
>
> On Mon, Jul 08, 2002 at 10:52:36AM -0700, Linus Torvalds wrote:
>
> > - I'd associate each profiling event with a dentry/offset pair, simply
> > because that's the highest-level thing that the kernel knows about and
> > that is "static".
>
> This makes sense, I think.
>
> > - I'd suggest that the profiler explicitly mark the dentries it wants
> > profiled, so that the kernel can throw away events that we're not
> > interested in. The marking function would return a cookie to user
> > space, and increment the dentry count (along with setting the
> > "profile" flag in the dentry)
>
> For a system-wide profiler, this needs to be /all/ dentries that get
> mapped in with executable permissions, or we lose any mappings of shared
> libraries we don't know about etc. Essentially, oprofile wants samples
> against any dentry that gets mmap()ed with PROT_EXEC, so this marking
> would really need to happen at mmap() time. Missing out on any dentry
> profiles amounts to data loss in the system profile and has the
> potential to mislead.
>
> > - the "cookie" (which would most easily just be the kernel address of the
> > dentry) would be the thing that we give to user-space (along with
> > offset) on profile read. The user app can turn it back into a filename.
> >
> > Whether it is the original "mark this file for profiling" phase that saves
> > away the cookie<->filename association, or whether we also have a system
> > call for "return the path of this cookie", I don't much care about.
> > Details, details.
> >
> > Anyway, what would be the preferred interface from user level?
>
> oprofile currently receives eip-pid pairs, along with the necessary
> syscall tracing info needed to reconstruct file offsets. The above
> scheme removes the dependency on the pid, but this also unfortunately
> throws away some useful information.
>
> It is often useful to be able to separate out shared-library samples on
> a per-process (and/or per-application) basis. Any really useful profile
> buffer facility really needs to preserve this info, but just including
> the raw pid isn't going to work when user-space can't reconstruct the
> "name" of the pid (where "name" would be something "/bin/bash") because
> the process exited in the meantime.
>
> The same goes for kernel samples that happen in process context.
>
> So this might work well in tandem with some global process-tree tracing
> scheme, but I don't know what form that might take ...
>
> (Then there are kernel modules, but that's probably best served by
> patching modutils)
>
> regards
> john
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
===================================================
Karim Yaghmour
karim@opersys.com
Embedded and Real-Time Linux Expert
===================================================
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/