This would be the easiest way to do in the sense that application authors take care of their own stuff, and kernel developpers only need to define rules/interfaces.
One scheme is that we can define a new signal number (e.g., SIGCKPT). When we send the signal to the process, it checkpoints itself (saves everything it needs for a restart). Then we define another signal (e.e., SIGRSUM). When we send the signal to it, it then knows that it should resume from the last checkpointed point. This is user-level checkpoint/restart, and there are already certain packages available (Condor, libckpt, etc).
If we want total transparency (i.e., applications don't need to be aware and everything is taken care of by the kernel), then the kernel needs substantial changes (I've written a kernel module to do this).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/