Mainly stability work:
. If pte_chain_alloc() fails to allocate GFP_ATOMIC memory, the kernel
oopses. This is a long-standing rmap problem. Present also in the 2.4
rmap patches and, as far as I know, production Red Hat kernels.
So it is clearly a very rare problem but it is not acceptable to have
an unchecked kmalloc in the core of the 2.5 VM.
The approach which I took was to change the page_add_rmap() API to
require the caller to pass in a preallocated pte_chain. And change
all callers to allocate their pte_chains with GFP_KERNEL.
This change is fairly ugly, but every other hare-brained scheme I
could come up with had holes. This one adds maybe 20 instructions
to pagefaults and works...
The swapoff path has not yet been converted - this can still oops.
The locking isn't quite right yet, if shared pagetables are enabled.
. If radix_tree_insert() fails to allocate GFP_ATOMIC memory, a system call
will return -ENOMEM, resulting in application failure.
This was fixed by implementing a reservation API within the slab allocator.
Before taking locks the caller of radix_tree_insert will ask slab to preallocate
sufficient objects in this CPU's slab head array to guarantee that the
allocation of up to seven (on ia32) radix_tree_nodes cannot fail.
This permitted the removal of the radix tree mempool. That's a 130 kbyte
saving. (260k on 64-bit).
. Some aggressive pruning of various system-wide memory reserve settings:
- The page reservation limits in the page allocator have been reduced
from ~256 pages per zone to ~4 pages per zone.
- The preallocation levels in the slab head arrays (which were ridiculously
large) have been reduced from 32k-128k to, typically, a single page.
- The per-cpu-pages head arrays in the page allocator have been reduced
from ~64 pages to 2 pages.
The net effect of these changes is to remove almost all of the kernel's
reserved memory buffers. Instead of maintaining several megabytes of
free memory the kernel will only maintain some tens of kilobytes.
And guess what? Everything still works.
I won't be submitting these changes - they are here for robustness testing.
But it certainly does indicate that the settings of these thresholds need
to be reviewed. And that there don't appear to be any low-on-memory deadlocks
in the VM (with ext2, at least..)
. An updated dcache_rcu patch which should fix a rename-related race which
Al Viro noted.
Changes since 2.5.53-mm1:
+linus.patch
Latest -BK
+aic-bounce.patch
aic7xxx highmem IO fix
+misc.patch
triviata
+devfs-fix.patch
A partial fix for a CONFIG_DEVFS=y boot problem
+copy_page_range-cleanup.patch
Small cleanups, partly to ease the maintenance of the shared pagetable
diff.
+pte_chain_alloc-fix.patch
Infrastructure for handling pte_chain_alloc() failures.
+page_add_rmap-rework.patch
Handle pte_chain_alloc() failures.
shpte-ng.patch
Lots of changes to handle pte_chain_alloc() failures.
+slab-preallocation.patch
Add an API to slab to reserve objects in the per-CPU head arrays.
+slab-export-tuning.patch
Export the slab head-array tuning functions.
+rat-preallocation.patch
Add a reservation API to the radix_tree code.
+use-rat-preallocation.patch
Use the reservation API to avoid radix_tree allocation failures.
+teeny-mem-limits.patch
Remove most of the page allocator page reserves.
+smaller-head-arrays.patch
Remove most of the slab memory reserves.
+remove-hugetlb-syscalls.patch
Remove the hugetlb system calls. hugetlbfs is suitable.
All 72 patches:
linus.patch
cset-1.951-to-1.1030.txt.gz
kgdb.patch
aic-bounce.patch
rcf.patch
run-child-first after fork
ga2.patch
don't call console drivers on non-online CPUs
misc.patch
misc fixes
devfs-fix.patch
dio-return-partial-result.patch
aio-direct-io-infrastructure.patch
AIO support for raw/O_DIRECT
deferred-bio-dirtying.patch
bio dirtying infrastructure
aio-direct-io.patch
AIO support for raw/O_DIRECT
aio-dio-debug.patch
dio-reduce-context-switch-rate.patch
Reduced wakeup rate in direct-io code
cputimes_stat.patch
Retore per-cpu time accounting, with a config option
reduce-random-context-switch-rate.patch
Reduce context switch rate due to the random driver
inlines-net.patch
rbtree-iosched.patch
rbtree-based IO scheduler
deadsched-fix.patch
deadline scheduler fix
quota-smp-locks.patch
Subject: [PATCH] Quota SMP locks
copy_page_range-cleanup.patch
copy_page_range: minor cleanup
pte_chain_alloc-fix.patch
page_add_rmap-rework.patch
shpte-ng.patch
pagetable sharing for ia32
slab-preallocation.patch
slab-export-tuning.patch
rat-preallocation.patch
use-rat-preallocation.patch
teeny-mem-limits.patch
smaller-head-arrays.patch
ptrace-flush.patch
Subject: [PATCH] ptrace on 2.5.44
buffer-debug.patch
buffer.c debugging
warn-null-wakeup.patch
pentium-II.patch
Pentium-II support bits
rcu-stats.patch
RCU statistics reporting
auto-unplug.patch
self-unplugging request queues
less-unplugging.patch
Remove most of the blk_run_queues() calls
ext3-fsync-speedup.patch
Clean up ext3_sync_file()
lockless-current_kernel_time.patch
Lockless current_kernel_timer()
scheduler-tunables.patch
scheduler tunables
dio-always-kmalloc.patch
direct-io: dynamically allocate struct dio
file-nr-doc-fix.patch
Docs: fix explanation of file-nr
set_page_dirty_lock.patch
fix set_page_dirty vs truncate&free races
remove-memshared.patch
Remove /proc/meminfo:MemShared
bin2bcd.patch
BIN_TO_BCD consolidation
log_buf_size.patch
move LOG_BUF_SIZE to header/config
semtimedop-update.patch
Enable semtimedop for ia64 32-bit emulation.
drain_local_pages.patch
add drain_local_pages() for CONFIG_SOFTWARE_SUSPEND
htlb-2.patch
hugetlb: fix MAP_FIXED handling
kmalloc_percpu.patch
kmalloc_percpu -- stripped down version
config_page_offset.patch
Configurable kenrel/user memory split
config_hz.patch
CONFIGurable HZ
dont-aligns-vmas.patch
Don't cacheline-align vm_area_struct
remove-swappable.patch
remove task_struct.swappable
remove-hugetlb-syscalls.patch
Subject: [hugetlb] remove hugetlb syscalls
wli-01_numaq_io.patch
(undescribed patch)
wli-02_do_sak.patch
(undescribed patch)
wli-03_proc_super.patch
(undescribed patch)
wli-06_uml_get_task.patch
(undescribed patch)
wli-07_numaq_mem_map.patch
(undescribed patch)
wli-08_numaq_pgdat.patch
(undescribed patch)
wli-09_has_stopped_jobs.patch
(undescribed patch)
wli-10_inode_wait.patch
(undescribed patch)
wli-11_pgd_ctor.patch
(undescribed patch)
wli-12_pidhash_size.patch
(undescribed patch)
wli-13_rmap_nrpte.patch
(undescribed patch)
dcache_rcu-2.patch
dcache_rcu-2-2.5.51.patch
dcache_rcu-3.patch
dcache_rcu-3-2.5.51.patch
page-walk-api.patch
page-walk-scsi.patch
page-walk-api-update.patch
pagewalk API update
gup-check-valid.patch
valid page test in get_user_pages()
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/