UPSTREAM: netfilter: x_tables: pack percpu counter allocations

instead of allocating each xt_counter individually, allocate 4k chunks
and then use these for counter allocation requests.

This should speed up rule evaluation by increasing data locality,
also speeds up ruleset loading because we reduce calls to the percpu

As Eric points out we can't use PAGE_SIZE, page_allocator would fail on
arches with 64k page size.

TEST=Booted an image with this patch and verified iptables-restore
performance improvement.

Suggested-by: Eric Dumazet <>
Signed-off-by: Florian Westphal <>
Acked-by: Eric Dumazet <>
Signed-off-by: Pablo Neira Ayuso <>
(cherry picked from commit ae0ac0ed6fcf ("netfilter: x_tables: pack
percpu counter allocations"))
Signed-off-by: Amey Deshpande <>

Change-Id: I9cc42750879a69ce7426458463e6d35037e2e8f8
(cherry picked from commit e53474eb22f57d4a2af222c4996e22815f054012)
Reviewed-by: Guenter Roeck <>
Commit-Queue: Amey Deshpande <>
Tested-by: Amey Deshpande <>
5 files changed