UPSTREAM: netfilter: x_tables: pack percpu counter allocations

instead of allocating each xt_counter individually, allocate 4k chunks
and then use these for counter allocation requests.

This should speed up rule evaluation by increasing data locality,
also speeds up ruleset loading because we reduce calls to the percpu

As Eric points out we can't use PAGE_SIZE, page_allocator would fail on
arches with 64k page size.

TEST=Booted an image with this patch and verified iptables-restore
performance improvement.

Suggested-by: Eric Dumazet <>
Signed-off-by: Florian Westphal <>
Acked-by: Eric Dumazet <>
Signed-off-by: Pablo Neira Ayuso <>
(cherry picked from commit ae0ac0ed6fcf ("netfilter: x_tables: pack
percpu counter allocations"))
Signed-off-by: Amey Deshpande <>

Change-Id: I9cc42750879a69ce7426458463e6d35037e2e8f8
(cherry picked from commit 790430b314cd7e172bcea8cd402f4ed41e6eb40f)
Reviewed-by: Guenter Roeck <>
Commit-Queue: Amey Deshpande <>
Tested-by: Amey Deshpande <>
5 files changed