HOW TO TUNE AIX KERNEL MBUF SETTING?
ITEM: RTA000024192
Q: How do I change the AIX kernel mbuf and cluster pools? I am getting
many "requests for memory denied" when I issue a 'netstat -m' command.
---------- ---------- ---------- --------- ---------- ----------
A: This information was found in a forum. Because I feel the entire
document is of use in solving this type of problem, I have appended
it here.
AIX 3.2 Network Tuning Guide
April 27, 1992
1. Tuning the memory buffer (mbuf) pools
1.1 Why tune the mbuf pools
The network subsystem uses a memory management facility that
revolves around a data structure called an "mbuf". Mbufs
are mostly used to store data for incoming and outbound
network traffic. Having mbuf pools of the right size can
have a very positive effect on network performance. If the
mbuf pools are configured improperly, both network and
system performance can suffer. AIX offers the capability
for run-time mbuf pool configuration. With this convenience
comes the responsibility for knowing when the pools need
adjusting and how much they should be adjusted.
1.2 Overview of the mbuf management facility
The mbuf management facility controls two pools of buffers:
a pool of small buffers (256 bytes each), which are simply
called "mbufs", and a pool of large buffers (4096 bytes
each), which are usually called "mbuf-clusters" or just
"clusters". The pools are created from system memory by
making an allocation request to the Virtual Memory Manager
(VMM). The pools consist of pinned pieces of virtual memory;
this means that they must always reside in physical memory
and are never paged out. The result is that the real memory
available for paging-in application programs and data has
been decreased by the amount that the mbuf pools have been
increased. This is a non-trivial cost that must always be
taken into account when considering an increase in the size
of the mbuf pools.
The initial size of the mbuf pools is system-dependant.
There is a minimum number of (small) mbufs and clusters
allocated for each system, but these minimums are increased
by an amount that depends on the specific system
configuration. One factor affecting how much they are
increased is the number of communications adapters in the
system. The default pool sizes are initially configured to
handle small to medium size network loads (network traffic
100-500 packets/second). The pool sizes dynamically increase
as network loads increase. The cluster pool size is reduced
as network loads decrease. The mbuf pool is never reduced.
To optimize network performance, the administrator should
balance mbuf pool sizes with network loads (packets/second).
If the network load is particularly oriented towards UDP
traffic (e.g. NFS server) the size of the mbuf pool should
be 2 times the packet/second rate. This is due to UDP
1 Network Tuning
April 27, 1992
traffic consuming an extra small mbuf.
To provide an efficient mbuf allocation service, an attempt
is made to maintain a minimum number of free buffers in the
pools at all times. The following network options (which can
be manipulated using the no command) are used to define
these lower limits:
o lowmbuf
o lowclust
The lowmbuf option controls the minimum number of free
buffers for the mbuf pool. The lowclust option controls the
minimum number of free buffers for the cluster pool. When
the number of buffers in the pools drop below the lowmbuf or
lowclust thresholds the pools are expanded by some amount.
The expansion of the mbuf free pools is not done
immediately, but is scheduled to be done by a kernel process
with the process name of "netm". When netm is dispatched,
the pools will be expanded to meet the minimum requirements
of lowclust and lowmbuf. Having a kernel process do this
work is required by the structure of the VMM.
An additional function that netm provides is to limit the
growth of the cluster pool. The network option that defines
this maximum value is:
o mb_cl_hiwat
The mb_cl_hiwat option controls the maximum number of free
buffers the cluster pool can contain. When the number of
free clusters in the pool exceeds mb_cl_hiwat, netm will be
scheduled to release some of the clusters back to the VMM.
The last network option that is used by the mbuf management
facility is
o thewall
The thewall option controls the maximum RAM (in K bytes)
that the mbuf management facility can allocate from the VMM.
This option is used to prevent unbalanced VMM resources
which result in poor system performance.
April 27, 1992
1.3 When to tune the mbuf pools
When and how much to tune the mbuf pools is directly related
to the network load a given machine is being subjected to. A
server machine that is supporting many clients is a good
candidate for having the mbuf pools tuned to optimize
network performance. It is important for the system
administrator to understand the networking load for a given
system. By using the netstat command you can get a rough
idea of the network load in packets/second. For example:
netstat -I tr0 1 reports the input and output traffic for
both the tr0 network interface and for all network
interfaces on the system. The output below shows the
activity caused by a large ftp operation:
input (tr0) output input (Total) output
packets errs packets errs colls packets errs packets errs colls
183 0 349 0 0 183 0 349 0 0
183 0 353 0 0 183 0 353 0 0
203 0 380 0 0 203 0 380 0 0
189 0 363 0 0 189 0 363 0 0
158 0 293 0 0 158 0 293 0 0
191 0 365 0 0 191 0 365 0 0
179 0 339 0 0 179 0 339 0 0
The netstat command also has an option, -m, that gives
detailed information about the use and availability of the
mbufs and clusters
182 mbufs in use:
17 mbufs allocated to data
2 mbufs allocated to packet headers
60 mbufs allocated to socket structures
83 mbufs allocated to protocol control blocks
11 mbufs allocated to routing table entries
6 mbufs allocated to socket names and addresses
3 mbufs allocated to interface addresses
16/54 mapped pages in use
261 Kbytes allocated to network (41% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines
The line that begins "16/54 mapped pages..." indicates that
there are 54 pinned clusters, of which 16 are currently in
use. If the "requests for memory denied" value is nonzero,
3 Network Tuning
April 27, 1992
the mbuf and/or cluster pools may need to be expanded.
This report can be compared against the existing system
parameters by issuing the command no -a which reports all of
the current settings (the following report has been
abbreviated):
lowclust = 29
lowmbuf = 88
thewall = 2048
mb_cl_hiwat = 58
It is clear that on the test system the "261 Kbytes
allocated to the network" is considerably short of thewall
value of 2048K and the (54-16 = 38) free clusters are short
of mb_cl_hiwat limit of 58.
The "requests for memory denied" counter is maintained by
the mbuf management facility and is incremented each time a
request for an mbuf allocation cannot be satisfied.
Normally the "requests for memory denied" value will be
zero. If a system experiences a high burst of network
traffic, the default configured mbuf pools will not be
sufficient to meet the demand of the incoming burst, causing
the error counter to be incremented once for each mbuf
allocation request that fails. Usually this is in the
thousands due to the large number of packets arriving all at
once. The request for memory denied statistic will
correspond with dropped packets on the network. Dropped
network packets mean re-transmissions, resulting in degraded
network performance. If the "requests for memory denied"
value is greater than zero it may be appropriate to tune the
mbuf parameters -- see "How to tune the mbuf Pools", below.
The "Kbytes allocated to the network" statistic is
maintained by the mbuf management facility and represents
the current amount of system memory that has been allocated
to both mbuf pools. The upper bound of this statistic set
by thewall is used to prevent the mbuf management facility
from consuming too much of a system's physical memory. The
default value for thewall limits the mbuf management
facility to 2048K bytes (as shown in the above no -a
report). If "Kbytes currently allocated to the network"
approaches thewall, it may be appropriate to tune the mbuf
parameters -- "see How to tune the mbuf Pools", below.
The netm kernel process runs at a very favored priority
(fixed 37). Because of this, excessive netm dispatching can
cause not only poor network performance but also poor system
April 27, 1992
performance because of contention with other system and user
processes. Improperly configured pools can result in netm
"thrashing" due to conflicting network traffic needs and
improperly tuned thresholds. netm dispatching can be
minimized by properly configuring the mbuf pools to match
system and networking needs.
There are cases where the above indicators suggest that the
mbuf pools may need to be expanded, when in fact there is a
system problem that should be corrected first. For example:
o mbuf memory leak
o queued data not being read from socket or other
internal queueing structure
An mbuf memory leak is a situation in which some kernel or
kernel-extension code has neglected to release an mbuf
resource and has destroyed the pointer to its memory
location, thereby losing the address of the mbuf forever. If
this occurs repeatedly, eventually all the mbuf resources
will be used up. If the netstat mbuf statistics show a
gradual increase in usage that never decreases or high mbuf
usage on a relatively idle system, there may be an mbuf
memory leak. Developers of kernel extensions that use mbufs
should always include checks for memory leaks in their
testing.
It is also possible to have a large number of mbufs queued
at the socket layer because of an application defect.
Normally an application program would read data from the
socket, causing the mbufs to be returned back to the mbuf
management facility. An administrator can monitor the
netstat -m mbuf statistics and look for high mbuf usage
while there is no expected network traffic. The
administrator can also view the current list of running
processes (ps -ef) and scan for those that use the network
subsystem with large amounts of CPU time being used. If this
behavior is observed, the suspected application defect
should be isolated and fixed.
1.4 How to tune the mbuf pools
With an understanding of how the mbuf pools are organized
and managed, tuning the mbuf pools is simple in AIX and can
be done at run-time (unlike other UNIX systems in which the
kernel must be recompiled and the system rebooted).
5 Network Tuning
April 27, 1992
The network options (no) command can be used by root to
modify the mbuf pool parameters. Soem guidelines are:
o After expanding the pools, use the vmstat command to
ensure that paging rates have not increased. If you
cannot expand the pools to the necessary levels without
adversely affecting the paging rates, additional memory
may be required.
o When adjusting lowclust, lowmbuf should be adjusted by
at least the amount that lowclust is. For every cluster
there will exist an mbuf that points to it.
o mb_cl_hiwat should remain at least two times greater
than lowclust at all times. This will prevent the netm
thrashing discussed earlier.
o When adjusting lowclust and lowmbuf, thewall may need
to be increased to prevent pool expansions from hitting
thewall.
The following is an example shell script that might be
placed at the end of /etc/rc.net to tune the mbuf pools for
an NFS server that experiences a network traffic load of
approximately 1500 packets/sec.
April 27, 1992
#¢/bin/ksh
# echo "Tuning mbuf pools..."
# set maximum amount of memory to allow for allocation
no -o thewall=10000
# set minimum number of small mbufs
no -o lowmbuf=3000
# generate network traffic to force small mbuf pool expansion
ping 127.0.0.1 1000 1 >/dev/null
# set minimum number of small mbufs back to default to prevent
netm from running unnecessarily.
no -d lowmbuf
# set maximum number of large mbufs before pool expansion
no -o mb_cl_hiwat=1500
# gradually expand large mbuf pool
N=10
while . $N -lt 1500 .
do
no -o lowclust=$N
ping 127.0.0.1 1000 1 >/dev/null
let N=N+10
done
# set minimum number of large mbufs back to default to prevent
netm from running unnecessarily.
no -d lowclust
You can use netstat -m following the above script to verify
the size of the pool of clusters (which netstat calls
"mapped pages"). To verify the size of the pool of mbufs you
can use the crash command to examine a kernel data
structure, mbstat (see /usr/include/sys/mbuf.h). The kernel
address of mbstat can be displayed while in crash using od
mbstat. You will then need to od 30 to dump
the first word in the mbstat structure, which contains the
size of the mbuf pool. The dialog would be approximately as
follows:
7 Network Tuning
April 27, 1992
$ crash
> od mbstat
000e2be0: 001f7008
> od 1f7008
001f7008: 00000130
> quit
The size of the mbuf pool is therefore 130(hex)
304(decimal).
---------- ---------- ---------- --------- ---------- ----------
This item was created from library item Q589150 2VRDS
Additional search words:
AIX ALTERNATE COMMUNICATIO INDEX IX JUL92 KERNEL MBUF PERFORMANCE
RISCSYSTEM RISCTCP SETTING SOFTWARE TCPIP TUNE TUNING 2VRDS
WWQA: ITEM: RTA000024192 ITEM: RTA000024192
Dated: 04/1996 Category: RISCPERF
This HTML file was generated 99/06/24~12:43:08
Comments or suggestions?
Contact us