ITEM: AI8949L
NFS Performance problems
Question:
Performance problems - questions relating to NFS performance.
Answer:
The following discusses general NFS Tuning and some performance
tips.
Ethernet Queues :
It is very important, to reduce timeout/retransmits, to increase the
queues for your server's ethernet adapters to the maximums allowed.
It may also help to increase client queues as well. If you see any
mass / frequent NFS timeout messages, it is likely due to the default
queue values. Use SMIT to set the queues. You must detach the ethernet
interface prior to changing the queue value.
Increase both transmit and receive queues. But transmit is more likely
to need the increase.
AIX is shipped with lower defaults as NFS is not mandatory use.
Ethernet MTU :
Make sure the packet size is as default ( 15xx bytes or SMIT maximum ).
Ethernet Current :
There were some massive retransmits / hang / crash system problems
with ethernet card / driver combinations in early part of 1992. You
should consider this when using any old adapters. This is unlikely
but better to be redundant here in case...
Token Ring :
Use the 16 mbit rate if possible. Always with an all AIX/6000 network,
but may not be possible for a hetergeneous OEM network. With 16 mbit
use a MTU size of 8500 bytes. If you use a 4 mbit rate network, use
an MTU of 3900 bytes. These larger MTU sizes allow more data in one
packet for the 4Kbyte/8Kbyte NFS read/write operations.
Try larger transmit / receive queues if significant retransmits occur.
RAM Size :
More is better as this allows more read caching of data on the server
and may allow reduced disk reads, particularly for common use files
across multiple clients / users same client.
Processor Speed :
More is better as the server has a faster response to client requests
up to the point where limited by either disk, RAM, network, etc.
Faster processor tends to combat the absence of a network coprocessor
product or MP server for AIX/6000.
Processes :
We have found increased thruput by use of up to 32 nfsd processes
on the server. Default is 8 nfsds. If you have idle time on the
server you tend to benefit from more nfsds.
Client biods are default at 6. Consider more of these if you have
idle CPU on client and server, and network / etc is not a limit.
Disks :
More disks are better, assuming proportional usage. More disks tend
to combat the absence of a Prestoserve product.
Additional disks are probably more important than larger disks or
faster disks.
Use one SCSI adapter per 3 or 4 disks for better thruput.
A disk with write cache built in is similiar to effect of
a Prestoserve, but only on a per disk basis.
Volume Groups :
Because each volume group gets a dedicated log logical volume, one
per volume group, defaulted to the first disk defined to a volume
group, it is a significant gain to have one disk per volume group.
Rootvg :
Avoid usage of rootvg for test data, as there is some degradation for
AIX system I/O. Also, adding disks to the rootvg causes contention
for the single log logical volume as per above.
Assuming low memory faults, there is probably not much gain for
adding more disks to rootvg for system use.
System processes :
Consider killing off nonessential processes, if that is what is done
in the competitive OEM test. This may provide a small gain for AIX
in most cases, but no big gain has been seen in our tests.
Use iostat command :
This command shows processor busy, idle, and iowait %, the
disk read/write %, and byte counts transferred to each disk.
It is a reliable indicator of whether the disks are a bottleneck.
High disk % for read may indicate a gain is possible for more
client or server RAM. Server RAM shortage is more likely.
High % on a subset of the disks may indicate a gain with reallocated
files per disk/volume group to distribute load more evenly.
High % disks for writes probably indicates a limit with that number
of disks and is a case that Prestoserve should help.
High CPU % indicates a case likely limited by the absence of a
network cooprocessor product for the AIX/6000.
Use vmstat command :
This command will show you whether there is any significant amount
of memory paging. You should add more RAM if this seems high. > 10% ?
Use netstat -i and -v command :
This shows some data for the network activity. It is in packet and
byte counts. Each ethernet has a limit of about 1.1 Mbytes of
data + packet overheads. A netstat -v can be used to derive how
many bytes per ethernet over an interval.
Sniffer advantage :
A network sniffer can be used, to get a % type network usage, if
you feel that it is necessary to understand the general usage of
the network. Alternatively, you can add another ethernet network
and retest to see whether that is a limitation.
Networks :
If server disk, CPU, and RAM seem low % usage , consider adding more
networks / clients / faster clients / more client processes to
increase the server thruput.
Load study :
It is probably better to start with a low load and increase it to
find a limit, as opposed to high overload, and then decrease it,
as it may take awhile to normalize from high to low.
Dedicated server :
Local applications / users on the server will contend with available
server resources. Thus ensure that the server is not being used by
workloads from networks unrelated to the clients involved with a test.
Try to shutdown all unrelated networks.
Writes :
NFS writes from client to server are synchronous to disk. The server
writes to disk include data, metadata, indirect blocks, and
log journal blocks. There may be as much as 3 added writes
per one data write. Also, operations like rm and rmdir create writes
to disk as well. So the Prestoserve product can be an important factor
to NFS performance.
Mounts :
Use options for hard, intr. Default timeout = 2.1 seconds and is
probably ok.
There is a noacl parameter which should have a small gain, but we
have not verified this yet.
The secure mount option causes increased path per NFS call
related to the mount, and should be a point of verification
that this is not a casual point of a test.
Client caching :
If for some reason you wish to disallow the client data cache for
the benchmark, say to always read data across the network, you
will need to use the "rmss" performance tool to reduce the effective
client RAM to the point that caching is ineffective.
Disk space allocation :
AIX allows allocation of specific relative disk space to the user
logical volumes within a volume group. You can also relocate the
log logical volume. These allocations can be skewed to reduce or
increase seek time, as per the rules / tolerances of the test.
Unspecified allocation rules may thus allow some optimization
of disk seek delays, of more interest if disks show high % read
or mild % and write bound.
Client RAM :
Increased client RAM may be a gain if there is a high % of read
for n files. The added RAM increases the client data cache and
may thus allow local hits on data rather than remote reread
of data due to cache contention.
File locking :
This is VERY expensive in the NFS environment. Verify the need for
locking is valid, rather than a casual inclusion. Also, the locks
are advisory only. Low / infrequent use is probably not noticable.
nfsstat command :
RPC count for AIX will include the ACL calls. These do not have
a cooresponding NFS type call count. Thus the RPC count minus NFS
call count will give an indication of the extent of the ACL
overheads. See "noacl" in the mount discussion.
badcalls = rejected calls due to bad header/length, should be
nonzero only if checksum off.
nullrecv = no request on queue when nfsd tried to find it. Some
occur normally.
badxid = reply from server did not match expected type/sequence
number. Perhaps caused by timeout and retransmit. Also, udp
does not guarantee matching values, per protocol design.
timeout = number of times that a client did not receive a response
within the timeout value. Results in a retransmit. Some are normal
when a server is nearing saturation. Some intermittently are probably
ok unless other items create concern.
retrans = number of retransmits from client. May be due to timeout,
collisions, insufficient transmit queue depth, adapter bad, etc.
This does not necessarily match the timeout count.
INFO for V3.2.5 may have a better description for nfsstat than the
above.
nfsstat -z shows the current counts and then sets them to zero.
A subsequent nfsstat -s or -c shows the counts for the interval
and is a nice way to track the NFS activity.
Support Line: NFS Performance problems ITEM: AI8949L
Dated: February 1996 Category: N/A
This HTML file was generated 99/06/24~13:30:27
Comments or suggestions?
Contact us