Performace bad; netstat indicates errors
ITEM: RS4000014942
Question........:
Greetings. Customer has a 6-node SP system with FDDI cards on the
nodes (there is no SP switch, so I believe this question is more
relevant to the COMMS rather than to the SP2 topic.)
When doing a netstat -i, I am getting a substantial number of Ierrs
(about 10%), and no output errors. Customer is also complaining about
performance, so I would like to know this:
1. What do these Ierrs mean? Can this be the cause of the perf.
problems?
2. What can I do to eliminate the errors?
I would appreciate your answer in simple terms; unfortunately I am
not at all experienced with FDDI (I just plugged the things in), and
I am trying to deal with this the best I can.
Thanks, George
Answer..........:
====> THIS TEXT HAS BEEN ENTERED BY IBM IN ITALY
Question........:
Greetings. Customer has a 6-node SP system with FDDI cards on the
nodes (there is no SP switch, so I believe this question is more
relevant to the COMMS rather than to the SP2 topic.)
When doing a netstat -i, I am getting a substantial number of Ierrs
(about 10%), and no output errors. Customer is also complaining about
performance, so I would like to know this:
1. What do these Ierrs mean? Can this be the cause of the perf.
problems?
2. What can I do to eliminate the errors?
I would appreciate your answer in simple terms; unfortunately I am
not at all experienced with FDDI (I just plugged the things in), and
I am trying to deal with this the best I can.
Thanks, George
Answer:
To help you we need some more info: what's your customer's environment
like (dual or single ring, what systems are connected to the FDDI, what
protocol are they running).....
What we also would need is to have the output of the fddistat command
that is a king of netstat but specific for fddi.
Thanks, best regards, Marina.
Question........:
Thank you for your answer. I am told that the customer uses a single
ring, connecting his 6 nodes to an 8244 FDDI hub. This hub is
connected to a CISCO Catalyst box with a FDDI card, and the rest
of the customer network is ethernet connected to the CISCO box.
The 6 nodes run AIX 3.2.5.1.
I will only be able to go onsite beginning of next week to get
fddistat output for you, but I am sending this info to you now so
that you can tell me if you need any more information. By the way,
do I just type fddistat? Do I need to supply any options, as I do
with netstat?
Thank you, George
Answer..........:
====> THIS TEXT HAS BEEN ENTERED BY IBM IN ITALY
Question 1......:
Greetings. Customer has a 6-node SP system with FDDI cards on the
nodes (there is no SP switch, so I believe this question is more
relevant to the COMMS rather than to the SP2 topic.)
When doing a netstat -i, I am getting a substantial number of Ierrs
(about 10%), and no output errors. Customer is also complaining about
performance, so I would like to know this:
1. What do these Ierrs mean? Can this be the cause of the perf.
problems?
2. What can I do to eliminate the errors?
I would appreciate your answer in simple terms; unfortunately I am
not at all experienced with FDDI (I just plugged the things in), and
I am trying to deal with this the best I can.
Thanks, George
Answer 1:
To help you we need some more info: what's your customer's environment
like (dual or single ring, what systems are connected to the FDDI, what
protocol are they running).....
What we also would need is to have the output of the fddistat command
that is a king of netstat but specific for fddi.
Thanks, best regards, Marina.
Question 2......:
Thank you for your answer. I am told that the customer uses a single
ring, connecting his 6 nodes to an 8244 FDDI hub. This hub is
connected to a CISCO Catalyst box with a FDDI card, and the rest
of the customer network is ethernet connected to the CISCO box.
The 6 nodes run AIX 3.2.5.1.
I will only be able to go onsite beginning of next week to get
fddistat output for you, but I am sending this info to you now so
that you can tell me if you need any more information. By the way,
do I just type fddistat? Do I need to supply any options, as I do
with netstat?
Thank you, George
Answer 2:
You simply run "fddistat fddi_device_name" (example: fddistat fddi0)
that will be all.
It seems that sometimes OEM Hubs/routers or concentrators send out
invalid packets, but to identify what the cause is for your problems
we need to get an overwhole descrition of what is present on the ring.
From your append we can design the physical layout of your customer's
environment. I would like to receive also the fddi configuration para-
meters you made on the SP nodes and the other boxes attached to the
ring
(I believe it's only the CISCO), (MTU,...)
Thanks, best regards, Marina.
Question........:
Hello. Yesterday I visited the customer. I tried running
the fddistat command, but it could not be found on the
system. The customer has a 3.2.5 system as I have
described to you, and I found LPP fddi.obj installed.
I checked on my 4.1.4 system at the office, and although
the command was not there either,it was described
in InfoExplorer (the customer's system infoexplorer did
not have an entry on fddistat). Maybe you can tell me
if I need to install anything else on the customer system
to make the command work. Meanwhile, I got netstat -i
output on one of the nodes (it looks similar on all nodes,
so I'll just give you one:)
fi0
Iopkts:784029 Ierrs:16406 Opkts:1426708 Oerrs:0 Coll:0
As I said, the 6 nodes connect to an IBM FDDI concentrator,
which connects to the CISCO Catalyst, which in turn
attaches 4 ethernet segments. The problem seems to be
on the FDDI ring however, and I recall seeing Ierrs even
when there was no traffic on the ethernet. As you requested,
I got the network options; several of those have been
changed according to guidelines for SP systems, but if you
have any recommendations I'd be happy to try them. Here's
the no -a command output:
dog_ticks = 60
lowclust = 100
lowmbuf = 163
thewall = 8192
mb_cl_hiwat = 1660
compat_43 = 1
sb_max = 1302528
detach_route = 1
subnetsarelocal = 1
maxttl = 255
ipfragttl = 60
ipsendredirects = 1
ipforwarding = 1
udp_ttl = 30
tcp_ttl = 60
arpt_killc = 20
tcp_sendspace = 651244
tcp_recvspace = 651244
udp_sendspace = 325622
udp_recvspace = 651244
loop_check_sum = 1
rfc1122addrchk = 0
nonlocsrcroute = 1
tcp_keepintvl = 150
tcp_keepidle = 14400
tcp_keepinit = 150
icmpaddressmask = 0
rfc1323 = 1
tcp_mssdflt = 512
directed_broadcast = 0
tcp_rtolow = 1
tcp_rtohigh = 64
tcp_rtolimit = 7
tcp_rtolength = 13
ipqmaxlen = 50
Please let me know if you need any more information.
Thank you, George
Answer..........:
====> THIS TEXT HAS BEEN ENTERED BY IBM IN ITALY
Question being searched, please wait.
Answer..........:
====> THIS TEXT HAS BEEN ENTERED BY IBM IN ITALY
Question........:
Greetings. Customer has a 6-node SP system with FDDI cards on the
nodes (there is no SP switch, so I believe this question is more
relevant to the COMMS rather than to the SP2 topic.)
When doing a netstat -i, I am getting a substantial number of Ierrs
(about 10%), and no output errors. Customer is also complaining about
performance, so I would like to know this:
1. What do these Ierrs mean? Can this be the cause of the perf.
problems?
2. What can I do to eliminate the errors?
I would appreciate your answer in simple terms; unfortunately I am
not at all experienced with FDDI (I just plugged the things in), and
I am trying to deal with this the best I can.
Further customer uses a single ring, connecting his 6 nodes
to an 8244 FDDI hub. This hub is connected to a CISCO Catalyst
box with a FDDI card, and the rest of the customer network
is ethernet connected to the CISCO box.
The 6 nodes run AIX 3.2.5.1.
I tried running the fddistat command, but it could not be found
on the system. The customer has a 3.2.5 system as I have
described and I found LPP fddi.obj installed.
I checked on my 4.1.4 system at the office, and although
the command was not there either,it was described
in InfoExplorer (the customer's system infoexplorer did
not have an entry on fddistat). Maybe you can tell me
if I need to install anything else on the customer system
to make the command work. Meanwhile, I got netstat -i
output on one of the nodes (it looks similar on all nodes,
so I'll just give you one:)
fi0
Iopkts:784029 Ierrs:16406 Opkts:1426708 Oerrs:0 Coll:0
As I said, the 6 nodes connect to an IBM FDDI concentrator,
which connects to the CISCO Catalyst, which in turn
attaches 4 ethernet segments. The problem seems to be
on the FDDI ring however, and I recall seeing Ierrs even
when there was no traffic on the ethernet.
I got the network options; several of those have been
changed according to guidelines for SP systems, but if you
have any recommendations I'd be happy to try them. Here's
the no -a command output:
dog_ticks = 60
lowclust = 100
lowmbuf = 163
thewall = 8192
mb_cl_hiwat = 1660
compat_43 = 1
sb_max = 1302528
detach_route = 1
subnetsarelocal = 1
maxttl = 255
ipfragttl = 60
ipsendredirects = 1
ipforwarding = 1
udp_ttl = 30
tcp_ttl = 60
arpt_killc = 20
tcp_sendspace = 651244
tcp_recvspace = 651244
udp_sendspace = 325622
udp_recvspace = 651244
loop_check_sum = 1
rfc1122addrchk = 0
nonlocsrcroute = 1
tcp_keepintvl = 150
tcp_keepidle = 14400
tcp_keepinit = 150
icmpaddressmask = 0
rfc1323 = 1
tcp_mssdflt = 512
directed_broadcast = 0
tcp_rtolow = 1
tcp_rtohigh = 64
tcp_rtolimit = 7
tcp_rtolength = 13
ipqmaxlen = 50
Please let me know if you need any more information.
Thank you, George
**************> ANSWER level 2 --> level 1 SPECIALIST <**************
====> THIS TEXT HAS BEEN ENTERED BY IBM IN USA
PMR E0301,998,758 was created on 96/11/06 at 11:33:40.
**************> ANSWER level 2 --> level 1 SPECIALIST <**************
====> THIS TEXT HAS BEEN ENTERED BY IBM IN USA
Answer..........:
TIME:0705
Received by Austin ITSC and assigned to AIXOPSYS.
Your question has been received, and assigned to a specialist. Please
wait for a reply. Thank you.
**************> ANSWER level 2 --> level 1 SPECIALIST <**************
====> THIS TEXT HAS BEEN ENTERED BY IBM IN USA
Answer..........:
TIME:1353
YOUR ITEM IS BEING RESEARCHED
**************> ANSWER level 2 --> level 1 SPECIALIST <**************
====> THIS TEXT HAS BEEN ENTERED BY IBM IN USA
Answer..........:
TIME:1125
Response:
1) Ierrs can be caused by two different factors. Either the
transmit/receive queue for the fddi adapter is too small or the adapter
is receiving a broadcast from a remote device/application using
different
protocols which are not running on the RS/6000. Since you are
experienc-
ing performance problems, the most likely culprit is
thetransmit/receive
queue for the fddi adapter.
2) To fix this problem, you will want to check the size of
transmit/rece
ive queues. You can check/change the transmit/receive queue size in
smit
Either use "lsattr -El fddi0" or use the fast-path "smitty chgfddi" to
do
this. The recommend size is 150. The maximum size is 250.
I would also like you to check the level of device driver, microcode,
and fixes for the fddi adapter that are on the system.
If you are using AIX 4.1.4 check the following:
lslpp -l devices.mca.8ef4*
lslpp -l bos.net.tcp*
ls -l /etc/microcode/8ef4*
Here are the latest fixes for AIX 4.1.4:
U441908 - devices.mca.8ef4.com 4.1.4.2
U441581 - devices.mca.8ef4.diag 4.1.4.1
U444993 - devices.mca.8ef4.rte 4.1.4.5
U444402 - devices.mca.8ef4.ucode 4.1.4.1(latest fddi microcode)
U445555 - bos.net.tcp.client 4.1.4.18
U445558 - bos.net.tcp.server 4.1.4.17
If you are using AIX 3.2.5 check the following:
ls -l /etc/microcode/8ef4*
lslpp -h fddi*
These are the latest 3.2.5 fixes:
U443507 - latest fddi microcode.
U435291
U491145
U491182
The latest level of fddi microcode is 8ef4m.02.06.
Thank you for using AIX Support Services.
Question........:
Thank you for your answer. The customer had the latest PTFs, except
for the fddi microcode, which I installed. It turns out that their
performance problems were most likely due to their having a class,
and everybody was trying to do the same thing at the same time. Now
that they have finished education, performance is reasonable.
Re the Ierrs: Those are caused by the CISCO Catalyst; I disconnected
it and they went away. It appears to be doing some sort of polling
(every second or so) which is not understood by the fddi nodes (there
is no problem with the ethernet stations on the CISCO. I have
demonstrated this to the customer who is satisfied. However they
have asked me to look into eliminating these error packets. This
probably falls outside your area, but do you have any idea if I
could do anything (on the CISCO most probably) to eliminate this?
Thank you very much, george
Answer..........:
====> THIS TEXT HAS BEEN ENTERED BY IBM IN ITALY
Question being serached. Please wait.
Answer..........:
Response:
There is a fix available that might fix this situation.
IX46964 UNSUPPORTED BROADCAST PACKETS CAUSE INPUT ERRORS
o To get this fix order PTF U444252
ERROR DESCRIPTION:
A received broadcast packet of an unsupported type will cause the IF
layer to increment the input error(Ierr) count.
PROBLEM SUMMARY:
Broadcast packets to an unsupported input type should not increment
the number of input errors. Different types of machines can be on the
same network as an RS/6000, and unless the packet is destined for the
RS/6000, it is legal to drop unsupported broadcast packets without
error.
PROBLEM CONCLUSION:
Don't increment input errors for broadcast packets destined for an
unknown input type.
If the CISCO's polling is causing the RS/6000 to increment the Ierr
count because it doesn't understand the packets, then this PTF should
stop the Ierrs from being incremented.
Thank you for using AIX Technical Support.
**************> QUESTION level 1 --> level 2 SPECIALIST <**************
====> THIS TEXT HAS BEEN ENTERED BY IBM IN ITALY
Question........:
Greetings. Customer has a 6-node SP system with FDDI cards on the
nodes (there is no SP switch, so I believe this question is more
relevant to the COMMS rather than to the SP2 topic.)
When doing a netstat -i, I am getting a substantial number of Ierrs
(about 10%), and no output errors. Customer is also complaining about
performance, so I would like to know this:
1. What do these Ierrs mean? Can this be the cause of the perf.
problems?
2. What can I do to eliminate the errors?
I would appreciate your answer in simple terms; unfortunately I am
not at all experienced with FDDI (I just plugged the things in), and
I am trying to deal with this the best I can.
Further customer uses a single ring, connecting his 6 nodes
to an 8244 FDDI hub. This hub is connected to a CISCO Catalyst
box with a FDDI card, and the rest of the customer network
is ethernet connected to the CISCO box.
The 6 nodes run AIX 3.2.5.1.
I tried running the fddistat command, but it could not be found
on the system. The customer has a 3.2.5 system as I have
described and I found LPP fddi.obj installed.
I checked on my 4.1.4 system at the office, and although
the command was not there either,it was described
in InfoExplorer (the customer's system infoexplorer did
not have an entry on fddistat). Maybe you can tell me
if I need to install anything else on the customer system
to make the command work. Meanwhile, I got netstat -i
output on one of the nodes (it looks similar on all nodes,
so I'll just give you one:)
fi0
Iopkts:784029 Ierrs:16406 Opkts:1426708 Oerrs:0 Coll:0
As I said, the 6 nodes connect to an IBM FDDI concentrator,
which connects to the CISCO Catalyst, which in turn
attaches 4 ethernet segments. The problem seems to be
on the FDDI ring however, and I recall seeing Ierrs even
when there was no traffic on the ethernet.
I got the network options; several of those have been
changed according to guidelines for SP systems, but if you
have any recommendations I'd be happy to try them. Here's
the no -a command output:
dog_ticks = 60
lowclust = 100
lowmbuf = 163
thewall = 8192
mb_cl_hiwat = 1660
compat_43 = 1
sb_max = 1302528
detach_route = 1
subnetsarelocal = 1
maxttl = 255
ipfragttl = 60
ipsendredirects = 1
ipforwarding = 1
udp_ttl = 30
tcp_ttl = 60
arpt_killc = 20
tcp_sendspace = 651244
tcp_recvspace = 651244
udp_sendspace = 325622
udp_recvspace = 651244
loop_check_sum = 1
rfc1122addrchk = 0
nonlocsrcroute = 1
tcp_keepintvl = 150
tcp_keepidle = 14400
tcp_keepinit = 150
icmpaddressmask = 0
rfc1323 = 1
tcp_mssdflt = 512
directed_broadcast = 0
tcp_rtolow = 1
tcp_rtohigh = 64
tcp_rtolimit = 7
tcp_rtolength = 13
ipqmaxlen = 50
Please let me know if you need any more information.
Thank you, George
**************> ANSWER level 2 --> level 1 SPECIALIST <**************
====> THIS TEXT HAS BEEN ENTERED BY IBM IN USA
PMR E0301,998,758 was created on 96/11/06 at 11:33:40.
**************> ANSWER level 2 --> level 1 SPECIALIST <**************
====> THIS TEXT HAS BEEN ENTERED BY IBM IN USA
====> ASGN: BUNLOEUR AT WTSCPOK ================= DATE:961106
TIME:0705
Received by Austin ITSC and assigned to AIXOPSYS.
Your question has been received, and assigned to a specialist. Please
wait for a reply. Thank you.
**************> ANSWER level 2 --> level 1 SPECIALIST <**************
====> THIS TEXT HAS BEEN ENTERED BY IBM IN USA
====> RESP: AIXOPSYS AT WTSCPOK ================= DATE:961106
TIME:1353
YOUR ITEM IS BEING RESEARCHED
**************> ANSWER level 2 --> level 1 SPECIALIST <**************
====> THIS TEXT HAS BEEN ENTERED BY IBM IN USA
====> RESP: AIXOPSYS AT WTSCPOK ================= DATE:961108
TIME:1125
Response:
1) Ierrs can be caused by two different factors. Either the
transmit/receive queue for the fddi adapter is too small or the adapter
is receiving a broadcast from a remote device/application using
different
protocols which are not running on the RS/6000. Since you are
experienc-
ing performance problems, the most likely culprit is
thetransmit/receive
queue for the fddi adapter.
2) To fix this problem, you will want to check the size of
transmit/rece
ive queues. You can check/change the transmit/receive queue size in
smit
Either use "lsattr -El fddi0" or use the fast-path "smitty chgfddi" to
do
this. The recommend size is 150. The maximum size is 250.
I would also like you to check the level of device driver, microcode,
and fixes for the fddi adapter that are on the system.
If you are using AIX 4.1.4 check the following:
lslpp -l devices.mca.8ef4*
lslpp -l bos.net.tcp*
ls -l /etc/microcode/8ef4*
Here are the latest fixes for AIX 4.1.4:
U441908 - devices.mca.8ef4.com 4.1.4.2
U441581 - devices.mca.8ef4.diag 4.1.4.1
U444993 - devices.mca.8ef4.rte 4.1.4.5
U444402 - devices.mca.8ef4.ucode 4.1.4.1(latest fddi microcode)
U445555 - bos.net.tcp.client 4.1.4.18
U445558 - bos.net.tcp.server 4.1.4.17
If you are using AIX 3.2.5 check the following:
ls -l /etc/microcode/8ef4*
lslpp -h fddi*
These are the latest 3.2.5 fixes:
U443507 - latest fddi microcode.
U435291
U491145
U491182
The latest level of fddi microcode is 8ef4m.02.06.
Thank you for using AIX Support Services.
**************> QUESTION level 1 --> level 2 SPECIALIST <**************
====> THIS TEXT HAS BEEN ENTERED BY IBM IN ITALY
Question........:
Thank you for your answer. The customer had the latest PTFs, except
for the fddi microcode, which I installed. It turns out that their
performance problems were most likely due to their having a class,
and everybody was trying to do the same thing at the same time. Now
that they have finished education, performance is reasonable.
Re the Ierrs: Those are caused by the CISCO Catalyst; I disconnected
it and they went away. It appears to be doing some sort of polling
(every second or so) which is not understood by the fddi nodes (there
is no problem with the ethernet stations on the CISCO. I have
demonstrated this to the customer who is satisfied. However they
have asked me to look into eliminating these error packets. This
probably falls outside your area, but do you have any idea if I
could do anything (on the CISCO most probably) to eliminate this?
Thank you very much, george
**************> ANSWER level 2 --> level 1 SPECIALIST <**************
====> THIS TEXT HAS BEEN ENTERED BY IBM IN USA
PMR E0301,998,758 was updated on 96/11/19 at 16:52:37.
**************> ANSWER level 2 --> level 1 SPECIALIST <**************
====> THIS TEXT HAS BEEN ENTERED BY IBM IN USA
====> ASGN: SGARDNER AT WTSCPOK ================= DATE:961119
TIME:1222
Received by Austin ITSC and assigned to AIXOPSYS.
Your question has been received and assigned to a specialist. Please
wait for a reply. Thank you.
**************> ANSWER level 2 --> level 1 SPECIALIST <**************
====> THIS TEXT HAS BEEN ENTERED BY IBM IN USA
====> RESP: AIXOPSYS AT WTSCPOK ================= DATE:961119
TIME:1317
YOUR ITEM IS BEING RESEARCHED
**************> ANSWER level 2 --> level 1 SPECIALIST <**************
====> THIS TEXT HAS BEEN ENTERED BY IBM IN USA
====> RESP: AIXOPSYS AT WTSCPOK ================= DATE:961120
TIME:1245
Response:
There is a fix available that might fix this situation.
IX46964 UNSUPPORTED BROADCAST PACKETS CAUSE INPUT ERRORS
o To get this fix order PTF U444252
ERROR DESCRIPTION:
A received broadcast packet of an unsupported type will cause the IF
layer to increment the input error(Ierr) count.
PROBLEM SUMMARY:
Broadcast packets to an unsupported input type should not increment
the number of input errors. Different types of machines can be on the
same network as an RS/6000, and unless the packet is destined for the
RS/6000, it is legal to drop unsupported broadcast packets without
error.
PROBLEM CONCLUSION:
Don't increment input errors for broadcast packets destined for an
unknown input type.
If the CISCO's polling is causing the RS/6000 to increment the Ierr
count because it doesn't understand the packets, then this PTF should
stop the Ierrs from being incremented.
Thank you for using AIX Technical Support.
WWQA: ITEM: RS4000014942 ITEM: RS4000014942
Dated: 10/1996 Category: AIXCOMMS
This HTML file was generated 99/06/24~12:43:05
Comments or suggestions?
Contact us