[ Previous | Next | Table of Contents | Index | Library Home |
Legal |
Search ]
Performance Management Guide
About This Book
Chapter 1. Performance Concepts
Chapter 2. Resource Management Overview
Performance Overview of the CPU Scheduler
Thread Support
Processes and Threads
Process and Thread Priority
Scheduling Policy for Threads
Scheduler Run Queue
Scheduler CPU Time Slice
Mode Switching
User Mode
Kernel Mode
Mode Switches
Performance Overview of the Virtual Memory Manager (VMM)
Real-Memory Management
Free List
Persistent versus Working Segments
Computational versus File Memory
Page Replacement
Repaging
VMM Thresholds
VMM Memory Load Control Facility
Memory Load Control Algorithm
Allocation and Reclamation of Paging Space Slots
Late Allocation Algorithm
Early Allocation Algorithm
Deferred Allocation Algorithm
Performance Overview of Fixed-Disk Storage Management
Sequential-Access Read Ahead
Write Behind
Memory Mapped Files and Write Behind
Disk-I/O Pacing
Chapter 3. Introduction to Multiprocessing
Symmetrical Multiprocessor (SMP) Concepts and Architecture
Types of Multiprocessing
Shared Nothing MP (pure cluster)
Shared Disks MP
Shared Memory Cluster (SMC)
Shared Memory MP
Symmetrical versus Asymmetrical Multiprocessors
Asymmetrical Multiprocessors
Symmetrical Multiprocessors
Multiprocessors
Parallelizing an Application
Data Serialization
Types of Locks
AIX Version 4 Simple Locks
AIX Version 4 Complex Locks
Lock Granularity
Locking Overhead
Waiting for Locks
Cache Coherency
Processor Affinity and Binding
Memory and Bus Contention
SMP Performance Issues
Workload Concurrency
Throughput
Response Time
SMP Workloads
Workload Multiprocessability
Multiprocessor Throughput Scalability
Multiprocessor Response Time
SMP Thread Scheduling
Default Scheduler Processing of Migrated Workloads
Scheduling Algorithm Variables
Thread Tuning
Thread Environment Variables
SPINLOOPTIME=n
YIELDLOOPTIME=n
AIXTHREAD_SCOPE={P|S}
AIXTHREAD_GUARDPAGES=n
MALLOCMULTIHEAP={considersize,heaps:n}
Variables for Process-Wide Contention Scope
Thread Debug Options
Thread Tuning Summary
SMP Tools
The bindprocessor Command
Considerations
The lockstat Command
The schedtune -s Command
Chapter 4. Planning and Implementing for Performance
Identifying the Components of the Workload
Documenting Performance Requirements
Estimating the Resource Requirements of the Workload
Measuring Workload Resources
Measuring a Complete Workload on a Dedicated System
Measuring a Complete Workload on a Production System
Measuring a Partial Workload on a Production System
Measuring an Individual Program
Estimating Resources Required by a New Program
Transforming Program-Level Estimates to Workload Estimates
Designing and Implementing Efficient Programs
CPU-Limited Programs
Design and Coding for Effective Use of Caches
Registers and Pipeline
Cache and TLBs
Effective Use of Preprocessors and the Compilers
Levels of Optimization
No Optimization
-O or -O2
-O3
Compiling for Specific Hardware Platforms (-qarch, -qtune)
C Options for string.h Subroutine Performance
C and C++ Coding Style for Best Performance
Compiler Execution Time
Memory-Limited Programs
Structuring of Pageable Code
Structuring of Pageable Data
Misuse of Pinned Storage
Using Performance-Related Installation Guidelines
Operating System Preinstallation Guidelines
CPU Preinstallation Guidelines
Memory Preinstallation Guidelines
Disk Preinstallation Guidelines
Placement and Sizes of Paging Spaces
Performance Implications of Disk Mirroring
Performance Implications of Mirrored Striped LVs
Communications Preinstallation Guidelines
Chapter 5. System Monitoring and Initial Performance Diagnosis
The Case for Continuous Performance Monitoring
Using the vmstat, iostat, netstat, and sar Commands
Using the topas Monitor
Using the Performance Diagnostic Tool
Using the Performance Toolbox
Recording with the Performance Agent
Determining the Kind of Performance Problem Reported
A Particular Program Runs Slowly
Everything Runs Slowly at a Particular Time of Day
Everything Runs Slowly at Unpredictable Times
Everything That an Individual User Runs is Slow
A Number of LAN-Connected Systems Slow Down Simultaneously
Everything on a Particular Service or Device Slows Down at Times
Identifying the Performance-Limiting Resource
Determining the Limiting Factor for a Single Program
Determining Whether the Problem is Related to Disk or Memory
Managing Workload
Chapter 6. Monitoring and Tuning CPU Use
Monitoring CPU Use
The vmstat Command (CPU)
The iostat Command
The sar Command
Real-time sampling and display
Display previously captured data
System activity accounting via cron daemon
The xmperf Program
Using the time Command to Measure CPU Use
time and timex Cautions
Identifying CPU-Intensive Programs
Using the ps Command
CPU Intensive
CPU Time Ratio
The THREAD Option
Using the acctcom Command
Using the tprof Program to Analyze Programs for CPU Use
A tprof Example
Offline Processing with the tprof Command
Using the pprof Command to Measure CPU usage of Kernel Threads
Detecting Instruction Emulation with the emstat Tool
Restructuring Executable Programs with the fdpr Program
Controlling Contention for the CPU
Controlling the Priority of User Processes
Running a Command with the nice Command
Setting a Fixed Priority with the setpri Subroutine
Displaying Process Priority with the ps Command
Modifying the Priority with the renice Command
Clarification of the nice and renice Command Syntax
Tuning the Thread-Priority-Value Calculation
Priority Calculation
Tuning with the schedtune Command
Example of a Priority Calculation
Modifying the Scheduler Time Slice with the schedtune Command
CPU-Efficient User ID Administration (The mkpasswd Command)
Chapter 7. Monitoring and Tuning Memory Use
Determining How Much Memory Is Being Used
The vmstat Command (Memory)
The vmstat -I Command
The vmstat -s Command
The ps Command
The svmon Command
How Much Memory is in Use
Who is Using Memory?
Detailed Information on a Specific Segment ID
List of Top Memory Usage of Segments
Correlating svmon and vmstat Outputs
Correlating svmon and ps Outputs
Calculating the Minimum Memory Requirement of a Program
Finding Memory-Leaking Programs
Assessing Memory Requirements Through the rmss Command
Two Styles of Using rmss
Using rmss to Change the Memory Size and Exit
Using rmss to Run a Command over a Range of Memory Sizes
Interpreting rmss Results
Report Generated for the foo Program
Report for a 16 MB Remote Copy
Hints for Using the -s, -f, -d, -n, and -o Flags
Guidelines to Consider When Running the rmss Command
Tuning VMM Memory Load Control with the schedtune Command
Memory Load Control Tuning
The h Parameter
The p Parameter
The m Parameter
The w Parameter
The e Parameter
Tuning VMM Page Replacement with the vmtune Command
Choosing minfree and maxfree Settings
Tuning Memory Pools
Tuning lrubucket to Reduce Memory Scanning Overhead
Choosing minperm and maxperm Settings
Placing a Hard Limit on Persistent File Cache with strict_maxperm
Tuning Paging-Space Thresholds
Choosing npswarn and npskill Settings
Tuning the fork() Retry Interval Parameter with schedtune
Choosing a Page Space Allocation Method
Late Page Space Allocation
Early Page Space Allocation
Deferred Page Space Allocation
Choosing between LPSA and DPSA with the vmtune Command
Looking at Paging Space and Virtual Memory
Using Shared Memory
Extended Shared Memory (EXTSHM)
Chapter 8. Monitoring and Tuning Disk I/O Use
Monitoring Disk I/O
Building a Pre-Tuning Baseline
Wait I/O Time Reporting
Method Used in AIX 4.3.2 and Earlier
Method Used in AIX 4.3.3 and Later
Assessing Disk Performance with the iostat Command
TTY Report
CPU Report
Drive Report
Assessing Disk Performance with the vmstat Command
Assessing Disk Performance with the sar Command
Assessing Logical Volume Fragmentation with the lslv Command
Assessing Physical Placement of Data with the lslv Command
Assessing File Placement with the fileplace Command
Space Efficiency and Sequentiality
Assessing Paging Space I/O with the vmstat Command
Assessing Overall Disk I/O with the vmstat Command
Detailed I/O Analysis with the filemon Command
Global Reports of the filemon Command
Detailed Reports of the filemon Command
Guidelines for Using the filemon Command
Summary for Monitoring Disk I/O
Changing File System Attributes that Affect Performance
File-System Fragment Size
Compression
Changing Logical Volume Attributes That Affect Performance
Position on Physical Volume
Range of Physical Volumes
Maximum Number of Physical Volumes to Use for Allocation
Mirror Write Consistency
Allocate Each Logical Partition Copy on a Separate PV
Relocate the Logical Volume During Reorganization?
Scheduling Policy for Reading/Writing Logical Partition Copies
Enable Write Verify
Striping Size
Physical Volume Considerations
Volume Group Recommendations
Performance Impacts of Mirroring rootvg
Reorganizing Logical Volumes
Recommendations for Best Performance
Recommendations for Highest Availability
Reorganizing File Systems
Reorganizing a File System
Defragmenting a File System
Reorganizing JFS Log and Log Logical Volumes
Creating Log Logical Volumes
Tuning with vmtune
Sequential Read-Ahead
VMM Write-Behind
Sequential Write-Behind
Random Write-Behind
Tuning File Syncs
Miscellaneous I/O Tuning Parameters
numfsbufs
lvm_bufcnt
hd_pbuf_cnt
pd_npages
v_pinshm
fsbufwaitcnt and psbufwaitcnt
Using Disk-I/O Pacing
Example
Tuning Logical Volume Striping
Designing a Striped Logical Volume
Tuning for Striped Logical Volume I/O
Mirrored Striped Logical Volume Performance Implications
Tuning Asynchronous Disk I/O
Tuning Direct I/O
Performance of Direct I/O Reads
Performance of Direct I/O Writes
Performance Example
Summary
Using Raw Disk I/O
Using sync/fsync Calls
Setting SCSI-Adapter and Disk-Device Queue Limits
Non-IBM Disk Drive
Non-IBM Disk Array
Expanding the Configuration
Using RAID
RAID Levels and Their Performance Implications
RAID 0 - For Performance
RAID 1 - For Availability/Good Read Response Time
RAID 2 - Rarely Used
RAID 3 - For CAD/CAM, Sequential Access to Large Files
RAID 4 - Less Used (Parity Volume Bottleneck)
RAID 5 - High Availability and Fewer Writes Than Reads
RAID 6 - Seldom Used
RAID 7 - A Definition of 3rd Parties
RAID 10 - RAID-0+1
Summary of RAID Levels
RAID Performance Summary
Using SSA
Guidelines for Improving SSA Performance
Using Fast Write Cache
Chapter 9. Monitoring and Tuning Communications I/O Use
UDP and TCP/IP Performance Overview
Communication Subsystem Memory (mbuf) Management
Socket Layer
Send Flow
Receive Flow
Socket Creation
Ephemeral Ports
Relative Level of Function in UDP and TCP
UDP Layer
TCP Layer
IP Layer
Send Flow
Receive Flow
Demux Layer
Send Flow
Receive Flow
LAN Adapters and Device Drivers
Send Flow
Receive Flow
Analyzing Network Performance
The ping Command
The ftp Command
The netstat Command
Using the netstat Command
The netpmon Command
Using netpmon
Global Reports of netpmon
Detailed Reports of netpmon
Limitations of netpmon
The traceroute Command
Successful traceroute Examples
Failed traceroute Examples
The iptrace daemon, and the ipreport and ipfilter Commands
Adapter Statistics
The entstat Command
The tokstat Command
The fddistat Command
The atmstat Command
The no Command
Tuning TCP and UDP Performance
Overall Recommendations
Maximizing Throughput
Minimizing Memory
Adapter Transmit and Receive Queue Tuning
Transmit Queues
Receive Queues
Device-Specific Buffers
When to Increase the Receive/Transmit Queue Parameters
Commands to Query and Change the Queue Parameters
How to See the Settings
How to Change the Parameters
Tuning MCA and PCI Adapters
Enabling Thread Usage on LAN Adapters (dog threads)
Tuning TCP Maximum Segment Size
Local Network
Remote Network
UDP Socket Buffer Tuning
udp_sendspace
udp_recvspace
TCP Socket Buffer Tuning
tcp_sendspace
tcp_recvspace
rfc1323
sb_max
Interface-Specific Network Options (ISNO)
IP Protocol Performance Tuning Recommendations
Ethernet Performance Tuning Recommendations
Token-Ring (4 MB) Performance Tuning Recommendations
Token-Ring (16 MB) Performance Tuning Recommendations
FDDI Performance Tuning Recommendations
ATM Performance Tuning Recommendations
SOCC Performance Tuning Recommendations
HIPPI Performance Tuning Recommendations
Tuning mbuf Pool Performance
Overview of the mbuf Management Facility
Tuning Network Memory
Tuning Asynchronous Connections for High-Speed Transfers
Async Port Tuning Techniques
Shell Script fastport.sh for Fast File Transfers
Tuning Name Resolution
Improving telnetd/rlogind Performance
Tuning the SP Network
SP Switch Statistics
The estat Command
The vdidlxxxx Commands
SP System-Specific Tuning Recommendations
Managing Tunable SP Parameters
Initial Settings of SP Tunable Parameters
Tuning the SP Network for Specific Workloads
Chapter 10. Monitoring and Tuning NFS Use
NFS Overview
NFS Version 3
Write Throughput
Read Throughput
Reduced Requests for File Attributes
Efficient Use of High Bandwidth Network Technology
Reduced Directory "Lookup" Requests
Changes Since AIX 4.2.1
Changes in AIX 4.3
Analyzing NFS Performance
The nfsstat Command
RPC Statistics
NFS Server Information
NFS Client Information
The netpmon Command
The nfso Command
NFS References
List of Network File System (NFS) Files
List of NFS Commands
List of NFS Daemons
Tuning for NFS Performance
How Many biod and nfsd Daemons Are Needed?
Choosing Initial Numbers of nfsd and biod daemons
Tuning the Numbers of nfsd and biod daemons
Performance Implications of Hard or Soft NFS Mounts
Other mount Options That Affect Performance
The rsize and wsize Options
Disabling Unused NFS ACL Support
Tune Retransmissions
Tuning to Avoid Retransmits
Dropped Packets
Packets Dropped by the Client
Packets Dropped by the Server
Dropped Packets On the Network
Tuning the NFS File-Attribute Cache
Tuning for Maximum Caching of NFS Data
Cache File System (CacheFS)
CacheFS Benefits
What CacheFS Does Not Do
Configuring CacheFS
Tuning Other Layers to Improve NFS Performance
Increasing NFS Socket Buffer Size
NFS Server Disk Configuration
Misuses of NFS That Affect Performance
NFS Tuning Checklist
Chapter 11. Monitoring and Tuning Java
What is Java?
Why Java?
Java Performance Guidelines
Monitoring Java
Tuning Java
Chapter 12. Analyzing Performance with the Trace Facility
Understanding the Trace Facility
Implementation
Limiting the Amount of Trace Data Collected
Starting and Controlling Trace
Formatting Trace Data
Viewing Trace Data
Example of Trace Facility Use
Obtaining a Sample Trace File
Formatting the Sample Trace
Reading a Trace Report
Filtering of the Trace Report
Starting and Controlling Trace from the Command Line
Controlling Trace in Subcommand Mode
Controlling Trace by Commands
Starting and Controlling Trace from a Program
Controlling Trace with Trace Subroutine Calls
Controlling Trace with ioctl() Calls
Using the trcrpt Command to Format a Report
Formatting a Report on the Same System
Formatting a Report on a Different System
Formatting a Report from trace -C Output
Adding New Trace Events
Possible Forms of a Trace Event Record
Trace Channels
Macros for Recording Trace Events
Use of Event IDs
Examples of Coding and Formatting Events
Syntax for Stanzas in the Trace Format File
Chapter 13. Using Performance Diagnostic Tool (PDT)
Structure of PDT
Scope of PDT Analysis
Analyzing the PDT Report
Header
Alerts
Upward and Downward Trends
System Health
Summary
Installing and Enabling PDT
Customizing PDT
Changing the PDT Report Recipient and Severity Level
PDT Severity Levels
Severity 1 Problems
Severity 2 Problems
Severity 3 Messages
Obtaining a PDT Report on Demand
Modifying the List of Files Monitored by PDT
Modifying the List of Hosts That PDT Monitors
Changing the Historical-Record Retention Period
Modifying the Collection, Retention, and Reporting Times
Modifying the Thresholds
PDT Error Reporting
Uninstalling PDT
Responding to PDT Report Messages
Chapter 14. Reporting Performance Problems
Measuring the Baseline
Reporting a Performance Problem
What is a Performance Problem?
Performance Problem Description
Reporting the Problem
Using the Problem-Analysis Data Collected
Chapter 15. Application Tuning
Profiling
Timing Commands
The prof Command
The gprof Command
The gprof Implementation
The tprof Command
Source Level Profiling with tprof
Reports That the tprof Command Generates
Compiler Optimization Techniques
Compiling with Optimization (-O, -O2, -O3, -qstrict, -qhot, -qipa)
Recommendations
When to Compile without Optimization
Compiling for Specific Hardware Platforms (-qarch, -qtune)
Recommendations
Compiling for Floating-Point Performance (-qfloat)
Recommendations
Specifying Cache Sizes (-qcache)
Expanding Procedure Calls Inline (-Q)
When to Use Dynamic Linking and Static Linking
Determining If Nonshared Libraries Help Performance
Specifying the Link Order to Reduce Paging for Large Programs
Calling the BLAS and ESSL Libraries
Profile Directed Feedback (PDF)
The fdpr Command
Optimizing Preprocessors for FORTRAN and C
Code-Optimization Techniques
Mapped Files
Appendix A. Monitoring and Tuning Commands and Subroutines
Performance Reporting and Analysis Commands
Performance Tuning Commands
Performance-Related Subroutines
Appendix B. Efficient Use of the ld Command
Rebindable Executable Programs
Prebound Subroutine Libraries
Examples
Appendix C. Accessing the Processor Timer
POWER-based-Architecture-Unique Timer Access
Assembler Routines to Access the POWER Timer Registers
C Subroutine to Supply the Time in Seconds
Accessing Timer Registers in POWER-based Systems
Example of the second Subroutine
Appendix D. Determining CPU Speed
Appendix E. National Language Support: Locale versus Speed
Programming Considerations
Some Simplifying Rules
Setting the Locale
Appendix F. Performance Tuning with AIX Fast Connect For Windows and OS/2
Monitoring AIX Fast Connect
Tuning AIX Fast Connect
Appendix G. Summary of Tunable Parameters
Environment Variables
Thread Support Tunable Parameters
Miscellaneous Tunable Parameters
Kernel Tunable Parameters
Scheduler Tunable Parameters
Memory Load Control Tunable Parameters
Virtual Memory Manager and File System Tunable Parameters
Asynchronous I/O Tunable Parameters
Logical Volume Manager Tunable Parameters
Disk and Disk Adapter Tunable Parameters
Interprocess Communication Tunable Parameters
Network Tunable Parameters
Network Option Tunable Parameters
arpqsize
arpt_killc
arptab_bsiz
arptab_nb
bcastping
clean_partial_conns
delayack
delayackports
directed_broadcast
extendednetstats
fasttimo
icmpaddressmask
ie5_old_multicast_mapping
ifsize
inet_stack_size
ipforwarding
ipfragttl
ipignoreredirects
ipqmaxlen
ipsendredirects
ipsrcrouteforward
ipsrcrouterecv
ipsrcroutesend
ip6_defttl
ip6_prune
ip6forwarding
ip6srcrouteforward
llsleep_timeout
main_if6
main_site6
maxmbuf
maxnip6q
maxttl
MTU
multi_homed
nbc_limit
nbc_max_cache
nbc_min_cache
nbc_pseg (AIX 4.3.3 and later)
nbc_pseg_limit (AIX 4.3.3 and later)
ndpqsize
ndpt_down
ndpt_keep
ndpt_mmaxtries
ndpt_probe
ndpt_reachable
ndpt_retrans
ndpt_umaxtries
net_malloc_police
nonlocsrcroute
pmtu_default_age
pmtu_rediscover_interval
rec_que_size
rfc1122addrchk
rfc1323
route_expire
routerevalidate
rto_high
rto_length
rto_limit
rto_low
sack (AIX 4.3.3 and later)
sb_max
send_file_duration
site6_index
sockthresh
somaxconn
subnetsarelocal
tcp_ephemeral_high
tcp_ephemeral_low
tcp_keepidle
tcp_keepinit
tcp_keepintvl
tcp_mssdflt
tcp_ndebug
tcp_nodelay
tcp_pmtu_discover
tcp_recvspace
tcp_sendspace
tcp_timewait
tcp_ttl
thewall
threads
udp_ephemeral_high
udp_ephemeral_low
udp_pmtu_discover
udp_recvspace
udp_sendspace
udp_ttl
udpcksum
use_isno (AIX 4.3.3 and later)
xmt_que_size
NFS Option Tunable Parameters
biod Count
nfs_allow_all_signals
nfs_device_specific_bufs (AIX 4.2.1 and later)
nfs_duplicate_cache_size (AIX 4.1, Version 4.2.0)
nfs_dynamic_retrans (AIX 4.1 and later)
nfs_gather_threshold
nfs_iopace_pages (AIX 4.1)
nfs_max_connections
nfs_max_read_size (AIX 4.3.1 and later)
nfs_max_threads (AIX 4.2.1 and later)
nfs_max_write_size (AIX 4.3.1 and later)
nfs_repeat_messages (AIX Version 4)
nfs_rfc1323 (AIX 4.3)
nfs_server_base_priority (AIX 4.1 and later)
nfs_server_clread (AIX 4.2.1 and later)
nfs_setattr_error (AIX 4.2.1 and later)
nfs_socketsize (AIX 4.1 and later)
nfs_tcp_duplicate_cache_size (AIX 4.2.1 and later)
nfs_tcp_socketsize (AIX 4.2.1 and later)
nfs_udp_duplicate_cache_size (AIX 4.2.1 and later)
nfs_use_reserved_ports (AIX 4.2.1 and later)
nsfd Count
portcheck
udpchecksum
Streams Tunable Attributes
lowthresh
medthresh
nstrpush
psebufcalls
psecache
pseintrstack
psetimers
strctlsz
strmsgsz
strthresh
strturncnt
Appendix H. Notices
Index
[ Previous | Next | Table of Contents | Index |
Library Home |
Legal |
Search ]