Certain aspects of PDT can be customized. For example, any user can be designated as the regular recipient of PDT reports, and the retention period for data in PDT's historical record can be modified. All customization is performed either by modifying one of the PDT files in the directory /var/perf/cfg/diag_tool/ or by executing the /usr/sbin/perf/diag_tool/pdt_config script.
It is recommended that no changes be made until after PDT has produced several reports, and a certain familiarity with PDT has been acquired.
By default, PDT reports are generated with severity level 1 with only the most serious problems identified. There are other severity levels (2 and 3) at which more detailed information is frequently available. Further, whenever a PDT report is produced, it is mailed to the adm user. You can choose to have the report mailed elsewhere or not mailed at all.
Both of these parameters are controlled with the /usr/sbin/perf/diag_tool/pdt_config script. The following dialog changes the user and the severity level:
# /usr/sbin/perf/diag_tool/pdt_config ________________PDT customization menu__________________ 1) show current PDT report recipient and severity level 2) modify/enable PDT reporting 3) disable PDT reporting 4) modify/enable PDT collection 5) disable PDT collection 6) de-install PDT 7) exit pdt_config Please enter a number: 1 current PDT report recipient and severity level adm 1 ________________PDT customization menu__________________ 1) show current PDT report recipient and severity level 2) modify/enable PDT reporting 3) disable PDT reporting 4) modify/enable PDT collection 5) disable PDT collection 6) de-install PDT 7) exit pdt_config Please enter a number: 2 enter id@host for recipient of report : rsmith enter severity level for report (1-3): 2 report recipient and severity level rsmith 2 ________________PDT customization menu__________________ 1) show current PDT report recipient and severity level 2) modify/enable PDT reporting 3) disable PDT reporting 4) modify/enable PDT collection 5) disable PDT collection 6) de-install PDT 7) exit pdt_config Please enter a number: 1 current PDT report recipient and severity level rsmith 2 ________________PDT customization menu__________________ 1) show current PDT report recipient and severity level 2) modify/enable PDT reporting 3) disable PDT reporting 4) modify/enable PDT collection 5) disable PDT collection 6) de-install PDT 7) exit pdt_config Please enter a number: 7
In the preceding example, the recipient is changed to user rsmith, and the severity is changed to 2. This means that user rsmith will receive the PDT report, and that both severity 1 and 2 messages will be included. Note the use of option 1 to determine the current PDT report recipient and report severity level.
The user and security level could also be changed directly in the /var/perf/cfg/diag_tool/.reporting.list file.
To terminate reporting (but allow collection to continue), option 3 is selected, for example:
# /usr/sbin/perf/diag_tool ________________PDT customization menu__________________ 1) show current PDT report recipient and severity level 2) modify/enable PDT reporting 3) disable PDT reporting 4) modify/enable PDT collection 5) disable PDT collection 6) de-install PDT 7) exit pdt_config Please enter a number: 3 disable PDT reporting done ________________PDT customization menu__________________ 1) show current PDT report recipient and severity level 2) modify/enable PDT reporting 3) disable PDT reporting 4) modify/enable PDT collection 5) disable PDT collection 6) de-install PDT 7) exit pdt_config Please enter a number: 1 reporting has been disabled (file .reporting.list not found).
The following lists indicate the possible problems associated with each severity level. Remember that selecting severity n results in the reporting of all problems of severity less than or equal to n.
Severity 3 messages provide additional detail about problems identified at severity levels 1 and 2. This includes the data-collection characteristics, such as number of samples, for severity 1 and 2 messages.
As an alternative to using the periodic report, any user can request a current report from the existing data by executing /usr/sbin/perf/diag_tool/pdt_report SeverityNum. The report is produced with the given severity (if none is provided, SeverityNum defaults to 1) and written to standard output. Generating a report in this way does not cause any change to the /var/perf/tmp/PDT_REPORT or to /var/perf/tmp/PDT_REPORT.last files.
PDT analyzes files and directories for systematic growth in size. It examines only those files and directories listed in the file /var/perf/cfg/diag_tool/.files. The format of the .files file is one file or directory name per line. The default content is as follows:
/usr/adm/wtmp /var/spool/qdaemon/ /var/adm/ras/ /tmp/
You can use an editor to modify this file to track files and directories that are important to your system.
PDT tracks the average ping delay to hosts whose names are listed in the /var/perf/cfg/diag_tool/.nodes file. This file is not shipped with PDT (which means that no host analysis is performed by default), but may be created by the administrator. The format of the .nodes file is one host name per line in the file. For example, to monitor nodes chuys and hulahut, the file .nodes would be as follows:
chuys hulahut
Periodically, a retention shell script is run that discards entries in the PDT historical record that are older than the designated retention period. The retention of all data is governed by the same retention policy. This policy is described in the /var/perf/cfg/diag_tool/.retention.list file. The default .retention.list content is as follows:
* * * 35
which causes all data to be retained no more than 35 days. The number 35 can be replaced by any unsigned integer.
PDT uses the historical record to assess trends and identify system changes. Extending the retention period increases the scope of this analysis, but at the cost of additional disk storage and PDT processing time.
The PDT historical record is maintained in /var/perf/tmp/.SM. The retention script creates a copy of this file in /var/perf/tmp/.SM.last prior to performing the retention operation. In addition, historical data that is discarded is appended to /var/perf/tmp/.SM.discards.
The existence of /var/perf/tmp/.SM.last provides limited backup, but the administrator should ensure that the /var/perf/tmp/.SM file is regularly backed up. If the file is lost, PDT continues to function, but without the historical information. Over time, the historical record will grow again as new data is collected.
Collection, reporting and retention are driven by three entries in the user adm cron table. Collection occurs on every weekday at 9 a.m (Driver_ daily). Reporting occurs every weekday at 10 a.m (Driver_ daily2). The retention analysis is performed once a week, on Saturday evening at 9 p.m (Driver_ offweekly). The following files are used:
The cron entries (created by executing the /usr/sbin/perf/diag_tool/pdt_config script and selecting option 2) are shown below:
0 9 * * 1-5 /usr/sbin/perf/diag_tool/Driver_ daily 0 10 * * 1-5 /usr/sbin/perf/diag_tool/Driver_ daily2 0 21 * * 6 /usr/sbin/perf/diag_tool/Driver_ offweekly
The default times can be changed by altering the crontab for user adm.
The file /var/perf/cfg/diag_tool/.thresholds contains the thresholds used in analysis and reporting. These thresholds, listed below, have an effect on PDT report organization and content.
The SCSI controllers having the largest and the smallest disk storage are identified. This is a static size, not the amount allocated or free. If the difference (in MB) between these two controllers exceeds DISK_STORAGE_BALANCE, a message is reported:
SCSI Controller %s has %.0lf MB more storage than %s
The default value for DISK_STORAGE_BALANCE is 800. Any integer value between 0 and 10000 is valid.
The paging spaces having the largest and the smallest areas are identified. If the difference (in MB) between these two exceeds PAGING_SPACE_BALANCE, a message is reported. The default value is 4. Any integer value between 0 and 100 is accepted. This threshold is presently not used in analysis and reporting.
The SCSI controllers having the largest and the least number of disks attached are identified. If the difference between these two counts exceeds NUMBER_OF_BALANCE, a message is reported:
SCSI Controller %s has %.0lf more disks than %s
The default value is 1. It can be set to any integer value in the range of 0 to 10000.
The same type of test is performed on the number of paging areas on each physical volume:
Physical Volume %s has %.0lf paging areas, while Physical Volume %s has only %.0lf
Applies to process utilization. Changes in the top three CPU consumers are only reported if the new process had a utilization in excess of MIN_UTIL.
First appearance of %s (%s) on top-3 cpu list
The same threshold applies to changes in the top-three memory consumers list:
First appearance of %s (%s) on top-3 memory list
The default value is 3. Any integer value from 0 to 100 is valid.
Applies to journaled file system utilization. If the file system has a percentage use above FS_UTIL_LIMIT, a message is reported:
File system %s (%s) is nearly full at %.0lf %%
The same threshold is applied to paging spaces:
Paging space %s is nearly full at %.0lf %%
The default value is 90 percent. Any integer value between 0 and 100 is accepted.
Special attention should be given to /, /var, and /tmp file systems. The operating system uses these areas for normal operation. If there remains no space in one of these, the behavior of the system is unpredictable. Error messages are provided when the execution of commands fails, but to detect these file system problems earlier, decrease FS_UTIL_LIMIT to 70 or 80 percent.
The objective is to determine if the total amount of memory is adequately backed up by paging space. If real memory is close to the amount of used paging space, then the system is likely paging and would benefit from the addition of memory.
The formula is based on experience and actually compares MEMORY_FACTOR * memory with the average used paging space.
The current default is 0.9. By decreasing this number, a warning is produced more frequently:
System has %.0lf MB memory; may be inadequate.
Increasing this number eliminates the message altogether. It can be set anywhere between 0.001 and 100.
Used in all trending assessments. It is applied after a linear regression is performed on all available historical data. This technique basically draws the best line among the points. The slope of the fitted line must exceed the last_value * TREND_THRESHOLD.
File system %s (%s) is growing, now, %.2lf %% full, and growing an avg. of %.2lf %%/day
The objective is to try to ensure that a trend, however strong its statistical significance, has some practical significance.
For example, if we determine that a file system is growing at X MB a day, and the last_value for the file system size is 100 MB, we require that X exceeds 100 MB * TREND_THRESHOLD to be reported as a trend of practical significance. The default value is 0.01; so a growth rate of 1 MB per day would be required for reporting. The threshold can be set anywhere between 0.00001 and 100000.
This threshold assessment applies to trends associated with:
Used also in trending assessments. For example, in the case of file systems, if there is a significant (both statistical and practical) trend, the time until the file system is 100 percent full is estimated. If this time is within EVENT_HORIZON, a message is reported:
At this rate, %s will be full in about %.0lf days
The default value is 30, and it can be any integer value between 0 and 100000.
This threshold applies to trends associated with:
Errors can occur within each of the different PDT components. In general, an error does not terminate PDT. Instead, a message is output to the PDT standard error file, /var/perf/tmp/.stderr. That phase of processing then terminates.
Users experiencing unexpected behavior, such as the PDT report not being produced as expected, should examine the /var/perf/tmp/.stderr file.
It is not possible to uninstall PDT directly using the pdt_config command, but if option 6 is requested, a message describes the steps necessary to remove PDT from the system:
# /usr/sbin/perf/diag_tool/pdt_config ________________PDT customization menu__________________ 1) show current PDT report recipient and severity level 2) modify/enable PDT reporting 3) disable PDT reporting 4) modify/enable PDT collection 5) disable PDT collection 6) de-install PDT 7) exit pdt_config Please enter a number: 6 PDT is installed as package bos.perf.diag_tool in the bos lpp. Use the installp facility to remove the package