What could cause a PSF for AIX/PSM system to hang when

ITEM: RTA000107147



Q:                                                                              
ABSTRACT:     What could cause a PSF for AIX system to hang when                
              printing to a 3900-0W1?                                           
SEARCH ARG:   psf aix hang                                                      
TOPIC THREAD: PRINT                                                             
              PSF/AIX                                                           
..                                                                              
PSF for AIX V2.1 (and PSM): Customer runs on an F30 with AIX 4.1.4 or           
4.1.5. They are driving a 3900-0W1 via ethernet 10BaseT.                        
Customer was running their print job and were 5000 pages into a                 
42,000 page job when the printer and server just "hung".  Customer              
was able to get out of this by turning the printer off/on and by                
doing the same with the server.                                                 
                                                                                
They used the AIX support line to determine they had no out of                 
space conditions in their spool or /var space.                                  
                                                                                
In talking with another SE he said there's a "locking semaphores"               
problem when running on an F50 with AIX 4.2.1.  This is a POD                   
environment.  In their case, they can issue a psfctl -kd to get                 
going again (even though they have to restart the job, etc.).                   
The locking semaphore condition is basically an out of synch                    
condition that occurs when PSF makes a call using AIX kernel                    
services.                                                                       
                                                                                
Could this have been our problem even though we are not a POD                   
environment?  Has this condition been reported in our environment?              
                                                                                
  I would appreciate any insights you may have on this problem.                 
                                                                               
P.S. Hardware was checked and ruled out as the cause of this "hang."            
                                                                                
Thank you.                                                                      
                                                                                
A:                                                                              
The semaphore problem that you describe on the F50 can occur with               
PSF for AIX in addition to POD (InfoPrint).  However, it only occurs on         
RS/6000s with multiple processors, i.e., on systems running the AIX             
MP kernel.  (That problem has been fixed by the way; an e-fix is                
available by opening a problem with level 2.) Since the 7025-F30 is             
uni-processor, they should not (as far as I know) experience that               
particular problem with the semaphores.                                         
                                                                                
Do you know if your customer checked to see if there is sufficient              
space in /var/psf/segments?  Whether or not this is relevant depends           
on how your customer is submitting jobs and how they have configured            
their system, but if they are using psfin (Input Manager) to submit             
jobs *and* if they've configured a separate filesystem for                      
/var/psf/segments, that might be a problem.  Also, your customer                
has PSM as well as PSF/AIX.  I'm less familiar with that product and            
areas of potential concern, and that specialist is out of the office            
until next week.                                                                
                                                                                
The only thing that I can think of is that we've had some reports that          
if users have the Job Interval Shutdown timer set to 9999 (never time           
out) that they have encountered unexpected "hangs".  The workaround for         
these customers has been to set the JIS timer to a non-infinite value;          
300 seconds (5 minutes) is what one consultant has recommended.                 
This situation has been called to development's attention, but I have           
no information on a future fix.                                                
                                                                                
Therefore, I would suggest the customer reopen the problem with the             
AIX Support Line, but ask them to PMR it to Boulder Level 2 for further         
analysis. Then, if necessary, L2 can work with hardware to diagnose             
any problems.                                                                   
                                                                                
I hope this helps.  Thanks for using WWQ&A.                                     
                                                                                
S e a r c h - k e y w o r d s:                                                  
semaphore MP UP multiple multiprocessors CPU processor hang PSF                 
PSF/6000 PSF/AIX PSM AIX F50 F30 job interval shutdown jis timer                
afccu tcp/ip                                                                    
                                                                                
                                                                                
                                                                               


WWQA: ITEM: RTA000107147 ITEM: RTA000107147
Dated: 01/1999 Category: XPSF6000
This HTML file was generated 99/06/24~12:43:35
Comments or suggestions? Contact us