Notes on decommisioning reindeer as our TSM server (aka adsmsrv1). 9-15-2005 From /etc/inittab, removed autosrvr:2:once:/usr/tivoli/tsm/server/bin/rc.adsmserv >/dev/console 2>&1 #Start the Tivoli Storage Manager server ... I got distracted ... ------------------------------------------------------------------------------- Lisle's TSM server is called space, a Windows machine at 10.224.89.144. You could use the "Client Connection Manager" to connect to it (uid=administrator, pw=p0rcupine), then get under the TSM GUI (uid=admin, pw=global), but the easier way to administer the TSM server, is to use the dsmadmc command line client. First though, you need to add a stanza in your dsm.sys, defining space as a TSM server: vi /usr/tivoli/tsm/client/ba/bin/dsm.sys Then you can dsmadmc -id=jasper -password=new4now or dsmadmc -id=admin -password=global -server=space ------------------------------------------------------------------------------- To install ADSM/TSM on an AIX machine at Delphion, presuming AFS is installed, as root, - cd /afs/d/software/base/Tivoli.Storage.Manager-4.1.2.0/AIX - smitty installp - Install and Update from LATEST Available Software - INPUT device / directory for software => . - SOFTWARE to install => tivoli.tsm.client.ba.aix43.32bit - Copy over sample config files, cd .. cp sample.dsm.opt /usr/tivoli/tsm/client/ba/bin/dsm.opt cp sample.dsm.sys /usr/tivoli/tsm/client/ba/bin/dsm.sys cp sample.inclexcl.dsm /usr/tivoli/tsm/client/ba/bin/inclexcl.dsm - Edit and customize the sample config files. At a minimum, cd /usr/tivoli/tsm/client/ba/bin vi dsm.sys to change the last line (Nodename jasper) to this machine's I.P. name. You may also want to update vi inclexcl.dsm - To set the ADSM/TSM password for the first time, invoke dsm. dsmc q b / and answer the possible "user id" prompt with just and the password prompt with new4you or new4now. - To set the nightly cron job, crontab -e and add # # Do a nightly incremental backup to ADSM/TSM. 39 1 * * * /usr/bin/dsmc incremental > /tmp/nightly.dsmc.incremental 2>&1 Vary the starting time to some random time to distribute the load. This example starts the nightly backup at 1:39 AM. - To start the first, initial full backup, yet still be able to logoff, nohup dsmc incr 1>/dev/null 2>&1 & ------------------------------------------------------------------------------- To start an ADSM Admin session, dsmadmc -id=jasper -password=new4now -itemcommit ------------------------------------------------------------------------------- To reset the password for machine, or to reset the password for a user, update node afs1 new4now update admin mike new4you ------------------------------------------------------------------------------- To define a new AIX machine to the ADSM server, reg node afs1 new4now backdel=y archdel=y us=none contact='Rick Jasper' To define a new Windows machine to the ADSM server, reg node bashful new4now backdel=y archdel=y us=none contact='Carol Thompson' or reg node tp-rickjasper new4now backdel=y archdel=y us=none contact='Rick Jasper' When defining Windows machines, you need to also associate each machine with a backup schedule, so when you register a Windows machine, also type in one of the following commands. - If this is a stationary machine, i.e. it's likely to be powered on and connected to the network overnight, define association standard daily_backup bashful - Or if this is a Thinkpad and it's usually taken home or powered off overnight, define association standard daily_thinkpad tp-rickjasper The difference is, the daily_backup schedule, schedules incremental backups starting at 8 PM, while the daily_thinkpad schedule starts at noon. ------------------------------------------------------------------------------- When the 15 tapes in the 7337 tape library get full on the TSM server, you need to "checkout" one or more of them, and checkin a new scratch tape, perhaps labelling it in the process. Before you begin, you may need to update the "Maximum Scratch Volumes Allowed:" setting on your tape storage pool. dsmadmc id=admin q stg backuptape f=d To change, udpate stg backuptape maxscratch=50 To checkout a tape, start a TSM admin session, dsmadmc id=jasper Insure there's at least one tape drive free, because for some reason, TSM wants to read the tape when you check it out. If you're having troubles getting a drive free, perhaps due to migration and/or Space Reclamation processes starting, To query the settings first, q stg backuppool f=d and insure these lines are as shown here, High Mig Pct: 70 Next Storage Pool: BACKUPTAPE To halt migration processes, update stg backuppool next="" and to halt reclamations, update stg backuptape recl=100 This is backwards from what you might think. The reclaim parm says to do a reclaim if a tape is 100% free, not 100% used. We normally have this set to 70%, which says reclaim when a tape is 30% used) To put these settings back, update stg backuppool next="backuptape" update stg backuptape recl=85 ============================================================================= | Don't confuse (like Mike and I did on 1-30-2002), the reclaim percentage, | | with the "High Mig Pct" value you see on a "q stg" command. They are two | | different things. You can only see the | | Reclamation Threshold: 85 | | with a | | q stg backuptape f=d | | command. | ============================================================================= To checkout a tape, checkout libv ibm7337 000280 checklabel=no It reads the tape, then puts up a request (which you can see with a "q req" command) that you must reply to, reply 1 Physically remove the tape from the 7337. Don't worry if one or more of the tape drives are in use. You can still unlock the door and remove tapes without affecting running processes. Note that with checklabel=no, the tape may be inside a tape drive, not its assigned slot. No big deal, just remove it from the drive. To label and checkin a new (i.e. unlabelled), scratch tape, put the tape in the empty slot (while you're taking out the other one, is a good time), again insure there is a tape drive free, and enter label libv ibm7337 search=yes labelsource=prompt checkin=scratch (To label and checkin a tape that is already labelled, use checkin libv ibm7337 search=yes status=scratch ) This will scan all the slots for tapes, and for all slots with tapes that TSM believes are empty (e.g. because you checked out the volume), and that TSM finds contains an UNLABELLED tape(!!), TSM will post requests that you need to reply to with the label parameter. E.G. a q req command said ANR8809I 005: Please provide the label name for the volume in slot element 0 of library IBM7337 by issuing REPLY n LABEL=xxx within 60 minutes, where n is the request ID and xxx is the desired label name. to which I replied reply 5 label=000296 To see the result of this work, you should see with a q libv command, your new volume in the library with a status of scratch, i.e. Library Name Volume Name Status Owner Last Use Home Element ------------ ----------- ---------- ---------- --------- ------------ IBM7337 000281 Private Data 1 IBM7337 000282 Private Data 2 IBM7337 000284 Private Data 4 IBM7337 000285 Private Data 5 IBM7337 000286 Private Data 6 IBM7337 000287 Private Data 7 IBM7337 000288 Private Data 8 IBM7337 000289 Private Data 9 IBM7337 000290 Private Data 10 IBM7337 000291 Private Data 11 IBM7337 000292 Private Data 12 IBM7337 000293 Private Data 13 IBM7337 000294 Private Data 14 IBM7337 000295 Private Data 3 IBM7337 000296 Scratch 0 You don't see the new volume with a "q vol 000296" command, but the next time some process wants a scratch tape tape, it'll "ask" the library for one, your new volume will get selected to satisfy that scratch tape request, and then it will be added as a volume. If you don't see this new volume added after a minute or so, check the activity log q actlog or q actlog begint=now-00:20 or q actlog begint=16:30 and see if you got an error. Perhaps the tape write-protect tab is set. If the tape is already labelled, you'll see ANR8807W Could not write label 000305 on the volume in drive RMT0 (/dev/rmt0) of library IBM7337 because volume is already labelled 000305. in which case, you should check it in via checkin libv ibm7337 search=yes status=scratch (which is similar to my earlier checkin experiences where I learned that checkin libv ibm7337 000305 status=scratch for example, was wrong) Sure enough, this worked. After a minute or two, I saw with a q libv command, my newly-checkin-in tape in scratch status. Library Name Volume Name Status Owner Last Use Home Element ------------ ----------- ---------- ---------- --------- ------------ ... IBM7337 000305 Scratch 12 ------------------------------------------------------------------------------- If TSM ever needs one of the tapes that are currently checked out, it posts a request which you can see with a "q req" command from a TSM Admin session. E.G. >q req ANR8352I Requests outstanding: ANR8308I 003: GENERICTAPE volume 000282 is required for use in library IBM7337; CHECKIN LIBVOLUME required within 60 minutes. Since all the library slots are probably full with other tapes, you'll need to select one to checkout and check it out, e.g. >checkout libv ibm7337 000284 checklabel=no ANS8003I Process number 1591 started. In a couple of minutes, you'll see another request from this checkout command, e.g. >q req ANR8352I Requests outstanding: ANR8308I 003: GENERICTAPE volume 000282 is required for use in library IBM7337; CHECKIN LIBVOLUME required within 55 minutes. ANR8307I 004: Remove GENERICTAPE volume 000284 from slot with element number 4 of library IBM7337; issue 'REPLY' along with the request ID when ready. Reply to that second request, e.g. >reply 4 and physically go out and take out the tape from the specified element (slot) number (4 in this example - counting from left to right starting at 0, so it's really the fifth slot from the left), and replace it with the tape you want to check in. The checked-out tapes are in the top, cubicle overhead cabinet in the cubicle between Rick's cubicle and the lab. N N OOO TTTTT NN N O O T N NN O O T To checkin the new tape, DO N N OOO T not use this command like I first did, NO >checkin libvolume ibm7337 000282 status=Private NO NO NO NO !!!! ANS8003I Process number 1592 started. You can see this process >q pro Process Process Description Status Number -------- -------------------- ------------------------------------------------- 1,592 CHECKIN LIBVOLUME ANR8424I Checking in volume 000282 in library IBM7337. But a check in the activity log after the process went away, revealed ANR8314E Library IBM7337 is full. ANR8426E CHECKIN LIBVOLUME for volume 000282 in library IBM7337 failed. ANR0985I Process 1592 for CHECKIN LIBVOLUME running in the BACKGROUND completed with completion state FAILURE at 12:28:17. I don't understand this at all. I then tried NO >checkin libvolume ibm7337 000282 search=yes status=private NO NO NO NO !!!! But I got ANR2020E CHECKIN LIBVOLUME: Invalid parameter - SEARCH. even though the "help checkin libv" text says search is valid. Hmmmmm. N N OOO TTTTT NN N O O T N NN O O T Also, do N N OOO T not use this command, NO >checkin libvolume ibm7337 000312 checklabel=no status=private NO NO NO NO !!!! That doesn't appear to work at all. I don't understand. Y Y EEEEE SSS Y Y E S Y EEE SS Y E S S Y EEEEE SSS The correct commands to use omits the tape label parameter, i.e. checkout libv ibm7337 000123 checklabel=no checkin libv ibm7337 search=yes status=private This starts a process in the background, >q pro Process Process Description Status Number -------- -------------------- ------------------------------------------------- 1,597 CHECKIN LIBVOLUME ANR8425I Checking in volumes in search mode in library IBM7337. After a few minutes, this process will finish and you'll see the new volume q libv Library Name Volume Name Status Owner Last Use Home Element ------------ ----------- ---------- ---------- --------- ------------ IBM7337 000282 Private Data 4 ... ------------------------------------------------------------------------------- To get DB/2 to backup to ADSM, one needs to install on the DB/2 server, the tivoli.tsm.client.api.aix43.32bit fileset. It installs stuff that's used when one tells DB/2 to dump directly to ADSM, e.g. db2 "backup database ncjapan use adsm". The most important of these, is the dsm.opt & dsm.sys files in the /usr/tivoli/tsm/client/api/bin directory. These files have to be set to talk to the TSM server. What Sandy & I did on elephant, was to simply link those two files to the normal files in the /usr/tivoli/tsm/client/ba/bin directory, I.E. lrwxrwxrwx 1 root system 20 Feb 13 08:38 dsm.opt -> ../../ba/bin/dsm.opt lrwxrwxrwx 1 root system 20 Feb 13 08:38 dsm.sys -> ../../ba/bin/dsm.sys See my aixnotes/db2 file for additional info on backing up DB/2 databases to ADSM. ------------------------------------------------------------------------------- On the Patent Server's adsmpat, dsmadmc -id=jasper -passw=new4now -server=adsmpat delete filespace ar0141e1.patent.ibm.com /images1 q libv q session ------------------------------------------------------------------------------- To do admin-y things, invoke the Admin ADSM client, dsmadmc ------------------------------------------------------------------------------- You can download the latest adsm client for AIX. The IBM Internal site is http://index.storsys.ibm.com/adsm/fixes/v2r1/aix/4.2/adsm.client. where you can get the base code from, as well as updates. From the Internet, you have to go to ftp://service.boulder.ibm.com/storage/tivoli-storage-management/maintenance/server/v3r7/AIX/LATEST/ where all you can get are the updates, not the base code. ------------------------------------------------------------------------------- Running dsmc from eagle, dsmc <-server=almaden> Query Backup <-INActive> file_spec Restore source_file_spec destination_file_spec e.g. restore -replace=no -subdir=yes -TAPEPrompt=Yes almaden The tapeprompt option is difficult to find info on. The default is no, which means you are not prompted for when TSM needs a tape -- your session will wait for the tape to get mounted. From an dsmadmc admin session, you can do a "q req" and see the tape mount request. TAPEPrompt=Yes does prompt you with --- Offline Media is Required --- The following object requires offline media to be mounted. Object: /home/ipnuser/cvsroot_ipsfd/... Device: TAPE Volume Label: 000312 Select an appropriate action 1. Wait for the volume to be mounted 2. Always wait for a volume to be mounted 3. Skip this object 4. Skip all objects on this volume 5. Skip all objects requiring a media to be mounted A. Abort this operation Action [1,2,3,4,5,A] : You can then get 000312 in the library or skip or abort. ------------------------------------------------------------------------------- See Rick Haeckel's hints/notes on setting up adsm on your own machine, in /afs/alm/ais/adsm (not in /afs/alm/ais/datavault, where I thought it was). ------------------------------------------------------------------------------- Some notes when Rick Haeckel set up ADSM on ssdarc01 on 10-19-95. The command to invoke ADSM is dsm & for the GUI version and dsmc for the line mode version. ADSM reads the files /etc/dsm.opt, which is essentially just an empty file, and /etc/dsm.sys, which points which server you're using. The interesting lines from dsm.sys are SErvername adsm TCPServeraddress adsmsrv1.almaden.ibm.com Passwordaccess generate InclExcl /some/file/name The Passwordaccess generate line will keep your ADSM password current by automatically generating a new password for you every six months. It keeps the password encrypted in the file /etc/security/adsm/ADSM. The encrypted password files will always be in the /etc/security/adsm directory, with filenames equal to whatever server you're talking to, the default being the one specified in the SErvername line in dsm.sys. The InclExcl line points to some other file that ADSM reads to further restrict and/or include other files. See Rick's documentation in the /afs/alm/ais/adsm directory. To get ADSM to run every night and do an incremental backup, you can put the following line in crontab, via a crontab -e (as root) 12 1 * * * /adsmbackup and create an adsmbackup script as follows, and of course, chmod +x it. This script by the way is in /u/haeckel/public/adsmbackup. #!/bin/ksh # Put ADSM error log in /tmp as /tmp/dsmerror.log export DSM_LOG=/tmp # annotate the output echo ------------------------------------------------- >> /tmp/adsm.log echo ADSM backup begun at $(date) >> /tmp/adsm.log echo ------------------------------------------------- >> /tmp/adsm.log # Run an incremental backup on the domains identified in /etc/dsm.opt /usr/bin/dsmc incremental -tapeprompt=no >> /tmp/adsm.log 2>&1 echo ================================================= >> /tmp/adsm.log echo ADSM backup ended at $(date) >> /tmp/adsm.log echo ================================================= >> /tmp/adsm.log ------------------------------------------------------------------------ ------------------------------------------------------------------------ A note I sent to Andras Kornai re: our afsback service and how it works. First of all, kornai is not now a subscriber to our afsback service. To subscribe, there's a quick and easy way to do it under IGOR, but since you don't run IGOR, you can run the /afs/alm/rcf/datavault/enroll script. That script will insure your ACL's under kornai's home directory are set so that afsback can read your files (i.e. afsback rl). It also sets ACL's for all your subdirectories. It then sends mail to afsback asking for kornai to be backed up by our afsback service. Rick Haeckel answers that mail (you don't have to know all these details - it all happens behind the scenes), puts kornai into the list of almost 300 other (mostly home) directories it now backs up nightly, and starting tonight, afsback on midway backs everything up. It's necessary to know at least some of this background 'cause to restore from this backup, you'll need to rlogin to midway and run adsm from there. When you do that, since you're not an IGOR user - IGOR hides this from the user, ADSM will come up with all 300 or so directories (domain's, I think they're called) highlighted. The first thing you'll want to do is to deselect all but your own home directory. The real easy way to do this is to run the IGOR script /afs/alm/ais/desktop/igor/rel/Tools/DataVault/DataVaultAccess This will setup the ~/$USER.adsm (e.g. ~/jasper.adsm) file to select just your home directory, before bringing up an xterm window to logon to midway. >> It does this too, export DSM_CONFIG=$HOME/$USER.adsm This saves you the painful task of de-selecting, one-by-one, all the other 500 some odd directories that afsback backups up nightly. ----------------------------------------------------------------------- In order to run the dsm GUI on an AIX machine, you need these two filesets installed, X11.Dt.lib and X11.Dt.ToolTalk. You can get these from the /afs/d/software/base/AIX.4.3.3 directory. ------------------------------------------------------------------------------- To register an admin id register admin gleddie new4now contact='707920 Gleddie, Darrel' grant auth gleddie cl=sy or register admin mike new4now contact='Mike Crom - Lisle' grant auth mike cl=sy To reset a locked id 'cause you typed in too many wrong passwords, q admin To look at it first unlock admin jasper To unlock it update admin mike new4you To change the password ------------------------------------------------------------------------------- In February, 2002, the ADSM server was getting strange errors when using its tape drives. The ACTLOG had things like 02/04/02 12:11:13 ANR8779E Unable to open drive /dev/rmt0, error number=46. Error number 46 says drive is in use, but it wasn't. We even did a reboot and played around with it via tar's and such, and it was working fine. Later, we RMT1 was even getting errors trying to use it. 02/04/02 05:17:45 ANR8413E UPDATE DRIVE: Drive RMT1 is currently in use. Another way to look at the status of the drives, is Q DRIVE We did this on 12-8-2003 after Friday's power outage, and saw Library Name Drive Name Device Type Device ON LINE ------------ ------------ ----------- ------------ ------------------- IBM7337 RMT0 GENERICTAPE /dev/rmt0 Yes IBM7337 RMT1 GENERICTAPE /dev/rmt1 Unavailable Since 12/05/03 13:53:40 Mike finally fixed things by "deleting rmt0, then adding it back in again with the same parameters". What I think he means is, he issued the following ADSM commands, DELETE DRIVE IBM7337 RMT0 DEFINE DRIVE IBM7337 RMT0 DEVICE=/dev/rmt0 ELEMENT=116 ONLINE=YES AUDIT LIBRARY IBM7337 CHECKLABEL=YES Or for rmt1, it's, DELETE DRIVE IBM7337 RMT1 DEFINE DRIVE IBM7337 RMT1 DEVICE=/dev/rmt1 ELEMENT=117 ONLINE=YES AUDIT LIBRARY IBM7337 CHECKLABEL=YES Or to delete & redefine the whole bloody thing. In TSM 4.1, the sequence was DELETE DRIVE IBM7337 RMT0 DELETE DRIVE IBM7337 RMT1 DELETE LIBRARY IBM7337 DEFINE LIBRARY IBM7337 LIBTYPE=SCSI DEVICE=/dev/lb0 DEFINE DRIVE IBM7337 RMT0 DEVICE=/dev/rmt0 ELEMENT=116 ONLINE=YES DEFINE DRIVE IBM7337 RMT1 DEVICE=/dev/rmt1 ELEMENT=117 ONLINE=YES (From http://people.bu.edu/rbs/ADSM.funcdir, here's how they said to do this, ( DEFINE LIBRARY IBM7337 LIBTYPE=SCSI DEVICE=/dev/lb0 ) ( DEFINE DRIVE IBM7337 RMT0 DEVICE=/dev/mt0 ELEMENT=116 ONLINE=YES ) ( DEFINE DRIVE IBM7337 RMT1 DEVICE=/dev/mt1 ELEMENT=117 ONLINE=YES ) If no tape volume was moved around, you can do AUDIT LIBRARY IBM7337 CHECKLABEL=YES else to force the library to read all tape labels for their volids, CHECKIN LIBVOLUME IBM7337 SEARCH=YES STATUS=PRIVATE But now in TSM 5.1, the sequence is, DELETE PATH ADSMSRV1 RMT0 SRCTYPE=SERVER DESTTYPE=DRIVE LIBRARY=IBM7337 DELETE PATH ADSMSRV1 RMT1 SRCTYPE=SERVER DESTTYPE=DRIVE LIBRARY=IBM7337 DELETE DRIVE IBM7337 RMT0 DELETE DRIVE IBM7337 RMT1 DELETE PATH ADSMSRV1 IBM7337 SRCTYPE=SERVER DESTTYPE=LIBRARY DELETE LIBRARY IBM7337 DEFINE LIBRARY IBM7337 LIBTYPE=SCSI DEFINE PATH ADSMSRV1 IBM7337 SRCTYPE=SERVER DESTTYPE=LIBRARY DEVICE=/dev/lb0 DEFINE DRIVE IBM7337 RMT0 ELEMENT=116 DEFINE DRIVE IBM7337 RMT1 ELEMENT=117 DEFINE PATH ADSMSRV1 RMT0 SRCTYPE=SERVER DESTTYPE=DRIVE LIBRARY=IBM7337 DEVICE=/dev/rmt0 DEFINE PATH ADSMSRV1 RMT1 SRCTYPE=SERVER DESTTYPE=DRIVE LIBRARY=IBM7337 DEVICE=/dev/rmt1 If no tape volume was moved around, you can try AUDIT LIBRARY IBM7337 CHECKLABEL=YES but if a q libv command doesn't show any tape volumes, you'll have to force the library to read all tape labels for their volids, with this command CHECKIN LIBVOLUME IBM7337 SEARCH=YES STATUS=PRIVATE which takes about 33 minutes to load and read the volid of all 15 tapes using just 1 drive. And now you're done. A Q PATH command now shows Source Name Source Type Destination Destination On-Line Name Type ----------- ----------- ----------- ----------- ------- ADSMSRV1 SERVER IBM7337 LIBRARY Yes ADSMSRV1 SERVER RMT0 DRIVE Yes ADSMSRV1 SERVER RMT1 DRIVE Yes If need be, you may have to cancel already-running processes and maybe even halt migration or space reclamation processes from starting up by update stg backuppool next="" update stg backuptape recl=100 To put these settings back, update stg backuppool next="backuptape" update stg backuptape recl=85 On 5-11-2005, the DEFINE DRIVE command gave errors complaining that 116 was not a valid element address. The problem was with the newly-replaced drive. For some reason, the library was failing to see that drive so as far as the library was concerned, there was no drive at element 116, which was confirmed by a /usr/tivoli/tsm/devices/bin/lbtest -d /dev/lb0 command, which can only be run when TSM is down so you can open the library. I think the sequence was Enter selection: 1 <-- For Manual test enter selection: 6 <-- To open enter selection: 10 <-- For ioctl return library inventory and to exit, enter selection: 99 <-- To return to main menu Enter selection: 9 <-- To Exit lbtest ------------------------------------------------------------------------------- One day, all the tape volumes got full again and I got a bit lazy by choosing a volume (000296) that had virtually no data on it, and just deleting it all. The commands to do this were delete volume 000296 DISCARDDATA=YES This caused 000296 to go into the "Empty" state, tsm: ADSMSRV1>q v 000296 Volume Name Storage Device Estimated Pct Volume Pool Name Class Name Capacity Util Status (MB) ------------------------ ----------- ---------- --------- ----- -------- 000296 BACKUPTAPE TAPE 0.0 0.0 Empty I then had a hell of a time to get TSM to actually use this tape as a scratch tape. I think, but I'm not sure, that what finally did it, was an AUDIT LIBRARY IBM7337 CHECKLABEL=YES ------------------------------------------------------------------------------- When a tape is in an "Error State", which you can only see when you do a q v 000201 f=d command, it will say In Error State?: Yes You can reset it by this command update vol 000201 access=readwrite For the case where the volume is in an error state but not currently in the library, e.g. when you do the q vol ... f=d command, it shows both Access: Unavailable and In Error State?: Yes doing the update vol ... access=readwrite command resets both to Access: Read/Write and In Error State?: No which is wrong, so to fix it, also do this command update vol 000201 access=unavailable and now both states are correct, namely Access: Unavailable and In Error State?: No ------------------------------------------------------------------------------- To label a new batch of tapes: Load tapes in tape library. cd /usr/tivoli/tsm/server/bin vi vol.labels and put each label in its own line ./dsmlabel -library=/dev/lb0 -drive=/dev/rmt0,116 -overwrite -search /dev/console 2>&1 & ------------------------------------------------------------------------------- Got a library problem. The ADSM "q actlog" showed it as 12/08/03 14:08:00 ANR8300E I/O error on library IBM7337 (OP=00006C03, CC=304, KEY=04, ASC=15, ASCQ=83, SENSE=70.00.04.00.00.00.00.0A.00.00.00.00.15.83.31.00.00- .00., Description=Changer failure). Refer to Appendix B in the 'Messages' manual for recommended action. and AIX's errpt -a said LABEL: ADSM_DD_LOG2 IDENTIFIER: 5680E405 Date/Time: Mon Dec 8 14:08:00 Sequence Number: 84472 Machine Id: 00059295A100 Node Id: reindeer Class: H Type: PERM Resource Name: lb0 Resource Class: library Resource Type: ADSM-SCSI-LB Location: X0-02-01-6,0 Description STORAGE SUBSYSTEM FAILURE Probable Causes ATTACHED SCSI TARGET DEVICE SCSI ADAPTER Failure Causes ATTACHED SCSI TARGET DEVICE SCSI ADAPTER Recommended Actions RUN DIAGNOSTICS AGAINST THE FAILING DEVICE CHECK PHYSICAL INSTALLATION CHECK FOR CORRECT MICROCODE FIX CONTACT APPROPRIATE SERVICE REPRESENTATIVE Detail Data COMMAND 0C06 0000 A500 0084 0003 0074 0000 0000 STATUS CODE 0000 0000 SENSE DATA 0102 0000 7000 0400 0000 000A 0000 0000 1583 3100 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 ERROR CODE 0 RETURN CODE 0 REFERENCE CODE 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 Our 7337 tape library is serial # 10-4576B. I called it in (reference # 31 TP LQ 3) ------------------------------------------------------------------------------- December 9, 2003 Some 7337 things I learned. See my w3/~jasper/7337Notes.html, which I stole from http://www.gruftie.net/ibm/tl/techlib/qna/sfam/html/FB/FB5197L.htm. Tape slots are 0-9 in the front, 10-14 in the back, left-to-right. The left-hand tape drive is numbered 116, is drive A, rmt0, SCSI ID 4. The right-hand tape drive is numbered 117, is drive B, rmt1, SCSI ID 5. The /dev/smc0 is the "IBM 7337 Tape Medium Changer" and is SCSI ID 6. IBM has a tapeutil command that can be used to manipulate the tape library. To load a tape from slot 11 to drive A, tapeutil -f/dev/smc0 move -s 11 -d 116 And to put it back, you need to first unload it, then move it back tapeutil -f/dev/rmt0 unload <=== Note this is rmt0, not smc0! tapeutil -f/dev/smc0 move -s 116 -d 11 There's also another device associated with the 7337, /dev/lb0. This is created by going through "smitty devices" "Tivoli Storage Manager Devices" "Library/MediumChanger" "Add a Library/MediumChanger" There's only one "Library/MediumChanger Type", "ADSM-SCSI-LB". On reindeer, the 7337 is attached to vscs1 (00-02-01). The "CONNECTION address" is typically "6,0". Under the covers, this does the following mkdev command, mkdev -c library -t 'ADSM-SCSI-LB' -s 'scsi' -p 'vscsi1' -w '6,0' Today though, I was getting this error message trying to recreate /dev/lb0: Method error (/etc/methods/cfgadsmdd): 0514-047 Cannot access a device. cfgadsmdd[get_i]: ioctl SCIO(L)START failed, errno = 22 cfgadsmdd[inq..]: free dds. cfgadsmdd[main ]: error inquiry or building dds,rc=47 But I upgraded AIX to ML11 and reinstalled the latest Atape driver, and then above the mkdev command worked, so now I had a /dev/lb0. But then the power went out and when it came back, lb0 was gone. Now when I do the mkdev, I get a slightly different error message, Method error (/etc/methods/cfgadsmdd): 0514-047 Cannot access a device. cfgadsmdd[get_i]: ioctl SCIO(L)INQU first call failed, retry if possible, errno = 19 cfgadsmdd[inq..]: free dds. cfgadsmdd[main ]: error inquiry or building dds,rc=47 But later it worked, but the tapeutil commands above, don't work. ------------------------------------------------------------------------------- December 11, 2003 I tried upgrading TSM from 4.1.2.0 to 5.1.0.0. They talk about doing backups, so I first figured out what kind of backups Darrel defined for us. To see them, get under a dsmadmc session and q schedule type=a It shows there are 5 Administrative tasks defined, * Schedule Name Start Date/Time Duration Period Day - ---------------- -------------------- -------- ------ --- DAILY_DB_INCR 02/05/01 06:00:00 1 H 1 D Any DAILY_EXPIRE_INV 02/05/01 09:00:00 1 H 1 D Any * MONTHLY_DB_FULL 02/01/01 05:59:00 1 H 1 Mo Any MONTHLY_DEL_VOLH 02/01/01 10:00:00 1 H 1 W Any WEEKLY_DB_BACKUP 06/10/01 05:58:59 1 H 1 W Sun A deeper look via q schedule type=a f=d, shows Schedule Name: DAILY_DB_INCR Description: backup db daily Command: backup db dev=db_backup type=incr Priority: 5 Start Date/Time: 02/05/01 06:00:00 Duration: 1 Hour(s) Period: 1 Day(s) Day of Week: Any Expiration: Active?: Yes Last Update by (administrator): GLEDDIE Last Update Date/Time: 02/05/01 16:15:37 Managing profile: Schedule Name: DAILY_EXPIRE_INV Description: daily expire inventory Command: expire inventory Priority: 5 Start Date/Time: 02/05/01 09:00:00 Duration: 1 Hour(s) Period: 1 Day(s) Day of Week: Any Expiration: Active?: Yes Last Update by (administrator): GLEDDIE Last Update Date/Time: 02/05/01 17:00:22 Managing profile: Schedule Name: MONTHLY_DB_FULL Description: full db backup Command: backup db dev=db_backup type=full Priority: 5 Start Date/Time: 02/01/01 05:59:00 Duration: 1 Hour(s) Period: 1 Month(s) Day of Week: Any Expiration: 06/09/01 23:59:59 Active?: No Last Update by (administrator): KEVIN Last Update Date/Time: 06/14/01 09:36:49 Managing profile: Schedule Name: MONTHLY_DEL_VOLH Description: deletes db backup volhistory Command: delete volhistory todate=today-5 t=all Priority: 5 Start Date/Time: 02/01/01 10:00:00 Duration: 1 Hour(s) Period: 1 Week(s) Day of Week: Any Expiration: Active?: Yes Last Update by (administrator): KEVIN Last Update Date/Time: 06/22/01 14:16:06 Managing profile: Schedule Name: WEEKLY_DB_BACKUP Description: Full weekly backup of TSM database Command: backup db dev=db_backup type=full Priority: 5 Start Date/Time: 06/10/01 05:58:59 Duration: 1 Hour(s) Period: 1 Week(s) Day of Week: Sunday Expiration: Active?: Yes Last Update by (administrator): KEVIN Last Update Date/Time: 06/25/01 10:51:29 Managing profile: At least that tells me the format of the command to do a full backup, which is what the book is telling me to do. Instead of backup db type=full devclass=tapeclass, I should type backup db type=full dev=db_backup What's devclass you ask? A q devclass command says q devclass Device Device Storage Device Format Est/Max Mount Class Access Pool Type Capacity Limit Name Strategy Count (MB) --------- ---------- ------- --------- ------ -------- ------ DB_BACKUP Sequential 0 FILE 620.0 1 DISK Random 3 TAPE Sequential 1 GENERICTAPE DRIVE 30,720.0 DRIVES And a q devclass db_backup f=d command said Device Class Name: DB_BACKUP Device Access Strategy: Sequential Storage Pool Count: 0 Device Type: FILE Format: Est/Max Capacity (MB): 620.0 Mount Limit: 1 Mount Wait (min): Mount Retention (min): Label Prefix: Library: Directory: /tsmdb_backup Server Name: Retry Period: Retry Interval: Last Update by (administrator): GLEDDIE Last Update Date/Time: 02/05/01 15:44:15 So that's how it finally knows about /tsmdb_backup. The bottom line is, to fully backup TSM before the conversion, I typed these commands from a dsmadmc session, backup db type=full dev=db_backup This wrote /tsmdb_backup/71173586.DBB (527,281,325 bytes) backup devconfig filenames=/tsmdb_backup/devconfig This wrote only 7 lines, versus 22 lines in an old dsmserv.devconv file I found in /tsmdb_backup. The extra lines were the 15 lines that look like comments, giving what tapes are in the 15 slots of the 7337 tape library. Since TSM doesn't have access to the tape library now, I guess these lines are not saved. backup volhistory filenames=/tsmdb_backup/volhistory (19 lines) I also squirreled away more stuff in the /tsmdb_backup directory, cp -p /usr/tivoli/tsm/server/bin/dsmserv.opt dsmserv.opt.2 cp -p /usr/tivoli/tsm/server/bin/setup.macro . cp -p /usr/tivoli/tsm/server/bin/adsm-machines . cp -p /usr/tivoli/tsm/server/bin/adsm.cmd . cp -p /usr/tivoli/tsm/server/bin/rc.adsmserv rc.adsmserv.old cp -p /usr/tivoli/tsm/server/bin/nodelock . cp -p /etc/inittab . Usefull dsmadmc commands to do beforehand so you know what you've got: q license q stg q libv q drive q devclass f=d and probably others I'm not remembering right now. The conversion to the next level was pretty easy. Insert the CD and 1) Install the software via smitty install_update 2) To migrate to the new level, cd /usr/tivoli/tsm/server/bin ./dsmserv runfile /usr/tivoli/tsm/server/webimages/dsmserv.idl 3) I don't think this next step was really worthwhile 'cause of my ignorance of what these "sample command scripts" are, but I did it anyway. ./dsmserv runfile /usr/tivoli/tsm/server/webimages/scripts.smp These scripts they say "can be loaded into the database and run form an administrative client, administrative Web interface, or server console." They "are primarly SELECT queries, but also include scripts that define volumes for and extend the database and recovery log and that back up storage pools." Whatever. 4) Start the server via ./dsmserv They say my licenses are no longer valid and I have to redefine them. I wonder what I had before ... I should have done a Q LICENCE command. Later ... Turns out I DID have to do something special about the licenses, 'cause the TSM server ACTLOG kept bitching that ANR2841W Server is NOT IN COMPLIANCE with license terms. I needed to go back to the install CD and specifically install the tivoli.tsm.license.cert fileset, which evidently is not installed by default. I had to use smitty's "Install and Update from ALL Available Software", not the more normal "Install and Update from LATEST Available Software", else you don't see the tivoli.tsm.license.cert fileset. Once the tivoli.tsm.license.cert fileset was installed, the *lic files were in the /usr/tivoli/tsm/server/bin directory, so all I had to do from a dsmadmc session, was register license file=mgsyslan.lic number=50 and now a q license command says we are now in compliance. ========================================================================= For historical records, here was my README from when I obtained and installed TSM 4.1.2 in January, 2001. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - These files were ftp'd from index.storsys.ibm.com on 1-19-2001, The AIX client code came from the /tivoli-storage-management/maintenance/client/v4r1/AIX/v412 directory. The Windows client code came from the /tivoli-storage-management/maintenance/client/v4r1/Windows/i386/v412 directory. ----------------------------------------------------------------------- The server code updates came from the /tivoli-storage-management/maintenance/client/v4r1/Windows/i386/v412 directory, but I had to get the base 4.1.0.0 server code from the IBM-internal, shasta.sanjose.ibm.com server, from the /tivoli-storage-management/releases/server/v4r1/AIX directory, before any of the 4.1.x.0 updates would apply. ----------------------------------------------------------------------- Normally, all one needs to install on an AIX client machine, is tivoli.tsm.client.ba.aix43.32bit tivoli.tsm.client.ba.aix43.32bit.base and tivoli.tsm.client.ba.aix43.32bit.common Not everything else. ----------------------------------------------------------------------- In order to run the dsm GUI on an AIX machine, you need these two filesets installed, X11.Dt.lib and X11.Dt.ToolTalk. You can get these from the /afs/d/software/base/AIX.4.3.3 directory. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - My sample.dsm.opt was SErvername adsmsrv1 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - My sample.dsm.sys SErvername adsmsrv1 COMMmethod TCPip TCPPort 1500 TCPServeraddress adsmsrv1 TCPBuffsize 32 TCPWindowsize 24 Inclexcl /usr/tivoli/tsm/client/ba/bin/inclexcl.dsm ERRORLOGRetention 5 Passwordaccess generate Nodename jasper - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - And my cat sample.inclexcl.dsm EXclude /usr/vice/cache/* EXclude /usr/vice/cache/.../* EXclude /var/vice/cache/* EXclude /var/vice/cache/.../* EXclude core EXClude /.../.netscape/cache/* EXClude /.../.netscape/cache/.../* ------------------------------------------------------------------------------- This came from my aixnotes/afs file ... I've got the Delphion AFS being backed up nightly from reindeer (the ADSM/TSM server). To restore, tn reindeer and login as root dsmc which is set up to look at the AFS backups by default. To query a file, q backup /afs/delphion.com/FakeDFS/ips/converters/xml2db.sh or if this gives you ANS1092E No files matching search criteria were found then try q backup /afs/delphion.com/FakeDFS/ips/converters/xml2db.sh -inactive To restore, restore /afs/delphion.com/FakeDFS/ips/converters/xml2db.sh /xml2db.sh or restore /afs/delphion.com/FakeDFS/ips/converters/xml2db.sh /xml2db.sh -inactive If you get messages when restoring like so, ** Interrupted ** ANS1114I Waiting for mount of offline media. ANS4035W File '/xml2db.sh' currently unavailable on server. the tape the file is on, must not be in the server or something else is wrong. Contact somebody that knows what they're doing (Mike Crom or Rick Jasper).