SMAP(1) smap 1.0 (December 2005) SMAP(1)
NAME
smap - graphically view information about SLURM jobs,
partitions, and set configurations parameters.
SYNOPSIS
smap [OPTIONS...]
DESCRIPTION
smap is used to graphically view job, partition and node
information for a system running SLURM. Note that
information about nodes and partitions to which a user lacks
access will always be displayed to avoid obvious gaps in the
output. This is equivalent to the --all option of the sinfo
and squeue commands.
OPTIONS
--help,
Print a message describing all smap options.
--usage
Print a brief message listing the smap options.
-D , --display=
sets the display mode for smap. Showing revelant
information about specific views and displaying a
corresponding node chart. While in any display a user
can switch by typing a different view letter. This is
true in all modes except for 'configure mode' user can
type 'quit' to exit just configure mode. Typing 'exit'
will end the configuration mode and exit smap. Note
that unallocated nodes are indicated by a '.' and DOWN
or DRAINED nodes by a '#'. -R ,
--resolve= Returns the XYZ coords
for a Rack/Midplane id or vice-versa.
To get the XYZ coord for a Rack/Midplane id input -R
R101 where 10 is the rack and 1 is the midplane.
To get the Rack/Midplane id from a XYZ coord input -R
101 where X=1 Y=1 Z=1 with no leading 'R'.
j Displays information about jobs running
on system.
s Displays information about slurm
partitions on the system
b Displays information about BGL
partitions on the system
Page 1 (printed 2/13/06)
SMAP(1) smap 1.0 (December 2005) SMAP(1)
c Displays current node states and allows
users to configure the system.
-h, --noheader
Do not print a header on the output.
-c, --commandline
Print output to the commandline, no curses.
-p, --parse
Used with -c commandline option. Don't format output
send only single tab delimited output to stdout.
-i , --iterate=
Print the state on a periodic basis. Sleep for the
indicated number of seconds between reports. User can
exit at anytime by typing 'q' or hitting the return
key. If user is in configure mode type 'exit' to exit
program, 'quit' to exit configure mode.
-V , --version
Print version information and exit.
INTERACTIVE OPTIONS
When using smap in curses mode you can scroll through the
different windows using the arrow keys. The up and down
arrow keys scroll the window containing the grid, and the
left and right arrow keys scroll the window containing the
text information.
OUTPUT FIELD DESCRIPTIONS
AVAIL
Partition state: up or down.
BGL_BLOCK
BGL Block Name.
CONN Connection Type: TORUS or MESH or SMALL (for small
blocks).
ID Key to identify the nodes associated with this entity
in the node chart.
MODE Mode Type: COPROCESS or VIRTUAL.
NAME Name of the job.
NODELIST
Names of nodes associated with this
configuration/partition.
Page 2 (printed 2/13/06)
SMAP(1) smap 1.0 (December 2005) SMAP(1)
NODES
Count of nodes with this particular configuration.
PARTITION
Name of a partition. Note that the suffix "*"
identifies the default partition.
ST State of a job in compact form. Possible states
include: PD (pending), R (running), S (suspended), CG
(completing), CD (completed), F (failed), TO
(timeout), and NF (node failure). See JOB STATE CODES
section below for more information.
STATE
State of the nodes. Possible states include: down,
unknown, idle, allocated, drained, draining, completing
and their abbreviated forms: down, unk, idle, alloc,
drain, drng, and comp respectively. Note that the
suffix "*" identifies nodes that are presently not
responding. See NODE STATE CODES section below for
more information.
TIMELIMIT
Maximum time limit for any user job in
days-hours:minutes:seconds. infinite is used to
identify jobs or partitions without a job time limit.
The node chart is designed to indicate relative locations of
the nodes. On most Linux clusters this will represent a
one-dimensional array of nodes. Larger clusters will utilize
multiple as needed with right side of one line being
logically followed by the left side of the next line.
On Blue Gene systems, the node chart will indicate the three
dimensional topography of the system.
The X dimension will increase from left to right on a given line.
The Y dimension will increase in planes from bottom to top.
The Z dimension will increase within a plane from the back
line to the front line of a plane.
Note the example below:
a a a a b b d d
a a a a b b d d
a a a a b b c c
a a a a b b c c
a a a a b b d d
a a a a b b d d
a a a a b b c c
a a a a b b c c
a a a a . . d d
Page 3 (printed 2/13/06)
SMAP(1) smap 1.0 (December 2005) SMAP(1)
a a a a . . d d
a a a a . . e e Y
a a a a . . e e |
|
a a a a . . d d 0----X
a a a a . . d d /
a a a a . . . . /
a a a a . . . # Z
ID JOBID PARTITION USER NAME ST TIME NODES NODELIST
a 12345 batch joseph tst1 R 43:12 64 bgl[000x333]
b 12346 debug chris sim3 R 12:34 16 bgl[420x533]
c 12350 debug danny job3 R 0:12 8 bgl[622x733]
d 12356 debug dan colu R 18:05 16 bgl[600x731]
e 12378 debug joseph asx4 R 0:34 4 bgl[612x713]
CONFIGURATION INSTRUCTIONS
For Admin use. From this screen one can create a
configuration file that is used to partition and wire the
system into usable blocks.
OUTPUT
BGL_BLOCK BGL Block Name.
CONN Connection Type: TORUS or MESH or SMALL (for small
blocks).
ID Key to identify the nodes associated with this entity
in the node chart.
MODE Mode Type: COPROCESS or VIRTUAL.
INPUT COMMANDS
resolve
Returns the XYZ coords for a Rack/Midplane id or
vice-versa.
To get the XYZ coord for a Rack/Midplane id input -R
R101 where 10 is the rack and 1 is the midplane.
To get the Rack/Midplane id from a XYZ coord input -R
101 where X=1 Y=1 Z=1 with no leading 'R'.
load
Load an already exsistant bluegene.conf file. This
Page 4 (printed 2/13/06)
SMAP(1) smap 1.0 (December 2005) SMAP(1)
will varify and mapout a bluegene.conf file.
After loaded the configuration may be edited and
saved as a new file.
specified either
Submit request for partition creation. The size may be
as a count of base partitions or specific
dimensions in the X, Y and Z directions separated
by "x", for example "2x3x4". A variety of options
may be specified. Valid options are listed below.
Note that the option and their values are case
insensitive (e.g. "MESH" and "mesh" are
equivalent).
Start = XxYxZ
Identify where to start the partition. This
is primarily for testing purposes. For
convenience one can only put the X coord or
XxY will also work. The default value is
0x0x0.
Connection = MESH | TORUS |
Identify how the nodes should be connected in
network. The default value is TORUS.
Small
Equivalent to "Connection=Small". If a small
connection is specified the midplanes chosen
will create 4 smaller partitions within the
midplane each consisting of 128 c-nodes.
Mesh Equivalent to "Connection=Mesh".
Torus
Equivalent to "Connection=Torus".
Rotation = TRUE | FALSE
Specifies that the geometry specified in the
size parameter may be rotated in space (e.g.
the Y and Z dimensions may be switched). The
default value is FALSE.
Rotate
Equivalent to "Rotation=true".
Elongation = TRUE | FALSE
If TRUE, permit the geometry specified in the
size parameter to be altered as needed to fit
available resources. For example, an
allocation of "4x2x1" might be used to
satisfy a size specification of "2x2x2". The
default value is FALSE.
Page 5 (printed 2/13/06)
SMAP(1) smap 1.0 (December 2005) SMAP(1)
Elongate
Equivalent to "Elongation=true".
copy
Submit request for partition to be copied. You may
copy a specific partition by specifying its id, by
default the last configured partition is copied.
You may also specify a number of copies to be
made. By default, one copy is made.
delete
Delete the specified block.
down
Down a specific node or range of nodes. i.e. 000,
000-111 [000x111]
up
Bring a specific node or range of nodes up. i.e.
000, 000-111 [000x111]
alldown
Set all nodes to down state.
allup
Set all nodes to up state.
save
Save the current configuration to a file. If no
file_name is specified, the configuration is
written to a file named "bluegene.conf" in the
current working directory.
clear
Clear all partitions created.
NODE STATE CODES
Node state codes are shortened as required for the field
size. If the node state code is followed by "*", this
indicates the node is presently not responding and will not
be allocated any new work. If the node remains
non-responsive, it will be placed in the DOWN state (except
in the case of DRAINED, DRAINING, or COMPLETING nodes).
ALLOCATED The node has been allocated to one or more jobs.
Page 6 (printed 2/13/06)
SMAP(1) smap 1.0 (December 2005) SMAP(1)
ALLOCATED+ The node is allocated to one or more active jobs
plus one or more jobs are in the process of
COMPLETING.
COMPLETING All jobs associated with this node are in the
process of COMPLETING. This node state will be
removed when all of the job's processes have
terminated and the SLURM epilog program (if any)
has terminated. See the Epilog parameter
description in the slurm.conf man page for more
information.
DOWN The node is unavailable for use. SLURM can
automatically place nodes in this state if some
failure occurs. System administrators may also
explicitly place nodes in this state. If a node
resumes normal operation, SLURM can
automatically return it to service. See the
ReturnToService and SlurmdTimeout parameter
descriptions in the slurm.conf(5) man page for
more information.
DRAINED The node is unavailable for use per system
administrator request. See the update node
command in the scontrol(1) man page or the
slurm.conf(5) man page for more information.
DRAINING The node is currently executing a job, but will
not be allocated to additional jobs. The node
state will be changed to state DRAINED when the
last job on it completes. Nodes enter this state
per system administrator request. See the update
node command in the scontrol(1) man page or the
slurm.conf(5) man page for more information.
IDLE The node is not allocated to any jobs and is
available for use.
UNKNOWN The SLURM controller has just started and the
node's state has not yet been determined.
JOB STATE CODES
Jobs typically pass through several states in the course of
their execution. The typical states are PENDING, RUNNING,
SUSPENDED, COMPLETING, and COMPLETED. An explanation of
each state follows.
CA CANCELLED Job was explicitly cancelled by the user
or system administrator. The job may or
may not have been initiated.
Page 7 (printed 2/13/06)
SMAP(1) smap 1.0 (December 2005) SMAP(1)
CD COMPLETED Job has terminated all processes on all
nodes.
CG COMPLETING Job is in the process of completing.
Some processes on some nodes may still
be active.
F FAILED Job terminated with non-zero exit code
or other failure condition.
NF NODE_FAIL Job terminated due to failure of one or
more allocated nodes.
PD PENDING Job is awaiting resource allocation.
R RUNNING Job currently has an allocation.
S SUSPENDED Job has an allocation, but execution has
been suspended.
TO TIMEOUT Job terminated upon reaching its time
limit.
ENVIRONMENT VARIABLES
The following environment variables can be used to override
settings compiled into smap.
SLURM_CONF The location of the SLURM configuration
file.
COPYING
Copyright (C) 2004 The Regents of the University of
California. Produced at Lawrence Livermore National
Laboratory (cf, DISCLAIMER). UCRL-CODE-217948.
This file is part of SLURM, a resource management program.
For details, see .
SLURM is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version
2 of the License, or (at your option) any later version.
SLURM is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
the GNU General Public License for more details.
SEE ALSO
Page 8 (printed 2/13/06)
SMAP(1) smap 1.0 (December 2005) SMAP(1)
scontrol(1), sinfo(1), squeue(1), slurm_load_ctl_conf(3),
slurm_load_jobs(3), slurm_load_node(3),
slurm_load_partitions(3), slurm_reconfigure(3),
slurm_shutdown(3), slurm_update_job(3),
slurm_update_node(3), slurm_update_partition(3),
slurm.conf(5)
Page 9 (printed 2/13/06)