Classic Cam
Wednesday, December 12, 2007
IBM System p 570 with POWER 6
* Building block architecture delivers flexible scalability and modular growth
* Advanced virtualization features facilitate highly efficient systems utilization
* Enhanced RAS features enable improved application availability
The IBM POWER6 processor-based System p™ 570 mid-range server delivers outstanding price/performance, mainframe-inspired reliability and availability features, flexible capacity upgrades and innovative virtualization technologies. This powerful 19-inch rack-mount system, which can handle up to 16 POWER6 cores, can be used for database and application serving, as well as server consolidation. The modular p570 is designed to continue the tradition of its predecessor, the IBM POWER5+™ processor-based System p5™ 570 server, for resource optimization, secure and dependable performance and the flexibility to change with business needs. Clients have the ability to upgrade their current p5-570 servers and know that their investment in IBM Power Architecture™ technology has again been rewarded.
The p570 is the first server designed with POWER6 processors, resulting in performance and price/performance advantages while ushering in a new era in the virtualization and availability of UNIX® and Linux® data centers. POWER6 processors can run 64-bit applications, while concurrently supporting 32-bit applications to enhance flexibility. They feature simultaneous multithreading,1 allowing two application “threads” to be run at the same time, which can significantly reduce the time to complete tasks.
The p570 system is more than an evolution of technology wrapped into a familiar package; it is the result of “thinking outside the box.” IBM’s modular symmetric multiprocessor (SMP) architecture means that the system is constructed using 4-core building blocks. This design allows clients to start with what they need and grow by adding additional building blocks, all without disruption to the base system.2 Optional Capacity on Demand features allow the activation of dormant processor power for times as short as one minute. Clients may start small and grow with systems designed for continuous application availability.
Specifically, the System p 570 server provides:
Common features Hardware summary
* 19-inch rack-mount packaging
* 2- to 16-core SMP design with building block architecture
* 64-bit 3.5, 4.2 or 4.7 GHz POWER6 processor cores
* Mainframe-inspired RAS features
* Dynamic LPAR support
* Advanced POWER Virtualization1 (option)
o IBM Micro-Partitioning™ (up to 160 micro-partitions)
o Shared processor pool
o Virtual I/O Server
o Partition Mobility2
* Up to 32 optional I/O drawers
* IBM HACMP™ software support for near continuous operation*
* Supported by AIX 5L (V5.2 or later) and Linux® distributions from Red Hat (RHEL 4 Update 5 or later) and SUSE Linux (SLES 10 SP1 or later) operating systems
* 4U 19-inch rack-mount packaging
* One to four building blocks
* Two, four, eight, 12 or 16 3.5 GHz, 4.2 GHz or 4.7 GHz 64-bit POWER6 processor cores
* L2 cache: 8 MB to 64 MB (2- to 16-core)
* L3 cache: 32 MB to 256 MB (2- to 16-core)
* 2 GB to 192 GB of 667 MHz buffered DDR2 or 16 GB to 384 GB of 533 MHz buffered DDR2 or 32 GB to 768 GB of 400 MHz buffered DDR2 memory3
* Four hot-plug, blind-swap PCI Express 8x and two hot-plug, blind-swap PCI-X DDR adapter slots per building block
* Six hot-swappable SAS disk bays per building block provide up to 7.2 TB of internal disk storage
* Optional I/O drawers may add up to an additional 188 PCI-X slots and up to 240 disk bays (72 TB additional)4
* One SAS disk controller per building block (internal)
* One integrated dual-port Gigabit Ethernet per building block standard; One quad-port Gigabit Ethernet per building block available as optional upgrade; One dual-port 10 Gigabit Ethernet per building block available as optional upgrade
* Two GX I/O expansion adapter slots
* One dual-port USB per building block
* Two HMC ports (maximum of two), two SPCN ports per building block
* One optional hot-plug media bay per building block
* Redundant service processor for multiple building block systems2
IBM System Cluster 1350
IBM HPC clustering offers significant price/performance advantages for many high-performance workloads by harnessing the advantages of low cost servers plus innovative, easily available open source software.
Today, some businesses are building their own Linux and Microsoft clusters using commodity hardware, standard interconnects and networking technology, open source software, and in-house or third-party applications. Despite the apparent cost advantages offered by these systems, the expense and complexity of assembling, integrating, testing and managing these clusters from disparate, piece-part components often outweigh any benefits gained.
IBM has designed the IBM System Cluster 1350 to help address these challenges. Now clients can benefit from IBM’s extensive experience with HPC to help minimize this complexity and risk. Using advanced Intel® Xeon®, AMD Opteron™, and IBM PowerPC® processor-based server nodes, proven cluster management software and optional high-speed interconnects, the Cluster 1350 offers the best of IBM and third-party technology. As a result, clients can speed up installation of an HPC cluster, simplify its management, and reduce mean time to payback.
The Cluster 1350 is designed to be an ideal solution for a broad range of application environments, including industrial design and manufacturing, financial services, life sciences, government and education. These environments typically require excellent price/performance for handling high performance computing (HPC) and business performance computing (BPC) workloads. It is also an excellent choice for applications that require horizontal scaling capabilities, such as Web serving and collaboration.
Common features Hardware summary
* Rack-optimized Intel Xeon dual-core and quad-core and AMD Opteron processor-based servers
* Intel Xeon, AMD and PowerPC processor-based blades
* Optional high capacity IBM System Storage™ DS3200, DS3400, DS4700, DS4800 and EXP3000 Storage Servers and IBM System Storage EXP 810 Storage Expansion
* Industry-standard Gigabit Ethernet cluster interconnect
* Optional high-performance Myrinet-2000 and Myricom 10g cluster interconnect
* Optional Cisco, Voltaire, Force10 and PathScale InfiniBand cluster interconnects
* Clearspeed Floating Point Accelerator
* Terminal server and KVM switch
* Space-saving flat panel monitor and keyboard
* Runs with RHEL 4 or SLES 10 Linux operating systems or Windows Compute Cluster Server
* Robust cluster systems management and scalable parallel file system software
* Hardware installed and integrated in 25U or 42U Enterprise racks
* Scales up to 1,024 cluster nodes (larger systems and additional configurations available—contact your IBM representative or IBM Business Partner)
* Optional Linux cluster installation and support services from IBM Global Services or an authorized partner or distributor
* Clients must obtain the version of the Linux operating system specified by IBM from IBM, the Linux Distributor or an authorized reseller
IBM System Cluster 1600
Common features
· Highly scalable AIX 5L or Linux cluster solutions for large-scale computational modeling, large databases and cost-effective data center, server and workload consolidation
· Cluster Systems Management (CSM) software for comprehensive, flexible deployment and ongoing management
· Cluster interconnect options: industry standard 1/10Gb Ethernet (AIX 5L or Linux), IBM High Performance Switch (AIX 5L and CSM) SP Switch2 (AIX 5L and PSSP); 4x/12x InfiniBand (AIX 5L or SLES 9); or Myrinet (Linux)
· Operating system options: AIX 5L Version 5.2 or 5.3, SUSE Linux Enterprise Server 8 or 9, Red Hat Enterprise Linux 4
· Complete software suite for creating, tuning and running parallel applications: Engineering & Scientific Subroutine Library (ESSL), Parallel ESSL, Parallel Environment, XL Fortran, VisualAge C++
· High-performance, high availability, highly scalable cluster file system General Parallel File System (GPFS)
· Job scheduling software to optimize resource utilization and throughput: LoadLeveler®
· High availability software for continuous access to data and applications: High Availability Cluster Multiprocessing (HACMP™)
Hardware summary
· Mix and match IBM POWER5 and POWER5+ servers:
· IBM System p5™ 595, 590, 575, 570, 560Q, 550Q, 550, 520Q, 520, 510Q, 510, 505Q and 505
· IBM eServer™ p5 595, 590, 575, 570, 550, 520, and 510
· Up to 128 servers or LPARs (AIX 5L or Linux operating system images) per cluster depending on hardware; higher scalability by special order
Directories to monitor in AIX
more to view it and rm to clean it out.
/etc/security/failedlogin Failed logins from users. Use the who command
to view the information. Use "cat /dev/null >
/etc/failedlogin" to empty it,
/var/adm/wtmp All login accounting activity. Use the who
command to view it use "cat /dev/null >
/var/adm/wtmp" to empty it.
/etc/utmp Who has logged in to the system. Use the who
command to view it. Use "cat /dev/null >
/etc/utmp" to empty it.
/var/spool/lpd/qdir/* Left over queue requests
/var/spool/qdaemon/* temp copy of spooled files
/var/spool/* spooling directory
smit.log smit log file of activity
smit.script smit log
Storage management concepts
physical partitions, logical volumes, logical partitions, file systems, and raw
devices. Some of their characteristics are presented as follows:
Each individual disk drive is a named physical volume (PV) and has a name
such as hdisk0 or hdisk1.
One or more PVs can make up a volume group (VG). A physical volume can
belong to a maximum of one VG.
You cannot assign a fraction of a PV to one VG. A physical volume is
assigned entirely to a volume group.
Physical volumes can be assigned to the same volume group even though
they are of different types, such as SCSI or SSA.
Storage space from physical volumes is divided into physical partitions (PPs).
The size of the physical partitions is identical on all disks belonging to the
same VG.
Within each volume group, one or more logical volumes (LVs) can be defined.
Data stored on logical volumes appears to be contiguous from the user point
of view, but can be spread on different physical volumes from the same
volume group.
Logical volumes consist of one or more logical partitions (LPs). Each logical
partition has at least one corresponding physical partition. A logical partition
and a physical partition always have the same size. You can have up to three
copies of the data located on different physical partitions. Usually, physical
partitions storing identical data are located on different physical disks for
redundancy purposes.
Data from a logical volume can be stored in an organized manner, having the
form of files located in directories. This structured and hierarchical form of
organization is named a file system.
Data from a logical volume can also be seen as a sequential string of bytes.
This type of logical volumes are named raw logical volumes. It is the
responsibility of the application that uses this data to access and interpret it
correctly.
The volume group descriptor area (VGDA) is an area on the disk that contains
information pertinent to the volume group that the physical volume belongs to.
It also includes information about the properties and status of all physical and
logical volumes that are part of the volume group. The information from VGDA
is used and updated by LVM commands. There is at least one VGDA per
physical volume. Information from VGDAs of all disks that are part of the
same volume group must be identical. The VGDA internal architecture and
Chapter 6. Disk storage management 213
location on the disk depends on the type of the volume group (original, big, or
scalable).
The volume group status area (VGSA) is used to describe the state of all
physical partitions from all physical volumes within a volume group. The
VGSA indicates if a physical partition contains accurate or stale information.
VGSA is used for monitoring and maintained data copies synchronization.
The VGSA is essentially a bitmap and its architecture and location on the disk
depends on the type of the volume group.
A logical volume control block (LVCB) contains important information about
the logical volume, such as the number of the logical partitions or disk
allocation policy. Its architecture and location on the disk depends on the type
of the volume group it belongs to. For standard volume groups, the LVCB
resides on the first block of user data within the LV. For big volume groups,
there is additional LVCB information in VGDA on the disk. For scalable volume
groups, all relevant logical volume control information is kept in the VGDA as
part of the LVCB information area and the LV entry area.
WSM Objectives
• Simplification of AIX administration by a single interface
• Enable AIX systems to be administered from almost any client platform with a browser
that supports Java 1.3 or use downloaded client code from an AIX V5.3 code
• Enable AIX systems to be administered remotely
• Provide a system administration environment that provides a similar look and feel to the
Windows NT/2000/XP, LINUX and AIX CDE environments
The Web-based System Manager provides a comprehensive system management
environment and covers most of the tasks in the SMIT user interface. The Web-based
System Manager can only be run from a graphics terminal so SMIT will need to be used in
the ASCII environment.
To download Web-based System Manager Client code from an AIX host use the address
http:///remote_client.html
Supported Microsoft Windows clients for AIX 5.3 are Windows 2000 Professional version,
Windows XP Professional version, or Windows Server 2003.
Supported Linux clients are PCs running: Red Hat Enterprise Version 3, SLES 8, SLES 9,
Suse 8.0, Suse 8.1, Suse 8.2, and Suse 9.0 using desktops KDE or GNOME only.
The PC Web-based System Manager Client installation needs a minimum of 300 MB free
disk space, 512 MB memory (1GB preferred) and a 1 GHZ cpu.
System management
CSM is designed to minimize the cost and complexity of administering clustered and partitioned systems by enabling comprehensive management and monitoring of the entire environment from a single point of control. CSM provides:
* Software distribution, installation and update (operating system and applications)
* Comprehensive system monitoring with customizable automated responses
* Distributed command execution
* Hardware control
* Diagnostic tools
* Management by group
* Both a graphical interface and a fully scriptable command line interface
In addition to providing all the key functions for administration and maintenance of distributed systems, CSM is designed to deliver the parallel execution required to manage clustered computing environments effectively. CSM supports homogeneous or mixed environments of IBM servers running AIX or Linux.
Parallel System Support Programs (PSSP) for AIX
PSSP is the systems management predecessor to Cluster Systems Management (CSM) and does not support IBM System p servers or AIX 5L™ V5.3 or above. New cluster deployments should use CSM and existing PSSP clients with software maintenance will be transitioned to CSM at no charge.
Very usefull Command
svmon -P
Further:
use can user svmon command to monitor memory usage as follows;
(A) #svmon -P -v -t 10 | more (will give top ten processes)
(B) #svmon -U -v -t 10 | more ( will give top ten user)
smit install requires "inutoc ." first. It'll autogenerate a .toc for you
I believe, but if you later add more .bff's to the same directory, then
the inutoc . becomes important. It is of course, a table of contents.
dump -ov /dir/xcoff-file
topas, -P is useful # similar to top
When creating really big filesystems, this is very helpful:
chlv -x 6552 lv08
Word on the net is that this is required for filesystems over 512M.
esmf04m-root> crfs -v jfs -g'ptmpvg' -a size='884998144' -m'/ptmp2'
-A''`locale yesstr | awk -F: '{print $1}'`'' -p'rw' -t''`locale yesstr |
awk -F: '{print $1}'`'' -a frag='4096' -a nbpi='131072' -a ag='64'
Based on the parameters chosen, the new /ptmp2 JFS file system
is limited to a maximum size of 2147483648 (512 byte blocks)
New File System size is 884998144
esmf04m-root>
If you give a bad combination of parameters, the command will list
possibilities. I got something like this from smit, then seasoned
to taste.
If you need files larger than 2 gigabytes in size, this is better.
It should allow files up to 64 gigabytes:
crfs -v jfs -a bf=true -g'ptmpvg' -a size='884998144' -m'/ptmp2' -A''` |
| locale yesstr | awk -F: '{print $1}'`'' -p'rw' -t''`locale yesstr | aw |
| k -F: '{print $1}'`'' -a nbpi='131072' -a ag='64'
Show version of SSP (IBM SP switch) software:
lslpp -al ssp.basic
llctl -g reconfig - make loadleveler reread its config files
oslevel (sometimes lies)
oslevel -r (seems to do better)
lsdev -Cc adapter
pstat -a looks useful
vmo is for VM tuning
On 1000BaseT, you really want this:
chdev -P -l ent2 -a media_speed=Auto_Negotiation
Setting jumbo frames on en2 looks like:
ifconfig en2 down detach
chdev -l ent2 -a jumbo_frames=yes
chdev -l en2 -a mtu=9000
chdev -l en2 -a state=up
Search for the meaning of AIX errors:
http://publib16.boulder.ibm.com/pseries/en_US/infocenter/base/eisearch.htm
nfso -a shows AIX NFS tuning parameters; good to check on if you're
getting badcalls in nfsstat. Most people don't bother to tweaks these
though.
nfsstat -m shows great info about full set of NFS mount options
Turn on path mtu discovery
no -o tcp_pmtu_discover=1
no -o udp_pmtu_discover=1
TCP support is handled by the OS. UDP support requires cooperation
between OS and application.
nfsstat -c shows rpc stats
To check for software problems:
lppchk -v
lppchk -c
lppchk -l
List subsystem (my word) status:
lssrc -a
mkssys
rmssys
chssys
auditpr
refresh
startsrc
stopsrc
traceson
tracesoff
This starts sendmail:
startsrc -s sendmail -a "-bd -q30m"
This makes inetd reread its config file. Not sure if it kills and
restarts or just HUP's or what:
refresh -s inetd
lsps is used to list the characteristics of paging space.
Turning off ip forwarding:
/usr/sbin/no -o ipforwarding=0
Detailed info about a specific error:
errpt -a -jE85C5C4C
BTW, Rajiv Bendale tells me that errors are stored in NVRAM on AIX,
so you don't have to put time into replicating an error as often.
Some or all of these will list more than one number. Trust the first,
not the second.
lslpp -l ppe.poe
...should list the version of poe installed on the system
Check on compiler versions:
lslpp -l vac.C
lslpp -l vacpp.cmp.core
Check on loadleveler version:
lslpp -l LoadL.full
If you want to check the bootlist do bootlist -o -m normal if you want to
update bootlist do bootlist -m normal hdisk* hdisk* cd* rmt*
prtconf
Run the ssadiag against the drive and the adapter and it will tell you if it
fails or not.
AIX Control Book Creation
List the defined devices lsdev -C -H
List the disk drives on the system lsdev -Cc disk
List the memory on the system lsdev -Cc memory (MCA)
List the memory on the system lsattr -El sys0 -a realmem (PCI)
lsattr -El mem0
List system resources lsattr -EHl sys0
List the VPD (Vital Product Data) lscfg -v
Document the tty setup lscfg or smit screen capture F8
Document the print queues qchk -A
Document disk Physical Volumes (PVs) lspv
Document Logical Volumes (LVs) lslv
Document Volume Groups (long list) lsvg -l vgname
Document Physical Volumes (long list) lspv -l pvname
Document File Systems lsfs fsname
/etc/filesystems
Document disk allocation df
Document mounted file systems mount
Document paging space (70 - 30 rule) lsps -a
Document paging space activation /etc/swapspaces
Document users on the system /etc/passwd
lsuser -a id home ALL
Document users attributes /etc/security/user
Document users limits /etc/security/limits
Document users environments /etc/security/environ
Document login settings (login herald) /etc/security/login.cfg
Document valid group attributes /etc/group
lsgroup ALL
Document system wide profile /etc/profile
Document system wide environment /etc/environment
Document cron jobs /var/spool/cron/crontabs/*
Document skulker changes if used /usr/sbin/skulker
Document system startup file /etc/inittab
Document the hostnames /etc/hosts
Document network printing /etc/hosts.lpd
Document remote login host authority /etc/hosts.equi
What is Hot Spare
A hot spare is a disk or group of disks used to replace a failing disk. LVM marks a physical
volume missing due to write failures. It then starts the migration of data to the hot spare
disk.
Minimum hot spare requirements
The following is a list of minimal hot sparing requirements enforced by the operating
system.
- Spares are allocated and used by volume group
- Logical volumes must be mirrored
- All logical partitions on hot spare disks must be unallocated
- Hot spare disks must have at least equal capacity to the smallest disk already
in the volume group. Good practice dictates having enough hot spares to
cover your largest mirrored disk.
Hot spare policy
The chpv and the chvg commands are enhanced with a new -h argument. This allows you
to designate disks as hot spares in a volume group and to specify a policy to be used in the
case of failing disks.
The following four values are valid for the hot spare policy argument (-h):
Synchronization policy
There is a new -s argument for the chvg command that is used to specify synchronization
characteristics.
The following two values are valid for the synchronization argument (-s):
Examples
The following command marks hdisk1 as a hot spare disk:
# chpv -hy hdisk1
The following command sets an automatic migration policy which uses the smallest hot
spare that is large enough to replace the failing disk, and automatically tries to synchronize
stale partitions:
# chvg -hy -sy testvg
Argument Description
y (lower case)
Automatically migrates partitions from one failing disk to one spare
disk. From the pool of hot spare disks, the smallest one which is big
enough to substitute for the failing disk will be used.
Y (upper case)
Automatically migrates partitions from a failing disk, but might use
the complete pool of hot spare disks.
n
No automatic migration will take place. This is the default value for a
volume group.
r
Removes all disks from the pool of hot spare disks for this volume
Thursday, November 29, 2007
Implementation of Partition Load Manager
PLM Software Installation
Install the following filesets:
plm.license
plm.server.rte
plm.sysmgt.websm
Make sure SSL and OpenSSH are also installed
For setup of PLM, create .rhosts files on the server and all clients.After PLM has been set up, you can delete the .rhosts files.
Create SSH Keys
On the server, enter:
# ssh-keygen –t rsa
Copy the HMC’s secure keys to the server:
# scp hscroot@hmchostname:.ssh/authorized_keys2 \
~/.ssh/tmp_authorized_keys2
Append the server’s keys to the temporary key file and copy it back to the HMC:
# cat ~/.ssh/id_rsa.pub >> ~/.ssh/tmp_authorized_keys2
# scp ~/.ssh/tmp_authorized_keys2 \
hscroot@hmchostname:.ssh/authorized_keys2
Test SSH and Enable WebSM
Test SSH to the HMC. You should not be asked for a password.
# ssh hscroot@hmchostname lssyscfg –r sys
On the PLM server, make sure you can run WebSM. Run:
# /usr/websm/bin/wsmserver -enable
Configure PLM Software
On the PLM server, open WebSM and select Partition Load Manager.
Click on Create a Policy File. In the window open on the General Tab, enter a policy file name on the first line
Click on the Globals tab. Enter the fully qualified hostname of your HMC. Enter hscroot (or a user with the Systems Administration role) as the HMC user name. Enter the CEC name, which is the managed system name (not the fully qualified hostname).
Click on the Groups tab. Click the Add button. Type in a group name. Enter the maximum CPU and memory values that you are allowed to use for PLM operations.
Check both CPU and Memory management if you’re going to manage both.
Click on Tunables. These are the defaults for the entire group. If you don’t understand a value, highlight it and select Help for a detailed description.
Click on the Partitions tab. Click the Add button and add all of the running partitions in the group to the partitions list.
On the Partition Definition tab, use the partitions’ fully qualified hostnames and add them to the group you just created.
Click OK to create the policy file.
In the PLM server, view the policy file you created. It will be in /etc/plm/policies.
Perform the PLM setup step using WebSM. You must be root. Once this finishes, you’ll see “Finished: Success” in the WebSM working window.
In the server and a client partition, look at the /var/ct/cfg/ctrmc.acls file to see if these lines are at the bottom of the file:
IBM.LPAR
root@hmchostname * rw
If you need to edit this file, run this command afterward:
# refresh –s ctrmc
Test RMC authentication by running this command from the PLM server, where remote_host is a PLM client
# CT_CONTACT=remote_host lsrsrc IBM.LPAR
If successful, a lot of LPAR information will be printed out instead of “Could not authenticate user”
Start the PLM server. Look for “Finished:Success” in the WebSM working window.
Enter a configuration name. Enter your policy file name. Enter a new logfile name.
(If you have trouble with the logilfe, you may need to touch the file before you can access it)
If the LPAR details window shows only zeroed-out information, then there’s probably an RMC authentication problem.
If there’s a problem, on the server partition, run:
# /usr/sbin/rsct/bin/ctsvhbal
The output should list one or more identities. Check to see that the server’s fully qualified hostname is in the output.
On each partition, run /usr/sbin/rsct/bin/ctsthl –l. At least one of the identities shown on the remote partition’s ctsvhbal output should show up on the other partitions’ ctsthl –l output. This is the RMC list of trusted hosts.
If there are any entries in the RMC trusted hosts lists which are not fully qualified hostnames, remove them with the following command:
# /usr/sbin/rsct/bin/ctsthl –d –n identity
where identity is the trusted host list identity
If one partition is missing a hostname, add it as follows:
# /usr/sbin/rsct/bin/ctsthl –l –n identity –m METHOD –p ID_VALUE
Identity is the fully qualified hostname of the other partition
rsa512 is the method
Id_value is obtained by running ctsthl –l on the other partition to determine its own identifier
Introduction to WPAR in AIX 6
Workload Partitioning is a virtualization technology that utilizes
software rather than firmware to isolate users and/or applications.
A Workload Partition (WPAR) is a combination of several core AIX technologies. There are differences of course, but here the emphasis is on the similarities. In this essay I shall describe the characteristics of these technologies and how workload partitions are built upon them.
There are two types of WPAR: system and application.My focus is on system WPAR as this more closely resembles a LPAR or a seperate system. In other words, a system WPAR behaves as a complete installation of AIX. At a later time application workload partitions will be described in terms of how they differ from a system WPAR. For the rest of this document WPAR and system WPAR are to be considered synonomous.
AIX system software has three components: root, user, and shared. The root component consists of all the software and data that are unique to that system or node. The user (or usr) part consists of all the software and data that is common to all AIX systems at that particular AIX software level (e.g., oslevel AIX 5.3 TL06-01, or AIX 5.3 TL06-02, or AIX 6.1). The shared component is software and data that is common to any UNIX or Linux system.
In it's default configuration a WPAR inherits it's user (/usr) and shared (/usr/share, usually physically included in /usr filesystem) components from the global system. Additionally, the WPAR inherits the /opt filesystem. The /opt filesystem is the normal installation area in the rootvg volume group for RPM and IHS packaged applications and AIX Linux affinity applications and libraries. Because multiple WPAR's are intended to share these file fystems (/usr and /opt) they are read-only by WPAR applications and users. This is very similiar to how NIM (Network Installation Manager) diskless and dataless systems were configured and installed. Since only the unique rootvg volume group file systems need to be created (/, /tmp, /var, /home) creation of a WPAR is a quick process.
The normal AIX boot process is conducted in three phases:
1) boot IPL, or locating and loading the boot block (hd5);
2) rootvg IPL (varyonvg of rootvg),
3) rc.boot 3 or start of init process reading /etc/inittab
A WPAR activation or "booting" skips step 1. Step 2 is the global (is hosting) system mounting the WPAR filesystems - either locally or from remote storage (currently only NFS is officially supported, GPFS is known to work, but not officially supported at this time (September 2007)). The third phase is staring an init process in the global system. This @init@ process does a chroot to the WPAR root filesystem and performs an AIX normal rc.boot 3 phase.
WPAR Management
WPAR Management in it's simpliest form is simply: Starting, Stopping, and Monitoring resource usage. And, not to forget - creating and deleting WPAR.
Creating a WPAR is a very simple process: the onetime prequistite is the existance of the directory /wpars with mode 700 for root. Obviously, we do not want just anyone wondering in the virtualized rootvg's of the WPAR. And, if the WPAR name you want to create resolves either in /etc/hosts or DNS (and I suspect NIS) all you need to do is enter:
# mkwpar -n
If you want to save the output you could also use:
# nohup mkwpar -n & sleep 2; tail -f nohup.out
and watch the show!
This creates all the wpar filesystems (/, /home, /tmp, /var and /proc)
and read-only entries for /opt and /usr. After these have been made, they are
mounted and "some assembly" is performed, basically installing the root part
of the filesets in /usr. The only "unfortunate" part of the default setup is
that all filesystems are created in rootvg, and using generic logical partition
names (fslv00, fslv01, fslv02, fslv03). Fortunately, there is an argument
(-g) that you can use to get the logical partitions made in a different
volume group. There are many options for changing all of these and they
will be covered in my next document when I'll discuss WPAR mobility.
At this point you should just enter:
# startwpar
wait for prompt and from "anywhere" you can connect to the running WPAR just
as if it was a seperate system. Just do not expect to make any changes in /usr
or /opt (software installation is also a later document).
Summary
WPAR creation is very similar to the process NIM uses for diskless and dataless installations. This method relies on AIX rootvg software consisting of three components: root, user and share. The normal boot process is emulated by the global system "hosting" the WPAR. Phase 1 is not needed; Phase 2 is the mount of the WPAR filesystem resources; and Phase 3 is a so-called @init@ process that is seen as the regular init in the WPAR environment. This is the process that reads and processes /sbin/rc.boot 3 and /etc/inittab just as a normal AIX system would
AIX / HMC/VIO Tips Sheet
lshmc –n (lists dynamic IP addresses served by HMC)
lssyscfg –r sys –F name,ipaddr (lists managed system attributes)
lssysconn –r sys (lists attributes of managed systems)
lssysconn –r all (lists all known managed systems with attributes)
rmsysconn –o remove –ip (removes a managed system from the HMC)
mkvterm –m {msys} –p {lpar} (opens a command line vterm from an ssh session)
rmvterm –m {msys} –p {lpar} (closes an open vterm for a partition)
Activate a partition
chsysstate –m managedsysname –r lpar –o on –n partitionname –f profilename –b normal
chsysstate –m managedsysname –r lpar –o on –n partitionname –f profilename –b sms
Shutdown a partition
chsysstate –m managedsysname –r lpar –o {shutdown/ossshutdown} –n partitionname [-immed][-restart]
VIO Server Commands
lsdev –virtual (list all virtual devices on VIO server partitions)
lsmap –all (lists mapping between physical and logical devices)
oem_setup_env (change to OEM [AIX] environment on VIO server)
Create Shared Ethernet Adapter (SEA) on VIO Server
mkvdev –sea{physical adapt} –vadapter {virtual eth adapt} –default {dflt virtual adapt} –defaultid {dflt vlan ID}
SEA Failover
ent0 – GigE adapter
ent1 – Virt Eth VLAN1 (Defined with a priority in the partition profile)
ent2 – Virt Eth VLAN 99 (Control)
mkvdev –sea ent0 –vadapter ent1 –default ent1 –defaultid 1 –attr ha_mode=auto ctl_chan=ent2
(Creates ent3 as the Shared Ethernet Adapter)
Create Virtual Storage Device Mapping
mkvdev –vdev {LV or hdisk} –vadapter {vhost adapt} –dev {virt dev name}
Sharing a Single SAN LUN from Two VIO Servers to a Single VIO Client LPAR
hdisk = SAN LUN (on vioa server)
hdisk4 = SAN LUN (on viob, same LUN as vioa)
chdev –dev hdisk3 –attr reserve_policy=no_reserve (from vioa to prevent a reserve on the disk)
chdev –dev hdisk4 –attr reserve_policy=no_reserve (from viob to prevent a reserve on the disk)
mkvdev –vdev hdisk3 –vadapter vhost0 –dev hdisk3_v (from vioa)
mkvdev –vdev hdisk4 –vadapter vhost0 –dev hdisk4_v (from viob)
VIO Client would see a single LUN with two paths.
spath –l hdiskx (where hdiskx is the newly discovered disk)
This will show two paths, one down vscsi0 and the other down vscsi1.
AIX Performance TidBits and Starter Set of Tuneables
Current starter set of recommended AIX 5.3 Performance Parameters. Please ensure you test these first before implementing in production as your mileage may vary.
Network
no –p –o rfc1323=1
no –p –o sb_max=1310720
no –p –o tcp_sendspace=262144
no –p –o tcp_recvspace=262144
no –p –o udp_sendspace=65536
no –p –o udp_recvspace=655360
nfso –p –o rfc_1323=1
NB Network settings also need to be applied to the adapters
nfso –p –o nfs_socketsize=600000
nfso –p –o nfs_tcp_socketsize=600000
Memory Settings
vmo – p –o minperm%=5
vmo –p –o maxperm%=80
vmo –p –o maxclient%=80
Let strict_maxperm and strict_maxclient default
vmo –p –o minfree=960
vmo –p –o maxfree=1088
vmo –p –o lru_file_repage=0
vmo –p –o lru_poll_interval=10
IO Settings
Let minpgahead and J2_minPageReadAhead default
ioo –p –o j2_maxPageReadAhead=128
ioo –p –o maxpgahead=16
ioo –p –o j2_maxRandomWrite=32
ioo –p –o maxrandwrt=32
ioo –p –o j2_nBufferPerPagerDevice=1024
ioo –p –o pv_min_pbug=1024
ioo –p –o numfsbufs=2048
If doing lots of raw I/O you may want to change lvm_bufcnt
Default is 9
ioo –p –o lvm_bufcnt=12
Others left to default that you may want to tweak include:
ioo –p –o numclust=1
ioo –p –o j2_nRandomCluster=0
ioo –p –o j2_nPagesPerWriteBehindCluster=32
Useful Commands
vmstat –v or –l or –s lvmo
vmo –o iostat (many new flags)
ioo –o svmon
schedo –o filemon
lvmstat fileplace
Useful Links
1. Lparmon – www.alphaworks.ibm.com/tech/lparmon
2. Nmon – www.ibm.com/collaboration/wiki/display/WikiPtype/nmon
3. Nmon Analyser – www-941.ibm.com/collaboration/wiki/display/WikiPtype/nmonanalyser
4. vmo, ioo, vmstat, lvmo and other AIX commands http://publib.boulder.ibm.com/infocenter/pseries/v5r3/topic/com
Recovering a Failed VIO Disk
server. It assumes the client partitions have mirrored (virtual) disks. The
recovery involves both the VIO server and its client partitions. However,
it is non disruptive for the client partitions (no downtime), and may be
non disruptive on the VIO server (depending on disk configuration). This
procedure does not apply to Raid5 or SAN disk failures.
The test system had two VIO servers and an AIX client. The AIX client had two
virtual disks (one disk from each VIO server). The two virtual disks
were mirrored in the client using AIX's mirrorvg. (The procedure would be
the same on a single VIO server with two disks.)
The software levels were:
p520: Firmware SF230_145 VIO Version 1.2.0 Client: AIX 5.3 ML3
We had simulated the disk failure by removing the client LV on one VIO server. The
padmin commands to simulate the failure were:
rmdev -dev vtscsi01 # The virtual scsi device for the LV (lsmap -all)rmlv -f aix_client_lv # Remove the client LV
This caused "hdisk1" on the AIX client to go "missing" ("lsvg -p rootvg"....The
"lspv" will not show disk failure...only the disk status at the last boot..)
The recovery steps included:
VIO Server
Fix the disk failure, and restore the VIOS operating system (if necessary)mklv -lv aix_client_lv rootvg 10G # recreate the client LV mkvdev -vdev aix_client_lv -vadapter vhost1 # connect the client LV to the appropriate vhost
AIX Client
cfgmgr # discover the new virtual hdisk2 replacepv hdisk1 hdisk2 # rebuild the mirror copy on hdisk2 bosboot -ad /dev/hdisk2 # add boot image to hdisk2bootlist -m normal hdisk0 hdisk2 # add the new disk to the bootlistrmdev -dl hdisk1 # remove failed hdisk1
The "replacepv" command assigns hdisk2 to the volume group, rebuilds the mirror, and
then removes hdisk1 from the volume group.
As always, be sure to test this procedure before using in production.
DIfferent Raid Levels
Raid 0 - Stripping data across the disks. This stripes the data across all the disks present in the
array. This improves the read and write performance. Eg. Reading a large file takes a
long time in comparison to reading the same file from a Raid 0 system.They is no data
redundancy in this case.
Raid 1 - Mirroring. In case of Raid 0 it was observed that there was no redundancy,i.e if one
disk fails then the data is lost. Raid 1 overcomes that problem by mirroring the data. So
if one disk fails the data is still accessible through the other disk.
Raid 2 - RAID level that does not use one or more of the "standard" techniques of mirroring,
striping and/or parity. It is implemented by splitting data at bit level and spreading it
across the data disks and redundant disk. It uses a special algorithm called as ECC
(error correction code) which is accompanied across each data block. These are tallied
when the data is read from the disk to maintain data integrity.
Raid 3 - data is striped across multiple disks at a byte level. The data is stripped with parity and
the parity is maintained in a separate disk. So if that disk goes off , it results in a data
loss.
Raid 4 - Similar to Raid 3 the only difference is that the data is striped across multiple disks at
block level.
Raid 5 - Block-level striping with distributed parity. The data and parity is stripped across all
disks thus increasing the data redundancy. Minimum three disks are required and if
any one disk goes off the data is still secure.
Raid 6 - Block-level striping with dual distributed parity. Its stripes blocks of data and parity
across all disks in the Raid except that it maintains two sets of parity information for
each parcel of data thus increasing the data redundancy. So if two disk go off the data
is still intact.
Raid 7 - Asynchronous, cached striping with dedicated parity. This level is not a open industry
standard. It is based on the concepts of Raid 3 and 4 and a great deal of cache is
included across multiple levels. Also there is a specialised real time processor to
manage the array asynchronously.
Part 2
The defragfs command can be used to improve or report the status of contiguous space within a file system. For example, to defragment the file system /home, use the following command:
#defragfs /home
Which fileset contains a particular binary?
To show bos.acct contains /usr/bin/vmstat, type:
#lslpp -w /usr/bin/vmstat
Or to show bos.perf.tools contains /usr/bin/svmon, type:
which_fileset svmon
How do I display information about installed filesets on my system?
Type the following:
#lslpp -l
How do I determine if all filesets of maintenance levels are installed on my system?
Type the following:
#instfix -i | grep ML
How do I determine if a fix is installed on my system?
To determine if IY24043 is installed, type:
#instfix -ik IY24043
How do I install an individual fix by APAR?
To install APAR IY73748 from /dev/cd0, for example, enter the command:
#instfix -k IY73748 -d /dev/cd0
How do I verify if filesets have required prerequisites and are completely installed?
To show which filesets need to be installed or corrected, type:
#lppchk –v
How do I get a dump of the header of the loader section and the symbol entries in symbolic representation?
Type the following:
#dump –Htv
How do I determine the amount of paging space allocated and in use?
Type the following:
#lsps –a
How do I increase a paging space?
You can use the chps -s command to dynamically increase the size of a paging space. For example, if you want to increase the size of hd6 with 3 logical partitions, you issue the following command:
#chps -s 3 hd6
How do I reduce a paging space?
You can use the chps -d command to dynamically reduce the size of a paging space. For example, if you want to decrease the size of hd6 with four logical partitions, you issue the following command:
#chps -d 4 hd6
How would I know if my system is capable of using Simultaneous Multi-threading (SMT)?
Your system is capable of SMT if it's a POWER5-based system running AIX 5L Version 5.3.
How would I know if SMT is enabled for my system?
If you run the smtctl command without any options, it tells you if it's enabled or not.
Is SMT supported for the 32-bit kernel?
Yes, SMT is supported for both 32-bit and 64-bit kernel.
How do I enable or disable SMT?
You can enable or disable SMT by running the smtctl command. The following is the syntax:
#smtctl [ -m off | on [ -w boot | now]]
The following options are available:
-m off
Sets SMT mode to disabled.
-m on
Sets SMT mode to enabled.
-w boot
Makes the SMT mode change effective on next and subsequent reboots if you run the bosboot command before the next system reboot.
-w now
Makes the SMT mode change immediately but will not persist across reboot.
If neither the -w boot or the -w now options are specified, then the mode change is made immediately. It persists across subsequent reboots if you run the bosboot command before the next system reboot.
How do I get partition-specific information and statistics?
The lparstat command provides a report of partition information and utilization statistics. This command also provides a display of Hypervisor information.
Volume groups and logical volumes
How do I know if my volume group is normal, big, or scalable?
Run the lsvg command on the volume group and look at the value for MAX PVs. The value is 32 for normal, 128 for big, and 1024 for scalable volume group.
How to create a volume group?
Use the following command, where s partition_size sets the number of megabytes (MB) in each physical partition where the partition_size is expressed in units of MB from 1 through 1024. (It's 1 through 131072 for AIX 5.3.) The partition_size variable must be equal to a power of 2 (for example: 1, 2, 4, 8). The default value for standard and big volume groups is the lowest value to remain within the limitation of 1016 physical partitions per physical volume. The default value for scalable volume groups is the lowest value to accommodate 2040 physical partitions per physical volume.
#mkvg -y name_of_volume_group -s partition_size list_of_hard_disks
How can I change the characteristics of a volume group?
You use the following command to change the characteristics of a volume group:
#chvg
How do I create a logical volume?
Type the following:
#mklv -y name_of_logical_volume name_of_volume_group number_of_partition
How do I increase the size of a logical volume?
To increase the size of the logical volume represented by the lv05 directory by three logical partitions, for example, type:
#extendlv lv05 3
How do I display all logical volumes that are part of a volume group (for example, rootvg)?
You can display all logical volumes that are part of rootvg by typing the following command:
#lsvg -l rootvg
How do I list information about logical volumes?
Run the following command to display information about the logical volume lv1:
#lslv lv1
How do I remove a logical volume?
You can remove the logical volume lv7 by running the following command:
#rmlv lv7
The rmlv command removes only the logical volume, but does not remove other entities, such as file systems or paging spaces that were using the logical volume.
How do I mirror a logical volume?
1. #mklvcopy LogicalVolumeName Numberofcopies
2. #syncvg VolumeGroupName
How do I remove a copy of a logical volume?
You can use the rmlvcopy command to remove copies of logical partitions of a logical volume. To reduce the number of copies of each logical partition belonging to logical volume testlv, enter:
#rmlvcopy testlv 2
Each logical partition in the logical volume now has at most two physical partitions.
Queries about volume groups
To show volume groups in the system, type:
#lsvg
To show all the characteristics of rootvg, type:
#lsvg rootvg
To show disks used by rootvg, type:
#lsvg -p rootvg
How to add a disk to a volume group?
Type the following:
#extendvg VolumeGroupName hdisk0 hdisk1 ... hdiskn
How do I find out what the maximum supported logical track group (LTG) size of my hard disk?
You can use the lquerypv command with the -M flag. The output gives the LTG size in KB. For instance, the LTG size for hdisk0 in the following example is 256 KB.
#/usr/sbin/lquerypv -M hdisk0256
You can also run the lspv command on the hard disk and look at the value for MAX REQUEST.
What does syncvg command do?
The syncvg command is used to synchronize stale physical partitions. It accepts names of logical volumes, physical volumes, or volume groups as parameters.
For example, to synchronize the physical partitions located on physical volumes hdisk6 and hdisk7, use:
#syncvg -p hdisk4 hdisk5
To synchronize all physical partitions from volume group testvg, use:
#syncvg -v testvg
How do I replace a disk?
1. #extendvg VolumeGroupName hdisk_new
2. #migratepv hdisk_bad hdisk_new
3. #reducevg -d VolumeGroupName hdisk_bad
How can I clone (make a copy of ) the rootvg?
You can run the alt_disk_copy command to copy the current rootvg to an alternate disk. The following example shows how to clone the rootvg to hdisk1.
#alt_disk_copy -d hdisk1
Network
How can I display or set values for network parameters?
The no command sets or displays current or next boot values for network tuning parameters.
How do I get the IP address of my machine?
Type one of the following:
#ifconfig -a host Fully_Qualified_Host_Name
For example, type host cyclop.austin.ibm.com.
How do I identify the network interfaces on my server?
Either of the following two commands will display the network interfaces:
#lsdev -Cc if
#ifconfig –a
To get information about one specific network interface, for example, tr0, run the command:
#ifconfig tr0
How do I activate a network interface?
To activate the network interface tr0, run the command:
#ifconfig tr0 up
How do I deactivate a network interface?
For example, to deactivate the network interface tr0, run the command:
#ifconfig tr0 down
AIX Commands you cannot live without it. part 1
As you know, AIX® has a vast array of commands that enable you to do a multitude of tasks. Depending on what you need to accomplish, you use only a certain subset of these commands. These subsets differ from user to user and from need to need. However, there are a few core commands that you commonly use. You need these commands either to answer your own questions or to provide answers to the queries of the support professionals.
In this article, I'll discuss some of these core commands. The intent is to provide a list that you can use as a ready reference. While the behavior of these commands should be identical in all releases of AIX, they have been only tested under AIX 5.3.
Note:
The bootinfo command discussed in the following paragraphs is NOT a user-level command and is NOT supported in AIX 4.2 or later.
Commands
Kernel
How would I know if I am running a 32-bit kernel or 64-bit kernel?
To display if the kernel is 32-bit enabled or 64-bit enabled, type:
bootinfo -K
How do I know if I am running a uniprocessor kernel or a multiprocessor kernel?
/unix is a symbolic link to the booted kernel. To find out what kernel mode is running, enter ls -l /unix and see what file /unix it links to. The following are the three possible outputs from the ls -l /unix command and their corresponding kernels:
/unix -> /usr/lib/boot/unix_up # 32 bit uniprocessor kernel /unix -> /usr/lib/boot/unix_mp # 32 bit multiprocessor kernel/unix -> /usr/lib/boot/unix_64 # 64 bit multiprocessor kernel
Note:
AIX 5L Version 5.3 does not support a uniprocessor kernel.
How can I change from one kernel mode to another?
During the installation process, one of the kernels, appropriate for the AIX version and the hardware in operation, is enabled by default. Let us use the method from the previous question and assume the 32-bit kernel is enabled. Let us also assume that you want to boot it up in the 64-bit kernel mode. This can be done by executing the following commands in sequence:
#ln -sf /usr/lib/boot/unix_64 /unix
#/usr/lib/boot/unix bosboot -ad /dev/hdiskxx
#shutdown -r
The /dev/hdiskxx directory is where the boot logical volume /dev/hd5 is located. To find out what xx is in hdiskxx, run the following command:
#lslv -m hd5
Note:
In AIX 5.2, the 32-bit kernel is installed by default. In AIX 5.3, the 64-bit kernel is installed on 64-bit hardware and the 32-bit kernel is installed on 32-bit hardware by default.
Hardware
How would I know if my machine is capable of running AIX 5L Version 5.3?
AIX 5L Version 5.3 runs on all currently supported CHRP (Common Hardware Reference Platform)-based POWER hardware.
How would I know if my machine is CHRP-based?
Run the prtconf command. If it's a CHRP machine, the string chrp appears on the Model Architecture line.
How would I know if my System p machine (hardware) is 32-bit or 64-bit?
To display if the hardware is 32-bit or 64-bit, type:
#bootinfo -y
How much real memory does my machine have?
To display real memory in kilobytes (KB), type one of the following:
#bootinfo -r
#lsattr -El sys0 -a realmem
Can my machine run the 64-bit kernel?
64-bit hardware is required to run the 64-bit kernel.
What are the values of attributes for devices in my system?
To list the current values of the attributes for the tape device, rmt0, type:
#lsattr -l rmt0 -E
To list the default values of the attributes for the tape device, rmt0, type:
#lsattr -l rmt0 -D
To list the possible values of the login attribute for the TTY device, tty0, type:
#lsattr -l tty0 -a login -R
To display system level attributes, type:
#lsattr -E -l sys0
How many processors does my system have?
To display the number of processors on your system, type:
#lscfg | grep proc
How many hard disks does my system have and which ones are in use?
To display the number of hard disks on your system, type:
#lspv
How do I list information about a specific physical volume?
To find details about hdisk1, for example, run the following command:
#lspv hdisk1
How do I get a detailed configuration of my system?
Type the following:
#lscfg
The following options provide specific information:
-p
Displays platform-specific device information. The flag is applicable to AIX 4.2.1 or later.
-v
Displays the VPD (Vital Product Database) found in the customized VPD object class.
For example, to display details about the tape drive, rmt0, type:
lscfg -vl rmt0
You can obtain very similar information by running the prtconf command.
How do I find out the chip type, system name, node name, model number, and so forth?
The uname command provides details about your system.
#uname -p
Displays the chip type of the system. For example, PowerPC.
#uname -r
Displays the release number of the operating system.
#uname -s
Displays the system name. For example, AIX.
#uname -n
Displays the name of the node.
#uname -a
Displays the system name, nodename, version, machine ID.
#uname -M
Displays the system model name. For example, IBM, 9114-275.
#uname -v
Displays the operating system version.
#uname -m
Displays the machine ID number of the hardware running the system.
#uname -u
Displays the system ID number.
AIX
What version, release, and maintenance level of AIX is running on my system?
Type one of the following:
#oslevel –r
#lslpp -h bos.rte
How can I determine which fileset updates are missing from a particular AIX level?
To determine which fileset updates are missing from 5300-04, for example, run the following command:
#oslevel -rl 5300-04
What SP (Service Pack) is installed on my system?
To see which SP is currently installed on the system, run the oslevel -s command. Sample output for an AIX 5L Version 5.3 system, with TL4, and SP2 installed would be:
#oslevel –s5300-04-02
Is a CSP (Concluding Service Pack) installed on my system?
To see if a CSP is currently installed on the system, run the oslevel -s command. Sample output for an AIX 5L Version 5.3 system, with TL3, and CSP installed would be:
#oslevel –s5300-03-CSP
How do I create a file system?
The following command will create, within volume group testvg, a jfs file system of 10MB with mounting point /fs1:
#crfs -v jfs -g testvg -a size=10M -m /fs1
The following command will create, within volume group testvg, a jfs2 file system of 10MB with mounting point /fs2 and having read only permissions:
#crfs -v jfs2 -g testvg -a size=10M -p ro -m /fs2
How do I change the size of a file system?
To increase the /usr file system size by 1000000 512-byte blocks, type:
#chfs -a size=+1000000 /usr
Note:
In AIX 5.3, the size of a JFS2 file system can be shrunk as well.
How do I mount a CD?
Type the following:
#mount -V cdrfs -o ro /dev/cd0 /cdrom
How do I mount a file system?
The following command will mount file system /dev/fslv02 on the /test directory:
#mount /dev/fslv02 /test
How do I mount all default file systems (all standard file systems in the /etc/filesystems file marked by the mount=true attribute)?
The following command will mount all such file systems:
#mount {-a|all}
How do I unmount a file system?
Type the following command to unmount /test file system:
#umount /test
How do I display mounted file systems?
Type the following command to display information about all currently mounted file systems:
#mount
How do I remove a file system?
Type the following command to remove the /test file system:
#rmfs /test
Saturday, November 17, 2007
Mirror Write Consistency
system crash occurs during mirrored writes. The active method achieves this by logging
when a write occurs. LVM makes an update to the MWC log that identifies what areas of
the disk are being updated before performing the write of the data. Records of the last 62
distinct logical transfer groups (LTG) written to disk are kept in memory and also written to
a separate checkpoint area on disk (MWC log). This results in a performance degradation
during random writes.
With AIX V5.1 and later, there are now two ways of handling MWC:
• Active, the existing method
• Passive, the new method
What is an LVM hot spare?
volume missing due to write failures. It then starts the migration of data to the hot spare
disk.
Minimum hot spare requirements
The following is a list of minimal hot sparing requirements enforced by the operating
system.
- Spares are allocated and used by volume group
- Logical volumes must be mirrored
- All logical partitions on hot spare disks must be unallocated
- Hot spare disks must have at least equal capacity to the smallest disk already
in the volume group. Good practice dictates having enough hot spares to
cover your largest mirrored disk.
Hot spare policy
The chpv and the chvg commands are enhanced with a new -h argument. This allows you
to designate disks as hot spares in a volume group and to specify a policy to be used in the
case of failing disks.
The following four values are valid for the hot spare policy argument (-h):
Synchronization policy
There is a new -s argument for the chvg command that is used to specify synchronization
characteristics.
The following two values are valid for the synchronization argument (-s):
Examples
The following command marks hdisk1 as a hot spare disk:
# chpv -hy hdisk1
The following command sets an automatic migration policy which uses the smallest hot
spare that is large enough to replace the failing disk, and automatically tries to synchronize
stale partitions:
# chvg -hy -sy testvg
Argument Description
y (lower case)
Automatically migrates partitions from one failing disk to one spare
disk. From the pool of hot spare disks, the smallest one which is big
enough to substitute for the failing disk will be used.
Y (upper case)
Automatically migrates partitions from a failing disk, but might use
the complete pool of hot spare disks.
n
No automatic migration will take place. This is the default value for a
volume group.
r
Removes all disks from the pool of hot spare disks for this volume
Directories to monitor in AIX
more to view it and rm to clean it out.
/etc/security/failedlogin Failed logins from users. Use the who command
to view the information. Use "cat /dev/null >
/etc/failedlogin" to empty it,
/var/adm/wtmp All login accounting activity. Use the who
command to view it use "cat /dev/null >
/var/adm/wtmp" to empty it.
/etc/utmp Who has logged in to the system. Use the who
command to view it. Use "cat /dev/null >
/etc/utmp" to empty it.
/var/spool/lpd/qdir/* Left over queue requests
/var/spool/qdaemon/* temp copy of spooled files
/var/spool/* spooling directory
smit.log smit log file of activity
smit.script smit log
AIX command
List the licensed program productslslpp -L
List the defined devices lsdev -C -H
List the disk drives on the system lsdev -Cc disk
List the memory on the system lsdev -Cc memory (MCA)
List the memory on the system lsattr -El sys0 -a realmem (PCI)
lsattr -El mem0
List system resources lsattr -EHl sys0
List the VPD (Vital Product Data) lscfg -v
Document the tty setup lscfg or smit screen capture F8
Document the print queues qchk -A
Document disk Physical Volumes (PVs) lspv
Document Logical Volumes (LVs) lslv
Document Volume Groups (long list) lsvg -l vgname
Document Physical Volumes (long list) lspv -l pvname
Document File Systems lsfs fsname
/etc/filesystems
Document disk allocation df
Document mounted file systems mount
Document paging space (70 - 30 rule) lsps -a
Document paging space activation /etc/swapspaces
Document users on the system /etc/passwd
lsuser -a id home ALL
Document users attributes /etc/security/user
Document users limits /etc/security/limits
Document users environments /etc/security/environ
Document login settings (login herald) /etc/security/login.cfg
Document valid group attributes /etc/group
lsgroup ALL
Document system wide profile /etc/profile
Document system wide environment /etc/environment
Document cron jobs /var/spool/cron/crontabs/*
Document skulker changes if used /usr/sbin/skulker
Document system startup file /etc/inittab
Document the hostnames /etc/hosts
Document network printing /etc/hosts.lpd
Document remote login host authority /etc/hosts.equiv
Thursday, November 1, 2007
IBM System p 570 with POWER 6
* Building block architecture delivers flexible scalability and modular growth
* Advanced virtualization features facilitate highly efficient systems utilization
* Enhanced RAS features enable improved application availability
The IBM POWER6 processor-based System p™ 570 mid-range server delivers outstanding price/performance, mainframe-inspired reliability and availability features, flexible capacity upgrades and innovative virtualization technologies. This powerful 19-inch rack-mount system, which can handle up to 16 POWER6 cores, can be used for database and application serving, as well as server consolidation. The modular p570 is designed to continue the tradition of its predecessor, the IBM POWER5+™ processor-based System p5™ 570 server, for resource optimization, secure and dependable performance and the flexibility to change with business needs. Clients have the ability to upgrade their current p5-570 servers and know that their investment in IBM Power Architecture™ technology has again been rewarded.
The p570 is the first server designed with POWER6 processors, resulting in performance and price/performance advantages while ushering in a new era in the virtualization and availability of UNIX® and Linux® data centers. POWER6 processors can run 64-bit applications, while concurrently supporting 32-bit applications to enhance flexibility. They feature simultaneous multithreading,1 allowing two application “threads” to be run at the same time, which can significantly reduce the time to complete tasks.
The p570 system is more than an evolution of technology wrapped into a familiar package; it is the result of “thinking outside the box.” IBM’s modular symmetric multiprocessor (SMP) architecture means that the system is constructed using 4-core building blocks. This design allows clients to start with what they need and grow by adding additional building blocks, all without disruption to the base system.2 Optional Capacity on Demand features allow the activation of dormant processor power for times as short as one minute. Clients may start small and grow with systems designed for continuous application availability.
Specifically, the System p 570 server provides:
Common features Hardware summary
* 19-inch rack-mount packaging
* 2- to 16-core SMP design with building block architecture
* 64-bit 3.5, 4.2 or 4.7 GHz POWER6 processor cores
* Mainframe-inspired RAS features
* Dynamic LPAR support
* Advanced POWER Virtualization1 (option)
o IBM Micro-Partitioning™ (up to 160 micro-partitions)
o Shared processor pool
o Virtual I/O Server
o Partition Mobility2
* Up to 32 optional I/O drawers
* IBM HACMP™ software support for near continuous operation*
* Supported by AIX 5L (V5.2 or later) and Linux® distributions from Red Hat (RHEL 4 Update 5 or later) and SUSE Linux (SLES 10 SP1 or later) operating systems
* 4U 19-inch rack-mount packaging
* One to four building blocks
* Two, four, eight, 12 or 16 3.5 GHz, 4.2 GHz or 4.7 GHz 64-bit POWER6 processor cores
* L2 cache: 8 MB to 64 MB (2- to 16-core)
* L3 cache: 32 MB to 256 MB (2- to 16-core)
* 2 GB to 192 GB of 667 MHz buffered DDR2 or 16 GB to 384 GB of 533 MHz buffered DDR2 or 32 GB to 768 GB of 400 MHz buffered DDR2 memory3
* Four hot-plug, blind-swap PCI Express 8x and two hot-plug, blind-swap PCI-X DDR adapter slots per building block
* Six hot-swappable SAS disk bays per building block provide up to 7.2 TB of internal disk storage
* Optional I/O drawers may add up to an additional 188 PCI-X slots and up to 240 disk bays (72 TB additional)4
* One SAS disk controller per building block (internal)
* One integrated dual-port Gigabit Ethernet per building block standard; One quad-port Gigabit Ethernet per building block available as optional upgrade; One dual-port 10 Gigabit Ethernet per building block available as optional upgrade
* Two GX I/O expansion adapter slots
* One dual-port USB per building block
* Two HMC ports (maximum of two), two SPCN ports per building block
* One optional hot-plug media bay per building block
* Redundant service processor for multiple building block systems2
IBM System p5 570
* IBM Advanced POWER™ Virtualization features increase system utilization and reduce the number of overall systems required
* Capacity on Demand features enable quick response to spikes in processing requirements
The IBM System p5 570 mid-range server is a powerful 19-inch rack mount system that can be used for database and application serving, as well as server consolidation. IBM’s modular symmetric multiprocessor (SMP) architecture means you can start with a 2-core system and easily add additional building blocks when needed for more processing power (up to 16-cores) I/O and storage capacity. The p5-570 includes IBM mainframe-inspired reliability, availability and serviceability features.
The System p5 570 server is designed to be a cost-effective, flexible server for the on demand environment. Innovative virtualization technologies and Capacity on Demand (CoD) options help increase the responsiveness of the server to variable computing demands. These features also help increase the systems utilization of processors and system components allowing businesses to meet their computing requirements with a smaller system. By combining IBM’s most advanced leading-edge technology for enterprise-class performance and flexible adaptation to changing market conditions, the p5-570 can deliver the key capabilities medium-sized companies need to survive in today’s highly competitive world.
Specifically, the System p5 570 server provides:
Common features Hardware summary
* 19-inch rack-mount packaging
* 2- to 16-core SMP design with unique building block architecture
* 64-bit 1.9 or 2.2 GHz POWER5+ processor cores
* Mainframe-inspired RAS features
* Dynamic LPAR support
* Advanced POWER Virtualization1 (option)
o IBM Micro-Partitioning™ (up to 160 micro- partitions)
o Shared processor pool
o Virtual I/O Server
o Partition Load Manager (IBM AIX 5L™ only)
* Up to 20 optional I/O drawers
* IBM HACMP™ software support for near continuous operation*
* Supported by AIX 5L (V5.2 or later) and Linux® distributions from Red Hat (RHEL AS 4 or later) and SUSE Linux (SLES 9 or later) operating systems
* System Cluster 1600 support with Cluster Systems Management software*
* 4U 19-inch rack-mount packaging
* One to four building blocks
* Two, four, eight, 12, 16 1.9 or 2.2 GHz 64-bit POWER5+ processor cores
* L2 cache: 1.9MB to 15.2MB (2- to 16-core)
* L3 cache: 36MB to 288MB (2- to 16-core)
* 1.9 GHz systems: 2GB to 256GB of 533 MHz DDR2 memory; 2.2 GHz systems: 2GB to 256GB of 533 MHz or 32GB to 512GB of 400 MHz DDR2 memory
* Six hot-plug PCI-X adapter slots per building block
* Six hot-swappable disk bays per building block provide up to 7.2TB of internal disk storage
* Optional I/O drawers may add up to an additional 139 PCI-X slots (for a maximum of 163) and 240 disk bays (72TB additional)
* Dual channel Ultra320 SCSI controller per building block (internal; RAID optional)
* One integrated 2-port 10/100/1000 Ethernet per building block
* Optional 2 Gigabit Fibre Channel, 10 Gigabit Ethernet and 4x GX adapters
* One 2-port USB per building block
* Two HMC, two system ports
* Two hot-plug media bays per building block
BASH CHEAT SHEETS
Checking strings:
s1 = s2 Check if s1 equals s2.
s1 != s2 Check if s1 is not equal to s2.
-z s1 Check if s1 has size 0.
-n s1 Check if s2 has nonzero size.
s1 Check if s1 is not the empty string.
Example:
if [ $myvar = "hello" ] ; then
echo "We have a match"
fi
Checking numbers:
Note that a shell variable could contain a string that represents a number. If you want to check the numerical value use one of the following:
n1 -eq n2 Check to see if n1 equals n2.
n1 -ne n2 Check to see if n1 is not equal to n2.
n1 -lt n2 Check to see if n1 < n2.
n1 -le n2 Check to see if n1 <= n2.
n1 -gt n2 Check to see if n1 > n2.
n1 -ge n2 Check to see if n1 >= n2.
Example:
if [ $# -gt 1 ]
then
echo "ERROR: should have 0 or 1 command-line parameters"
fi
Boolean operators:
! not
-a and
-o or
Example:
if [ $num -lt 10 -o $num -gt 100 ]
then
echo "Number $num is out of range"
elif [ ! -w $filename ]
then
echo "Cannot write to $filename"
fi
Note that ifs can be nested. For example:
if [ $myvar = "y" ]
then
echo "Enter count of number of items"
read num
if [ $num -le 0 ]
then
echo "Invalid count of $num was given"
else
#... do whatever ...
fi
fi
The above example also illustrates the use of read to read a string from the keyboard and place it into a shell variable. Also note that most UNIX commands return a true (nonzero) or false (0) in the shell variable status to indicate whether they succeeded or not. This return value can be checked. At the command line echo $status. In a shell script use something like this:
if grep -q shell bshellref
then
echo "true"
else
echo "false"
fi
Note that -q is the quiet version of grep. It just checks whether it is true that the string shell occurs in the file bshellref. It does not print the matching lines like grep would otherwise do.
I/O Redirection:
pgm > file Output of pgm is redirected to file.
pgm < file Program pgm reads its input from file.
pgm >> file Output of pgm is appended to file.
pgm1 | pgm2 Output of pgm1 is piped into pgm2 as the input to pgm2.
n > file Output from stream with descriptor n redirected to file.
n >> file Output from stream with descriptor n appended to file.
n >& m Merge output from stream n with stream m.
n <& m Merge input from stream n with stream m.
<<>Note that file descriptor 0 is normally standard input, 1 is standard output, and 2 is standard error output.
Shell Built-in Variables:
$0 Name of this shell script itself.
$1 Value of first command line parameter (similarly $2, $3, etc)
$# In a shell script, the number of command line parameters.
$* All of the command line parameters.
$- Options given to the shell.
$? Return the exit status of the last command.
$$ Process id of script (really id of the shell running the script)Pattern Matching:
* Matches 0 or more characters.
? Matches 1 character.
[AaBbCc] Example: matches any 1 char from the list.
[^RGB] Example: matches any 1 char not in the list.
[a-g] Example: matches any 1 char from this range.Quoting:
\c Take character c literally.
`cmd` Run cmd and replace it in the line of code with its output.
"whatever" Take whatever literally, after first interpreting $, `...`, \
'whatever' Take whatever absolutely literally.Example:
match=`ls *.bak` #Puts names of .bak files into shell variable match.
echo \* #Echos * to screen, not all filename as in: echo *
echo '$1$2hello' #Writes literally $1$2hello on screen.
echo "$1$2hello" #Writes value of parameters 1 and 2 and string hello.Grouping:
Parentheses may be used for grouping, but must be preceded by backslashes
since parentheses normally have a different meaning to the shell (namely
to run a command or commands in a subshell). For example, you might use:if test \( -r $file1 -a -r $file2 \) -o \( -r $1 -a -r $2 \)
then
#do whatever
fiCase statement:
Here is an example that looks for a match with one of the characters a, b, c. If $1 fails to match these, it always matches the * case. A case statement can also use more advanced pattern matching.case "$1" in
a) cmd1 ;;
b) cmd2 ;;
c) cmd3 ;;
*) cmd4 ;;
esacShell Arithmetic:
In the original Bourne shell arithmetic is done using the expr command as in:result=`expr $1 + 2`
result2=`expr $2 + $1 / 2`
result=`expr $2 \* 5` #note the \ on the * symbolWith bash, an expression is normally enclosed using [ ] and can use the following operators, in order of precedence:
* / % (times, divide, remainder)
+ - (add, subtract)
< > <= >= (the obvious comparison operators)
== != (equal to, not equal to)
&& (logical and)
|| (logical or)
= (assignment)Arithmetic is done using long integers.
Example:result=$[$1 + 3]In this example we take the value of the first parameter, add 3, and place the sum into result.
Order of Interpretation:
The bash shell carries out its various types of interpretation for each line in the following order:brace expansion (see a reference book)
~ expansion (for login ids)
parameters (such as $1)
variables (such as $var)
command substitution (Example: match=`grep DNS *` )
arithmetic (from left to right)
word splitting
pathname expansion (using *, ?, and [abc] )Other Shell Features:
$var Value of shell variable var.
${var}abc Example: value of shell variable var with string abc appended.
# At start of line, indicates a comment.
var=value Assign the string value to shell variable var.
cmd1 && cmd2 Run cmd1, then if cmd1 successful run cmd2, otherwise skip.
cmd1 || cmd2 Run cmd1, then if cmd1 not successful run cmd2, otherwise skip.
cmd1; cmd2 Do cmd1 and then cmd2.
cmd1 & cmd2 Do cmd1, start cmd2 without waiting for cmd1 to finish.
(cmds) Run cmds (commands) in a subshell.