First Created: May 04, 2019 - rajpat@us.ibm.com
Last updated - May 17, 2019 - rajpat@us.ibm.com
Last updated - Jun 07, 2019 - rajpat@us.ibm.com
Last updated - Oct 14, 2019 - rajpat@us.ibm.com
Last updated - Jun 01, 2021 - rajpat@us.ibm.com
*** NOTE: Contains dump, perf, devscan ( AIX, Linux, and IBMi Client )
*** NOTE: Only highlight the points or items from this sections for customer to capture from this complete list.
*** If using Dual HMC, provide data from both HMC
High level problem description and data collection required.
1) Problem description.
2) VIO
3) Client ( AIX, Linux, IBMi )
4) HMC pedbg ( If using dual HMC data from both HMC ) *** SEE HMC cleanup, for very old logs ***
5) RSCT
6) System Firmware Resource and Platform Dump
7) Fabric switch ( For NPIV configurations )
8) Devscan ( For NPIV configurations )
9) Additional data from HMC for NPIV config.
10) Checking network performance using open source IPERF ( executable can also be down loaded from this link )
11) Incase of hang or SRC 2005
12) Requirements for PowerVC
13) FTP Data. for Blue Diamond Follow BD steps.
14) HMC Cleanup ( When there are too many files)
=============== Start ===================
1) Problem description
a) Make sure all date and time on hmc, vio, etc are all in sync.
b) Is this using single or dual HMC ?
c) Are you using GUI or CLI ?
d) Are these single or concurrent LPMs?
e) Are you using LPM toolkit ?
f) Is this a test or production ?
g) Is this a new configuration ?
h) When was this last working ?
i) Did you make changes to system firmware, vio level, hmc level, client lpar, network, switches etc ?
j) ** Provide complete date and time of error and any screen shots. **
k) How long has the LPM been running if hung or slow and SRC code on HMC on Source and Destination frames ?
l) Is this using PowerVC. If so provide PowerVC logs ( see section item 12 for PowerVC )?
2) VIO snaps from both VIOs on source and both VIOS on target. Rename to indicate source and target.
$ snap ( run from padmin. Creates file /home/padmin/snap.pax.Z)
3a) AIX client snap
# snap -r ( clear old snap )
# snap -ac ( data in /tmp/ibmsupt/snap.pax.Z)
3b) Linux snap ( sosreport requires root permissions to run )
# sosreports ( data in /var/tmp )
3c) IBMi Client: IBMi MustGather (QMGTOOLS), which the customer should update, and then run the IBMi SYSSNAP
4) HMC pedbg:
In short CLI command:
# pedbg -c -q 4;
- Say, YES when prompted to collect ctsnap.
(if there are RSCT problems. ctsnap can hang)
- File created is
"HsClogsXXzz2007zzzzzz.zip", created in /dump directory.
- Copy file to the common directory for FTP to IBM.
# scp /dump/HSClogsXXzz2007zzzzzz.zip
@:.
(NOTE: the ":." means keep the same name on remote,
do not forget the :. At the end of the line )
- Rename the file to include the PMR number
# mv HSClogsXXzz2007zzzzzz.zip PMRno.ZZZ.000.pedbg.zip
5) RSCT
- From VIO Servers within oem_setup_env ( Included in snap with VIO 2.2.2.1 but may still need below for RMC / RSCT )
$ oem_setup_env -
# ctsnap -x runrpttr - This will create => /tmp/ctsupt/ctsnap*.tar.gz
- From AIX lpar:
# ctsnap -x runrpttr - This will create => /tmp/ctsupt/ctsnap*.tar.gz
- From Linux lpar:
# ctsnap -x runrpttr - This will create => /tmp/ctsupt/ctsnap*.tar.gz
- From HMC ( Requires pesh password ) - pedbg option 7
# ctsnap -x runrpttr
6) System Firmware non-distruptive Resource and Platform Dump FOR BOTH Source and Target Mananged System:
If using dual HMC, provide from btoh HMC.
CLI:
a) Access HMC restricted shell
To get {managed_system}:
# lssyscfg -r sys -F name
b) startdump -t resource -m {source_managed_system} -r "system"
c) startdump -t resource -m {target_managed_system} -r "system"
d) This creates a file in /dump/SYSDUMP.SRLNMBR.DUMPNMBR.TIMESTAMP....
e) Please re-name the file to indicate if its for source_managed_system or target_managed_system
7) Fabric switch ( For NPIV configurations )
Switch Logs for Cisco & Brocade:
- Switch logs ("show tech-support details" for Cisco)
- 'show tech detail' if more then 1 switch.
- 'supportshow' collected via CLI using either
HyperTerm or Putty to collect the output for Brocade.
- 'supportsave' as that has additional debug information.
- How the RSCN ( Registered State Change Notification )
events are sent when a zoning change is done on the switch?
Switch Logs for McData:
- "data collection" from the switch management console, EFCM.
- any other related to ones described above under Cisco / Brocade.
Switch Logs using iSCSI TOE:
- igroup show
- lun show -m
** For FCOE Types also include: **
- "show tech-support fc"
8) Devscan ( For NPIV configurations )
From the HMC get the Live Partition Mobility WWPNS of client lpar.
# lssyscfg -r sys -F name ( To get {managed_system} )
# lssyscfg -r prof -m {source_managed_system} -F name virtual_fc_adapters
From VIO1 target that current has inactive WWPN
$ oem_setup_env
# script /tmp/devscan_vio1_active_wwpn_src.log
# devscan -t f -n [wwpn_inactive_lowercase] ( -t and f may be optional )
# devscan -t f -n [wwpn_inactive_lowercase] --dev=fcxx ( -t and f may be optional )
# exit
From VIO2 target that current has inactive WWPN
$ oem_setup_env
# script /tmp/devscan_vio2_inactive_wwpn_src.log
# devscan -t f -n [wwpn_inactive_lowercase] ( -t and f may be optional )
# devscan -t f -n [wwpn_inactive_lowercase] --dev=fcxx ( -t and f may be optional )
# exit
From AIX and Linux Client LPAR:
Run devscan on the moving lpar, and compare the output with the one you get on the Virtual I/O Server :
# devscan --dev=fscsi0 --concise | awk -F '|' '{print $2}' | sort -n | uniq
Complete Devscan Client to Destination VIOs Checks - Step By Step ( AIX & Linux ):
9) Additional data from HMC for NPIV config for AIX and Linux Clients. If using dual HMC data from both HMC
To get {managed_system}:
# lssyscfg -r sys -F name
To get lsnportlogin from HMC:
# lsnportlogin -m {managed_system} --filter lpar_names={name_of_client_lpar}
To get complete listing from HMC
# lssyscfg -r sys F name ( to get managed_system )
# lshmc -v
# lshmc -V
# lshwres -r virtualio -rsubtype scsi -m {managed_system} -level lpar
# lshwres -r virtualio -m --rsubtype fc --level sys
# lshwres -r virtualio -m --rsubtype fc --level lpar
** To get complete listing of wwpn,vfchost,fcs mapping ** (make sure its in single line command)
# lshwres -r virtualio --rsubtype fc --level lpar -m Rack101-8286-42A-SN219319V -F lpar_name,lpar_id,slot_num,adapter_type,state,is_required,remote_lpar_id,remote_lpar_name,remote_slot_num,wwpns,topology
10a) Performance related: IPERF, VMSTAT and LPARSTAT:
Enable stats on HMC ( Menu selection to get to this may vary depending on HMC levels )
From HMC activate Performance Information Collection
- Right click on the specific LPAR
- Properties
- Hardware
- Processors
- Allow performance information collection
10b) VMSTAT and LPARSTAT ( Collect during problem )
vmstat -It 5 200 | tee vmstat_vio1_src.log &
vmstat -It 5 200 | tee vmstat_vio2_src.log &
vmstat -It 5 200 | tee vmstat_vio1_target.log &
vmstat -It 5 200 | tee vmstat_vio2_target.log &
lparstat -ht 5 200 > lparstat_vio1_src.log &
lparstat -ht 5 200 > lparstat_vio2_src.log &
lparstat -ht 5 200 > lparstat_vio1_target.log &
lparstat -ht 5 200 > lparstat_vio2_target.log &
10c) IPERF ( Basic info shown below, full details refer to ( iperf_Instructions.txt attached )
I) Start the iperf test at the other end as a server ( Target MSP )
./iperf -s -P 4 --> Starts the server
II) At the problem vios start the iperf as a client ( Source MSP )
./iperf -c -P 4 -t 10 -w 240k
III) Start the iptrace at both the ends as well as at the switch.
On the aix VIOS
# startsrc -s iptrace -a " -a /big_directory/iptrc.bin "
Wait for 20 seconds.
REPEAT ABOVE IN BOTH DIRECTION BY CHANGING THE SERVER AND CLIENT AND CAPTURE ALL STDOUT INCLUDING THE SUM AT THE END.
THIS MAY NEED TO BE RUN AT VARIOUS TIMES INCASE OF IRREGULAR / RANDOM NETWORK BEHAVIOUR.
Repeat above serveral time to make sure network through put is good.
11) For SRC 2005 Client hangs and dumps refer to below links:
12) LPM through PowerVC: Above items apply in addition to PowerVC data.
PowerVC techdoc
13) FTP the file to ibm:
ftp testcase.software.ibm.com,
login: anonymous,
passwd: your email address,
ftp> cd /toibm/aix
ftp> bin
ftp> put (salesforce_case_number.pax.gz)
ftp> quit
For Blue Diamond: Registration Link: Blue Diamond Registration
14) HMC Cleanup ( When there are too many files)
# chhmcfs -o f -d 0
HMC cleanup
====================== End ================