This document describes the authorized program analysis reports (APARs) resolved in IBM Spectrum Scale 5.1.7.x releases.
This document was last updated on 13th April, 2023.
Tips:
APAR | Severity | Description | Resolved in | Feature Tags | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
IJ45803 | Medium Importance | Spectrum Scale and systemhealth monitor (sysmon) start independently after a node reboot. During initialization, Spectrum Scale checks if all declared NFS exports are available. The sysmon configuration has the flag "preventnfsstartuponmissingfs" enabled, so the expected behavior was that NFS is not started if a required filesystem is unmounted. But in fact, NFS was started anyway. (show details)
|
5.1.7.1 | System Health | ||||||||
IJ45804 | High Importance | While online mmfsckx is in progress and if a user tries to do I/O on a file or directory that has an inode number greater than 32 bit integer number then in some cases it can cause the node to assert with LOGASSERT(i64 == INVALID_INODE_NUMBER || (i64 & 0xFFFFFFFF00000000ULL) == 0)
(show details)
|
5.1.7.1 | FSCKX | ||||||||
IJ45590 | High Importance | With File Audit Logging (FAL) enabled, when kx Ganesha operation op 112 (GET_XSTAT) is being handled, the NFS client ip is malloc'ed and inserted into a table by the current Ganesha threadfor use by FAL. The responsibility for freeing the ip is left to close during a close file routine. However, the routine is called by a different thread and not immediately after the kxGanesha op 112 call. This results in the ip remaining in the table and not being freed, leading to memory leaks and subsequent memory exhaustion. (show details)
|
5.1.7.1 | NFS and File Audit Logging | ||||||||
IJ45609 | Critical | Due to an issue identified in offline fsck mmfsck it can cause it to report false positive lost blocks and also not report properly genuine incorrect blocks and duplicates.
(show details)
|
5.1.7.1 | FSCK | ||||||||
IJ46129 | High Importance | Daemon asserts when an AFM fileset is unlinked with junction path.
(show details)
|
5.1.7.1 | AFM | ||||||||
IJ45806 | High Importance | There is a peculiar case where the local bit on the .ptrash directory inside AFM filesets gets reset. This causes the .ptrash directory to be treated like a normal directory and in Write modes, the temporary files generated for recovery/resync policy start getting replicated to the remote site. For Read modes this causes the ptrash directory to show up as a dangling entry because a normal lookup is sent to home - and since the .ptrash doesn't have remote attrs - it fails to complete this lookup successfully. This also causes errors when the user wants to empty the ptrash with rm -rf since the lookups to remote site don't succeed. (show details)
|
5.1.7.1 | AFM | ||||||||
IJ45805 | High Importance | The command to start smb traces "mmprotocoltrace start smb -c <ip address>" failed with an error message "/tmp/mmfs: No such file or directory". The corresponding log file /var/adm/ras/mmprotocoltrace.log shows error messages of this failing command, but not any reason detail. (show details)
|
5.1.7.1 | System Health | ||||||||
IJ46130 | Suggested | AFM Recovery uses an external program to detect renames/removes done that were not replicated. This external program was seen to leak few memory blocks which is now addressed. (show details)
|
5.1.7.1 | AFM | ||||||||
IJ46208 | Suggested | Add hardware information to scheduled call home data.
(show details)
|
5.1.7.1 | ESS | ||||||||
IJ45880 | High Importance | A GPFS Windows node that has been running for a few hours, may enter a state where-in even under no load, the idle GPFS threads might spin causing 100% CPU utilization. This is because of a potential error in time management and computation on Windows. (show details)
|
5.1.7.1 | Windows performance. | ||||||||
IJ45891 | High Importance | All non-posix operations like SetAttr, SetXattr, Peer snapshots, etc, are not going through from Cache/Primary to the Home/Secondary. Because we're prevented from using the AFM special control file at Home/Secondary (show details)
|
5.1.7.1 | AFM | ||||||||
IJ46131 | High Importance | Adding/Removing Gateway node roles to the cluster when Active I/O is happening to an AFM fileset can cause deadlocks owing to how the node join/leave protocol
handles leading to One applicaiton node thinking of a certain Gateway node to be the Gateway node for the fileset Vs other nodes thinking other nodes to be fileset gateway nodes.
(show details)
|
5.1.7.1 | AFM | ||||||||
IJ46132 | High Importance | While uploading file to COS if node goes down, when node comes up and tries to recover
file it was trying to recover file from snapshot path instead of live FS.
(show details)
|
5.1.7.1 | AFM | ||||||||
IJ46133 | High Importance | AFM Gateway node shall hit an assertion when running IO from application node
to a dependent fileset inside AFM independent fileset or AFM filesystem level replication enabled.
(show details)
|
5.1.7.1 | AFM | ||||||||
IJ46148 | High Importance | During filesystem restripe process, for example, mmrestripefs -R, a file replication setting
may be changed if the file is ill-replicated, and quota is not handling correctly after the file data blocks
are replicated or un-replicated as needed to match the new replication settings. As result, some quota accounting data become unreliable over time. (show details)
|
5.1.7.1 | Quotas | ||||||||
IJ45690 | High Importance | There were unknown NFS errors hit during recovery and there was no bypass around these to get recovery to go through.
(show details)
|
5.1.7.1 | AFM | ||||||||
IJ46138 | High Importance | Prefetch support for skip-dirs needs to attempt removal of each skipped-dir in thread by setting special GID at the binary level. So any other thread in the same binary attempting to perform lookup with the remote site is treated to be local since the GID is treated to be local. (show details)
|
5.1.7.1 | AFM | ||||||||
IJ46205 | High Importance | The tsapolicy adds each client process (agent) information to agentVctr to keep track activities.
If agent is retrieved from agentVctr While a helper is being added, it could get vogus agent address and it could result tsapolicy hang.
Adding lock while retrieving agent info can avoid this problem.
(show details)
|
5.1.7.1 | mmapplypolicy | ||||||||
IJ46323 | High Importance | Systems running Scale v 5.1.5.x, 5.1.6.x, and 5.1.7.0 may experience an unexpected termination
of mmfsd or mmsdrserv. This will be seen in log entries, etc. It may be reflected in a runtime Assert messages on affected quorum nodes involving an invalid socket. (show details)
|
5.1.7.1 | CCR | ||||||||
IJ46327 | Suggested | getfacl may not display a POSIX default ACL that has been set on a directory. This occurs in this situation: - A default ACL is set on a directory in a Scale filesystem using setfacl, but not an access ACL. - The filesystem is shared using the NFS server included with the operating system. - The NFS client mounts the filesystem using NFS version 3. Functionally things seem to work correctly even though getfacl is missing the default ACL information. (show details)
|
5.1.7.1 | NFS and POSIX default ACLs | ||||||||
IJ45446 | High Importance | With QoS throttling configuration on a subset of nodes in the cluster,
the I/Os on the rest client nodes without QoS throttling are seriously throttled unexpectedly.
(show details)
|
5.1.7.1 | QoS | ||||||||
IJ45706 | High Importance | In a replicated file system (-r 2), when disks of a failure group are not available,
e.g. one of the failure groups in two is suspended, the file writes succeed allocating disk space on the
available failure group but only one replica per logical block is allocated - the file is ill-replicated. In such scenario, quota is not handling correctly the partial successful block allocation as GetLocalQuota and FixLocalQuota routines are out of sync. As result, some quota shares (in-doubt) become not reclaimable and leading to increase of in-doubt values over time. (show details)
|
5.1.7.1 | Quotas | ||||||||
IJ46329 | Suggested | After enabling file audit logging, immediately listing or accessing the created audit log directory
(SpectrumScale_XYZ) inside the .audit_log directory returns the “No such file or directory” message. In addition, performing ls -l on the .audit_log directory returns ‘?’ in the output for the created audit log directory. (show details)
|
5.1.7.1 | File Audit Logging | ||||||||
IJ46394 | High Importance | The TCT recall process could fail or report some errors during deleting a non-resident (stub) file that is also in a snapshot.
(show details)
|
5.1.7.1 | TCT migration/LWE | ||||||||
IJ45626 | High Importance | Adding any disk into a file system, the ill_unbalanced flag would be set to indicate that the file system can be further rebalanced. With this ill_unbalanced flag, the mmhealth will see it and downgrade the file system until an mmrestripefs command -b option is done. (show details)
|
5.1.7.1 | All Scale Users |