VAST Cluster 5.0.0-SP60 Release Notes

Prev Next

Upgrade to this version is supported from VAST Cluster 5.0.0 up to 5.0.0-SP40 and from VAST Cluster 4.7.0 up to 4.7.0-SP28. No upgrade from any previous version.

Note that a direct upgrade may not be supported from hotfix builds. Consult VAST Support regarding the upgrade if you may be running a hotfix build.

To obtain the download package for VAST Cluster 5.0.0-SP60, reach out to your VAST Customer Success Engineer.

Enhancements in 5.0.0-SP60

Tenant Client IP Ranges

  • ORION-195263, ORION-194742: Increased the maximum number of allowed client IP ranges per tenant up to 1000. Each range can have up to 65535 IPs.

  • ORION-178951: Provided an ability to update (PATCH) the list of client IP ranges configured for a tenant to allow or disallow access to the tenant’s data from these IPs. The following user controls have been added for this purpose:

    • In VAST CLI, the tenant alter-client-ip-ranges command

    • In VAST REST API, the /tenants/{id}/client_ip_ranges/ endpoint that can be used to PATCH the client IPs configured for the tenant.

VMS

  • ORION-204295: Added an alarm to be raised when a NIC cable is frequently changing its state from up to down and back to up, which can be indicative of a jittering cable.

  • ORION-195986: Added an alert to be raised when a NIC encounters an increasing number of duplicate and out-of-sequence packets.

  • ORION-182516, ORION-179490: Provided event definitions to enable the cluster to generate alarms in case the cluster runs low on its storage resources, based on user-supplied thresholds. The new event definitions use the RaidMetrics,stripe_available_percent property to track the percentage of the cluster’s available stripes (data segments that can be written to the cluster).

    In addition, there are now threshold-type event definitions to monitor the percentage of used handles (Capacity,used_handles_percent) and percentage of used metadata resources (Capacity,metadata_resources_percent).

  • ORION-176774: Added an alert to be raised when the cluster encounters high latency for a very short period of time.

  • ORION-173731: Added an ability to learn the date and time when the current S3 access keys were created for a user, as follows:

    • In VAST Web UI, by opening the tooltip for the current S3 access keys when creating or modifying a user (User Management -> Users -> choose to create or edit a user).

    • In VAST CLI, by running the user query command for a specific user.

    • In VAST REST API, by sending a request to the /users/query/ endpoint.

Resolved Issues in 5.0.0-SP60

Install & Upgrade

  • ORION-191880: Resolved an issue that prevented showing the last upgraded CNode in the progress message produced during an upgrade procedure initiated via VAST CLI with the --cnodes-batch-size-percentage option specified.

  • ORION-190369: Enhanced upgrade routines to prevent a scenario where, after completing a frozen upgrade, the NDU portion of the process could not be started due to an INVALID_TARGET_VERSION error.

  • ORION-189672: Enhanced the logic behind the Northband option that can be selected for the management network during a VAST Easy Install process so that the option applies to CNodes only. Prior to this change, an attempt to run the installation process with the Northband option selected would fail with the northband is supported only on cnodes and only those with 2 HCAs error.

  • ORION-182412: Updated pre-upgrade validations to skip inactive SCMs when checking for locked SCMs so that this check does not cause the AttributeError("'NoneType' object has no attribute 'host'") error during an upgrade.

  • ORION-180395: Updated upgrade routines to prevent the removal of the known_hosts file during a DNode OS upgrade.

Cluster Expansion

  • ORION-187388: Resolved an issue that could cause the dbox add command to fail with the cannot create directory ‘/vast/backup’: Permission denied error.

Networking

  • ORION-179941: Enhanced IB switch monitoring functionality to prevent raising OpenSM is not enabled correctly alerts for hosts that have the OpenSM service in the masked state.

Element Store

  • ORION-204314: Introduced improvements to avoid write latency spikes after massive write operations.

  • ORION-194007: Resolved an issue that could cause the cluster to encounter a assertion failed: (eres != EStoreRes::OK) ranges_to_mark not empty! alert followed by denylisting @handle=<handle> as traversal_type=19 resulted with res=PERSISTENT_ERROR and then locally denylisting shard_type=0 shard_id=<shard> as op=DELETE_ELEMENT resulted with res=28 deny list alerts.

  • ORION-190969: Made updates to improve handling of deletions of an extremely large amount of directories with a lot of subdirectories and files in them while the cluster has VAST Catalog enabled.

  • ORION-190443: Optimized the mechanism of bulk permission updates to speed up handling of very large directories.

  • ORION-190034: Enhanced VAST Catalog transaction management to prevent a scenario that could cause automatic denylisting of ESTORE BIG_CATALOG operations because of a write-write conflict.

  • ORION-188621: Fine-tuned the mechanism responsible for handling massive deletions to resolve an issue where deletion of a very large amount of data caused the VMS to report nearly running out of metadata capacity.

  • ORION-188096: Improved write buffer management to eliminate flows that could lead to CNode container restarts with the group=W_MIGRATE had a suspension timeout! error.

Quotas

  • ORION-172465: Enhanced quota update logic to eliminate an issue that could cause a defined user quota to stop being displayed in VAST Web UI and VAST CLI, although the quota was enforced as expected.

  • ORION-169675: Updated the quota deletion logic to prevent a flow where after hitting the quota limit and removing the quota to continue write operations on a path, any write attempt to the path still failed due to the quota limit exceeded.

SMB

  • ORION-179067: Updated SMB processing to avoid flows where the VAST cluster could respond with an information structure of an incorrect size, causing INFO LENGTH MISMATCH errors on the client.

  • ORION-173222: Fine-tuned TCP session keep-alive timeouts to avoid a scenario where increased latency could be occasionally encountered on file read attempts.

S3

  • ORION-199913: Made updates to ensure that VAST Cluster sends an S3 Not Implemented response to requests that include the If-None-Match header (conditional writes).

VAST Database

  • ORION-189222: Resolved an issue that could case a CNode container to restart with the assertion failed: (_type_and_coding.type_info.nullable || (validity_bitmap == nullptr)) error when using VAST Database SDK to import data to a VAST database.

  • ORION-178460: Resolved an issue that could cause a Got an error: Must pass at least one table error when running queries against a VAST database using VAST Database CLI (vast-db-cli).

VAST Catalog

  • ORION-184012: Resolved an issue that was causing the Query: Please correct the form error when trying to export VAST Catalog query results to a CSV file in VAST Web UI.

Data Protection

  • ORION-177810: Enhanced handling of date and time in protection policies so that VAST Web UI always shows the local timezone date and time. Prior to this change, a UTC time could be displayed in some cases.

Replication

  • ORION-190681: Updated the logic used to cache user ID mappings (SIDs, VAIDs) to resolve an issue where a protected path could not catch up with its RPO after a NATIVE_REPLICATION deny list was cleared.

  • ORION-181017: Enhanced handling of group SIDs to resolve an issue that could cause UDB2 lookup user on sid=<SID> failed to return a token alerts during S3 replication.

  • ORION-160019: Updated the async replication certificate expiration alerts so that they can trigger notifications to admin users. Prior to this change, the alerts were visible to root and support users only.

Authorization & Authentication

  • ORION-181089: Updated LDAP caching mechanism to prevent a scenario where, after a VMS HA event, it was not possible to log in to the VMS using LDAP until the LDAP cache expired.

  • ORION-179744: Made updates to prevent CNode container restarts in case a client initiates a user refresh (which entails clearing of the internal database cache and retrieving all the user information from a provider) too frequently within a short period of time.

VMS

  • ORION-206781: Made updates to ensure that a bulk CNode activation task processes all CNodes included in the batch.

  • ORION-202338: Resolved an issue where the cluster did not raise an alarm when a DNode link went down and remained disconnected for a period of time long enough to avoid false alarming.

  • ORION-194862: Updated view path validations to allow for view paths or SMB share names that include any UTF8 characters. Prior to this change, an attempt to create a view with a path that included a non-ASCII character resulted in an error.

  • ORION-191852: Improved the Data Flow polling to avoid creating a large number of SSH connections from the VMS container to localhost when Data Flow hostname polling is enabled on the cluster (Settings -> Dataflow Settings).

  • ORION-188852: Enhanced calculation of used capacity percentage so that it does not cause a bigint out of range errors when trying to view quotas via VAST CLI or VAST REST API.

  • ORION-186748: Fine-tuned the logic of VMS accessing the internal database to resolve an issue that could cause a large number of key='APPLY_LOGS_OBFUSCATION' failed retrieved from DB messages to appear in the VMS log.

  • ORION-186340: Made updates to prevent overriding of S3 access key creation time when a second access key is created for the same VMS user.

  • ORION-185215: Updated handling of preferred CNode IDs so that an attempt to run the vms modfy --preferred-cnode-ids command does not fail with the cnodes_set_vms_preferred failed: CNodesSetPreferredResultCode.NOT_FOUND error.

  • ORION-180970: Resolved an issue where VMS occasionally could not return identity policies for a user group queried via VAST REST API (/api/latest/groups/?groupname=<...>&context=aggregated&amp;tenant_id=<...>).

  • ORION-180832: Updated the event definition creation flow to always use microseconds as the unit of measurement for latency.

  • ORION-176603: Updated view path validations to allow for view paths or SMB share names that include any UTF8 characters. Prior to this change, an attempt to create a view with a path that included a non-ASCII character resulted in an error.

  • ORION-175599: Resolved an issue where a newly created QoS policy could not be seen in VAST Web UI, VAST CLI or VAST REST API although it existed in the VAST internal database.

  • ORION-172955: Improved error handling to report ??? as a CNode status in cases where the node’s NIC responds with junk data to VMS polling (due to cabling issues, for example). Prior to this change, VMS would report an empty status and log an UTF-8 decode error in such cases.

  • ORION-113520: Resolved an issue where one of DNode ports was falsely reported as faulty during a periodic pre-upgrade check.

VAST Web UI

  • ORION-196513: Improved usability and performance when dealing with a large number of client IP ranges defined for a tenant (Element Store -> Tenants -> choose to create or edit a tenant -> Tenant Access tab).

  • ORION-194719: Updated the logic used to select tenants when creating a virtual IP pool in VAST Web UI so that the All tenants option creates a virtual IP pool for all tenants but not for the default tenant only.

  • ORION-191852: Improved the Data Flow polling to avoid creating a large number of SSH connections from the VMS container to localhost when Data Flow hostname polling is enabled on the cluster (Settings -> Dataflow Settings).

  • ORION-182932: Updated the header of the BW (MB/s) column in the Global Snapshot Clones page (Data Protection -> Global Snapshot Clones) to read MB/s for megabytes per second.

  • ORION-176969: Updated the Settings -> Notifications navigation menu item to point to the correct notification settings page. Prior to this change, it opened the VMS settings page.

  • ORION-176945: Updated VAST Prometheus Exporter to avoid showing the text (MB/s) in metrics that refer to IOPS values.

  • ORION-175524: Made updates to avoid showing UNKNOWN port location for some ports listed in the Infrastructure -> NICs page.

VAST REST API

  • ORION-185217: Resolved an issue that caused the /users/<user ID>/ endpoint to return an empty JSON in response to an update request, although the requested updates were made as expected.

Platform & Control

  • ORION-202564: Improved handling of gratuitous ARP requests to resolve an issue where some of the client mount points hang when connecting to a virtual IP pool on a NIC that was changing its state from up to down and back to up.

  • ORION-202803: Resolved an issue that could case a DNode container to restart with the spinlock lock takes too long: 1835, cannot kill locker silo error.

  • ORION-201453: Improved handling of write buffers to resolve an issue that could sometimes cause latency spikes after an upgrade from VAST Cluster 4.7 to VAST Cluster 5.0.

  • ORION-199929: Made updates to prevent a flow that could cause node containers running CentOS-based VAST OS to restart with the spinlock lock takes too long error during a switch firmware upgrade.

  • ORION-193977: Resolved an issue where a CNode container restarted due to no fibers for incoming request errors.

  • ORION-199308: Improved handling of DNode port connections so that multiple CNodes attempting to connect to a bad port do not have impact on attempts of other CNodes to connect to the other port.

  • ORION-193425: Resolved an issue where multiple CNode container restarted with an Address not mapped to object error followed by ingest_handle_event deny list alerts.

  • ORION-193511: Eliminated a flow where the DTray reboot mechanism could get stuck, causing inability to restart the DTray and temporary service disruption.

  • ORION-186568: Resolved an issue where after encountering link issues, the CNode container restarted due to the failed removing IP <IP address> from interface ens1f1 error.

  • ORION-186003: Resolved an issue that could cause a CNode container to restart with the Invalid permissions for mapped object error.

  • ORION-184360: Eliminated a potential race condition that could cause a CNode container restart with the assertion failed: ((_reentrancy_level) > (0)) (0 > 0) error.

  • ORION-184028: Updated failover mechanisms to eliminate a flow that could cause multiple CNode and DNode container restarts followed by a short service disruption after one of the cluster’s switches was rebooted.

  • ORION-183745: Improved handling of CNode and DNode communication issues to prevent a scenario where, following an IB switch reboot, all CNode containers are restarted, and DNode failures are encountered with the can't deactivate dnode: can reach DU error.

  • ORION-181080: Enhanced handling of deletions to help avoid scenarios that could result in md_usage_state changed from ABUNDANT to SCARCE alerts.

  • ORION-180071: Resolved an issue that could cause periodic CNode container restarts with the failed to allocate vmsg args for GetUpdatedS3UsersParams error.

  • ORION-179428: Fine-tuned the mechanism of sorting stripes during defragmentation to resolve an issue that caused multiple the stripe is stuck alerts on the cluster.

  • ORION-177324: Resolved an issue where a cluster encountered extended high latency accompanied with a The publishers of capacity estimations are not keeping up alert after massive data deletions.

  • ORION-174205: Updated cluster’s internal database query caching to eliminate an issue that could cause a CNode container to restart with the Address not mapped to object error.

  • ORION-171916: Enhanced handling of NIC IDs to ensure that there are no duplicate entries created as a result of hardware polling operations.

  • ORION-159518: Resolved an issue that could cause a CNode container to restart with the timeout expired for life_type=0,life_gen=<...> (INGEST_READ) with 1 active jobs - timeout is 300 seconds, and diff is 301 seconds error.

VAST OS

  • ORION-190436: Resolved an issue where multiple DNode containers restarted during a short period of time due to list_del corruption kernel errors.

Call Home & Support

  • ORION-173555: Updated the logic behind CNode selection in advanced support bundle settings (Support -> Bundles -> Create Support Bundle -> Advanced tab) to ensure that all relevant nodes are included in the selection list.

Uplink

  • ORION-194400: Enhanced synchronization between VMS and Uplink to avoid situations where Uplink shows a task as running while VMS reports it as complete.

Limitations in 5.0.0-SP60

The following are limitations in VAST Cluster 5.0.0-SP60:

Quotas

  • ORION-208873: Quotas and quota accounting are not supported on subpaths of a replicated protected path on the destination peer. For example, if a protected path is replicated to a destination directory /dest-dir, you cannot set a quota on /dest-dir/mydir.

  • (RESOLVED IN 5.3.0) ORION-179496: NFS aliases are not supported with VAST Cluster's implementation of Remote Quota Protocol (rquota).

Quality of Service

  • ORION-148295: QoS should be enabled on all views to avoid performance degradation issues.

  • ORION-148206: There may be some scenarios in which minimum service levels set by QoS policies are not met. 

  • ORION-139524: Setting a minimum limit for read operations does not limit write operations on the same view.

  • QoS provisioning is not supported for S3 clients.

  • User QoS feature is supported for NFS clients only.

NFS

  • ORION-115336: If one creates an NFSv4.1-only view and mounts it, and then creates its parent view with NFSv3 only, IO operations on the NFSv4.1-only view succeed, but mounts are not allowed.

NFSv3

  • In rare cases with large numbers of files and directories, the existence of a view with Global Synchronization enabled under a protected path can block the removal of the protected path.

SMB

  • ORION-160323: After updating permissions for an SMB share in Windows Explorer, a duplicate SMB share can be displayed. The duplicate SMB share disappears upon a refresh (F5).

  • (RESOLVED IN 5.2.0) ORION-130460: VAST Cluster does not show any previous versions for a file or directory that has the same name as a file or directory that has been deleted and resides in the same directory as the deleted file or directory.

  • ORION-134730: An attempt to restore a file can fail if after the restore has started, a quota is set on the path where the file resides.

  • (RESOLVED IN 5.2.0) ORION-137905: If an application saves changes to a file by recreating the file, or when the client otherwise deletes a file or a directory and creates a new one with the same name, no previous versions can be displayed for the file or directory. To restore such a file or directory, you need to restore one of its parent directories.

S3

  • An object to be uploaded via an S3 presigned POST request must have only ASCII characters in its name.

  • A POST policy (used for S3 presigned POST requests) can be up to 4800 bytes.

VAST Catalog

  • The maximum path length supported by VAST Catalog is 1024 characters.

  • When VAST Catalog is enabled, replication is limited to two peers (group replication is not supported with VAST Catalog). 

  • VAST Catalog must be disabled before a protected path can be deleted. 

Global Snapshot Clones

This release does not support global snapshot clones with VAST Catalog enabled.

Multi-Cluster Management

  • The Multi-Cluster Management feature requires that each cluster participating in the inter-connection is running VAST Cluster 5.0.

  • ORION-135966: The inter-connecting clusters must have connectivity to each other through the clusters' management networks.  

  • ORION-132073: When you remove a VoC cluster from a Multi-Cluster Manager cloud service instance (using the removal button on the cluster's card (delete_voc_button.png)), the VoC cluster is terminated. There is no option to remove a VoC cluster from Multi-Cluster Manager without also terminating it. (In the Multi-Cluster Management page in the VAST Web UI the button removes the VoC cluster from Multi-Cluster Management and does not terminate it. )

  • ORION-137875: In case of Multi-Cluster Manager failure, VoCs provisioned by the instance cannot be connected to a Multi-Cluster Management instance.

Authentication & Authorization

  • ORION-143944: When using Kerberos/NTLM Authentication to authorize SMB users from non-trusting domains, the DOMAIN\username format cannot be used to specify users of remote domains. The username@domain format must be used instead.

  • ORION-134299: When the tenant is set to use Kerberos/NTLM authentication to authorize SMB users from non-trusting domains, both NFS and SMB must use the native SMB authentication (Kerberos), and not Unix-style UID/GIDs.

  • ORION-141763: Before enabling or disabling NTLM authentication, you need to leave the cluster's joined Active Directory domain. After NTLM authentication is enabled or disabled, rejoin the domain.

  • The following limitations apply to Multi-Forest Authentication:

    • VAST Cluster does not allow adding two different Active Directory configuration records with the same domain name but different settings for multi-forest authentication and/or auto-discovery.

    • Names of users' domains are not displayed in data flow analytics.

    • If a trusted domain becomes unavailable and then recovers, SMB clients can use it to connect to the VAST cluster only after a period of time, but not immediately upon domain recovery.

    • Clients cannot establish SMB sessions immediately after a trusted domain recovers from a domain failure.

    • If a group exists on an Active Directory domain in a trusted forest and the group scope is defined as DomainLocal, VAST Cluster does not retrieve such a group when querying Active Directory, so members of such a group are denied access despite any share-level ACLs that can rule otherwise.

    • If TLS is enabled, the SSL certificate has to be a CA-signed certificate that is valid for all of the domain controllers in all trusted forests. If the certificate is not valid for a domain controller, this domain controller is not recognized.

    • ORION-156168: In a multi-forest environment, after migrating a group account from the forest of the cluster’s joined domain to another forest, information about historical group membership is not kept, so users in the migrated group might not be able to access resources to which they used to have access prior to the migration.

VAST Prometheus Exporter

With VAST Cluster 5.0 and 4.7, the Prometheus exporter script at https://github.com/vast-data/vast-exporter is no longer supported. Instead, use the following the VAST API endpoints:

  • https://<VMS IP>/api/prometheusmetrics/ 

  • https://<VMS IP>/api/prometheusmetrics/all 

  • https://<VMS IP>/api/prometheusmetrics/users 

  • https://<VMS IP>/api/prometheusmetrics/defrag 

  • https://<VMS IP>/api/prometheusmetrics/views 

  • https://<VMS IP>/api/prometheusmetrics/devices 

  • https://<VMS IP>/api/prometheusmetrics/quotas 

Call Home & Support

  • When creating a support bundle with the METADATA preset, only one CNode can be selected for the bundle. Selecting any DNode(s) or multiple CNodes together with the METADATA preset results in an error.

Known Issues in 5.0.0-SP60

The following are known issues in VAST Cluster 5.0.0-SP60.

Install & Upgrade

  • ORION-220709: An error occurs when trying to rerun an OS/FW upgrade on nodes where the firmware has already been staged on the NICs during a prior (unsuccessful) upgrade attempt.

  • ORION-200435: When running an upgrade with firmware upgrade and force options specified, the firmware does not get upgraded if the DBox is Mavericks APEX.

  • ORION-145815: In some cases, VAST Cluster does not raise an alert on a wrong NIC firmware version during a cluster upgrade.

Cluster Expansion

  • ORION-175762: In some cases, a DBox expansion procedure run on a cluster with similarity-based data reduction enabled can take longer than expected.

Networking

  • (RESOLVED IN 5.1.0-SP60) ORION-214087: On a cluster where both external and internal interfaces are InfiniBand, the VMS may report the Failed to find the current OpenSM master with error: 'NoneType' object has no attribute 'ssh_conn' error if the OpenSM master is outside the cluster.

  • ORION-205395: If, during an HA event on a cluster with InfiniBand internal networking the OpenSM service is found unavailable on a CNode, the CNode may occasionally encounter a failed connecting to the leader's platform error.

  • ORION-155530: Sometime,s after you run the cluster networking configuration script (configure_network.py) and then rebooted the CNode, the eb1 interface can still be down with the Device ib1 has different MAC address than expected, ignoring error. In this case, rerun the script after the reboot to bring the interface up.

Per-Tenant Encryption

  • (RESOLVED IN 5.1.0) ORION-114057: A tenant_create returned an error : ObjectCreateResultCode.FAILURE error occurs when attempting to create 256 tenants, each with a unique encryption group, if prior to this attempt, a tenant with per-tenant encryption enabled was created and then deleted.

Quotas

  • (RESOLVED IN 5.1.0-SP60, 5.2.0) ORION-206297: In some cases, quota capacity percentage shown in the Element Store -> Quotas page may not get updated properly to reflect the capacity consumption. If you encounter this issue, use Uplink to view the data.

  • (RESOLVED IN 5.2.0) ORION-178975: After creating a user quota with the identifier type set to UID, VMS lists this quota under the corresponding username pulled from the LDAP provider but not under the UID specified during quota creation.

Lifecycle Rules

  • (RESOLVED IN 5.1.0-SP50, 5.2.0-SP6) ORION-201538: The lifecycle rule mechanism deletes empty directories that were created through NFS and SMB protocols on the view for which a lifecycle rule is enabled, even when the empty directories are not expired according to the enabled lifecycle rule.

QoS

  • ORION-139913: When applying a QoS policy to NFSv3 access, both data and metadata are taken into account in QoS limit calculations, while with NFSv4.1, only data are considered.

  • ORION-137986: Enabling a QoS policy for a view on which a mixed (read and write) workload runs, can result in decreased performance for the workload.

Protocols

  • (RESOLVED IN 5.1.0-SP60, 5.2.0-SP10) ORION-216774: For views with the SMB and S3 protocols enabled and the Mixed Last Wins or SMB security flavor set, the owner of a child directory in a parent that has no default ACL, may in some cases be set incorrectly.

  • (RESOLVED IN 5.3.0) ORION-204972: When creating S3 objects on a multi-protocol view controlled with the NFS security flavor, in a directory for which the SGID POSIX modebit is set, the SGID modebit may get propagated to files/objects created in that directory.

NFS

  • (RESOLVED IN 5.1.0-SP50) ORION-193090: The READDIR and READDIRPLUS operations against a directory with a name longer than 255 characters may hang without returning an error.

  • (RESOLVED IN 5.1.0) ORION-135514: The word percent in the CNode <...> nfs over rdma connections is at <...> percent alert should be read as connections, since the alert shows the number of connections but not a percentage.

SMB

  • (RESOLVED IN 5.3.0) ORION-144020: When use of Kerberos/NTLM authentication to authorize SMB users from non-trusting domains is enabled for the tenant, a Windows client would let you add a new ACE only by searching for a specific user in the list of trusted forest users, instead of locating the user through the list of domains.

  • ORION-142968: If a quota is exceeded during the process of coping a file to the VAST cluster, the copying process is stopped with a misleading error message: A device attached to the system is not functioning.

S3

  • (RESOLVED IN 5.1.0-SP60) ORION-217661: If the final part of a multipart upload has a size of 0 (zero), VAST Cluster responds with a 400 Bad Request error.

  • (RESOLVED IN 5.3.0) ORION-198606: In rare cases, an IO is stuck - should close alert can be raised on a CNode caused by the cluster waiting for completion of an S3 multi-part upload.

  • (RESOLVED IN 5.1.0) ORION-136816: S3 GET of a symlink is blocked but HeadObject and GetObjectACL operations still succeed.

Protocol Auditing

  • (RESOLVED IN 5.1.0) ORION-134836: When displaying path details in the VAST Audit log dialog, the phandle field does not show the phandle.

VAST Database

  • ORION-163038: When importing data into a VAST Database table and there is a type mismatch between the column and the data being imported, VAST Cluster produces an ambiguous error message (Failed to get column) instead of pointing to the expected data type.

Data Protection

  • (RESOLVED IN 5.2.0, 5.1.0-SP50) ORION-196575: An attempt to bulk delete a large number of protected paths may result in a timeout in case an issue occurs during the deletion of one of the protected paths.

Replication

  • (RESOLVED IN 5.1.0-SP30) ORION-201982: An attempt to replicate from more than eight source clusters may result in a CNode container restart with the Buffers pool is exhausted error.

  • (RESOLVED IN 5.1.0-SP50, 5.2.0) ORION-196091: Objects created as a result of attempts to create a protected path with incorrect settings (for example, to create a path with a target directory that already exists on the destination peer), do not get automatically deleted on protected path creation failure.

  • ORION-183432: When trying to perform a failover using the protectedpath modify --modify-replication-state VAST CLI command, the replication state remains Standalone, although it is expected to change from Standalone to Source. If you encounter this issue, use VAST Web UI to perform the failover.

  • (RESOLVED IN 5.2.0) ORION-144137: User quotas for Alternate Data Stream (ADS) children might get miscalculated at the replication destination when the size and/or used attributes of an ADS child are updated due to replication.

  • ORION-140894: When attempting to delete a protected path from the destination peer after an ungraceful failover, a Failed to delete following streams or similar error occurs. The workaround is to manually change the destination peer's role to STANDALONE and retry the deletion.

Multi-Cluster Management

  • (RESOLVED IN 5.1.0) ORION-146029: When sending call home bundles from a VAST on Cloud (VoC) cluster, the Multi-Cluster Manager (MCM) sends the first bundle an hour after the cluster has been registered, and the following bundles are sent according to the user-defined interval.

Authentication & Authorization

  • ORION-196963: The owner for files and folders created on an NFS4.1 view can be occasionally reported as nobody instead of the correct value. This issue can occur if both LDAP and Active Directory providers (with different domain names) are configured for the view’s tenant and the same group exists on both providers, but the user is part of this group on the LDAP provider only.

  • (RESOLVED IN 5.1.0) ORION-144288: Due to a caching issue, an incorrect user UID can be returned in a user query being retried immediately after the connectivity to the provider has been restored.

VMS

  • ORION-203155: The Unexpected width, actual link width is <...>  alarm message may contain garbage at the end of the message.

  • (RESOLVED IN 5.2.0) ORION-172811: Some analytics properties that can be selected when creating a customized analytics report, produce a graph that does not precisely correspond to the property name. For example, selecting the NFS Write IOPS property produces a graph showing the write IOPS not only for NFS but for all protocols. In particular, this issue may occur with protocol-specific and replication-related properties that represent bandwidth, IOPS and latency.

  • (RESOLVED IN 5.1.0) ORION-147658: An attempt to add a user quota for a non-existing user does not  raise an error.

  • ORION-143717: On a cluster with CNode Port Affinity configured, there is no way to expose the VAST DNS IP on a specific port (left or right).

  • (RESOLVED IN 5.1.0) ORION-134765: The Rows filtered out and Rows scanned metrics in the VAST DB Row Metrics analytics report show the total number of rows accumulated over time while other metrics in the report show the number of rows per second.

  • ORION-131386: When there is a parent directory that has a very large number of child directories, a total of children’s capacity values displayed in the Capacity page can exceed the capacity value shown for the parent directory.

  • ORION-89570: In some cases, capacity analytics for subdirectories cannot be reported due to an internal timeout. This issue occurs when there is an extremely large number of subdirectories to be estimated.

VAST Web UI

  • (RESOLVED IN 5.1.0-SP30) ORION-189217: The Hardware page in VAST Web UI may display a incorrect layout image for a Mavericks DBox.

  • ORION-169645: A tip for the Atime Frequency field (Element Store -> View Policies -> choose to create or edit a view policy -> General tab) states that 3600s is the default value for this field, while the actual default is 0 (no atime updates).

  • ORION-150503: A local user cannot be found when trying to add it as a value in the Database owner field of the New Database dialog.

  • ORION-147073: The Database page does not show the actual number of rows and size of objects until the page is refreshed manually.

  • (RESOLVED IN 5.1.0) ORION-146832:  After an existing VAST Web UI session has timed out, the Multi-Cluster Management page may display a prompt to enter a registration token for a cluster for which the token has already been provided. To eliminate the prompt, refresh the page.

  • (RESOLVED IN 5.1.0) ORION-146273: After deleting a cluster in the Multi-Cluster Management page, subsequent delete confirmation popups can show the Type DELETE to approve field pre-populated with the DELETE word.

  • (RESOLVED IN 5.1.0) ORION-143724: Some of columns in the SSDs tab of the Infrastructure page opened through Multi-Cluster Management may show dm_mock or mock dev values instead of model and firmware version numbers.

  • ORION-142547: Clicking the Vast catalog policy link in the Policy column of the Snapshots page in Multi-Cluster Management opens an empty Protection Policies page instead of showing a specific policy.

  • (RESOLVED IN 5.1.0) ORION-141670: Relative file symlinks created through SMB are listed as directory symlinks and require use of rmdir to be deleted.

  • (RESOLVED IN 5.2.0) ORION-140652: Auto-completion for the Logon name of the privileged domain user field in tenant settings (Element Store -> Tenants -> choose to create or edit a tenant) is not provided.

  • (RESOLVED IN 5.1.0) ORION-139890: The QoS policy field in the Create View or Update View dialog (Element Store -> Views -> choose to create or edit a view) can list both view QoS policies and user QoS policies, although it does not let you add a user QoS policy to the view.

VAST CLI

  • (RESOLVED IN 5.1.0) ORION-146200: The auto-completion options for the role-assign command do not list all possible parameters.

VAST REST API

  • (RESOLVED IN 5.1.0-SP50) ORION-201905: When trying to retrieve the segments retransmitted metrics with an API call to /api/monitors/ad_hoc_query/, a "detail": "metrics not available" error can occur.

  • ORION-178569: The /users/names endpoint always returns only the first 50 entries, regardless of the page size parameter or the total amount of entries to be returned.

Platform & Control

  • ORION-205393: After disconnecting and reconnecting an InfiniBand switch, the cluster might encounter a CNode container restart due to the assertion failed: (!has_verifier(mem_dev->dest().env_id)) Failed performing rpc call! lock_op=HAS_TEMP_REFS error.

  • ORION-203504: A finished redistribution and still not balanced alert can occur on the cluster when one of the CNode ports is disconnected and thus even distribution of virtual IPs among the platform ports is not possible. If there are no accompanying messages indicative of any issues, this alert can be ignored.

  • ORION-202806: When handling extreme workloads, CNode containers may occasionally restart with the timeout expired for life_type=16,life_gen=<number> (TRAVIS) error. The error means that the cluster is busy processing the workload. If there are no other symptoms indicative of any issues, no human intervention is required.

  • (RESOLVED IN 5.3.0) ORION-193956: The leader hogging for <number> us message may occasionally appear in VAST logs. If there are no accompanying messages indicative of a failure, this message can be ignored.

  • (RESOLVED IN 5.1.0-SP60) ORION-158539: The back view for the CERES DBox in the Hardware Layout page shows the data ports in incorrect positions (e.g. port enp3s0f1 is shown on the right while it should be on the left). To mitigate the issue, refer to the Infrastructure -> NICs page that lists the correct locations for the ports.

Call Home & Support

  • ORION-239170: When obfuscating a support bundle, the CNode hostname may not get obfuscated in some of the logs included in the bundle.

  • (RESOLVED IN 5.1.0) ORION-143381: When the directory used to store call home bundles reaches its size cap, a FileNotFoundError: [Errno 2] No such file or directory error is reported instead of an out-of-space error.