Note
For CERES DBoxes, the complete procedure for replacing a DBox features some additional and variant steps and cautions not included below. They are performed by VAST support personnel.
DBox replacement is a VMS-enabled procedure for replacing a faulty DBox while the cluster continues to operate.
This DBox replacement procedure is suitable for the following situations:
When a DBox has failed in a cluster that has DBox HA capability. The cluster is still running.
When a DBox is faulty but running. For example, the DBox has a failed slot. Even if DBox HA is not enabled, the cluster is still running, since the DBox has not failed.
A replacement DBox is shipped with new DNodes and empty of SSDs and NVRAMs. During the procedure, the SSDs and NVRAMs are migrated from the faulty DBox to the new DBox.
Prerequisites
The procedure requires you to connect the new DBox to the switches before disconnecting the old DBox. Therefore, the cluster's network switches must have enough spare unused ports to accommodate an extra DBox.
Please consult your VAST Data sales engineer for help designating switch ports and ensuring that they are configured with the correct port designations for DNodes as required.
Similarly, you'll need rack space and PSUs in order to install the new DBox before physically removing the faulty DBox.
Required Equipment
Replacement DBox with rail mount kit and four C13/C14 power cables. All SSD slots on the DBox must be empty.
4 x 100Gb/s QSFP28 cables for connecting the new DBox to the cluster's switches.
Step 1: Install and Add the Replacement DBox
Without removing the faulty DBox, rack mount the new DBox and add the new DBox to the cluster. Follow the instructions in this cluster expansion procedure to add the DBox to the cluster. Make sure to select Empty box in the General Settings screen.
Step 2: Begin DBox Removal
On the DBoxes tab, open the Actions menu for the faulty DBox that you want to replace and select Replace.
Click Yes to confirm your action.
Step 3: Migrate the SSDs to the New DBox
On the Clusters tab of the Infrastructure page, check that the cluster's Raid State is healthy.
Prepare to move SSDs from the old DBox into the new DBox. Plan to insert each SSD into the slot in the new DBox that has the same slot number as in the old DBox.
Migrate each SSD, one at a time, as follows:
Remove the SSD from the faulty DBox.
The SSD's state changes to Failed and the cluster's RAID state changes to Rebuild.
Insert the removed SSD into the target slot in the new DBox.
The SSD is activated automatically.
Verify that the cluster's RAID state has returned to healthy before proceeding with the next SSD.
Step 4: Migrate the NVRAMs to the New DBox
Prepare to move NVRAMs from the old DBox into the new DBox. Plan to insert each NVRAM into the slot in the new DBox that has the same slot number as in the old DBox.
On the Clusters tab of the Infrastructure page, check that the cluster's NVRAM State is healthy.
For each NVRAM in turn:
In the NVRAMs tab, open the Actions menu for the NVRAM and select Deactivate.
When the NVRAM is deactivated, remove the NVRAM from the faulty DBox.
Insert it into the target slot in the new DBox. In case of a faulty NVRAM, insert the replacement NVRAM into the planned slot.
Verify that the slot is active and the device is healthy.
Open the Actions menu for the moved NVRAM and select Activate.
Verify that the cluster's NVRAM State is healthy before proceeding with the next NVRAM.
Step 5: Remove the Faulty DBox
Verify that the faulty DBox is empty of devices.
Open the Actions menu again for the faulty DBox and select Conclude Replacement.
Click Yes to confirm the replacement.
The process of removing the DBox and DNodes takes some time. You can monitor the progress by watching the replace_dbox task in the Activitiies page.
Wait until the task is complete and then physically remove the faulty DBox. Ship it back to VAST Data.