Cisco VIRL 1.3 Metadata CRC error detected xfs_repair

Share on:

I received a phone call from a co-worker over the weekend about having issues with VIRL 1.3 (Aug 2017 Build). He encountered the issue after deploying VIRL 1.3 ova (1 nic or 5 nic ova on ESXi 6.5 update1). After deploying the OVA he would reconfigure the virtual machine with CPU and memory needed for his lab. VIRL would successfully boot into CLI allowing him to run virl_setup. However, after running the virl_setup he would receive the following errors after a reboot. I was able to run xfs_repair which temporarily fix the issue. After a reboot or shutdown the errors would reappear again.

My co-worker re-downloaded the ova’s again (Both 1nic or 5nic) and re-deployed the ova’s about 3 or 4 times before involving me. After doing some troubleshooting with both ESXi and VIRL I was able to find the issue. It was due to a firmware issue with Intel 600p NVMe SSD that was installed on his ESXi host. 600p was used as VMFS Datastore hosting VMDK for VIRL 1.3 virtual machine. I decided to SSH into ESXi host and run voma on the intel 600p which reported 1150 errors on the device and voma was unable to repair the errors.

What I found really funny was VMFS datastore had the following virtual machine running with no problems :

  • VIRL 1.2
  • Window 10 1709 64-bit

So I did some research and came across a bug reported which pertain to XFS and NVMe. The issue was related to firmware currently loaded on 600p. Intel released new firmware to resolve this issue.

https://bugzilla.redhat.com/show_bug.cgi?id=1402533

How I resolved the issue

  1. Download Intel SSD firmware update tool (In this case I selected to download the Bootable ISO) Download Intel® SSD Firmware Update Tool

  2. Mounted the ISO, Restarted ESXi host, and booted off the ISO.

  3. Upgraded the firmware to v121c on the SSD

  4. Restarted the host back into ESXi

  5. Downloaded intel-nvme vib for ESXi and extracted the ZIP. (ESXi Intel Drive)

  6. Uploaded “VMW-ESX-6.5.0-intel-nvme-1.2.1.15-offline_bundle-5330543.zip” to the VMFS datastore.

  7. SSH into ESXi host

  8. Ran the following command

    esxcli software vib install -d /vmfs/volumes/nameofdatastore/VMW-ESX-6.5.0-intel-nvme-1.2.1.15-offline_bundle-5330543.zip

  9. Verified driver was successfully installed from vib.

  10. Reboot ESXi host.

  11. Ran another “voma” check and came back with 0 errors

  12. Created new Virl 1.3 Virtual machine using OVA.

Hopefully this helps somebody else who might come across the same issue. If you are having the same issue but with another brand NVMe SSD I would encourage you to update the firmware if you are having the same issue.