Thursday, July 14, 2016

Unmounting a LUN or detaching a datastore from ESXi 5.x or 6.0

You might have seen where you unmounted/detached a LUN from ESXi host and after some time few hosts in your env are showing as Inaccessible/Not-Responding. When you would further check the VMkernel logs on affected host, you would find APD/PDL related log entries.

This is something that could be related to not following the proper procedure during LUN detach. If you wouldn’t follow the proper process during LUN/Datastore unmount/detach then it could lead the host to APD/PDL state.

In this post I will summarizing the best practice of unmounting a LUN from ESXi 5.x or 6.

Before doing anything, ensure that:
  • Host should not have any registered virtual machines/template residing on this datastore and all CD/DVD images located on the VMFS datastore must also be unregistered from any virtual machines.
  • The datastore is not used for vSphere HA heartbeat.
  • The datastore is not part of a datastore cluster (managed by Storage DRS).
  • The datastore is not configured as a diagnostic coredump partition.
  • Storage I/O Control is disabled for the datastore.
  • If the LUN is being used as an RDM, remove the RDM from the virtual machine. Click Edit Settings, highlight the RDM hard disk, and click Remove. Select Delete from disk and click OK.

    Note: This destroys the mapping file but not the LUN content.
  • Check if the LUN/datastore is used as the persistent scratch location for the host.

Note: When using the vSphere Web Client with vSphere 5.1, 5.5, and 6.0, only following checks are required during datastore unmount,
  • Host should not have any virtual machines residing on this datastore
  • Host should not use the datastore for HA heartbeats

Obtaining the NAA ID of the LUN to be removed

From the vSphere Client, this information is visible in the Properties window of the datastore.

From the ESXi host, run this command:

# esxcli storage vmfs extent list

You see output similar to:

Volume Name  VMFS  UUID  Extent Number  Device  Name Partition
----------- ----------------------------------- ------------- ------------------------------------ ---------
datastore1  4de4cb24-4cff750f-85f5-0019b9f1ecf6 0  naa.6001c230d8abfe000ff76c198ddbc13e 3
Storage2  4c5fbff6-f4069088-af4f-0019b9f1ecf4 0  naa.6001c230d8abfe000ff76c2e7384fc9a 1
Storage4  4c5fc023-ea0d4203-8517-0019b9f1ecf4 0  naa.6001c230d8abfe000ff76c51486715db 1
LUN01  4e414917-a8d75514-6bae-0019b9f1ecf4 0  naa.60a98000572d54724a34655733506751 1

Make a note of the NAA ID of the datastore to use this information later in this procedure.

Note: Alternatively, you can run the esxcli storage filesystem list command, which lists all file systems recognized by the ESXi host. To find the unique identifier of the LUN housing the datastore to be removed, run this command:

# esxcfg-scsidevs –m 

This command generates a list of VMFS datastore volumes and their related unique identifiers. Make a note of the unique identifier (NAA_ID) for the datastore you want to unmount as this will be used later on.

Unmounting and de a LUN using the vSphere Client

To detach a storage device using the vSphere Client, first unmount the datastore and then detach the LUN, process is as follows,
1.       If the LUN is an RDM, skip to step 2. Otherwise, in the Configuration tab of the ESXi host, click Storage. Right-click the datastore being removed and click Unmount.

A Confirm Datastore Unmount window appears. When the prerequisite criteria have been passed, click OK.

Note: To unmount a datastore from multiple hosts in the vSphere Client, click Hosts and Clusters > Datastores and Datastore Clusters view (Ctrl+Shift+D). Perform the unmount task and select the appropriate hosts that should no longer access the datastore to be unmounted.

2.     Click the Devices view (under Configuration > Storage):

3.       Right-click the NAA ID of the LUN (as noted above) and click Detach. A Confirm Device Unmount window is displayed. When the prerequisite criteria are passed, click OK. Under the Operational State of the Device, the LUN is listed as Unmounted.

Note: The Detach function must be performed on a per-host basis and does not propagate to other hosts in vCenter Server. If a LUN is presented to an initiator group or storage group on the SAN, the Detach function must be performed on every host in that initiator group before unmapping the LUN from the group on the SAN. Failing to follow this step results in an all-paths-down (APD) state for those hosts in the storage group on which Detach was not performed for the LUN being unmapped.
4.       Confirm if the LUN is successfully detached. The LUN can then be safely unpresented from the SAN.
5.       Perform a rescan on all ESXi hosts which had visibility to the LUN. The device is automatically removed from the Storage Adapters.
When the device is detached, it stays in an unmounted state even if the device is re-presented (that is, the detached state is persistent). To bring the device back online, the device must be attached.

If you want the device to permanently decommission from an ESXi host, manually remove the NAA entries from the host configuration:

·         To list the permanently detached devices, run this command:

# esxcli storage core device detached list

You see output similar to:

Device UID State
------------------------------------ -----
naa.50060160c46036df50060160c46036df off
naa.6006016094602800c8e3e1c5d3c8e011 off 
·         To permanently remove the device configuration information from the system, run this command:

# esxcli storage core device detached remove -d NAA_ID

For example:

# esxcli storage core device detached remove -d naa.50060160c46036df50060160c46036df

This is it.

Unmounting a LUN using the command line

To unmount a LUN from an ESXi 5.x/6.0 host using the command line:

  • As earlier, obtain the NAA ID of the LUN to be removed
  • Now unmount the datastore by running this command:

    # esxcli storage filesystem unmount [-u UUID | -l label | -p path ]

    For example, use one of these commands to unmount the LUN01 datastore:

    # esxcli storage filesystem unmount -l LUN01
    # esxcli storage filesystem unmount -u 4e414917-a8d75514-6bae-0019b9f1ecf4
    # esxcli storage filesystem unmount -p /vmfs/volumes/4e414917-a8d75514-6bae-0019b9f1ecf4

    Note: If the VMFS filesystem you are attempting to unmount has active I/O or has not fulfilled the prerequisites to unmount the VMFS datastore, you see an error in the VMkernel logs similar to:

    WARNING: VC: 637: unmounting opened volume ('4e414917-a8d75514-6bae-0019b9f1ecf4' 'LUN01') is not allowed.
    VC: 802: Unmount VMFS volume f530 28 2 4e414917a8d7551419006bae f4ecf19b 4 1 0 0 0 0 0 : Busy
  • To verify that the datastore is unmounted, run this command:

    # esxcli storage filesystem list

    You see output similar to:

    Mount Point  Volume Name  UUID  Mounted  Type  Size  Free
    ------------------------------------------------- ----------- ----------------------------------- ------- ------ ----------- -----------
    /vmfs/volumes/4de4cb24-4cff750f-85f5-0019b9f1ecf6  datastore1  4de4cb24-4cff750f-85f5-0019b9f1ecf6  true  VMFS-5  140660178944  94577360896
    /vmfs/volumes/4c5fbff6-f4069088-af4f-0019b9f1ecf4  Storage2  4c5fbff6-f4069088-af4f-0019b9f1ecf4  true  VMFS-3  146028888064  7968129024
    /vmfs/volumes/4c5fc023-ea0d4203-8517-0019b9f1ecf4  Storage4  4c5fc023-ea0d4203-8517-0019b9f1ecf4  true  VMFS-3  146028888064  121057050624
    LUN01  4e414917-a8d75514-6bae-0019b9f1ecf4  false VMFS-unknown  version 0 0

    The Mounted field is set to false, the Type field is set to VMFS-unknown version, and that no Mount Point exists.

    Note: The unmounted state of the VMFS datastore persists across reboots. This is the default behavior. If you need to unmount a datastore temporarily, you can do so by appending the --no-persist flag to the unmount command.
  • To detach the device/LUN, run this command:

    # esxcli storage core device set --state=off -d NAA_ID
  • To verify that the device is offline, run this command:

    # esxcli storage core device list -d NAA_ID

    You see output, which shows that the Status of the disk is off, similar to:

    naa.60a98000572d54724a34655733506751
    Display Name: NETAPP Fibre Channel Disk (naa.60a98000572d54724a34655733506751)
    Has Settable Display Name: true
    Size: 1048593
    Device Type: Direct-Access
    Multipath Plugin: NMP
    Devfs Path: /vmfs/devices/disks/naa.60a98000572d54724a34655733506751
    Vendor: NETAPP
    Model: LUN
    Revision: 7330
    SCSI Level: 4
    Is Pseudo: false
    Status: off
    Is RDM Capable: true
    Is Local: false
    Is Removable: false
    Is SSD: false
    Is Offline: false
    Is Perennially Reserved: false
    Thin Provisioning Status: yes
    Attached Filters:
    VAAI Status: unknown
    Other UIDs: vml.020000000060a98000572d54724a346557335067514c554e202020
This device is now successfully detached from the host. It remains visible the UI at this point.

If the device is to be permanently decommissioned, it is now possible to unpresent the LUN from the SAN.
  • To rescan all devices on the ESXi host, run this command:

    # esxcli storage core adapter rescan [ -A vmhba# | --all ]

    The devices are automatically removed from the Storage Adapters.

    Notes:
  • A rescan must be run on all hosts that had visibility to the removed LUN.
  • When the device is detached, it stays in an unmounted state even if the device is re-presented (that is, the detached state is persistent). To bring the device back online, the device must be attached. To do this via the command line, run this command:

    # esxcli storage core device set --state=on -d NAA_ID
  • If the device is to be permanently decommissioned from an ESXi host, (that is, the LUN has been or will be destroyed), remove the NAA entries from the host configuration by running these commands:
  • To list the permanently detached devices:

    # esxcli storage core device detached list

    You see output similar to:

    Device UID State
    ---------------------------- -----
    naa.50060160c46036df50060160c46036df off
    naa.6006016094602800c8e3e1c5d3c8e011 off
  • To permanently remove the device configuration information from the system:

    # esxcli storage core device detached remove -d NAA_ID

    For example:

    # esxcli storage core device detached remove -d naa.50060160c46036df50060160c46036df
  • The reference to the device configuration is permanently removed from the ESXi host's configuration.

    Note: If the device is detached but still presented (highlighted step was skipped), the preceding command fails to permanently remove the device from the system, and the device is automatically re-attached. You must complete
    highlighted step for the device to be permanently removed.
Reference: VMware KB# 20046052004684

That's it... :)


1 comment:

  1. Followed the instructions, everything runs smoothly. Thanks!

    ReplyDelete