Last
day when I logged in to vCenter I noticed
one of the host with warning icon and upon checking in summary tab found this
warning,
This warning message is indicating the Embedded
Flash/SD-Card (esxi embedded install) is no longer available to the ESXi
host. As this a HP ProLiant server so logged into iLO, checked into
diagnostics and found the SD-Card status ok then taken a look into ilo logs and
found SD-card was restarted recently.
The good news is that the whole ESXi OS loads into memory so there was no
outage for the VMs and once the connectivity would restore the host can access
the storage again. The bad news is that the error did not clear automatically
and as no one likes to see errors/warning in their production environment so I
needed to find a solution to this issue.
The simplest solution of this issue is to put the host in maintenance mode
and restart the management agents. One can do this by two ways, either connect
to the host using ssh and run below commands,
/etc/init.d/hostd restart
/etc/init.d/vpxa restart
Or alternatively connect to the host using iLO, establish a remote connection, login to DCUI and restart the management agents.
Or alternatively connect to the host using iLO, establish a remote connection, login to DCUI and restart the management agents.
Once the managements agents restart will complete, vCenter will show the
host back in a normal state.
Note: There might be cases where SD-Card having issues due to buggy
firmware and in order to fix the issue you may need to upgrade or downgrade the
firmware.
we are at firmware version 2.20 and as per various forums this version have
SD-Card related bug and that was supposedly fixed in firmware version 2.22, as
version 2.30 is also available so one may upgrade to one of these versions of
firmware.
Other Scenario: What if SD-Card is failed, you can try to remove and
reattach the SD-Card but if it still doesn’t come online then you need to call
the server vender for its replacement.
But if SD-Card is bad,
migrate all VMs to other hosts then put the host in maintenance mode and take
backup of host configuration. Now shut down the host and after replacing the flash
drive reinstall the esxi (As the host will not come up after reboot), once the
host comes up, configure the management network and VLANs then restore the host
configuration.
Reference: Daniel's blog and discussion on other forums.
Update, 05/11/2015:- This week we faced the same issue again so instead of fixing it myself contacted HP support and they confirmed the issue is with firmware version 2.20 that we have on these G9 server.
Update, 05/11/2015:- This week we faced the same issue again so instead of fixing it myself contacted HP support and they confirmed the issue is with firmware version 2.20 that we have on these G9 server.
Response from hp support:
That version 2.20 has been removed
from our site due to it causing issues with server components, including the
embedded flash cards. . The new iLO firmware 2.22 addresses/fixes issues with
the embedded cards disconnecting.
And hp support further suggested us to upgrade the firmware of G9 servers to most recent version 2.30.
That’s it… :)
Hey Noor, I am checking to see if you still have this problem. We have a few G9 servers that has been upgraded from 2.40 to 2.44 but this issue still persists.
ReplyDeleteAfter upgrading the firmware to ver 2.30, we didn't see this issue again.
DeleteWe are running 4x BL460c G9 and one of then has this issue. We are running firmware 2.30
ReplyDeleteWe have this issue only on some of our BL465c Gen8 servers - with iLO version 2.40 or 2.44
ReplyDeleteDid you try to remove and then reconnect the SD card....btw i would suggest you to check with HP Support about the same.
DeleteI have a BL460c G9 with spinning disks on v2.40 that had this issue. Upgraded ilo to v2.50. I am reading elsewhere that the issue is still not resolved with v2.50.
ReplyDeletenot sure in case of magnetic disk......but in case on flash drive/SD...i didn't see this issue after upgrading the iLO...in your case there might be some other issue...if you have active support the...plz check with HP
DeleteHP Gen 8/9 lose access to device backing boot filesystem
ReplyDeletehttps://kb.vmware.com/kb/2144283
I use
ReplyDeleteDCUI -> Troubleshooting Option -> Restart Managment Agent
and it's working fine
Thank you Noor Mohammad
Super; thank you. I used your instructions and SSH-ed to it with Putty.
ReplyDelete