Fix - Invalid Guest on Virtual Center

December 9, 2008

After encountering an ESX host problem the other night, I ran into an issue today with a VM guest showing up as “invalid” in virtual center. I was able to bring the guest back into VC without taking an outage by doing the following procedures.

First some background.

Due to circumstances still being investigated, the console of an ESX box froze disconnecting it from virtual center. All of the guests (approximately 40) on the host were still available and running, but VMware support confirmed that the state of the server was so degragated that it would require a reboot of the host and thus an outage of all the guests on it to fix. Since the ESX box is in an HA cluster, after some necessary VM guest applications were shut down the ESX box was rebooted and HA promptly brought up the guest VM’s onto other hosts in the cluster. All the guests affected were then checked out and appeared fine.

Thinking I was in the clear, today I noticed one of the affected VM’s icon in Virtual center appeared as blue and was italicized with the words “(invalid)” added after the vm name. Knowing that I had successfully started and checked this particular vm the night before, I was needless to say confused.

First things first, since the VM was a Linux guest I tried to ssh to the guest to see if it was still running. Luckily, I was able to log in to the VM and everything looked normal. Next, I logged onto the ESX host console that this VM had last been registered to and issued a vmware-cmd -l. There was no entry for the invalid VM so to double check I issued a ps -axf | grep -i and found that there was indeed a process running for the vm in question on this particular ESX host.

I decided to try to re-add the VM into VC manually by first removing the invalid guest from inventory in VC and then re-adding it by browsing to the .vmx file. To do this, I clicked on the ESX host in VC and on the summary tab double click on the data store that the .vmx file for this vm lives on. You can then browse to the directory for the vm guest and should be able to right-click the .vmx file and choose the “Add to inventory” option. I say should be able to because in this particular instance that option was grayed out and not selectable.

In an attempt to find out some more information from the ESX host logs, I then logged onto the ESX host the VM was last registered on and navigated to the /var/log/vmware directory. Issuing a grep -i * gave a lot of good output. The interesting bit I found were some entries concerning .vmx file syntax errors. They appeared as follows:

hostd-9.log:[2008-12-07 17:28:17.388 'BaseLibs' 20241328 info] Reloading config state: /vmfs/volumes/48dabc48-573b1344-46f8-001ec939c5cb/vmabc123/vmabc123.vmx
hostd-9.log:[2008-12-07 17:28:17.435 'BaseLibs' 20241328 warning] VMHSVMLoadConfig failed: File “/vmfs/volumes/48dabc48-573b1344-46f8-001ec939c5cb/vmabc123/vmabc123.vmx” line 94: Syntax error.
hostd-9.log:[2008-12-07 17:28:17.448 'vm:/vmfs/volumes/48dabc48-573b1344-46f8-001ec939c5cb/vmabc123/vmabc123.vmx' 3076424608 info] Failed to load virtual machine.
hostd-9.log:[2008-12-07 17:28:17.466 'vm:/vmfs/volumes/48dabc48-573b1344-46f8-001ec939c5cb/vmabc123/vmabc123.vmx' 3076424608 info] Failed to load virtual machine. Marking as unavailable: vim.fault.InvalidVmConfig
hostd-9.log:[2008-12-07 17:28:17.467 'vm:/vmfs/volumes/48dabc48-573b1344-46f8-001ec939c5cb/vmabc123/vmabc123.vmx' 3076424608 info] State Transition
(VM_STATE_INITIALIZING -> VM_STATE_INVALID_CONFIG)
hostd-9.log:[2008-12-07 17:28:17.467 'vm:/vmfs/volumes/48dabc48-573b1344-46f8-001ec939c5cb/vmabc123/vmabc123.vmx' 3076424608 info] Marking VirtualMachine invalid
hostd-9.log:[2008-12-07 17:28:17.467 'Vmsvc' 3076424608 info] Loaded virtual machine: /vmfs/volumes/48dabc48-573b1344-46f8-001ec939c5cb/vmabc123/vmabc123.vmx
hostd.log:[2008-12-08 09:18:04.516 'vm:/vmfs/volumes/48dabc48-573b1344-46f8-001ec939c5cb/vmabc123/vmabc123.vmx' 60660656 info] State Transition (VM_STATE_INVALID_CONFIG -> VM_STATE_UNREGISTERING)
hostd.log:[2008-12-08 09:18:04.586 'vm:/vmfs/volumes/48dabc48-573b1344-46f8-001ec939c5cb/vmabc123/vmabc123.vmx' 60660656 info] State Transition (VM_STATE_UNREGISTERING-> VM_STATE_GONE)

These entries are from approximately 17 hours after I successfully restarted the invalid VM after the ESX host outage. Since they specified bad .vmx entries, I navigated to the .vmx file in question and made a backup copy of the file. Then I opened the original .vmx file and noticed the last three lines of the file were:
evcCompatibilityMode = "FALSE"
0001e9ebd3fbff"
evcCompatibilityMode = "FALSE"

The .vmx file is basically the configuration of the VM, and each line should have relevant information. The second to last line consisting of a multiple of digits is not a correct entry and the evccompatibilitymode entry should only appear once. Seems like I found the syntax errors the hostd logs were complaining about. After editing the .vmx file to remove the last two entries. I decided to stop and restart the vmware management agents to see if they could now pick up the orphaned VM guest process.

This was done using the following commands:
#/etc/rc.d/init.d/vmware-vpxa stop
#service mgmt-vmware stop
#service mgmt-vmware start
#/etc/rc.d/init.d/vmware-vpxa start

After restarting the services, I tried manually registering the VM guest to the host using #vmware-cmd -s register . This returned successfully so I checked for the VM’s operation state using vmware-cmd getstate. The command showed that the VM was in a powered on state, which also meant that the VMware services now recognized the vm as a valid guest. I logged back into VC and sure enough the vm guest icon was now showing as powered on and I was able to open a console to the guest.

I’m still not sure who or what created the bad entries in the vmx file to begin with and why they didn’t cause an issue until so long after the guest was rebooted, but at least I was able to fix the issue without an outage.

Building an ESX3i White Box: Enterprise Virtualization for Less

June 25, 2008

VMWareServer virtualization is and has been a hot topic in the IT field in recent years. The use of virtualized servers allows businesses to expand their infrastructure in density (vertically) instead of in physical machine quantity (horizontally). Among other benefits this minimizes the required floor space, power consumption, and physical hardware required to keep the business running. These particular benefits are very appealing to large corporations which own and operate their own datacenters, and to the individual or small business owner who is renting rack space in a secure datacenter the savings created by server virtulization can be even more importantant. This article documents the step required to build a relatively low cost server virtualization platform with most of the features being used by billion dollar corporations today.
If you wish to try ESXi before you build a server specifically for it, simply follow the Software section of this article and plug the loaded USB drive into an available computer which can boot from a USB slot. Most computers made within the past few years should work to some degree, depending on the hardware and if ESXi comes with drivers for it.

Software
Although there are various vendors currently offering virtualization software, the leading choice among top businesses currently is VMware. VMware offers a full range of virtualization products, but for this article we will focus on their most recent release; ESXi. This software is a minimal version of their popular ESX Server product line which provides “an enterprise-class hypervisor with a thin 32 MB footprint”. It is intended to be installed on an SD card which is included with a select number of enterprise class servers from hardware vendors such as Dell, IBM, and HP. To learn more about ESXi, click the following link http://vmware.com/products/esxi/.

Edit: VMware now gives users a license of ESXi for free after registration!  You are no longer limited to a trial.
Although the licensing for ESXi is currently $495, VMware allows anyone to download a 60 day trial of the product for free. This evaluation copy was used throughout the system build, and can be upgraded to the retail copy at the end of the trial period if desired.

Note: The following instructions are a summary of the instructions found at http://vmetc.com/2008/02/05/create-your-own-bootable-esx-3i-usb-stick/ .
Since USB jump drives are much more common than PCI to SD card adaptors, we will install the software onto a >=1GB USB drive. To start, download an evaluation copy of the software (VMware ESXi 3.5 Installable Update 1) from VMware’s site http://vmware.com/download/vi/ . Next you will need to download a trial copy of WinImage (http://www.winimage.com) and a freeware arc hiver such as 7zip (http://www.7-zip.org). Then perform the following steps:

    1. Extract INSTALL.TGZ from the root directory of the ESXi ISO image using 7zip.
    2. Extract /usr/lib/vmware/installer/VMware-VMvisor-big-3.5.0-67921.i386.dd.bz2 from INSTALL.TGZ using 7zip
    3. Extract VMware-VMvisor-big-3.5.0-67921.i386.dd from VMware-VMvisor-big-3.5.0-67921.i386.dd.bz2 using 7zip
    4. Attach the USB flash drive and make sure you no longer need the data on it
    5. Use WinImage to transfer VMware-VMvisor-big-3.5.0-67921.i386.dd to the USB flash drive
    1. Disk->Restore Virtual Hard Disk image on physical drive…
    2. Select the USB flash drive (Warning: If you select the wrong disk you will lose data!)
    3. Select the image file VMware-VMvisor-big-3.5.0-67921.i386.dd
    4. Confirm the warning message
    5. Wait for the transfer to complete
    6. Unplug the USB flash drive (Warning: If you forget to unplug the flash drive from the PC you might lose the data on your hard drives the next time you boot!)

Hardware
Now that we have our bootable ESX drive, it is time to build the physical box which will host the virtual machines. Because ESX is meant to be an enterprise product, it is only sold and supported on a select number of expensive server platforms which sell for anywhere from $3000-$30,000. However even though these are the only officially supported platforms VMware lists, the ESXi hypervisor can actually work with a variety of undocumented hardware. Because of this, building a white box often involves finding out by trial and error which hardware works and which doesn’t. The following, guaranteed to work, configuration was used during my build to fit my personal needs and desired budget, but other more or less powerful configurations may also work if suitable. All of the parts used were purchased from online PC hardware providers and the prices listed may reflect currently unavailable sales or rebates.

    Part/ Price
    4U rack mountable case: Norco RPC-800 / $74
    8GB Corsair XMS2 DDR2 800 RAM (4×2GB) / $152
    Corsair CMPSU-550VX 550W Power Supply / $80
    Intel Q9300 Processor / $275
    ABIT IP35 Pro Motherboard / $130
    Intel EXPI9300PT Gigabit PCI-E Network Card (x2) / $90
    2GB Sony USB Micro Drive / $13
    Cheap PCI or PCI-E video card (only needed for setup) / $15
    1TB HDD (Western Digital Caviar SE16 WD5000AAKS SATA drive x2) / $180
    Total $1,009

    The hardware used provides a fairly beefy machine (quad core with 8GB RAM and 1TB of disk space) easily capable of running numerous virtual servers simultaneously. For example the above hardware is currently running four different VM’s each with a 3 GHz CPU core and 2GB RAM. The particular motherboard used includes two on-board gigabit NIC’s which are unfortunately not supported by ESXi at this time. Therefore, the two PCI-E Intel NIC’s were included to provide both a VM traffic and an ESX management physical Ethernet port. Because ESXi also does not support IDE devices, the hard drives used must be SATA. And although the motherboard includes onboard raid, the chipset used is only supported in IDE mode (meaning no RAID or AHCI support). If raid is desired, a raid PCI or PCI-E card may be used.

    Installation
    After setting up the physical ESX box, insert the USB drive with the newly loaded hypervisor into an available USB port. Once the box boots up, a setting needs to be configured in the bios for the server to boot from the USB drive. Under Integrated Peripherals->OnChip PCI Device->USB Device Settings make sure the USB Storage Function is set to Enabled. Then make sure the Hard Disk Boot Priority is set to have the USB drive boot first. Then save the settings and let the machine boot into ESX. You are now ready to configure the ESX server as desired.

    Final Thoughts
    Server virtualization is an increasingly important skill for IT professionals. While the system described in this article is certainly stable enough to used in a business environment, because it uses white box components and is not officially supported by VMware it is more practical for use in a “home” or lab environment. By using the above procedures ESXi gives the novice techie the ability to learn, play, and benefit from the skills required for in-demand positions in the IT field.