Lately one of my customers had some issues with his VMware vSphere environment where the semi-outage of one ESXi took down quit a huge amount of his VMs. After fixing the issue in his productive infrastructure and bringing back his IT everything seamed working again. Some days later we noticed Errors in some of his Veeam Backup Jobs. Strangely not all VMs where affected and those affected had no direct correlation (e.g. same ESXi host, same datastore, some network, etc.).
The error message shown inside the Veeam Backup & Replication console was an error I have not seen before:
Error: DiskLib error: .A file error was encountered -- Failed to read the file
Error Failed to retrieve next FILE_PUT message. File path: [[<DATASTORE>] <VMFOLDER>/<VM>.vmx]. File pointer: . File size: .
Today I demonstrated some of the new features of VMware vSAN 7.0 Update 3 (7.0 U3) related to 2-node deployments to one of our customers who uses this topology extensively. We have focused particularly on improvements to the resilience of 2-node clusters and the witness Host. You can find a short video about the features here. An extensive list of the new features and the release notes can be found here and here.
Another new feature I noticed but forgot zu demonstrate to the customer ist the new vSAN cluster shutdown and restart operation. I already tested the shutdown and restart feature with my 4-node cluster but as the demo was finished I decided to give the feature a try with the 2-node cluster and the external witness appliance.
As every year, some of my customers use the weeks after Christmas to update their environments. Nearly all of them run their ESXi hosts with vendor-specific Custom Images that provide additional drivers or agents over the standard VMware image. Unfortunately there are almost always problems with conflicting VIBs when they are updated. Of course it was the same this time. In my case, this time it was about a custom image from Fujitsu.
To perform the update successfully the problematic VIB must be removed. The necessary steps for this I would like to point out below.
Today I learned something very useful that I want to share with you. We have some smaller customers from the SMB segment. Here we often find small vSphere clusters with two or three ESXi hosts. These are usually licensed with vSphere Essentials Plus to use the benefits of vSphere High Availability (vSphere HA). The functions contained in this bundle are quite sufficient for regular operation. One function that is missing in vSphere Essentials Plus is the ability to move running VMs from one Datastore to another using Storage vMotion. In this post I want to share with you a way on how you can still move your running VMs between datastores.
After the gerneral availability of ESXi 6.7 U3, I decided it is time to update my homelab. I used the integrated vSphere Update Manager (VUM) for my nested vSAN Cluster without any issues. During my VCAP-DVC Deploy preparation earlier this year I used an old Intel NUC system to deploy an additional vCenter Server Appliance (VCSA) in order to play around with Enhanced Linked Mode (ELM). I wanted to have the VCSAs in different Layer 3 networks, so I copied the design from the nested cluster and installed a nested router alongside with a nested ESXi host. The nested ESXi is used to host the second VCSA. Due to this design I can not use the integrated VUM of the second VCSA to update this standalone ESXi host.
The virtual machines (VMs) on the standalone ESXi host are not very important, so I decided to shut them down together with the VCSA and update the nested ESXi host via CLI. For details on how to update ESXi 6.x using the CLI, take a look at one of my previous blog posts here. I used this method quite some time so I was certain to have no issues with the update. But this time it was different, the update failed.
During my work with VMware products I always needed a platform to test new products, updated versions and software dependencies. In the past I used the virtual environment of my employer to deploy, install and configure the necessary systems. This always had a large downside: Me and my colleagues used the same environment to perform our tests so we had some conflicting systems now and then. For the past couple of months, I have been experimenting with different lab setups, but nothing really satisfied my needs. So I decided to approach my homelab design the way I would approach any other client design: collecting requirements. (Even if it may not correspond to the level of detail of a real customer design.)
Here is the list of the central requirements for my homelab:
R001: Small physical footprint.
R002: Low energy consumption.
R003: Low noise level.
R004: Powerful enough to handle different use cases at the same time:
VMware vSAN (including All-Flash, multiple disk groups and automatic component rebuilds)
VMware NSX (including NSX DLR and NSX Edges)
VMware Horizon (including virtual desktops)
R005: Portable solution for customer demonstrations.
R006: Isolated environment without dependency or negative impact to other infrastructure components.
R007: IPMI support to power it on and off remotely.
To have the automatic rebuild of components (see R004) with vSAN even in case of a host failure, at least four nodes are required. Regarding R001, R002 and R003, a full sized 19-inch solution with four physical servers is not an option. Regarding the 10Gbit/s requirement for vSAN All-Flash (see R004), a solution consisting of several mini PCs is out of the question. Also, such a solution is not ideal for regular assembly and disassembly (see R005) due to the different components. To avoid dependencies on infrastructure components (see R006) despite the different use cases (see R004), a completely nested environment including AD, DNS and DHCP behind a virtual router seems to be the best option to me.
The network design of the physical ESXi host is as simple as possible and as complex as necessary. I configured two vSphere Standard Switches (vSS). The first one (vSwitch0) contains the VMkernel NIC enabled for management traffic and Standard port group for the VMs which need a connection to the Home network. vSwitch1 handles all the nested virtualization magic.
To do so I configured this vSS to accept promiscuous mode, MAC address changes and forged transmits. I also configured it for jumbo frames (MTU 9000) so I can use it for the VXLAN overlay networks using VMware NSX-V. All VMs which form the nested environment attach to vSwitch1.
Six VMs are deployed natively on the physical ESXi host. The first VM is a virtual router which acts as gateway to the outside world for the nested systems. It is configured with two network interfaces. One facing the Home network while the other connects to the nested environment. The second VM is a domain controller running AD, DNS, DHCP and ADCS. The remaining four VMs are the nested ESXi hosts which later form the vSphere cluster for all other workloads. The nested router and the nested domain controller can be reverenced as „physical“ workloads running outside of the virtualization.
I went for a single Supermicro SuperServer E300-8D as a starting point. This not so small beauty convinced me by the good initial values but also by the numerous Expansion possibilities. I replaced the two out of stock fans with Noctua ones and also added a third one for better overall cooling.
I was able to reuse most of the components from my homelab experiments and a nice system with 128GB RAM, a 64GB SATADOM device, a 1TB mSATA SSD and a 2TB NVMe SSD came together.
For all who are interested, here is the complete bill of materials of my current homelab configuration.
Supermicro SuperServer E300-8D
Noctua premium fan 40x40mm
Kingston ValueRAM 32GB
Supermicro SATADOM 64GB
Samsung 850 EVO mSATA 1TB
Samsung 970 EVO NVMe 2TB
Samsung PM983 NVMe 3.84TB
With this space and power saving configuration I have enough resources to run a nested 4 node vSAN cluster that uses NSX-V to run VMware Horizon on it. In one of my next posts, I’ll go deeper into the configuration of nested ESXi hosts and the associated implementation of VMware vSAN.
Recently I was asked to update some ESXi 6.7 U1 hosts to the latest patch release. A quick look at the list of build and version numbers showed me that there is an actual express patch (EP6) which could be installed. Before performing this update in production I decided to update my test ESXi host first. If the update completes successfully and there are no issues after a few days I will continue to update the production hosts to EP6.
The following list outlines the steps necessary to perform the update using CLI (this is just a test ESXi host so there is no vSphere Update Manager (VUM) available). This procedure should be the same for other versions of ESXi 6.x even if a restart of the ESXi host may be necessary at the end.
Step 1: Make sure SSH is enabled on the host
Because I perform the update using CLI I need to login to the host via SSH. Therefore I need to check if SSH is enabled or not. There are at least three ways to check:
Try and error: Use your favorite SSH client and simply try to connect to the management IP of your ESXi host.
VMware Host Client: Use your webbrowser of choice and connect to the management IP of your ESXi host. Navigate to Host > Manage > Services. Check if the TSM-SSH service is in Running state, else start up the service using the corresponding button.
Direct Console User Interface (DCUI): Use the IPMI of your server vendor, a KVM system or direct connected peripherie to access the DCUI. Navigate to Troubleshooting Mode Options > Enable/Disable SSH and check if SSH is enabled, else enable it.
Step 2: Enable firewall rule for web traffic
Connect to the ESXi host via SSH and enable the firewall rule to allow web traffic. This is necessary to use the VMware Online Depot in the next steps.
esxcli network firewall ruleset set -e true -r httpClient
Optional Step 3: Check the current ESXi build and version
To confirm the current version of your ESXi host use the following command.
esxcli system version get
Step 4: List all available profiles in the VMware Online Depot
To get al list of all available profiles in the VMware Online Depot use the following command.
esxcli software sources profile list -d https://hostupdate.vmware.com/software/VUM/PRODUCTION/main/vmw-depot-index.xml
This list can be very extensive so it is recommended to work with filters. Appending a | grep -i ESXi-6.7.*standard lists only the profiles for Version 6.7 including VMware Tools.
If the update finishes with Reboot Required: true use the following command to reboot the host. Make sure there are no running VMs and optional the host is in maintenance mode.
Optional Step 7: Check ESXi version again to ensure the update completed successfully
I finished the update with a final check of the running ESXi build and version using esxcli again.
esxcli system version get
With just a few commands it is possible to update an existing host to the current available EP6 using only the VMware Online Depot. Just make sure that host is configured with correct DNS settings and is able to access the internet. If this is not possible, a corresponding offline bundle can still be used for the update.