Category: VMware

Understanding VMware vSphere Distributed Resource Scheduler (DRS):

Published by Luciano Batalha on March 20, 2024

VMware vSphere Distributed Resource Scheduler (DRS) is a critical feature for dynamic resource management in virtualized environments. By intelligently balancing workloads across hosts, DRS ensures optimal performance and efficient use of resources. In this article, we’ll explore how DRS works, its benefits, and how to configure it in your environment.

Key Benefits of DRS

Workload Balancing:
Ensures that no host is overloaded, improving the overall cluster performance.
Power Efficiency:
Shuts down underutilized hosts during low demand to save energy and reduces costs.
Improved VM Performance:
Allocates resources dynamically to prevent resource contention.
Operational Simplicity:
Automates the process of balancing workloads, reducing manual intervention.

How Does DRS Work?

Resource Monitoring:
DRS continuously monitors CPU, memory, and other resources across the cluster.
VMotion:
Uses VMware vMotion to migrate VMs between hosts without downtime.
Dynamic Thresholds:
Balances workloads based on thresholds defined by the user (e.g., conservative vs. aggressive).

Configuring DRS in vSphere

Enable DRS:
- In the vCenter Server, right-click on your cluster and go to “Settings.”
- Enable DRS and set the automation level (manual, partially automated, or fully automated).
Set Resource Pools (Optional):
- Create resource pools to allocate resources to groups of VMs as needed.
Define Rules:
- Create affinity and anti-affinity rules to control VM placement.
Monitor Performance:
- Use the DRS dashboard to analyze the balancing actions and cluster performance.

Best Practices for Using DRS

Always enable vMotion alongside DRS for seamless migrations.
Regularly review automation levels to match operational requirements.
Test rules and thresholds in a controlled environment before applying them in production.
Combine DRS with VMware High Availability (HA) for enhanced fault tolerance.

Resolving ESXi Host CPU Overload Issues

Published by Luciano Batalha on February 16, 2024

An ESXi host experiencing CPU overload typically exhibits symptoms such as:

VMs becoming unresponsive or slow.
High CPU Ready Times in vSphere performance metrics.
Consistently maxed-out CPU usage in the host’s performance tab.

Common causes include:

Oversized VMs: Allocating more vCPUs than needed.
Resource Contention: Too many VMs competing for CPU resources.
Misconfigured Resource Pools: Imbalanced resource allocation.
Unoptimized Applications: Inefficient software consuming excessive CPU.
Background Processes: Host-level tasks like backups or snapshots running during peak hours.

Steps to Troubleshoot and Resolve

Analyze Performance Metrics

In the vSphere Client, go to Monitor > Performance for the affected host or VMs.
Look for:
- CPU Usage (%): High values indicate overload.
- CPU Ready (%): High values (above 5%) indicate VMs waiting too long for CPU.
- Co-Stop (%): High values indicate vCPU scheduling issues.

Optimize VM Configurations

Reduce the number of vCPUs allocated to each VM unless absolutely necessary. Many applications perform well with fewer vCPUs.
Power off unused or idle VMs to free up resources.

Check and Reconfigure Resource Pools

Review resource pools to ensure proper allocation.
Avoid strict limits unless required, as they can starve VMs of CPU during peak loads.

Balance Workloads Across Hosts

Use vMotion to migrate high-load VMs to hosts with spare CPU capacity.
Enable DRS (Distributed Resource Scheduler) if available, to automatically balance workloads.

Address Application-Level Issues

Identify high-CPU-consuming processes within the VMs.
Work with application owners to optimize software settings or update inefficient programs.

Update ESXi and Guest OS Drivers

Ensure that the ESXi host and VM tools are updated to the latest versions. Outdated software can lead to inefficient CPU usage.

Monitor Background Tasks

Stagger resource-intensive tasks such as backups, virus scans, or snapshots to run during off-peak hours.

Add Host Resources

If the cluster consistently runs at high capacity, consider adding more hosts or upgrading the existing hardware to handle increased demand.

Preventive Measures

Monitor Regularly: Use vRealize Operations or another monitoring tool to proactively track resource usage.
Enable DRS: Automate load balancing to prevent bottlenecks.
Right-Size VMs: Periodically evaluate and adjust vCPU and memory allocations based on actual usage patterns.
Reserve Resources Strategically: Use reservations for critical VMs but avoid over-reserving resources unnecessarily.
Plan Capacity: Regularly review cluster capacity to ensure it aligns with business needs and future growth.

How to Configure High Availability (HA) in VMware vSphere

Published by Luciano Batalha on January 9, 2024

High Availability (HA) in VMware vSphere is an essential feature to ensure service continuity in case of host or VM failures. This article will guide you through the steps to configure HA in vSphere and highlight best practices for maintaining a robust and reliable environment.

Before enabling HA on a cluster, ensure the following requirements are met:

Proper vSphere licensing.
A configured cluster in vSphere.
A stable and redundant management network.
Shared storage accessible by all hosts in the cluster

Section 3: Step-by-Step Guide to Configure HA in vSphere

Access the vCenter Server:
- Log in to the vCenter interface.
Create or Select a Cluster:
- Right-click on the Datacenter, select “New Cluster,” and set up the cluster.
Enable High Availability:
- In the cluster menu, click “Configure.”
- Under “Availability,” click “Edit.”
- Enable HA, set failover policies, and click “OK.”
Add Hosts to the Cluster:
- Drag and drop hosts into the cluster or add them manually.
Test the Configuration:
- Simulate a failure to confirm that VMs automatically restart on another host.

Best Practices for Configuring HA

Configure multiple paths for management networks.
Use Distributed Resource Scheduler (DRS) with HA for load balancing.
Regularly monitor the cluster to detect issues before failures occur.

vExpert 2024 Applications are Open!

Published by Luciano Batalha on December 15, 2023

vExpert 2024 Applications are Open!

vExpert 2022 Applications are Open! Don’t miss out on this opportunity, be sure to apply before January 19th, 2024. The vExpert awards will be announced on February 20th, be a part of the announcement! Current vExperts will need to apply even if you were awarded in the second half.

“Scan or remediation is not supported on because of unsupported OS …” for certain operating systems

Published by Luciano Batalha on November 17, 2023

The VMware wrote this workaround, so you can manually add the operating system to the list for VMware Update Manager.

Connect to your vCenter Server Appliance per SSH and log in
Create a backup of the vci-integrity.xml file:

mkdir /backup && cp /usr/lib/vmware-updatemgr/bin/vci-integrity.xml /backup/

Modify the vci-integrity.xml file by opening the file using vi editor:

vi /usr/lib/vmware-updatemgr/bin/vci-integrity.xml

Locate the <vci_vcIntegrity> ….. </vci_vcIntegrity> section
Enter edit mode by hitting the insert or the letter i button
Before the </vci_vcIntegrity> line, add the following lines, depending on the operating system configured in your virtual machine. If entering both versions of the same OS (ie: Windows 2019 AND 2022), see the below Note section

For Debian 11 (32 bit):

<supportedLinuxGuestIds>
  <debian11Guest/>
</supportedLinuxGuestIds>

For Debian 11 (64 bit):

<supportedLinuxGuestIds>
  <debian11_64Guest/>
</supportedLinuxGuestIds>

For Red Hat Enterprise Linux 9 (64 bit):

<supportedLinuxGuestIds>
  <rhel9_64Guest/>  
</supportedLinuxGuestIds>

Some Linux have same issue, for that follow the list of all supported Linux Guest OS IDs

asianux3Guest
asianux3_64Guest
asianux4Guest
asianux4_64Guest
asianux5_64Guest
centosGuest
centos64Guest
coreos64Guest
debian4Guest
debian4_64Guest
debian5Guest
debian5_64Guest
debian6Guest
debian6_64Guest
debian7Guest
debian7_64Guest
debian8Guest
debian8_64Guest
oracleLinuxGuest
oracleLinux64Guest
rhel7Guest
rhel7_64Guest
rhel6Guest
rhel6_64Guest
rhel5Guest
rhel5_64Guest
rocklinux_64Guest
fedoraGuest
fedora64Guest
sles12Guest
sles12_64Guest
sles11Guest
sles11_64Guest
sles10Guest
sles10_64Guest
opensuseGuest
opensuse64Guest
ubuntuGuest
ubuntu64Guest
otherlinuxguest
otherlinux64guest

Round Robin ESXi

Published by Luciano Batalha on October 20, 2023

Best practices to add in your ESXi if you have this storage.

To storage Compelant – esxcli storage nmp satp rule add -s “VMW_SATP_ALUA” -V “COMPELNT” -P “VMW_PSP_RR” -O “iops=3”
To storage PowerMAx – esxcli storage nmp satp rule add -s “VMW_SATP_SYMM” -V “EMC” -M “SYMMETRIX” -P “VMW_PSP_RR” -O “iops=1”
To storage Huawei – esxcli storage nmp satp rule add -V HUAWEI -M XSG1 -s VMW_SATP_DEFAULT_AA -P VMW_PSP_RR -c tpgs_off
To storage IBM – esxcli storage nmp satp set –default-psp VMW_PSP_RR –satp VMW_SATP_SVC
To storage Dell Unity – esxcli storage nmp satp rule add –satp “VMW_SATP_ALUA_CX” –vendor “DGC” –psp “VMW_PSP_RR” –psp-option “iops=1″ –claim-option=”tpgs_on”
To storage Huawei . esxcli storage nmp satp rule add -V HUAWEI -M XSG1 -s VMW_SATP_DEFAULT_AA -P VMW_PSP_RR -O iops=1 -c tpgs_off

The Datastore still appears because there is an active process or an iso mounted/snapshot.

Published by Luciano Batalha on October 19, 2023

The Datastore still appears because there is an active process or an iso mounted/snapshot.

First select your cluster -> right click -> datastore -> and rescan

Second, go to vCenter select the datastore and check Check which host you didn’t unmount on, also you can go to the VM tab of this datastore and check if you have a VM.

lsof | grep name of datastore

Run this esxcli storage filesystem list find storage UUID and send me this output.

lsof | grep storage uuid

Replace the naa.

Run this command vsish -e ls /storage/scsifw/devices/<“naa. id of your datastore”>/worlds/ |sed ‘s:/::’ |while read i;do ps |grep $i;done

Check the lock file/ process file to unmount the datastore

Set password of an sso user to never expire

Published by Luciano Batalha on September 3, 2023

Login to VCSA over ssh
Go to folder /usr/lib/vmware-vmafd/bin/
cd VCSA : /usr/lib/vmware-vmafd/bin/
Execute this command
./dir-cli user modify –account user_name –password-never-expires
Type Administrator@vsphere.local password

Identify the HPE ILO IP Address from ESXi Host

Published by Luciano Batalha on August 24, 2023

This procedure will show you how to identify the HPE ProLiant iLO via ESXi CLI.

1 – Enable SSH on the host you need the ILO
2 – Login to the ESXi host by ssh
3 – Then type cd /opt/hp/tools
4 – Then type ./hponcfg -w /tmp/ilo-conf.txt

Remove vSphere Replication flag from VM

Published by Luciano Batalha on July 27, 2023

In my case the customer removed the vSphere replication appliance, so I needed removed all vsphere replications.

Login to ESXi
Execute this command to list the VMS
vim-cmd vmsvc/getallvms | awk ‘$3 ~ /^\[/ {print $1}’
Execute this command to remove the flag (you need replace VM id)
vim-cmd hbrsvc/vmreplica.disable 1011