vSAN is a storage solution from VMware, released as a beta version back in 2013, made generally available to the public in March 2014, and reached version 6.5 in November 2016. vSAN is fully integrated with vSphere. It is an object-based storage system and a platform for Virtual Machine Storage Policies that aims to simplify Virtual Machine storage placement decisions for vSphere administrators. It fully supports and is integrated with core vSphere features such as vSphere High Availability (HA), vSphere Distributed Resource Scheduler (DRS), and vMotion.
vSAN 6.6 requires ESXi 6.5d and vCenter Server 6.5d. vSAN can be managed by both the Windows version of vCenter Server and the vCenter Server Appliance (VCSA).
vSAN is configured and monitored via the vSphere Web Client and this also needs to be version 6.5d.
vSAN requires at least 3 vSphere hosts (where each host has local storage) in order to form a supported vSAN cluster. This is to allow the cluster to meet the minimum availability requirements of tolerating at least one host failure. The vSphere hosts must be running vSphere 6.5. With fewer hosts there is a risk to the availability of virtual machines if a single host goes down. The maximum number of hosts supported is 64.
DISK AND NETWORK
IMPORTANT : All components ( hardware, drivers, firmware ) must be listed on the vSphere Compatibility Guide for vSAN. All other configurations are unsupported.
• Hybrid disk group configuration : At least one flash cache device, and one or more SAS, NL-SAS or SATA magnetic disks.
• All-flash disk group configuration : One SAS or SATA solid state disk (SSD) or PCIe flash device used for caching, and one or more flash devices used for capacity.
• In vSAN 6.5 hybrid cluster SSD will provide both a write buffer ( 30% ) and a read cache ( 70% ). The more SSD capacity in the host, the greater the performance since more I/O can be cached.
• In vSAN all-flash cluster, 100% of the cache is allocated for writes, read performance from capacity flash is more than sufficient.
• Not every node in a vSAN cluster needs to have local storage although a balanced configuration is recommended. Hosts with no local storage can still leverage the distributed vSAN datastore.
• Each host must have minimum bandwidth dedicated to vSAN. 1 GbE for hybrid capacity, 10 GbE for all-flash capacity
• A Distributed Switch can be optionally configured between all hosts in the vSAN cluster, although VMware Standard Switches (VSS) will also work.
• A vSAN VMkernel port must be configured for each host. With a Distributed Switch, Network I/O Control can also be enabled to dedicate bandwidth to the vSAN network.
• Layer 2 multicast must be enabled on the physical switch that handles vSAN traffic.
• Version 6.2 and later of vSAN support IPv4-only configurations, IPv6-only configurations, and also configurations where both IPv4 & IPv6 are enabled. This addresses requirements for customers moving to IPv6 and, additionally, supports mixed mode for migrations.
The VMkernel port is labeled vSAN. This port is used for intra-cluster node communication and for read and writes when one of the vSphere hosts in the cluster owns a particular virtual machine, but the actual data blocks making up the virtual machine files are located on a different vSphere host in the cluster. In this case, I/O will need to traverse the network configured between the hosts in the cluster.
STORAGE POLICY BASED MANAGEMENT
We are giving a brief description of each of the Storage Policies here.
Number of disk stripes per object – The number of capacity devices across which each replica of a virtual machine object is striped. A value higher than 1 might result in better performance, but also results in higher use of system resources.
Flash read cache reservation – Flash capacity reserved as read cache for the virtual machine object. Specified as a percentage of the logical size of the virtual machine disk (vmdk) object. Reserved flash capacity cannot be used by other objects. Unreserved flash is shared fairly among all objects. This option should be used only to address specific performance issues.
Primary level failures to tolerate – For non-stretched clusters, defines the number of disk,host or fault domain failures a storage object can tolerate. For n failures tolerated, n+1 copies of the virtual machine object are created and 2*n+1 hosts contributing storage are required.
Force provisioning – If the option is set to Yes, the object will be provisioned even if the policy specified in the storage policy is not satisfiable by the datastore. Use this parameter in bootstrapping scenarios and during an outage when standard provisioning is no longer possible.
Object space reservation – Percentage of the logical size of the virtual machine disk (vmdk) object that should be reserved, or thick provisioned when deploying virtual machines.
Disable object checksum – If the option is set to No, the object calculates checksum information to ensure the integrity of its data. If this option is set to Yes, the object will not calculate checksum information. Checksums ensure the integrity of data by confirming that each copy of a file is exactly the same as the source file. If a checksum mismatch is detected, Virtual SAN automatically repairs the data by overwriting the incorrect data with the correct data.
Failure tolerance method – Specifies whether the data replication method optimizes for Performance or Capacity. If you choose Performance, Virtual SAN uses more disk space to place the components of objects but provides better performance for accessing the objects. If you select Capacity, Virtual SAN uses less disk space, but reduces the performance.
IOPS limit for object – Defines the IOPS limit for a disk. IOPS is calculated as the number of IO operations, using a weighted size. If the system uses the default base size of 32KB, then a 64KB IO represents two IO operations. When calculating IOPS, read and write are considered equivalent, while cache hit ratio and sequentiality are not considered. If a disk’s IOPS exceeds the limit, IO operations will be throttled. If the IOPS limit for object is set to 0, IOPS limits are not enforced.
STORAGE POLICY BASED MANAGEMENT – RAID 5/6 (ERASURECODING )
Note that there is a requirement on the number of hosts needed to implement RAID-5 or RAID-6 configurations on vSAN.
For RAID-5, a minimum of 4 hosts are required; for RAID-6, a minimum of 6 hosts are required.
The objects are then deployed across the storage on each of the hosts, along with a parity calculation. The configuration uses distributed parity, so there is no dedicated parity disk. When a failure occurs in the cluster, and it impacts the objects that were deployed using RAID-5 or RAID-6, the data is still available and can be calculated using the remaining data and parity if necessary.
A new policy setting has been introduced to accommodate the new RAID-5/RAID-6 configurations.
This new policy setting is called Failure Tolerance Method. This policy setting takes two values: performance and capacity. When it is left at the default value of performance, objects continue to be deployed with a RAID-1/mirror configuration for the best performance. When the setting is changed to capacity, objects are now deployed with either a RAID-5 or RAID-6 configuration.
The RAID-5 or RAID-6 configuration is determined by the number of failures to tolerate setting. If this is set to 1, the configuration is RAID-5. If this is set to 2, then the configuration is a RAID-6.