Designing Systems for Continuous Availability – Multi-Node with Block Storage

Speakers: Elden Christensen and Mallikarjun Chadalapaka

This session will focus on block based storage. It’s a clustering session. It seems like failover clustering is not opimised for the cloud. *joking*

Sneak Peak at Failover Clustering
– scale up to 4,000 VMs in a cluster
– scale out to 63 nodes in a cluster
– 4 x more than W2008 R2

Note: more persistent reservations and iSCSI-3 resevations to the SAN!

Multi-Machine Management with Server Manager, Featuring Cluster Integration
– Remote server management
– Server groups to manage sets of machines – single click to affect all nodes at once (nice!)
– Simplified management
– Launch clustering management from Server Manager

New Placement Policies
– Virtual Machine Priority: start with the most important VMs first (start backend first, then mid tier, then front tier). Ensure the most important VMs are running – shut down low priority VMs to allow high priority VMs to get access to constrained resources
– Enahance Failover Placement: Each VM based on note with best avaialble memory resources. Memory requirements determined on the per VM basis – finds best node based on how DM is configured. NUMA aware.

VM Mobility
– Live Migration Queing
– Storage Live Migration
– Concurrent Live Migrations – multiple simulataneous LMs for a given source or target
– Hyper-V Replica is integrated with clustering

Cluster Management
Demo: The demo cluster has 4001 VMs and 63 nodes (RDP into Redmond). In the FCM, it is smooth and fast. You can see the priority of each VM. You can search for VMs with basic and complex queries. The thumbnai of the VM is on the FCM.

Guest Clustering – Increased Storage Support
– Most common scenario is SQL Server
– Could only be done in iSCSI. Now we have a virtual fibre channel HBA

VM Monitoring
– Application level recovery: Service Control Manager or event triggered
– Guest Lvel HA Recovery – FC reboots the VM
– Host level HA recovery – FC fails over VM to another node
– Generic health monitoring for any application: Service Control Manager and generation of specific event IDs

VM Monitoring VS Guest Clustering
– VM Monitoring: Application monitoing, simplified configuration and event monitoring – good for tier 2 apps
– Guest clustering: applciation health monitoring, application mobility (for scheduled maintenance) – still for tier 1 apps

Automated Node Draining
Like VMM maintenance mode. Click a node to drain it of hosted roles (VMs).

Cluster Aware Updating
CAU updates alll cluster nodes in an automated fashion without impacting service availability. It is an end to end orchestration of updates. Built on top of WUA. Patching does not impact cluster quorum. Workflow:

– Scan nodes to ID appropriate updates
– ID node with fewest worklaodss
– Place node into maintenance mode to drain
– WSUS update
– Rinse and repeat

The workloads return to their original node at the end of the process.

Note: The machine managing this is called the orchestrator. That might be a little confusing because SC Orchestrator can do this stuff too.
Note: I wonder how well this will play with updates in VMM 2012?

There is extensibility to include firmware, BIOS, etc, via updates, via 3rd party plugin.

Demo: Streaming video from a HA VM. The cluster is updated, the workflow runs, and the videos stay running. The wizard gives you the PSH. You can save that and schedule it. No dedicated WSUS needed by the looks of it.

Cluster Shared Volume
Redirected I/O is b-a-d.

Windows Server 8: Improve backup / restore of CSV. Expanded CSV to include more roles. CSV expands out to 63 nodes. Enables zero downtime for planned and unplanned failures of SMB workloads Provides iteroperability with file system mini-filer drivers (a/v and backup), and lots more.

CSV no longer needs to be enabled. Just right click on a disk to make it a CSV. File systems now appears as CSVFS. It is NTFS under the covers. It enables applications to know they are on CSV and ensure their copatibility.

AV, Continuous data protection, backup and replication all use filter drivers to insert themselves in the CSV pseudo-file system stack.

High speed CSV I/O redirection will have negligible impact. CSV is integrated with SMB mutli-channel. Alows streaming CSV traffic acros multiple networks. Delivers improved performance when in redirected mode. CSV takes advantage of SMB 2 Direct and RDMA

BitLocker is now supported on traditional shared nothing disks and CSV. The Cluster Name Object (CNO) ID is used.

Cluster Storage Requirements Are:
– FC
– Storage Spaces
– FCoE

Data Replication storage requirements:
– Hardware
– Software replication
– Aplication Replication (Exchange, SQL Denali AlwaysOn)

SCSI Command requirements: storage must support SCSI-3 SPC-3 compliant SCSI Commands.

Cost Effective & Scale Out with Storage Spaces. Integrated and supported by clustering and CSV.

Redirected I/O is normally file level. There is now a block level variant – not covered in this talk.

What if your Storage Spaces servers were in the same cluster as the Hyper-V hosts? High speed block level redirected IO. Simplified management. Single CSV namespace accessiible on all nodes. Unified security model Single cluster to manage. VMs can run anywhere.

Note: Wow!

Called an asymmetric configuration.

CSV Backup
Support for parallel backups on same or different CSV volumes, or on same or different cluster nodes. Improved I/O performance. Direct IO mode for snapshot and backup operations. (!!!) Software snapshots will stay in direct IO mode (!!!!) CSV volume ownership does not change during backup. Improved filter driver support for incremental backups. Backup applications do not need to be CSV aware. Fully compatible with W2008 R2 “requestors”.

Distributed App Consistens VM Shadow Copies:
Saw you have a LUN with VMs scattered across lots of hosts. Can now snap the entire LUN using an orchestrated snapshot.

Comparing Backup With W2008 R2
– Backup app: W2008 R2 rquires CSV aware backup app
– IO performance: No redireced IO for backup
– Locality of CSV volume: Snapshot can be created by any volume
– Complexity: Cluster coordinates the backup process

Note: I’m still trying to get over that we stay in direct IO during a system VSS provider backup of a CSV.

Cluster.exe is deprecated. Not there by default but you can install it in Server Manager. Use PSH instead.

SCSI Inquiry Data (page 83h) is now changed from recommended to required.

Please follow and like us:

Leave a comment

Your email address will not be published.


This site uses Akismet to reduce spam. Learn how your comment data is processed.