AWS EC2 High Availability for SQL Server: Waive LI on Passive Nodes to Cut Costs

  • Thread Author
AWS has introduced a practical, low-friction way to cut the licensing bill for Microsoft SQL Server Always On deployments on EC2 by automatically waiving the SQL Server License Included (LI) charge for passive standby nodes, potentially reducing HA costs by up to the range AWS advertises when applied to common instance pairs.

Background​

Enterprises running mission‑critical SQL Server workloads commonly deploy Always On Availability Groups (AGs) or Failover Cluster Instances (FCI) to meet strict availability SLAs. That architecture usually requires at least two EC2 hosts: an active (primary) node that serves application traffic and a passive (standby) node that maintains synchronized copies and can fail over when needed.
Traditionally, when using License Included (LI) images on EC2, every SQL Server instance is billed for SQL Server licensing on a per‑vCPU basis (subject to Microsoft’s minimums). AWS’s new EC2 High Availability for SQL Server capability recognizes the passive role, automates detection of active/passive state using Systems Manager, and stops charging the SQL Server LI fee for the passive node while continuing to bill for EC2 compute and Windows Server licensing. The result: a meaningful reduction in total HA cost for license-included deployments when conditions are met.
This article explains how the feature works, the technical and licensing prerequisites, a realistic cost model and example, operational and security implications, and practical guidance for production adoption — plus the risks and audit traps to avoid.

Overview: how the feature works​

The principle​

  • One EC2 instance is designated as the active node (runs the production SQL workloads).
  • The other node is passive — it replicates data but does not serve external SQL client traffic or run active workloads.
  • AWS uses AWS Systems Manager (SSM) to run an SSM Run Command document (AWSEC2-DetectSqlHaState) on each EC2 instance to determine whether the SQL Server instance is active or standby.
  • When SSM reports an instance as standby, EC2 billing for SQL Server License Included for that instance is waived automatically — you still pay for EC2 compute and Windows Server LI (if using LI Windows).
  • When failover occurs and the standby becomes active, billing updates automatically so SQL licensing applies to the new active instance.

Key automation pieces​

  • SSM Agent: must be installed and online on target EC2 instances.
  • SSM Run Command (AWSEC2-DetectSqlHaState): inspects SQL Server HA metadata and reports status.
  • IAM role/policy: instances require an IAM instance role containing the AWS‑managed policy (or equivalent permissions) used by the EC2 High Availability workflow.
  • Optional Secrets Manager integration: if the server’s default SYSTEM account is restricted, a SQL login stored in Secrets Manager can be used by the SSM document; that login needs a tightly scoped set of metadata permissions.

Supported configurations and hard constraints​

Supported HA topologies​

  • Always On Availability Groups (AGs) — not readable secondaries for the purposes of the passive-waive benefit.
  • Always On Failover Cluster Instances (FCI).

Supported software and environment​

  • Microsoft SQL Server: 2017 or newer, Standard or Enterprise editions only.
  • Windows Server: 2019 or newer (SQL LI only supported on Windows Server in this feature).
  • PowerShell: Windows PowerShell 5.1 or later for management tasks.
  • Only two-node clusters are supported by the EC2 console flow (active + single standby).
  • Multi‑AZ within same region supported; cross‑region is not supported.

Licensing constraints you must observe​

  • The passive node must meet Microsoft’s passive failover rules to qualify — it must:
  • not serve incoming client traffic,
  • not run active SQL Server workloads,
  • not be a readable secondary (with narrow database exceptions such as master, msdb, tempdb, model),
  • not host standalone databases outside the AG (for AG deployments).
  • The passive instance must be the same size or smaller than the active instance in vCPU count.
  • SQL Server LI is billed per vCPU per second and is subject to Microsoft’s four-core minimum: instances smaller than four vCPUs are still billed for four SQL cores.
  • Windows Server LI costs still apply for the passive node and are charged per vCPU per second with no minimum.

Step-by-step: enabling EC2 High Availability for SQL Server (console flow)​

  • Confirm prerequisites:
  • SSM Agent is installed and reporting online on all cluster nodes.
  • Attach an IAM role that includes the AWS-managed policy (or equivalent permissions) required for the EC2 SQL HA workflow.
  • Confirm SQL Server edition and Windows Server versions meet the supported list (SQL Server 2017+, Windows Server 2019+).
  • Make sure the cluster topology is two nodes and that passive node configuration meets Microsoft’s passive requirements.
  • In the EC2 console:
  • Select the EC2 instances that are part of your SQL Server HA deployment (include both primary and secondary nodes).
  • From Actions → Instance settings → choose Modify SQL High Availability settings.
  • Follow the wizard:
  • Step 1: Review prerequisites (SSM and IAM).
  • Step 2: Choose to enable the feature for selected instances and optionally supply SQL credentials stored in Secrets Manager if the SYSTEM account is restricted.
  • Step 3: Review and apply.
  • Verify:
  • After enabling, a new SQL High Availability tab appears for the instance details. The tab shows HA status (Active/Standby) and SQL license usage (Full vs Waived).
  • Monitor SSM Run Command logs and Systems Manager inventory to confirm the AWSEC2-DetectSqlHaState document runs successfully and reports accurate state.

Required SQL permissions when using Secrets Manager​

If you must provide SQL credentials (for environments where NT AUTHORITY\SYSTEM is disabled), the SQL login used by the SSM document needs only metadata-level read permissions:
  • VIEW DATABASE STATE
  • VIEW SERVER STATE
  • VIEW ANY DEFINITION
  • VIEW ANY DATABASE
Grant only these minimal permissions to the dedicated service login and rotate the secret regularly. Store the credential in AWS Secrets Manager and restrict access to the SSM document and instance role.

Cost model, example and math you can run yourself​

The cost driver​

Total HA cost for a two‑node SQL Server cluster on EC2 = compute + Windows Server license + SQL Server license (LI) for each node, minus any waived charges for passive nodes.
Because SQL Server licensing typically represents a large chunk of the charge for license-included instances, waiving the SQL Server LI for the passive node delivers the most meaningful savings compared to waiving compute or Windows costs.

Illustrative example (conceptual)​

  • Two identical EC2 instances (active and standby).
  • When both are billed for SQL LI, the SQL license line item is present on both nodes.
  • With the passive-waive feature enabled, the active node is billed for SQL LI; the standby node is not billed for SQL LI — you still pay compute and Windows LI on the standby.
AWS provides a worked example where two m8i.4xlarge instances with Windows + SQL LI yield up to ~40% lower total HA cost when the passive SQL license is waived. Exact savings vary by:
  • SQL edition (Enterprise vs Standard),
  • instance family and size (vCPU count matters),
  • region pricing,
  • whether you use On‑Demand, Savings Plans, or Reserved Instances.

How to calculate expected savings quickly​

  • Determine hourly on‑demand rates for:
  • EC2 compute for your chosen instance type,
  • Windows Server LI per vCPU,
  • SQL Server LI per vCPU for your edition.
  • Compute per‑node total (compute + Windows LI + SQL LI).
  • Compute HA pair total when both nodes billed.
  • Compute HA pair total when standby SQL LI is waived (standby pays compute + Windows only).
  • Percent savings = (both_billed_total − waived_total) / both_billed_total × 100.
This is the method behind the advertised “up to 40%” claim; your actual percentage will depend on how high the SQL LI component is compared to compute + Windows charges for the chosen instance type.

Operational considerations and best practices​

Monitoring and observability​

  • Enable CloudWatch and SSM logging to capture the run results of AWSEC2-DetectSqlHaState.
  • Keep a retention trail proving that a node was passive during the billing periods that the waiver applied. This helps in licensing audits.
  • Integrate an automated alert if the SSM document fails to run or reports an unexpected state.

Failover testing and license behavior​

  • Test planned failovers and record how quickly billing state changes; the licensing waiver is tied to reported SQL HA state and billing is updated automatically.
  • Understand any grace windows — if a passive node becomes active briefly, licensing responsibility transfers and you must ensure compliance.

IAM and least privilege​

  • Use a narrowly scoped instance profile (or AWS-managed policy if recommended) for SSM operations.
  • Do not over‑privilege the instance role; keep access limited to the SSM and Secrets Manager APIs needed for detection and credential retrieval.

Security for SQL credentials​

  • If using SQL credentials stored in Secrets Manager, restrict access to the secret to the specific instance role and the SSM orchestration role, and enable automatic rotation.

Integration with existing automation​

  • The EC2 feature is accessible via the AWS API — integrate it into IaC workflows, automation scripts, and deployment pipelines.
  • AWS has signalled that CloudFormation support is coming; plan to templatize once native CloudFormation resource support or CloudFormation registry extensions are available.

Compliance and audit risks — what can trip you up​

  • Readable secondary: If your passive secondary is configured as a readable secondary and serves queries, it no longer qualifies as passive. That invalidates the waiver and creates potential licensing exposure.
  • Hidden workloads: Background processes or jobs on the passive node that constitute “active SQL Server workloads” (beyond allowed maintenance operations) can void the passive designation.
  • Instance size mismatch: If the standby has more vCPUs than the active server, it won’t qualify. Ensure instance sizes align if you plan to scale vertically.
  • Multiple passive nodes: Microsoft passive rights and the EC2 passive-waive restriction can differ — Microsoft’s rules around multiple passive replicas are specific and limited. Do not assume unlimited passive nodes can be unlicensed; validate per your licensing model (BYOL/SA vs LI).
  • Audit evidence: Maintain logs showing the node’s state over time. Billing disputes are far easier to defend with automated telemetry from SSM + CloudWatch logs.

Alternatives and complementary licensing strategies​

  • Bring‑Your‑Own‑License (BYOL) + License Mobility: If you have active Software Assurance (SA) or subscription licensing, you can use License Mobility to run BYOL in EC2. License Mobility has different failover rules — ensure you understand how they change your architecture and audit posture.
  • Dedicated Hosts / Dedicated Instances: For large deployments, licensing at the physical core can be cheaper — evaluate dedicated host pricing versus shared tenancy licensing requirements.
  • Optimize CPU for LI: AWS has introduced CPU options that let you tune vCPU counts or disable hyperthreading to lower vCPU-based license costs for LI instances. This is complementary; Optimize CPU reduces per-instance SQL license burden, and the passive-waive feature reduces cluster-level licensing by avoiding a second SQL license charge.
  • Edition selection: Where feasible, using SQL Server Standard or Web editions (where available and supported) may materially reduce license cost compared to Enterprise. The passive-waive feature applies to Standard and Enterprise LI, but edition selection remains one of the biggest levers for cost reduction.

Security and governance checklist before enabling in production​

  • Confirm SSM Agent version is current (SSM v3.x+ recommended).
  • Ensure instance IAM role contains the required AWS-managed policy or an equivalent custom role with narrowly scoped permissions.
  • Validate SQL Server editions and Windows Server versions.
  • Confirm that the passive node is configured to never accept external SQL client traffic during normal operations.
  • Create a Secrets Manager secret and a minimal SQL metadata-only login if NT AUTHORITY\SYSTEM is restricted, and enable rotation.
  • Add CloudWatch dashboards and SSM runbooks that surface the HA detection results and any errors.
  • Create a runbook that documents failover scenarios and billing behavior — include steps to remediate SSM/permission issues that could break the passive detection and cause unwanted billing.

Practical pitfalls and troubleshooting guidance​

  • SSM Agent offline: the EC2 wizard checks SSM connectivity; if it's offline the feature cannot reliably detect HA state. Install or update the SSM Agent and verify connectivity.
  • IAM permissions missing: confirm the instance profile role includes the AWSEC2SqlHaInstancePolicy or equivalent permissions. When in doubt, test on a non‑production pair first.
  • NT AUTHORITY\SYSTEM disabled: many hardened Windows server images disable or restrict SYSTEM account access. If this is your environment, you must configure Secrets Manager with a SQL login that has the minimal metadata permissions described earlier.
  • Failure to reflect in billing: allow a short propagation delay. If billing does not reflect the expected waived license, collect SSM logs and contact AWS Support with evidence of detected standby state and timestamps.

When not to use this feature​

  • If your architecture requires readable secondaries for reporting, the passive-waive benefit is likely incompatible.
  • If you maintain more than two nodes in an AG or cluster and expect multiple always‑on secondaries to remain unlicensed, this approach does not remove the need to license additional active copies as per Microsoft rules.
  • Cross‑region DR topologies where a passive replica sits in another region are not supported by the EC2 passive-waive workflow as implemented.
  • If your security policy forbids any agent or IAM permissions on DB servers, you may not be able to use the automated detection flow.

Final analysis — strengths, limitations and risk assessment​

Strengths​

  • Immediate and automated cost reduction: removes the SQL LI charge for the passive node without manual billing adjustments.
  • Low operational overhead: the EC2 console wizard and SSM-based detection reduce manual configuration and human error.
  • Compatible with other cost optimizations: works alongside CPU optimization and Savings Plans / Reserved Instances models.
  • Audit-friendly when paired with telemetry: SSM logs and CloudWatch can provide a robust audit trail when configured properly.

Limitations and risks​

  • Narrow scope: only license-included SQL Server on Windows, only Standard and Enterprise editions, only two-node clusters, and only within a single region.
  • Dependency on SSM and IAM: environments that lock down SYSTEM account access or block SSM cannot use the default detection flow without adjustments.
  • Potential for licensing missteps: misconfigured readable secondaries, hidden workloads, or instance size mismatches create audit exposure.
  • Not a substitute for licensing governance: organizations still need to manage SQL edition choice, core counts, and compliance with Microsoft’s Product Terms.

Overall recommendation​

For shops running two‑node Always On clusters on EC2 using SQL Server License Included images, this feature represents a pragmatic and significant cost optimization — provided the teams handling SQL Server configuration, Windows hardening, and cloud IAM collaborate to ensure the passive node meets the strict criteria for waiver eligibility. The real work is in governance: automation reduces human friction, but auditors will expect documentary evidence and telemetry to prove that nodes designated as passive were truly passive during the billing periods in question.

Quick implementation checklist​

  • [ ] Confirm SQL Server 2017+ and Windows Server 2019+ across nodes.
  • [ ] Install and validate SSM Agent (v3.x or later).
  • [ ] Attach instance role with required permissions (or prepare custom policy).
  • [ ] Decide whether to allow NT AUTHORITY\SYSTEM access or create a Secrets Manager secret with a minimal SQL login.
  • [ ] Use EC2 console → Actions → Instance settings → Modify SQL High Availability settings to enable.
  • [ ] Configure CloudWatch + SSM logs; retain them for licensing audits.
  • [ ] Conduct planned failover test and verify billing updates.
  • [ ] Document architecture and enrollment decision in your compliance repository.

Conclusion​

AWS’s EC2 High Availability feature for SQL Server License Included on EC2 is a thoughtful, operationally simple response to a longstanding cost pain point: paying full SQL Server LI on both nodes of a two-node HA cluster. When deployed with care — aligning instances sizes, maintaining a true passive secondary, and preserving an auditable trail via Systems Manager and CloudWatch — the feature can materially reduce HA licensing expense while preserving the operational behavior of Always On AGs and FCIs.
The benefits are most pronounced where SQL Server licensing is a dominant portion of the stack cost. However, the downside risk is not negligible: misconfiguration or deviation from Microsoft’s passive failover rules can lead to noncompliance and retroactive licensing exposure. For that reason, treat this capability as a governed optimization: enable it where the architecture and operational controls can guarantee a truly passive standby, instrument the environment for evidence, and integrate the activation into your provisioning and audit processes.

Source: Amazon Web Services (AWS) Reduce Microsoft SQL Server High Availability costs running on Amazon EC2 | Amazon Web Services