Milliman’s move to Azure high‑performance computing (HPC) has turned a years‑long arms race in actuarial compute into a practical productivity story: what once required days of queuing, orchestration and expensive on‑prem hardware now completes in hours, enabling insurers to meet tighter regulatory deadlines and run far larger stochastic workloads at lower cost.
Actuarial modeling for life insurers is uniquely demanding: models must price long‑duration contracts, quantify reserve adequacy, and produce regulatory submissions that often require thousands of stochastic economic scenarios and millions of policy records. Over the past two decades Milliman’s Life Technology Solutions (LTS) practice developed distributed computing stacks — including the Integrate and MG‑ALFA platforms — to orchestrate those calculations for more than 100 life insurers worldwide. These platforms historically ran on-premises clusters and commodity cloud VMs, but regulatory complexity and dataset growth pushed them to the limits of elasticity and economics. Regulatory drivers are central to the story. Europe’s Solvency II framework and U.S. principle‑based reserving developments such as NAIC’s VM‑22 require firms to consider broad ranges of economic scenarios and to validate internal models under strict governance and timelines. Solvency II internal model frameworks and supervisory guidance routinely expect thousands of stochastic runs to build distributions for value‑at‑risk and stress tests, while the NAIC’s VM‑22 workstream extends PBR approaches to non‑variable annuities and creates new modeling obligations for U.S. firms. These regulatory demands make high throughput and predictable turnarounds not optional, but core to solvency reporting.
Source: Microsoft Milliman accelerates actuarial modeling with Azure HPC | Microsoft Customer Stories
Background
Actuarial modeling for life insurers is uniquely demanding: models must price long‑duration contracts, quantify reserve adequacy, and produce regulatory submissions that often require thousands of stochastic economic scenarios and millions of policy records. Over the past two decades Milliman’s Life Technology Solutions (LTS) practice developed distributed computing stacks — including the Integrate and MG‑ALFA platforms — to orchestrate those calculations for more than 100 life insurers worldwide. These platforms historically ran on-premises clusters and commodity cloud VMs, but regulatory complexity and dataset growth pushed them to the limits of elasticity and economics. Regulatory drivers are central to the story. Europe’s Solvency II framework and U.S. principle‑based reserving developments such as NAIC’s VM‑22 require firms to consider broad ranges of economic scenarios and to validate internal models under strict governance and timelines. Solvency II internal model frameworks and supervisory guidance routinely expect thousands of stochastic runs to build distributions for value‑at‑risk and stress tests, while the NAIC’s VM‑22 workstream extends PBR approaches to non‑variable annuities and creates new modeling obligations for U.S. firms. These regulatory demands make high throughput and predictable turnarounds not optional, but core to solvency reporting. Why classical cloud VMs were no longer enough
Early cloud migrations focused on general‑purpose VM families — D‑series and other “commodity” instances — for elasticity. That approach removed capital expense and made scaling easier, but it has limits for memory‑ and communication‑bound actuarial workloads.- Actuarial jobs are often both memory‑heavy and massively parallel: they require large single‑run memory footprints (multi‑terabyte models) and tens of thousands of simultaneous threads to process scenario trees and nested loops across products.
- Distributed fleets of small VMs create orchestration sprawl: coordinating thousands of cores across hundreds of VMs increases overhead, raises the odds of stragglers, and complicates licensing and NUMA tuning.
- Memory‑to‑core ratios matter: insufficient memory per core forces disabling cores (wasting compute) or performing expensive out‑of‑core operations that dramatically increase runtime.
The Azure HPC shift: H‑class machines, HBv3 → HX → HBv5
Milliman’s breakthrough came through adopting Azure’s H‑class HPC machines and subsequent generations of memory‑dense instances.What the H‑class provides
Azure’s H‑series is purpose‑built for compute‑ and memory‑intensive HPC workloads. The HBv3 family, for example, offers large core counts, high memory bandwidth and a high‑performance InfiniBand fabric for MPI workloads — features that matter for tightly coupled actuarial jobs where low‑latency communication and predictable RDMA are required. Documentation for HBv3 confirms up to 120 vCPUs per VM, large L3 caches and a 200 Gb/s InfiniBand fabric for scale‑out MPI. HBv5 and HX are successive steps toward even higher memory bandwidth and capacity. HBv5 targets memory‑bandwidth‑intensive HPC with terabytes-per-second class memory bandwidth and very large HBM capacities optimized for scientific simulations, while HX series emphasizes high memory capacity — the HX family advertises twice the memory capacity of HBv4 and VM sizes with more than a terabyte (1,408 GB) of RAM per instance in its published specs. For memory‑bound actuarial models that previously required disabling cores to meet memory‑per‑thread needs, the HX and HBv5 families materially improve utilization and economics.Why memory density changes the calculus
Two technical constraints governed Milliman’s earlier inefficiencies: raw core count and working‑set memory. Lower memory per core forces tradeoffs:- Disable cores to increase memory per active core (wasting nominal capacity).
- Use distributed sharding across hosts, which increases cross‑host traffic and synchronization overhead, reducing parallel efficiency.
- Consolidate many parallel threads on fewer, larger VMs with high memory-to-core ratios.
- Exploit faster interconnects so that when parallel runs do require cross‑node communication, latency and throughput penalties are lower.
- Raise utilization (the percentage of available cores that can be effectively used) — Milliman reports moving utilization from constrained levels to roughly 75% on newer HX hardware, a major cost and time win for clients.
How that translates into actuarial modeling gains
The combination of dense compute, expanded memory capacity and low‑latency fabric yields three practical results for actuarial workloads.- Faster runtimes: Jobs that used to take days can finish in hours. This is not a marketing slogan—it’s a consequence of removing memory bottlenecks and minimizing cross‑host synchronization costs. Milliman reported dramatic runtime reductions after moving critical jobs to Azure H‑class instances.
- Greater reliability and predictability: Fewer nodes, larger VMs and RDMA‑backed fabrics reduce the number of potential failure points and variability in job completion times. For regulated reporting windows, predictable throughput matters as much as raw speed.
- Improved cost efficiency: By increasing utilization and using larger‑footprint machines rather than thousands of small VMs, the effective cost per run falls—assuming high throughput and mature orchestration to avoid idle time. Milliman noted better economics on H‑series machines compared with commodity VMs.
Regulatory context revisited: scale, governance and auditability
The push to HPC is not purely about speed. Regulatory frameworks like Solvency II and the NAIC’s PBR/VM‑22 developments require robust scenario generation, repeatability and documented controls. Key points:- Solvency II internal models and ORSA processes frequently require thousands of stochastic scenarios and careful model validation; regulators expect firms to document scenario design, model assumptions, and error handling. EIOPA’s guidance and internal‑model comparisons make clear that internal models must be statistically sound and auditable.
- The NAIC’s VM‑22 initiative formalizes principle‑based reserving for non‑variable annuities, adding new computational demands and disclosure expectations in the U.S., including new economic scenario generators and reporting fields. Firms implementing VM‑22 will need to run rigorous stochastic reserves and generate supporting documentation.
Technical and operational best practices (what Milliman and others learned)
Adopting rack‑scale or memory‑dense cloud HPC is not plug‑and‑play. The following practices are necessary to convert potential into real production outcomes:- Profile first: measure memory footprint, communication patterns and I/O behavior with representative jobs before selecting instance classes.
- Use topology‑aware scheduling: keep frequently communicating tasks within the same host or NVLink/NIC domain where possible to minimize cross‑node traffic.
- Right‑size memory and core counts: choose instances that match your memory‑per‑thread needs rather than maximizing cores alone.
- End‑to‑end benchmarking: measure not only raw compute throughput but also end‑to‑end time‑to‑report including data ingest, prefetch, checkpointing, and post‑processing.
- Plan for licensing and cost models: many actuarial tools are licensed per core or per socket—consolidating on larger instances may change licensing impacts.
- Embed governance and audit trails: ensure model inputs, seeds for stochastic generators, and versioned code are logged and recoverable for regulator inspection.
- Define representative jobs (policy mix, scenario count, stochastic seeds).
- Run single‑node and multi‑node benchmarks capturing wall time, tail latency, and resource utilization.
- Validate numerical fidelity across precisions and sharding strategies.
- Negotiate topology‑aware SLAs (rack locality, interconnect bandwidth guarantees).
- Verify the security perimeter: data residency, encrypted storage, and controlled access to compute nodes.
Economics and vendor risks: the tradeoffs
Moving heavy actuarial workloads to Azure HPC produces clear benefits, but it also introduces risks that insurers and vendors must manage.- Concentration risk: frontier HPC hardware (rack‑scale, NVLink‑first architectures) is expensive and initially available in limited regions. Reliance on a single hyperscaler or specific instance family increases exposure to regional outages and supply constraints.
- Vendor‑specific optimizations and lock‑in: topology‑aware performance gains often come from vendor‑specific fabrics, collectives and numeric optimizations. Porting tuned pipelines between clouds or to on‑prem hardware can be nontrivial.
- Cost predictability: high‑density instances are expensive when idle; realizing cost benefits depends on maintaining high sustained utilization or on batch window scheduling to smooth costs.
- Environmental and facility considerations: denser compute raises power and cooling demands. While modern designs often use liquid cooling and PUE optimizations, large deployments have real grid and facility implications that should be considered in TCO and sustainability reporting.
Milliman’s pragmatic architecture — a model for insurers
Milliman’s practical playbook offers a template for other firms:- Platformization: expose actuarial engines (Integrate / MG‑ALFA) behind a governed data platform and embed version control, notebooks and BI to make models reproducible.
- Cloud‑first HPC: use Azure H‑class instances for the heaviest runs while keeping more agile workloads on smaller cloud families.
- Hybrid orchestration: combine Azure CycleCloud/Azure Batch with scheduler tuning, place constraints to ensure topology locality for latency‑sensitive stages, and use checkpointing for long runs.
- Economics by design: combine capacity reservations, off‑peak scheduling and consolidation on memory‑dense instances to lower cost per stochastic scenario.
Critical analysis: strengths, caveats and unresolved questions
Strengths
- Performance uplift: The combination of high memory density, fast interconnects and larger cores per VM reduces communication penalties and allows actuarial models to run faster with fewer nodes.
- Governance posture: Integrating HPC with governed data platforms and versioned artifacts meets regulatory needs for reproducibility and audit trails.
- Economies at scale: Consolidating jobs on H‑class instances can lower cost per effective run when utilization is high.
Caveats and open questions
- Vendor claims vs. workload reality: Cloud and hardware vendors publish peak throughput and memory figures, but real gains are workload dependent. Independent benchmarking with representative actuarial jobs is essential before committing major programs. Treat headline performance or “first” deploy claims as vendor statements requiring validation.
- Portability risk: Optimizations that exploit vendor fabric features (NVLink, in‑network offloads) may reduce portability across clouds or on‑prem hardware; firms must preserve escape and fallback strategies.
- Regional capacity limits: Early availability of new instance classes tends to be regional; insurers with strict data residency or regional scheduling needs may face delays or additional cost to achieve required capacity.
- Sustainability and facilities: Dense racks and liquid cooling impose facility design and energy considerations that affect total cost of ownership and sustainability reporting.
Recommendations for actuarial teams and IT leaders
- Start with a production‑representative pilot: include full data ingest, stochastic libraries, and end‑to‑end reporting to capture real bottlenecks.
- Insist on workload‑matched SLAs from cloud vendors: topology locality guarantees, telemetry hooks and audit logs that map compute to regulatory artifacts.
- Build portability into code and data: abstract sharding/collective layers where possible and maintain checkpoint formats that are cloud‑agnostic.
- Negotiate predictable pricing and committed capacity where predictable reporting cycles are critical.
- Embed governance controls: versioned seeds for stochastic generation, immutable input snapshots, signed model artifacts and machine‑readable audit trails.
Conclusion
Milliman’s adoption of Azure HPC shows how domain expertise plus purpose‑built cloud hardware can change what’s operationally feasible in actuarial modeling. By moving heavy runs from sprawling fleets of small VMs into H‑class, memory‑dense instances, Milliman cut runtimes and improved utilization—delivering faster, more reliable, and more economical outcomes for insurers facing tougher regulatory requirements like Solvency II and VM‑22. The technical gains are real, but achieving them requires careful workload profiling, topology‑aware engineering, contract safeguards, and strong governance so that speed and scale do not come at the expense of auditability, portability or cost control. For life insurers that depend on timely, defensible reserves and risk reports, the new generation of cloud HPC is a practical enabler — provided that organizations treat it as a platform investment rather than a quick‑fix.Source: Microsoft Milliman accelerates actuarial modeling with Azure HPC | Microsoft Customer Stories