Milliman Accelerates Actuarial Modeling with Azure HPC for Faster, Compliant Runs

  • Thread Author
Milliman’s move to Azure high‑performance computing (HPC) has turned a years‑long arms race in actuarial compute into a practical productivity story: what once required days of queuing, orchestration and expensive on‑prem hardware now completes in hours, enabling insurers to meet tighter regulatory deadlines and run far larger stochastic workloads at lower cost.

Blue isometric cloud above server racks connected by orange cables, with governance and analytics visuals.Background​

Actuarial modeling for life insurers is uniquely demanding: models must price long‑duration contracts, quantify reserve adequacy, and produce regulatory submissions that often require thousands of stochastic economic scenarios and millions of policy records. Over the past two decades Milliman’s Life Technology Solutions (LTS) practice developed distributed computing stacks — including the Integrate and MG‑ALFA platforms — to orchestrate those calculations for more than 100 life insurers worldwide. These platforms historically ran on-premises clusters and commodity cloud VMs, but regulatory complexity and dataset growth pushed them to the limits of elasticity and economics. Regulatory drivers are central to the story. Europe’s Solvency II framework and U.S. principle‑based reserving developments such as NAIC’s VM‑22 require firms to consider broad ranges of economic scenarios and to validate internal models under strict governance and timelines. Solvency II internal model frameworks and supervisory guidance routinely expect thousands of stochastic runs to build distributions for value‑at‑risk and stress tests, while the NAIC’s VM‑22 workstream extends PBR approaches to non‑variable annuities and creates new modeling obligations for U.S. firms. These regulatory demands make high throughput and predictable turnarounds not optional, but core to solvency reporting.

Why classical cloud VMs were no longer enough​

Early cloud migrations focused on general‑purpose VM families — D‑series and other “commodity” instances — for elasticity. That approach removed capital expense and made scaling easier, but it has limits for memory‑ and communication‑bound actuarial workloads.
  • Actuarial jobs are often both memory‑heavy and massively parallel: they require large single‑run memory footprints (multi‑terabyte models) and tens of thousands of simultaneous threads to process scenario trees and nested loops across products.
  • Distributed fleets of small VMs create orchestration sprawl: coordinating thousands of cores across hundreds of VMs increases overhead, raises the odds of stragglers, and complicates licensing and NUMA tuning.
  • Memory‑to‑core ratios matter: insufficient memory per core forces disabling cores (wasting compute) or performing expensive out‑of‑core operations that dramatically increase runtime.
Milliman’s experience mirrors these points: general‑purpose VMs delivered scale, but not the predictable throughput, memory density, and cost‑efficiency their clients needed as VM counts, policy volumes and regulatory scenario mixes ballooned.

The Azure HPC shift: H‑class machines, HBv3 → HX → HBv5​

Milliman’s breakthrough came through adopting Azure’s H‑class HPC machines and subsequent generations of memory‑dense instances.

What the H‑class provides​

Azure’s H‑series is purpose‑built for compute‑ and memory‑intensive HPC workloads. The HBv3 family, for example, offers large core counts, high memory bandwidth and a high‑performance InfiniBand fabric for MPI workloads — features that matter for tightly coupled actuarial jobs where low‑latency communication and predictable RDMA are required. Documentation for HBv3 confirms up to 120 vCPUs per VM, large L3 caches and a 200 Gb/s InfiniBand fabric for scale‑out MPI. HBv5 and HX are successive steps toward even higher memory bandwidth and capacity. HBv5 targets memory‑bandwidth‑intensive HPC with terabytes-per-second class memory bandwidth and very large HBM capacities optimized for scientific simulations, while HX series emphasizes high memory capacity — the HX family advertises twice the memory capacity of HBv4 and VM sizes with more than a terabyte (1,408 GB) of RAM per instance in its published specs. For memory‑bound actuarial models that previously required disabling cores to meet memory‑per‑thread needs, the HX and HBv5 families materially improve utilization and economics.

Why memory density changes the calculus​

Two technical constraints governed Milliman’s earlier inefficiencies: raw core count and working‑set memory. Lower memory per core forces tradeoffs:
  • Disable cores to increase memory per active core (wasting nominal capacity).
  • Use distributed sharding across hosts, which increases cross‑host traffic and synchronization overhead, reducing parallel efficiency.
By moving to H‑class machines with much higher memory capacity (HX) and memory bandwidth (HBv5), Milliman was able to:
  • Consolidate many parallel threads on fewer, larger VMs with high memory-to-core ratios.
  • Exploit faster interconnects so that when parallel runs do require cross‑node communication, latency and throughput penalties are lower.
  • Raise utilization (the percentage of available cores that can be effectively used) — Milliman reports moving utilization from constrained levels to roughly 75% on newer HX hardware, a major cost and time win for clients.

How that translates into actuarial modeling gains​

The combination of dense compute, expanded memory capacity and low‑latency fabric yields three practical results for actuarial workloads.
  • Faster runtimes: Jobs that used to take days can finish in hours. This is not a marketing slogan—it’s a consequence of removing memory bottlenecks and minimizing cross‑host synchronization costs. Milliman reported dramatic runtime reductions after moving critical jobs to Azure H‑class instances.
  • Greater reliability and predictability: Fewer nodes, larger VMs and RDMA‑backed fabrics reduce the number of potential failure points and variability in job completion times. For regulated reporting windows, predictable throughput matters as much as raw speed.
  • Improved cost efficiency: By increasing utilization and using larger‑footprint machines rather than thousands of small VMs, the effective cost per run falls—assuming high throughput and mature orchestration to avoid idle time. Milliman noted better economics on H‑series machines compared with commodity VMs.

Regulatory context revisited: scale, governance and auditability​

The push to HPC is not purely about speed. Regulatory frameworks like Solvency II and the NAIC’s PBR/VM‑22 developments require robust scenario generation, repeatability and documented controls. Key points:
  • Solvency II internal models and ORSA processes frequently require thousands of stochastic scenarios and careful model validation; regulators expect firms to document scenario design, model assumptions, and error handling. EIOPA’s guidance and internal‑model comparisons make clear that internal models must be statistically sound and auditable.
  • The NAIC’s VM‑22 initiative formalizes principle‑based reserving for non‑variable annuities, adding new computational demands and disclosure expectations in the U.S., including new economic scenario generators and reporting fields. Firms implementing VM‑22 will need to run rigorous stochastic reserves and generate supporting documentation.
For insurers and consulting partners, faster compute is useful only if it integrates with governance: reproducible pipelines, versioned model code, secure data handling, and machine‑readable audit trails are essential. Milliman’s Integrate platform and Microsoft Fabric integration aim to meet those needs by embedding analytics, version control and governed data lakes together with HPC compute.

Technical and operational best practices (what Milliman and others learned)​

Adopting rack‑scale or memory‑dense cloud HPC is not plug‑and‑play. The following practices are necessary to convert potential into real production outcomes:
  • Profile first: measure memory footprint, communication patterns and I/O behavior with representative jobs before selecting instance classes.
  • Use topology‑aware scheduling: keep frequently communicating tasks within the same host or NVLink/NIC domain where possible to minimize cross‑node traffic.
  • Right‑size memory and core counts: choose instances that match your memory‑per‑thread needs rather than maximizing cores alone.
  • End‑to‑end benchmarking: measure not only raw compute throughput but also end‑to‑end time‑to‑report including data ingest, prefetch, checkpointing, and post‑processing.
  • Plan for licensing and cost models: many actuarial tools are licensed per core or per socket—consolidating on larger instances may change licensing impacts.
  • Embed governance and audit trails: ensure model inputs, seeds for stochastic generators, and versioned code are logged and recoverable for regulator inspection.
A short, practical acceptance checklist for insurers moving to HPC:
  • Define representative jobs (policy mix, scenario count, stochastic seeds).
  • Run single‑node and multi‑node benchmarks capturing wall time, tail latency, and resource utilization.
  • Validate numerical fidelity across precisions and sharding strategies.
  • Negotiate topology‑aware SLAs (rack locality, interconnect bandwidth guarantees).
  • Verify the security perimeter: data residency, encrypted storage, and controlled access to compute nodes.
These operational controls reflect Milliman’s approach: they combine platform engineering with actuarial domain knowledge to make large, regulated runs reliable and auditable.

Economics and vendor risks: the tradeoffs​

Moving heavy actuarial workloads to Azure HPC produces clear benefits, but it also introduces risks that insurers and vendors must manage.
  • Concentration risk: frontier HPC hardware (rack‑scale, NVLink‑first architectures) is expensive and initially available in limited regions. Reliance on a single hyperscaler or specific instance family increases exposure to regional outages and supply constraints.
  • Vendor‑specific optimizations and lock‑in: topology‑aware performance gains often come from vendor‑specific fabrics, collectives and numeric optimizations. Porting tuned pipelines between clouds or to on‑prem hardware can be nontrivial.
  • Cost predictability: high‑density instances are expensive when idle; realizing cost benefits depends on maintaining high sustained utilization or on batch window scheduling to smooth costs.
  • Environmental and facility considerations: denser compute raises power and cooling demands. While modern designs often use liquid cooling and PUE optimizations, large deployments have real grid and facility implications that should be considered in TCO and sustainability reporting.
These concerns do not negate the advantages of HPC; they require explicit mitigation: multi‑region capacity planning, contractual SLAs with auditability, portability engineering, and workload orchestration that minimizes idle time.

Milliman’s pragmatic architecture — a model for insurers​

Milliman’s practical playbook offers a template for other firms:
  • Platformization: expose actuarial engines (Integrate / MG‑ALFA) behind a governed data platform and embed version control, notebooks and BI to make models reproducible.
  • Cloud‑first HPC: use Azure H‑class instances for the heaviest runs while keeping more agile workloads on smaller cloud families.
  • Hybrid orchestration: combine Azure CycleCloud/Azure Batch with scheduler tuning, place constraints to ensure topology locality for latency‑sensitive stages, and use checkpointing for long runs.
  • Economics by design: combine capacity reservations, off‑peak scheduling and consolidation on memory‑dense instances to lower cost per stochastic scenario.
The results reported by Milliman are concrete: substantial runtime reductions and utilization improvements that translate into faster reporting cycles and lower client costs — enabling insurers to run more scenarios, add sensitivity analyses, and meet regulatory timetables more comfortably.

Critical analysis: strengths, caveats and unresolved questions​

Strengths​

  • Performance uplift: The combination of high memory density, fast interconnects and larger cores per VM reduces communication penalties and allows actuarial models to run faster with fewer nodes.
  • Governance posture: Integrating HPC with governed data platforms and versioned artifacts meets regulatory needs for reproducibility and audit trails.
  • Economies at scale: Consolidating jobs on H‑class instances can lower cost per effective run when utilization is high.

Caveats and open questions​

  • Vendor claims vs. workload reality: Cloud and hardware vendors publish peak throughput and memory figures, but real gains are workload dependent. Independent benchmarking with representative actuarial jobs is essential before committing major programs. Treat headline performance or “first” deploy claims as vendor statements requiring validation.
  • Portability risk: Optimizations that exploit vendor fabric features (NVLink, in‑network offloads) may reduce portability across clouds or on‑prem hardware; firms must preserve escape and fallback strategies.
  • Regional capacity limits: Early availability of new instance classes tends to be regional; insurers with strict data residency or regional scheduling needs may face delays or additional cost to achieve required capacity.
  • Sustainability and facilities: Dense racks and liquid cooling impose facility design and energy considerations that affect total cost of ownership and sustainability reporting.
When considered together, the strengths are compelling for firms that can invest in portability engineering and governance. The caveats require measured procurement, contractual protections and representative benchmarking to turn platform potential into validated, regulator‑defensible outcomes.

Recommendations for actuarial teams and IT leaders​

  • Start with a production‑representative pilot: include full data ingest, stochastic libraries, and end‑to‑end reporting to capture real bottlenecks.
  • Insist on workload‑matched SLAs from cloud vendors: topology locality guarantees, telemetry hooks and audit logs that map compute to regulatory artifacts.
  • Build portability into code and data: abstract sharding/collective layers where possible and maintain checkpoint formats that are cloud‑agnostic.
  • Negotiate predictable pricing and committed capacity where predictable reporting cycles are critical.
  • Embed governance controls: versioned seeds for stochastic generation, immutable input snapshots, signed model artifacts and machine‑readable audit trails.
A practical, phased approach reduces transition risk: prove correctness and speed first, then scale to mission‑critical reporting cycles once benchmarks and governance are in place.

Conclusion​

Milliman’s adoption of Azure HPC shows how domain expertise plus purpose‑built cloud hardware can change what’s operationally feasible in actuarial modeling. By moving heavy runs from sprawling fleets of small VMs into H‑class, memory‑dense instances, Milliman cut runtimes and improved utilization—delivering faster, more reliable, and more economical outcomes for insurers facing tougher regulatory requirements like Solvency II and VM‑22. The technical gains are real, but achieving them requires careful workload profiling, topology‑aware engineering, contract safeguards, and strong governance so that speed and scale do not come at the expense of auditability, portability or cost control. For life insurers that depend on timely, defensible reserves and risk reports, the new generation of cloud HPC is a practical enabler — provided that organizations treat it as a platform investment rather than a quick‑fix.
Source: Microsoft Milliman accelerates actuarial modeling with Azure HPC | Microsoft Customer Stories
 

Back
Top