Shift Left SDVs: AMD VAS on Azure with PAVE360

  • Thread Author
AMD’s new push to enable “shift‑left” development for software‑defined vehicles (SDVs) — delivering a Virtualized Automotive Stack (VAS) that runs on AMD Radeon PRO V710 GPUs and AMD EPYC CPUs in Microsoft Azure and integrating Siemens’ PAVE360 digital‑twin environment — promises a meaningful acceleration of cloud‑native vehicle engineering, but it also surfaces technical contradictions and operational risks that OEMs and Tier‑1 suppliers must validate before they change validation pipelines at scale.

Blue neon diagram of an autonomous car with perception sensors, ADAS modules, and a cloud hypervisor.Background​

Software‑defined vehicles (SDVs) are transforming automotive development: functionality that used to live in isolated ECUs is migrating into consolidated, zonal SoCs and cloud‑driven services, multiplying software complexity and increasing the need for system‑level validation earlier in the lifecycle. Cloud‑based simulation, large‑scale scenario libraries, and digital twins are the practical tools enabling a shift‑left approach — moving integration, safety analysis, and validation from late prototype stages into earlier software development sprints.
Microsoft, AMD, and Siemens position their joint offering as an answer to that need: Azure’s NVads V710 v5‑series VMs provide AMD EPYC host CPUs and Radeon PRO V710 accelerators; AMD supplies a Virtualized Automotive Stack (VAS) — a middleware layer that bundles VirtIO device models and a Xen hypervisor intended to run as a nested guest on top of Azure’s Hyper‑V hosts; and Siemens brings PAVE360 as the system‑of‑systems digital‑twin engine for large‑scale simulation. Together thefication of mixed‑criticality stacks such as ADAS perception, infotainment, and instrument clusters.

What the announcement actually says (clear summary)​

  • AMD has introduced the Virtualized Automotive Stack (VAS) and says it is available on Azure NVads V710 v5‑series instances powered by AMD Radeon PRO V710 GPUs and AMD EPYC CPUs. AMD describes VAS as a middleware stack that exposes VirtIO devices and runs a Xen nested hypervisor on top of Hyper‑V to host mixed‑criticality guest partitions.
  • Siemens has extended PAVE360 to run on AMD Radeon PRO V710 GPUs and EPYC CPUs in Azure, explicitly promoting system‑level digital‑twin validation for SDV use cases. Siemens frames the move as increasing cloud platform choices for customers who want to scale scenario simulations.
  • Microsoft’s documentation and product pages describe the NVads V710 v5‑series VM family (CPUs, vCPU counts, memory ranges, fractional GPU partitioning) and position these VMs for GPU‑accelerated visualization, VDI, and small‑to‑medium AI inference. Microsoft’s official VM documentation further lists supported features — and crucially enumerates feature limitations for these GPmicro
These vendor statements are factual: AMD’s blog and Siemens’ press release each make the integration claims publicly, and Microsoft lists the NVads V710 v5‑series as an Azure SKU family. However, there are specific, verifiable technical claims in AMD’s narrative that conflict with Microsoft’s documented VM feature support — most notably the claim that nested virtualization is supported on GPU instances — and those contradictions must be resolved before production use.

Technical anatomy: VAS, NVads V710 v5, Xen, VirtIO, and PAVE360​

What is AMD’s Virtualized Automotive Stack (VAS)?​

VAS is described as a middleware suite that provides:
  • VirtIO‑based virtual device models to efficiently expose I/O to nested guests.
  • A Xen hypervisor running as a guest (nested hypervisor) to host partitioned workloads that emulate zonal or consolidated SoC domains.
  • Integration points to Azure host services (for instance, storage and networking) so that nested guests can be orchestrated at scale for simulation and validation.
AMD’s pitch is straightforward: nested Xen guests give automotive developers the logical isolation needed to run safety‑critical code alongside non‑safety workloads (infotainment, instrument clusters, visualization) on the same top‑level VM while preserving separation similar to zonal consolidation in real hardware.

NVads V710 v5 series: host and VM sizing​

Microsoft’s NVads V710 v5 documentation lists the host and VM characteristics that matter for SDV workloads:
  • Host CPU: AMD EPYC 9V64 F (Genoa family), with VM size options from 4 vCPUs to 28 vCPUs and memory ranges 16 GiB to 160 GiB depending on the SKU.
  • GPU: AMD Radeon PRO V710 accelerator offered as fractional GPU partitions (1/6, 1/3, 1/2, or full GPU) with exposed frame buffer sizes ranging from 4 GiB to 24 GiB for the VM family. The NVads V710 v5 family is explicitly sized for graphics, visualization, and small‑to‑medium inference workloads.
Note the important distinction: Microsoft’s VM documentation states the exposed frame buffer per VM in that family tops out at 24 GiB for a full GPU allocation, whereas AMD’s public product page and independent GPU databases report the Radeon PRO V710 as a physical card with 28 GB of GDDR6 memory. This discrepancy is real and expected in cloud environments (physical resource partitioning, driver mappings, reservation for hypervisor overhead), but it must be validated against real workload behavior to avoid memory‑sizing surprises.

Siemens PAVE360: the digital twin piece​

Siemens’ PAVE360 is a vehicle‑level systems‑of‑systems digital twin environment that simulates perception, vehicle dynamics, electronics, and interactions between subsystems. Siemens explicitly confirms PAVE360 runs on Azure NVads instances using AMD GPUs and EPYC CPUs, enabling large scenario libraries to be executed at scale for system‑level validation. For automakers, this is the component that converts cloud GPU compute into actionable validation runs and safety‑case evidence.

The central technical contradiction: nested virtualization​

AMD’s VAS announcement claims that Xen can be run as a nested guest on top of Azure’s Hyper‑V hosts, enabling nested virtualization on GPU instances. That capability is central to AMD’s value proposition because nested guesused to emulate consolidated SoC partitions and to separate mixed‑criticality domains during system testing.
Microsoft’s NVads V710 v5 documentation, however, explicitly lists “Nested Virtualization: Not Supported” for the NVads V710 v5 series. Microsoft’s broader guidance and community Q&A threads also show long‑standing limitations around exposing the full virtualization instruction set, device passthrough semantics, and GPU virtualization features required for third‑party hypervisors to operate inside Azure GPU VMs. That means running Xen as a nested guest with full GPU access is not a generally supported configuration in the public Azure VM family documentation.
This is not an academic point: if nested virtualization is only available in private preview, partneMicrosoft‑AMD‑joint special configuration (and not available as a standard, GA, fully supported SKU), programs that rely on it will have to accept limited SLAs, potential feature instability, and unclear long‑term support paths. Engineering teams should not assume “works in AMD’s demo” equals “supported for mass validation runs.”

GPU memory and partitioning: 28 GB physical vs 24 GiB exposed​

  • Independent hardware databases and AMD’s product listings show the Radeon PRO V710 physical card commonly listed with 28 GB of GDDR6 memory.
  • Microsoft’s NVads V710 v5 documentation (the VM family that exposes V710 GPUs to Azure tenants) documents an exposed maximum frame buffer of 24 GiB for a full GPU VM allocation, with fractional GPU options down to 4 GiB for 1/6 GPU partitions. That means cloud guests may see slightly less GPU memory than the raw card specification, which is common where cloud providers reserve some resources for hypervisor and driver stack overhead or map cards into partitions.
Why this matters: large perception models, high‑resolution sensor simulation, and photorealistic rendering can be GPU‑memory bound. Mismatches between physical card specs and the memory available to VMs can cause out‑of‑memory errors or force workloads to run with smaller batch sizes, increasing run time and cost per simulation. Test runs should explicitly measure usable GPU memory through realistic workloads rathndor‑quoted card specs alone.

Practical benefits for vehicle development teams​

When validated, this combined stack offers several concrete benefits that align with the “shift‑left” promise:
  • Scale: run thousands of virtual scenarios in parallel to uncover corner cases earlier, improving safety evidence and reducing late‑stage recalls. Siemens frames PAVE360 as the tool that enables thousands of scenario permutations.
  • Faster iteration: cloud VMs let multiple teams iterate against the same system‑level digital twin, reducing hardware bottlenecks and accelerating integration cycles.
  • Cost efficiency: for heavy, bursty simulation workloads, cloud GPU instances ctive than maintaining on‑prem GPU farms, provided teams optimize for spot/scale pricing and minimize idle allocations.
  • Mixed‑criticality testing: if nested virtualization with strong isolation is genuinely supported, teams can validate consolidated SoC behavior and mixed‑criticality interactions earlier than constructing expensive validation hardware. AMD and Siemens explicitly tout this use case.

Risks, limitations, and areas that require verification​

  • Unsupported nested virtualization on public GPU VMs. Microsoft’s documented feature matrix lists nested virtualization as not supported for NVads V710 v5. If AMD’s demos rely on private/partner SKUs or preview features, those configurations may not be available to all customers or covered by typical SLAs. Treat AMD’s nested virtualization claim as a conditional capability that requires explicit validation with Microsoft.
  • Resource visibility and GPU memory fragmentation. The difference between the physical GPU memory (28 GB) and the VM‑exposed buffer (24 GiB) can affect workload sizing, especially for large‑model inference or high‑fidelity sensor simulation. Run representative workloads to establish safe memory limits and margin.
  • Isolation vs. safety certification. Nested hypervisors add layers of complexity and attack surface. Isolation provided by hypervisors is useful, but it is not a substitute for the formal evidence required by automotive functional safety (ISO 26262) and cybersecurity (ISO/SAE 21434). OEMs must map any cloud‑based validation process to their safety lifecycle and ensure traceability and auditability of results.
  • Operational support and SLAs. If the capability depends on preview or partner‑only SKUs, expect limited support windows, potential API/feature churn, and integration work that could delay program timelines.
  • Determinism and timing. Cloud VMs introduce variability in I/O latency and execution timing that can make reproducible timing analyses (for control‑loop validation) difficult. Hybrid architectures that keep deterministic control loops at the edge but use cloud resources for scenario generation and perception validation are safer for approval cases.
  • Data governance and privacy. Large validation runs may involve sensitive recorded sensor logs or driver data. Data residency, encryption at rest/in transit, and access control must be enforced as part of any cloud‑based validation pipeline.

Implementation checklist: how to evaluate this stack in a real program​

  • Confirm SKU and support: obtain written clarification from Microsoft on whether nested virtualization on NVads V710 v5 is supported for your subscription/region or whether a partner/preview agreement is required. Never assume public demos equate to GA support.
  • Validate nested Xen behavior: run a controlled test lab that boots Xen as a nested guest and runs representative mixed‑criticality workloads, verifying device assignment, interrupt latency, and isolation boundaries.
  • Measure GPU effective memory: deploy target perception and rendering workloads and observe maximum allocatable GPU ms; do not assume host card specs equal exposed VM memory.
  • Performance profiling: measure end‑to‑end simulation throughput, inferencing latency, and scenario execution costs to compute cost per regression test.
  • Security assessment: perform penetration testing and threat modeling on the stack — hypervation drivers (VirtIO), and cloud orchestration endpoints. Ensure test artifacts and logs are tamper‑evident.
  • Safety mapping: produce traceability matrices showing how cloud simulation outputs feed into ISO 26262 safety cases; identify gaps where cloud‑derived evidence may not meet regulator expectations.
  • Operational runbooks: define how to reproduce simulation environments (images, driver versions, ROCm/driver pins), how to manage softwareBOM), and how to maintain reproducible baselines over months/years.
  • Cost governance: set budgets, reservation strategies, and automated job orchestration to maximize utilization and minimize waste.
  • Hybrid architecture validation: decide what remains on‑vehicle for deterministic tasks and what shifts to cloud; create integration tests that validate behavior across that hybrid boundary.
  • Legal/compliance review: confirm data residency, contractual SLA coverage, and export controls for any models or datasets used in validation.

Sizing examples and practical guidance​

Microsoft lists the NVads V710 v5 series sizes as a straightforward ladder of SKUs:
  • Standard_NV4ads_V710_v5: 4 vCPU, 16 GiB RAM, 1/6 GPU (4 GiB frame buffer)
  • Standard_NV8ads_V710_v5: 8 vCPU, 32 GiB RAM
  • Standard_NV12ads_V710_v5: 12 vCPU, 64 GiB RAM
  • Standard_NV24ads_V710_v5: 24 vCPU, 128 GiB RAM
  • Standard_NV28adms_V710_v5: 28 vCPU, 160 GiB RAM (full GPU, 24 GiB frame buffer exposed)
Sizing guidance:
  • For bulk scenario generation where each scenario is relatively lightweight on GPU memory, fractional GPU partitions (1/6, 1/3) will increase parallelism and lower cost per run.
  • For high‑fidelity sensor simulation or large model inference, start tests on the full GPU SKU and validate peak GPU memory usage and GPU compuur workload needs more than 24 GiB in Azure‑exposed memory, it will not fit without model optimizations or multi‑GPU partitioning strategies.

Best practices and recommended architecture​

  • Embrace a hybrid model: use the cloud for compute‑heavy, non‑real‑time validation (mass scenario sweeps, perception training, and system‑level regressions) while retaining low‑latency control loops and safety‑critical determinism in edge or in‑vehicle compute. This reduces regulatory and timing risks while allowing cloud scale whereock driver and hypervisor versions: for reproducibility, pin ROCm/driver stacks, guest kernels, Xen versions, and VAS releases. Maintain versioned artifacts and SBOMs for auditability.
  • Treat nested virtualization as a high‑risk capability until explicitly certified: require Microsoft and AMD to provide a formal support matrix and a documented security posture for any nested hypervisor configuration you plan to use in production validation.
  • Automate test harnesses with immutability in mind: use image‑based provisioning, containerized orchestration for test drivers where practical, and immutable logging/archives for traceability of simulation runs.
  • Engage legal, safety, and security teams early: ensure cloud‑based validation artifacts are admissible in safety cases and meet the cybersecurity standards your OEM requires.

Critical analysis: strengths, limitations, and where the partnership succeeds or falls short​

Strengths
  • Scale and capability: pairing PAVE360 with Azure GPU instances enables the sort of large‑scale scenario testing that is otherwise infeasible on local hardware. When used correctly, this reduces late discovery of integration defects and accelerates time‑to‑market.
  • Ecosystem alignment: having the SoC vendor (AMD), cloud provider (Microsoft), and simulation tool vendor (Siemens) align product messaging around SDV validation is a practical step toward standardization and can reduce integration friction for early adopters.
Limitations and risks
  • Supported feature mismatch: AMD’s nested virtualization claim directly conflicts with Microsoft’s public VM feature matrix. Until Microsoft clarifies support for nested hypervisors on GPU instances (or until AMD documents partner‑only preview paths), engineering teams must treat nested virtualization as an unverified capability.
  • **Operational st validation for SDVs requires mature orchestration, reproducibility, and traceability. Vendors’ demos do not guarantee the operational discipline teams need to submit safety artifacts to regulators.
  • Security and safety posture: additional hypervisor layers and cloud orchestration enlarge the attack surface and complicate certification. Vendors’ claims about isolation are meaningful, but they are not a substitute for independent audits, formal verification, or penetration testing.

Final recommendations for engineering leaders​

  • Treat AMD’s VAS + Azure + PAVE360 as a powerful enabler for shift‑left SDV validation — but require hard evidence: written support commitments from Microsoft about nested virtualization, measurable test results for GPU memory exposure, and a security and safety assessment signed off by your safety office.
  • Begin with a focused pilot: validate nested hypervisor behavior (if needed), run a set of canonical PAVE360 scenarios, measure costs and throughput, and produce a reproducibility report that your safety and compliance teams can review.
  • Use conservative architecture assumptions: keep safety‑critical, low‑latency control logic at the edge; use cloud resources for perception, model training, visualization, and system‑level scenario sweeps.
  • Negotiate SLAs and support: if nested virtualization or special GPU partitioning is essential, negotiate explicit support and long‑term commitments from the cloud provider and hardware partner.
  • Instrument everything: collect deterministic logs, versioned images, SBOMs, and cryptographic attestations of simulation runs to create defensible evidence for safety cases.

Cloud‑scale digital twins and GPU‑accelerated VM families open new possibilities for earlier, broader validation of software‑defined vehicles — the combination of AMD’s VAS, Microsoft’s NVads V710 v5 instances, and Siemens’ PAVE360 can materially reduce integration risk and speed development cadence when implemented carefully. However, the most valuable single action for any engineering team today is to treat the announcement as the start of a technical due‑diligence checklist: verify nested virtualization support with Microsoft, test effective GPU memory and driver behavior under realistic loads, and require formal safety and security validation before migrating critical validation artifacts into production pipelines. The promise is real; the path to safely realizing it depends on rigorous verification and tightly managed operational practices.

Source: EE Times Asia https://www.eetasia.com/embeddedblo...-left-sdv-development-with-microsoft-siemens/
 

Back
Top