MTN EVA 3.0: Telco Lakehouse on Azure Databricks Scales to Billions of Records

  • Thread Author
Neon-blue infographic showcasing EVA 3.0: a cloud-native telco lakehouse on Azure Databricks.
MTN Group has completed a major re‑engineering of its Enterprise Value Analytics (EVA) platform in South Africa, migrating the stack to Microsoft Azure and unveiling EVA 3.0 — a cloud‑native, Databricks‑powered lakehouse the company says now processes roughly 22 billion records per day, runs 800+ analytics workflows and ingests ~1,700 data feeds, while being protected by Microsoft Defender.

Background​

MTN’s migration to Azure is the latest and most visible outcome of a multi‑year cloud strategy that the operator has been executing with Microsoft under a programme commonly referenced as Project Nephos and a centralised Cloud Centre of Excellence (CCoE). The move has been positioned by MTN and Microsoft as a flagship implementation for telco analytics at hyperscale and as a repeatable blueprint for other MTN markets. From a technology viewpoint the public narrative and technical summaries point to a classic lakehouse pattern: Azure Databricks as the analytics compute fabric, Delta Lake semantics on top of Azure Data Lake Storage (ADLS Gen2) for durability and incremental processing, and Azure native security and governance tooling (Microsoft Defender, Azure Active Directory / Entra, and cataloging/lineage services). These choices align with industry best practices for converging streaming and batch telco telemetry into a single governed platform.

What MTN says EVA 3.0 delivers​

  • Greater agility to analyse telemetry and business data faster, enabling earlier detection of service degradations and shorter mean time to repair (MTTR).
  • Improved customer experience through faster insight-to-action cycles that help design more relevant offers and personalised interventions.
  • A hardened security posture built on Azure tooling and Microsoft Defender, with an explicit aim to support responsible AI practices for model governance and deployment.
  • A scalable reference architecture that MTN intends to template and replicate across its operating companies in Africa and the Middle East.
MTN Group Chief Information Officer Nikos Angelopoulos framed the migration as a direct enabler of reliability, relevance and responsiveness for customers, while Microsoft’s telecommunications CTO Rick Lievano highlighted AI and big‑data analytics as central to telco transformation.

Why the architecture is logical for a telco​

Telco analytics workloads share three persistent demands: continuous high‑velocity ingestion, highly parallel processing for streaming and batch, and strict governance for sensitive personal and network data. The lakehouse pattern chosen for EVA 3.0 addresses those needs:
  • Scale and elasticity: Azure Databricks (managed Spark) allows autoscaling for bursty telemetry (events, outages, mass‑use incidents) without massive on‑premises overprovisioning.
  • Durability and operational correctness: Delta Lake on ADLS Gen2 supports ACID semantics and time‑travel for reliable incremental updates and controlled schema evolution.
  • Security and governance: A combination of Entra/Azure AD, Microsoft Defender, and catalog/lineage controls can centralise access controls and audit trails required by telecom regulators.
This trio—ingest fabric, lakehouse storage, and governed compute—has become the mainstream approach for operators replatforming OSS/BSS and telemetry analytics to the cloud. Industry coverage of the EVA 3.0 rollout makes this pattern explicit in MTN’s implementation.

The scale claims: what they mean — and what to verify​

MTN’s public statements assert the platform now processes “around 22 billion records each day,” executes more than 800 analytics workflows, and consumes roughly 1,700 data feeds. These figures, repeated across multiple trade outlets and MTN’s own newsroom, are plausible given the telemetry volumes a large operator produces, but they are company‑reported operational metrics and — as noted in independent technical summaries — have not been published as third‑party audited benchmarks. Treat these numbers as credible declarations of scale rather than independently verified performance data. Why this matters operationally:
  • Operating at tens of billions of records per day implies heavy parallelism, autoscaling policies, robust back‑pressure handling, and mature SRE practices to keep latency SLAs under control.
  • Cost, observability and governance at that scale become first‑order problems: inefficient Spark jobs, uncontrolled cluster autoscaling, and unbounded storage retention will materially affect monthly cloud spend and platform reliability.
Recommended checks for any organisation evaluating MTN’s claims:
  1. Request ingestion and pipeline telemetry (ingest rates, median and p95 latencies, job failure rates) for representative pipelines.
  2. Ask for cost attribution reports (compute hours, storage TB-months, egress) tied to the reported throughput period.
  3. If procurement is involved, seek a compact, independent technical due diligence engagement that includes load tests and security validation.

Operational benefits observed and expected​

EVA 3.0’s design should deliver tangible, measurable benefits when executed well:
  • Faster incident detection and remediation: near‑real‑time streaming analytics correlating NOC telemetry, OSS/BSS alerts and customer signals reduces the time between fault occurrence and remediation. Business impact: lower outage time and fewer customer complaints.
  • Smarter personalization and offer design: unifying network quality signals with CRM and billing datasets enables contextual, relevant offers that can increase uptake and reduce churn.
  • Platform economics: moving to pay‑as‑you‑go compute and managed services reduces capital expenditures for on‑premises hardware while enabling faster access to new features from the hyperscaler and Databricks ecosystem.
These are the precise benefits MTN highlights and those most telcos seek when migrating analytics to hyperscaler cloud platforms.

Security, governance and the promise of “responsible AI”​

MTN explicitly links EVA 3.0 to responsible AI goals: governance controls, model registries, staging environments and policies intended to limit risk when productionising models that touch customer data. The platform’s stated use of Microsoft Defender and Azure governance services forms a baseline for workload protection, identity management, and centralized logging. Important caveats:
  • Public announcements list Defender and Azure controls as core elements, but they do not publish independent security attestations or detailed compliance mappings for each jurisdiction MTN operates in. Independent security assessment results (penetration tests, SOC‑2 / ISO artefacts, regulator audits) are typically confidential, so the public record should be treated as a high‑level summary rather than detailed proof.
  • Responsible AI is not a single product: it requires sustained investment in governance workflows (bias testing, explainability tooling, drift detection), operational monitoring, and human‑in‑the‑loop controls for high‑risk actions. The announcement signals intent; operational detail remains limited in public materials.
Practical security expectations for a telco lakehouse:
  • Zero Trust identity controls (MFA, conditional access, role‑scoped permissions).
  • Data classification with pseudonymisation/tokenisation for PII and sensitive location telemetry.
  • Private peering (ExpressRoute/Private Link) for sensitive network flows and documented, auditable cross‑border transfer governance.
  • Integration of platform alerts into a mature SIEM and incident response playbooks with automated runbooks and canary testing.

People and the Cloud Centre of Excellence​

MTN emphasises significant upskilling alongside the technical migration: public materials cite more than 1,250–1,350 Azure certifications among MTN engineers in 2025, positioning the CCoE as central to replicability and sustainment. Skills and organisational design are often the decisive factors in cloud transformation success: certifications are a useful leading indicator, but operational maturity depends on real‑world experience, SRE practices, on‑call rotations and institutionalised runbooks. What to watch in people and process:
  • Depth of hands‑on experience beyond exams.
  • Cross‑functional workflows bridging network, data engineering, security and product teams.
  • Apprenticeship-like models to transfer tribal knowledge and reduce single‑person operational risks.

Strategic strengths, and important trade‑offs​

Strengths​

  • Operational velocity: Centralised analytics and near‑real‑time insights compress detection‑to‑remediation cycles.
  • Repeatability at scale: A templatized EVA 3.0 can accelerate consistent rollouts across markets, lowering duplication of engineering effort.
  • Platform economics and innovation: Access to managed AI and analytics services enables faster model iteration and feature adoption.

Trade‑offs and risks​

  • Vendor lock‑in: Tight coupling to Azure managed services and Databricks accelerates delivery but increases switching costs. Adopt open formats (Delta Lake) and define export strategies to mitigate.
  • Cost governance: Massive Spark workloads and large storage footprints require strict cost telemetry, quotas and cost-aware autoscaling policies.
  • Regulatory complexity: Centralised cross‑border analytics must be paired with region‑specific residency and contractual controls.
  • Operational complexity: Hundreds of interdependent workflows at telco scale demand mature SRE, chaos testing and observability.

Recommendations for telcos and enterprise IT teams planning similar migrations​

  1. Define measurable business outcomes first (e.g., MTTD reduction by X%, churn reduction Y%), then design the platform to deliver and measure them.
  2. Build a CCoE and include explicit programmes for SRE, runbooks, and rotational on‑call experience, not only certification quotas.
  3. Insist on performance telemetry and transparently validated benchmarks during procurement: ingest rates, latency percentiles, job failure distributions, storage retention and cost attribution.
  4. Adopt open formats and modular APIs to reduce future portability costs (Delta Lake for storage and documented export contracts for datasets).
  5. Bake governance and privacy controls into the platform from day one: data classification, pseudonymisation, region‑aware deployments and audited access logs.

Verification and claims that need independent validation​

Several of the most load‑bearing numeric claims in public coverage—22 billion records/day, 800+ workflows, ~1,700 feeds—are reported consistently across MTN announcements and trade press, but are company‑reported metrics and not accompanied by third‑party audited telemetry or a published engineering case study with bench‑marks. Readers and procurement teams should therefore treat these as MTN’s operational metrics and request independent validation when those numbers are material to commercial or engineering decisions. Where public detail is thin, ask for:
  • Representative ingest and processing traces for a defined window (e.g., a 24‑hour sample with ingest rate and latency percentiles).
  • Architecture diagrams that show region mapping, private connectivity (ExpressRoute), and key vault / key management locales for PII protection.
  • Security penetration test and audit summaries (redacted where needed) or attestations that map platform controls to regulatory requirements per country.

Broader implications for African telcos and cloud market dynamics​

MTN’s move is consequential beyond a single company: hyperscaler investments in regional cloud capacity, the maturation of managed analytics platforms (Databricks on Azure), and large operator success stories lower the technical and perceived risk for other telcos contemplating similar transformations. If EVA 3.0 can be reliably adapted across markets with diverse regulatory regimes, it creates a reusable pattern for accelerating data‑driven telco services and AI monetisation across the continent. At the same time, the case underlines that success is as much organisational and regulatory as it is technical: people, runbooks, cost control, and compliance are what sustain any large cloud migration after the headlines.

Conclusion​

MTN’s announcement of EVA 3.0 on Microsoft Azure represents a high‑profile, technologically coherent leap to a modern lakehouse architecture designed to handle telco‑scale telemetry and enable AI‑driven operations. The choice of Azure Databricks plus Azure security and governance tooling is consistent with best practices for converged streaming and batch analytics and supports the operator’s stated goals around faster incident detection, personalised services, and responsible AI.
The rollout’s headline scale metrics are impressive and plausible for a large operator, but they are company‑reported and should be validated with representative telemetry, cost and security attestations before being treated as independent facts. For telcos planning similar moves, the path forward requires more than technology selection: sustained investment in people, SRE practices, cost governance and regulatory compliance will determine whether the platform’s promise delivers measured business outcomes at scale.
Source: Extensia Ltd MTN Group upgrades major cloud platform | Extensia Ltd
 

Back
Top