Here’s a summary of Microsoft’s new investments in European languages and culture for AI, according to the Neowin article:
Key Points:
- Preserving European Linguistic and Cultural Heritage
- Microsoft announced two major initiatives in Paris to better represent Europe’s diverse languages and cultural assets in large language models (LLMs).
- The initiative is part of Microsoft’s broader European Digital Commitments to expand AI/cloud infrastructure, strengthen data privacy, and support Europe’s digital competitiveness.
- Problem: English Dominance in AI Training Data
- Europe has over 200 languages with rich cultural legacies, but online content and AI training data are skewed towards English and American perspectives.
- For example, the Llama 3.1 model scores over 15 points lower in Greek and 25 lower in Latvian compared to English.
- Microsoft’s Response
- Microsoft will set up teams in Strasbourg, France, at its Open Innovation Centre and AI for Good Lab to develop and curate multilingual datasets on Microsoft Azure.
- The focus is on expanding training data in ten under-represented European languages, including Estonian, Alsatian, Slovak, Greek, and Maltese.
- There’s a call for proposals to source digital texts, transcripts, etc. for AI development, with grants (Azure credits + technical support) from September 1st, 2025.
- Cultural Project: Digital Replica of Notre Dame
- This autumn, Microsoft will extend its Culture AI program to create a high-fidelity digital replica of Notre Dame Cathedral in collaboration with the French Ministry of Culture and Iconem.
- Previous Culture AI projects include digital preservation of Ancient Olympia (Greece), Mount Saint-Michel (France), St Peter’s Basilica (Rome), and Normandy’s Allied landing sites.
- Microsoft’s Language Experience
- Microsoft has over 40 years of language localization experience.
- Windows supports over 90 languages (all EU official languages + regional languages like Basque, Catalan, Galician, Luxembourgish, Valencian).
- Microsoft 365 Office interfaces are available in 30+ European languages.
- Approach
- Microsoft states these steps are purely supportive, contributing open data, tools, and expertise, not proprietary assets.
To ensure Europe’s languages and culture are represented and preserved in the digital/AI era—empowering both citizens and businesses in Europe.
Source:
Neowin - Microsoft invests in European languages and culture to build smarter, more inclusive AI
Source: Neowin Microsoft invests in European languages and culture to build smarter, more inclusive AI