
Today a new Large Language Model has seen the light of the day: Apertus (formely known as ‘the Swiss LLM’). It’s co-created by researchers at EPFL, ETH Zurich, and the Swiss National Supercomputing Centre (CSCS). Its fully open and anybody wanting to use it can have a go on Hugging Face and soon there will be API and chat access via Public AI.
Let me share my thinking why Apertus, a large model with 70B model trained on 15T tokens, matters.
Why Apertus matters?
Transparency and openness
The chosen model’s name is a testament to it’s greatest strength. It is the first major LLM with fully open training data, code, weights, and training recipes. This enables reproducible research, independent auditing, and breaks dependence on closed proprietary systems.
A business-friendly license
The model is released under the Apache 2.0 License which is very permissive. Among other is allows for commercial usage (companies can incorporate Apache 2.0 code into proprietary products without releasing their modifications) and it includes explicit patent grant clauses.
Declaration of pretraining data
The fact that all data used for pretraining is published and opt-outs are respected (as by January 2025) is touching all other points. But it’s so important, that I list the point separately anyway. Not only is it the first large model to do this, it also allows for a profound discussion about the data used, of possible bias and its influence on quality.
Legal Compliance
The project tries to comply with Swiss law, the EU AI Act and data protection laws/GDPR from the very beginning. As such, it provides legally a compliant alternative for European organizations i.e. by respecting opt-outs and minimizing memorization. This is not only a regulatory advantage, but it also minimized risks of leaking proprietary information/PII and risk associated with this.
Democratic Governance
The title is a bit unwieldy, but governance during alignment is essential: What (ethical) values does the model represent? Because it’s developed in Switzerland by public academic institution, the development was guided by academic/public interest that adhere to Swiss/European values rather than commercial objectives and geopolitical agendas.
Plus many technical aspect (such as 40% non-English training data with equal computational cost across 1000+ languages) that I am skipping in this post.
Summary
The Apertus model represents the first serious attempt to create a fully transparent, legally compliant, and democratically governed large language model. An attempt to challenge the current duopoly of US and Chinese AI systems while advancing multilingual capabilities and responsible AI practices.
A first, but an extremely important step to substantiate the concept of digital sovereignty with facts.