20 November 2015

The European Language Cloud, or How to Enable Multilingual Europe

Multilingualism is a core value of the European Union, as integral to Europe as the freedom of movement, the freedom of residence, and the freedom of expression. The European Charter of Fundamental Rights, which enshrines the foundational rights and freedoms protected in the EU, upholds a respect for cultural, religious, and also linguistic diversity as a cornerstone of European policy.

Europe’s commitment to linguistic diversity is most clearly apparent in its unwavering decision to maintain 24 official languages in the EU – no matter how large or small their speaker populations, and despite the bureaucratic hurdles in Brussels and Luxembourg – though many more regional languages are widely spoken and officially promoted across the continent.

Language barriers in the digital world

Though Europe’s multilingualism is a fundamental cultural and social value, its treasured linguistic diversity can also lead to significant communication barriers between people. The upholding of “unity in diversity” remains a difficult challenge. The effects of linguistic fragmentation can be seen most clearly in maps of language use on social media sites such as Twitter, where conversations are mostly restricted to national languages and thus limited by geographic borders.

As these fascinating maps make all too clear, language barriers can hinder the free flow of information and knowledge between nations, effectively fragmenting Europe into “language silos.” This is a major obstacle for the receiving and imparting of information and ideas across national borders – a defining aspect of the freedom of expression.

Language barriers also represent a major obstacle to the creation of the Digital Single Market, which seeks to combine the 28 national digital markets, harmonizing regulations and uniting all 500 million citizens of the EU in a single online marketplace.

At the moment, as a Eurobarometer study shows, more than 40% of Europeans never purchase goods or services if they are not available in their native language. Language barriers therefore severely restrict access to goods for European consumers, hindering the creation of a Digital Single Market. Having more access to information in multiple languages would go a long way toward increasing the number of cross-border sales. At the moment, according to the European Commission, only 15% of European consumers shop online in other EU countries, and only 7% of European SMEs sell cross-border. There is much room for growth.

How language technologies can help

Fortunately, there is a technological solution to easing linguistic fragmentation online. Recent developments in language technologies, such as state-of-the-art machine translation and automated speech recognition, now enable us to overcome language barriers between people, simultaneously allowing multilingualism to thrive in the digital world.

Thanks to language technologies, people are enabled to write, read, or speak online in their own native language, while others can access the information in a language that they understand.

The heightened application of these language technologies to the online market will not only foster communication between nations. It will also help boost the European economy by enabling more cross-border trade.

Just imagine: a digital market where absolutely all online content is instantly available in all languages of the European Union; where Internet users can interact seamlessly in real time with one another regardless of the language they are speaking or writing; and where goods and information can be searched for and accessed no matter where it was posted or in which language.

European Language Cloud

Where to begin to realize this vision? Fortunately, European excellence in language technology research and the thriving language technology industry has already laid the foundations for a viable solution. This includes recent breakthroughs in services from fields such as natural language processing, machine translation, text analytics, speech recognition, multilingual SEO/SEM, and semantic analysis.

But none of these services alone can meet the comprehensive needs of European industry and enable a truly multilingual Digital Single Market. To meet the complex needs of the market, language technology services must be accessed, combined, and leveraged into large-scale solutions, which can then be plugged directly into applications, making them fully multilingual.

This is where European policymakers can step in and help, by setting up a public language technology infrastructure – the European Language Cloud. This infrastructure would use the power of cloud technologies and combine the best that European industry and research have to offer.

A European Language Cloud would ensure easy access to key enabling language technologies for all EU languages, in the areas of natural language processing, automated translation, speech processing, and semantic analysis, among others. This would make these enabling services easily available to developers and integrators of commercial and public digital solutions.

The infrastructure should also include open access to multilingual language resources – the raw material for data-driven technologies and solutions – which all too often remain buried deep in corporate and government databases, instead of being used to build the solutions sorely needed by the marketplace.

Once a solid European Language Cloud infrastructure is in place, commercial players and public sector organizations could then use the available language technology services as buildings blocks, or core components, to create innovative multilingual solutions for their high-demand applications. 

The role of European research and innovation

At the same time, the European Language Cloud must be continuously replenished with new services and innovations. The driver of this innovation is the cutting-edge language technology research emerging from Europe’s universities and research centers. However, several “knowledge gaps” still exist in research, and often our research doesn’t fully evolve into commercially viable applications.

Targeted actions are critically needed to address the gaps in coverage for all EU languages, and provide novel methods to improve quality and applicability of language technologies. In a tangible display of its respect for linguistic diversity, Europe must fill the gaps in existing knowledge and ensure that all EU languages (not just larger languages) have the same degree and quality of language technology services.

Europe must also guarantee that European excellence in research keeps up with the growing demands of global industry. This will help Europe to remain globally competitive with next-generation services and solutions.

What Europe can do to implement this vision

How to implement this vision? The European Commission already has the right instruments in place – an encouraging sign. The Connecting Europe Facility programme is taking the initial steps to create an automated translation infrastructure for Europe. Public institutions across Europe have already begun to reap the first fruits of this programme, as it extends and improves its translation technologies for European languages. But the CEF programme should be significantly expanded to include other essential language technologies as well.

Europe must also reinforce its innovative language technology research, through programmes like Horizon 2020 and other instruments. Unfortunately, language technologies are missing from the latest Work Programme for 2016-17. Not only should they return to the Work Programme for 2018-19, but they should also assume a central priority to address this major challenge for Europe.

Breaking the language barrier in Europe is essential to make the EU more united in its diversity. It is crucial for not only increasing trade and commerce, but also fostering communication and understanding between the 500 million citizens of multilingual Europe. This is needed today more than ever. We should not miss this opportunity.

Andrejs Vasiļjevs and Rihards Kalniņš, Tilde