LT-Innovate - The Association of the Language Technology Industry: 2013

24 December 2013

LT for Hire: NLP Recruiting Gets Business-friendly

In a recent Expert System blog on 10 Semantic Technology Trends for 2014 that the Italian company has identified, trend 4 is boldly entitled “The Programmer of the Future is a Linguist” and claims that “the role of the linguist to bridge the gap between meaning and contextual relevance will become an essential part of technology applications.”

In other words, however much we try and automate the processes of understanding what a web page, a query, a text or even just a single sentence means, we know that the expert human linguist will still be a crucial factor in programming machines to understand better.

This role, by the way, was amply brought home in early December, when the UK firm Crystal Semantics was acquired by a media tech company: most of the commentary highlighted the fact that the founder David Crystal’s team of linguists took ten years to handcraft a disambiguation and categorisation engine for English (and other language) web pages that would aid a “sense” engine to understand the gist of polysemic words. Machine learning clearly can’t do it all.

To find out more about the job market for such linguists, LT Innovate talked to Maxim Khalilov the founder of NLPeople and Nick Gallimore at Natural Language Recruiter, the language technology wing of the mployability site in the UK, to find out more about job needs in the LT industry as a whole.

Maxim Khalilov started the NLPeople site in May 2012, and now publishes around 60-80 jobs for researchers and scientists in industry per month, finding demand to be fairly stable. New rounds of European Commission-funded projects tend to spark a rise in demand, and he also noted a “moderate increase of about 5-10 %” in October 2012, May 2012 and June 2013, probably due to business expansion activities among big players on the NLP market. His primary focus is in fact on the research community, so his data do not necessarily reflect the global job scene.

Nick Gallimore has been working exclusively on job openings in the LT industry under the general mployability banner for three years now, and decided to focus on the LT industry via the dedicated Natural Language Recruiter brand. He himself is passionate about language and technology and is keen to build credibility in this fast-growing space.

For him, job openings in the field fluctuate considerably from month-to-month. “It takes companies quite a long time to hire people (their requirements are often very different to research organisations) so it's not usually clear how ‘new’ a vacancy is.” But quoting figures that are close to those cited by NLPeople, “we see 250-350 job openings in industry each year in Europe, and a similar level each year in the US. We also believe that there is quite a lot of commercial-side hiring in the LT space that takes place ‘under the radar’.”

Which LT fields do these jobs address? Khalilov sees increased demand in the machine translation industry in Europe and in the USA, as well as more positions for speech processing experts. The latest tendency is crowdsourcing – “we regularly receive jobs submitted as a part of various language crowdsourcing projects.” And of course the data analytics/data scientist segment offers a growing number of exciting openings for NLP people.

Not surprisingly, NLP developers with the hands-on implementation experience are much in demand. A solid NLP background knowledge is mostly required, in some cases in combination with the language expertise. Language technology researchers and scientists with proven coding skills are taking second place.

What sorts of companies are advertising for NLP expertise? NLPeople receives lot of jobs from the recruitment agencies which, in many cases, prefer not to reveal the actual employer. A second major segment covers jobs at the “IT monsters”. Although Khalilov sees great potential in other companies needing NLP expertise, he reckons that they mostly “prefer to buy solutions and focus on integration only.” Then there are the NLP-oriented start-ups that typically require a broad outlook of computational linguistics, data mining and machine learning technologies.

Geographically speaking, Natural Language Recruiter works worldwide, even though the initial focus is on the UK. “We have clients in the UK, US, France, Spain, Germany, the Netherlands and China as LT is a truly international space,” says Gallimore.

For Khalilov, the USA is an “absolute leader on the industrial NLP market” – especially on the West Coast and in the Greater New York area. In Europe, there are a large number of localization jobs in Ireland “the localization Mecca of the Old World.”

A noticeable number of start-ups concentrating their efforts on the interface between NLP and machine learning appeared in 2011-2013 in France, Germany and Spain. While in Germany these new companies tend to stick close to big university centres, in France and Spain virtually 100% of them are in the Paris and Barcelona areas respectively.

Overall, then, LT jobs are on the rise. Let’s hope the LT industry harvests the benefits.

19 December 2013

LTC upgrades Deutsche Post DHL's translation management infrastructure

Deutsche Post DHL has extended its partnership with LTC, The Language Technology Centre, for a managed technology solution in corporate language services, including LTC Worx multilingual business-management system for global companies.

Doing business in more than 220 countries and territories demands more than high quality translation and localization; Deutsche Post also needs to maintain an advanced system to manage multilingual business processes and other business functions such as resource allocation and budgeting.

“LTC has been a large scale provider of language services and technology to Deutsche Post since 2000, and we have enjoyed an excellent ongoing business relationship,” explained Dr Adriane Rinsche, Managing Director at LTC. “During this time language needs have developed to support global operations, and LTC supports Deutsche Post in managing a clear strategy for multilingual content.”

This extended partnership with Deutsche Post will see advanced automation options, the continued hosting of the LTC Worx SaaS solution, and importantly LTC will also host the Kilgray memoQ 2013 translation environment as part of the complete translation management solution.

“Technology solutions such as LTC Worx and memoQ have allowed the Corporate Language Service (CLS) at Deutsche Post to plan, track and report on important language projects. This supports the innovation required to achieve our goal: to enable our customers within DPDHL and our subsidiaries to act globally, overcoming language barriers” said Doro Meyer-Veit, Head of CLS.

“The memoQ translation environment allows you to carry out many precise functions such as managing multiple packages in one, multilingual project,” said István Lengyel, Kilgray CEO. “By using LTC Worx and memoQ together, users can manage projects with complete control and this makes life a lot easier.”

LTC language services have covered many different subject areas including logistics, infrastructure, environment, law, IT, marketing and more. The LTC Worx system has enabled the management of flexible and controlled multilingual processes.

The solution facilitates the design of unique workflows and gives control over the complete end-to-end process. DPDHL use memoQ 2013 and LTC Worx to drive down operational costs with advanced applications.

The full press release.

16 December 2013

Crystal Semantics: From Disambiguation Research to Smart Online Advertising Analyst

Crystal Semantics, one of the most singular European language tech companies and an LT-Innovate Prize-winner, has changed hands again. Previously owned by the Dutch media company ad pepper Media (from 2006), it was sold last week to the media monitoring giant WPP’s digital’s marketing technology company 24/7 Media which will in turn be merging with the technology company Xaxis early in 2014.

Why the interest in Crystal Semantics and its 15 employees split fairly evenly between engineers developing the technology and semantic linguists?

The UK company uses proprietary technology to read web pages for their total (fully disambiguated) meaning in real time, thereby enabling an advertising agent to know exactly what is contained on the target web page where they might wish to place their advertisements. The kind of brands that WPP and others provide monitoring services for want to ensure that they do not get their adverts placed on a page of web content that is semantically inappropriate to their brand image – i.e. containing porn, political incorrectness, tobacco, alcohol, etc.

By buying into Crystal Semantics technology, media managers can ensure an automated understanding of content, and therefore benefit from an optimised guide to placement for their clients’ adverts in the cut-throat world of brand competition.

What makes the company so singular in the European technology landscape is that the underlying technology was originally developed as a use-neutral, almost academic attempt to digitise an English dictionary with all its multifarious word meanings. An encyclopaedic subject matter taxonomy was then used to assign a web page to a category on the basis of the interacting word meanings, giving an accurate picture of its fundamental message.

The progenitor of this effort – David Crystal – is an eminent linguist in the UK, with a remarkable track record of linguistic inquiry, ranging from speech therapy to indexing, via stylistics, consulting for the government and linguistics education for the general public. The company website claims that the effort of developing the semantic network – called the Sense Engine - that underlies the company’s application took over ten years. It was largely carried out with the Dutch AND Publishers in the mid-1990s and represented “one of the largest language engineering projects ever undertaken.”

Under the new ownership, will this core resource, which used to be available as a service for web page analysis, continue to aid others in providing semantic disambiguation?

Tomorrow, the world of brand advertising management on the web will be genuinely global, and Crystal Semantics, now part of the Xaxis technology stack for WPP, will need to extend its semantic expertise beyond the dozen or so European languages it covers today and adapt to target languages throughout Asia and elsewhere, despite the low expectations for advertising growth according to WPP’s boss.

11 December 2013

Horizon 2020 Work Programmes 2014/15 published today!

After the final adoption yesterday, the European Commission published the Horizon 2020 Work Programmes today.

All programmes and ancillary documents such as model contracts, rules of participation etc. can be found at:

http://ec.europa.eu/research/participants/portal/desktop/en/funding/reference_docs.html

02 December 2013

Jaap van der Meer of TAUS on How We Can Reinvent Translation for a New Generation of Users

In your Translation technology Landscape Report published in April 2013, TAUS said that translation technology is at a deeply transformative point in its evolution, and that we were heading for a convergence era. Could you sum up for us this will mean for (Europe's) translation industry?

JVDM : This convergence – as we say in the Translation Technology Landscape report – comes from three different angles: the technology, the functional (or business) and the social. The simplest example of the technology convergence is the fusion of speech and translation technology. It is so natural that we are moving towards speech-to-speech translation. We have seen demos of that at TAUS conferences and it’s clear that the technology integration still needs more work, but I think we will see some rapid evolution on this front. The functional convergence has started already earlier: it’s all the different disciplines and functions in organizations looking at the relevance and value of translation. Once you have a relatively simple and somewhat automated process for translation, everyone will come to you and ask you to plug in: customer support, social media, marketing, search and so on. Translation and localisation managers are becoming very popular. Social convergence is completely in the hands of the users, the community, and the crowd. Adding self-service (machine) translate buttons to online support sites for instance can make a world of difference. What this all means for the (European) translation industry is that we have to reinvent ourselves: both the way we set up processes and use technology and the way we set up our business and pricing models. We are making a shift from a static model to a dynamic model where the customer and the user have many different options and quality levels for translation

Data-driven translation tech has now become a core feature of the translation landscape. Are there any legal, technical or commercial questions that TAUS is interested in solving in relation to the "market" for language data?

JVDM : Yes, technically it is still very complicated to train MT engines. TAUS would like to make it simpler. In the coming year we will add new features to the TAUS Data repository to allow users to identify data that are really close to the domain or industry for which they need to customize or train an engine. We call this the Matching Scores feature that is based on semantic clustering techniques. We are also considering setting up a library of language, translation and reordering models to help fast-track and fine-tune the development and customization of MT engines. That is quite an ambitious project but with the 55 billion words in 2,200 language pairs already in the data repository we have the right basis for it. Legally, yes of course, we are all still struggling with an outdated copyright law. I hope that policy-makers in Europe will recognize this issue and address it.

Will the emergence of new "device-based" and user-pulled translation apps that you mention in your report have any serious impact on the market as we know it for commercial translation? Or will the market just keep growing exponentially due to the massive creation of content?

JVDM : Our prediction is that the ubiquitous availability of translation will only drive the demand for professional and business-to-business translation at all levels. This means that business customers will look at their vendors to supply translation at different quality levels: from real-time customized MT to personalized transcreation and hyperlocalisation. We think there will be growth at all levels. The challenge though will be to establish the references and metrics that help all of us to deliver upon expectations. At our TAUS Annual Conference in 2012 we set up a competition for innovation insiders and innovation invaders in the translation sector. I think we will see many more innovators coming into the translation sectors with fresh ideas on how to differentiate the offerings.

TAUS is eight years old this year. Can you tell us how the organisation is evolving to address the changing needs of the industry?

JVDM : TAUS has evolved from a think tank to a platform for shared industry services, from ideas to execution. We set up the TAUS Data repository in 2008 and since 2010 we have started developing the Dynamic Quality Framework. Both the data sharing and the translation quality evaluation platform are good examples of general industry support services that benefit all industry stakeholders. They help the industry to mature and add credibility. TAUS provides a unique service as a neutral and objective industry platform.

25 November 2013

OECD chooses TEMIS to semantically structure its Knowledge and Information Management Processes

TEMIS, the leading provider of Semantic Content Enrichment solutions for the Enterprise, announced today that they have won a call for tender issued by the Organisation for Economic Co-operation and Development (OECD) with their award-winning Semantic Content Enrichment solution Luxid®.

"This mark of trust by OECD represents a new recognition of our ability to address challenges in international organisations. For TEMIS, this is a link between our know-how in the publishing domain and our industrial experience in information systems", said Fabien Gauthier, Sales Director, Enterprise, TEMIS.

The OECD provides its expertise, data and analysis to its 34 member governments and 100 other countries to help them support sustainable economic growth, boost employment and raise living standards. To fulfill its vision of increased relevance and global presence, the OECD has launched a Knowledge and Information Management (KIM) Program that establishes an integrated framework for managing and delivering information and improving its accessibility and presentation. The KIM framework is intended as the steward of the OECD's information lifecycle, with a universal knowledge referential at its core facilitating enhanced searching and findability, rationalized content re-use/repurposing processes, and supporting the organisation's Open Data and Linked Data initiatives.

Based on patented and award-winning Natural Language Processing technologies, Luxid® exploits off-the-shelf extractors called Skill Cartridges® to extract targeted information from unstructured content and semantically enrich it with domain-specific metadata. This enables professional publishers to efficiently package and deliver relevant information to their audience, and helps enterprises to intelligently archive, manage, analyze, discover and share increasing volumes of information.

The full press release.

15 November 2013

Deep Multilingual Semantics Pays Off for Bitext

The Spanish semantic technology company Bitext (The Bits and Text Company) has been in the news quite a bit since it won an LT-Innovate Prize in June 2012. Analysts have complimented Bitext on its decision to develop a technology that enriches and extends existing systems with deep linguistic knowledge, rather than reinventing the wheel for each application.

We caught up with Antonio S. Valderrábanos. CEO & Founder and Enrique Torrejon, the R&D Director, to check on their progress.

International strategy

Since June 2012, there have been three major events in Bitext’s business activities: In the last quarter of 2012, they reached an agreement with Salesforce, the leader in social media monitoring, to provide multilingual sentiment analysis in the Salesforce Marketing Cloud Insights ecosystem. With Bitext's real-time, multilingual sentiment analysis companies can understand their customers better and faster than ever before. This decision to join an existing sales channel seems typical of the company’s approach to market outreach.

They also signed distribution and partnership agreements with companies such as Actuate. This means that Bitext is steadily positioning itself as a text analytics and sentiment analysis provider for Big Data in the US market as a whole.

Here in Europe, Bitext is now working with the Spain-based telecoms company Telefonica to provide multilingual text analytics for voice of the customer in different languages for their international product launches.

Verticalising the technology

In the immediate future, Bitext’s business agenda for the next three years is to build a stronger presence in the US – especially in Silicon Valley – so that it can sign partnerships with major US corporations. On the technology front, they will be focusing on “verticalising” semantic applications beyond sentiment analysis. This will involve developing text analytics for specific purposes, such as making recommendations, intent to buy, optimising contact centre performance, and also fraud detection.

How about the European market? Bitext agrees that one of the company’s major assets is the extensive multilingual capabilities of their solutions & services. But there is also a certain disadvantage in this for many European language tech companies, say Bitext’s senior executives: “Paradoxically, the existence of multiple languages in Europe segments the market according to languages. This makes it difficult for language technology providers to expand to other markets if they lack these multilingual capabilities.

What Bitext would like to see more of is better access to financing at European level, be it through business angels or investors in general. Even their competitors would probably agree with them on that point!

10 November 2013

Out of the Mouth of Babies: Young Europeans Should Learn to Code for Natural Language

Coding is becoming cool again. Western countries have counted the cost of failing to educate the next generation of ICT-smart young people. There was a drop in academic interest in “computer science” a few years ago and as a result there’s now a steady stream of government proposals for boosting information and communication technology education for the youth of European and other countries.

In September, European Commissioner Neelie Kroes and the EC Education Commissioner released some alarming figures about the state of ICT in the educational infrastructure and in terms of educational content. They detailed a list of 24 new initiatives to beef up ICT in education in Europe.

Digital jobs

In the United States they reckon that 1.4 million jobs—and 60% of sci-tech/engineering jobs of the future—will require computing skills. So if young people will have to become IT-savvy for almost any job, one way to kick-start at least some IT education at school would be to work from what children use most but know about least – namely, natural language. Using NLP as an educational problem space might offer an intelligent entry point into the great instauration of computer coding. At the same time we could help expand the community of NLP-aware coders and perhaps learn new tricks from a new generation of device-happy young innovators.

Coding cohorts

There have been courageous attempts to boot up a cohort of young coders in Europe. The Irishman James Whelton set up CoderDoJo to teach kids to code outside of the standard education system.

Another much-praised venture has been the UK’s Raspberry Pi 'simple computers for kids' project, driven by Cambridge University stakeholders in the UK. In late October this year, they notched up their millionth computer manufactured in Wales - a credit-card sized device that plugs into a TV and a keyboard. The founders wanted it to be used by children all over the world to learn programming. But it actually may be more popular among older IT hobbyists, as this news item about building a speech translator suggests.

Can these efforts really rise to the challenge of fostering the kind of large-scale IT literacy being proposed in countries from the US to India? In the US, the Association for Computing Machinery is holding an Hour of Code to introduce more than 10 million students of all ages to the basics of coding - “a foundational skill for careers in the 21st century”.

Hackathons, Code-Ins & Community Building

It is heartening to see that Europe’s Apertium free/open-source machine translation community is already participating in this year’s Google’s Code-In. The idea is for students from all around the world to tackle small tasks (code writing, debugging, documentation, production of training material) to learn how to prepare for larger projects in the future. But it’s only a start.

Europe’s multilingual footprint poses a tremendous challenge for cross-border transactions. But there are already hackathons (e.g. the Moses Marathons for statistical machine translation) that can help open up opportunities for dedicated communities.

So let’s give more incentives to younger hackers with an interest in the world of apps and human language, and encourage them to learn about existing resources and APIs to create new ways of addressing our language needs. Multilingual communication ought to be an enjoyable, cheap and profitable challenge.

05 November 2013

[Presentation] Language Technologies and Business in the Future

Very interesting and valuable presentation by Niko Papula, from Multilizer
at KITES Symposium, Helsinki 31.10.2013

Exalead: very recherché

With its “connect the dots” tagline, Exalead is one of Europe’s most successful search companies to emerge in the last decade or so, and is ranked fourth among the world’s most-used web search engines. Largely focusing on the strategic business of enterprise search, the company has nevertheless explored the whole search space, from desktop and web search through to multimedia and voice search in some of its Lab projects. We see three good reasons for underscoring the “lead” in Exalead which, among other awards, was an LT-Innovate prize winner in 2012.

Semantic Vision
Exalead was founded in 2000 by two young computer scientists who wanted to build no less than the best and most comprehensive search engine from scratch. They heard the call of semantic technology before many others and decided to develop their own technology over the long term, ending up with a Semantic Factory that automates the whole process of aligning and enriching heterogeneous mixes of data. They also foresaw the need for searches over multimedia content, from video to voice, which has meant considerable investment into research as well as development.

The business vision paid off: in 2010 they were acquired by Dassault Systèmes (now known as 3DS) for around €150M. Their task was to round off the 3D design software company’s product lifetime management portfolio with a powerful search engine for the huge documentation and parts databases that underpin very large engineering projects such as airliners or nuclear power plants. But this move also gave Exalead access to 3DS’ database of over 115,000 customers.

Exalead now has a staff of 150 and a number of products and solutions that enable other industries such as healthcare, defence and finance or organisations such as contact centres to embed powerful search capabilities into very large data silos and then link them altogether to discover new insights.

The Exalead Ecosystem
As The Rude Baguette has shown in an excellent recent news story (to which this post is deeply indebted), Exalead has acted as a major incubator for twenty or so successful Paris-area start-ups. They have all been launched by former Exalead engineers in the past decade. Some of these businesses operate in the video or related search space but most of these younger companies draw in some way on Exalead’s strong culture of semantic technology skills. They obviously knew how to pick ambitious software engineers!

Re-Search
Today, Exalead is sufficiently resourced to continue with its own applied research agenda in a number of fields under the leadership of Chief Science Officer Gregory Grefenstette, who joined the company in 2008 after a distinguished career as an NLP researcher in the US and Europe. Exalead contributes to the open source community, develops innovative solutions to outstanding search problems, and above all has provided much of the inspiration and expertise for the €200M Quaero.

With its Latin for “I search” name, the largely Franco-German Quaero seems very much an emanation of Exalead’s original vision of seeing the universe of content through the eye of a search engineer. Its five areas of focus are personalized, multi-device distribution of video; better targeting for advertising; and multimedia search for the Web, all using the Exalead search engine.

Some of these application fields will presumably see products, apps or solutions emerging from the Quaero consortium – especially as the R&D phase is due to end in December this year. This means that the technology development process can begin. It’s worth noting that the project’s Voxalead application has already won three awards including a META 2nd Prize for outstanding audio visual search and transcription software and services in 2011. So it will be interesting to see which Quaero outcomes Exalead itself will productise in the next few years.

04 November 2013

New Opportunities for Virtual Assistants in European Customer Service

The IT consulting firm Gartner recently predicted that Virtual Personal Assistant (VPA) usage in retailing contexts will grow more quickly in 2017 and 2018 than iPad usage did in 2010 and 2011. VPA technology can use a variety of channels – sometimes through text chat, but also using conversational interfaces on a smartphone – to connect customers with services that benefit from smart digital efficiencies.

Gartner has even claimed that in 2014 the number of speech recognition applications running on deep neural network algorithms (as touted by Google, Microsoft, Nuance and others today) will double, though most of them will probably handle personal information management tasks for individuals rather than customer services. So the question is: how can language and speech technology meet the specific challenges facing the customer-service market with respect to this evolution towards VPAs?

An interesting recent survey by UK-based Creative Virtual, a CEM virtual assistant supplier, looked at the state of play and expectations for the immediate future in EMEA (Europe, the Middle East and Africa) countries and in the US for customer support challenges. The results are revealing.

Over a million inquiries a year
In the EMEA, 55% of those questioned would like to resolve customer inquiries faster. 40% want to reduce call volumes to live chat agents. And another 40% - over a third - want to step up the use of self-service channels. This presumably means installing the kind of technologies – search, natural language processing, and smart knowledge management – that can streamline the whole engagement process for customers.

To get a feel for the volume of calls or inquiries involved in customer service, and hence of the pressure on this service, 19% of the respondents referred to over one million inquires a year (close to 3,000 a day) in their businesses. Interestingly there appears to be a slight decrease in inquires to call centres and via email, presumably because more people are already using web channels such as social media or chat channels (a 45% increase). And of course customer service over mobile channels grew by 42%.

More self-service, less contact centre
What this means for businesses offering these CEM services is that increases in social, mobile and live chat suggest the rise of live/real time services over the web. This translates into an obvious opportunity for technologies such as virtual assistants, using text but tomorrow perhaps speech as an initial user search interface, or to engage directly with virtual avatars on service sites.

According to the survey, there has been faster “new channel” adoption in the EMEA, with 61% citing social media, and 55% seeing a rise in live chat. Unsurprisingly 62% of the respondents said they would be planning on developing social media channels in the coming years. This would presumably involve the use of smart technology to find and sort social responses to customer problems and personalise them to a given customer.

As for the tools they plan to roll out in the next 12 months, 23% mentioned virtual assistants, 21% online communities, and 16% cited forums. In the EMEA, 77% of those questioned already use FAQs, 69% feedback forms and 52% knowledge bases as primary service tools. This suggests that the emerging interfaces – VAs and speech – could both provide a faster, smoother front end to such inquires.
In all cases, these emerging trends in service management point to an increased use of conversation or a more “natural” interface whereby the richness of natural language will simplify life for human users, but require more advanced forms of language intelligence for tool and system suppliers.

Make sure your PVA is multilingual
When it comes to PVAs, in the EMEA 70% of those questioned already use/plan to use virtual agents on their home page/customer service, 40% in the call centres, followed by 30% on live chat and on smartphones. In terms, of budget, respondents using PVAs said they devoted 1 to 10% of their budget to the solution. The next challenge is to make sure all these PVAs are as multilingual as is necessary in a single European marketplace.

Overall, then, a highly positive outlook for the numerous European VPA companies, from Artificial Solutions and Inbenta to Sherpa - and even Creative Virtual itself.

23 October 2013

ICT2013: are you involved in "cracking the language barrier"?

We are!

"Cracking the Language Barrier" is the title of the session at which the European Commission will present its Workprogramme with regard to language technologies for 2014-15 at the ICT 2013 event in Vilnius. The draft agenda of the session and the draft text of the Workprogramme are available on the session's webpage.

LT-Innovate will be present at ICT2013 with its own networking session entitled "Language Technologies – The cornerstone key enabling technology for the Digital Single Market"..

Other Workprogramme presentations and networking sessions that you might want to get engaged in or attend are for example:

WP2014-15 presentation on big data challenges (funding opportunities for content analytics and language understanding);
WP2014-15 presentation on multimodal and natural computer interaction;
A networking booth by the EU-Bridge project on Speech Translation and its Application.

22 October 2013

[Radio Podcast] Daniel Mayer, Marketing Director of TEMIS speaks about Language Technologies (in French)

"La danse des mots" is a French radio programme broadcast on RFI (Radio France International).

In this podcast you will hear Daniel Mayer, Temis' Marketing Director, speaking about semantics and language technologies. He explains how to use semantics to analyse texts and improve the search of information.

Temis helps organizations structure, manage and leverage their unstructured information assets. The company won an LT-Innovate Award in 2012 for its platform Luxid® that identifies and extracts targeted information to semantically enrich content with domain-specific metadata.

"La danse des mots" - 09/10/2013 (26:30)

Inbenta: from FAQs to Virtual, Semantically-enabled Q&As

At the beginning of 2011, Gartner predicted by year-end 2013 (that’s very soon now) at least 15 per cent of Fortune 1000 companies would be using a virtual assistant. The question is: will this prove to be the case?

When it comes to answering questions like that in the digital world, we’ve certainly come a long way from a process that began with the FAQ. Back in 2000, this format was a new online content category, created when websites were first being built on the brand new WWW. The idea was to anticipate customer information needs by using the sort of interactive exchange that goes back to the Socratic method and beyond. As customers typically ask questions to solve their problems, the idea was to provide typical online answers. The FAQ was a frozen simulacrum of conversational interactivity.

Almost ever since Alexander Graham Bell invented the telephone, businesses have enabled users and consumers to ask “real-time” questions over the phone to a contact centre. You often waited a long time, or punched lots of buttons until the right agent answered and talked you through a solution for your hardware problem or your bank query. This kind of customer experience service inevitably led to high staff costs, plus additional management solutions to handle agent supervision, maintain quality and prevent customer loss due to unacceptable wait times.

One of the great breakthroughs in Q&A sessions like these, therefore, has been the shift from the relative complexity of real-time calls to the apparent simplicity of customer self-service, using ‘virtual assistant’ online software solutions to extract answers to questions from existing corporate content. Gartner estimates that this self-service market was already worth a billion dollars in 2012.
Up to 2016, the market for global intelligent VAs is likely to grow by 39% a year.

One of the key players in this new virtual assistant space has been the Barcelona-based company Inbenta, one of the first European companies to offer an online customer service that can actually understand the language of the question and then find the most relevant answer. Founded in 2005, Inbenta has unlike many of its competitors invested deeply in a linguistically sophisticated model of language meaning that can be implemented computationally to hide the understanding process from customers and optimise the search for the right information to solve the service question at hand.
This means paying close attention to the potential ambiguities of natural language.

Under the leadership of CEO Julio Prada López, Inbenta has expanded its customer-service client portfolio to more than 90 large companies and organisations, and posted sales of over a million euros in 2012. Its self-service solution has been adapted for websites and intranets and is available in multiple languages.

Inbenta has signed a number of partnerships to expand the range of VA opportunities for clients – one of them in 2011 with CodeBabyhttp://langtechnews.hivefire.com/articles/share/69971/ which provides digital characters that engage website visitors and seamlessly guides them through the online self-service experience.

As a result of this focus on very high quality language understanding technology, Inbenta won an EU Platinum Seal of e-Excellence In March 2011. And earlier this year, Inbenta was awarded an LT-Innovate Prize at the second LT-I Summit in Brussels, crowning eight years of enviable progress in semantics-driven customer self-service, whatever the device or interface involved.

16 October 2013

LTi Workshop in Brussels on 13 November: Maximising your Chances of Success in EU Projects in 2014

As from 2014, the EU's ICT Programme will be operating under a new set of rules. Instead of being a compoment of the Framework Programme for R&D (FP7), it will operate under "Leadership in Enabling and Industrial Technologies", itself a component of the "Horizon 2020" Programme.

LT-Innovate organises a Workshop to allow its members and other parties interested in Language Tecnologies (LT) to interact with experts who will inform them about the new framework and rules, provide insights into the draft ICT Work Programme for 2014-15 and advise them on building successful project consortia.

This Workshop is a unique opportunity for LT stakeholders who are genuinely interested in EU-funded projects.

[More details & REGISTRATION]

European Day of Languages in Vilnius - Summary

The European Day of Languages is celebrated each year on 26 September. It highlights Europe's important cultural asset: multilingualism that at the same time represents a barrier for seamless communication and ubiquitous access to (understandable) content. This year, Vilnius hosted the 2-days conference "Unity in Diversity" with an attractive, international programme that discussed the many facets of multilingualism, amongst them: Multilingualism in digital content, languages for mobility, jobs, and active citizenship, or ICT for language learning.
For images and videos of the conference, please visit the Lithuanian presidency/Parliamentary dimension website that fully documents the event. [Image credit: Office of the Seimas]

margaretha mazura is Director of EMF Services, a consultancy in Brussels. She specializes in business models and funding opportunities for ICT projects. On a more "esoteric" level, she specialises in the history of fans (éventails, Fächer) and has an internationally recognized collection of fans. And if some time is left, she writes about them or acts as curator for fan exhibitions.

14 October 2013

Single Digital Marketplace = Single LT Marketplace

The timing is good. MarketsandMarkets recently published a new report on the global Natural Language Processing market, estimating it would be worth some $9.8B in four years’ time. Today it stands at $3.7B. This represents an expected compounded annual growth rate (CAGR) of 21.1% from 2013 to 2018.

Whether or not we actually reach this specific degree of growth with this time line, the signs are nevertheless promising for the immediate future of language technology. For as it happens, Europe’s Research and Innovation community is meeting up in Vilnius in November to plan the Digital Agenda for Europe. And LT Innovate is committed to irrigating the valleys and plains of Europe’s communication landscape by spurring the development of innovative language technologies.

Conversational interfaces, smart content and multilingual access are destined to underwrite the human dimension of the single digital market, allowing everyone to access what the thinker Ivan Illich once called ‘tools for conviviality’ – in the etymological meaning of the word – that is, tools for living together through our devices, our languages, our businesses and our desires!

At LT-Innovate, we have already developed our own market model to estimate the size of LT market in terms of sales and services to consumers, users and citizens, rather than focus on the market for “components” such as NLP. In our 2013 market report we estimate the global language technology market to be worth around €19.3B today and we anticipate grow to nearly €30B by 2015 – at a slightly lower growth rate of around 11%. But in many ways, then these estimates reinforce the positive picture for LT foreshadowed by the report mentioned.

Globally, we estimate that the speech technology market is growing by 9.7% and will be worth some €8.6B by 2015. Intelligent content should grow to €6.2B. And the more buoyant translation technology market is worth some €8.6B today and should surge to a significant €14.9B in a few years’ time.

The fascinating challenge of this particular market is that any given advance in corporate or consumer NLP software development at point A will almost inevitably need to be localised (i.e. translate the interface into a series of languages) at point B. And every piece of text content could potentially be rolled out in a spoken form and translated into all the languages of the community.

In other words, our three segments of speech, content and translation technology are handy categories. But they will be intimately interdependent in tomorrow’s single digital marketplace. That’s why we try and offer a global LT market figure for what might today look like lots of arbitrary segments.

LT-Innovate at ICT 2013 in Vilnius - join our networking session!

LT-Innovate organises a networking session at the ICT2013 event in Vilnius. Join us on 7 November 2013 at 6pm to discuss: Language Technologies: the Cornerstone Key Enabling Technology for the Digital Single Market. For more info, have a look at our agenda or send us comments and food for discussion here.

08 October 2013

TEDxZurich - Thomas Zweifel - Leading through Language

What is the difference between a good company and a great one? Language, argues Thomas Zweifel, an accomplished leadership coach. In his interactive talk, he will set out to prove to the TEDxZurich audience that communication is the most important leadership tool of all.

Dr. Thomas D. Zweifel is a Consultant for Insigniam Performance and the former CEO of Swiss Consulting Group. Since 1984, living on four continents, he has helped top and senior managers develop leadership in the action of meeting strategic objectives. Dr. Zweifel is the author of six co-leadership books, including Communicate or Die, Culture Clash, and The Rabbi and the CEO. Since 2000, he has taught leadership at Columbia University and St. Gallen University. Dr. Zweifel often appears in the media, including ABC News, Bloomberg TV, and CNN. He lives in Zurich with his wife and two daughters.

More TEDxZurich talks on: http://www.tedxzurich.com/

07 October 2013

Actuate and Bitext Announce Collaboration to Deliver Text Analytics Engines and Sentiment Analysis for Big Data through BIRT

Actuate Corporation, The BIRT Company^™ delivering more insights to more people than all BI companies combined, today announced their cooperation with Bitext, in parallel with Bitext’s U.S. event in San Francisco this evening at WeWork. Bitext provides text analytics engines – inherently multilingual semantic technologies including text analytics and natural language interfaces – sporting one of the highest degrees of accuracy available today. Bitext recently announced a partnership with Salesforce.com as well.

Combined with Actuate’s BIRT iHub™ development tools and platform, or with Actuate’s BIRT Analytics™ 4.2 predictive analysis solution, Bitext provides two main advantages for the BIRT developer or end user: it produces highly accurate precision and recall; and it lends itself easily to a development process based on continuous improvement. BIRT Analytics 4.2 and Bitext are now available as a combined solution from Actuate.

“We are very pleased to be working together with Actuate to further enrich their leading BIRT commercial suite with Bitext text and semantic analysis power,” said Antonio Valderrabanos, CEO and Founder, Bitext. “With Bitext analyzing unstructured data words as well as meaning, and Actuate performing advanced analysis of structured as well as unstructured, we cover the world of data.”

Bitext enables entity and concept extraction, categorization, and sentiment analysis with a focus on customer-centric business areas such as marketing; customer relationship management and support; content analytics; and any line of business unit that requires advanced analytics. Examples of solutions include text analytics (entity extraction, concept extraction, and sentiment analysis), metatagging (enhanced indexing) and search (natural language interfaces). Currently available for 10 languages, Bitext enables the addition of new languages by including new data sources (dictionaries and grammatical rules).

“Our collaboration with Bitext – providers of advanced semantic solutions for social media, search, and more – extends the types of analysis that can be performed with Actuate’s commercial BIRT developer and end-user platform or solution, by adding the ability to score sentiment toward products and services,” said Josep Arroyo, VP of Analytic Solutions at Actuate. “Users of Actuate with Bitext can now tap more than just negative or positive sentiment analysis. They can also visualize anticipated risks, opportunities and threats for personalized insights, in a single display on any device.”

For a demo of BIRT Analytics 4.2, please visit Actuate’s YouTube Channel

For a demo of Bitext’s new API, please visit Bitext website

Read the full Press Release.

03 October 2013

LT-Innovate @ TEKOM / TC WORLD - 6-8 November 2013

From 6 to 8 November 2013, LT-Innovate participates in TEKOM / TC WORLD (Wiesbaden, Germany) with its own stand hosting ABBYY Language Services, CrossLang, ESTeam, Get Localization, The Language Technology Centre, Lingenio, Palex Languages and Software and TEMIS.

TEKOM / TC WORLD is Europe's largest professional fair for technical communication.

On 6 November, several LT-Innovate speakers appear on TEKOM's programme:

08.45-09.30: Knowledge meets Language by Jochen Hummel, ESTeam, Berlin, Germany (Room 2C)

09.45-10.30: Language Technology Scenarios for the Healthcare and Life Sciences Domain by Dr. Adriane Rinsche, The Language Technology Centre, London, UK (Room 2C)

11.15-12.00: Real-time Selection of Best Assets Based on Productivity Analysis by Anton Voronov, ABBYY Language Services, Russia, Moscow (Room 2C)

13.45-14.30: A Framework for Collaborative Efforts around Industrial Uses of Terminologies – The Luxid Community by Stefan Geißler, TEMIS Deutschland GmbH, Heidelberg, Germany (Room 2C)

14.45-15.30: Developing “Ideal” Software for the Language Industry by Anna Motovilova & Julia Makoushina, Palex, Alexandria, USA (Room 2C)

16.15-17.00: Extracting Translation Relations for Human-readable Dictionaries from Bilingual Text by Kurt Eberle, Lingenio, Heidelberg, Germany (Room 2C)

17.00-17.30: Tool Presentation: Machine Translation On Demand by Nathalie De Sutter, CrossLang, Gent, Belgium (Room 2B2)

17.15-18.00: Crowdsourcing in the Localization Process by Jari Herrgård, Get Localization, Helsinki, Finland (Room 2C)

Location:
Rhein-Main-Hallen GmbH , Rheinstraße 20, Wiesbaden , 65185, Germany
THe LT-Innovate stand is located in Hall 4, booth 442

[More information]

Inbenta implement online customer service using Natural Language Processing

NoMoreRack has chosen Inbenta to implement its online customer service using Natural Language Processing.

NoMoreRack is an online shopping website that offers quality brand named apparel and accessories for 50-80% off retail price.

Thanks to Inbenta, NoMoreRacks's website self-service rate is now more than 70%, as users find most commonly asked questions straight on the website, using their own words. As new FAQ and contents are being added, the self-service rate is growing rapidly.

More information

[Radio Podcast] Jean Senellart speaking about Machine Translation Softwares (in French)

"La danse des mots" is a French radio programme broadcast on RFI (Radio France International). The programme is presented by Yvan Amar and is about French language in the world.

In this podcast you will hear Jean Senellart, Systran's R&D director, speaking about machine translation softwares and language technology. He explains the design of a machine translation software, what are the developments and limits of the exercise.

Systran won this year an LT-Innovate Summit Award for its SystranLinks solution, designed to optimize and accelerate the automation of website localisation.

"La danse des mots" - 23/09/2013
(26:30)

01 October 2013

Systran - an Historic MT Pure Play in a World of Changing Language Services

The first generation of machine translation engines have been great travellers. Many of the original rule based systems – some of them now evolved into hybrid statistical/symbolic configurations – first saw the dawn decades ago. They represented a huge investment in person years of design, optimisation, maintenance and redesign. But no one seemed to make genuinely viable businesses out of them. Apart from Systran, the great survivor, more of which anon.

One classic example of an itinerant system is METAL, originally developed at a university in Texas in the 1970s, then acquired by Siemens in Germany in the late 1980s to drive its huge documentation localisation programme, and later sold off in parts (usually language pairs) to various smaller translation companies in Europe over the subsequent years. Today much mutated versions of METAL it are still hard at work in numerous incarnations from Spain to Germany, even though their software core has had to be largely rewritten.

Another fascinating story is the Logos system, originally developed as a very large (and beautifully-crafted) bespoke rule system in the US during the Vietnam war to translate weapons documentation into Vietnamese, and later extended to another strategic language on the weapons agenda – namely Farsi –just as regime change arrived in Iran in 1979. It continued commercially with a German-English pair into the 1990s, but since 2010 has gone open source at SourceForge (led by DFKI experts, offering a massive linguistic resource for developers but (so far) of very little commercial value.

So the outright commercial champion of the first half-century of automated translation software is Systran . Sourced in the early experiments in Georgetown (US) mostly for US intelligence end users, Systran was founded by Peter Toma in 1968 and the company has never disappeared from sight or been dissolved into a larger service supplier. Today it is the great brand name of the world of machine translation.

Like its successors, it too has travelled. The system was partially acquired by a French businessman in the late 1980s, while certain language pairs were owned and developed by translation services at the European Commission as part of a first effort to apply MT to solving the multilingual information barrier in the Europe Union.

Today, Systran is still innovating, and in 2013 won an LT-Innovate Summit Award for its SystranLinks solution, designed to optimize and accelerate the automation of website localisation. Prior to this it was almost certainly the first MT system to be linked up to the early Web, driving the free Babelfish translation service ever since the Internet paleolithic age of 1995. And can even claim to be the first online translation service ever, providing automated translations through the French Minitel videotext service back in the early 1990s.

The company has also weathered the statistical tsunami by extending its technology stack to include data-driven benefits on top of its fundamentally ‘symbolic’ architecture – i.e. one based on the properties of words and phrases rather than on the probabilities of strings of letters.

Systran also offers a useful benchmark for mapping the sector’s monetary value. It is one of the very few publicly-listed companies in its sector (SDL and other similar listed vendors operate in a galaxy of multilingual service markets, not just MT) so its financials are public. In a translation technology software market that LT-Innovate estimates to have grown by some 15.5% in 2012 to a volume of $739.2M, Systran has been posting sales in software and services of around €10M. Other software service suppliers do better.

But remember that Systran has remained independent of the much more valuable translation services segment and focused constantly on improving its core technology and value proposition as a “pure play” MT supplier. Where it has been particularly successful recently is in helping translate content for national intelligence agencies, especially in the US.

The company has run the gamut from offering a free online translation service to providing highly domain-tailored services to enterprises and industries. Systran’s forty-five years of loyal service to its clients in the MT segment constitutes a pretty rare track record. How will it innovate tomorrow?

24 September 2013

Bitext: Sentiment Analysis industry: accuracy and evaluation.

Bitext, member of LT-Innovate, is organizing a presentation in San Francisco on 2 October 2013 on "The Sentiment Analysis industry: accuracy and evaluation."

Spain-based Bitext will discuss the following questions:

How can I provide accurate results to my clients?
How does accuracy correlates with achieving business goals?
How can I evaluate or measure these results?
What kind of tool can answer all these issues: being accurate and fulfilling business goals?

More information on this event on the dedicated page.

19 September 2013

ABBYY Language Services Delivers New Cloud-based Solution for Terminology Management in Companies

Moscow, Russia (September 19, 2013) - ABBYY Language Services, a hi-tech provider of localization services and translation technologies, announces the release of the beta version of Lingvo.Pro, a new online solution for terminology management to improve translation processes at international companies. ABBYY Lingvo.Pro expands the language technologies market with new cloud-based terminology management solution for all types of translation resources: corporate glossaries, dictionaries and Translation memories.

Lingvo.Pro is an intuitive, easy-to-use terminology management system built on linguistic technologies by ABBYY Language Services. Market research by ABBYY LS showed that companies in almost all industries run into the same problems in translation: existing solutions were unable to quickly display relevant terminology and systematically guarantee proper implementation of terminology for translation and localization projects.

Lingvo.Pro overhauls and simplifies the work of translators, offering easy access to corporate terminology and ensuring that terms are consistently used across company departments. The new cloud-based solution is a convenient and efficient tool for managing the linguistic assets of a company. It facilitates the consistent use of corporate terminology and, thus, increases the quality of translations and reduces editing and revision expenses.

The core advantage of Lingvo.Pro over existing technologies is its intuitive and easy-to-use interface: no installation or training is necessary to start using Lingvo.Pro. The product offers financial and quality benefits to both mid-sized and large businesses that work with multilingual documentation.

Inclusion of all types of translation inputs guarantees the best results for corporate terminology, which is why Lingvo.Pro allows leveraging almost all formats of corporate glossaries, dictionaries and Translation memory. For better performance, the cloud-based solution can be integrated with CAT software and content management systems as well.

“For anyone who creates multilingual content and manages information, having centralized, well-defined terminology is crucial for accurate translations, as international companies have found out. The most challenging part is automating the term creation and updating processes with minimal effort, and making sure that the terminology is actually used by all of the company's employees, freelance translators and translation providers,” described Ivan Smolnikov, CEO of ABBYY Language Services. “Complicated, unwieldy terminology solutions require substantial expenses for installation and training. Lingvo.Pro is designed to be different: simple and efficient.”

The solution does not require installation on a user’s computer and can be easily accessed by as many users as needed regardless of their location. It serves a crucial component in automating the translation process and allows companies to leverage the most out of the existing linguistic assets.

The beta version of Lingvo.Pro is available for trial use, free of charge, until the end of the year 2013. For more information about the software and how to use it, visit www.lingvo.pro.

About ABBYY Language Services

ABBYY Language Services offers localization services and technological solutions enabling businesses to go global. The company provides localization into over 80 languages as well as helps streamline multilingual content maintenance by means of translation workflow automation and cutting-edge linguistic solutions. ABBYY Language Services provides comprehensive language support to more than 2,500 companies worldwide, including 25 companies from the Top 100 Global Brands and 35 Fortune 500 companies.

ABBYY Language Services is part of the ABBYY Group, a leading provider of document recognition, data capture, and linguistic technologies and services. Its products include the ABBYY FineReader line of optical character recognition (OCR) applications, ABBYY FlexiCapture line of data capture solutions, ABBYY Lingvo dictionary software, and development tools.

Media Contacts

Anna Sidorova/ Head of Marketing

ABBYY Language Services

Phone: +7 495 783 37 00

Fax: +7 495 783 26 63

E-mail: info@abbyy-ls.com

www.ABBYY-ls.com

17 September 2013

TEMIS Selects immixGroup as its Master Distributor for U.S. Government Accounts

immixGroup will help TEMIS bring its semantic enrichment technology to the U.S. Government.

New York, NY – September 17, 2013 – TEMIS, the leading provider of Semantic Content Enrichment solutions for the Enterprise, announced today that immixGroup will represent TEMIS for public sector supply schedules making procurement for federal organizations simple, fast and economical. This partnership allows TEMIS channel partners access to supply schedules and enables Federal agencies to unlock unstructured content as part of the President Barak Obama directive to Federal agencies on open data.

"We are excited to open new markets for TEMIS in the Federal space. The potential for TEMIS' market in the public sector is impressive," said Art Richer, President of immixGroup. "Our unique platform of services including strategic demand creation activities and distribution services will give TEMIS partners the resources they need to grow their business, while government agencies will enjoy reliable access to TEMIS products through their preferred contracts and solution providers."

TEMIS' flagship Luxid® Content Enrichment Platform, the recipient of the SIIA's 2013 CODiE Award for Best Semantic Technology Platform, is based on patented natural language processing technology that recognizes and extracts relevant items of information hidden in plain text, such as entities, relationships, topics and categories and enriches content with domain-specific metadata. Luxid® helps organizations to efficiently structure their unstructured information assets, to package and deliver targeted, relevant information to their stakeholders, and to enable the analysis, discovery and sharing of actionable insights to optimize their business.

"As we experienced at the recent White House Data Jam on Open Data, there is a major need for semantic enrichment to unlock government content that has been hard to find, much less manage," said Guillaume Mazieres, TEMIS Executive Vice President for North America. "Our Luxid® platform has proven how unstructured content can reveal critical information, from money laundering to scientific research. In immixGroup, we have found a partner with the integrity, depth and vigor to engage the Federal market place with TEMIS."

About immixGroup, Inc.

immixGroup helps technology companies do business with the government. immixGroup's unique platform of services enables software and hardware manufacturers and their channel partners to grow their public sector business and accelerate the sales cycle. Since 1997, immixGroup has delivered the specialized resources and expertise these companies need to increase their revenue, support their demand creators, and operate efficiently. And government agencies trust immixGroup to provide leading commercial technology products through their preferred contracts and business partners.

To see the full press release