24 December 2013

LT for Hire: NLP Recruiting Gets Business-friendly

In a recent Expert System blog on 10 Semantic Technology Trends for 2014 that the Italian company has identified, trend 4 is boldly entitled “The Programmer of the Future is a Linguist” and claims that “the role of the linguist to bridge the gap between meaning and contextual relevance will become an essential part of technology applications.”

In other words, however much we try and automate the processes of understanding what a web page, a query, a text or even just a single sentence means, we know that the expert human linguist will still be a crucial factor in programming machines to understand better.

This role, by the way, was amply brought home in early December, when the UK firm Crystal Semantics was acquired by a media tech company: most of the commentary highlighted the fact that the founder David Crystal’s team of linguists took ten years to handcraft a disambiguation and categorisation engine for English (and other language) web pages that would aid a “sense” engine to understand the gist of polysemic words. Machine learning clearly can’t do it all.

To find out more about the job market for such linguists, LT Innovate talked to Maxim Khalilov the founder of NLPeople and Nick Gallimore at Natural Language Recruiter, the language technology wing of the mployability site in the UK, to find out more about job needs in the LT industry as a whole.

Maxim Khalilov started the NLPeople site in May 2012, and now publishes around 60-80 jobs for researchers and scientists in industry per month, finding demand to be fairly stable. New rounds of European Commission-funded projects tend to spark a rise in demand, and he also noted a “moderate increase of about 5-10 %” in October 2012, May 2012 and June 2013, probably due to business expansion activities among big players on the NLP market. His primary focus is in fact on the research community, so his data do not necessarily reflect the global job scene.

Nick Gallimore has been working exclusively on job openings in the LT industry under the general mployability banner for three years now, and decided to focus on the LT industry via the dedicated Natural Language Recruiter brand. He himself is passionate about language and technology and is keen to build credibility in this fast-growing space.  

For him, job openings in the field fluctuate considerably from month-to-month. “It takes companies quite a long time to hire people (their requirements are often very different to research organisations) so it's not usually clear how ‘new’ a vacancy is.” But quoting figures that are close to those cited by NLPeople, “we see 250-350 job openings in industry each year in Europe, and a similar level each year in the US. We also believe that there is quite a lot of commercial-side hiring in the LT space that takes place ‘under the radar’.”

Which LT fields do these jobs address? Khalilov sees increased demand in the machine translation industry in Europe and in the USA, as well as more positions for speech processing experts. The latest tendency is crowdsourcing – “we regularly receive jobs submitted as a part of various language crowdsourcing projects.” And of course the data analytics/data scientist segment offers a growing number of exciting openings for NLP people.

Not surprisingly, NLP developers with the hands-on implementation experience are much in demand. A solid NLP background knowledge is mostly required, in some cases in combination with the language expertise. Language technology researchers and scientists with proven coding skills are taking second place.

What sorts of companies are advertising for NLP expertise? NLPeople receives lot of jobs from the recruitment agencies which, in many cases, prefer not to reveal the actual employer. A second major segment covers jobs at the “IT monsters”. Although Khalilov sees great potential in other companies needing NLP expertise, he reckons that they mostly “prefer to buy solutions and focus on integration only.” Then there are the NLP-oriented start-ups that typically require a broad outlook of computational linguistics, data mining and machine learning technologies.

Geographically speaking, Natural Language Recruiter works worldwide, even though the initial focus is on the UK. “We have clients in the UK, US, France, Spain, Germany, the Netherlands and China as LT is a truly international space,” says Gallimore.

For Khalilov, the USA is an “absolute leader on the industrial NLP market” – especially on the West Coast and in the Greater New York area. In Europe, there are a large number of localization jobs in Ireland “the localization Mecca of the Old World.”

A noticeable number of start-ups concentrating their efforts on the interface between NLP and machine learning appeared in 2011-2013 in France, Germany and Spain. While in Germany these new companies tend to stick close to big university centres, in France and Spain virtually 100% of them are in the Paris and Barcelona areas respectively.

Overall, then, LT jobs are on the rise. Let’s hope the LT industry harvests the benefits. 

19 December 2013

LTC upgrades Deutsche Post DHL's translation management infrastructure

Deutsche Post DHL has extended its partnership with LTC, The Language Technology Centre, for a managed technology solution in corporate language services, including LTC Worx multilingual business-management system for global companies.

Doing business in more than 220 countries and territories demands more than high quality translation and localization; Deutsche Post also needs to maintain an advanced system to manage multilingual business processes and other business functions such as resource allocation and budgeting.

LTC has been a large scale provider of language services and technology to Deutsche Post since 2000, and we have enjoyed an excellent ongoing business relationship,” explained Dr Adriane Rinsche, Managing Director at LTC. “During this time language needs have developed to support global operations, and LTC supports Deutsche Post in managing a clear strategy for multilingual content.

This extended partnership with Deutsche Post will see advanced automation options, the continued hosting of the LTC Worx SaaS solution, and importantly LTC will also host the Kilgray memoQ 2013 translation environment as part of the complete translation management solution.

Technology solutions such as LTC Worx and memoQ have allowed the Corporate Language Service (CLS) at Deutsche Post to plan, track and report on important language projects. This supports the innovation required to achieve our goal: to enable our customers within DPDHL and our subsidiaries to act globally, overcoming language barriers” said Doro Meyer-Veit, Head of CLS.

The memoQ translation environment allows you to carry out many precise functions such as managing multiple packages in one, multilingual project,” said Istv├ín Lengyel, Kilgray CEO. “By using LTC Worx and memoQ together, users can manage projects with complete control and this makes life a lot easier.

LTC language services have covered many different subject areas including logistics, infrastructure, environment, law, IT, marketing and more. The LTC Worx system has enabled the management of flexible and controlled multilingual processes. 

The solution facilitates the design of unique workflows and gives control over the complete end-to-end process. DPDHL use memoQ 2013 and LTC Worx to drive down operational costs with advanced applications.

16 December 2013

Crystal Semantics: From Disambiguation Research to Smart Online Advertising Analyst

Crystal Semantics, one of the most singular European language tech companies and an LT-Innovate Prize-winner, has changed hands again. Previously owned by the Dutch media company ad pepper Media (from 2006), it was sold last week to the media monitoring giant WPP’s digital’s marketing technology company 24/7 Media which will in turn be merging with the technology company Xaxis early in 2014.

Why the interest in Crystal Semantics and its 15 employees split fairly evenly between engineers developing the technology and semantic linguists?

The UK company uses proprietary technology to read web pages for their total (fully disambiguated) meaning in real time, thereby enabling an advertising agent to know exactly what is contained on the target web page where they might wish to place their advertisements. The kind of brands that WPP and others provide monitoring services for want to ensure that they do not get their adverts placed on a page of web content that is semantically inappropriate to their brand image – i.e. containing porn, political incorrectness, tobacco, alcohol, etc.

By buying into Crystal Semantics technology, media managers can ensure an automated understanding of content, and therefore benefit from an optimised guide to placement for their clients’ adverts in the cut-throat world of brand competition.

What makes the company so singular in the European technology landscape is that the underlying technology was originally developed as a use-neutral, almost academic attempt to digitise an English dictionary with all its multifarious word meanings. An encyclopaedic subject matter taxonomy was then used to assign a web page to a category on the basis of the interacting word meanings, giving an accurate picture of its fundamental message.

The progenitor of this effort – David Crystal – is an eminent linguist in the UK, with a remarkable track record of linguistic inquiry, ranging from speech therapy to indexing, via stylistics, consulting for the government and linguistics education for the general public. The company website claims that the effort of developing the semantic network – called the Sense Engine - that underlies the company’s application took over ten years. It was largely carried out with the Dutch AND Publishers in the mid-1990s and represented “one of the largest language engineering projects ever undertaken.”

Under the new ownership, will this core resource, which used to be available as a service for web page analysis, continue to aid others in providing semantic disambiguation?
Tomorrow, the world of brand advertising management on the web will be genuinely global, and Crystal Semantics, now part of the Xaxis technology stack for WPP, will need to extend its semantic expertise beyond the dozen or so European languages it covers today and adapt to target languages throughout Asia and elsewhere, despite the low expectations for advertising growth according to WPP’s boss.

11 December 2013

Horizon 2020 Work Programmes 2014/15 published today!

After the final adoption yesterday, the European Commission published the Horizon 2020 Work Programmes today. 
All programmes and ancillary documents such as model contracts, rules of participation etc. can be found at: 

02 December 2013

Jaap van der Meer of TAUS on How We Can Reinvent Translation for a New Generation of Users

In your Translation technology Landscape Report published in April 2013, TAUS said that translation technology is at a deeply transformative point in its evolution, and that we were heading for a convergence era. Could you sum up for us this will mean for (Europe's) translation industry?

JVDM : This convergence – as we say in the Translation Technology Landscape report – comes from three different angles: the technology, the functional (or business) and the social. The simplest example of the technology convergence is the fusion of speech and translation technology. It is so natural that we are moving towards speech-to-speech translation. We have seen demos of that at TAUS conferences and it’s clear that the technology integration still needs more work, but I think we will see some rapid evolution on this front. The functional convergence has started already earlier: it’s all the different disciplines and functions in organizations looking at the relevance and value of translation. Once you have a relatively simple and somewhat automated process for translation, everyone will come to you and ask you to plug in: customer support, social media, marketing, search and so on. Translation and localisation managers are becoming very popular. Social convergence is completely in the hands of the users, the community, and the crowd. Adding self-service (machine) translate buttons to online support sites for instance can make a world of difference. What this all means for the (European) translation industry is that we have to reinvent ourselves: both the way we set up processes and use technology and the way we set up our business and pricing models. We are making a shift from a static model to a dynamic model where the customer and the user have many different options and quality levels for translation

Data-driven translation tech has now become a core feature of the translation landscape. Are there any legal, technical or commercial questions that TAUS is interested in solving in relation to the "market" for language data?

JVDM : Yes, technically it is still very complicated to train MT engines. TAUS would like to make it simpler. In the coming year we will add new features to the TAUS Data repository to allow users to identify data that are really close to the domain or industry for which they need to customize or train an engine. We call this the Matching Scores feature that is based on semantic clustering techniques. We are also considering setting up a library of language, translation and reordering models to help fast-track and fine-tune the development and customization of MT engines. That is quite an ambitious project but with the 55 billion words in 2,200 language pairs already in the data repository we have the right basis for it. Legally, yes of course, we are all still struggling with an outdated copyright law. I hope that policy-makers in Europe will recognize this issue and address it.

Will the emergence of new "device-based" and user-pulled translation apps that you mention in your report have any serious impact on the market as we know it for commercial translation? Or will the market just keep growing exponentially due to the massive creation of content?

JVDM : Our prediction is that the ubiquitous availability of translation will only drive the demand for professional and business-to-business translation at all levels. This means that business customers will look at their vendors to supply translation at different quality levels: from real-time customized MT to personalized transcreation and hyperlocalisation. We think there will be growth at all levels. The challenge though will be to establish the references and metrics that help all of us to deliver upon expectations. At our TAUS Annual Conference in 2012 we set up a competition for innovation insiders and innovation invaders in the translation sector. I think we will see many more innovators coming into the translation sectors with fresh ideas on how to differentiate the offerings.

TAUS is eight years old this year. Can you tell us how the organisation is evolving to address the changing needs of the industry?

JVDM : TAUS has evolved from a think tank to a platform for shared industry services, from ideas to execution. We set up the TAUS Data repository in 2008 and since 2010 we have started developing the Dynamic Quality Framework. Both the data sharing and the translation quality evaluation platform are good examples of general industry support services that benefit all industry stakeholders. They help the industry to mature and add credibility. TAUS provides a unique service as a neutral and objective industry platform.