Friday, July 4, 2014

Language Prosthetics: After Life in the Digital Never Never

Media technology used to be thought of as simply an extension of the human sensorium. Now it will become an extension of our entire existence.

Back in the 1960s, media theorist Marshall McLuhan expounded a simple story about the evolution of media technology: (alphabetic) writing, print, photography, film, radio and TV are all extensions of our natural sensorium. Alphabet technology, for example, along with Chappe’s telegraph and similar devices was a visual (hence inspectable) extension of human speech’s natural capacity to produce aural verbal messages. TV (along with microscopes, telescopes and X-rays) was another extension of our visual capacity to view events, while radio and telephones were an extension of our mouths and eardrums to distant contacts.

In this tale, the history of technology recounts the gradual extension of a sensory apprehension of the world into a hardware amplifier. Since the senses are few in number, McLuhanites had to produce complicated work-arounds to save the theory. For example, the post-electric world (yes, McLuhan rarely used the term ‘electronic’ to identify the microchip revolution happening around him) would be one of “secondary orality” – in the 1970s we were gradually shutting the library door on linear visual written knowledge, and gathering together in a tribe around the oral/aural campfire of CB radio and rock, and the promise down the road of podcasts, always-listening smartphones and speech translation.

Remember The Who’s song entitled “Deaf, Dumb and Blind Boy” in their Pinball Wizard rock opera? Human sensory disabilities have systematically offered premonitory probes into the art of the technologically possible, and McLuhan’s ‘extensions-of-the-senses’ story became even more complicated when it engaged with disability.

Braille, for example, becomes a tactile extension of an alphabet which is itself a visual “extension” of spoken language. Yet in prehistory (i.e. before writing), sightless people would not have needed such a media – they would have developed acute hearing to catch the semantic grace notes of the ambient aural world. Today, though, the auditory/oral channel enabled by smartphones as an “extension” of the ear is becoming a far more powerful communication medium than Braille for the visually handicapped.

Now take the theory of media extensions into the digital world in which we live - and will increasingly die. McLuhan’s “media” have morphed into technologies (or apps) that we can use to extend our digital lives and surmount our physical failings. Braille was once a wonderful tool for accessing knowledge for the visually impaired. But now we can extend spoken knowledge to the terminally sightless, and give a plausible artificial voice to those struck dumb.

And, more grotesquely but also more touchingly, we can give the primi inter pares disabled – i.e. the truly brain dead - a new voice. Hallelujah. McLuhan had not expected the electronic nexus to afford room for the physically injured, the congenitally handicapped, or the terminally moribund.

Digital now allows us to extend our “lives” into the virtual, and broadcast our “voices” far beyond situated friends and family into the deep echo chamber of forever.

In the best of possible worlds, text-to-speech technology can invent voices for the congenitally mute. Such voices will probably be built from a cunning mix of real recorded voices chosen from a digital pool and totally artificial voices crafted into a unique timbre for someone who has become or always has been voiceless. But it raises the interesting question for a dumb speaker of which voice to choose: so watch out for “voice design” on your tech radar, especially for those who have always disliked their recorded voice.

Literature and historical movies give voices to dead souls. And we find it perfectly natural that Moses, Caesar, Elisabeth I, Catherine the Great and Mr. Bojangles have “spoken” to us from the stage or screen – the Greeks called it prosopopeia. Yet a newly crafted voice for a dead soul will eventually have to pick its way through the voice biometric devices that will underpin our online security systems.

Will these guard-dogs in future be able to handle artificial voices of real (yet currently speechless or even dead) humans? Or in the even longer run, the weirdly synthetic human voices of artificial beings – robotic avatars of the long gone?

Lastly, in a social media culture, who exactly will we be (with our digital identities and social graphs) when we shuffle off the mortal coil and go permanently virtual and post-human?

Will there be a sustainability app that keeps up our online presence as an eternally young speaking avatar (rather as actors tend to play Queen Elisabeth 1st as a forceful young woman when she was actually in her early dotage)?

Maybe this app could use intelligent methods to analyse what messages are sent us after our death, and by mining data from our previous content stack, guess what message we would have sent back. But can we or should we age that voice from spritely youth to creaky old age when we use it (post-mortem) to answer the phone? Or should we think about personality cosmetics?

My digital being is necessarily a virtual “extension” of the physical me. And analytics will inevitably characterise and embody me as a plausible avatar, sending out social-media messages digested by a smart reader with the kind of stuff I had blogged, uttered, You-tubed, tweeted, emailed, or merely “written” before.

Having an agent mine the web and automatically generate new in-genre content, I (but is it “me” any longer?) could extend my life almost indefinitely, by virtue of the smart robot that parses my old words, and keeps churning out simulacra variora of my textual life.

This is a big leap from McLuhan’s media vision. Digital media do not simply extend the reach of my senses; they transform my very persona (remember that etymologically this word means “sounding (sona) through (per)” –for example through a mask in a stage performance). Today, the irony is that I don’t have to die first. Theoretically I could sit back and watch my digital after-life evolve as an avatar of myself – and why not several different digital personae while I’m at it – and in a Joycean moment pare my fingernails from a stance of digital silence, exile and cunning. Perhaps I wouldn’t even need a “voice”.

Remember the old joke: you can never tell whether someone’s a dog on the web.

Woof woof!

Thursday, May 22, 2014

Language Technology is the drill to make Big Data "oil" flow in Europe!

It has always surprised me how much is written and talked about Big Data without pointing to the main barrier to the data revolution: our many languages (more than 60 in Europe alone). The numbers surely differ from sector to sector, but a fair guess would be that half of big data is unstructured, i.e. text. Most multimedia data is also converted to text (speech-to-text, tagging, metadata) before further processing. Text in Europe is always multilingual.

Europe prides itself of an “undeniable competitive advantage, thanks to [its] computer literacy level”. In fact, we have had this advantage for decades, but so far it hasn’t helped much. Good brains and companies are systematically bought by our American friends. No, we rather have to focus on what is specific for Europe. On what we have and the US doesn’t. Maybe even if it is a disadvantage - at first sight.

What makes Europe special and different is the fact that we are trying to build a Single Market in spite of our different cultures and systems. Our multilingualism is always seen as a challenge, a big disadvantage. Most Big Data applications only work well in English and, with some luck, okayish in German, Spanish, or French. Smaller EU countries with lesser spoken languages are basically excluded from the data revolution. The dominance of English in content and tools is the reason for the US lead in Big Data. Many European companies have reacted to this and now use English as their corporate language. But Big Data is often big because it originates from customers and citizens. And these rather use their own languages.

What if we managed to turn this perceived handicap of a multilingual Europe into an asset? Overcoming the language barriers would be a great step towards a Single Market. We would make sure that smaller Member States participate and perhaps become drivers of the data revolution. Even more importantly, Europe would become the fittest for the global markets. The BRICs and all other emerging economies do not accept any more the dominance of English. Europe has a unique chance... if it solves a problem the Americans do not have, or discover too late.

The real opportunity is therefore to create the Digital Single Market for content/data independently of the latter's (linguistic) origin. This would require that we overcome the language-silos in which most data remains captive and make all data language-neutral.

To achieve this, we urgently need a European Language Cloud. For all text based Big Data applications the European Language Cloud is a web-based set of APIs that provides the basic functionality to build products for all languages of the Single Digital Markets and Europe’s main trading partners. For more information, see my previous post.

While the European language technology industry might not have all the solutions readily available to deliver the European Language Cloud, many language resources could be pooled as a first step. In addition, many technologies are presently entering into a phase of maturity (after decades of European investment into R&D) and could be harnessed - through a set of common APIs - into a viral Language Infrastructure. This would go a long way towards delivering the European Language Cloud... without which the Big Data oil will only continue to flow from English grounds.

Jochen Hummel
CEO, ESTeam AB - Chairman, LT-Innovate

Sunday, May 18, 2014

2014 - The year of the verticals for Europe's language technology industry

In a recent interview, the CEO of  the Spanish firm Daedalus, José Carlos Gonzalez  said with great verve that his “goal for 2014 is to cover progressively the specific needs of our clients by helping them to develop solutions in vertical markets, freeing them from the complexity of language processing technology.”

Freeing verticals from the complexity of language technologies is a necessary step forward. But it means knowing about the specific needs of industries, and how solutions can be invented that address the infrastructural conditions of these often large-scale players requiring fairly long-term

At LT-Innovate, we believe that 2014 will be the year of the verticals. This means that instead of endlessly repeating what our language technology could do if there was, as the poet said, world enough and time (and above all money), we should deliver solutions that industries actually need.

We kick-started this process of market analysis some 18 months ago and have built up a useful body of knowledge about gaps, want-to-haves, on-going problems, and the sheer lack of awareness among various verticals of the potential benefits of LT. We recently published our findings on these markets to help our members compare their experience and insight with our own efforts at trying to identify opportunities.

Each industry naturally has its specific needs, even though all of them tend to follow the trend towards breaking down information silos and stepping up cross-lingual data sharing while keeping costs down.
We found that the increasingly globalising Manufacturing industry tended to expect massively unified information centres with localised interfaces; that Tourism needed deep, multilingual sentiment analysis applications, and that Media & Publishing is increasingly requiring integrated multimodal (speech/text/image) monitoring, using multilingual speech recognition among other technologies.

We also learnt that whatever the structure of the industry, there are multiple touch points in most workflows where LT can play a role in lowering costs, improving efficiency and contributing to what we can call digital integration. Spoken interfaces can improve productivity in numerous industrial jobs, from store-room workers to clinicians making out reports on patients.

Likewise, the need for cross-lingual access to information of all sorts is now a constant in nearly every European vertical. Today these tend to be addressed by point applications; tomorrow we can expect far more integrated solutions that can adapt more effectively to specific requirements in the online workplace.

This year LT-Innovate hopes to leverage this initial knowledge base to build a clearer picture of where language & speech technology can play a differentiating, even disruptive, role in simplifying processes, adding value to operations, lowering costs and breaking down data silos in different industries in Europe. So stay in touch.

Friday, March 21, 2014

ROCKIT: Paving the Road to Future of Conversational Interaction Technologies

New conversational interaction technologies raise many business and societal opportunities.  European research can provide interactive agents that are proactive, multimodal, social, and autonomous. Moreover, it is now possible to draw data from many different sources together to provide very rich context and knowledge to use in applications.  But how can the organisations who want to exploit this technology decide what products and services to develop, and where to invest their R&D?  ROCKIT is a new strategic roadmapping initiative that will create a shared vision and innovation agenda to guide this process for all types of stakeholders in this emerging area.

Most technology roadmaps merely describe the future and speculate about what will happen if technology is left to evolve of its own accord.  ROCKIT is different – we will decide what we want the future to hold in ten years' time, and create a structured and visual map of the steps we need to take in order to realize our vision. Markets and drivers, products and services, and enabling technologies will all form different layers so that readers can see the basics at a glance, find what interests them most, and drill down into detail.   For example, if a company knows their market requires a particular service, they will be able to see exactly what technology developments and science research is required to make that service happen, complete with an assessment of the readiness for every item.  Conversely, technology providers will be able to see wider possibilities for their components than they could on their own.  

We will start by defining our vision of the future – constructing a number of key “scenarios”, our future use cases – and the drivers and constraints that assess where the community currently is in its ability to deliver that vision. We will then establish the possible routes and required developments to fulfil that vision.   With this mapping done we will be able to highlight key enablers, technology gaps, risks and resource gaps. Iteration will ensure our roadmap is robust and correct. The roadmapping process will be ably led by Vodera, which has produced roadmaps leading to sounder research and innovation programmes in diverse application domains such as automotive, aerospace, security, healthcare and environmental monitoring.

Getting all this information under control has traditionally been a problem for roadmaps – but in ROCKIT pioneers the use of SharpCloud, a new online collaborative visualisation platform that makes it much easier to capture, edit, display, and disseminate roadmap contents than was previous possible.  As a result, the knowledge of the community will be in an accessible format that allows easy identification of trends, gaps, opportunities, and resources.

A roadmap is only as good as the people who contribute to it.  ROCKIT needs the right participants, and they have to cover every stakeholder community, from R&D and system integrators to component suppliers and usability experts, and more.  SMEs are just as important as large companies and public sector research organisations, since most current commercial activity takes place there. If you want to be involved, join the Conversational Interaction Technology Innovation Alliance (CITIA) Linkedin Group or speak to the ROCKIT partners. Forthcoming Workshops are planned in conjunction with major sector events such as LREC (Reykjavik, May 2014), ICASSP (Florence, May 2014) and LT Innovate Summit (Brussels, June 2014).

Article contributed by Costis Kompis, Vodera

Costis Kompis is the managing partner of Vodera, where he helps private and public organisations align their R&D activities, develop innovation strategies for emerging technologies and design new business models to capture market opportunities.

Tuesday, March 18, 2014

Meetings of Minds: The Promise of Smart Knowledge Management for Online Conferences

Much effort is being put into building the market for technology that supports rich-media online meetings. This segment covers anything from telepresence and high-quality video conferences to private meetings, conference calls and webinars. In due course, driven by mobile and the cloud, it will extend to applications such as multiple-site remote surgery or online customer focus groups.

The common denominator of all these digital meet-ups is that they inevitably produce – because they are all about humans communicating together – large amounts of content: i.e. recordable language data that can provide substantial added value to all kinds of stakeholders if properly captured and processed.

To give an idea of the market value, 2011 revenue from the global video conferencing infrastructure grew by 26.9% (to $746M). In 2012 a company such as Cisco did significantly better. More important for knowledge management services is the software layer poised on top of the unified communications infrastructure, much invested in by telecoms equipment companies such as Avaya.

Just recently Oracle said it was making a substantial bet on video-conferencing as a major business line. Surely Microsoft among others with its high-potential Skype asset will be eying the same market.

The story of smart meeting technology goes back to the early days of exploring how computing could augment – rather than replace - human intellectual work. This was radically enabled by the research into computer interfaces, networks and graphics in the Augment programme led by Doug Engelbart back in the 1970s. He tried to adapt the technology of his time to help teams of intellectual workers increase their grasp of complex decision-making and data handling during very large-scale industrial projects.

This singular seam of computing history is often contrasted with the development of the famous Artificial Intelligence agenda. The focus here was on automating essentially human practices such as using language to create meaning, and reason over semantic entities. This led to software applications for powering processes ranging from medical diagnoses to driving a car.

Augmenting the value of meetings poses a real challenge. Meetings can involve many participants in free-flowing conversations around documents or presentations that generate a huge flow of information, both trivial and critical. Sorting through the inevitable noise generated in order to identify the key takeaways is a demanding task. As is the parallel need to check on the relevance of those half-forgotten suggestions, criticisms, and expressions of support.

Note that a perfectly intelligent aural record of a meeting would change the psychological dynamics of how people process what happened, producing a cool, detailed photograph rather than human memory’s warm, impressionistic picture. Although everyone keeps their own notes as a partial record of a meeting, there is also a scribe whose job it is to take down the official “minutes”. What if there was an independent and searchable record of the whole event?

Enter Gridspace which has come up with an NLP-driven solution for recording and indexing the content of meetings, and collecting and integrating all documentation associated with it, so that it produces a searchable knowledge base.

The application also claims to provide a dashboard for meeting attendees of what the system considers to be the “most important” content of the meeting – i.e. a sort of automated minutes. The aim is of course to save time and give a rapid solution to the post meeting problem of collating scribbled notes into an “objective” record. Another company operating in this space is VoiceBase

Four leads for adding functionality to future meeting apps:

1. Although small-group online meetings may use a single language – possibly linguafranca English – any meeting knowledge support system will eventually need to be multilingual in scope. In some cases, interpreters could be integrated into the workflow (with the attendant data capture issues); in others, subtitles could be used to simplify communications.  Subtitle companies using speech recognition to aid multilingual access include the people behind Jibbigo (before it was bought by Facebook – no news since), but also Translate Your World, which claims translation capabilities for subtitles in 78 languages or automated voice translation in 35 languages. A hard linguistic nut to crack, of course, but essential in the long run.

2. “Intelligent Meeting” applications will also need to be able to consolidate a whole historical series of meetings on the same topics and summarise their contents. They should provide people with references to previous meetings, what people said before, what updates have been shared by email, and so on. In other words, a fully-fledged ideas monitor that can take the burden of searching and consolidating information and morph it into usable input for everyone involved the meeting.

3. Other add-ons will almost certainly be dreamt up to improve meeting productivity in due course. In a BYOD world, wearable computing devices such as smart glasses could well turn into a meeting interface for some, requiring a number of adaptations to the meeting scenario. Fine-tuned emotion recognition software may well find its way into meeting software to help participants detect the temperature of their co-members either in their facial expression or in their manner of speaking. These will enable people who have never met before to rapidly evaluate personalities. Sinister but possibly manageable.

4. Knowledge technology advances in online meeting/webinars will almost certainly extend to the world of education and training (e.g. MOOCs) and open up an interesting multilingual marketplace for smart apps that help learners engage more easily, understand more intelligently, and access richer knowledge repositories. Again, across the language spectrum.

Data2Content, the smart solution that lets you create editorial content for websites automatically

Syllabs, the French expert in semantic analysis, launched Data2Content, the automated solution for producing editorial content for websites. Using a structured database (e.g., product information and specifications), the Data2Content solution automatically produces large volumes of human-quality editorial content (e.g., product descriptions, technical data sheets, etc.). The solution also takes care of automatic content updates and supports multilingualism through the creation of content in several languages from the same data set, without the need for translation.

Data2Content provides the following client benefits:

  • improved customer experience thanks to rich, detailed content
  • increased number of visitors and customer retention (relevant and attractive content) making it possible to lower the bounce rate
  • adaptation to target audience through the choice of an appropriate style
  • quick integration into a website’s workflow (various formats are possible and texts can also be made available in SaaS mode through a dedicated web service)
  • better visibility of the website’s content: Data2Content texts are optimized to boost natural ranking

A fully automated and standardized solution, Data2Content is already used by leading web players in the fields of e-commerce, e-tourism, online directories and classified ads websites such as Brioude Internet (web agency), or