Speaking in tongues, Part 2:
Foreign language KM technologies
I am a news junkie, and I don’t limit myself to just U.S. news. I like finding out what is going on across the world. One thing I find very interesting is looking at video information about foreign countries presented on sites like CNN, BBC, Al-Jazeera and other foreign news sources. What also interests me is the question of whether or not different things are being reported in the native tongue of a country vs. on the internationalized English versions of those broadcasts. However if you are, like me, just an English speaker, then reviewing information provided in foreign languages is a serious challenge. Enter, automated foreign language video exploitation tools and services.
This article is the second of a two-part series on foreign language knowledge management technologies that can help you meet those foreign language challenges. In the first article (in the July/August 2007 issue of KMWorld), I focused on foreign language tools to support unstructured text mining software, which provides users with natural language processing, language identification, transliteration and name normalization capabilities. In this article, I will focus on BBN Technologies’ Broadcast Monitoring System (BMS) and some of the underlying components that make it work.
The basicsBMS is a suite of technologies that have been integrated to provide a comprehensive capability to monitor, search and supply alerts based on the specific content in streaming audio and video. BBN, a 60-year-old, DARPA-funded spinoff from Verizon, was originally formed in 1948 and is well known for its speech-to-text conversion technologies. [DARPA is the Defense Advanced Research Projects Agency.] BBN’s technologies are very mature and widely used in everything from telephone-based call management systems to automated audio and video speech-to-text transcription capabilities.
In a nutshell, BMS works by integrating various Internet video channels into an Internet Explorer 6.0 (or later) and Windows media player-based streaming video exploitation portal (see Figure 1, Page 8, KMWorld, Vol 18. Issue 1). Back-office servers provide media extraction services that convert speech to text, align the text with a given video frame, and then allow the foreign language text to be converted to English or another supported language, using automated machine translation. BMS’ foreign language support includes Modern Standard Arabic, Spanish, Mandarin Chinese, Persian/Farsi and English.
Core featuresSome really nice features of the system include continuous monitoring of incoming multimedia feeds and the ability to search for any text string in either English or the target language, such as Arabic or Chinese. The system can actually extract segments of video into either MPEG format or as stills in JPEG format. Furthermore, users are able to set system alerts based on text strings and keywords, which are activated if the right trigger conditions arise as the system monitors incoming broadcast information (see Figure 2, Page 8, KMWorld, Vol 18. Issue 1).
A really useful design feature of BMS is that as it is playing a particular video segment, it automatically aligns and highlights the speech it has converted to text in the targeted foreign language, the translation and the actual playing video. As it displays the text, it identifies the voice that is speaking (in the text) as either male or female and numbers each separate voice it identifies (see Figure 3 on page 9, KMWorld, Vol 18. Issue 1).
Machine translationOne of the impressive features of BMS is its machine translation (MT) capability. Machine translation is a difficult technical challenge, and over the past 15 years, the focus has been on using rule-based systems. However, the precision or accuracy of rule-based MT systems has been extremely limited and generally not very useful. More recently, MT technology companies have focused on statistical machine translation approaches. In the case of BMS, BBN has partnered with Language Weaver, which provides its machine translation capability for the product.
Rather than use the rules of language (e.g. what’s a noun, verb, adverb, etc.) to provide the basis of converting from one language to another, Language Weaver uses statistical measures that analyze the frequency of phrases, sentences and relationships within the text, and contextually convert them to the targeted foreign language.
This method, more formally attributed to Warren Weaver (Weaver Method) and initially pioneered by IBM in the 1970s and 1980s, uses statistical methods and large samples of documents that have already been translated manually to build the automated underlying translation system. It normally takes about 2 million sample documents and their associated translations to the target language to build up the capability to translate information from one language to another with any reliability.
Unlike rule-based foreign language translation, which is about 35 percent accurate, statistical methods for translation can bring the precision up to 90 percent or even higher. In the case of a system like BMS, this means that humans can definitely look at the information being presented in the automated translation and get the "gist" of what is being said. If an analyst or other BMS user wants a more perfect translation, he or she can copy the relevant section of the video/audio/text and send it to a human translator using BMS’ built-in capabilities.
Exploiting multimediaBBN’s BMS is a leap ahead in terms of technologies that help with the automated exploitation of Internet foreign language multimedia. It provides a turnkey and automated (hands-off) means of monitoring broadcast audio and/or video information and allows exploitation of that information in real time.
Once you use BMS to extract text information from video or audio feeds, you have a body of text that can be further exploited in an automated fashion. In the first article in this series, I introduced you to a range of data mining and text exploitation tools that can
automatically extract structure from unstructured information. Since the transcription capabilities of BMS provide both the native language text and the translated version as well, a range of text mining activities are now possible. Those activities include extraction of nouns, verbs, context, language identification, geospatial information, concepts and other metadata, all of which can be stored in a database for long-term historical access.
Though I am normally not a big fan of watching television, just for laughs, I think I will go and watch some of the old re-runs of Saturday Night Live that have been translated into Mandarin. I somehow think it’s going to be even funnier the second time around.