Fluent
Machines Developing a new approach to automatically translating text
documents from one language to another. What's new
Fluent has just begun describing details of its technology and sketching
out a product roadmap. Profile Getting machines to understand
human, or so-called natural language is one of the great challenges in
computer science. Every year, it seems, scientists refine their techniques
and get a few steps closer to the perhaps unreachable goal of making the
computer as fluent as a human being. As yet, no algorithm solves the
problem completely and definitively. Evidently, the
thing that most defines us as human is not easily reducible to a simple
transfer or processing of some mathematical stuff called
information. Machine translation of written texts, as
compared to getting computers to interpret and act on written or spoken
commands in real-time, might appear to be a fairly easy task. After all,
the computer can take its time, within reason, in analyzing a document and
creating its translation. And with documents of any size, lots of clues as
to meaning and context would seem to exist in the text
itself. Sure enough, there are a number of machine
translation programs commercially available in the form of desktop and
enterprise software products and as Web-based services. But their accuracy
rate generally hovers in the 70% range. That may suffice for the quick
turnaround of a technical product data sheet, say, but in most cases, the
software's output needs cleaning up by people familiar with the target
language. And that costs money. Fluent Machines
claims to have come up with a breakthrough in machine translation, or MT,
as it's often called. Eli Abir, its chief technologist, has devised an
enhancement to the so-called example- based approach to translation. In
essence, this approach involves building a large database of sentence
pairs, each in a different language but with equivalent meaning: "I closed
the door" and "J'ai fermι la porte," for
instance. Through a process of searching this
database of examples, matches can be found for new sentences and sentence
fragments and fairly accurate translations can be constructed
automatically. If needed, a set of grammatical rules can be applied to
refine the initial translation and produce a considerably more accurate
text. Much research in example-based translation has been done at
Carnegie-Mellon University, led by Jamie Carbonell, an old hand in
artificial intelligence. Numerous other universities have looked into it,
too, often with funding from the military. Mr. Abir's
insight into the problem is two-fold. the first part centers on an
improved way of determining how to connect translated sentence fragments
into proper sentences. And, he seems to have come up with a better way of
undertaking the extensive, automated statistical analysis of large volumes
of matching texts that's required to find and match those fragments.
Previous example-based systems, we understand, stored every matching pair
of sentences that they're given, thus creating a huge, unwieldy database.
But Mr. Abir has figured out a way to store only what's useful and thereby
reduce the size and improve the speed and usefulness of searches in that
database. |
Instead of seeking to match complete sentences,
Mr. Abir's algorithm focuses on determining the frequency of practically
every word string in each of a pair of matching
texts. Clearly, words that match in meaning don't
necessarily show up in exactly the same sequential order in two
corresponding sentences written in different languages. But Mr. Abir
figures that it's quite possible for a computer to identify the words (and
strings of words) that each word typically appears in conjunction with.
Boxers tend to punch other boxers and win fights, for instance, while dogs
bark and airplanes fly, land, and crash. This kind of association
eventually leads to a long list of matching words and sentence
fragments. The assumption is that in any language at
any given moment, there is in current use a finite number of these
DNA-like "blocks of meaning." The trick is identifying them somehow
without manually poring over vast amounts of text. Mr. Abir estimates this
number at between 1 billion and 5 billion blocks for modern tongues like
English and German. That's a big number, no doubt, but by analyzing enough
document pairs that are known to be good translations of each other, he
reckons, his software should be able to identify the different word
strings in each language that carry the same meaning. And, what's more,
the code can determine how these chunks typically fit together when used
properly. Fluent's is a purely statistical approach,
with no regard for grammatical or syntactical rules or the actual
semantics of words. That's not entirely new, we gather; though relatively
young, statistical MT has its own rich history. But by virtue of the way
Fluent does its analysis, company officials argue, the more matching texts
the system is fed the better it can translate texts in any of the
languages it handles. In other words, its English-to-French abilities will
improve not only with every new English-and-French text-pair it's given
but also with new German-to-French pairs, too. And it should improve
incrementally with every contribution of a new translation, including
those supplied by human translators who are refining Fluent-produced texts
and feed the system their corrections. So far,
Fluent tells us, the firm has only just finished prototypes of its
database builder and word-string connector programs, the two key
algorithms it has developed. And now, the company is scrambling to find
translated document pairs to analyze. Thousands of these are available on
the Web, of course, but the company hopes to strike deals with government
agencies, including the military, and explore other
sources. Having reviewed Fluent's technology, Mr.
Carbonell has given it a generous endorsement: "...clearly the most
promising and theoretically important MT development in the past several
years." We'd be considerably more impressed with this assessment if we
weren't aware that the Carnegie-Mellon professor has also joined Fluent's
board of directors. Still, we have to believe Fluent is on to something if
someone of Mr. Carbonell's stature actually joins the
firm. Estimates of the document translation market
vary, but the rapid globalization of markets is making it imperative to
get product manuals, website content, and other documentation translated
as quickly and with as little cost as possible. Even e-mail needs
translating, preferably in near real-time. Fluent sees worldwide
language-translation revenues of $5.7 billion in 2001 growing to $7.6
billion by
| 2006, with MT's portion
growing from $73 million to $117 million over the same period. That's not
exactly the biggest potential market we've heard touted by an
entrepreneur, but the implication is significant improvements in MT's
accuracy would tend to give MT a bigger slice of the pie. And, it could
well drive demand for new applications not considered economical right
now: translating daily newspapers, for instance, and providing access to
foreign websites. Fluent plans to pursue a
service-based business model, with availability starting within a year.
The choice of a that model is dictated largely by the fact that the core
database will be fairly huge in size-tens, maybe hundreds of gigabytes-and
the actual translation of new texts will consume great amounts of memory
and processing capacity. Company officials tell us that they can foresee
an enterprise version of the system eventually being produced for
installation on dedicated sets of servers. But for now, the plan is to
translate documents submitted over the Web to a central site. This is the
business model already in place at WorldLingo, which retains banks of
trained translators who work on their own and with the aid of MT systems.
The firm can call on translators who have expertise not only in specific
languages but also in selected subject domains. (We submitted this article
for machine translation into Spanish at WorldLingo's website. The Spanish
text came back with the option of having a human translator touch it up
for $110.) Fluent's background isn't typical for the
technology industry. It is actually one of two wholly-owned subsidiaries
of an entity called Meaningful Machines. The sister startup is called
Internet Driver, which is also commercializing technology conceived by Mr.
Abir. Internet Driver offers a browser plug-in and hosted service that
together enable non-English-speakers to navigate the Web in their own
languages and character sets. The Web's URLs are available only in
English. Fluent has exclusive license to patents filed by Meaningful that
cover use of Mr. Abir's technology for human language translation apps.
Mr. Abir is described as an inventor who served time in the Israeli army
and then owned three restaurants and other small businesses in the U.S.
Fluent's backer, Apple Core Holdings, is a real-estate company and hotel
operator in New York that in the '90s, began investing in early-stage tech
firms such as Register.com, GoAmerica, Cryptek, and Javelin
Technologies. An intriguing, offbeat story, we
believe. Now, the challenge will be to create a robust, commercial product
and to convince investors that there's something worthwhile here. Last
year, another MT system-called Gedanken and developed by a New York outfit
called Applied Knowledge Systems-fell off the map when a planned take-over
by The Translation Group fell through. (We notice that Translation,
publicly listed, itself seems to have gone belly up; its phone is
disconnected and website unreachable.) Even if it's as technically
advanced as Mr. Carbonell insists, Mr. Abir's idea won't necessarily
translate into profits right away. Upside Rampant
globalization is making rapid and low-cost language translation a must for
many companies and agencies. Downside The technology is still
unproven, particularly from a commercial point of view. Though Fluent says
its approach can potentially increase in accuracy to 99%-plus over time,
it's not clear how accurate it will be upon first delivery-or how fast it
will operate. CEO and Chairman Steve Klein, chairman and CEO of Apple
Core.
---------------------------------------
www.meaningfulmachines.com 212-716-0070 HQ New York
Founded 2000 Employees 11 Financing $4.1
million in one round Investors Apple Core Holdings
|