Search
Profile

Ask your question

Close

What Is Machine Translation?

The translation of text between languages comes with a wealth of challenges, even for deep learning AI programs. However, machine translation has made great strides in this area of linguistics.

Details

Why Is It Important to Have a Good Machine Translation Program?

When you need or want information written in a language you don’t know, you have a few possible solutions: learn the language yourself, find and pay for a good human translator, or use a machine translation program like Google Translate. Luckily, the easiest and cheapest solution is also rapidly becoming one of the most accurate. The benefits of this for business or other urgent uses are obvious.

Additionally, advances in any one aspect of computational linguistics will in turn advance other aspects of the field. Chatbots for business use or mental health outreach programs, for example, both benefit from connecting with people who don’t speak English in their native tongues.

What Internal Data Should I Have for a Good Machine Translation Program?

There are not a lot of internal data sources for this use case.

What External Data Is Essential for a Good Program?

There are a lot of external data that would be essential for automatic translation programs, particularly language dictionaries.

What External Data May Prove Useful for a Good Program?

Some other useful external data include having native speakers of the target language on staff or having an open-source model that accepts user-submitted corrections. Additionally, since many translation programs translate to and from English instead of directly from one language to another, having accurate English dictionaries and translation capabilities in the program can be very helpful.

What Are the Main Challenges of this Use Case?

Challenges of automatic translation abound, especially between languages from vastly different language families. In addition, computational linguistics already has trouble determining tone; in translation, this becomes an even greater difficulty. Translating creative writing therefore is much more difficult than technical writing.

Further, a lot of text submitted for translation contains typos, grammatical errors, and emojis. Enabling a machine translation service to read errors correctly is an ongoing challenge.

Finally, automatic translation uses specialized coding languages and programs like Keras, RNN, and LSTM.

Interesting Case Studies and Blogs to Look Into

Analytics Vidhya: A Must-Read NLP Tutorial on Neural Machine Translation – The Technique Powering Google Translate
TranslateFX: What is Neural Machine Translation & How does it work?

Tangible Examples of Impact

[Facebook research assistant Angela] Fan noted that many machine translation models begin by translating from Chinese to English first, and then from English to French. This is done “because English training data is the most widely available,” she said. But such a method can lead to mistakes in translation.
“Our model directly trains on Chinese to French data to better preserve meaning,” Fan said. Facebook said the system outperformed English-centered systems in a widely used system that uses data to measure the quality of machine translations.

VOA Learning English: Facebook Develops Machine Translation System for 100 Languages

Related Categories

Relevant datasets

INTERConnect Analytics AdaptiveNLP™

by

AdaptiveNLP™ provides adaptive set of insights based on historic and ongoing analysis of the language used by input data sources.

0 (0)   Reviews (0)

Semasio Audience Targeting

by Semasio

Semasio Audience Targeting uses the Semasio semantic approach to optimize marketing strategies. This approach uses records of keywords and phrases used by site visitors to create Semantic User Profiles. Then Semasio takes keyword and phrasal similarities in the browsing habits of established customers to create Seed Audiences that you can use to plan your marketing campaigns.

In each case, Semasio provides companies the ability to tailor their marketing approach with either specific or more general keywords.

0 (0)   Reviews (0)

Semasio Brand Fit Targeting

by Semasio

Semasio Brand Fit Targeting uses the Semasio semantic approach to marketing and audience segmentation by using records of keywords and phrases used by site visitors to create Semantic User Profiles. With this data set, companies can also gain greater control over targeted campaigns by excluding what keyphrases and webpages from consideration.

0 (0)   Reviews (0)

Global Tone Communication Language Technology

by Global-Tone-Communication-logo

Global Tone Communication Language Technology provides a multi-lingual machine translation, computer-aided translation and human-machine collaborative translation solutions in different fields and industries.

0 (0)   Reviews (0)

Similar Data Providers

  • The Arabesque GroupThe Arabesque Group
    5 (1)
    Reviews ()
    Data sets (4)
    Established in 2013, the Arabesque Group is a leading global financial technology company that combines AI with environmental, social and governance (ESG) data to assess the performance and sustainability of corporations worldwide. In addition to their Asset Management consultation service, the groups offers Arabesque S-Ray GmbH and Arabesque AI Ltd. datasets.
  • Black Box Intelligence Consumer Intelligence
    5 (1)
    Reviews ()
    Data sets (0)
    Black Box Intelligence Consumer Intelligence is designed to provide detailed analysis on individual competitor sales and performance data.
  • Home by Vendigi
    4.3 (3)
    Reviews (1)
    Data sets (0)
    Home by Vendigi provides audience data for all things home buyers, remodelers, and sellers. Their data comes from first-party sources like top multiple listing systems (MLSs) major brokers like RE/MAX, Coldwell Banker, Century 21, and Sotheby's. Users of Vendigi's Home data range from home and garden retailers to insurance institutions to telecom companies.