Google on Wednesday, announced their expansion on Google Translate to support 24 under-sourced languages, including Ilocano through ‘unlocking zero-resource machine translation’.
Google created “monolingual datasets by developing and using specialized neural language identification models combined with novel filtering approaches” for these languages.Below is the complete list of new languages added to Google Translate in their update:
The update was part of Google’s effort in “Building Machine Translation Systems for the Next Thousand Languages.” Google describes building high-quality monolingual datasets for over a thousand languages that do not have translation datasets available and demonstrates how one can use monolingual data alone to train Machine Translation models.
Google said that the technology of Machine translation (MT) has made significant advancements in recent years, as deep learning has been integrated with natural language processing (NLP), and WMT research benchmark performance soared. Despite that, existing translation services only cover around 100 languages in total.
Languages that are overwhelmingly represented are European, and had largely overlooked high linguistic diversity in other regions such as Africa and the Americas.
Google states that there are two bottlenecks to address in building MT models for plethora of languages to reach sufficient quality: (1) data scarcity—”digitized data for many languages is limited and can be difficult to find on the web due to quality issues with Language Identification (LangID) models”; (2) modeling limitations—”MT models usually train on large amounts of parallel (translated) text, but without such data, models must learn to translate from limited amounts of monolingual text, which is a novel area of research.”
Google also presented relevant graphs and data, saying that “automatically gathering usable textual data for under-resourced languages is much more difficult than it may seem.”
As part of their research, Google also acknowledged contributions from over 100 people and institutions who are native speakers of the mentioned languages.
Moreover, Google stressed out that the translation quality produced by the new models still lags behind from the higher-resourced languages.
“These models are certainly a useful first tool for understanding content in under-resourced languages, but they will make mistakes and exhibit their own biases,” Google added.
Read the full Google AI blog here.
YugaTech.com is the largest and longest-running technology site in the Philippines. Originally established in October 2002, the site was transformed into a full-fledged technology platform in 2005.
How to transfer, withdraw money from PayPal to GCash
Prices of Starlink satellite in the Philippines
Install Google GBox to Huawei smartphones
Pag-IBIG MP2 online application
How to check PhilHealth contributions online
How to find your SIM card serial number
Globe, PLDT, Converge, Sky: Unli fiber internet plans compared
10 biggest games in the Google Play Store
LTO periodic medical exam for 10-year licenses
Netflix codes to unlock hidden TV shows, movies
Apple, Asus, Cherry Mobile, Huawei, LG, Nokia, Oppo, Samsung, Sony, Vivo, Xiaomi, Lenovo, Infinix Mobile, Pocophone, Honor, iPhone, OnePlus, Tecno, Realme, HTC, Gionee, Kata, IQ00, Redmi, Razer, CloudFone, Motorola, Panasonic, TCL, Wiko
Best Android smartphones between PHP 20,000 - 25,000
Smartphones under PHP 10,000 in the Philippines
Smartphones under PHP 12K Philippines
Best smartphones for kids under PHP 7,000
Smartphones under PHP 15,000 in the Philippines
Best Android smartphones between PHP 15,000 - 20,000
Smartphones under PHP 20,000 in the Philippines
Most affordable 5G phones in the Philippines under PHP 20K
5G smartphones in the Philippines under PHP 16K
Smartphone pricelist Philippines 2024
Smartphone pricelist Philippines 2023
Smartphone pricelist Philippines 2022
Smartphone pricelist Philippines 2021
Smartphone pricelist Philippines 2020