The translation service from HMS Core ML Kit supports multiple languages and is ideal for a range of scenarios, when combined with other services.
The translation service is perfect for those who travel overseas. When it is combined with the text to speech (TTS) service, an app can be created to help users communicate with speakers of other languages, such as taking a taxi or ordering food. Not only that, when translation works with text recognition, these two services help users understand menus or road signs, simply using a picture taken of them.
Translation Delivers Better Performance with a New Direct MT SystemMost machine translation (MT) systems are pivot-based: They first translate the source language to a third language (named pivot language, which is usually English) and then translate text from that third language to the target language.
This process, however, compromises translation accuracy and is not that effective because it uses more compute resources. Apps expect a translation service that is more effective and more accurate when handling idiomatic language.
To meet such requirements, HMS Core ML Kit has strengthened its translation service by introducing a direct MT system in its new version, which supports translation between Chinese and Japanese, Chinese and German, Chinese and French, and Chinese and Russian.
Compared with MT systems that adopt English as the pivot language, the direct MT system has a number of advantages. For example, it can concurrently process 10 translation tasks with 100 characters in each, delivering an average processing speed of about 160 milliseconds — a 100% decrease. The translation result is also remarkable. For example, when translating culture-loaded expressions in Chinese, the system manages to ensure the translation complies with the idiom of the target language, and is accurate and smooth.
As an entry to the shared Task: Triangular MT: Using English to improve Russian-to-Chinese machine translation in the Sixth Conference on Machine Translation (WMT21), the mentioned direct MT system adopted by ML Kit won the first place with superior advantages.
Technical Advantages of the Direct MT SystemThe direct MT system leverages the pioneering research of Huawei in machine translation, while Russian-English and English-Chinese corpora are used for knowledge distillation. This, combined with the explicit curriculum learning (CL) strategy, gives rise to high-quality Russian-Chinese translation models when only a small amount of Russian-Chinese corpora exists — or none at all. In this way, the system avoids the low-resource scenarios and cold start issue that usually baffle pivot-based MT systems.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
Direct MT
Technology 1: Multi-Lingual Encoder-Decoder Enhancement
This technology overcomes the cold start issue. Take Russian-Chinese translation as an example. It imports English-Chinese corpora into a multi-lingual model and performs knowledge distillation on the corpora, to allow the decoder to better process the target language (in this example, Chinese). It also imports Russian-English corpora into the model, to help the encoder better process the source language (in this example, Russian).
Technology 2: Explicit CL for Denoising
Sourced from HW-TSC's Participation in the WMT 2021 Triangular MT Shared Task
Explicit CL is used for training the direct MT system. According to the volume of noisy data in the corpora, the whole training process is divided into three phases, which adopts the incremental learning method.
In the first phase, use all the corpora (including the noisy data) to train the system, to quickly increase its convergence rate. In the second phase, denoise the corpora by using a parallel text aligning tool and then perform incremental training on the system. In the last phase, perform incremental training on the system, by using the denoised corpora that are output by the system in the second phase, to reach convergence for the system.
Technology 3: FTST for Data AugmentationFTST stands for Forward Translation and Sampling Backward Translation. It uses the sampling method in its backward model for data enhancement, and uses the beam search method in its forward models for data balancing. In the comparison experiment, FTST delivers the best result.
Sourced from HW-TSC's Participation in the WMT 2021 Triangular MT Shared Task
In addition to the mentioned languages, the translation service of ML Kit will support direct translation between Chinese and 11 languages (Korean, Portuguese, Spanish, Turkish, Thai, Arabic, Malay, Italian, Polish, Dutch, and Vietnamese) by the end of 2022. This will open up a new level of instant translation for users around the world.
The translation service can be used together with many other services from ML Kit. Check them out and see how they can help you develop an AI-powered app.
Related
Text Recognition with ML Kit
ML Kit gives developers the ability to implement text recognition into their apps. When using an API to develop your HMS-powered app, you'll have two different options. The text recognition API can be on-device or in-cloud. The on-device service will allow you to recognize Simplified Chinese, Japanese, Korean, and Latin-based languages (including English, Spanish, Portuguese, Italian, German, French, Russian, and special characters. The in-cloud API is more robust and allows you to recognize a wider variety of languages including Simplified Chinese, English, Spanish, Portuguese, Italian, German, French, Russian, Japanese, Korean, Polish, Finnish, Norwegian, Swedish, Danish, Turkish, Thai, Arabic, Hindi, and Indonesian.
The text recognition service is able to recognize text in both static images and dynamic camera streams with a host of APIs, which you can call synchronously or asynchronously to build your text recognition-enabled apps.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
Using the ML Kit demo APK, you can see this technology in action. The app is quick to accurately recognize any text your camera is pointed at. It takes less than a second for large text blocks to be converted into an actual text input for your phone. Translation features are also impressively fast, being able to read your words back to you in another language of your choice. This APK shows the extent to which this kit can be used, and makes development so much easier for these features.
How Developers are Implementing Text Recognition
There are many different ways that developers are taking advantage of ML Kit's text recognition. The ability to point your phone at some text and save it to your device opens many possibilities for great app ideas. You can use text recognition to quickly save the information off of a business card, translate text, create documents, and much more. Any situation where you can avoid requiring users to manually input text should be taken advantage of. This makes your app easier and quicker to use.
Whether a developer uses the on-device API or the in-cloud API depends on the needs of their app. The on-device API lets you add real-time processing of images from the camera stream. This means a user will be able to point their camera at some text, and the phone will be able to use ML Kit to recognize that text in real-time. The on-cloud API is better for high-accuracy text recognition from images and documents, but won't be able to complete real-time recognition from a camera.
Developer Resources
Huawei provides plenty of documentation and guides to help you get started with ML Kit's text recognition. You can get started with this guide here.
For all of the functions of ML Kit, refer to their service portal here.
For an overview of their APIs, browse the comprehensive resource library here.
You can also look at different ways that ML Kit can be implemented, by seeing a collection of sample codes here.
Dear Translate on HMS vs Google Translate on GMS
Being able to translate language can come in very handy when you're traveling to different countries. Smartphones have the ability to translate speech, text, or photos in real-time. There are many different apps that offer these features but we are going to focus on Dear Translate and Google Translate.
While Google Translate is one of the most popular solutions for this, it's not available to newer Huawei phones that don't support GMS. Dear Translate is an HMS alternative that is available for free from the Huawei AppGallery.
Google Translate on GMS
Google Translate offers instant translation from many different types of inputs. It works with text input, camera access, or audio input. Pointing your camera at a sign or literature from another language will let Google Translate translate the text into your chosen language. To use the audio input, you can just speak your sentence into your phone, and then let your phone read it out in the target language.
When using Google translate to communicate in real-time, the best feature to use is called "Conversation". This feature lets you choose two languages and provides you with an interface designed to be used with both parties. Each person presses their microphone icon when they speak, and the language is translated for both people.
For offline translation, you can download 59 different languages, for using your app in locations where the internet is not available.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
Google Translate Features:
Text translation: Translate between 103 languages by typing
Tap to Translate: Copy text in any app and tap the Google Translate icon to translate (all languages)
Offline: Translate with no internet connection (59 languages)
Instant camera translation: Translate text in images instantly by just pointing your camera (88 languages)
Photos: Take or import photos for higher quality translations (50 languages)
Conversations: Translate bilingual conversations on the fly (43 languages)
Handwriting: Draw text characters instead of typing (95 languages)
Phrasebook: Star and save translated words and phrases for future reference (all languages)
Cross-device syncing: Login to sync phrasebook between app and desktop
Dear Translate on HMS:
If you're looking for an HMS alternative to Google Translate, Dear Translate might be your best option. This free app is available on AppGallery and is currently ad-free. It supports many similar features to Google Translate like text translate, camera-based AR translate, voice translate, and more.
The app is missing a feature comparable to the conversation feature in Google Translate. This is the only feature that I found myself missing from the Dear Translate app, and you'll be able to function without it. Overall Dear Translate offers more compatible languages than Google Translate for the features that it supports.
Dear Translate Features:
Support translation in 107 languages, meeting your need for translation when studying, working, going abroad, and traveling.
Languages: support translation in 107 languages such as English, Chinese, Japanese, Korean, French, Russian, and Spanish, covering 186 countries.
Text translation: translate the text you type in, and support translation in 107 languages.
AR translation: translate upon scanning with a camera, with no need of shooting.
Simultaneous translation: with streaming speech recognition, translate while you are speaking.
Photo translation: with the powerful function of camera-based OCR word capturing and photo translation, translate what you shoot.
Emotion translation: funny emotion translation, making the translation more interesting.
Off-line translation: a free dictionary translation application supporting off-line translation, to translate when you are traveling abroad and network connection is unavailable.
Dear Translate is a great free app that is a nice solution for anyone who needs to translate languages using their Huawei phone. The app really highlights the abilities of HMS Core.
Interesting and useful. I didn't know it.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
ML Kit:
Added the face verification service, which compares captured faces with existing face records to generate a similarity value, and then determines whether the faces belong to the same person based on the value. This service helps safeguard your financial services and reduce security risks.
Added the capability of recognizing Vietnamese ID cards.
Reduced hand keypoint detection delay and added gesture recognition capabilities, with support for 14 gestures. Gesture recognition is widely used in smart household, interactive entertainment, and online education apps.
Added on-device translation support for 8 new languages, including Hungarian, Dutch, and Persian.
Added support for automatic speech recognition (ASR), text to speech (TTS), and real time transcription services in Chinese and English on all mobile phones, and French, Spanish, German, and Italian on Huawei and Honor phones.
Other updates: Optimized image segmentation, document skew correction, sound detection, and the custom model feature; added audio file transcription support for uploading multiple long audio files on devices.
Learn More
Nearby Service:
Added Windows to the list of platforms that Nearby Connection supports, allowing you to receive and transmit data between Android and Windows devices. For example, you can connect a phone as a touch panel to a computer, or use a phone to make a payment after placing an order on the computer.
Added iOS and MacOS to the list of systems that Nearby Message supports, allowing you to receive beacon messages on iOS and MacOS devices. For example, users can receive marketing messages of a store with beacons deployed, after entering the store.
Learn More
Health Kit:
Added the details and statistical data type for medium- and high-intensity activities.
Learn More
Scene Kit:
Added fine-grained graphics APIs, including those of classes for resources, scenes, nodes, and components, helping you realize more accurate and flexible scene rendering.
Shadow features: Added support for real-time dynamic shadows and the function of enabling or disabling shadow casting and receiving for a single model.
Animation features: Added support for skeletal animation and morphing, playback controller, and playback in forward, reverse, or loop mode.
Added support for asynchronous loading of assets and loading of glTF files from external storage.
Learn More
Computer Graphics Kit:
Added multithreaded rendering capability to significantly increase frame rates in scenes with high CPU usage.
Learn More
Made necessary updates to other kits. Learn More
New Resources
Analytics Kit:
Sample Code: Added the Kotlin sample code to hms-analytics-demo-android and the Swift sample code to hms-analytics-demo-ios.
Learn More
Dynamic Tag Manager:
Sample Code: Added the iOS sample code to hms-dtm-demo-ios.
Learn More
Identity Kit:
Sample Code: Added the Kotlin sample code to hms-identity-demo.
Learn More
Location Kit:
Sample Code: Updated methods for checking whether GNSS is supported and whether the GNSS switch is turned on in hms-location-demo-android-studio; optimized the permission verification process to improve user experience.
Learn More
Map Kit:
Sample Code: Added guide for adding dependencies on two fallbacks to hms-mapkit-demo, so that Map Kit can be used on non-Huawei Android phones and in other scenarios where HMS Core (APK) is not available.
Learn More
Site Kit:
Sample Code: Added the strictBounds attribute for NearbySearchRequest in hms-sitekit-demo, which indicates whether to strictly restrict place search in the bounds specified by location and radius, and added the attribute for QuerySuggestionRequest and SearchFilter for showing whether to strictly restrict place search in the bounds specified by Bounds.
Learn More
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
Our lives are now packed with advanced devices, such as mobile gadgets, wearables, smart home appliances, telematics devices, and more.
Of all the features that make them advanced, the major one is the ability to understand user speech. Speaking into a device and telling it to do something are naturally easier and more satisfying than using input devices (like a keyboard and mouse) for the same purpose.
To help devices understand human speech, HMS Core ML Kit introduced the automatic speech recognition (ASR) service, to create a smoother human-machine interaction experience.
Service IntroductionASR can recognize and simultaneously convert speech (no longer than 60s) into text, by using industry-leading deep learning technologies. Boasting regularly updated algorithms and data, currently the service delivers a recognition accuracy of 95%+. The supported languages now are: Mandarin Chinese (including Chinese-English bilingual speech), English, French, German, Spanish, Italian, Arabic, Russian, Thai, Malay, Filipino, and Turkish.
Demo
Use CasesASR covers many fields spanning life and work, and enhances recognition capabilities of searching for products, movies, TV series, and music, as well as the capabilities for navigation services. When a user searches for a product in a shopping app through speech, this service recognizes the product name or feature in speech as text for search.
Similarly, when a user uses a music app, this service recognizes the song name or singer input by voice as text to search for the song.
On top of these, ASR can even contribute to driving safety: During driving — when users are not supposed to use their phone to, for example, search for a place — ASR allows them to speak out where they want to go and converts the speech into text for the navigation app which can then offer the search results to users.
Features
Real-time result output
Available options: with and without speech pickup UI
Endpoint detection: Start and end points of speech can be accurately located.
Silence detection: No voice packet is sent for silent parts.
Intelligent conversion of number formats: For example, when the speech is "year two thousand twenty-two", the text output by ASR will be "2022".
How to Integrate ML Kit?
For guidance about ML Kit integration, please refer to its official document. Also welcome to the HUAWEI Developers website, where you can find other resources for reference.
Optical character recognition (OCR) technology efficiently recognizes and extracts text in images of receipts, business cards, documents, and more, freeing us from the hassle of manually entering and checking text. This tech helps mobile apps cut the cost of information input and boost their usability.
So far, OCR has been applied to numerous fields, including the following:
In transportation scenarios, OCR is used to recognize license plate numbers for easy parking management, smart transportation, policing, and more.
In lifestyle apps, OCR helps extract information from images of licenses, documents, and cards — such as bank cards, passports, and business licenses — as well as road signs.
The technology also works for receipts, which is ideal for banks and tax institutes for recording receipts.
It doesn't stop here. Books, reports, CVs, and contracts. All these paper documents can be saved digitally with the help of OCR.
How HMS Core ML Kit's OCR Service WorksHMS Core's ML Kit released its OCR service, text recognition, on Jan. 15, 2020, which features abundant APIs. This service can accurately recognize text that is tilted, typeset horizontally or vertically, and curved. Not only that, the service can even precisely present how text is divided among paragraphs.
Text recognition offers both cloud-side and device-side services, to provide privacy protection for recognizing specific cards, licenses, and receipts. The device-side service can perform real-time recognition of text in images or camera streams on the device, and sparse text in images is also supported. The device-side service supports 10 languages: Simplified Chinese, Japanese, Korean, English, Spanish, Portuguese, Italian, German, French, and Russian.
The cloud-side service, by contrast, delivers higher accuracy and supports dense text in images of documents and sparse text in other types of images. This service supports 19 languages: Simplified Chinese, English, Spanish, Portuguese, Italian, German, French, Russian, Japanese, Korean, Polish, Finnish, Norwegian, Swedish, Danish, Turkish, Thai, Arabic, and Hindi. The recognition accuracy for some of the languages is industry-leading.
The OCR service was further improved in ML Kit, providing a lighter device-side model and higher accuracy. The following is a demo screenshot for this service.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
How Text Recognition Has Been ImprovedLighter device-side model, delivering better recognition performance of all supported languages
The device-side service has downsized by 42%, without compromising on KPIs. The memory that the service consumes during runtime has decreased from 19.4 MB to around 11.1 MB.
As a result, the service is now smoother. It has a higher accuracy for recognizing Chinese on the cloud-side, which has increased from 87.62% to 92.95%, higher than the industry average.
Technology SpecificationsOCR is a process in which an electronic device examines a character printed on a paper, by detecting dark or light areas to determine a shape of the character, and then translates the shape into computer text by using a character recognition method. In short, OCR is a technology (designed for printed characters) that converts text in an image into a black-and-white dot matrix image file, and uses recognition software to convert the text in the image for further editing.
In many cases, image text is curved, and therefore the algorithm team for text recognition re-designed the model of this service. They managed to make it support not only horizontal text, but also text that is tilted or curved. With such a capability, the service delivers higher accuracy and usability when it is used in transportation scenarios and more.
Compared with the cloud-side service, however, the device-side service is more suitable when the text to be recognized concerns privacy. The service performance can be affected by factors such as device computation power and power consumption. With these in mind, the team designed the model framework and adopted technologies like quantization and pruning, while reducing the model size to ensure user experience without compromising recognition accuracy.
Performance After UpdateThe text recognition service of the updated version performs even better. Its cloud-side service delivers an accuracy that is 7% higher than that of its competitor, with a latency that is 55% of that of its competitor.
As for the device-side service, it has a superior average accuracy and model size. In fact, the recognition accuracy for some minor languages is up to 95%.
Future UpdatesMost OCR solutions now support only printed characters. The text recognition service team from ML Kit is trying to equip it with a capability that allows it to recognize handwriting. In future versions, this service will be able to recognize both printed characters and handwriting.
The number of supported languages will grow to include languages such as Romanian, Malay, Filipino, and more.
The service will be able to analyze the layout so that it can adjust PDF typesetting. By supporting more and more types of content, ML Kit remains committed to honing its AI edge.
In this way, the kit, together with other HMS Core services, will try to meet the tailored needs of apps in different fields.
ReferencesHMS Core ML Kit home page
HMS Core ML Kit Development Guide