HUAWEI ML Kit's Automatic Speech Recognition (ASR) Service: Tongue Twisters

HUAWEI ML Kit's Automatic Speech Recognition (ASR) Service: Tongue Twisters - Huawei Developers

Overview
When I try to perform voice commands on my devices, the device will often fail to recognize what I am trying to say, because of my poor pronunciation. For example, sometimes I can't distinguish between syllables, or make the "ch" and "sh" sounds, which have led to some frustrating experiences. I've always envied people who can enunciate well, and recite tongue twisters with ease, and have dreamed of the day when that could be me. By chance, I came across the game Tongue Twister, which integrates HUAWEI ML Kit's ASR service, and has changed my life for the better. Let's take a look at how the game works.
Application Scenarios
There are five levels in Tongue Twister, and as you'd expect, each level contains a tongue twister. The key for passing each level is ML Kit's ASR service. By integrating the service, the game is able to recognize the player's voice with a high degree of accuracy. Players are thus able to pass each level when they demonstrate clear enunciation. The service has proven itself to be highly useful in certain fields, enhancing recognition capabilities for product, movie, and music searches, as well as navigation services.
Now, let's look at what the game looks like in practice.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
Piqued your interest? With the ASR service, why not create a tongue twister game of your own? Here's how...
Development Procedures
1. For details about how to set the authentication information for your app, please refer to Notes on Using Cloud Authentication Information.
2. Call an API to create a speech recognizer.
Code:
MLAsrRecognizer mSpeechRecognizer = MLAsrRecognizer.createAsrRecognizer(context);
3. Create a speech recognition result listener callback.
Code:
/**
* Use the callback to implement the MLAsrListener API and methods in the API.
*/
protected class SpeechRecognitionListener implements MLAsrListener {
@Override
public void onStartListening() {
// The recorder starts to receive speech.
}
@Override
public void onStartingOfSpeech() {
// The user starts to speak, that is, the speech recognizer detects that the user starts to speak.
}
@Override
public void onVoiceDataReceived(byte[] data, float energy, Bundle bundle) {
// Return the original PCM stream and audio power to the user.
}
@Override
public void onRecognizingResults(Bundle partialResults) {
// Receive the recognized text from MLAsrRecognizer.
}
@Override
public void onResults(Bundle results) {
// Text data of ASR.
}
}
@Override
public void onError(int error, String errorMessage) {
// If you don't add this, there will be no response after you cut the network
}
@Override
public void onState(int state, Bundle params) {
// Notify the app status change.
}
}
4. Bind the new result listener callback to the speech recognizer.
Code:
mSpeechRecognizer.setAsrListener(new SpeechRecognitionListener());
5. Set the recognition parameters and initiate speech recognition.
Code:
// Set parameters and start the audio device.
Intent mSpeechRecognizerIntent = new Intent(MLAsrConstants.ACTION_HMS_ASR_SPEECH);
mSpeechRecognizerIntent
// Set the language that can be recognized to English. If this parameter is not set,
// English is recognized by default. Example: "zh-CN": Chinese;"en-US": English;"fr-FR": French;"es-ES": Spanish;"de-DE": German;"it-IT": Italian.
.putExtra(MLAsrConstants.LANGUAGE, language)
// Set to return the recognition result along with the speech. If you ignore the setting, this mode is used by default. Options are as follows:
// MLAsrConstants.FEATURE_WORDFLUX: Recognizes and returns texts through onRecognizingResults.
// MLAsrConstants.FEATURE_ALLINONE: After the recognition is complete, texts are returned through onResults.
.putExtra(MLAsrConstants.FEATURE, MLAsrConstants.FEATURE_WORDFLUX);mSpeechRecognizer.startRecognizing(mSpeechRecognizerIntent);
6. Release resources when the recognition ends.
Code:
if (mSpeechRecognizer != null) {
mSpeechRecognizer.destroy();
mSpeechRecognizer = null;
}
Maven repository address
Code:
buildscript {
repositories {
maven { url 'https://developer.huawei.com/repo/' }
}
}
allprojects {
repositories {
maven { url 'https://developer.huawei.com/repo/' }
}
}
SDK import
Code:
dependencies {
// Automatic speech recognition Long voice SDK.
implementation 'com.huawei.hms:ml-computer-voice-realtimetranscription:2.0.3.300'
// Automatic speech recognition SDK.
implementation 'com.huawei.hms:ml-computer-voice-asr:2.0.3.300'
// Automatic speech recognition plugin.
implementation 'com.huawei.hms:ml-computer-voice-asr-plugin:2.0.3.300'
}
Manifest files
Code:
<manifest
...
<meta-data
android:name="com.huawei.hms.ml.DEPENDENCY"
android:value="ocr />
...
</manifest>
Permission
Code:
<uses-permission android:name="android.permission.RECORD_AUDIO" />
Dynamic permission application
Code:
private void requestCameraPermission() {
final String[] permissions = new String[]{Manifest.permission.RECORD_AUDIO};
if (!ActivityCompat.shouldShowRequestPermissionRationale(this,
Manifest.permission.RECORD_AUDIO)) { ActivityCompat.requestPermissions(this,
permissions,
TongueTwisterActivity.AUDIO_CODE);
return;
}
}
Summary
In addition to game applications, ML Kit's ASR service also takes effect in other scenarios, such as in shopping apps. The service is able to recognize a spoken product name or feature, and convert it into text to search for the product. For music apps, the service can likewise, recognize song and artist names. For navigation as well, the driver will naturally prefer to speak a destination rather than type it, and have it converted into text using ASR, to enjoy an optimally safe driving experience.
Learn More
For more information, please visit HUAWEI Developers.
For detailed instructions, please visit Development Guide.
You can join the HMS Core developer discussion by going to Reddit.
You can download the demo and sample code on GitHub.
To solve integration problems, please go to Stack Overflow.

Related

How to use HUAWEI ML Kit service to quickly develop a photo translation app

Photo translation app is quite useful when traveling abroad and this article will help the developers build this app in short time. We use HUAWEI ML kit help to build this app and this will largely accelerate the whole development process.
Introduction
There must be a lot of friends who like to travel. Sometimes it's better to go abroad for a tour. Before the tour, we will make all kinds of strategies for eating, wearing, living, traveling and playing routes.
Imaginary tourism:
Before departure, the imagined tourist destination may have beautiful buildings:
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
Delicious food
Beautiful women
Carefree life
Actual tourism：
But in reality, if you go to a place where the language is different from ur mother tongue, you may encounter the following problems:
A confusing map
Unreadable menu
Street sign
That's too hard to travel abroad without any translation tool !!!
Photo translator will help you
With text recognition and translation services, none of the above is a problem. There are only two steps to complete the development of photo translation small application:
Text recognition
First take a photo and then send the image to Huawei HMS ml kit text recognition service for text recognition
Huawei's text recognition service provides offline SDK (end side) and cloud side at the same time. The end side is free and can be detected in real time, and the cloud side recognition type and accuracy are higher. In this actual battle, we use the capabilities provided by cloud side.
Photo translation app development
1 Development preparation
Due to the use of cloud services, it is necessary to register the developer account with Huawei's developer alliance and open these services in the cloud. Here we will not go into details, just follow the operation steps of the official appgallery connect configuration and service opening:
Registered developer, open service reference please go to:
https://developer.huawei.com/consumer/en/doc/development/HMS-Guides/ml-enable-service
1.1 add Maven in project level gradle
Open the Android studio project level build.gradle file.
Add the maven address
Code:
buildscript {
repositories {
maven {url 'http://developer.huawei.com/repo/'}
} }allprojects {
repositories {
maven { url 'http://developer.huawei.com/repo/'}
}}
1.2 add SDK dependency in application level build.gradle
Integrated SDK. (Due to the use of cloud-side capabilities, only SDK basic packages can be introduced)
Code:
dependencies{
implementation 'com.huawei.hms:ml-computer-vision:1.0.2.300'
implementation 'com.huawei.hms:ml-computer-translate:1.0.2.300'}
1.3 apply for camera and storage permission in Android manifest.xml file
Code:
<uses-permission android:name="android.permission.CAMERA" /><uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" /><uses-feature android:name="android.hardware.camera" /><uses-feature android:name="android.hardware.camera.autofocus" />
Two key steps of code development
2.1 dynamic authority application
Code:
private static final int CAMERA_PERMISSION_CODE = 1; @Override
public void onCreate(Bundle savedInstanceState) {
// Checking camera permission
if (!allPermissionsGranted()) {
getRuntimePermissions();
}}
2.2 create a cloud text analyzer. You can create a text analyzer from the text detection configurator "mlremotetextsetting".
Code:
MLRemoteTextSetting setting = (new MLRemoteTextSetting.Factory()).
setTextDensityScene(MLRemoteTextSetting.OCR_LOOSE_SCENE).create();this.textAnalyzer = MLAnalyzerFactory.getInstance().getRemoteTextAnalyzer(setting);
2.3 create "mlframe" object through android.graphics.bitmap for analyzer to detect pictures.
Code:
MLFrame mlFrame = new MLFrame.Creator().setBitmap(this.originBitmap).create();
2.4 call "asyncanalyseframe" method for text detection.
Code:
Task<MLText> task = this.textAnalyzer.asyncAnalyseFrame(mlFrame);
task.addOnSuccessListener(new OnSuccessListener<MLText>() {
@Override public void onSuccess(MLText mlText) {
// Transacting logic for segment success.
if (mlText != null) {
RemoteTranslateActivity.this.remoteDetectSuccess(mlText);
} else {
RemoteTranslateActivity.this.displayFailure();
}
}
}).addOnFailureListener(new OnFailureListener() {
@Override public void onFailure(Exception e) {
// Transacting logic for segment failure.
RemoteTranslateActivity.this.displayFailure();
return;
}
});
2.5 create a text translator. You can create a translator through class "mlremotetranslatesetting".
Code:
MLRemoteTranslateSetting.Factory factory = new MLRemoteTranslateSetting
.Factory()
// Set the target language code. The ISO 639-1 standard is used.
.setTargetLangCode(this.dstLanguage);
if (!this.srcLanguage.equals("AUTO")) {
// Set the source language code. The ISO 639-1 standard is used.
factory.setSourceLangCode(this.srcLanguage);
}
this.translator = MLTranslatorFactory.getInstance().getRemoteTranslator(factory.create());
2.6 call "asyncanalyseframe" method to translate the content obtained by text recognition.
Code:
final Task<String> task = translator.asyncTranslate(this.sourceText);
task.addOnSuccessListener(new OnSuccessListener<String>() {
@Override public void onSuccess(String text) {
if (text != null) {
RemoteTranslateActivity.this.remoteDisplaySuccess(text);
} else {
RemoteTranslateActivity.this.displayFailure();
}
}
}).addOnFailureListener(new OnFailureListener() {
@Override public void onFailure(Exception e) {
RemoteTranslateActivity.this.displayFailure();
}
});
2.7 release resources after translation.
Code:
if (this.textAnalyzer != null) {
try {
this.textAnalyzer.close();
} catch (IOException e) {
SmartLog.e(RemoteTranslateActivity.TAG, "Stop analyzer failed: " + e.getMessage());
}
}
if (this.translator != null) {
this.translator.stop();
}
3 source code
The demo source code has been uploaded to GitHub(the project directory is: Photo translate). You can do scene based optimization for reference.
https://github.com/HMS-MLKit/HUAWEI-HMS-MLKit-Sample
4 demo
5 Brainstorming
The app development demonstrats how to use the two cloud side capabilities of Huawei HMS ml kit, text recognition and translation. Huawei's text recognition and translation can also help developers to do many other interesting and powerful functions, such as:
[general text recognition]
1. text recognition of bus license plate
2. Text recognition in document reading
[card recognition]
1. The card number of the bank card can be identified through text recognition, which is used in the scenarios such as bank card binding, etc
2. Of course, in addition to identifying bank cards, you can also identify various card numbers in your life, such as membership cards and preferential cards
3. In addition, it can also realize the identification of ID card, Hong Kong and Macao pass and other certificate numbers
[translation]
1. Signpost and signboard translation
2. Document translation
3. Web page translation, such as identifying the language type of the comment area of the website and translating it into the language of the corresponding country;
4. Introduction and translation of overseas products
5. Translation of restaurant order menu
FOR MORE REFERENCE PLZ CLICK:
https://developer.huawei.com/consumer/en/doc/development/HMS-Guides/ml-introduction-4

Reply to rikkirose
rikkirose said:
Thanks for the guide. I'm not sure that this application is suitable for high-quality translation of documents, as machine translators do this poorly, but otherwise it looks very simple and convenient.
Click to expand...
Click to collapse
Hi，rikkirose，document translation is not yet supported, and it is expected to be supported in August this year. Currently，High-quality translation, the key areas of optimization are news, travel, technology, and social. If it is not within those scopes, and if you really want to try it, you can provide us sample, we can do verification and quality improvement for you.
Please feel free to email and transfer the sample and detail requirement to this email：[email protected]

Hi,
Nice Post. Can we use Huawei ML Kit to translate our communication to other languages. It will help tourists to communicate. Is it possible.???

Very interesting, thanks

Use MLKit Service to Quickly Develop A Photo Translation App

More information like this, you can visit HUAWEI Developer Forum
Photo translation app is quite useful when traveling abroad and this article will help the developers build this app in short time
We use HUAWEi Mlkit help to build this app and this will largely accelerate the whole development process.
Introduction
There must be a lot of friends who like to travel. Sometimes it’s better to go abroad for a tour. Before the tour, we will make all kinds of strategies for eating, wearing, living, traveling and playing routes.
Imaginary tourism:
Before departure, the imagined tourist destination may have beautiful buildings:
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
delicious food:
beautiful women:
Actual tourism:
But in reality, if you go to a place where the language is different from ur mother tongue, you may encounter the following problems:
A confusing map
Unreadable menu
Street sign
Various goods:
That’s too hard to travel abroad without any translation tool !!!
Photo translator will help you
With text recognition and translation services, none of the above is a problem. There are only two steps to complete the development of photo translation small application:
Text recognition
First take a photo and then send the image to Huawei HMS ml kit text recognition service for text recognition
Huawei’s text recognition service provides offline SDK (end side) and cloud side at the same time. The end side is free and can be detected in real time, and the cloud side recognition type and accuracy are higher. In this actual battle, we use the capabilities provided by cloud side.
Photo translation app development
1 Development preparation
Due to the use of cloud services, it is necessary to register the developer account with Huawei’s developer alliance and open these services in the cloud. Here we will not go into details, just follow the operation steps of the official appgallery connect configuration and service opening:
Registered developer, open service reference please go to:
https://developer.huawei.com/consumer/en/doc/development/HMS-Guides/ml-enable-service
1.1 add Maven in project level gradle
Open the Android studio project level build.gradle file.
Add the maven address
Code:
buildscript {
repositories {
maven {url 'http://developer.huawei.com/repo/'}
} }allprojects {
repositories {
maven { url 'http://developer.huawei.com/repo/'}
}}
1.2 add SDK dependency in application level build.gradle
Integrate the SDK. (Because cloud capabilities are used, only the SDK basic package needs to be introduced.)
Code:
dependencies{
implementation 'com.huawei.hms:ml-computer-vision:1.0.2.300'
implementation 'com.huawei.hms:ml-computer-translate:1.0.2.300'}
1.3 apply for camera and storage permission in Android manifest.xml file
Code:
<uses-permission android:name="android.permission.CAMERA" /><uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" /><uses-feature android:name="android.hardware.camera" /><uses-feature android:name="android.hardware.camera.autofocus" />
Two key steps of code development
2.1 dynamic authority application
Code:
private static final int CAMERA_PERMISSION_CODE = 1; @Override
public void onCreate(Bundle savedInstanceState) {
// Checking camera permission
if (!allPermissionsGranted()) {
getRuntimePermissions();
}}
2.2 create a cloud text analyzer. You can create a text analyzer from the text detection configurator “mlremotetextsetting”.
Code:
MLRemoteTextSetting setting = (new MLRemoteTextSetting.Factory()).
setTextDensityScene(MLRemoteTextSetting.OCR_LOOSE_SCENE).create();this.textAnalyzer = MLAnalyzerFactory.getInstance().getRemoteTextAnalyzer(setting);
2.3 create “mlframe” object through android.graphics.bitmap for analyzer to detect pictures.
Code:
MLFrame mlFrame = new MLFrame.Creator().setBitmap(this.originBitmap).create();
2.4 call “asyncanalyseframe” method for text detection.
Code:
Task<MLText> task = this.textAnalyzer.asyncAnalyseFrame(mlFrame);
task.addOnSuccessListener(new OnSuccessListener<MLText>() {
@Override public void onSuccess(MLText mlText) {
// Transacting logic for segment success.
if (mlText != null) {
RemoteTranslateActivity.this.remoteDetectSuccess(mlText);
} else {
RemoteTranslateActivity.this.displayFailure();
}
}
}).addOnFailureListener(new OnFailureListener() {
@Override public void onFailure(Exception e) {
// Transacting logic for segment failure.
RemoteTranslateActivity.this.displayFailure();
return;
}
});
2.5 create a text translator. You can create a translator through class “mlremotetranslatesetting”.
Code:
MLRemoteTranslateSetting.Factory factory = new MLRemoteTranslateSetting
.Factory()
// Set the target language code. The ISO 639-1 standard is used.
.setTargetLangCode(this.dstLanguage);
if (!this.srcLanguage.equals("AUTO")) {
// Set the source language code. The ISO 639-1 standard is used.
factory.setSourceLangCode(this.srcLanguage);
}
this.translator = MLTranslatorFactory.getInstance().getRemoteTranslator(factory.create());
2.6 call “asyncanalyseframe” method to translate the content obtained by text recognition.
Code:
final Task<String> task = translator.asyncTranslate(this.sourceText);
task.addOnSuccessListener(new OnSuccessListener<String>() {
@Override public void onSuccess(String text) {
if (text != null) {
RemoteTranslateActivity.this.remoteDisplaySuccess(text);
} else {
RemoteTranslateActivity.this.displayFailure();
}
}
}).addOnFailureListener(new OnFailureListener() {
@Override public void onFailure(Exception e) {
RemoteTranslateActivity.this.displayFailure();
}
});
2.7 release resources after translation.
Code:
if (this.textAnalyzer != null) {
try {
this.textAnalyzer.close();
} catch (IOException e) {
SmartLog.e(RemoteTranslateActivity.TAG, "Stop analyzer failed: " + e.getMessage());
}
}
if (this.translator != null) {
this.translator.stop();
}
3 source code
The demo source code has been uploaded to GitHub(the project directory is: Photo translate). You can do scene based optimization for reference.
https://github.com/HMS-MLKit/HUAWEI-HMS-MLKit-Sample
4 Demo
5 Brainstorming
The app development demonstrats how to use the two cloud side capabilities of Huawei HMS ml kit, text recognition and translation. Huawei’s text recognition and translation can also help developers to do many other interesting and powerful functions, such as:
[general text recognition]
1. text recognition of bus license plate
2. Text recognition in document reading
[card recognition]
1. The card number of the bank card can be identified through text recognition, which is used in the scenarios such as bank card binding, etc
2. Of course, in addition to identifying bank cards, you can also identify various card numbers in your life, such as membership cards and preferential cards
3. In addition, it can also realize the identification of ID card, Hong Kong and Macao pass and other certificate numbers
[translation]
1. Signpost and signboard translation
2. Document translation
3. Web page translation, such as identifying the language type of the comment area of the website and translating it into the language of the corresponding country;
4. Introduction and translation of overseas products
5. Translation of restaurant order menu
FOR MORE REFERENCE PLZ CLICK:
https://developer.huawei.com/consumer/en/doc/development/HMS-Guides/ml-introduction-4
Previous link：
NO. 1：One article to understand Huawei HMS ML Kit text recognition, bank card recognition, general card identification
NO.2: Integrating MLkit and publishing ur app on Huawei AppGallery

does Ml kit work locally or processing is done at server?

A Quick Introduction about How to Implement Sound Detection

For some apps, it's necessary to have a function called sound detection that can recognize sounds like knocks on the door, rings of the doorbell, and car horns. Developing such a function can be costly for small- and medium-sized developers, so what should they do in this situation?
There's no need to worry about if you have the sound detection service in HUAWEI ML Kit. Integrating its SDK into your app is simple, and you can equip it with the sound detection function that can work well even when the device does not connect to the network.
Introduction to Sound Detection in HUAWEI ML Kit
This service can detect sound events online by real-time recording. The detected sound events can help you perform subsequent actions. Currently, the following types of sound events are supported: laughter, child crying, snoring, sneezing, shouting, cat meowing, dog barking, running water (such as from taps, streams, and ocean waves), car horns, doorbells, knocking on doors, fire alarms (including smoke alarms), and sirens (such as those from fire trucks, ambulances, police cars, and air defenses).
Preparations
Configuring the Development Environment
Create an app in AppGallery Connect.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
For details, see Getting Started with Android.
Enable ML Kit.
Click here to get more information.
After the app is created, an agconnect-services.json file will be automatically generated. Download it and copy it to the root directory of your project.
Configure the Huawei Maven repository address.
To learn more, click here.
Integrate the sound detection SDK.
It is recommended to integrate the SDK in full SDK mode. Add build dependencies for the SDK in the app-level build.gradle file.
Code:
// Import the sound detection package.
implementation 'com.huawei.hms:ml-speech-semantics-sounddect-sdk:2.1.0.300'
implementation 'com.huawei.hms:ml-speech-semantics-sounddect-model:2.1.0.300'
Add the AppGallery Connect plugin configuration as needed using either of the following methods:
Method 1: Add the following information under the declaration in the file header:
Code:
apply plugin: 'com.android.application'
apply plugin: 'com.huawei.agconnect'
Method 2: Add the plugin configuration in the plugins block:
Code:
plugins {
id 'com.android.application'
id 'com.huawei.agconnect'
}
Automatically update the machine learning model.
Add the following statements to the AndroidManifest.xml file. After a user installs your app from HUAWEI AppGallery, the machine learning model is automatically updated to the user's device.
Code:
<meta-data
android:name="com.huawei.hms.ml.DEPENDENCY"
android:value= "sounddect"/>
For details, go to Integrating the Sound Detection SDK.
Development Procedure
Obtain the microphone permission. If the app does not have this permission, error 12203 will be reported.
(Mandatory) Apply for the static permission.
Code:
<uses-permission android:name="android.permission.RECORD_AUDIO" />
(Mandatory) Apply for the dynamic permission.
ActivityCompat.requestPermissions(
this, new String[]{Manifest.permission.RECORD_AUDIO
}, 1);
Create an MLSoundDector object.
Code:
private static final String TAG = "MLSoundDectorDemo";
// Object of sound detection.
private MLSoundDector mlSoundDector;
// Create an MLSoundDector object and configure the callback.
private void initMLSoundDector(){
mlSoundDector = MLSoundDector.createSoundDector();
mlSoundDector.setSoundDectListener(listener);
}
Create a sound detection result callback to obtain the detection result and pass the callback to the sound detection instance.
// Create a sound detection result callback to obtain the detection result and pass the callback to the sound detection instance.
Code:
private MLSoundDectListener listener = new MLSoundDectListener() {
@Override
public void onSoundSuccessResult(Bundle result) {
// Processing logic when the detection is successful. The detection result ranges from 0 to 12, corresponding to the 13 sound types whose names start with SOUND_EVENT_TYPE. The types are defined in MLSoundDectConstants.java.
int soundType = result.getInt(MLSoundDector.RESULTS_RECOGNIZED);
Log.d(TAG,"Detection success:"+soundType);
}
@Override
public void onSoundFailResult(int errCode) {
// Processing logic for detection failure. The possible cause is that your app does not have the microphone permission (Manifest.permission.RECORD_AUDIO).
Log.d(TAG,"Detection failure"+errCode);
}
};
Note: The code above prints the type of the detected sound as an integer. In the actual situation, you can convert the integer into a data type that users can understand.
Definition for the types of detected sounds:
Code:
<string-array name="sound_dect_voice_type">
<item>laughter</item>
<item>baby crying sound</item>
<item>snore</item>
<item>sneeze</item>
<item>shout</item>
<item>cat's meow</item>
<item>dog's bark</item>
<item>running water</item>
<item>car horn sound</item>
<item>doorbell sound</item>
<item>knock</item>
<item>fire alarm sound</item>
<item>alarm sound</item>
</string-array>
Start and stop sound detection.
Code:
@Override
public void onClick(View v) {
switch (v.getId()){
case R.id.btn_start_detect:
if (mlSoundDector != null){
boolean isStarted = mlSoundDector.start(this); // context: Context.
// If the value of isStared is true, the detection is successfully started. If the value of isStared is false, the detection fails to be started. (The possible cause is that the microphone is occupied by the system or another app.)
if (isStarted){
Toast.makeText(this,"The detection is successfully started.", Toast.LENGTH_SHORT).show();
}
}
break;
case R.id.btn_stop_detect:
if (mlSoundDector != null){
mlSoundDector.stop();
}
break;
}
}
Call destroy() to release resources when the sound detection page is closed.
Code:
@Override
protected void onDestroy() {
super.onDestroy();
if (mlSoundDector != null){
mlSoundDector.destroy();
}
}
Testing the App
Using the knock as an example, the output result of sound detection is expected to be 10.
Tap Start detecting and simulate a knock on the door. If you get logs as follows in Android Studio's console, they indicates that the integration of the sound detection SDK is successful.
More Information
Sound detection belongs to one of the six capability categories of ML Kit, which are related to text, language/voice, image, face/body, natural language processing, and custom model.
Sound detection is a service in the language/voice-related category.
Interested in other categories? Feel free to have a look at the HUAWEI ML Kit document.
To learn more, please visit:
HUAWEI Developers official website
Development Guide
Reddit to join developer discussions
GitHub or Gitee to download the demo and sample code
Stack Overflow to solve integration problems
Follow our official account for the latest HMS Core-related news and updates.
Original Source

How to Add AI Dubbing to App

Text to speech (TTS) is highly sought after by audio/video editors, thanks to its ability to automatically turn text into naturally sounding speech, as a low cost alternative to human dubbing. It can be used on all kinds of video, regardless of whether the video is long or short.
I recently stumbled upon the AI dubbing capability of HMS Core Audio Editor Kit, which does just that. It is able to turn input text into speech with just a tap, and comes loaded with a selection of smooth, naturally-sounding male and female timbres.
This is ideal for developing apps that involve e-books, creating audio content, and editing audio/video. Below describes how I integrated this capability.
Making PreparationsComplete all necessary preparations by following the official guide.
Configuring the Project1. Set the app authentication information
The information can be set via an API key or access token (recommended).
Use setAccessToken to set an access token during app initialization.
Java:
HAEApplication.getInstance().setAccessToken("your access token");
Or, use setApiKey to set an API key during app initialization. The API key needs to be set only once.
Java:
HAEApplication.getInstance().setApiKey("your ApiKey");
2. Initialize the runtime environment
Initialize HuaweiAudioEditor, and create a timeline and necessary lanes.
Java:
// Create a HuaweiAudioEditor instance.
HuaweiAudioEditor mEditor = HuaweiAudioEditor.create(mContext);
// Initialize the runtime environment of HuaweiAudioEditor.
mEditor.initEnvironment();
// Create a timeline.
HAETimeLine mTimeLine = mEditor.getTimeLine();
// Create a lane.
HAEAudioLane audioLane = mTimeLine.appendAudioLane();
Import audio.
Java:
// Add an audio asset to the end of the lane.
HAEAudioAsset audioAsset = audioLane.appendAudioAsset("/sdcard/download/test.mp3", mTimeLine.getCurrentTime());
3. Integrate AI dubbing.
Call HAEAiDubbingEngine to implement AI dubbing.
Java:
// Configure the AI dubbing engine.
HAEAiDubbingConfig haeAiDubbingConfig = new HAEAiDubbingConfig()
// Set the volume.
.setVolume(volumeVal)
// Set the speech speed.
.setSpeed(speedVal)
// Set the speaker.
.setType(defaultSpeakerType);
// Create a callback for an AI dubbing task.
HAEAiDubbingCallback callback = new HAEAiDubbingCallback() {
@Override
public void onError(String taskId, HAEAiDubbingError err) {
// Callback when an error occurs.
}
@Override
public void onWarn(String taskId, HAEAiDubbingWarn warn) {}
@Override
public void onRangeStart(String taskId, int start, int end) {}
@Override
public void onAudioAvailable(String taskId, HAEAiDubbingAudioInfo haeAiDubbingAudioFragment, int i, Pair<Integer, Integer> pair, Bundle bundle) {
// Start receiving and then saving the file.
}
@Override
public void onEvent(String taskId, int eventID, Bundle bundle) {
// Synthesis is complete.
if (eventID == HAEAiDubbingConstants.EVENT_SYNTHESIS_COMPLETE) {
// The AI dubbing task has been complete. That is, the synthesized audio data is completely processed.
}
}
@Override
public void onSpeakerUpdate(List<HAEAiDubbingSpeaker> speakerList, List<String> lanList,
List<String> lanDescList) { }
};
// AI dubbing engine.
HAEAiDubbingEngine mHAEAiDubbingEngine = new HAEAiDubbingEngine(haeAiDubbingConfig);
// Set the listener for the playback process of an AI dubbing task.
mHAEAiDubbingEngine.setAiDubbingCallback(callback);
// Convert text to speech and play the speech. In the method, text indicates the text to be converted to speech, and mode indicates the mode for playing the converted audio.
String taskId = mHAEAiDubbingEngine.speak(text, mode);
// Pause playback.
mHAEAiDubbingEngine.pause();
// Resume playback.
mHAEAiDubbingEngine.resume();
// Stop AI dubbing.
mHAEAiDubbingEngine.stop();
ResultIn the demo below, I successfully implement the AI dubbing function in app. Now, I can converts text into emotionally expressive speech, with default and custom timbres.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
To learn more, please visit:
>> Audio Editor Kit official website
>> Audio Editor Kit Development Guide
>> Reddit to join developer discussions
>> GitHub to download the sample code
>> Stack Overflow to solve integration problems
Follow our official account for the latest HMS Core-related news and updates.

How to Develop an AR-Based Health Check App

Now that spring has arrived, it's time to get out and stretch your legs! As programmers, many of us are used to being seated for hours and hours at time, which can lead to back pain and aches. We're all aware that building the workout plan, and keeping track of health indicators round-the-clock can have enormous benefits for body, mind, and soul.
Fortunately, AR Engine makes that remarkably easy. It comes with face tracking capabilities, and will soon support body tracking as well. Thanks to core AR algorithms, AR Engine is able to monitor heart rate, respiratory rate, facial health status, and heart rate waveform signals in real time during your workouts. You can also use it to build an app, for example, to track the real-time workout status, perform real-time health check for patients, or to monitor real-time health indicators of vulnerable users, like the elderly or the disabled. With AR Engine, you can make your health or fitness app more engaging and visually immersive than you might have believed possible.
Advantages and Device Model Restrictions1. Monitors core health indicators like heart rate, respiratory rate, facial health status, and heart rate waveform signals in real time.
2. Enables devices to better understand their users. Thanks to technologies like Simultaneous Localization and Mapping (SLAM) and 3D reconstruction, AR Engine renders images to build 3D human faces on mobile phones, resulting in seamless virtual-physical cohesion.
3. Supports all of the device models listed in Software and Hardware Requirements of AR Engine Features.
Demo IntroductionA simple demo is available to give you a grasp of how to integrate AR Engine, and use its human body and face tracking capabilities.
ENABLE_HEALTH_DEVICE: indicates whether to enable health check.
HealthParameter: health check parameter, including heart rate, respiratory rate, age and gender probability based on facial features, and heart rate waveform signals.
FaceDetectMode: face detection mode, including health rate checking, respiratory rate checking, real-time health checking, and all of the three above.
Effect
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
The following details how you can run the demo using the source code.
Key Steps1. Add the Huawei Maven repository to the project-level build.gradle file.
Java:
buildscript {
repositories {
maven { url 'http://developer.huawei.com/repo/'}
}
dependencies {
...
// Add the AppGallery Connect plugin configuration.
classpath 'com.huawei.agconnect:agcp:1.4.2.300'
}
}allprojects {
repositories {
maven { url 'http://developer.huawei.com/repo/'}
}
}
2. Add dependencies on the SDK to the app-level build.gradle file.
Java:
implementation 'com.huawei.hms:arenginesdk:3.7.0.3'
3. Declare system permissions in the AndroidManifest.xml file.
Java:
<uses-permission android:name="android.permission.CAMERA" />
4. Check whether AR Engine has been installed on the current device. If yes, the app can run properly. If not, the app automatically redirects the user to AppGallery to install AR Engine.
Java:
boolean isInstallArEngineApk = AREnginesApk.isAREngineApkReady(this);
if (!isInstallArEngineApk && isRemindInstall) {
Toast.makeText(this, "Please agree to install.", Toast.LENGTH_LONG).show();
finish();
}
if (!isInstallArEngineApk) {
startActivity(new Intent(this, ConnectAppMarketActivity.class));
isRemindInstall = true;
}
return AREnginesApk.isAREngineApkReady(this);
Key Code1. Call ARFaceTrackingConfig and create an ARSession object. Then, set the human face detection mode, configure AR parameters for motion tracking, and enable motion tracking.
Java:
mArSession = new ARSession(this);
mArFaceTrackingConfig = new ARFaceTrackingConfig(mArSession);
mArFaceTrackingConfig.setEnableItem(ARConfigBase.ENABLE_HEALTH_DEVICE);
mArFaceTrackingConfig
.setFaceDetectMode(ARConfigBase.FaceDetectMode.HEALTH_ENABLE_DEFAULT.getEnumValue());
2. Call FaceHealthServiceListener to add your app and pass the health check status and progress. Call handleProcessProgressEvent() to obtain the health check progress.
Java:
mArSession.addServiceListener(new FaceHealthServiceListener() {
@Override
public void handleEvent(EventObject eventObject) {
if (!(eventObject instanceof FaceHealthCheckStateEvent)) {
return;
}
final FaceHealthCheckState faceHealthCheckState =
((FaceHealthCheckStateEvent) eventObject).getFaceHealthCheckState();
runOnUiThread(new Runnable() {
@Override
public void run() {
mHealthCheckStatusTextView.setText(faceHealthCheckState.toString());
}
});
}
@Override
public void handleProcessProgressEvent(final int progress) {
mHealthRenderManager.setHealthCheckProgress(progress);
runOnUiThread(new Runnable() {
@Override
public void run() {
setProgressTips(progress);
}
});
}
});
private void setProgressTips(int progress) {
String progressTips = "processing";
if (progress >= MAX_PROGRESS) {
progressTips = "finish";
}
mProgressTips.setText(progressTips);
mHealthProgressBar.setProgress(progress);
}
Update data in real time and display the health check result.
Java:
mActivity.runOnUiThread(new Runnable() {
@Override
public void run() {
mHealthParamTable.removeAllViews();
TableRow heatRateTableRow = initTableRow(ARFace.HealthParameter.PARAMETER_HEART_RATE.toString(),
healthParams.getOrDefault(ARFace.HealthParameter.PARAMETER_HEART_RATE, 0.0f).toString());
mHealthParamTable.addView(heatRateTableRow);
TableRow breathRateTableRow = initTableRow(ARFace.HealthParameter.PARAMETER_BREATH_RATE.toString(),
healthParams.getOrDefault(ARFace.HealthParameter.PARAMETER_BREATH_RATE, 0.0f).toString());
mHealthParamTable.addView(breathRateTableRow);
}
});
References>> AR Engine official website
>> AR Engine Development Guide
>> Reddit to join developer discussions
>> GitHub to download the sample code
>> Stack Overflow to solve integration problems

Better Mobile App

welcome

HUAWEI ML Kit's Automatic Speech Recognition (ASR) Service: Tongue Twisters - Huawei Developers

Related

How to use HUAWEI ML Kit service to quickly develop a photo translation app

Use MLKit Service to Quickly Develop A Photo Translation App

A Quick Introduction about How to Implement Sound Detection

How to Add AI Dubbing to App

How to Develop an AR-Based Health Check App

Categories

Resources