Audio File Transcription, for Super-Efficient Recording - Huawei Developers

Introduction
Converting audio into text has a wide range of applications: generating video subtitles, taking meeting minutes, and writing interview transcripts. HUAWEI ML Kit's service makes doing so easier than ever before, converting audio files into meticulously accurate text, with correct punctuation as well!
Actual Effects
Build and run an app with audio file transcription integrated. Then, select a local audio file and convert it into text.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
Development Preparations
For details about configuring the Huawei Maven repository and integrating the audio file transcription SDK, please refer to the Development Guide of ML Kit on HUAWEI Developers.
Declaring Permissions in the AndroidManifest.xml File
Open the AndroidManifest.xml in the main folder. Add the network connection, network status access, and storage read permissions before <application.
Please note that these permissions need to be dynamically applied for. Otherwise, Permission Denied will be reported.
Code:
<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.ACCESS_NETWORK_STATE" />
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE" />
Development Procedure
Creating and Initializing an Audio File Transcription Engine
Code:
Override onCreate in MainActivity to create an audio transcription engine.
private MLRemoteAftEngine mAnalyzer;
mAnalyzer = MLRemoteAftEngine.getInstance();
mAnalyzer.init(getApplicationContext());
mAnalyzer.setAftListener(mAsrListener);
Use MLRemoteAftSetting to configure the engine. The service currently supports Mandarin Chinese and English, that is, the options of mLanguage are zh and en.
Code:
MLRemoteAftSetting setting = new MLRemoteAftSetting.Factory()
.setLanguageCode(mLanguage)
.enablePunctuation(true)
.enableWordTimeOffset(true)
.enableSentenceTimeOffset(true)
.create();
enablePunctuation indicates whether to automatically punctuate the converted text, with a default value of false.
If this parameter is set to true, the converted text is automatically punctuated; false otherwise.
enableWordTimeOffset indicates whether to generate the text transcription result of each audio segment with the corresponding offset. The default value is false. You need to set this parameter only when the audio duration is less than 1 minute.
If this parameter is set to true, the offset information is returned along with the text transcription result. This applies to the transcription of short audio files with a duration of 1 minute or shorter.
If this parameter is set to false, only the text transcription result of the audio file will be returned.
enableSentenceTimeOffset indicates whether to output the offset of each sentence in the audio file. The default value is false.
If this parameter is set to true, the offset information is returned along with the text transcription result.
If this parameter is set to false, only the text transcription result of the audio file will be returned.
Creating a Listener Callback to Process the Transcription Result
private MLRemoteAftListener mAsrListener = new MLRemoteAftListener()
After the listener is initialized, call startTask in AftListener to start the transcription.
Code:
@Override
public void onInitComplete(String taskId, Object ext) {
Log.i(TAG, "MLRemoteAftListener onInitComplete" + taskId);
mAnalyzer.startTask(taskId);
}
Override onUploadProgress, onEvent, and onResult in MLRemoteAftListener.
@Override
public void onUploadProgress(String taskId, double progress, Object ext) {
Log.i(TAG, " MLRemoteAftListener onUploadProgress is " + taskId + " " + progress);
}
@Override
public void onEvent(String taskId, int eventId, Object ext) {
Log.e(TAG, "MLAsrCallBack onEvent" + eventId);
if (MLAftEvents.UPLOADED_EVENT == eventId) { // The file is uploaded successfully.
showConvertingDialog();
startQueryResult(); // Obtain the transcription result.
}
}
@Override
public void onResult(String taskId, MLRemoteAftResult result, Object ext) {
Log.i(TAG, "onResult get " + taskId);
if (result != null) {
Log.i(TAG, "onResult isComplete " + result.isComplete());
if (!result.isComplete()) {
return;
}
if (null != mTimerTask) {
mTimerTask.cancel();
}
if (result.getText() != null) {
Log.e(TAG, result.getText());
dismissTransferringDialog();
showCovertResult(result.getText());
}
List<MLRemoteAftResult.Segment> segmentList = result.getSegments();
if (segmentList != null && segmentList.size() != 0) {
for (MLRemoteAftResult.Segment segment : segmentList) {
Log.e(TAG, "MLAsrCallBack segment text is : " + segment.getText() + ", startTime is : " + segment.getStartTime() + ". endTime is : " + segment.getEndTime());
}
}
List<MLRemoteAftResult.Segment> words = result.getWords();
if (words != null && words.size() != 0) {
for (MLRemoteAftResult.Segment word : words) {
Log.e(TAG, "MLAsrCallBack word text is : " + word.getText() + ", startTime is : " + word.getStartTime() + ". endTime is : " + word.getEndTime());
}
}
List<MLRemoteAftResult.Segment> sentences = result.getSentences();
if (sentences != null && sentences.size() != 0) {
for (MLRemoteAftResult.Segment sentence : sentences) {
Log.e(TAG, "MLAsrCallBack sentence text is : " + sentence.getText() + ", startTime is : " + sentence.getStartTime() + ". endTime is : " + sentence.getEndTime());
}
}
}
}
Processing the Transcription Result in Polling Mode
After the transcription is completed, call getLongAftResult to obtain the transcription result. Process the obtained result every 10 seconds.
Code:
private void startQueryResult() {
Timer mTimer = new Timer();
mTimerTask = new TimerTask() {
@Override
public void run() {
getResult();
}
};
mTimer.schedule(mTimerTask, 5000, 10000); // Process the obtained long speech transcription result every 10s.
}
private void getResult() {
Log.e(TAG, "getResult");
mAnalyzer.setAftListener(mAsrListener);
mAnalyzer.getLongAftResult(mLongTaskId);
}
References:
To learn more, please visit:
HUAWEI Developers official website
Development Guide
Reddit to join developer discussions
GitHub or Gitee to download the demo and sample code
Stack Overflow to solve integration problems
Follow our official account for the latest HMS Core-related news and updates.
Original Source

Related

Product Visual Search – Ultimate Guide

More information like this, you can visit HUAWEI Developer Forum​
Introduction
HMS ML kit service searches in pre-established product image library for the same or similar products based on a product image taken by a customer, and returns the IDs of those products and related information.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
Use Case
We will capture the product image using device camera from our developed shopping application.
We will show the returned products list in recycle view.
Prerequisite
Java JDK 1.8 or higher is recommended.
Android Studio is recommended.
Huawei android device with HMS core 4.0.0.300 or higher.
Before developing an app, you will need to register as a HUAWEI developer. Refer to Register a HUAWEI ID.
Integrating app gallery connect SDK. Refer to AppGallery Connect Service Getting Started.
Implementation
Enable ML kit in Manage APIs. Refer to Service Enabling.
Integrate following dependencies in app-level build.gradle.
Code:
// Import the product visual search SDK.
implementation 'com.huawei.hms:ml-computer-vision-cloud:2.0.1.300'
3. Add agc plugin in the top of app.gradle file.
Code:
apply plugin: 'com.huawei.agconnect'
4. Add the following permission in manifest.
Camera permission android.permission.CAMERA: Obtains real-time images or videos from a camera.
Internet access permission android.permission.INTERNET: Accesses cloud services on the Internet.
Storage write permission android.permission.WRITE_EXTERNAL_STORAGE: Upgrades the algorithm version.
Storage read permission android.permission.READ_EXTERNAL_STORAGE: Reads photos stored on a device.
5. To request camera permission in realtime.
Code:
private void requestCameraPermission() {
final String[] permissions = new String[] {Manifest.permission.CAMERA};
if (!ActivityCompat.shouldShowRequestPermissionRationale(this, Manifest.permission.CAMERA)) {
ActivityCompat.requestPermissions(this, permissions, this.CAMERA_PERMISSION_CODE);
return;
}
}
6. Add following code in Application class
Code:
public class MyApplication extends Application {
@Override
public void onCreate() {
super.onCreate();
MLApplication.getInstance().setApiKey("API KEY");
}
}
API key can be obtained either from AGC or integrated agcconnect-services.json.
7. To create an analyzer for product visual search.
Code:
private void initializeProductVisionSearch() {
MLRemoteProductVisionSearchAnalyzerSetting settings = new MLRemoteProductVisionSearchAnalyzerSetting.Factory()
// Set the maximum number of products that can be returned.
.setLargestNumOfReturns(16)
// Set the product set ID. (Contact [email protected] to obtain the configuration guide.)
// .setProductSetId(productSetId)
// Set the region.
.setRegion(MLRemoteProductVisionSearchAnalyzerSetting.REGION_DR_CHINA)
.create();
analyzer
= MLAnalyzerFactory.getInstance().getRemoteProductVisionSearchAnalyzer(settings);
}
8. To capture image from camera.
Code:
Intent intent = new Intent(MediaStore.ACTION_IMAGE_CAPTURE);
startActivityForResult(intent, REQ_CAMERA_CODE);
9. Once image has been captured, onActivityResult() method will be executed.
Code:
@Override
public void onActivityResult(int requestCode, int resultCode, Intent data) {
super.onActivityResult(requestCode, resultCode, data);
Log.d(TAG, "onActivityResult");
if(requestCode == 101) {
if (resultCode == RESULT_OK) {
Bitmap bitmap = (Bitmap) data.getExtras().get("data");
if (bitmap != null) {
// Create an MLFrame object using the bitmap, which is the image data in bitmap format.
MLFrame mlFrame = new MLFrame.Creator().setBitmap(bitmap).create();
mlImageDetection(mlFrame);
}
}
}
}
private void mlImageDetection(MLFrame mlFrame) {
Task> task = analyzer.asyncAnalyseFrame(mlFrame);
task.addOnSuccessListener(new OnSuccessListener>() {
public void onSuccess(List products) {
// Processing logic for detection success.
displaySuccess(products);
}})
.addOnFailureListener(new OnFailureListener() {
public void onFailure(Exception e) {
// Processing logic for detection failure.
// Recognition failure.
try {
MLException mlException = (MLException)e;
// Obtain the result code. You can process the result code and customize respective messages displayed to users.
int errorCode = mlException.getErrCode();
// Obtain the error information. You can quickly locate the fault based on the result code.
String errorMessage = mlException.getMessage();
} catch (Exception error) {
// Handle the conversion error.
}
}
});
}
private void displaySuccess(List productVisionSearchList) {
List productImageList = new ArrayList();
String prodcutType = "";
for (MLProductVisionSearch productVisionSearch : productVisionSearchList) {
Log.d(TAG, "type: " + productVisionSearch.getType() );
prodcutType = productVisionSearch.getType();
for (MLVisionSearchProduct product : productVisionSearch.getProductList()) {
productImageList.addAll(product.getImageList());
Log.d(TAG, "custom content: " + product.getCustomContent() );
}
}
StringBuffer buffer = new StringBuffer();
for (MLVisionSearchProductImage productImage : productImageList) {
String str = "ProductID: " + productImage.getProductId() + "
ImageID: " + productImage.getImageId() + "
Possibility: " + productImage.getPossibility();
buffer.append(str);
buffer.append("
");
}
Log.d(TAG , "display success: " + buffer.toString());
FragmentTransaction transaction = getFragmentManager().beginTransaction();
transaction.replace(R.id.main_fragment_container, new SearchResultFragment(productImageList, prodcutType ));
transaction.commit();
}
onSuccess() callback will give us list of MLProductVisionSearch objects, which can be used to get product id and image URL of each product. Also we can get the product type using productVisionSearch.getType(). getType() returns number which can be mapped.
10. We can achieve the product type mapping with following code.
Code:
private String getProductType(String type) {
switch(type) {
case "0":
return "Others";
case "1":
return "Clothing";
case "2":
return "Shoes";
case "3":
return "Bags";
case "4":
return "Digital & Home appliances";
case "5":
return "Household Products";
case "6":
return "Toys";
case "7":
return "Cosmetics";
case "8":
return "Accessories";
case "9":
return "Food";
}
return "Others";
}
11. To get product id and image URL from MLVisionSearchProductImage.
Code:
@Override
public void onBindViewHolder(ViewHolder holder, int position) {
final MLVisionSearchProductImage mlProductVisionSearch = productVisionSearchList.get(position);
holder.tvTitle.setText(mlProductVisionSearch.getProductId());
Glide.with(context)
.load(mlProductVisionSearch.getImageId())
.diskCacheStrategy(DiskCacheStrategy.ALL)
.into(holder.imageView);
}
Images
Reference
https://developer.huawei.com/consumer/en/doc/development/HMSCore-Guides-V5/sdk-data-security-0000001050040129-V5

How to Integrate HUAWEI Drive Kit

More information like this, you can visit HUAWEI Developer Forum​
Hello everyone, in this article we will create an Android application using the capabilities of HUAWEI Drive Kit that will allow users to manage and edit files in HUAWEI Drive, as well as store photos, drawings, designs, recordings and videos.
Why should we use HUAWEI Drive Kit?
With HUAWEI Drive Kit, you can let your users store data on the cloud quick and easy. Users can upload, download, synchronize, and view images, videos, and documents whenever and wherever they want.
There are 3 main functions provided by HUAWEI Drive Kit.
User file management: Includes file search, comment, and reply in addition to basic functions.
App data management: Supports app data storage and restoration.
Cross-platform capability: Provides RESTful APIs to support app access from non-Android devices.
Integration Preparations
Before you get started, you must first register as a HUAWEI developer and verify your identity on the HUAWEI Developer website. For more details, please refer to Register a HUAWEI ID.
Software Requirements
Java JDK installation package
Android SDK package
Android Studio 3.X
HMS Core (APK) 3.X or later
Supported Locations
To see all supported locations, please refer to HUAWEI Drive Kit Supported Locations
Integrating HMS Core SDK
To integrate HMS Core SDK in your application and learn creating a new project on AppGallery Connect, follow this great guide: https://medium.com/huawei-developers/android-integrating-your-apps-with-huawei-hms-core-1f1e2a090e98
Keep in mind: Users need to sign in with their HUAWEI ID before they can use the functions provided by HUAWEI Drive Kit. If the user has not signed in with a HUAWEI ID, the HMS Core SDK will prompt the user to sign in first.
Implementation
Code:
implementation 'com.huawei.hms:drive:5.0.0.301'
implementation 'com.huawei.hms:hwid:5.0.1.301'
Now that we’re ready to implement our methods, let’s see how they work.
Signing In with a HUAWEI ID
To implement the sign-in function through the HMS Core SDK, you will need to set the Drive scope for obtaining the permission to access Drive APIs.
Each Drive scope corresponds to a certain type of permissions. You may apply for the permissions as needed. For details about the corresponding APIs, please refer to HUAWEI Account Kit Development Guide.
Code:
private void driveSignIn() {
List<Scope> scopeList = new ArrayList<>();
scopeList.add(new Scope(DriveScopes.SCOPE_DRIVE));
scopeList.add(new Scope(DriveScopes.SCOPE_DRIVE_APPDATA));
scopeList.add(new Scope(DriveScopes.SCOPE_DRIVE_FILE));
scopeList.add(new Scope(DriveScopes.SCOPE_DRIVE_METADATA));
scopeList.add(new Scope(DriveScopes.SCOPE_DRIVE_METADATA_READONLY));
scopeList.add(new Scope(DriveScopes.SCOPE_DRIVE_READONLY));
scopeList.add(HuaweiIdAuthAPIManager.HUAWEIID_BASE_SCOPE);
HuaweiIdAuthParams authParams = new HuaweiIdAuthParamsHelper(HuaweiIdAuthParams.DEFAULT_AUTH_REQUEST_PARAM)
.setAccessToken()
.setIdToken()
.setScopeList(scopeList)
.createParams();
HuaweiIdAuthService authService = HuaweiIdAuthManager.getService(this, authParams);
startActivityForResult(authService.getSignInIntent(), REQUEST_SIGN_IN_LOGIN);
}
@Override
protected void onActivityResult(int requestCode, int resultCode, @Nullable Intent data) {
super.onActivityResult(requestCode, resultCode, data);
if (requestCode == REQUEST_SIGN_IN_LOGIN) {
Task<AuthHuaweiId> authHuaweiIdTask = HuaweiIdAuthManager.parseAuthResultFromIntent(data);
if (authHuaweiIdTask.isSuccessful()) {
AuthHuaweiId authHuaweiId = authHuaweiIdTask.getResult();
mAccessToken = authHuaweiId.getAccessToken();
mUnionId = authHuaweiId.getUnionId();
int returnCode = init(mUnionId, mAccessToken, refreshAccessToken);
if (returnCode == DriveCode.SUCCESS) {
Log.d(TAG, "onActivityResult: driveSignIn success");
} else {
Log.d(TAG, "onActivityResult: driveSignIn failed");
}
} else {
Log.d(TAG, "onActivityResult, signIn failed: " + ((ApiException) authHuaweiIdTask.getException()).getStatusCode());
}
}
}
Handling the Permissions
Add the read and write permissions for the phone storage to AndroidManifest.xml
Code:
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" />
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE" />
<uses-permission
android:name="android.permission.WRITE_MEDIA_STORAGE"
tools:ignore="ProtectedPermissions" />
Add the permissions to request in MainActivity.java.
Code:
private static String[] PERMISSIONS_STORAGE = {
Manifest.permission.READ_EXTERNAL_STORAGE,
Manifest.permission.WRITE_EXTERNAL_STORAGE,
Manifest.permission.CAMERA
};
Request the permissions in the onCreate method.
Code:
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.M) {
requestPermissions(PERMISSIONS_STORAGE, 1);
}
Create a Folder and Upload a File
Now that we handled necessary permissions, let’s upload some files!
Create a folder in the root directory of Drive, and upload the file in the designated directory.
Keep in mind: the code below contains global variables and functions, you can download the sample code from the link at the end of the article to view their meanings.
Code:
private void uploadFiles() {
new Thread(new Runnable() {
@Override
public void run() {
try {
if (mAccessToken == null) {
Log.d(TAG, "need to sign in first");
return;
}
if (StringUtils.isNullOrEmpty(et_uploadFileName.getText().toString())) {
Log.d(TAG, "file name is required to upload");
return;
}
String path = getExternalFilesDir(null).getAbsolutePath()
+ "/" + et_uploadFileName.getText();
Log.d(TAG, "run: " + path);
java.io.File fileObj = new java.io.File(path);
if (!fileObj.exists()) {
Log.d(TAG, "file does not exists");
return;
}
Drive drive = buildDrive();
Map<String, String> appProperties = new HashMap<>();
appProperties.put("appProperties", "property");
String dirName = "Uploads";
File file = new File();
file.setFileName(dirName)
.setAppSettings(appProperties)
.setMimeType("application/vnd.huawei-apps.folder");
directoryCreated = drive.files().create(file).execute();
// Upload the file
File fileToUpload = new File()
.setFileName(fileObj.getName())
.setMimeType(mimeType(fileObj))
.setParentFolder(Collections.singletonList(directoryCreated.getId()));
Drive.Files.Create request = drive.files()
.create(fileToUpload, new FileContent(mimeType(fileObj), fileObj));
fileUploaded = request.execute();
Log.d(TAG, "upload success");
} catch (Exception e) {
Log.d(TAG, "upload error " + e.toString());
}
}
}).start();
}
If applicationData is selected, the operation will be performed on the app data folder. The app data folder is invisible to users and is used to store app-specific data.
After the upload is done, the users can view the file by going to Files > HUAWEI Drive.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
Query File Details
With this method we can get the details of a file in Drive. Below, we get the id, file name and size information.
Code:
private void queryFiles() {
new Thread(new Runnable() {
@Override
public void run() {
try {
if (mAccessToken == null) {
Log.d(TAG, "need to sign in first");
return;
}
String containers = "";
String queryFile = "fileName = '" + et_searchFileName.getText()
+ "' and mimeType != 'application/vnd.huawei-apps.folder'";
if (cb_isApplicationData.isChecked()) {
containers = "applicationData";
queryFile = "'applicationData' in parentFolder and ".concat(queryFile);
}
Drive drive = buildDrive();
Drive.Files.List request = drive.files().list();
FileList files;
while (true) {
files = request
.setQueryParam(queryFile)
.setPageSize(10)
.setOrderBy("fileName")
.setFields("category,nextCursor,files(id,fileName,size)")
.setContainers(containers)
.execute();
if (files == null || files.getFiles().size() > 0) {
break;
}
if (!StringUtils.isNullOrEmpty(files.getNextCursor())) {
request.setCursor(files.getNextCursor());
} else {
break;
}
}
String text = "";
if (files != null && files.getFiles().size() > 0) {
fileSearched = files.getFiles().get(0);
text = fileSearched.toString();
} else {
text = "empty";
}
final String finalText = text;
MainActivity.this.runOnUiThread(new Runnable() {
@Override
public void run() {
tv_queryResult.setText(finalText);
}
});
Log.d(TAG, "query success");
} catch (Exception e) {
Log.d(TAG, "query error " + e.toString());
}
}
}).start();
}
Download Files
We can also download files from Drive. Just query the file (notice the fileSearched global variable above) and download with Drive.Files.Get request.
This is not the end. For full content, you can visit https://forums.developer.huawei.com/forumPortal/en/topicview?tid=0202357120930420217&fid=0101187876626530001
Is there any restriction on type of files which can be uploaded and while uploading are there any security measures taken like encryption of my data ?
Nice and useful article
DOes Huawei drive kit work in India?

SMS Retriever API for HMS

{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
Article Introduction
Validating identity is incredibly important in mobile development, and assuring new user signups are real human beings is critical. Developers need reliable ways to confirm the identities of their users to prevent security issues. Developers can implement verification systems using phone numbers in two main ways: either by calling or by sending an SMS containing a code that the user must input. In this article, we’ll implement SMS Retriever API to support both GMS and HMS, so our app can read the Verification/OTP SMS and verify the user automatically.
Automatic and one-tap SMS verification
Verify your users by SMS without making them deal with verification codes. If your app requires a user to enter a mobile number and verifies the user identity using an SMS verification code, you can integrate the ReadSmsManager service so that your app can automatically read the SMS verification code without applying for the SMS reading permission. After the integration, SMS verification codes are automatically filled in for verification, greatly improving user experience.
The process is described as follows:
1. HMS Core (APK) sends the SMS message that meets the rules to your app through a directed broadcast.
2. Your app receives the directed broadcast, parses it to obtain the SMS verification code, and displays it on your app.
3. The user checks whether the verification code is correct and if so, sends a verification request.
4. Your app sends the verification code from the user to your app server for check.
5. Your app server checks whether the verification code is correct and if so, returns the verification result to your app.
Pre-Requisites
ReadSmsManager Support the following:
Support
Value
Devices
Phones and Tablets
Operating System
EMUI 3.0 or later
Android
Android 4.4 or later
SMS message rules
SMS Template:
prefix_flag short message verification code is XXXXXX hash_value
prefix_flag
indicates the prefix of an SMS message, which can be <#>, [#], or \u200b\u200b. \u200b\u200b is invisible Unicode characters.
short message verification code is
indicates the SMS message content, which you can define as needed.
XXXXXX
indicates the verification code.
hash_value
indicates the hash value generated by the HMS Core SDK based on the package name of an app to uniquely identify the app
Step 1: Get Hash Value:
You get your hash value by implementing the following class:
Java:
public class hashcodeHMS extends ContextWrapper {
public static final String TAG = hashcodeHMS.class.getSimpleName();
public hashcodeHMS(Context context) {
super(context);
}
public MessageDigest getMessageDigest() {
MessageDigest messageDigest = null;
try {
messageDigest = MessageDigest.getInstance("SHA-256");
} catch (NoSuchAlgorithmException e) {
Log.e(TAG, "No Such Algorithm.", e);
}
return messageDigest;
}
public String getSignature(Context context, String packageName) {
PackageManager packageManager = context.getPackageManager();
Signature[] signatureArrs;
try {
signatureArrs = packageManager.getPackageInfo(packageName, PackageManager.GET_SIGNATURES).signatures;
} catch (PackageManager.NameNotFoundException e) {
Log.e(TAG, "Package name inexistent.");
return "";
}
if (null == signatureArrs || 0 == signatureArrs.length) {
Log.e(TAG, "signature is null.");
return "";
}
Log.e("hashhms =>", signatureArrs[0].toCharsString());
return signatureArrs[0].toCharsString();
}
public String getHashCode(String packageName, MessageDigest messageDigest, String signature) {
String appInfo = packageName + " " + signature;
messageDigest.update(appInfo.getBytes(StandardCharsets.UTF_8));
byte[] hashSignature = messageDigest.digest();
hashSignature = Arrays.copyOfRange(hashSignature, 0, 9);
String base64Hash = Base64.encodeToString(hashSignature, Base64.NO_PADDING | Base64.NO_WRAP);
base64Hash = base64Hash.substring(0, 11);
return base64Hash;
}
}
Step 2: Start the consent from SMS Manager:
Code:
val task = ReadSmsManager.startConsent([email protected], null)
task.addOnCompleteListener { it ->
if (it.isSuccessful) {
// The service is enabled successfully. Perform other operations as needed.
tv_title.text = "Waiting for the OTP"
Toast.makeText(this, "SMS Retriever starts", Toast.LENGTH_LONG).show()
}else{
tv_title.text = "ReadSmsManager did not worked"
Toast.makeText(this, "SMS Retriever did not start", Toast.LENGTH_LONG).show()
}
}
Step 3: Prepare your Broadcast Receiver:
Java:
class MySMSBrodcastReceiverHms : BroadcastReceiver() {
private var otpReceiver: OTPReceiveListenerHMS? = null
fun initOTPListener(receiver: OTPReceiveListenerHMS) {
this.otpReceiver = receiver
Log.e("Firas", "initOTPListener: Done", )
}
override fun onReceive(context: Context, intent: Intent) {
intent?.let { it ->
val bundle = it.extras
bundle?.let { itBundle ->
if (ReadSmsConstant.READ_SMS_BROADCAST_ACTION == it.action) {
val status : Status? = itBundle.getParcelable(ReadSmsConstant.EXTRA_STATUS)
if (status?.statusCode == CommonStatusCodes.TIMEOUT) {
// The service has timed out and no SMS message that meets the requirements is read. The service process ends.
Log.i("Firas", "onReceive: TIMEOUT")
otpReceiver!!.onOTPTimeOutHMS()
}else if (status?.statusCode == CommonStatusCodes.SUCCESS){
if (bundle.containsKey(ReadSmsConstant.EXTRA_SMS_MESSAGE)) {
// An SMS message that meets the requirement is read. The service process ends.
var otp: String = bundle.getString(ReadSmsConstant.EXTRA_SMS_MESSAGE) as String
Log.i("Firas", "onReceive: ${otp}")
otp = otp.replace("[#] short message verification code is ", "").split(":".toRegex()).dropLastWhile { it.isEmpty() }.toTypedArray()[0]
otp = otp.split(" ").dropLastWhile { it.isEmpty() }.toTypedArray()[0]
Log.i("Firas", "onReceive: ${otp}")
otpReceiver!!.onOTPReceivedHMS(otp)
}
}
}
}
}
}
interface OTPReceiveListenerHMS {
fun onOTPReceivedHMS(otp: String)
fun onOTPTimeOutHMS()
}
}
*Add the broadcast receiver to the manifest*
XML:
<receiver
android:name=".MySMSBrodcastReceiverHms"
android:exported="true">
<intent-filter>
<action android:name="com.huawei.hms.support.sms.common.ReadSmsConstant.READ_SMS_BROADCAST_ACTION" />
</intent-filter>
</receiver>
Step 4: Prepare your Broadcast Receiver:
Implement the OTPReceiverListenerHMS interface from our MySMSBrodcastReceiverHms:
Code:
* class otp_read : AppCompatActivity(),MySMSBrodcastReceiverHms.OTPReceiveListenerHMS{}*
Then create and initiate the OTPListener
Code:
var smsBroadcast = MySMSBrodcastReceiverHms() as MySMSBrodcastReceiverHms
(smsBroadcast as MySMSBrodcastReceiverHms).initOTPListener(this)
Implement the functions of the interface:
Code:
override fun onOTPReceivedHMS(otp: String) {
Toast.makeText(this, " onOTPReceived", Toast.LENGTH_SHORT).show()
if (smsBroadcast != null) {
LocalBroadcastManager.getInstance(this).unregisterReceiver(smsBroadcast as MySMSBrodcastReceiverHms)
}
Toast.makeText(this, otp, Toast.LENGTH_SHORT).show()
tv_title.text = "$otp"
otp_view.setText(otp)
Log.e("OTP Received", otp)
}
override fun onOTPTimeOutHMS() {
tv_title.setText("Timeout")
Toast.makeText(this, " SMS retriever API Timeout", Toast.LENGTH_SHORT).show()
}
Step 5: Register your receiver:
Code:
val intentFilter = IntentFilter()
intentFilter.addAction(ReadSmsConstant.READ_SMS_BROADCAST_ACTION)
applicationContext.registerReceiver(
smsBroadcast as MySMSBrodcastReceiverHms,
intentFilter
)
Step 6: Run the application
Conclusion
This feature will help the user to be verified faster than the regular way and it will prevent the user from any typo mistake like writing the OTP, keep in mind this feature will work only if the user has the sim card on his phone other wais it will not trigger the broadcast receiver. From now not every app with READ SMS permission can access your personal data like messages. Usually, to auto-fill OTP we give access to an android app it’s better to off that permission after the process is completed (else they will have access to each & every message you have on the mobile), but how many will do that. With SMS Retriever API apps won’t ask for READ SMS permission to auto-fill OTP.
Tips & Tricks
Remove AppSignatureHelper class from your project before going to production.
Debug and Release APK’s might have different Hashcodes, make sure you get hashcode from release APK.
References
Automatically Reading an SMS Verification Code Without User Authorization:
https://developer.huawei.com/consumer/en/doc/development/HMSCore-Guides/readsmsmanager-0000001050050861
Automatically Reading an SMS Verification Code After User Authorization:
https://developer.huawei.com/consumer/en/doc/development/HMSCore-Guides/authotize-to-read-sms-0000001061481826#EN-US_TOPIC_0000001126286229__section1186673334918
ReadSmsManager Reference:
https://developer.huawei.com/consumer/en/doc/development/HMSCore-Guides/readsmsmanager-0000001050050861
Obtaining the Hash Value Reference:
https://developer.huawei.com/consumer/en/doc/development/HMSCore-Guides-V5/obtaining-hash-value-0000001050194405-V5
Checkout in forum

How a Programmer Used 300 Lines of Code to Help His Grandma Shop Online with Voice Input

"John, why the writing pad is missing again?"
John, programmer at Huawei, has a grandma who loves novelty, and lately she's been obsessed with online shopping. Familiarizing herself with major shopping apps and their functions proved to be a piece of cake, and she had thought that her online shopping experience would be effortless — unfortunately, however, she was hindered by product searching.
John's grandma tended to use handwriting input. When using it, she would often make mistakes, like switching to another input method she found unfamiliar, or tapping on undesired characters or signs.
Except for shopping apps, most mobile apps feature interface designs that are oriented to younger users — it's no wonder that elderly users often struggle to figure out how to use them.
John patiently helped his grandma search for products with handwriting input several times. But then, he decided to use his skills as a veteran coder to give his grandma the best possible online shopping experience. More specifically, instead of helping her adjust to the available input method, he was determined to create an input method that would conform to her usage habits.
Since his grandma tended to err during manual input, John developed an input method that converts speech into text. Grandma was enthusiastic about the new method, because it is remarkably easy to use. All she has to do is to tap on the recording button and say the product's name. The input method then recognizes what she has said, and converts her speech into text.
Actual Effects
Real-time speech recognition and speech to text are ideal for a broad range of apps, including:
Game apps (online): Real-time speech recognition comes to users' aid when they team up with others. It frees up users' hands for controlling the action, sparing them from having to type to communicate with their partners. It can also free users from any potential embarrassment related to voice chatting during gaming.
Work apps: Speech to text can play a vital role during long conferences, where typing to keep meeting minutes can be tedious and inefficient, with key details being missed. Using speech to text is much more efficient: during a conference, users can use this service to convert audio content into text; after the conference, they can simply retouch the text to make it more logical.
Learning apps: Speech to text can offer users an enhanced learning experience. Without the service, users often have to pause audio materials to take notes, resulting in a fragmented learning process. With speech to text, users can concentrate on listening intently to the material while it is being played, and rely on the service to convert the audio content into text. They can then review the text after finishing the entire course, to ensure that they've mastered the content.
How to Implement
Two services in HUAWEI ML Kit: automatic speech recognition (ASR) and audio file transcription, make it easy to implement the above functions.
ASR can recognize speech of up to 60s, and convert the input speech into text in real time, with recognition accuracy of over 95%. It currently supports Mandarin Chinese (including Chinese-English bilingual speech), English, French, German, Spanish, Italian, and Arabic.
l Real-time result output
l Available options: with and without speech pickup UI
l Endpoint detection: Start and end points can be accurately located.
l Silence detection: No voice packet is sent for silent portions.
l Intelligent conversion to digital formats: For example, the year 2021 is recognized from voice input.
Audio file transcription can convert an audio file of up to five hours into text with punctuation, and automatically segment the text for greater clarity. In addition, this service can generate text with timestamps, facilitating further function development. In this version, both Chinese and English are supported.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
Development Procedures
1. Preparations
(1) Configure the Huawei Maven repository address, and put the agconnect-services.json file under the app directory.
Open the build.gradle file in the root directory of your Android Studio project.
Add the AppGallery Connect plugin and the Maven repository.
l Go to allprojects > repositories and configure the Maven repository address for the HMS Core SDK.
l Go to buildscript > repositories and configure the Maven repository address for the HMS Core SDK.
l If the agconnect-services.json file has been added to the app, go to buildscript > dependencies and add the AppGallery Connect plugin configuration.
Code:
<p style="line-height: 1.5em;">buildscript {
repositories {
google()
jcenter()
maven { url 'https://developer.huawei.com/repo/' }
}
dependencies {
classpath 'com.android.tools.build:gradle:3.5.4'
classpath 'com.huawei.agconnect:agcp:1.4.1.300'
// NOTE: Do not place your app dependencies here; they belong
// in the individual module build.gradle files.
}
}
allprojects {
repositories {
google()
jcenter()
maven { url 'https://developer.huawei.com/repo/' }
}
}</p>
2) Add the build dependencies for the HMS Core SDK.
Code:
<p style="line-height: 1.5em;">dependencies {
//The audio file transcription SDK.
implementation 'com.huawei.hms:ml-computer-voice-aft:2.2.0.300'
// The ASR SDK.
implementation 'com.huawei.hms:ml-computer-voice-asr:2.2.0.300'
// Plugin of ASR.
implementation 'com.huawei.hms:ml-computer-voice-asr-plugin:2.2.0.300'
...
}
apply plugin: 'com.huawei.agconnect' // AppGallery Connect plugin.</p>
(3) Configure the signing certificate in the build.gradle file under the app directory.
Code:
<p style="line-height: 1.5em;">signingConfigs {
release {
storeFile file("xxx.jks")
keyAlias xxx
keyPassword xxxxxx
storePassword xxxxxx
v1SigningEnabled true
v2SigningEnabled true
}
}
buildTypes {
release {
minifyEnabled false
proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'), 'proguard-rules.pro'
}
debug {
signingConfig signingConfigs.release
debuggable true
}
}</p>
(4) Add permissions in the AndroidManifest.xml file.
Code:
<p style="line-height: 1.5em;"><uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE" />
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" />
<uses-permission android:name="android.permission.ACCESS_NETWORK_STATE" />
<uses-permission android:name="android.permission.ACCESS_WIFI_STATE" />
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<application
android:requestLegacyExternalStorage="true"
...
</application>
</p>
2. Integrating the ASR Service
(1) Dynamically apply for the permissions.
Code:
<p style="line-height: 1.5em;">if (ActivityCompat.checkSelfPermission(this, Manifest.permission.RECORD_AUDIO) != PackageManager.PERMISSION_GRANTED) {
requestCameraPermission();
}
private void requestCameraPermission() {
final String[] permissions = new String[]{Manifest.permission.RECORD_AUDIO};
if (!ActivityCompat.shouldShowRequestPermissionRationale(this, Manifest.permission.RECORD_AUDIO)) {
ActivityCompat.requestPermissions(this, permissions, Constants.AUDIO_PERMISSION_CODE);
return;
}
}
</p>
(2) Create an Intent to set parameters.
Code:
<p style="line-height: 1.5em;">// Set authentication information for your app.
MLApplication.getInstance().setApiKey(AGConnectServicesConfig.fromContext(this).getString("client/api_key"));
//// Use Intent for recognition parameter settings.
Intent intentPlugin = new Intent(this, MLAsrCaptureActivity.class)
// Set the language that can be recognized to English. If this parameter is not set, English is recognized by default. Example: "zh-CN": Chinese; "en-US": English.
.putExtra(MLAsrCaptureConstants.LANGUAGE, MLAsrConstants.LAN_EN_US)
// Set whether to display the recognition result on the speech pickup UI.
.putExtra(MLAsrCaptureConstants.FEATURE, MLAsrCaptureConstants.FEATURE_WORDFLUX);
startActivityForResult(intentPlugin, "1");</p>
(3) Override the onActivityResult method to process the result returned by ASR.
Code:
<p style="line-height: 1.5em;">@Override
protected void onActivityResult(int requestCode, int resultCode, @Nullable Intent data) {
super.onActivityResult(requestCode, resultCode, data);
String text = "";
if (null == data) {
addTagItem("Intent data is null.", true);
}
if (requestCode == "1") {
if (data == null) {
return;
}
Bundle bundle = data.getExtras();
if (bundle == null) {
return;
}
switch (resultCode) {
case MLAsrCaptureConstants.ASR_SUCCESS:
// Obtain the text information recognized from speech.
if (bundle.containsKey(MLAsrCaptureConstants.ASR_RESULT)) {
text = bundle.getString(MLAsrCaptureConstants.ASR_RESULT);
}
if (text == null || "".equals(text)) {
text = "Result is null.";
Log.e(TAG, text);
} else {
// Display the recognition result in the search box.
searchEdit.setText(text);
goSearch(text, true);
}
break;
// MLAsrCaptureConstants.ASR_FAILURE: Recognition fails.
case MLAsrCaptureConstants.ASR_FAILURE:
// Check whether an error code is contained.
if (bundle.containsKey(MLAsrCaptureConstants.ASR_ERROR_CODE)) {
text = text + bundle.getInt(MLAsrCaptureConstants.ASR_ERROR_CODE);
// Troubleshoot based on the error code.
}
// Check whether error information is contained.
if (bundle.containsKey(MLAsrCaptureConstants.ASR_ERROR_MESSAGE)) {
String errorMsg = bundle.getString(MLAsrCaptureConstants.ASR_ERROR_MESSAGE);
// Troubleshoot based on the error information.
if (errorMsg != null && !"".equals(errorMsg)) {
text = "[" + text + "]" + errorMsg;
}
}
// Check whether a sub-error code is contained.
if (bundle.containsKey(MLAsrCaptureConstants.ASR_SUB_ERROR_CODE)) {
int subErrorCode = bundle.getInt(MLAsrCaptureConstants.ASR_SUB_ERROR_CODE);
// Troubleshoot based on the sub-error code.
text = "[" + text + "]" + subErrorCode;
}
Log.e(TAG, text);
break;
default:
break;
}
}
}
</p>
3. Integrating the Audio File Transcription Service
(1) Dynamically apply for the permissions.
Code:
<p style="line-height: 1.5em;">private static final int REQUEST_EXTERNAL_STORAGE = 1;
private static final String[] PERMISSIONS_STORAGE = {
Manifest.permission.READ_EXTERNAL_STORAGE,
Manifest.permission.WRITE_EXTERNAL_STORAGE };
public static void verifyStoragePermissions(Activity activity) {
// Check if the write permission has been granted.
int permission = ActivityCompat.checkSelfPermission(activity,
Manifest.permission.WRITE_EXTERNAL_STORAGE);
if (permission != PackageManager.PERMISSION_GRANTED) {
// The permission has not been granted. Prompt the user to grant it.
ActivityCompat.requestPermissions(activity, PERMISSIONS_STORAGE,
REQUEST_EXTERNAL_STORAGE);
}
}
</p>
(2) Create and initialize an audio transcription engine, and create an audio file transcription configurator.
Code:
<p style="line-height: 1.5em;">// Set the API key.
MLApplication.getInstance().setApiKey(AGConnectServicesConfig.fromContext(getApplication()).getString("client/api_key"));
MLRemoteAftSetting setting = new MLRemoteAftSetting.Factory()
// Set the transcription language code, complying with the BCP 47 standard. Currently, Mandarin Chinese and English are supported.
.setLanguageCode("zh")
// Set whether to automatically add punctuations to the converted text. The default value is false.
.enablePunctuation(true)
// Set whether to generate the text transcription result of each audio segment and the corresponding audio time shift. The default value is false. (This parameter needs to be set only when the audio duration is less than 1 minute.)
.enableWordTimeOffset(true)
// Set whether to output the time shift of a sentence in the audio file. The default value is false.
.enableSentenceTimeOffset(true)
.create();
// Create an audio transcription engine.
MLRemoteAftEngine engine = MLRemoteAftEngine.getInstance();
engine.init(this);
// Pass the listener callback to the audio transcription engine created beforehand.
engine.setAftListener(aftListener);</p>
(3) Create a listener callback to process the audio file transcription result.
l Transcription of short audio files with a duration of 1 minute or shorter:
Code:
<p style="line-height: 1.5em;">private MLRemoteAftListener aftListener = new MLRemoteAftListener() {
public void onResult(String taskId, MLRemoteAftResult result, Object ext) {
// Obtain the transcription result notification.
if (result.isComplete()) {
// Process the transcription result.
}
}
@Override
public void onError(String taskId, int errorCode, String message) {
// Callback upon a transcription error.
}
@Override
public void onInitComplete(String taskId, Object ext) {
// Reserved.
}
@Override
public void onUploadProgress(String taskId, double progress, Object ext) {
// Reserved.
}
@Override
public void onEvent(String taskId, int eventId, Object ext) {
// Reserved.
}
};
</p>
l Transcription of audio files with a duration longer than 1 minute:
Code:
<p style="line-height: 1.5em;">private MLRemoteAftListener asrListener = new MLRemoteAftListener() {
@Override
public void onInitComplete(String taskId, Object ext) {
Log.e(TAG, "MLAsrCallBack onInitComplete");
// The long audio file is initialized and the transcription starts.
start(taskId);
}
@Override
public void onUploadProgress(String taskId, double progress, Object ext) {
Log.e(TAG, " MLAsrCallBack onUploadProgress");
}
@Override
public void onEvent(String taskId, int eventId, Object ext) {
// Used for the long audio file.
Log.e(TAG, "MLAsrCallBack onEvent" + eventId);
if (MLAftEvents.UPLOADED_EVENT == eventId) { // The file is uploaded successfully.
// Obtain the transcription result.
startQueryResult(taskId);
}
}
@Override
public void onResult(String taskId, MLRemoteAftResult result, Object ext) {
Log.e(TAG, "MLAsrCallBack onResult taskId is :" + taskId + " ");
if (result != null) {
Log.e(TAG, "MLAsrCallBack onResult isComplete: " + result.isComplete());
if (result.isComplete()) {
TimerTask timerTask = timerTaskMap.get(taskId);
if (null != timerTask) {
timerTask.cancel();
timerTaskMap.remove(taskId);
}
if (result.getText() != null) {
Log.e(TAG, taskId + " MLAsrCallBack onResult result is : " + result.getText());
tvText.setText(result.getText());
}
List<MLRemoteAftResult.Segment> words = result.getWords();
if (words != null && words.size() != 0) {
for (MLRemoteAftResult.Segment word : words) {
Log.e(TAG, "MLAsrCallBack word text is : " + word.getText() + ", startTime is : " + word.getStartTime() + ". endTime is : " + word.getEndTime());
}
}
List<MLRemoteAftResult.Segment> sentences = result.getSentences();
if (sentences != null && sentences.size() != 0) {
for (MLRemoteAftResult.Segment sentence : sentences) {
Log.e(TAG, "MLAsrCallBack sentence text is : " + sentence.getText() + ", startTime is : " + sentence.getStartTime() + ". endTime is : " + sentence.getEndTime());
}
}
}
}
}
@Override
public void onError(String taskId, int errorCode, String message) {
Log.i(TAG, "MLAsrCallBack onError : " + message + "errorCode, " + errorCode);
switch (errorCode) {
case MLAftErrors.ERR_AUDIO_FILE_NOTSUPPORTED:
break;
}
}
};
// Upload a transcription task.
private void start(String taskId) {
Log.e(TAG, "start");
engine.setAftListener(asrListener);
engine.startTask(taskId);
}
// Obtain the transcription result.
private Map<String, TimerTask> timerTaskMap = new HashMap<>();
private void startQueryResult(final String taskId) {
Timer mTimer = new Timer();
TimerTask mTimerTask = new TimerTask() {
@Override
public void run() {
getResult(taskId);
}
};
// Periodically obtain the long audio file transcription result every 10s.
mTimer.schedule(mTimerTask, 5000, 10000);
// Clear timerTaskMap before destroying the UI.
timerTaskMap.put(taskId, mTimerTask);
}
</p>
(4) Obtain an audio file and upload it to the audio transcription engine.
Code:
<p style="line-height: 1.5em;">// Obtain the URI of an audio file.
Uri uri = getFileUri();
// Obtain the audio duration.
Long audioTime = getAudioFileTimeFromUri(uri);
// Check whether the duration is longer than 60s.
if (audioTime < 60000) {
// uri indicates audio resources read from the local storage or recorder. Only local audio files with a duration not longer than 1 minute are supported.
this.taskId = this.engine.shortRecognize(uri, this.setting);
Log.i(TAG, "Short audio transcription.");
} else {
// longRecognize is an API used to convert audio files with a duration from 1 minute to 5 hours.
this.taskId = this.engine.longRecognize(uri, this.setting);
Log.i(TAG, "Long audio transcription.");
}
private Long getAudioFileTimeFromUri(Uri uri) {
Long time = null;
Cursor cursor = this.getContentResolver()
.query(uri, null, null, null, null);
if (cursor != null) {
cursor.moveToFirst();
time = cursor.getLong(cursor.getColumnIndexOrThrow(MediaStore.Video.Media.DURATION));
} else {
MediaPlayer mediaPlayer = new MediaPlayer();
try {
mediaPlayer.setDataSource(String.valueOf(uri));
mediaPlayer.prepare();
} catch (IOException e) {
Log.e(TAG, "Failed to read the file time.");
}
time = Long.valueOf(mediaPlayer.getDuration());
}
return time;
}</p>
For more details, you can go to:
l Reddit to join our developer discussion
l GitHub to download demos and sample codes
l Stack Overflow to solve any integration problems
| Orignal Source

How to Convert Audio File into Text With Machine Learning

Introduction​Converting audio into text has a wide range of applications: generating video subtitles, taking meeting minutes, and writing interview transcripts. Machine learning makes doing so easier than ever before, converting audio files into meticulously accurate text, with correct punctuation as well!
Actual Effects​Build and run an app with audio file transcription integrated. Then, select a local audio file and convert it into text.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
Development Preparations​For details about configuring the Huawei Maven repository and integrating the audio file transcription SDK, please refer to the Development Guide of ML Kit on HUAWEI Developers.
Declaring Permissions in the AndroidManifest.xml File
Open the AndroidManifest.xml in the main folder. Add the network connection, network status access, and storage read permissions before <application.
Please note that these permissions need to be dynamically applied for. Otherwise, Permission Denied will be reported.
Code:
<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.ACCESS_NETWORK_STATE" />
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE" />
Development Procedure​Creating and Initializing an Audio File Transcription Engine
Override onCreate in MainActivity to create an audio transcription engine.
Code:
private MLRemoteAftEngine mAnalyzer;
mAnalyzer = MLRemoteAftEngine.getInstance();
mAnalyzer.init(getApplicationContext());
mAnalyzer.setAftListener(mAsrListener);
Use MLRemoteAftSetting to configure the engine. The service currently supports Mandarin Chinese and English, that is, the options of mLanguage are zh and en.
Code:
MLRemoteAftSetting setting = new MLRemoteAftSetting.Factory()
.setLanguageCode(mLanguage)
.enablePunctuation(true)
.enableWordTimeOffset(true)
.enableSentenceTimeOffset(true)
.create();
enablePunctuation indicates whether to automatically punctuate the converted text, with a default value of false.
If this parameter is set to true, the converted text is automatically punctuated; false otherwise.
enableWordTimeOffset indicates whether to generate the text transcription result of each audio segment with the corresponding offset. The default value is false. You need to set this parameter only when the audio duration is less than 1 minute.
If this parameter is set to true, the offset information is returned along with the text transcription result. This applies to the transcription of short audio files with a duration of 1 minute or shorter. If this parameter is set to false, only the text transcription result of the audio file will be returned.
enableSentenceTimeOffset indicates whether to output the offset of each sentence in the audio file. The default value is false.
If this parameter is set to true, the offset information is returned along with the text transcription result. If this parameter is set to false, only the text transcription result of the audio file will be returned.
Creating a Listener Callback to Process the Transcription Result
Code:
private MLRemoteAftListener mAsrListener = new MLRemoteAftListener()
After the listener is initialized, call startTask in AftListener to start the transcription.
Code:
@Override
public void onInitComplete(String taskId, Object ext) {
Log.i(TAG, "MLRemoteAftListener onInitComplete" + taskId);
mAnalyzer.startTask(taskId);
Override onUploadProgress, onEvent, and onResult in MLRemoteAftListener.
Code:
@Override
public void onUploadProgress(String taskId, double progress, Object ext) {
Log.i(TAG, " MLRemoteAftListener onUploadProgress is " + taskId + " " + progress);
}
@Override
public void onEvent(String taskId, int eventId, Object ext) {
Log.e(TAG, "MLAsrCallBack onEvent" + eventId);
if (MLAftEvents.UPLOADED_EVENT == eventId) { // The file is uploaded successfully.
showConvertingDialog();
startQueryResult(); // Obtain the transcription result.
}
}
@Override
public void onResult(String taskId, MLRemoteAftResult result, Object ext) {
Log.i(TAG, "onResult get " + taskId);
if (result != null) {
Log.i(TAG, "onResult isComplete " + result.isComplete());
if (!result.isComplete()) {
return;
}
if (null != mTimerTask) {
mTimerTask.cancel();
}
if (result.getText() != null) {
Log.e(TAG, result.getText());
dismissTransferringDialog();
showCovertResult(result.getText());
}
List<MLRemoteAftResult.Segment> segmentList = result.getSegments();
if (segmentList != null && segmentList.size() != 0) {
for (MLRemoteAftResult.Segment segment : segmentList) {
Log.e(TAG, "MLAsrCallBack segment text is : " + segment.getText() + ", startTime is : " + segment.getStartTime() + ". endTime is : " + segment.getEndTime());
}
}
List<MLRemoteAftResult.Segment> words = result.getWords();
if (words != null && words.size() != 0) {
for (MLRemoteAftResult.Segment word : words) {
Log.e(TAG, "MLAsrCallBack word text is : " + word.getText() + ", startTime is : " + word.getStartTime() + ". endTime is : " + word.getEndTime());
}
}
List<MLRemoteAftResult.Segment> sentences = result.getSentences();
if (sentences != null && sentences.size() != 0) {
for (MLRemoteAftResult.Segment sentence : sentences) {
Log.e(TAG, "MLAsrCallBack sentence text is : " + sentence.getText() + ", startTime is : " + sentence.getStartTime() + ". endTime is : " + sentence.getEndTime());
}
}
}
}
Processing the Transcription Result in Polling Mode
After the transcription is completed, call getLongAftResult to obtain the transcription result. Process the obtained result every 10 seconds.
Code:
private void startQueryResult() {
Timer mTimer = new Timer();
mTimerTask = new TimerTask() {
@Override
public void run() {
getResult();
}
};
mTimer.schedule(mTimerTask, 5000, 10000); // Process the obtained long speech transcription result every 10s.
}
private void getResult() {
Log.e(TAG, "getResult");
mAnalyzer.setAftListener(mAsrListener);
mAnalyzer.getLongAftResult(mLongTaskId);
}
References:​For more details, you can go to:
ML Kit official website
ML Kit Development Documentation page, to find the documents you need
Reddit to join our developer discussion
GitHub to download ML Kit sample codes
Stack Overflow to solve any integration problems
Thanks for sharing!

Categories

Resources