Research has shown that our voice is often an indicator of our personalities, and this is why we're so fascinated with changing our voice to make it sound more fun and uplifting in, for example, videos and live streams.
As a mobile developer, I have implemented the voice changing function into my own app, which you can try out in my demo. This function allows users to mask their voice using seven preset voices: seasoned, cute, male, female, monster, cartoon, and robots.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
I don't want to brag, but I myself still find this function amazing. Let's move on to how it's developed.
Making PreparationsEnsure you have completed these steps first.
Configuring the ProjectSet the app authentication information.
This can be set via an API key or access token.
Call setAccessToken during app initialization to set an access token. This only needs to be set once.
Java:
HAEApplication.getInstance().setAccessToken("your access token");
Or, use setApiKey to set an API key during app initialization. This only needs to be set once.
Code:
HAEApplication.getInstance().setApiKey("your ApiKey");
Calling the File APICall the file API for the voice changer function, which is necessary for calling back the file API.
Java:
private ChangeSoundCallback callBack = new ChangeSoundCallback() {
@Override
public void onSuccess(String outAudioPath) {
// Callback when the processing is successful.
}
@Override
public void onProgress(int progress) {
// Callback when the processing progress is received.
}
@Override
public void onFail(int errorCode) {
// Callback when the processing fails.
}
@Override
public void onCancel() {
// Callback when the processing is canceled.
}
};
Implementing the Voice Changer CapabilityCall applyAudioFile to change the voice.
Java:
// Change the voice.
HAEChangeVoiceFile haeChangeVoiceFile = new HAEChangeVoiceFile();
ChangeVoiceOption changeVoiceOption = new ChangeVoiceOption();
changeVoiceOption.setSpeakerSex(ChangeVoiceOption.SpeakerSex.MALE);
changeVoiceOption.setVoiceType(ChangeVoiceOption.VoiceType.CUTE);
haeChangeVoiceFile.changeVoiceOption(changeVoiceOption);
// Call the API.
haeChangeVoiceFile.applyAudioFile(inAudioPath, outAudioDir, outAudioName, callBack);
// Cancel the task of changing the voice.
haeChangeVoiceFile.cancel();
And now it's implemented. Easy, right? I hope this article can help you develop your own voice changer, and feel free to leave a comment if you want to learn more about my development journey.
ReferencesVoice Changer
Related
1. About the Function
Topic-based messaging is a function in Push Kit that seeks to boost user engagement and retention, by ensuring that users receive messages on topics that they have subscribed to, such as photography, sports, and food, providing a rich array of information on topics of interest.
2. Background
During busy shopping festivals, e-commerce platforms tend to make purchase reservations broadly accessible for a wide range of products. Thanks to push channels, merchants can send messages recommending products to users in a timely manner. However, not all users will have demonstrated interest in a product, and many will be turned off by the sheer flood of messages. Topic-based messaging, on the other hand, only reaches users who have subscribed to specific topics, helping developers and merchants avoid this pitfall.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
3. Process
4. Key Integration Steps and Code
(1) Integrate the HMS Core Push SDK.
For details, please refer to the Push SDK integration document at https://developer.huawei.com/consum...s-V5/service-introduction-0000001050040060-V5.
(2) Configure automatic initialization. After the configuration, a token is returned by onNewToken each time that the app is started. The topic-based messaging does not depend on a token, but the app will need to obtain the token first.
Code:
<meta-data
android
:name
="push_kit_auto_init_enabled"
android
:value
="true"
/>
(3) Set topic to a product ID. When a user taps the reservation button, the subscribe method is called to subscribe to the topic.
Code:
/**
* to subscribe to topics in asynchronous mode.
*/
private void addTopic(String topic) {
try {
HmsMessaging.getInstance(MainActivity2.this)
.subscribe(topic)
.addOnCompleteListener(new OnCompleteListener<Void>() {
@Override
public void onComplete(Task<Void> task) {
if (task.isSuccessful()) {
Log.i(TAG, "subscribe Complete");
changToCancelAppointment();
isAppointment = true;
showLog("subscribe successful");
} else {
isAppointment = false;
changeToAppointment();
showLog("subscribe failed: ret=" + task.getException().getMessage());
}
}
});
} catch (Exception e) {
isAppointment = false;
changeToAppointment();
showLog("subscribe failed: exception=" + e.getMessage());
}
}
(4) Call the downlink messaging API to push messages based on topic. Use Postman to simulate sending a message. The packet is as follows:
(5) When reserving a watch purchase, the user may wish to purchase other watches in the same price range and of the same model. In this case, push the pre-sales information of these watches as a group to the user by condition. In the following figure, if the user has subscribed to one of watch123456, watch32165, and watch321684, the pre-sales information about the other watches will be pushed to the user.
(6) When the user cancels the reservation, call the unsubscribe method. After the cancellation, the user will no longer receive the pre-sales information for the watch.
Code:
/**
* to unsubscribe to topics in asynchronous mode.
*/
private void deleteTopic(String topic) {
try {
HmsMessaging.getInstance(MainActivity2.this)
.unsubscribe(topic)
.addOnCompleteListener(new OnCompleteListener<Void>() {
@Override
public void onComplete(Task<Void> task) {
if (task.isSuccessful()) {
showLog("unsubscribe successful");
changeToAppointment();
isAppointment = false;
} else {
isAppointment = true;
showLog("unsubscribe failed: ret=" + task.getException().getMessage());
changToCancelAppointment();
}
}
});
} catch (Exception e) {
showLog("unsubscribe failed: exception=" + e.getMessage());
isAppointment = true;
changeToAppointment();
}
}
5. Display Effects
(1) Prior to reservation (2) At the time of reservation (3) Reservation canceled
6. Others
Topic-based messaging does not limit the number of subscriptions to each topic. However, Push Kit has the following restrictions:
(1) An app instance can subscribe to a maximum of 2,000 topics.
(2) For Huawei devices running EMUI 10.0 or later, the version of HMS Core (APK) must be 3.0.0 or later. For Huawei devices running EMUI earlier than 10.0, the version of HMS Core (APK) must be 4.0.3 or later. HMS Core (APK) of a later version supplements the functions that are missing in EMUI of an earlier version.
(3) The number of topics through which messages can be pushed simultaneously cannot exceed 100.
Text to speech (TTS) is highly sought after by audio/video editors, thanks to its ability to automatically turn text into naturally sounding speech, as a low cost alternative to human dubbing. It can be used on all kinds of video, regardless of whether the video is long or short.
I recently stumbled upon the AI dubbing capability of HMS Core Audio Editor Kit, which does just that. It is able to turn input text into speech with just a tap, and comes loaded with a selection of smooth, naturally-sounding male and female timbres.
This is ideal for developing apps that involve e-books, creating audio content, and editing audio/video. Below describes how I integrated this capability.
Making PreparationsComplete all necessary preparations by following the official guide.
Configuring the Project1. Set the app authentication information
The information can be set via an API key or access token (recommended).
Use setAccessToken to set an access token during app initialization.
Java:
HAEApplication.getInstance().setAccessToken("your access token");
Or, use setApiKey to set an API key during app initialization. The API key needs to be set only once.
Java:
HAEApplication.getInstance().setApiKey("your ApiKey");
2. Initialize the runtime environment
Initialize HuaweiAudioEditor, and create a timeline and necessary lanes.
Java:
// Create a HuaweiAudioEditor instance.
HuaweiAudioEditor mEditor = HuaweiAudioEditor.create(mContext);
// Initialize the runtime environment of HuaweiAudioEditor.
mEditor.initEnvironment();
// Create a timeline.
HAETimeLine mTimeLine = mEditor.getTimeLine();
// Create a lane.
HAEAudioLane audioLane = mTimeLine.appendAudioLane();
Import audio.
Java:
// Add an audio asset to the end of the lane.
HAEAudioAsset audioAsset = audioLane.appendAudioAsset("/sdcard/download/test.mp3", mTimeLine.getCurrentTime());
3. Integrate AI dubbing.
Call HAEAiDubbingEngine to implement AI dubbing.
Java:
// Configure the AI dubbing engine.
HAEAiDubbingConfig haeAiDubbingConfig = new HAEAiDubbingConfig()
// Set the volume.
.setVolume(volumeVal)
// Set the speech speed.
.setSpeed(speedVal)
// Set the speaker.
.setType(defaultSpeakerType);
// Create a callback for an AI dubbing task.
HAEAiDubbingCallback callback = new HAEAiDubbingCallback() {
@Override
public void onError(String taskId, HAEAiDubbingError err) {
// Callback when an error occurs.
}
@Override
public void onWarn(String taskId, HAEAiDubbingWarn warn) {}
@Override
public void onRangeStart(String taskId, int start, int end) {}
@Override
public void onAudioAvailable(String taskId, HAEAiDubbingAudioInfo haeAiDubbingAudioFragment, int i, Pair<Integer, Integer> pair, Bundle bundle) {
// Start receiving and then saving the file.
}
@Override
public void onEvent(String taskId, int eventID, Bundle bundle) {
// Synthesis is complete.
if (eventID == HAEAiDubbingConstants.EVENT_SYNTHESIS_COMPLETE) {
// The AI dubbing task has been complete. That is, the synthesized audio data is completely processed.
}
}
@Override
public void onSpeakerUpdate(List<HAEAiDubbingSpeaker> speakerList, List<String> lanList,
List<String> lanDescList) { }
};
// AI dubbing engine.
HAEAiDubbingEngine mHAEAiDubbingEngine = new HAEAiDubbingEngine(haeAiDubbingConfig);
// Set the listener for the playback process of an AI dubbing task.
mHAEAiDubbingEngine.setAiDubbingCallback(callback);
// Convert text to speech and play the speech. In the method, text indicates the text to be converted to speech, and mode indicates the mode for playing the converted audio.
String taskId = mHAEAiDubbingEngine.speak(text, mode);
// Pause playback.
mHAEAiDubbingEngine.pause();
// Resume playback.
mHAEAiDubbingEngine.resume();
// Stop AI dubbing.
mHAEAiDubbingEngine.stop();
ResultIn the demo below, I successfully implement the AI dubbing function in app. Now, I can converts text into emotionally expressive speech, with default and custom timbres.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
To learn more, please visit:
>> Audio Editor Kit official website
>> Audio Editor Kit Development Guide
>> Reddit to join developer discussions
>> GitHub to download the sample code
>> Stack Overflow to solve integration problems
Follow our official account for the latest HMS Core-related news and updates.
Efficient records management is more relevant now than ever. In our digital age, huge growth of information — audio, video, and more — must be handled in a limited time. This makes a real-time transcription function essential, because it is useful in many scenarios.
In audio or video conferencing, this function records meeting minutes that I can refer to later, which is more convenient than writing them all by myself. I've seen my kids struggling to take notes during their online courses, so I know this process can be so much easier with the help of the transcription function. In short, it removed the job of writing down everything the teacher says, allowing the kids to focus on the lecture itself and easily review the content again later. Also, the live captions provide viewers with real-time subtitles, for a better watching experience.
As a coder, I am a believer in "Actions speak louder than words". That's why I developed a real-time transcription function, with the help of a real-time transcription capability from ML Kit, like this.
Demo
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
This function transcribes up to 5 hours of speech into Chinese, English (or both), and French languages in real time. In addition, the output text is punctuated and contains timestamps.
This function has some requirements: the support for French is dependent on the mobile phone model, whereas Chinese and English are available on all mobile phone models. Also, the function requires Internet connection.
Okay, let's move on to the point of this article: How I developed this real-time transcription function.
Development Procedure1. Make necessary preparations. This is described in detail in the References section.
2. Create and then configure a speech recognizer.
Code:
MLSpeechRealTimeTranscriptionConfig config = new MLSpeechRealTimeTranscriptionConfig.Factory()
// Set the language, which can be Chinese, English, both Chinese and English, or French.
.setLanguage(MLSpeechRealTimeTranscriptionConstants.LAN_ZH_CN)
// Punctuate the text recognized from the speech.
.enablePunctuation(true)
// Set the sentence offset.
.enableSentenceTimeOffset(true)
// Set the word offset.
.enableWordTimeOffset(true)
.create();
MLSpeechRealTimeTranscription mSpeechRecognizer = MLSpeechRealTimeTranscription.getInstance();
3. Create a callback for the speech recognition result listener.
Code:
// Use the callback to implement the MLSpeechRealTimeTranscriptionListener API and methods in the API.
Protected class SpeechRecognitionListener implements MLSpeechRealTimeTranscriptionListener{
@Override
public void onStartListening() {
// The recorder starts to receive speech.
}
@Override
public void onStartingOfSpeech() {
// The speech recognizer detects the user speaking.
}
@Override
public void onVoiceDataReceived(byte[] data, float energy, Bundle bundle) {
// Return the original PCM stream and audio power to the user. The API does not run in the main thread, and the return result is processed in a sub-thread.
}
@Override
public void onRecognizingResults(Bundle partialResults) {
// Receive recognized text from MLSpeechRealTimeTranscription.
}
@Override
public void onError(int error, String errorMessage) {
// Callback when an error occurs during recognition.
}
@Override
public void onState(int state,Bundle params) {
// Notify the app of the recognizer status change.
}
}
4. Bind the speech recognizer.
Code:
mSpeechRecognizer.setRealTimeTranscriptionListener(new SpeechRecognitionListener());
5. Call startRecognizing to begin speech recognition.
Code:
mSpeechRecognizer.startRecognizing(config);
6. Stop recognition and release resources occupied by the recognizer when the recognition is complete.
Code:
if (mSpeechRecognizer!= null) {
mSpeechRecognizer.destroy();
}
ReferencesAudio Transcription: What It Is, What It Is Not, and Why It's in High Demand
Configuring Necessary Information During Preparation
Adding a Plug-In and the Maven Repository Address, and Configuring the Building Dependencies
BackgroundVideos are memories — so why not spend more time making them look better? Many mobile apps on the market simply offer basic editing functions, such as applying filters and adding stickers. That said, it is not enough for those who want to create dynamic videos, where a moving person stays in focus. Traditionally, this requires a keyframe to be added and the video image to be manually adjusted, which could scare off many amateur video editors.
I am one of those people and I've been looking for an easier way of implementing this kind of feature. Fortunately for me, I stumbled across the track person capability from HMS Core Video Editor Kit, which automatically generates a video that centers on a moving person, as the images below show.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
Before using the capability
After using the capability
Thanks to the capability, I can now confidently create a video with the person tracking effect.
Let's see how the function is developed.
Development ProcessPreparationsConfigure the app information in AppGallery Connect.
Project Configuration1. Set the authentication information for the app via an access token or API key.
Use the setAccessToken method to set an access token during app initialization. This needs setting only once.
Code:
MediaApplication.getInstance().setAccessToken("your access token");
Or, use setApiKey to set an API key during app initialization. The API key needs to be set only once.
Code:
MediaApplication.getInstance().setApiKey("your ApiKey");
2. Set a unique License ID.
Code:
MediaApplication.getInstance().setLicenseId("License ID");
3. Initialize the runtime environment for HuaweiVideoEditor.
When creating a video editing project, first create a HuaweiVideoEditor object and initialize its runtime environment. Release this object when exiting a video editing project.
(1) Create a HuaweiVideoEditor object.
Code:
HuaweiVideoEditor editor = HuaweiVideoEditor.create(getApplicationContext());
(2) Specify the preview area position.
The area renders video images. This process is implemented via SurfaceView creation in the SDK. The preview area position must be specified before the area is created.
Code:
<LinearLayout
android:id="@+id/video_content_layout"
android:layout_width="0dp"
android:layout_height="0dp"
android:background="@color/video_edit_main_bg_color"
android:gravity="center"
android:orientation="vertical" />
// Specify the preview area position.
LinearLayout mSdkPreviewContainer = view.findViewById(R.id.video_content_layout);
// Configure the preview area layout.
editor.setDisplay(mSdkPreviewContainer);
(3) Initialize the runtime environment. LicenseException will be thrown if license verification fails.
Creating the HuaweiVideoEditor object will not occupy any system resources. The initialization time for the runtime environment has to be manually set. Then, necessary threads and timers will be created in the SDK.
Code:
try {
editor.initEnvironment();
} catch (LicenseException error) {
SmartLog.e(TAG, "initEnvironment failed: " + error.getErrorMsg());
finish();
return;
}
4. Add a video or an image.
Create a video lane. Add a video or an image to the lane using the file path.
Code:
// Obtain the HVETimeLine object.
HVETimeLine timeline = editor.getTimeLine();
// Create a video lane.
HVEVideoLane videoLane = timeline.appendVideoLane();
// Add a video to the end of the lane.
HVEVideoAsset videoAsset = videoLane.appendVideoAsset("test.mp4");
// Add an image to the end of the video lane.
HVEImageAsset imageAsset = videoLane.appendImageAsset("test.jpg");
Function Building
Code:
// Initialize the capability engine.
visibleAsset.initHumanTrackingEngine(new HVEAIInitialCallback() {
@Override
public void onProgress(int progress) {
// Initialization progress.
}
@Override
public void onSuccess() {
// The initialization is successful.
}
@Override
public void onError(int errorCode, String errorMessage) {
// The initialization failed.
}
});
// Track a person using the coordinates. Coordinates of two vertices that define the rectangle containing the person are returned.
List<Float> rects = visibleAsset.selectHumanTrackingPerson(bitmap, position2D);
// Enable the effect of person tracking.
visibleAsset.addHumanTrackingEffect(new HVEAIProcessCallback() {
@Override
public void onProgress(int progress) {
// Handling progress.
}
@Override
public void onSuccess() {
// Handling successful.
}
@Override
public void onError(int errorCode, String errorMessage) {
// Handling failed.
}
});
// Interrupt the effect.
visibleAsset.interruptHumanTracking();
// Remove the effect.
visibleAsset.removeHumanTrackingEffect();
ReferencesThe Importance of Visual Effects
Track Person
BackgroundIt's now possible to carry a mobile recording studio in your pocket, thanks to a range of apps on the market that allow music enthusiasts to sing and record themselves anytime and anywhere.
However, you'll often find that nasty background noise creeps into recordings. That's where HMS Core Audio Editor Kit comes into the mix, which, when integrated into an app, will cancel out background noise. Let's see how to integrate it to develop a noise reduction function.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
Development ProcessMaking PreparationsComplete these prerequisites.
Configuring the Project1. Set the app authentication information via an access token or API key.
Call setAccessToken during app initialization to set an access token. This needs setting only once.
Code:
HAEApplication.getInstance().setAccessToken("your access token");
Or, call setApiKey to set an API key during app initialization. This needs to be set only once.
Code:
HAEApplication.getInstance().setApiKey("your ApiKey");
2. Call the file API for the noise reduction capability. Before this, the callback for the file API must have been created.
Code:
private ChangeSoundCallback callBack = new ChangeSoundCallback() {
@Override
public void onSuccess(String outAudioPath) {
// Callback when the processing is successful.
}
@Override
public void onProgress(int progress) {
// Callback when the processing progress is received.
}
@Override
public void onFail(int errorCode) {
// Callback when the processing failed.
}
@Override
public void onCancel() {
// Callback when the processing is canceled.
}
};
3. Call applyAudioFile for noise reduction.
Code:
// Reduce noise.
HAENoiseReductionFile haeNoiseReductionFile = new HAENoiseReductionFile();
// API calling.
haeNoiseReductionFile.applyAudioFile(inAudioPath, outAudioDir, outAudioName, callBack);
// Cancel the noise reduction task.
haeNoiseReductionFile.cancel();
And the function is now created.
This function is ideal for audio/video editing, karaoke, live streaming, instant messaging, and for holding online conferences, as it helps mute steady state noise and loud sounds captured from one or two microphones, to make a person's voice sound crystal clear. How would you use this function? Share your ideas in the comments section.
ReferencesTypes of Noise
How to Implement Noise Reduction?