Changing an image/video background has always been a hassle, whereby the most tricky part is to extract the element other than the background.
Traditionally, it requires us to use a PC image-editing program that allows us to select the element, add a mask, replace the canvas, and more. If the element has an extremely uneven border, then the whole process can be very time-consuming.
Luckily, ML Kit from HMS Core offers a solution that streamlines the process: the image segmentation service, which supports both images and videos. This service draws upon a deep learning framework, as well as detection and recognition technology. The service can automatically recognize — within seconds — the elements and scenario of an image or a video, delivering a pixel-level recognition accuracy. By using a novel framework of semantic segmentation, image segmentation labels each and every pixel in an image and supports 11 element categories including humans, the sky, plants, food, buildings, and mountains.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
This service is a great choice for entertaining apps. For example, an image-editing app can use the service to realize swift background replacement. A photo-taking app can count on this service for optimization on different elements (for example, the green plant) to make them appear more attractive.
Below is an example showing how the service works in an app.
Cutout is another field where image segmentation plays a role. Most cutout algorithms, however, cannot delicately determine fine border details such as that of hair. The team behind ML Kit's image segmentation has been working on its algorithms designed for handling hair and highly hollowed-out subjects. As a result, the capability can now retain hair details during live-streaming and image processing, delivering a better cutout effect.
Development Procedure
Before app development, there are some necessary preparations. In addition, the Maven repository address should be configured for the SDK, and the SDK should be integrated into the app project.
The image segmentation service offers three capabilities: human body segmentation, multiclass segmentation, and hair segmentation.
Human body segmentation: supports videos and images. The capability segments the human body from its background and is ideal for those who only need to segment the human body and background. The return value of this capability contains the coordinate array of the human body, human body image with a transparent background, and gray-scale image with a white human body and black background. Based on the return value, your app can further process an image to, for example, change the video background or cut out the human body.
Multiclass segmentation: offers the return value of the coordinate array of each element. For example, when the image processed by the capability contains four elements (human body, sky, plant, and cat & dog), the return value is the coordinate array of the four elements. Your app can further process these elements, such as replacing the sky.
Hair segmentation: segments hair from the background, with only images supported. The return value is a coordinate array of the hair element. For example, when the image processed by the capability is a selfie, the return value is the coordinate array of the hair element. Your app can then further process the element by, for example, changing the hair color.
Static Image Segmentation
1. Create an image segmentation analyzer.
Integrate the human body segmentation model package.
Code:
// Method 1: Use default parameter settings to configure the image segmentation analyzer.
// The default mode is human body segmentation in fine mode. All segmentation results of human body segmentation are returned (pixel-level label information, human body image with a transparent background, gray-scale image with a white human body and black background, and an original image for segmentation).
MLImageSegmentationAnalyzer analyzer = MLAnalyzerFactory.getInstance().getImageSegmentationAnalyzer();
// Method 2: Use MLImageSegmentationSetting to customize the image segmentation analyzer.
MLImageSegmentationSetting setting = new MLImageSegmentationSetting.Factory()
// Set whether to use fine segmentation. true indicates yes, and false indicates no (fast segmentation).
.setExact(false)
// Set the segmentation mode to human body segmentation.
.setAnalyzerType(MLImageSegmentationSetting.BODY_SEG)
// Set the returned result types.
// MLImageSegmentationScene.ALL: All segmentation results are returned (pixel-level label information, human body image with a transparent background, gray-scale image with a white human body and black background, and an original image for segmentation).
// MLImageSegmentationScene.MASK_ONLY: Only pixel-level label information and an original image for segmentation are returned.
// MLImageSegmentationScene.FOREGROUND_ONLY: A human body image with a transparent background and an original image for segmentation are returned.
// MLImageSegmentationScene.GRAYSCALE_ONLY: A gray-scale image with a white human body and black background and an original image for segmentation are returned.
.setScene(MLImageSegmentationScene.FOREGROUND_ONLY)
.create();
MLImageSegmentationAnalyzer analyzer = MLAnalyzerFactory.getInstance().getImageSegmentationAnalyzer(setting);
Integrate the multiclass segmentation model package.
When the multiclass segmentation model package is used for processing an image, an image segmentation analyzer can be created only by using MLImageSegmentationSetting.
Code:
MLImageSegmentationSetting setting = new MLImageSegmentationSetting
.Factory()
// Set whether to use fine segmentation. true indicates yes, and false indicates no (fast segmentation).
.setExact(true)
// Set the segmentation mode to image segmentation.
.setAnalyzerType(MLImageSegmentationSetting.IMAGE_SEG)
.create();
MLImageSegmentationAnalyzer analyzer = MLAnalyzerFactory.getInstance().getImageSegmentationAnalyzer(setting);
Integrate the hair segmentation model package.
When the hair segmentation model package is used for processing an image, a hair segmentation analyzer can be created only by using MLImageSegmentationSetting.
Code:
MLImageSegmentationSetting setting = new MLImageSegmentationSetting
.Factory()
// Set the segmentation mode to hair segmentation.
.setAnalyzerType(MLImageSegmentationSetting.HAIR_SEG)
.create();
MLImageSegmentationAnalyzer analyzer = MLAnalyzerFactory.getInstance().getImageSegmentationAnalyzer(setting);
2. Create an MLFrame object by using android.graphics.Bitmap for the analyzer to detect images. JPG, JPEG, and PNG images are supported. It is recommended that the image size range from 224 x 224 px to 1280 x 1280 px.
Code:
// Create an MLFrame object using the bitmap, which is the image data in bitmap format.
MLFrame frame = MLFrame.fromBitmap(bitmap);
3. Call asyncAnalyseFrame for image segmentation.
Code:
// Create a task to process the result returned by the analyzer.
Task<MLImageSegmentation> task = analyzer.asyncAnalyseFrame(frame);
// Asynchronously process the result returned by the analyzer.
task.addOnSuccessListener(new
OnSuccessListener<MLImageSegmentation>() {
@Override
public void onSuccess(MLImageSegmentation segmentation) {
// Callback when recognition is successful.
}})
.addOnFailureListener(new OnFailureListener() {
@Override
public void onFailure(Exception e) {
// Callback when recognition failed.
}});
4. Stop the analyzer and release the recognition resources when recognition ends.
Code:
if (analyzer != null) {
try {
analyzer.stop();
} catch (IOException e) {
// Exception handling.
}
}
The asynchronous call mode is used in the preceding example. Image segmentation also supports synchronous call of the analyseFrame function to obtain the detection result:
Code:
SparseArray<MLImageSegmentation> segmentations = analyzer.analyseFrame(frame);
References
Home page of HMS Core ML Kit
Development Guide of HMS Core ML Kit
Related
Have you ever been sent compressed images that have poor definition? Even when you zoom in, the image is still blurry. I recently received a ZIP file of travel photos from a trip I went on with a friend. After opening it, I found to my dismay that each image was either too dark, too dim or too blurry. How am I going to show off with such terrible photos? So, I sought help from the Internet, and luckily, I came across HUAWEI ML Kit's image super-resolution capability. The amazing thing is that this SDK is free of charge, and can be used with all Android phones.
Background
ML Kit's image super-resolution capability is backed by a deep neural network and provides two super-resolution capabilities for mobile apps:
1x super-resolution: Doesn’t change the size but removes compression noise, for vivid, natural images.
3x super-resolution: Provides 3x magnification (with 9x higher pixel resolution) while also removing compression noise, for clear details and textures.
Of course, these capabilities are not easy for non-professionals to develop themselves. But here is the sample code, so you can download and experience it yourself!
Application Scenarios
Image super-resolution is not just about optimizing faces and text; it’s useful in a huge range of situations. When shopping apps integrate 3x image super-resolution, they can give shoppers the option to zoom in on a product, without losing any image quality. For news and reading apps, the 1x image super-resolution capability can give users clearer images without changing the image resolution. Then, for photography apps which integrate image super-resolution, users can capture more vivid and natural images.
Code Development
Before we get started with API development, we need to make necessary development preparations. Also, ensure that the Maven repository address of the HMS Core SDK has been configured in your project, and the SDK of this service has been integrated.
Note
When using the image super-resolution capability, you need to set the value of targetSdkVersion to less than 29 in the build.gradle file.
1. Create an image super-resolution analyzer.
You can create the analyzer using the MLImageSuperResolutionAnalyzerSetting class.
Code:
// Method 1: Use default parameter settings, that is, the 1x super-resolution capability.
MLImageSuperResolutionAnalyzer analyzer = MLImageSuperResolutionAnalyzerFactory.getInstance
().getImageSuperResolutionAnalyzer();
// Method 2: Use custom parameter settings. Currently, only 1x super-resolution is supported. More capabilities will be supported in future.
MLImageSuperResolutionAnalyzerSetting settings = new MLImageSuperResolutionAnalyzerSetting.Factory()
// Set the image super-resolution multiplier to 1x.
setScale(MLImageSuperResolutionAnalyzerSetting.ISR_SCALE_1X)
create();
MLImageSuperResolutionAnalyzer analyzer = MLImageSuperResolutionAnalyzerFactory.getInstance
().getImageSuperResolutionAnalyzer(setting)
2. Create an MLFrame object by using android.graphics.Bitmap. (Note that the bitmap type must be ARGB8888. Convert if necessary.)
Code:
// Create an MLFrame object using the bitmap. The parameter bitmap indicates the input image.
MLFrame frame = new MLFrame.Creator().setBitmap(bitmap).create();
3. Perform super-resolution processing on the image. (For details about the error codes, please refer to ML Kit Error Codes).
Code:
Task<MLImageSuperResolutionResult> task = analyzer.asyncAnalyseFrame(frame);
task.addOnSuccessListener(new OnSuccessListener<MLImageSuperResolutionResult>() {
public void onSuccess(MLImageSuperResolutionResult result) {
// Processing logic for detection success.
}})
addOnFailureListener(new OnFailureListener() {
public void onFailure(Exception e) {
// Processing logic for detection failure.
// failure.
if(e instanceof MLException) {
MLException mlException = (MLException)e;
// Obtain the error code. You can process the error code and customize the messages users see.
int errorCode = mlException.getErrCode();
// Obtain the error information. You can quickly locate the fault based on the error code.
String errorMessage = mlException.getMessage();
} else {
// Other errors.
}
}
});
4. Stop the analyzer to release recognition resources after the recognition is complete.
Code:
if (analyzer != null) {
analyzer.stop();
}
Demo
Now, let's compare some before and after images, to see what this capability can do.
Original image
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
Image after 1x super-resolution
Original image
Image after 3x super-resolution
Related Links
With HUAWEI ML Kit, you can integrate a wide range of capabilities and services into your apps, including text recognition, card recognition, text translation, facial recognition, speech recognition, text to speech, image classification, image segmentation, and product visual search.
If you’re interested, go to HUAWEI ML Kit - About the Service to learn more.
To find out more about these capabilities and services, here are some useful resources:
https://developer.huawei.com/consumer/cn/doc/31403
How much time it takes to integrate ML kit in a project. What is the use case of ML kit super resolution capability.
Introducton:
HUAWEI HiAI is an open Artificial Intelligence (AI) capable platform for smart devices, which adopts a chip-device-cloud architecture, opening up the chip, app, and service capabilities for a fully intelligent ecosystem. The HiAI is used for Text Recognition, Image Recognition etc. Text Recognition is used to extract text data from an image and the Image recognition is used to obtain quality, category and scene of a particular image.
This article is giving a brief explanation on Document Skew Correction, General Text Recognition and Table Recognition. Here we are using DevEco plugin to configure the HiAI application. To know about the integrate of application via DevEco you can refer the article "HUAWEI HiAI Image Super-Resolution Via DevEco".
General Text Recognition:
The API detects and identifies text information in a given image. It supports text extraction from screenshots with simple to more complex backgrounds. In snapshot Optical Character Recognition(OCR), the image resolution should be higher than 720p (1280×720 px), and the aspect ratio (length-to-width ratio) should be lower than 2:1. If the input image does not meet the above requirements, text recognition accuracy may be affected.
Code Implementation:
Initialize with the VisionBase static class and asynchronously get the connection of the service.
Code:
private void initHiAI() {
VisionBase.init(this, new ConnectionCallback() {
@Override
public void onServiceConnect() {
/** Invoke the callback method when the service is connected successfully. You can also perform initialization and mark the service connection status in this method*/
}
@Override
public void onServiceDisconnect() {
/** Invoke the callback method when the service is disconnected. You can choose to reconnect to the service or handle exceptions API invocation*/
}
});
}
Define the detector instance and frame. Select the OCR type to be called via TextConfiguration and finally process the Json result.
Code:
private void setHiAi() {
/** Define the detector instance and use the Context of this project as a parameter*/
TextDetector detector = new TextDetector(this);
/** Define a frame and initialize it*/
Frame frame = new Frame();
/** Place the bitmap to be checked in the frame*/
frame.setBitmap(bitmap);
/** Select the OCR type to be called via TextConfiguration, for example TYPE_TEXT_DETECT_SCREEN_SHOT_GENERAL is the detection screenshot*/
TextConfiguration config = new TextConfiguration();
config.setEngineType(TextDetectType.TYPE_TEXT_DETECT_SCREEN_SHOT_GENERAL);
detector.setTextConfiguration(config);
/** Check the text and process the result. Note that this line of code must be placed in the working thread rather than the primary thread*/
JSONObject jsonObject = detector.detect(frame, null);
/** Run the convertResult() command to convert the JSON character string into a Java class. You can also parse the JSON character string yourself*/
Text text = detector.convertResult(jsonObject);
System.out.println("Json:"+jsonObject);
String txtValue = text.getValue();
this.txtValue = txtValue;
handler.sendEmptyMessage(Utils.HiAI_RESULT);
}
Screenshot:
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
Document Skew Correction:
The API can be used to recognize automatically the location information of the document in the image and correct the angle of the document. For example, we can duplicate paper photos and letters and correct their locations, thereby reserving memories.
Here the bitmap of the frame type must be in the ARGB8888 format with a size not greater than 20 megapixels.
Code Implementation:
As mentioned above sample, we have to Initialize with the VisionBase static class. After that define detector instance and frame. Put the picture which need to be detected into the frame. Call the detect method and process the result.
Code:
private void setHiAi() {
/** Define class detector, the context of this project is the input parameter */
DocRefine docResolution = new DocRefine(this);
/** Get Bitmap image (Note: bitmap type must be ARGB8888, please pay attention to the necessary conversion)*/
/** Define frame class, and put the picture which need to be detected into the frame */
Frame frame = new Frame();
frame.setBitmap(bitmap);
/** Call the detect method to get the information of the result*/
/** Note: This line of code must be placed in the worker thread instead of the main thread */
JSONObject jsonDoc = docResolution.docDetect(frame, null);
/** Returns a jsonObject format, you can call convertResult() method to convert the json to java class and get the result*/
DocCoordinates sc = docResolution.convertResult(jsonDoc);
ImageResult sr = docResolution.docRefine(frame, sc, null);
/** Get bitmap result*/
Bitmap bmp = sr.getBitmap();
/** Note: The result and the Bitmap in the result must be NULL, but also to determine whether the returned error code is 0 (0 means no error)*/
this.bmp = bmp;
handler.sendEmptyMessage(Utils.HiAI_RESULT);
}
Screenshot:
Table Recognition:
The API can perfectly retrieve the information of tables as well as text from cells, besides, merged cells can also be recognized. It supports recognition of tables with clear and unbroken lines, but not supportive of tables with crooked lines or cells divided by color background. Currently the API supports recognition from printed materials and snapshots of slide meetings, but it is not functioning for the screenshots or photos of excel sheets and any other table editing software.
Here, the image resolution should be higher than 720p (1280×720 px), and the aspect ratio (length-to-width ratio) should be lower than 2:1.
Code Implementation:
As mentioned above sample, we have to Initialize with the VisionBase static class. After that define TableDetector instance and frame. Call the detect method and process the result.
Code:
<p style="line-height: 2.0em;">private void setHiAi() {
startTime = System.currentTimeMillis();
/** Define the detector instance and use the Context of this project as a parameter*/
TableDetector detector = new TableDetector(this);
/** Define a frame and initialize it*/
Frame frame = new Frame();
/** Place the bitmap to be checked in the frame*/
frame.setBitmap(bitmap);
/** Check the text and process the result. Note that this line of code must be placed in the working thread rather than the primary thread*/
JSONObject jsonObject = detector.detect(frame, null);
/** Run the convertResult() command to convert the JSON character string into a Java class. You can also parse the JSON character string yourself*/
Table table = detector.convertResult(jsonObject);
System.out.println("Json:"+jsonObject);
extractFromTable(table);
this.txtValue = txtValue;
handler.sendEmptyMessage(Utils.HiAI_RESULT);
}
</p>
Screenshot:
Conclusion:
In this article, we discussed about Document Skew Correction, General Text Recognition and Table Recognition. We can use these in so many real-time scenarios, as follows:
The Document Skew Correction API can be used to easily record works in exhibition regardless of the objective factors such as the angle and the stream of people.
General Text Recognition can provide accurate recognition in various scenarios, even with interferences such as complex light and background situations.
The operational scenarios of Table Recognition are various, so that the third-party apps can generate excel sheets according to the returned results, minimizing manual workload
Reference:
HiAI Guide
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
Have you ever watched a video of the northern lights? Mesmerizing light rays that swirl and dance through the star-encrusted sky. It's even more stunning when they are backdropped by crystal-clear waters that flow smoothly between and under ice crusts. Complementing each other, the moving sky and water compose a dynamic scene that reflects the constant rhythm of the mother nature.
Now imagine that the video is frozen into an image: It still looks beautiful, but lacks the dynamism of the video. Such a contrast between still and moving images shows how videos are sometimes better than still images when it comes to capturing majestic scenery, since the former can convey more information and thus be more engaging.
This may be the reason why we sometimes regret just taking photos instead of capturing a video when we encounter beautiful scenery or a memorable moment.
In addition to this, when we try to add a static image to a short video, we will find that the transition between the image and other segments of the video appears very awkward, since the image is the only static segment in the whole video.
If we want to turn a static image into a dynamic video by adding some motion effects to the sky and water, one way to do this is to use a professional PC program to modify the image. However, this process is often very complicated and time-consuming: It requires adjustment of the timeline, frames, and much more, which can be a daunting prospect for amateur image editors.
Luckily, there are now numerous AI-driven capabilities that can automatically create time-lapse videos for users. I chose to use the auto-timelapse capability provided by HMS Core Video Editor Kit. It can automatically detect the sky and water in an image and produce vivid dynamic effects for them, just like this:
The movement speed and angle of the sky and water are customizable.
Now let's take a look at the detailed integration procedure for this capability, to better understand how such a dynamic effect is created.
Integration ProcedurePreparations1. Configure necessary app information. This step requires you to register a developer account, create an app, generate a signing certificate fingerprint, configure the fingerprint, and enable the required services.
2. Integrate the SDK of the kit.
3. Configure the obfuscation scripts.
4. Declare necessary permissions.
Project Configuration1. Set the app authentication information. This can be done via an API key or an access token.
Set an API key via the setApiKey method: You only need to set the app authentication information once during app initialization.
Code:
MediaApplication.getInstance().setApiKey("your ApiKey");
Or, set an access token by using the setAccessToken method: You only need to set the app authentication information once during app initialization.
Code:
MediaApplication.getInstance().setAccessToken("your access token");
2. Set a License ID. This ID should be unique because it is used to manage the usage quotas of the service.
Code:
MediaApplication.getInstance().setLicenseId("License ID");
3. Initialize the runtime environment for the HuaweiVideoEditor object. Remember to release the HuaweiVideoEditor object when exiting the project.
Create a HuaweiVideoEditor object.
Code:
HuaweiVideoEditor editor = HuaweiVideoEditor.create(getApplicationContext());
Specify the preview area position. Such an area is used to render video images, which is implemented by SurfaceView created within the SDK. Before creating such an area, specify its position in the app first.
Code:
<LinearLayout
android:id="@+id/video_content_layout"
android:layout_width="0dp"
android:layout_height="0dp"
android:background="@color/video_edit_main_bg_color"
android:gravity="center"
android:orientation="vertical" />
// Specify the preview area position.
LinearLayout mSdkPreviewContainer = view.findViewById(R.id.video_content_layout);
// Specify the preview area layout.
editor.setDisplay(mSdkPreviewContainer);
Initialize the runtime environment. If license verification fails, LicenseException will be thrown.
After it is created, the HuaweiVideoEditor object will not occupy any system resources. You need to manually set when the runtime environment of the object will be initialized. Once you have done this, necessary threads and timers will be created within the SDK.
Code:
try {
editor.initEnvironment();
} catch (LicenseException error) {
SmartLog.e(TAG, "initEnvironment failed: " + error.getErrorMsg());
finish();
return;
}
Function Development
Code:
// Initialize the auto-timelapse engine.
imageAsset.initTimeLapseEngine(new HVEAIInitialCallback() {
@Override
public void onProgress(int progress) {
// Callback when the initialization progress is received.
}
@Override
public void onSuccess() {
// Callback when the initialization is successful.
}
@Override
public void onError(int errorCode, String errorMessage) {
// Callback when the initialization failed.
}
});
// When the initialization is successful, check whether there is sky or water in the image.
int motionType = -1;
imageAsset.detectTimeLapse(new HVETimeLapseDetectCallback() {
@Override
public void onResult(int state) {
// Record the state parameter, which is used to define a motion effect.
motionType = state;
}
});
// skySpeed indicates the speed at which the sky moves; skyAngle indicates the direction to which the sky moves; waterSpeed indicates the speed at which the water moves; waterAngle indicates the direction to which the water moves.
HVETimeLapseEffectOptions options =
new HVETimeLapseEffectOptions.Builder().setMotionType(motionType)
.setSkySpeed(skySpeed)
.setSkyAngle(skyAngle)
.setWaterAngle(waterAngle)
.setWaterSpeed(waterSpeed)
.build();
// Add the auto-timelapse effect.
imageAsset.addTimeLapseEffect(options, new HVEAIProcessCallback() {
@Override
public void onProgress(int progress) {
}
@Override
public void onSuccess() {
}
@Override
public void onError(int errorCode, String errorMessage) {
}
});
// Stop applying the auto-timelapse effect.
imageAsset.interruptTimeLapse();
// Remove the auto-timelapse effect.
imageAsset.removeTimeLapseEffect();
Now, the auto-timelapse capability has been successfully integrated into an app.
ConclusionWhen capturing scenic vistas, videos, which can show the dynamic nature of the world around us, are often a better choice than static images. In addition, when creating videos with multiple shots, dynamic pictures deliver a smoother transition effect than static ones.
However, for users not familiar with the process of animating static images, if they try do so manually using computer software, they may find the results unsatisfying.
The good news is that there are now mobile apps integrated with capabilities such as Video Editor Kit's auto-timelapse feature that can create time-lapse effects for users. The generated effect appears authentic and natural, the capability is easy to use, and its integration is straightforward. With such capabilities in place, a video/image app can provide users with a more captivating user experience.
In addition to video/image editing apps, I believe the auto-timelapse capability can also be utilized by many other types of apps. What other kinds of apps do you think would benefit from such a feature? Let me know in the comments section.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
In photography, cutout is a function that is often used to editing images, such as removing the background. To achieve this function, a technique known as green screen is universally used, which is also called as chroma keying. This technique requires a green background to be added manually.
This, however, makes the green screen-dependent cutout a challenge to those new to video/image editing. The reason is that most images and videos do not come with a green background, and adding such a background is actually quite complex.
Luckily, a number of mobile apps on the market help with this, as they are able to automatically cut out the desired object for users to later edit. To create an app that is capable of doing this, I turned to the recently released object segmentation capability from HMS Core Video Editor Kit for help. This capability utilizes the AI algorithm, instead of the green screen, to intelligently separate an object from other parts of an image or a video, delivering an ideal segmentation result for removing the background and many other further editing operations.
This is what my demo has achieved with the help of the capability.
It is a perfect cutout, right? As you can see, the cut object comes with a smooth edge, without any unwanted parts appearing in the original video.
Before moving on to how I created this cutout tool with the help of object segmentation, let's see what lies behind the capability.
How It WorksThe object segmentation capability adopts an interactive way for cutting out objects. A user first taps or draws a line on the object to be cut out, and then the interactive segmentation algorithm checks the track of the user's tap and intelligently identifies their intent. The capability then selects and cuts out the object. The object segmentation capability performs interactive segmentation on the first video frame to obtain the mask of the object to be cut out. The model supporting the capability traverses frames following the first frame by using the mask obtained from the first frame and applying it to subsequent frames, and then matches the mask with the object in them before cutting the object out.
The model assigns frames with different weights, according to the segmentation accuracy of each frame. It then blends the weighted segmentation result of the intermediate frame with the mask obtained from the first frame, in order to segment the desired object from other frames. Consequently, the capability manages to cut out an object as wholly as possible, delivering a higher segmentation accuracy.
What makes the capability better is that it has no restrictions on object types. As long as an object is distinctive to the other parts of the image or video and is against a simple background, the capability can cleanly cut out this object.
Now, let's check out how the capability can be integrated.
Integration ProcedureMaking PreparationsThere are necessary steps to do before the next part. The steps include:
Configure app information in AppGallery Connect.
Integrate the SDK of HMS Core.
Configure obfuscation scripts.
Apply for necessary permissions.
Setting Up the Video Editing Project1. Configure the app authentication information. Available options include:
Call setAccessToken to set an access token, which is required only once during app startup.
Code:
MediaApplication.getInstance().setAccessToken("your access token");
Or, call setApiKey to set an API key, which is required only once during app startup.
Code:
MediaApplication.getInstance().setApiKey("your ApiKey");
2. Set a License ID.
Because this ID is used to manage the usage quotas of the mentioned service, the ID must be unique.
Code:
MediaApplication.getInstance().setLicenseId("License ID");
1) Initialize the runtime environment for HuaweiVideoEditor.
When creating a video editing project, we first need to create an instance of HuaweiVideoEditor and initialize its runtime environment. When you exit the project, the instance shall be released.
Create a HuaweiVideoEditor object.
Code:
HuaweiVideoEditor editor = HuaweiVideoEditor.create(getApplicationContext());
Determine the layout of the preview area.
Such an area renders video images, and this is implemented by SurfaceView within the fundamental capability SDK. Before the area is created, we need to specify its layout.
Code:
<LinearLayout
android:id="@+id/video_content_layout"
android:layout_width="0dp"
android:layout_height="0dp"
android:background="@color/video_edit_main_bg_color"
android:gravity="center"
android:orientation="vertical" />
// Specify a preview area.
LinearLayout mSdkPreviewContainer = view.findViewById(R.id.video_content_layout);
// Design the layout for the area.
editor.setDisplay(mSdkPreviewContainer);
Initialize the runtime environment. If the license verification fails, LicenseException will be thrown.
After the HuaweiVideoEditor instance is created, it will not use any system resources, and we need to manually set the initialization time for the runtime environment. Then, the fundamental capability SDK will internally create necessary threads and timers.
Code:
try {
editor.initEnvironment();
} catch (LicenseException error) {
SmartLog.e(TAG, "initEnvironment failed: " + error.getErrorMsg());
finish();
return;
}
Integrating Object Segmentation
Code:
// Initialize the engine of object segmentation.
videoAsset.initSegmentationEngine(new HVEAIInitialCallback() {
@Override
public void onProgress(int progress) {
// Initialization progress.
}
@Override
public void onSuccess() {
// Callback when the initialization is successful.
}
@Override
public void onError(int errorCode, String errorMessage) {
// Callback when the initialization failed.
}
});
// After the initialization is successful, segment a specified object and then return the segmentation result.
// bitmap: video frame containing the object to be segmented; timeStamp: timestamp of the video frame on the timeline; points: set of coordinates determined according to the video frame, and the upper left vertex of the video frame is the coordinate origin. It is recommended that the coordinate count be greater than or equal to two. All of the coordinates must be within the object to be segmented. The object is determined according to the track of coordinates.
int result = videoAsset.selectSegmentationObject(bitmap, timeStamp, points);
// After the handling is successful, apply the object segmentation effect.
videoAsset.addSegmentationEffect(new HVEAIProcessCallback() {
@Override
public void onProgress(int progress) {
// Progress of object segmentation.
}
@Override
public void onSuccess() {
// The object segmentation effect is successfully added.
}
@Override
public void onError(int errorCode, String errorMessage) {
// The object segmentation effect failed to be added.
}
});
// Stop applying the object segmentation effect.
videoAsset.interruptSegmentation();
// Remove the object segmentation effect.
videoAsset.removeSegmentationEffect();
// Release the engine of object segmentation.
videoAsset.releaseSegmentationEngine();
And this concludes the integration process. A cutout function ideal for an image/video editing app was just created.
I just came up with a bunch of fields where object segmentation can help, like live commerce, online education, e-conference, and more.
In live commerce, the capability helps replace the live stream background with product details, letting viewers conveniently learn about the product while watching a live stream.
In online education and e-conference, the capability lets users switch the video background with an icon, or an image of a classroom or meeting room. This makes online lessons and meetings feel more professional.
The capability is also ideal for video editing apps. Take my demo app for example. I used it to add myself to a vlog that my friend created, which made me feel like I was traveling with her.
I think the capability can also be used together with other video editing functions, to realize effects like copying an object, deleting an object, or even adjusting the timeline of an object. I'm sure you've also got some great ideas for using this capability. Let me know in the comments section.
ConclusionCutting out objects used to be a thing of people with editing experience, a process that requires the use of a green screen.
Luckily, things have changed thanks to the cutout function found in many mobile apps. It has become a basic function in mobile apps that support video/image editing and is essential for some advanced functions like background removal.
Object segmentation from Video Editor Kit is a straightforward way of implementing the cutout feature into your app. This capability leverages an elaborate AI algorithm and depends on the interactive segmentation method, delivering an ideal and highly accurate object cutout result.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
I dare say there are two types of people in this world: people who love Toy Story and people who have not watched it.
Well, this is just the opinion of a huge fan of the animation film. When I was a child, I always dreamed of having toys that could move and play with me, like my own Buzz Lightyear. Thanks to a fancy technique called rigging, I can now bring my toys to life, albeit I'm probably too old for them now.
What Is Rigging in 3D Animation and Why Do We Need It?Put simply, rigging is a process whereby a skeleton is created for a 3D model to make it move. In other words, rigging creates a set of connected virtual bones that are used to control a 3D model.
It paves the way for animation because it enables a model to be deformed, making it moveable, which is the very reason that rigging is necessary for 3D animation.
What Is Auto Rigging3D animation has been adopted by mobile apps in a number of fields (gaming, e-commerce, video, and more), to achieve more realistic animations than 2D.
However, this graphic technique has daunted many developers (like me) because rigging, one of its major prerequisites, is difficult and time-consuming for people who are unfamiliar with modeling. Specifically speaking, most high-performing rigging solutions have many requirements. An example of this is that the input model should be in a standard position, seven or eight key skeletal points should be added, as well as inverse kinematics which must be added to the bones, and more.
Luckily, there are solutions that can automatically complete rigging, such as the auto rigging solution from HMS Core 3D Modeling Kit.
This capability delivers a wholly automated rigging process, requiring just a biped humanoid model that is generated using images taken from a mobile phone camera. After the model is input, auto rigging uses AI algorithms for limb rigging and generates the model skeleton and skin weights (which determine the degree to which a bone can influence a part of the mesh). Then, the capability changes the orientation and position of the skeleton so that the model can perform a range of preset actions, like walking, running, and jumping. Besides, the rigged model can also be moved according to an action generated by using motion capture technology, or be imported into major 3D engines for animation.
Lower requirements do not compromise rigging accuracy. Auto rigging is built upon hundreds of thousands of 3D model rigging data records. Thanks to some fine-tuned data records, the capability delivers ideal algorithm accuracy and generalization.
I know that words alone are no proof, so check out the animated model I've created using the capability.
Movement is smooth, making the cute panda move almost like a real one. Now I'd like to show you how I created this model and how I integrated auto rigging into my demo app.
Integration ProcedurePreparationsBefore moving on to the real integration work, make necessary preparations, which include:
Configure app information in AppGallery Connect.
Integrate the HMS Core SDK with the app project, which includes Maven repository address configuration.
Configure obfuscation scripts.
Declare necessary permissions.
Capability Integration1. Set an access token or API key — which can be found in agconnect-services.json — during app initialization for app authentication.
Using the access token: Call setAccessToken to set an access token. This task is required only once during app initialization.
Code:
ReconstructApplication.getInstance().setAccessToken("your AccessToken");
Using the API key: Call setApiKey to set an API key. This key does not need to be set again.
Code:
ReconstructApplication.getInstance().setApiKey("your api_key");
The access token is recommended. And if you prefer the API key, it is assigned to the app when it is created in AppGallery Connect.
2. Create a 3D object reconstruction engine and initialize it. Then, create an auto rigging configurator.
Code:
// Create a 3D object reconstruction engine.
Modeling3dReconstructEngine modeling3dReconstructEngine = Modeling3dReconstructEngine.getInstance(context);
// Create an auto rigging configurator.
Modeling3dReconstructSetting setting = new Modeling3dReconstructSetting.Factory()
// Set the working mode of the engine to PICTURE.
.setReconstructMode(Modeling3dReconstructConstants.ReconstructMode.PICTURE)
// Set the task type to auto rigging.
.setTaskType(Modeling3dReconstructConstants.TaskType.AUTO_RIGGING)
.create();
3. Create a listener for the result of uploading images of an object.
Code:
private Modeling3dReconstructUploadListener uploadListener = new Modeling3dReconstructUploadListener() {
@Override
public void onUploadProgress(String taskId, double progress, Object ext) {
// Callback when the upload progress is received.
}
@Override
public void onResult(String taskId, Modeling3dReconstructUploadResult result, Object ext) {
// Callback when the upload is successful.
}
@Override
public void onError(String taskId, int errorCode, String message) {
// Callback when the upload failed.
}
};
4. Use a 3D object reconstruction configurator to initialize the task, set an upload listener for the engine created in step 1, and upload images.
Code:
// Use the configurator to initialize the task, which should be done in a sub-thread.
Modeling3dReconstructInitResult modeling3dReconstructInitResult = modeling3dReconstructEngine.initTask(setting);
String taskId = modeling3dReconstructInitResult.getTaskId();
// Set an upload listener.
modeling3dReconstructEngine.setReconstructUploadListener(uploadListener);
// Call the uploadFile API of the 3D object reconstruction engine to upload images.
modeling3dReconstructEngine.uploadFile(taskId, filePath);
5. Query the status of the auto rigging task.
Code:
// Initialize the task processing class.
Modeling3dReconstructTaskUtils modeling3dReconstructTaskUtils = Modeling3dReconstructTaskUtils.getInstance(context);
// Call queryTask in a sub-thread to query the task status.
Modeling3dReconstructQueryResult queryResult = modeling3dReconstructTaskUtils.queryTask(taskId);
// Obtain the task status.
int status = queryResult.getStatus();
6. Create a listener for the result of model file download.
Code:
private Modeling3dReconstructDownloadListener modeling3dReconstructDownloadListener = new Modeling3dReconstructDownloadListener() {
@Override
public void onDownloadProgress(String taskId, double progress, Object ext) {
// Callback when download progress is received.
}
@Override
public void onResult(String taskId, Modeling3dReconstructDownloadResult result, Object ext) {
// Callback when download is successful.
}
@Override
public void onError(String taskId, int errorCode, String message) {
// Callback when download failed.
}
};
7. Pass the download listener to the 3D object reconstruction engine, to download the rigged model.
Code:
// Set download configurations.
Modeling3dReconstructDownloadConfig downloadConfig = new Modeling3dReconstructDownloadConfig.Factory()
// Set the model file format to OBJ or glTF.
.setModelFormat(Modeling3dReconstructConstants.ModelFormat.OBJ)
// Set the texture map mode to normal mode or PBR mode.
.setTextureMode(Modeling3dReconstructConstants.TextureMode.PBR)
.create();
// Set the download listener.
modeling3dReconstructEngine.setReconstructDownloadListener(modeling3dReconstructDownloadListener);
// Call downloadModelWithConfig, passing the task ID, path to which the downloaded file will be saved, and download configurations, to download the rigged model.
modeling3dReconstructEngine.downloadModelWithConfig(taskId, fileSavePath, downloadConfig);
Where to UseAuto rigging is used in many scenarios, for example:
Gaming. The most direct way of using auto rigging is to create moveable characters in a 3D game. Or, I think we can combine it with AR to create animated models that can appear in the camera display of a mobile device, which will interact with users.
Online education. We can use auto rigging to animate 3D models of popular toys, and liven them up with dance moves, voice-overs, and nursery rhymes to create educational videos. These models can be used in educational videos to appeal to kids more.
E-commerce. Anime figurines look rather plain compared to how they behave in animes. To spice up the figurines, we can use auto rigging to animate 3D models that will look more engaging and dynamic.
Conclusion3D animation is widely used in mobile apps, because it presents objects in a more fun and interactive way.
A key technique for creating great 3D animations is rigging. Conventional rigging requires modeling know-how and expertise, which puts off many amateur modelers.
Auto rigging is the perfect solution to this challenge because its full-automated rigging process can produce highly accurate rigged models that can be easily animated using major engines available on the market. Auto rigging not only lowers the costs and requirements of 3D model generation and animation, but also helps 3D models look more appealing.