Augmented reality (AR) bridges real and virtual worlds, by integrating digital content into real-world environments. It allows people to interact with virtual objects as if they are real. Examples include product displays in shopping apps, interior design layouts in home design apps, accessible learning materials, real-time navigation, and immersive AR games. AR technology makes digital services and experiences more accessible than ever.
This has enormous implications in daily life. For instance, when shooting short videos or selfies, users can switch between different special effects or control the shutter button with specific gestures, which spares them from having to touch the screen. When browsing clothes or accessories, on an e-commerce website, users can use AR to "wear" the items virtually, and determine which clothing articles fit them, or which accessories match with which outfits. All of these services are dependent on precise hand gesture recognition, which HMS Core AR Engine provides via its hand skeleton tracking capability. If you are considering developing an app providing AR features, you would be remiss not to check out this capability, as it can streamline your app development process substantially.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
The hand skeleton tracking capability works by detecting and tracking the positions and postures of up to 21 hand skeleton joints, and generating true-to-life hand skeleton models with attributes like fingertip endpoints and palm orientation, as well as the hand skeleton itself. Please note that when there is more than one hand in an image, the service will only send back results and coordinates from the hand in which it has the highest degree of confidence. Currently, this service is only supported on certain Huawei phone models that are capable of obtaining image depth information.
AR Engine detects the hand skeleton in a precise manner, allowing your app to superimpose virtual objects on the hand with a high degree of accuracy, including on the fingertips or palm. You can also perform a greater number of precise operations on virtual hands, to enrich your AR app with fun new experiences and interactions.
Hand skeleton diagram
Simple Sign Language TranslationThe hand skeleton tracking capability can also be used to translate simple gestures in sign languages. By detecting key hand skeleton joints, it predicts how the hand posture will change, and maps movements like finger bending to a set of predefined gestures, based on a set of algorithms. For example, holding up the hand in a fist with the index finger sticking out is mapped to the gesture number one (1). This means that the kit can help equip your app with sign language recognition and translation features.
Building a Contactless Operation InterfaceIn science fiction movies, it is quite common to see a character controlling a computer panel with air gestures. With the skeleton tracking capability in AR Engine, this mind-bending technology is no longer out of reach.
With the phone's camera tracking the user's hand in real time, key skeleton joints like the fingertips are identified with a high degree of precision, which allows the user to interact with virtual objects with specific simple gestures. For example, pressing down on a virtual button can trigger an action, pressing and holding a virtual object can display the menu options, spreading two fingers apart on a small object across a larger object can show the details, or resizing a virtual object and placing it in a virtual pocket.
Such contactless gesture-based controls have been widely used in fields as diverse as medical equipment and vehicle head units.
Interactive Short Videos & Live StreamingThe hand skeleton tracking capability in AR Engine can help with adding gesture-based special effects to short videos or live streams. For example, when the user is shooting a short video, or starting a live stream, the capability will enable your app to identify their gestures, such as a V-sign, thumbs up, or finger heart, and then apply the corresponding special effects or stickers to the short video or live stream. This makes the interactions more engaging and immersive, and makes your app more appealing to users than competitor apps.
Hand skeleton tracking is also ideal in contexts like animation, course material presentation, medical training and imaging, and smart home controls.
The rapid development of AR technologies has made human-computer interactions based on gestures a hot topic throughout the industry. Implementing natural and human-friendly gesture recognition solutions is key to making these interactions more engaging. Hand skeleton tracking is the foundation for gesture recognition. By integrating AR Engine, you will be able to use this tracking capability to develop AR apps that provide users with more interesting and effortless features. Apps that offer such outstanding AR features will undoubtedly provide an enhanced user experience that helps them stand out from the myriad of competitor apps.
ConclusionAugmented reality is one of the most exciting new technological developments in the past few years, and a proven method for presenting a variety of digital content, including text, graphics, and videos, in a visually immersive manner. Now an increasing number of apps are opting to provide AR-based features of their own, in order to provide an interactive and easy-to-use experience in fields as diverse as medical training, interior design and modeling, real-time navigation, virtual classrooms, health care, and entertainment. Hand gesture recognition is at the core of this trend. If you are currently developing an app, the right development kit, which offers all the preset capabilities that you need for your app, is key to reducing the development workload and building the features that you want. It can also let you focus on optimizing the app's feature design and the user experience. AR Engine offers an effective and easy-to-use hand gesture tracking capability for AR apps. By integrating this kit, your app will be able to identify user hand gestures in real time in a highly precise manner, implement responsive user-device interactions based on these detected hand gestures, and therefore provide users with highly immersive and engaging AR experience.
Related
Huawei's kits ensure Huawei smartphone users can unlock enhanced functionality and enjoy new benefits
A fast-growing and innovative photo editing tool is now available to users outside of China through AppGallery.
The innovative photo editing tool Cut Cut gives everyday users access to professional retouching and editing software, enabling them to easily create crisper, cleaner, and prettier images from their Huawei smartphone. It is quickly becoming one of the most popular and highest-rated photography apps in the world, amassing almost 100 million global users in less than 18 months.
Cut Cut integrates some of Huawei's leading technology to give users a unique and seamless photo editing experience. As well as allowing users to remove or edit backgrounds, add filters and stickers to images, access a massive material library, and create new collages and artworks, it also includes features made possible by Huawei's chip, device, and cloud capabilities. For example, Huawei's Machine Learning Kit reduces on-device processing time and improves image segmentation precision even if the user is offline. The image segmentation service also unlocks other enhanced functionality features such as image area optimization, portrait colouring, and sky filter effects - Cut Cut can even identify pixels of different kinds of elements more accurately to enhance images of things like the sky or grass.
The app uses HiAI, Huawei's latest AI technology. This ground-breaking platform provides capabilities at the chip, device, and cloud level, allowing Cut Cut to automatically detect people in photos and intelligently cut out backgrounds and differentiate objects within images.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
Cut Cut also integrates a range of other HMS Core and HMS Capabilities kits to enable a smoother, easier, and more tailored user experience. For example, the Analytics Kit feeds developers useful insights on user behaviour, helping them better understand user preferences and therefore allowing them to constantly improve applications. Other kits integrated into Cut Cut that deliver user advantages include the Account Kit, allowing users to easily sign in using their Huawei ID; the Push Kit, notifying users of any important promotions or messages; Ads Kit, helping ensure ads are high quality and personalized and therefore less intrusive; and a range of other user benefits, such as convenient in-app purchases and easy sharing to social media and other platforms.
Furthermore, Cut Cut developers APUS has given out a certain amount of all Huawei smartphone users a six-month VIP gift packages as early-bird-incentives to Huawei smartphone users who download the App from AppGallery, which unlocks all paid materials to give members free access to more than 20,000 resources such as background, stickers, and filters.
Cut Cut, along with thousands of other quality apps across 18 categories, are available on Huawei's open, innovative, and secure app distribution platform, AppGallery. One of the top three app marketplaces globally, AppGallery connects 400 million monthly active users throughout more than 170 countries and regions to Huawei''s smart and innovative ecosystem.
For more information about AppGallery, please visit consumer.huawei.com/en/mobileservices/appgallery/
Interested in knowing more about HMS kits and capabilities? Please visit: developer.huawei.com/consumer/en/hms
1. Introduction to Virtual Human
Virtual Human is a service that utilizes cutting-edge AI technologies, including image vision, emotion generation, voice cloning, and semantic understanding, and has a wide range of applications, spanning news broadcasting, financial customer service, and virtual gaming.
Application scenarios:
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
2. ML Kit's Virtual Human Service
ML Kit's Virtual Human service is backed by core Huawei AI technologies, such as image processing, text to speech, voice cloning, and semantic understanding, and provides innovative, cost-effective authoring modes for education, news, and multimedia production enterprises. Virtual Human service features a number of crucial advantages over other similar services, including the following:
Ultra-HD 4K cinematic effects
Supports large-screen displays. The details and textures of the entire body are rendered in the same definition.
Generates images that fit seamlessly with the real background, and achieve trackless fusion under HD resolution.
Generates detailed lip features, distinct lipstick reflection, and lifelike textures.
Produces clear and visible teeth, and true-to-life textures.
Hyper-real synthesis effects
True restoration of teeth (no painting involved), lips, and even lipstick reflections.
True restoration of facial features such as illumination, contrasts, shadows, and dimples.
Seamless connections between the generated texture for the mouth and the real texture.
Intricate animation effects that outperform those for 3D live anchors.
Comparing with services provided by other enterprises.
3. ML Kit's Virtual Human Video Display
As shown below, Virtual Human generates ultra-HD video effects, provides for clearer enunciation, and exercises better control over key details, such as lip features, lipstick reflections, actual pronunciation and illumination.
4. Integrating ML Kit's Virtual Human Service
4.1 Integration Process
4.1.1 Submitting the Text of the Video for Generation
Call the customized API for converting text into the virtual human video, and pass the required configurations (specified by parameter config) and text (specified by parameter data) to the backend for processing through the API. First, check the length of the passed text. The maximum length of the Chinese text is 1,000 characters, and that for the English text is 3,000 characters. Perform the non-null check on the passed configurations, then submit the text and configurations to convert the text into audio
4.1.2 Using the Data Provided by an Asynchronous Scheduled Task
Call the text-to-speech algorithm to convert text into the video, based on the data provided by the asynchronous scheduled task, and synthesize the video with the previously obtained audio.
4.1.3 Checking Whether the Text Has been Successfully Converted
Call the API for querying the results of converting text into the virtual human video, to check whether the text has been successfully converted. If the execution is complete, the video link will be returned.
4.1.4 Accessing the Videos via the Link
Access the generated video through the link returned by the API, to query the results of converting text into the virtual human video.
4.2 Main APIs for Integration
4.2.1 Customized API for Converting Text into the Virtual Human Video
URL: http://10.33.219.58:8888/v1/vup/text2vedio/submit
Request parameters
Main functions:
Input the customized API for converting text into the virtual human video. The API is asynchronous. Currently, Virtual Human can only complete the conversion using an offline mode, a process that takes time to complete. The conversion results can be queried via the API for querying the results of converting text into the virtual human video. If the submitted text has been synthesized, you can return and play the video directly.
Main logic:
Convert the text into audio based on the text and configurations to be synthesized, passed by the frontend. Execute multithreading asynchronously, generate the video that meets pronunciation requirements based on the text-to-speech algorithm, and then compound the video with audio to generate the virtual human video. If the submitted text has been synthesized, you can return and play the video directly.
4.2.2 API for Querying the Results of Converting Text into the Virtual Human Video
URL: http://10.33.219.58:8888/v1/vup/text2vedio/query
Request parameters
Main functions:
Query the conversion status in batches, based on the submitted text IDs.
Main logic:
Query the synthesis status of the video through textlds (the ID list of the synthesized text passed by the frontend), save the obtained status results to a set as the output parameter, and insert the parameter to the returned request. If the requested text has been synthesized, you can return and play the video directly.
4.2.3 API for Taking the Virtual Human Video Offline in Batches
URL: http://10.33.219.58:8888/v1/vup/text2vedio/offline
Request parameters
Main functions:
Bring the video offline in batches, based on the submitted text ID.
Main logic:
Change the status of the video corresponding to the ID in the array to offline through textlds (the ID array of the synthesized text transmitted by the frontend), and then delete the video. The offline video is not capable of being played.
4.3 Main Functions of ML Kit's Virtual Human
ML Kit's Virtual Human service has a myriad of powerful functions.
1. Dual language support: Virtual Human currently supports Chinese and English, and thus text in either Chinese or English can be used as audio data.
2. Multiple virtual anchors: The service supports up to four virtual anchors, one Chinese female voice, one English female voice, and two English male voices.
3. Picture-in-picture video: Picture-in-picture video play, in essence, small-window video playback, is supported as well. When playing a video in picture-in-picture mode, the video window moves in accordance with the rest of the screen. Users are able to view the text while playing the video, and can drag the video to any location on the screen for easier reading.
4. Adjustable speech speed, volume, and tone: The speech speed, volume, and tone can be adjusted at will, to meet a wide range of user needs.
5. Multi-background settings: The service allows you to choose from diverse backgrounds for virtual anchors. There are currently three built-in backgrounds provided: transparent, green-screen, and technological. You can also upload an image to apply a customized background.
6. Subtitles: Virtual Human is capable of automatically generating Chinese, English, and bilingual subtitles.
7. Multi-layout settings: You can change the position of the virtual anchors on the screen (left, right, or middle of the screen) by setting parameters. You can also determine the size of the virtual anchors and choose to place either their upper body or entire body in view. In addition, you are free to set a channel logo, its position on the screen, as well as the video to be played. This ensures that the picture-in-picture effect achieves a bona fide news broadcast experience.
Picture-in-picture effect:
5. Final Thoughts
As a developer, after using ML Kit's Virtual Human service to generate a video, I was shocked at its capabilities, especially the picture-in-picture capability, which helped me generate real news broadcast effects. It has got me wondering whether virtual humans will soon replace real anchors.
To learn more, please visit the official website:
Reference
Official website of Huawei Developers
Development Guide
HMS Core official community on Reddit
Demo and sample code
Discussions on Stack Overflow
Artificial reality (AR) has been widely deployed in many fields, such as marketing, education, and gaming fields, as well as in exhibition halls. 2D image and 3D object tracking technologies allow users to add AR effects to photos or videos taken with their phones, like a 2D poster or card, or a 3D cultural relic or garage kit. More and more apps are using AR technologies to provide innovative and fun features. But to stand out from the pack, more resources must be put into app development, which is time-consuming and entails huge workload.
HMS Core AR Engine makes development easier than ever. With 2D image and 3D object tracking based on device-cloud synergy, you will be able to develop apps that deliver premium experience.
2D Image TrackingReal-time 2D image tracking technology is largely employed by online shopping platforms for product demonstration, where shoppers interact with the AR effects to view products from different angles. According to the background statistics of one platform, the sales volume of products with AR special effects is much higher than other products, involving twice as much interaction in AR-based activities than common activities. This is one example of how platforms can deploy AR technologies to make profit.
To apply AR effects to more images on an app using traditional device-side 2D image tracking solutions, you need to release a new app version, which can be costly. In addition, increasing the number of images will stretch the app size. That's why AR Engine adopts device-cloud synergy, which allows you to easily apply AR effects to new images by simply uploading images to the cloud, without updates to your app, or occupying extra space.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
2D image tracking with device-cloud synergy
This technology consists of the following modules:
Cloud-side image feature extraction
Cloud-side vector retrieval engine
Device-side visual tracking
In terms of response speed to and from the cloud, AR Engine runs a high-performance vector retrieval engine by virtue of the platform's hardware acceleration capability, to ensure millisecond-level retrieval from massive volumes of feature data.
3D Object TrackingAR Engine also allows real-time tracking of 3D objects like cultural relics and products. It presents 3D objects as holograms to supercharge images.
3D objects can be mundane and stem from various textures and materials, such as a textureless sculpture, or metal utensils that reflect light and appear shiny. In addition, as the light changes, 3D objects can cast shadows. These conditions pose a great challenge to 3D object tracking. AR Engine implements quick, accurate object recognition and tracking with multiple deep neutral networks (DNNs) in three major steps: object detection, coarse positioning of object poses, and pose optimization.
3D object tracking with device-cloud synergyThis technology consists of the following modules:
Cloud-side AI-based generation of training samples
Cloud-side automatic training of DNNs
Cloud-side DNN inference
Device-side visual tracking
Training algorithms for DNNs by manual labeling is labor-and time-consuming. Based on massive offline data and generative adversarial networks (GANs), AR Engine designs an AI-based algorithm for generating training samples, so as to accurately identify 3D objects in complex scenarios without manual labeling.
Currently, Huawei Cyberverse uses the 3D object tracking capability of AR Engine to create an immersive tour guide for Mogao Caves, to reveal never-before-seen details about the caves to tourists.
These premium technologies were constructed, built, and released by Central Media Technology Institute, 2012 Labs. They are open for you to bring users differentiated AR experience.
Learn more about AR Engine at HMS Core AR Engine.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
New FeaturesAnalytics Kit
Released the function of saving churned users as an audience to the retention analysis function. This function enables multi-dimensional examination on churned users, thus contributing to making targeted measures for winning back such users.
Changed Audience analysis to Audience insight that has two submenus: User grouping and User profiling. User grouping allows for segmenting users into different audiences according to different dimensions, and user profiling provides audience features like profiles and attributes to facilitate in-depth user analysis.
Added the Page access in each time segment report to Page analysis. The report compares the numbers of access times and users in different time segments. Such vital information gives you access to your users' product usage preferences and thus helps you seize operations opportunities.
Added the Page access in each time segment report to Page analysis. The report compares the numbers of access times and users in different time segments. Such vital information gives you access to your users' product usage preferences and thus helps you seize operations opportunities.
Learn more>>
3D Modeling Kit
Debuted the auto rigging capability. Auto rigging can load a preset motion to a 3D model of a biped humanoid, by using the skeleton points on the model. In this way, the capability automatically rigs and animates such a biped humanoid model, lowering the threshold of 3D animation creation and making 3D models appear more interesting.
Added the AR-based real-time guide mode. This mode accurately locates an object, provides real-time image collection guide, and detects key frames. Offering a series of steps for modeling, the mode delivers a fresh, interactive modeling experience.
Learn more>>
Video Editor Kit
Offered the auto-smile capability in the fundamental capability SDK. This capability detects faces in the input image and then lightens up the faces with a smile (closed-mouth or open-mouth).
Supplemented the fundamental capability SDK with the object segmentation capability. This AI algorithm-dependent capability separates the selected object from a video, to facilitate operations like background removal and replacement.
Learn more>>
ML Kit
Released the interactive biometric verification service. It captures faces in real time and determines whether a face is of a real person or a face attack (like a face recapture image, face recapture video, or a face mask), by checking whether the specified actions are detected on the face. This service delivers a high-level of security, making it ideal in face recognition-based payment scenarios.
Improved the on-device translation service by supporting 12 more languages, including Croatian, Macedonian, and Urdu. Note that the following languages are not yet supported by on-device language detection: Maltese, Bosnian, Icelandic, and Georgian.
Learn more>>
Audio Editor Kit
Added the on-cloud REST APIs for the AI dubbing capability, which makes the capability accessible on more types of devices.
Added the asynchronous API for the audio source separation capability. On top of this, a query API was added to maintain an audio source separation task via taskId. This serves as the solution to the issue that in some scenarios, a user failed to find their previous audio source separation task when they exited and re-opened the app, because of the long time taken by an audio source separation task to complete.
Enriched on-device audio source separation with the following newly supported sound types: accompaniment, bass sound, stringed instrument sound, brass stringed instrument sound, drum sound, accompaniment with the backing vocal voice, and lead vocalist voice.
Learn more>>
Health Kit
Added two activity record data types: apnea training and apnea testing in diving, and supported the free diving record data type on the cloud-side service, giving access to the records of more activity types.
Added the sampling data type of the maximum oxygen uptake to the device-side service. Each data record indicates the maximum oxygen uptake in a period. This sampling data type can be used as an indicator of aerobic capacity.
Added the open atomic sampling statistical data type of location to the cloud-side service. This type of data records the GPS location of a user at a certain time point, which is ideal for recording data of an outdoor sport like mountain hiking and running.
Opened the activity record segment statistical data type on the cloud-side service. Activity records now can be collected by time segment, to better satisfy requirements on analysis of activity records.
Added the subscription of scenario-based events and supported the subscription of total step goal events. These fresh features help users set their running/walking goals and receive push messages notifying them of their goals.
Learn more>>
Video Kit
Released the HDR Vivid SDK that provides video processing features like opto-electronic transfer function (OETF), tone mapping, and HDR2SDR. This SDK helps you immerse your users with high-definition videos that get rid of overexposure and have clear details even in dark parts of video frames.
Added the capability for killing the WisePlayer process. This capability frees resources occupied by WisePlayer after the video playback ends, to prevent WisePlayer from occupying resources for too long.
Added the capability to obtain the list of video source thumbnails that cover each frame of the video source — frame by frame, time point by time point — when a user slowly drags the video progress bar, to improve video watching experience.
Added the capability to accurately play video via dragging on the progress bar. This capability can locate the time point on the progress bar, to avoid the inaccurate location issue caused by using the key frame for playback location.
Learn more>>
Scene Kit
Added the 3D fluid simulation component. This component allows you to set the boundaries and volume of fluid (VOF), to create interactive liquid sloshing.
Introduced the dynamic diffuse global illumination (DDGI) plugin. This plugin can create diffuse global illumination in real time when the object position or light source in the scene changes. In this way, the plugin delivers a more natural-looking rendering effect.
Learn more>>
New ResourcesMap Kit
For the hms-mapkit-demo sample code: Added the MapsInitializer.initialize API that is used to initialize the Map SDK before it can be used.
Added the public layer (precipitation map) in the enhanced SDK.
Go to GitHub>>
Site Kit
For the hms-sitekit-demo sample code: Updated the Gson version to 2.9.0 and optimized the internal memory.
Go to GitHub >>
Game Service
For the hms-game-demo sample code: Added the configuration of removing the dependency installation boost of HMS Core (APK), and supported HUAWEI Vision.
Go to GitHub >>
Made necessary updates to other kits. Learn more >>
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
What do you usually do if you like a particular cartoon character? Buy a figurine of it?
That's what most people would do. Unfortunately, however, it is just for decoration. Therefore, I tried to create a way of sending these figurines back to the virtual world — In short, I created a virtual but moveable 3D model of a figurine.
This is done with auto rigging, a new capability of HMS Core 3D Modeling Kit. It can animate a biped humanoid model that can even interact with users.
Check out what I've created using the capability.
What a cutie.
The auto rigging capability is ideal for many types of apps when used together with other capabilities. Take those from HMS Core as an example:
Audio-visual editing capabilities from Audio Editor Kit and Video Editor Kit. We can use auto rigging to animate 3D models of popular stuffed toys that can be livened up with proper dances, voice-overs, and nursery rhymes, to create educational videos for kids. With the adorable models, such videos can play a better role in attracting kids and thus imbuing them with knowledge.
The motion creation capability. This capability, coming from 3D Engine, is loaded with features like real-time skeletal animation, facial expression animation, full body inverse kinematic (FBIK), blending of animation state machines, and more. These features help create smooth 3D animations. Combining models animated by auto rigging and the mentioned features, as well as numerous other 3D Engine features such as HD rendering, visual special effects, and intelligent navigation, is helpful for creating fully functioning games.
AR capabilities from AR Engine, including motion tracking, environment tracking, and human body and face tracking. They allow a model animated by auto rigging to appear in the camera display of a mobile device, so that users can interact with the model. These capabilities are ideal for a mobile game to implement model customization and interaction. This makes games more interactive and fun, which is illustrated perfectly in the image below.
As mentioned earlier, the auto rigging capability supports only the biped humanoid object. However, I think we can try to add two legs to an object (for example, a candlestick) for auto rigging to animate, to recreate the Be Our Guest scene from Beauty and the Beast.
How It WorksAfter a static model of a biped humanoid is input, auto rigging uses AI algorithms for limb rigging and automatically generates the skeleton and skin weights for the model, to finish the skeleton rigging process. Then, the capability changes the orientation and position of the model skeleton so that the model can perform a range of actions such as walking, jumping, and dancing.
AdvantagesDelivering a wholly automated rigging processRigging can be done either manually or automatically. Most highly accurate rigging solutions that are available on the market require the input model to be in a standard position and seven or eight key skeletal points to be added manually.
Auto rigging from 3D Modeling Kit does not have any of these requirements, yet it is able to accurately rig a model.
Utilizing massive data for high-level algorithm accuracy and generalizationAccurate auto rigging depends on hundreds of thousands of 3D model rigging data records that are used to train the Huawei-developed algorithms behind the capability. Thanks to some fine-tuned data records, auto rigging delivers ideal algorithm accuracy and generalization. It can implement rigging for an object model that is created from photos taken from a standard mobile phone camera.
Input Model SpecificationsThe capability's official document lists the following suggestions for an input model that is to be used for auto rigging.
Source: a biped humanoid object (like a figurine or plush toy) that is not holding anything.
Appearance: The limbs and trunk of the object model are not separate, do not overlap, and do not feature any large accessories. The object model should stand on two legs, without its arms overlapping.
Posture: The object model should face forward along the z-axis and be upward along the y-axis. In other words, the model should stand upright, with its front facing forward. None of the model's joints should twist beyond 15 degrees, while there is no requirement on symmetry.
Mesh: The model meshes can be triangle or quadrilateral. The number of mesh vertices should not exceed 80,000. No large part of meshes is missing on the model.
Others: The limbs-to-trunk ratio of the object model complies with that of most toys. The limbs and trunk cannot be too thin or short, which means that the ratio of the arm width to the trunk width and the ratio of the leg width to the trunk width should be no less than 8% of the length of the object's longest edge.
Driven by AI, the auto rigging capability lowers the threshold of 3D modeling and animation creation, opening them up to amateur users.
While learning about this capability, I also came across three other fantastic capabilities of the 3D Modeling Kit. Wanna know what they are? Check them out here. Let me know in the comments section how your auto rigging has come along.