Upscaling a Blurry Text Image with Machine Learning - Huawei Developers

{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
Unreadable image text caused by motion blur, poor lighting, low image resolution, or distance can render an image useless. This issue can adversely affect user experience in many scenarios, for example:
A user takes a photo of a receipt and uploads the photo to an app, expecting the app to recognize the text on the receipt. However, the text is unclear (due to the receipt being out of focus or poor lighting) and cannot be recognized by the app.
A filer takes images of old documents and wants an app to automatically extract the text from them to create a digital archive. Unfortunately, some characters on the original documents have become so blurred that they cannot be identified by the app.
A user receives a funny meme containing text and reposts it on different apps. However, the text of the reposted meme has become unreadable because the meme was compressed by the apps when it was reposted.
As you can see, this issue spoils user experience and prevents you from sharing fun things with others. I knew that machine learning technology can help deal with it, and the solution I got is the text image super-resolution service from HMS Core ML Kit.
What Is Text Image Super-Resolution​The text image super-resolution service can zoom in on an image containing text to make it appear three times as big, dramatically improving text definition.
Check out the images below to see the difference with your own eyes.
Before
After
Where Text Image Super-Resolution Can Be Used​This service is ideal for identifying text from a blurry image. For example:
In a fitness app: The service can enhance the image quality of a nutrition facts label so that fitness freaks can understand what exactly they are eating.
In a note-taking app: The service can fix blurry images taken of a book or writing on a whiteboard, so that learners can digitally collate their notes.
What Text Image Super-Resolution Delivers​Remarkable enhancement result: It enlarges a text image up to three times its resolution, and works particularly well on JPG and downsampled images.
Fast process: The algorithm behind the service is built upon the deep neural network, fully utilizing the NPU of Huawei mobile phones to accelerate the neural network and delivering a speedup that is 10-fold.
Less development time and smaller app package size: The service is loaded with an API that is easy to integrate and saves ROM that is occupied by the algorithm model.
What Text Image Super-Resolution Requires​An input bitmap in ARGB format, which is also the output format of the service.
A compressed JPG image or a downsampled image, which is the optimal image format for the service. If the resolution of the input image is already high, the after-effect of the service may not be distinctly noticeable.
The maximum dimensions of the input image are 800 x 800 px. The long edge of the input image should contain at least 64 pixels.
And this concludes the service. If you want to know more about how to integrate the service, you can check out the walkthrough here.
The text image super-resolution service is just one function of the larger ML Kit. Click the link to learn more about the kit.

Related

Huawei Shares its Cutting-edge Camera Capabilities with Developers

Huawei phones are renowned for having state-of-the-art camera technology. By using their wide range of shooting modes, including Wide angle, Wide aperture, Portrait, Night, Slow-mo, and AI Cinema, experienced and novice photographers alike can capture precious moments in stunning quality.
Huawei is now making its Camera Kit available to third-party developers, who can integrate it into their apps and take advantage of the powerful image processing capabilities offered by Huawei cameras to bring fresh experiences to their users.
Huawei + TikTok: More camera modes means more fun!
[Super Slow-mo]
TikTok is currently the most downloaded social media app in the world, and has inspired hundreds and thousands of people to start creating their own short videos. The Slow-mo mode integrated from HUAWEI Camera Kit can produce incredible slow-motion videos, with high-definition, super slow-motion footage of up to 960 fps.
[Ultra-wide Angle]
Huawei's Ultra-wide angle mode is another option for TikTok users. It can be activated by simply opening the app, logging in, and choosing "Wide Angle" from the right sidebar. It’s never been easier to take impressive panoramic videos!
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
With a ToF 3D-depth camera, the subject can be identified accurately, and then reshaped and edited without making the background of the photo appear uneven. Users can apply this 3D shaping effect to videos as well. Currently, these features are available on many phones, including HUAWEI P30 series, Mate 20 series, nova 40, and HONOR V20 phones.
*Actual features may vary depending on the phone brand and model.
More third-party apps are integrating Huawei's media features
Huawei is providing an impressive collection of open capabilities for app developers. By integrating Huawei's ultra-high-resolution image processing feature, third-party apps can increase image pixel size up to ninefold (threefold for width and height respectively). Photos are clearer, more immersive, and look alive. The compression noise is also largely reduced. Even when the screen is enlarged, the photo has clean edges, sharpness, and fine details.
Huawei has built a wholly-open media technology sharing platform, which developers can use to easily integrate Huawei's advanced multimedia capabilities into their apps. All they need to do is download the SDKs. Aside from the camera modes mentioned above, HUAWEI Camera Kit has many more features, including Dual View, Super Zoom, HDR (front camera), HDR (video sensor), Super Macro, among others. Users of non-Huawei phones can now experience the amazing image processing capabilities of Huawei phone cameras when using third-party apps.

Explore the World in a Whole New Way with Visual Searches

{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
It's probably happened to you before: You're scrolling through Instagram, and see a friend's post about their recent vacation. You notice they're wearing a beanie, but can't figure out the brand. This is where visual searches can step in.
What is a visual search?
Visual searches are backed by a power-efficient hardware architecture, a suite of advanced AI technologies, and some very complicated algorithms. For instance, artificial neural networks, such as a convolutional neural network (CNN), a deep learning algorithm, are commonly used in the process of image recognition and processing. These technologies help visual search engines understand and identify objects in photos and even videos, in order to generate related search results.
A visual search engine mimics the human brain's ability to recognize objects in images, using data collected from cameras and sensors. Whether you're trying to identify a plant in your yard, or a cute little puppy on the street, you can easily do so, just by taking a picture – without having to type words to describe what you're looking for in a search box.
A couple of things a visual search can do:
- Perform a search faster than a typed search
- Allow you to easily compare prices between different stores
- Support searching for multiple results simultaneously
Performing visual searches in Petal Search
Huawei has introduced visual search technology into Petal Search, to provide a better searching and shopping experience. All you have to do is snap a photo, and the app will identify the object and find information on it. You can search for a wide range of content across different categories, such as animals, food, clothes, and furniture.
To search by image, simply open Petal Search, and tap the camera icon in the main search bar. You can take a photo with your camera or choose one from your gallery. When you search for a product – let's say a houseplant – Petal Search picks out details of the plant from the photo you upload to the engine, and then processes the image data. It can consequently discover information such as how much it costs, where to buy it, and can even return images of other plants that are visually similar.
Better yet, Petal Search is capable of searching for and detecting multiple targets in an image at a time. For instance, when you take a photo of a sofa with a blanket and cushion on it in a furniture showroom, Petal Search can identify all three items at the same time and deliver comprehensive search results in a single list. This saves you the hassle of having to take tons of photos and search them one by one. Visual search technology has made shopping so much easier.
So, next time when you're trying to identify a plant in your yard, a bag you've seen your coworker carry, or solve a math problem, simply take a photo, and Petal Search will find it in a matter of seconds!

[HMS Core 6.0 Launch] Build a 3D Model in No Time with 3D Modeling Kit

3D Modeling Kit, another service with graphics- and image-related technologies provided by Huawei. This AI-powered kit automatically generates 3D models and physically based rendering (PBR) texture maps, to satisfy the needs for efficient 3D model and animation creation.
Utilizing Huawei-developed algorithms, 3D Modeling Kit supports all Android devices with minimal hardware requirements. The kit can collect all details of an object for modeling, by combining mutual visibility information of images taken from different angles of the object and estimating the depth of field of the image. This process does not require any special devices such as RGB-D or light detection and ranging (LiDAR) sensors. Using just a mobile phone with a standard RGB camera, 3D Modeling Kit can generate a 3D model with 40,000 to 200,000 patches. Moreover, the kit cuts modeling costs and is more efficient than conventional manual modeling methods. Once the kit has been integrated, it supports data collection and upload, as well as model preview and download. 3D Modeling Kit can separate the object from its background, so it is able to generate models that have sharp and smooth edges with no background. It is undoubtedly an ideal option for creating product models, teaching, creating games and animations, and making short videos.
This kit uses an AI-powered tool to convert one or more RGB images into four PBR texture maps (diffuse, normal, specular, and roughness maps). These maps, supported by mainstream rendering engines, can bring a level of lifelike lighting and shading. Generally, a large amount of realistic textures are needed in the gaming and video industries. Now, with 3D Modeling Kit, developers can create desired textures in a highly intuitive way, rather than create them from scratch.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}

AI Color from HMS Core Video Editor Kit Rejuvenates Old Photos

Since 1839 when Louis Daguerre invented the daguerreotype (the first publicly available photographic process), new inventions have continued to advance photography. Its spike reached a record high where people were able to record experiences through photos, anytime and anywhere. However, it is a shame that many early photos existed in only black and white.
HMS Core Video Editor Kit provides the AI color function that can liven up such photos, intelligently adding color to black-and-white images or videos to endow them with a more contemporary feel.
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
In addition to AI color, the kit also provides other AI-empowered capabilities, such as allowing your users to copy a desired filter, track motions, change hair color, animate a picture, and mask faces.
In terms of input and output support, Video Editor Kit allows multiple images and videos to be imported, which can be flexibly arranged and trimmed, and allows videos of up to 4K and with a frame rate up to 60 fps to be exported.
Useful in Various Scenarios
Video Editor Kit is ideal for numerous application scenarios, to name a few:
Video editing: The kit helps accelerate video creation by providing functions such as video clipping/stitching and allowing special effects/music to be added.
Travel: The kit enables users to make vlogs on the go to share their memories with others.
Social media: Functions like video clipping/stitching, special effects, and filters are especially useful for social media app users, and are a great way for them to spice up videos.
E-commerce: Product videos with subtitles, special effects, and background music allow products to be displayed in a more intuitive and immersive way.
Flexible Integration Methods
Video Editor Kit can now be integrated via its:
UI SDK, which comes with a product-level UI for straightforward integration.
Fundamental capability SDK, which offers hundreds of APIs for fundamental capabilities, including the AI-empowered ones. The APIs can be integrated as needed.
Both of the SDKs serve as a one-stop toolkit for editing videos, providing functions including file import, editing, rendering, output, and material management. Integrating either of the SDKs allows you to access the kit's powerful capabilities.
These capabilities enable your users to restore early photos and record life experiences. Check out the official documentation for this great Video Editor Kit, to know more about how it can help you create a mobile life recorder.

Shot It & Got It: Know What You Eat with Image Classification

{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
Washboard abs, buff biceps, or a curvy figure — a body shape that most of us probably desire. However, let's be honest: We're too lazy to get it.
Hitting the gym is a great choice to getting ourselves in shape, but paying attention to what we eat and how much we eat requires not only great persistence, but also knowledge about what goes in food.
The food recognition function can be integrated into fitness apps, letting users use their phone's camera to capture food and displaying on-screen details about the calories, nutrients, and other bits and pieces of the food in question. This helps health fanatics keep track of what they eat on a meal-by-meal basis.
The GIF below shows the food recognition function in action.
Technical Principles​This fitness assistant is made possible thanks to the image classification technology which is a widely-adopted basic branch of the AI field. Traditionally, image classification works by initially pre-processing images, extracting their features, and developing a classifier. The second part of the process entails a huge amount of manual labor, meaning such a process can merely classify images with limited information. Forget about the images having lists of details.
Luckily, in recent years, image classification has developed considerably with the help of deep learning. This method adopts a specific inference framework and the neural network to classify and tag elements in images, to better determine the image themes and scenarios.
Image classification from HMS Core ML Kit is one service that adopts such a method. It works by: detecting the input image in static image mode or camera stream mode → analyzing the image by using the on-device or on-cloud algorithm model → returning the image category (for example, plant, furniture, or mobile phone) and its corresponding confidence.
The figure below illustrates the whole procedure.
Advantages of ML Kit's Image Classification​This service is built upon deep learning. It recognizes image content (such as objects, scenes, behavior, and more) and returns their corresponding tag information. It is able to provide accuracy, speed, and more by utilizing:
Transfer learning algorithm: The service is equipped with a higher-performance image-tagging model and a better knowledge transfer capability, as well as a regularly refined deep neural network topology, to boost accuracy by 38%.
Semantic network WordNet: The service optimizes the semantic analysis model and analyzes images semantically. It can automatically deduce the image concepts and tags, and supports up to 23,000 tags.
Acceleration based on Huawei GPU cloud services: Huawei GPU cloud services increase the cache bandwidth by 2 times and the bit width by 8 times, which are vastly superior to the predecessor. These improvements mean that image classification requires only 100 milliseconds to recognize an image.
Sound tempting, right? Here's something even better if you want to use the image classification service from ML Kit for your fitness app: You can either directly use the classification categories offered by the service, or customize your image classification model. You can then train your model with the images collected for different foods, and import their tag data into your app to build up a huge database of food calorie details. When your user uses the app, the depth of field (DoF) camera on their device (a Huawei phone, for example) measures the distance between the device and food to estimate the size and weight of the food. Your app then matches the estimation with the information in its database, to break down the food's calories.
In addition to fitness management, ML Kit's image classification can also be used in a range of other scenarios, for example, image gallery management, product image classification for an e-commerce app, and more.
All these can be realized with the image classification categories of the mentioned image classification service. I have integrated it into my app, so what are you waiting for?

Categories

Resources