ML Kit: Text Recognition Development Procedure - Huawei Developers

Text recognition from images on the device
1. Create the text analyzer MLTextAnalyzer to recognize text in images. You can set MLLocalTextSetting to specify languages that can be recognized. If you do not set the languages, only Latin-based languages can be recognized by default.
Code:
// Method 1: Use default parameter settings to configure the on-device text analyzer. Only Latin-based languages can be recognized.
MLTextAnalyzer analyzer = MLAnalyzerFactory.getInstance().getLocalTextAnalyzer();
// Method 2: Use the customized parameter MLLocalTextSetting to configure the text analyzer on the device.
MLLocalTextSetting setting = new MLLocalTextSetting.Factory()
.setOCRMode(MLLocalTextSetting.OCR_DETECT_MODE)
// Specify languages that can be recognized.
.setLanguage("en")
.create();
MLTextAnalyzer analyzer = MLAnalyzerFactory.getInstance().getLocalTextAnalyzer(setting);
2. Create an MLFrame using android.graphics.Bitmap. JPG, JPEG, PNG, and BMP images are supported. It is recommended that the length-width ratio range from 1:2 to 2:1.
Code:
// Create an MLFrame object using the bitmap, which is the image data in bitmap format.
MLFrame frame = MLFrame.fromBitmap(bitmap);
3. Pass the MLFrame object to the asyncAnalyseFrame method for text recognition.
Code:
Task<MLText> task = analyzer.asyncAnalyseFrame(frame);
task.addOnSuccessListener(new OnSuccessListener<MLText>() {
@Override
public void onSuccess(MLText text) {
// Recognition success.
}
}).addOnFailureListener(new OnFailureListener() {
@Override
public void onFailure(Exception e) {
// Recognition failure.
}
});
The sample code uses the asynchronous call mode. Local text recognition also supports synchronous call. The recognition result is specified by the MLText.Block array.
Code:
Context context = getApplicationContext();
MLTextAnalyzer analyzer = new MLTextAnalyzer.Factory(context).create();
SparseArray<MLText.Block> blocks = analyzer.analyseFrame(frame);
4. After the recognition is complete, stop the analyzer to release recognition resources.
Code:
if (analyzer != null) {
analyzer.close();
}
Text recognition from images on the cloud
1. Creat an analyzer. The recommended way is to using MLRemoteTextSetting. In this way, you can specify to-be-recognized languages for more accurate text recognition.
Code:
// Method 1: Use customized parameter settings.
MLRemoteTextSetting setting = new MLRemoteTextSetting.Factory()
//Set the on-cloud text detection mode.
// MLRemoteTextSetting.OCR_COMPACT_SCENE: dense text recognition
// MLRemoteTextSetting.OCR_LOOSE_SCENE: sparse text recognition
.setTextDensityScene(MLRemoteTextSetting.OCR_LOOSE_SCENE)
// Specify the languages that can be recognized, which should comply with ISO 639-1.
.setLanguageList(new ArrayList<String>(){{this.add("zh"); this.add("en");}})
// Set the format of the returned text border box.
// MLRemoteTextSetting.NGON: Return the coordinates of the four vertices of the quadrilateral.
// MLRemoteTextSetting.ARC: Return the vertices of a polygon border in an arc. The coordinates of up to 72 vertices can be returned.
.setBorderType(MLRemoteTextSetting.ARC)
.create();
MLTextAnalyzer analyzer = MLAnalyzerFactory.getInstance().getRemoteTextAnalyzer(setting);
// Method 2: Use the default parameter settings to automatically detect languages for text recognition. This method is applicable to sparse text scenarios. The format of the returned text box is MLRemoteTextSetting.NGON.
MLTextAnalyzer analyzer = MLAnalyzerFactory.getInstance().getRemoteTextAnalyzer();
2. Create an MLFrame using the bitmap. JPG, JPEG, PNG, and BMP images are supported.
Code:
MLFrame frame = MLFrame.fromBitmap(bitmap);
3. Pass the MLFrame object to the asyncAnalyseFrame method for text recognition.
Code:
Task<MLText> task = analyzer.asyncAnalyseFrame(frame);
task.addOnSuccessListener(new OnSuccessListener<MLText>() {
@Override
public void onSuccess(MLText text) {
// Recognition success.
}
}).addOnFailureListener(new OnFailureListener() {
@Override
public void onFailure(Exception e) {
// Recognition failure.
}
});
4. After the recognition is complete, stop the analyzer to release recognition resources.
Code:
try {
if (analyzer != null) {
ananlzer.close;
}
} catch (IOException e) {
// Exception handling.
}
Text recognition from camera streams on the device
Your app can process camera streams, convert camera frames into the MLFrame object, and recognize text using the local static image recognition method. If the synchronous recognition API is called, your app can also use the LensEngine class built in the SDK to locally detect text in camera streams and create and initialize a LensEngine object. For details, please refer to Sample Code.
1. Create an analyzer.
Code:
MLTextAnalyzer analyzer = new MLTextAnalyzer.Factory(context).create();
2. Create the OcrDetectorProcessor class for processing detection results. This class provides the MLAnalyzer.MLTransactor<T> API, which uses the transactResult method to obtain the detection results and implement specific services.
Code:
public class OcrDetectorProcessor implements MLAnalyzer.MLTransactor<MLText.Block> {
@Override
public void transactResult(MLAnalyzer.Result<MLText.Block> results) {
SparseArray<MLText.Block> items = results.getAnalyseList();
// Determine detection result processing as required. Note that only the detection results are processed.
// Other detection-related APIs provided by HUAWEI ML Kit cannot be called.
}
@Override
public void destroy() {
// Callback method used to release resources when the detection ends.
}
}
3. Set the detection result processor to bind the analyzer to the result processor.
Code:
analyzer.setTransactor(new OcrDetectorProcessor());
4. Call the synchronous API to use the built-in LensEngine of the SDK to create an object, register the analyzer, and initialize camera parameters.
Code:
LensEngine lensEngine = new LensEngine.Creator(getApplicationContext(),analyzer)
.setLensType(LensEngine.BACK_LENS)
.applyDisplayDimension(1440, 1080)
.applyFps(30.0f)
.enableAutomaticFocus(true)
.create();
5. Call the run method to start the camera and read camera streams for recognition.
Code:
// Implement other logic of the SurfaceView control by yourself.
SurfaceView mSurfaceView = findViewById(R.id.surface_view);
try {
lensEngine.run(mSurfaceView.getHolder());
} catch (IOException e) {
// Exception handling logic.
}
6. After the recognition is complete, stop the analyzer to release recognition resources.
Code:
if (analyzer != null) {
try {
analyzer.close();
} catch (IOException e) {
// Exception handling.
}
}
if (lensEngine != null) {
lensEngine.release();
}
In camera stream detection, when MLAnalyzer.MLTransactor<T> is inherited to process detection results, if your app needs to stop detection after a specific result is detected and continue detection after the result is processed, please refer to Development for Multi Detections in Camera Stream Detection Mode.

Related

ML Kit: Face Detection Development Procedure

Before API development, you need to make necessary development preparations, ensure that the Maven repository address of the HMS Core SDK has been configured in your project, and the SDK of this service has been integrated.
Static image detection
1. Create a face analyzer. You can create the analyzer using the MLFaceAnalyzerSetting class.
Code:
// Method 1: Use customized parameter settings.
// If the Full SDK mode is used for integration, set parameters based on the integrated model package.
MLFaceAnalyzerSetting setting = new MLFaceAnalyzerSetting.Factory()
// Set whether to detect key face points.
.setKeyPointType(MLFaceAnalyzerSetting.TYPE_KEYPOINTS)
// Set whether to detect facial features.
.setFeatureType(MLFaceAnalyzerSetting.TYPE_FEATURES)
// Set whether to detect face contour points.
.setShapeType(MLFaceAnalyzerSetting.TYPE_SHAPES)
// Set whether to enable face tracking.
.setTracingAllowed(true)
// Set the speed and precision of the detector.
.setPerformanceType(MLFaceAnalyzerSetting.TYPE_SPEED)
.create();
MLFaceAnalyzer analyzer = MLAnalyzerFactory.getInstance().getFaceAnalyzer(setting);
// Method 2: Use the default parameter settings. This method can be used when the Lite SDK is used for integration. The default parameters are key points, face contour, facial features, precision mode, and face tracking (disabled by default) for detection.
MLFaceAnalyzer analyzer = MLAnalyzerFactory.getInstance().getFaceAnalyzer();
2. Create an MLFrame object by using android.graphics.Bitmap for the analyzer to detect images. JPG, JPEG, and PNG images are supported. It is recommended that the image size be within the range of 320 x 320 px to 1920 x 1920 px.
Code:
// Create an MLFrame by using the bitmap.
MLFrame frame = MLFrame.fromBitmap(bitmap);
3. Call the asyncAnalyseFrame method to perform face detection.
Code:
Task<List<MLFace>> task = analyzer.asyncAnalyseFrame(frame);
task.addOnSuccessListener(new OnSuccessListener<List<MLFace>>() {
@Override
public void onSuccess(List<MLFace> faces) {
// Detection success.
}
}).addOnFailureListener(new OnFailureListener() {
@Override
public void onFailure(Exception e) {
// Detection failure.
// Recognition failure.
try {
MLException mlException = (MLException)e;
// Obtain the result codes. You can process the result codes and customize respective messages displayed to users. For details about the result codes, please refer to MLException.
int errorCode = mlException.getErrCode();
// Obtain the error information. You can quickly locate the fault based on the result code.
String errorMessage = mlException.getMessage();
} catch (Exception error) {
// Handle the conversion error.
}
}
});
4. After the detection is complete, stop the analyzer to release detection resources.
Code:
try {
if (analyzer != null) {
ananlzer.stop();
}
} catch (IOException e) {
// Exception handling.
}
The asynchronous call mode is used in preceding sample code. Face detection also supports synchronous call of the analyseFrame function to obtain the detection result.
Code:
SparseArray<MLFace> faces = analyzer.analyseFrame(frame);
Camera stream detection
You can process camera streams, convert video frames into an MLFrame object, and detect faces using the local static image detection method. If the synchronous detection API is called, you can also use the LensEngine class built in the SDK to locally detect faces in camera streams. The sample code is as follows:
1. Create a face analyzer.
Code:
MLFaceAnalyzer analyzer = MLAnalyzerFactory.getInstance().getFaceAnalyzer();
Create the FaceAnalyzerTransactor class for processing detection results. This class implements the MLAnalyzer.MLTransactor<T> API and uses the transactResult method in this class to obtain the detection results and implement specific services.
Code:
public class FaceAnalyzerTransactor implements MLAnalyzer.MLTransactor<MLFace> {
@Override
public void transactResult(MLAnalyzer.Result<MLFace> results) {
SparseArray<MLFace> items = results.getAnalyseList();
// Determine detection result processing as required. Note that only the detection results are processed.
// Other detection-related APIs provided by HUAWEI ML Kit cannot be called.
}
@Override
public void destroy() {
// Callback method used to release resources when the detection ends.
}
}
2. Set the detection result processor to bind the analyzer to the result processor.
Code:
analyzer.setTransactor(new FaceAnalyzerTransactor());
3. Create an instance of the LensEngine class provided by the HMS Core ML SDK to capture dynamic camera streams and pass the streams to the analyzer. It is recommended that the camera display size be set to a value ranging from 320 x 320 px to 1920 x 1920 px.
Code:
LensEngine lensEngine = new LensEngine.Creator(getApplicationContext(), analyzer)
.setLensType(LensEngine.BACK_LENS)
.applyDisplayDimension(1440, 1080)
.applyFps(30.0f)
.enableAutomaticFocus(true)
.create();
4. Call the run method to start the camera and read camera streams for recognition.
Code:
// Implement other logic of the SurfaceView control by yourself.
SurfaceView mSurfaceView = findViewById(R.id.surface_view);
try {
lensEngine.run(mSurfaceView.getHolder());
} catch (IOException e) {
// Exception handling logic.
}
5. After the detection is complete, stop the analyzer to release detection resources.
Code:
if (analyzer != null) {
try {
analyzer.stop();
} catch (IOException e) {
// Exception handling.
}
}
if (lensEngine != null) {
lensEngine.release();
}
In camera stream detection, when MLAnalyzer.MLTransactor<T> is inherited to process detection results, if your app needs to stop detection after a specific result is detected and continue detection after the result is processed, please refer to Development for Multi Detections in Camera Stream Detection Mode.

CameraX — Camera Kit comparison

More information like this, you can visit HUAWEI Developer Forum​
Original link: https://forums.developer.huawei.com/forumPortal/en/topicview?tid=0201332917232620018&fid=0101187876626530001
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
CameraX
CameraX is a Jetpack support library, built to help you make camera app development easier. It provides a consistent and easy-to-use API surface that works across most Android devices, with backward-compatibility to Android 5.0
While it leverages the capabilities of camera2, it uses a simpler, uses a case-based approach that is lifecycle-aware. It also resolves device compatibility issues for you so that you don’t have to include device-specific code in your codebase. These features reduce the amount of code you need to write when adding camera capabilities to your app.
Use Cases
CameraX introduces use cases, which allow you to focus on the task you need to get done instead of spending time managing device-specific nuances. There are several basic use cases:
Preview: get an image on the display
Image analysis: access a buffer seamlessly for use in your algorithms, such as to pass into MLKit
Image capture: save high-quality images
CameraX has an optional add-on, called Extensions, which allow you to access the same features and capabilities as those in the native camera app that ships with the device, with just two lines of code.
The first set of capabilities available include Portrait, HDR, Night, and Beauty. These capabilities are available on supported devices
Implementing Preview
When adding a preview to your app, use PreviewView, which is a View that can be cropped, scaled, and rotated for proper display.
The image preview streams to a surface inside the PreviewView when the camera becomes active.
Implementing a preview for CameraX using PreviewView involves the following steps, which are covered in later sections:
Optionally configure a CameraXConfig.Provider.
Add a PreviewView to your layout.
Request a CameraProvider.
On View creation, check for the CameraProvider.
Select a camera and bind the lifecycle and use cases.
Using PreviewView has some limitations. When using PreviewView, you can’t do any of the following things:
Create a SurfaceTexture to set on TextureView and PreviewSurfaceProvider.
Retrieve the SurfaceTexture from TextureView and set it on PreviewSurfaceProvider.
Get the Surface from SurfaceView and set it on PreviewSurfaceProvider.
If any of these happen, then the Preview will stop streaming frames to the PreviewView.
On your app level build.gradle file add the following:
Code:
// CameraX core library using the camera2 implementation
def camerax_version = "1.0.0-beta03"
def camerax_extensions = "1.0.0-alpha10"
implementation "androidx.camera:camera-core:${camerax_version}"
implementation "androidx.camera:camera-camera2:${camerax_version}"
// If you want to additionally use the CameraX Lifecycle library
implementation "androidx.camera:camera-lifecycle:${camerax_version}"
// If you want to additionally use the CameraX View class
implementation "androidx.camera:camera-view:${camerax_extensions}"
// If you want to additionally use the CameraX Extensions library
implementation "androidx.camera:camera-extensions:${camerax_extensions}"
On your .xml file using the PreviewView is highly recommended:
Code:
<androidx.camera.view.PreviewView
android:id="@+id/camera"
android:layout_width="math_parent"
android:layout_height="math_parent"
android:contentDescription="@string/preview_area"
android:importantForAccessibility="no"/>
Let's start the backend coding for our previewView in our Activity or a Fragment:
Code:
private val REQUIRED_PERMISSIONS = arrayOf(Manifest.permission.CAMERA)
private lateinit var cameraSelector: CameraSelector
private lateinit var previewView: PreviewView
private lateinit var cameraProviderFeature: ListenableFuture<ProcessCameraProvider>
private lateinit var cameraControl: CameraControl
private lateinit var cameraInfo: CameraInfo
private lateinit var imageCapture: ImageCapture
private lateinit var imageAnalysis: ImageAnalysis
private lateinit var torchView: ImageView
private val executor = Executors.newSingleThreadExecutor()
takePicture() method:
Code:
fun takePicture() {
val file = createFile(
outputDirectory,
FILENAME,
PHOTO_EXTENSION
)
val outputFileOptions = ImageCapture.OutputFileOptions.Builder(file).build()
imageCapture.takePicture(
outputFileOptions,
executor,
object : ImageCapture.OnImageSavedCallback {
override fun onImageSaved(outputFileResults: ImageCapture.OutputFileResults) {
val msg = "Photo capture succeeded: ${file.absolutePath}"
previewView.post {
Toast.makeText(
context.applicationContext,
msg,
Toast.LENGTH_SHORT
).show()
//You can create a task to save your image to any database you like
getImageTask(file)
}
}
override fun onError(exception: ImageCaptureException) {
val msg = "Photo capture failed: ${exception.message}"
showLogError(mTAG, msg)
}
})
}
As I said you may get uri from file and use it on anywhere you like:
Code:
fun getImageTask(file: File) {
val uri = Uri.fromFile(file)
}
This part is an example for starting front camera with minor changes I am sure you may switch between front and back:
Code:
fun startCameraFront() {
showLogDebug(mTAG, "startCameraFront")
CameraX.unbindAll()
torchView.visibility = View.INVISIBLE
imagePreviewView = Preview.Builder().apply {
setTargetAspectRatio(AspectRatio.RATIO_4_3)
setTargetRotation(previewView.display.rotation)
setDefaultResolution(Size(1920, 1080))
setMaxResolution(Size(3024, 4032))
}.build()
imageAnalysis = ImageAnalysis.Builder().apply {
setImageQueueDepth(ImageAnalysis.STRATEGY_KEEP_ONLY_LATEST)
}.build()
imageAnalysis.setAnalyzer(executor, LuminosityAnalyzer())
imageCapture = ImageCapture.Builder().apply {
setCaptureMode(ImageCapture.CAPTURE_MODE_MAXIMIZE_QUALITY)
}.build()
cameraSelector =
CameraSelector.Builder().requireLensFacing(CameraSelector.LENS_FACING_FRONT).build()
cameraProviderFeature.addListener(Runnable {
val cameraProvider = cameraProviderFeature.get()
val camera = cameraProvider.bindToLifecycle(
this,
cameraSelector,
imagePreviewView,
imageAnalysis,
imageCapture
)
previewView.preferredImplementationMode =
PreviewView.ImplementationMode.TEXTURE_VIEW
imagePreviewView.setSurfaceProvider(previewView.createSurfaceProvider(camera.cameraInfo))
}, ContextCompat.getMainExecutor(context.applicationContext))
}
LuminosityAnalyzer is essential for autofocus measures, so I recommend you to use it:
Code:
private class LuminosityAnalyzer : ImageAnalysis.Analyzer {
private var lastAnalyzedTimestamp = 0L
/**
* Helper extension function used to extract a byte array from an
* image plane buffer
*/
private fun ByteBuffer.toByteArray(): ByteArray {
rewind() // Rewind the buffer to zero
val data = ByteArray(remaining())
get(data) // Copy the buffer into a byte array
return data // Return the byte array
}
override fun analyze(image: ImageProxy) {
val currentTimestamp = System.currentTimeMillis()
// Calculate the average luma no more often than every second
if (currentTimestamp - lastAnalyzedTimestamp >=
TimeUnit.SECONDS.toMillis(1)
) {
val buffer = image.planes[0].buffer
val data = buffer.toByteArray()
val pixels = data.map { it.toInt() and 0xFF }
val luma = pixels.average()
showLogDebug(mTAG, "Average luminosity: $luma")
lastAnalyzedTimestamp = currentTimestamp
}
image.close()
}
}
Now before saving our image to our folder lets define our constants:
Code:
companion object {
private const val REQUEST_CODE_PERMISSIONS = 10
private const val mTAG = "ExampleTag"
private const val FILENAME = "yyyy-MM-dd-HH-mm-ss-SSS"
private const val PHOTO_EXTENSION = ".jpg"
private var recPath = Environment.getExternalStorageDirectory().path + "/Pictures/YourNewFolderName"
fun getOutputDirectory(context: Context): File {
val appContext = context.applicationContext
val mediaDir = context.externalMediaDirs.firstOrNull()?.let {
File(
recPath
).apply { mkdirs() }
}
return if (mediaDir != null && mediaDir.exists()) mediaDir else appContext.filesDir
}
fun createFile(baseFolder: File, format: String, extension: String) =
File(
baseFolder, SimpleDateFormat(format, Locale.ROOT)
.format(System.currentTimeMillis()) + extension
)
}
Simple torch control:
Code:
fun toggleTorch() {
when (cameraInfo.torchState.value) {
TorchState.ON -> {
cameraControl.enableTorch(false)
}
else -> {
cameraControl.enableTorch(true)
}
}
}
private fun setTorchStateObserver() {
cameraInfo.torchState.observe(this, androidx.lifecycle.Observer { state ->
if (state == TorchState.ON) {
torchView.setImageResource(R.drawable.ic_flash_on)
} else {
torchView.setImageResource(R.drawable.ic_flash_off)
}
})
}
Remember torchView can be any View type you want to be:
Code:
torchView.setOnClickListener {
toggleTorch()
setTorchStateObserver()
}
Now in your onCreateView() for Fragments or in onCreate() you may initiate previewView start using it:
Code:
previewView.post { startCameraFront() }
} else {
requestPermissions(
REQUIRED_PERMISSIONS,
REQUEST_CODE_PERMISSIONS
)
}
Camera Kit
HUAWEI Camera Kit encapsulates the Google Camera2 API to support multiple enhanced camera capabilities.
Unlike other camera APIs, Camera Kit focuses on bringing the full capacity of your camera to your apps. Well, dear readers think like this, many other social media apps have their own camera features yet output given by their camera is somehow always worse than the camera quality that your phone actually provides. For example, your camera may support x50 zoom or super night mode or maybe wide aperture mode but we all know that full extent of our phones' camera becomes useless no matter the price or the feature that our phone has when we are trying the take a shot from any of the 3rd party camera APIs.
HUAWEI Camera Kit provides a set of advanced programming APIs for you to integrate powerful image processing capabilities of Huawei phone cameras into your apps. Camera features such as wide aperture, Portrait mode, HDR, background blur, and Super Night mode can help your users shoot stunning images and vivid videos anytime and anywhere.
Features
Unlike the rest of the open-source APIs Camera Kit access the devices’ original camera features and is able to unleash them in your apps.
Front Camera HDR: In a backlit or low-light environment, front camera High Dynamic Range (HDR) improves the details in both the well-lit and poorly-lit areas of photos to present more life-like qualities.
Super Night Mode: This mode is used for you to take photos with sufficient brightness by using a long exposure at night. It also helps you to take photos that are properly exposed in other dark environments.
Wide Aperture: This mode blurs the background and highlights the subject in a photo. You are advised to be within 2 meters of the subject when taking a photo and to disable the flash in this mode.
Recording: This mode helps you record HD videos with effects such as different colors, filters, and AI film. Effects: Video HDR, Video background blurring
Portrait: Portraits and close-ups
Photo Mode: This mode supports the general capabilities that include but are not limited to Rear camera: Flash, color modes, face/smile detection, filter, and master AI. Front camera: Face/Smile detection, filter, SensorHdr, and mirror reflection.
Super Slow-Mo Recording: This mode allows you to record super slow-motion videos with a frame rate of over 960 FPS in manual or automatic (motion detection) mode.
Slow-mo Recording: This mode allows you to record slow-motion videos with a frame rate lower than 960 FPS. This mode allows you to record slow-motion videos with a frame rate lower than 960 FPS.
Pro Mode (Video): The Pro mode is designed to open the professional photography and recording capabilities of the Huawei camera to apps to meet diversified shooting requirements.
Pro Mode (Photo): This mode allows you to adjust the following camera parameters to obtain the same shooting capabilities as those of Huawei camera: Metering mode, ISO, exposure compensation, exposure duration, focus mode, and automatic white balance.
Integration Process
Registration and Sign-in
Before you get started, you must register as a HUAWEI developer and complete identity verification on the HUAWEI Developer website. For details, please refer to Register a HUAWEI ID.
Signing the HUAWEI Developer SDK Service Cooperation Agreement
When you download the SDK from SDK Download, the system prompts you to sign in and sign the HUAWEI Media Service Usage Agreement…
Environment Preparations
Android Studio v3.0.1 or later is recommended.
Huawei phones equipped with Kirin 980 or later and running EMUI 10.0 or later are required.
Code Part (Portrait Mode)
Now let us do an example for Portrait Mode. On our manifest lets set up some permissions:
Code:
<uses-permission android:name="android.permission.CAMERA" />
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" />
<uses-permission android:name="android.permission.WRITE_INTERNAL_STORAGE" />
<uses-permission android:name="android.permission.READ_INTERNAL_STORAGE" />
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE" />
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.ACCESS_FINE_LOCATION" />
<uses-feature android:name="android.hardware.camera" />
<uses-feature android:name="android.hardware.camera.autofocus" />
View for the camera doesn’t provided by Camera Kit so we have to write our own view first:
Code:
public class OurTextureView extends TextureView {
private int mRatioWidth = 0;
private int mRatioHeight = 0;
public OurTextureView(Context context) {
this(context, null);
}
public OurTextureView(Context context, AttributeSet attrs) {
this(context, attrs, 0);
}
public OurTextureView(Context context, AttributeSet attrs, int defStyle) {
super(context, attrs, defStyle);
}
public void setAspectRatio(int width, int height) {
if ((width < 0) || (height < 0)) {
throw new IllegalArgumentException("Size cannot be negative.");
}
mRatioWidth = width;
mRatioHeight = height;
requestLayout();
}
@Override
protected void onMeasure(int widthMeasureSpec, int heightMeasureSpec) {
super.onMeasure(widthMeasureSpec, heightMeasureSpec);
int width = MeasureSpec.getSize(widthMeasureSpec);
int height = MeasureSpec.getSize(heightMeasureSpec);
if ((0 == mRatioWidth) || (0 == mRatioHeight)) {
setMeasuredDimension(width, height);
} else {
if (width < height * mRatioWidth / mRatioHeight) {
setMeasuredDimension(width, width * mRatioHeight / mRatioWidth);
} else {
setMeasuredDimension(height * mRatioWidth / mRatioHeight, height);
}
}
}
}
.xml part:
Code:
<com.huawei.camerakit.portrait.OurTextureView
android:id="@+id/texture"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:layout_alignParentStart="true"
android:layout_alignParentTop="true" />
Let's look at our variables:
Code:
private Mode mMode;
private @Mode.Type int mCurrentModeType = Mode.Type.PORTRAIT_MODE;
private CameraKit mCameraKit;
Our permissions:
Code:
@Override
public void onRequestPermissionsResult(int requestCode, @NonNull String[] permissions,
@NonNull int[] grantResults) {
Log.d(TAG, "onRequestPermissionsResult: ");
if (!PermissionHelper.hasPermission(this)) {
Toast.makeText(this, "This application needs camera permission.", Toast.LENGTH_LONG).show();
finish();
}
}
First, in our code let us check if the Camera Kit is supported by our device:
Code:
private boolean initCameraKit() {
mCameraKit = CameraKit.getInstance(getApplicationContext());
if (mCameraKit == null) {
Log.e(TAG, "initCamerakit: this devices not support camerakit or not installed!");
return false;
}
return true;
}
captureImage() method to capture image
Code:
private void captureImage() {
Log.i(TAG, "captureImage begin");
if (mMode != null) {
mMode.setImageRotation(90);
// Default jpeg file path
mFile = new File(getExternalFilesDir(null), System.currentTimeMillis() + "pic.jpg");
// Take picture
mMode.takePicture();
}
Log.i(TAG, "captureImage end");
}
Callback method for our actionState:
Code:
private final ActionStateCallback actionStateCallback = new ActionStateCallback() {
@Override
public void onPreview(Mode mode, int state, PreviewResult result) {
}
@Override
public void onTakePicture(Mode mode, int state, TakePictureResult result) {
switch (state) {
case TakePictureResult.State.CAPTURE_STARTED:
Log.d(TAG, "onState: STATE_CAPTURE_STARTED");
break;
case TakePictureResult.State.CAPTURE_COMPLETED:
Log.d(TAG, "onState: STATE_CAPTURE_COMPLETED");
showToast("take picture success! file=" + mFile);
break;
default:
break;
}
}
};
Now let us compare CameraX with Camera Kit
CameraX
Limited to already built-in functions
No Video capture
ML only exists on luminosity builds
Easy to use, lightweight, easy to implement
Any device that supports above API level 21 can use it.
Has averagely acceptable outputs
Gives you the mirrored image
Implementation requires only app level build.gradle integration
Has limited image adjusting while capturing
https://developer.android.com/training/camerax
Camera Kit
Lets you use the full capacity of the phones original camera
Video capture exist with multiple modes
ML exists on both rear and front camera (face/smile detection, filter, and master AI)
Hard to implement. Implementation takes time
Requires the flagship Huawei device to operate
Has incredible quality outputs
The mirrored image can be adjusted easily.
SDK must be downloaded and handled by the developer
References:
https://developer.huawei.com/consumer/en/CameraKit
camera kit will support portrait mode?

How a Programmer Developed a Text Reader App for His 80-Year-Old Grandpa

"John, have you seen my glasses?"
Our old friend John, a programmer at Huawei, has a grandpa who despite his old age, is an avid reader. Leaning back, struggling to make out what was written on the newspaper through his glasses, but unable to take his eyes off the text — this was how my grandpa used to read, John explained.
Reading this way was harmful on his grandpa's vision, and it occurred to John that the ears could take over the role of "reading" from the eyes. He soon developed a text-reading app that followed this logic, recognizing and then reading out text from a picture. Thanks to this app, John's grandpa now can ”read” from the comfort of his rocking chair, without having to strain his eyes.
How to Implement
The user takes a picture of a text passage. The app then automatically identifies the location of the text within the picture, and adjusts the shooting angle to an angle directly facing the text.
The app recognizes and extracts the text from the picture.
The app converts the recognized text into audio output by leveraging text-to-speech technology.
These functions are easy to implement, when relying on three services in HUAWEI ML Kit: document skew correction, text recognition, and text to speech (TTS).
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
Preparations
1. Configure the Huawei Maven repository address.
2. Add the build dependencies for the HMS Core SDK.
Code:
dependencies {
// Import the base SDK.
implementation 'com.huawei.hms:ml-computer-voice-tts:2.1.0.300'
// Import the bee voice package.
implementation 'com.huawei.hms:ml-computer-voice-tts-model-bee:2.1.0.300'
// Import the eagle voice package.
implementation 'com.huawei.hms:ml-computer-voice-tts-model-eagle:2.1.0.300'
// Import a PDF file analyzer.
implementation 'com.itextpdf:itextg:5.5.10'
}
Tap PREVIOUS or NEXT to turn to the previous or next page. Tap speak to start reading; tap it again to pause reading.
Development process
1. Create a TTS engine by using the custom configuration class MLTtsConfig. Here, on-device TTS is used as an example.
Java:
private void initTts() {
// Set authentication information for your app to download the model package from the server of Huawei.
MLApplication.getInstance().setApiKey(AGConnectServicesConfig.
fromContext(getApplicationContext()).getString("client/api_key"));
// Create a TTS engine by using MLTtsConfig.
mlTtsConfigs = new MLTtsConfig()
// Set the text converted from speech to English.
.setLanguage(MLTtsConstants.TTS_EN_US)
// Set the speaker with the English male voice (eagle).
.setPerson(MLTtsConstants.TTS_SPEAKER_OFFLINE_EN_US_MALE_EAGLE)
// Set the speech speed whose range is (0, 5.0]. 1.0 indicates a normal speed.
.setSpeed(.8f)
// Set the volume whose range is (0, 2). 1.0 indicates a normal volume.
.setVolume(1.0f)
// Set the TTS mode to on-device.
.setSynthesizeMode(MLTtsConstants.TTS_OFFLINE_MODE);
mlTtsEngine = new MLTtsEngine(mlTtsConfigs);
// Update the configuration when the engine is running.
mlTtsEngine.updateConfig(mlTtsConfigs);
// Pass the TTS callback function to the TTS engine to perform TTS.
mlTtsEngine.setTtsCallback(callback);
// Create an on-device TTS model manager.
manager = MLLocalModelManager.getInstance();
isPlay = false;
}
2. Create a TTS callback function for processing the TTS result.
Java:
MLTtsCallback callback = new MLTtsCallback() {
@Override
public void onError(String taskId, MLTtsError err) {
// Processing logic for TTS failure.
}
@Override
public void onWarn(String taskId, MLTtsWarn warn) {
// Alarm handling without affecting service logic.
}
@Override
// Return the mapping between the currently played segment and text. start: start position of the audio segment in the input text; end (excluded): end position of the audio segment in the input text.
public void onRangeStart(String taskId, int start, int end) {
// Process the mapping between the currently played segment and text.
}
@Override
// taskId: ID of a TTS task corresponding to the audio.
// audioFragment: audio data.
// offset: offset of the audio segment to be transmitted in the queue. One TTS task corresponds to a TTS queue.
// range: text area where the audio segment to be transmitted is located; range.first (included): start position; range.second (excluded): end position.
public void onAudioAvailable(String taskId, MLTtsAudioFragment audioFragment, int offset,
Pair<Integer, Integer> range, Bundle bundle) {
// Audio stream callback API, which is used to return the synthesized audio data to the app.
}
@Override
public void onEvent(String taskId, int eventId, Bundle bundle) {
// Callback method of a TTS event. eventId indicates the event name.
boolean isInterrupted;
switch (eventId) {
case MLTtsConstants.EVENT_PLAY_START:
// Called when playback starts.
break;
case MLTtsConstants.EVENT_PLAY_STOP:
// Called when playback stops.
isInterrupted = bundle.getBoolean(MLTtsConstants.EVENT_PLAY_STOP_INTERRUPTED);
break;
case MLTtsConstants.EVENT_PLAY_RESUME:
// Called when playback resumes.
break;
case MLTtsConstants.EVENT_PLAY_PAUSE:
// Called when playback pauses.
break;
// Pay attention to the following callback events when you focus on only the synthesized audio data but do not use the internal player for playback.
case MLTtsConstants.EVENT_SYNTHESIS_START:
// Called when TTS starts.
break;
case MLTtsConstants.EVENT_SYNTHESIS_END:
// Called when TTS ends.
break;
case MLTtsConstants.EVENT_SYNTHESIS_COMPLETE:
// TTS is complete. All synthesized audio streams are passed to the app.
isInterrupted = bundle.getBoolean(MLTtsConstants.EVENT_SYNTHESIS_INTERRUPTED);
break;
default:
break;
}
}
};
3. Extract text from a PDF file.
Java:
private String loadText(String path) {
String result = "";
try {
PdfReader reader = new PdfReader(path);
result = result.concat(PdfTextExtractor.getTextFromPage(reader,
mCurrentPage.getIndex() + 1).trim() + System.lineSeparator());
reader.close();
} catch (IOException e) {
showToast(e.getMessage());
}
// Obtain the position of the header.
int header = result.indexOf(System.lineSeparator());
// Obtain the position of the footer.
int footer = result.lastIndexOf(System.lineSeparator());
if (footer != 0){
// Do not display the text in the header and footer.
return result.substring(header, footer - 5);
}else {
return result;
}
}
4. Perform TTS in on-device mode.
Java:
// Create an MLTtsLocalModel instance to set the speaker so that the language model corresponding to the speaker can be downloaded through the model manager.
MLTtsLocalModel model = new MLTtsLocalModel.Factory(MLTtsConstants.TTS_SPEAKER_OFFLINE_EN_US_MALE_EAGLE).create();
manager.isModelExist(model).addOnSuccessListener(new OnSuccessListener<Boolean>() {
@Override
public void onSuccess(Boolean aBoolean) {
// If the model is not downloaded, call the download API. Otherwise, call the TTS API of the on-device engine.
if (aBoolean) {
String source = loadText(mPdfPath);
// Call the speak API to perform TTS. source indicates the text to be synthesized.
mlTtsEngine.speak(source, MLTtsEngine.QUEUE_APPEND);
if (isPlay){
// Pause playback.
mlTtsEngine.pause();
tv_speak.setText("speak");
}else {
// Resume playback.
mlTtsEngine.resume();
tv_speak.setText("pause");
}
isPlay = !isPlay;
} else {
// Call the API for downloading the on-device TTS model.
downloadModel(MLTtsConstants.TTS_SPEAKER_OFFLINE_EN_US_MALE_EAGLE);
showToast("The offline model has not been downloaded!");
}
}
}).addOnFailureListener(new OnFailureListener() {
@Override
public void onFailure(Exception e) {
showToast(e.getMessage());
}
});
5. Release resources when the current UI is destroyed.
Java:
@Override
protected void onDestroy() {
super.onDestroy();
try {
if (mParcelFileDescriptor != null) {
mParcelFileDescriptor.close();
}
if (mCurrentPage != null) {
mCurrentPage.close();
}
if (mPdfRenderer != null) {
mPdfRenderer.close();
}
if (mlTtsEngine != null){
mlTtsEngine.shutdown();
}
} catch (IOException e) {
e.printStackTrace();
}
}
Other Applicable Scenarios
TTS can be used across a broad range of scenarios. For example, you could integrate it into an education app to read bedtime stories to children, or integrate it into a navigation app, which could read out instructions aloud.
For more details, you can go to:
Reddit to join our developer discussion
GitHub to download demos and sample codes
Stack Overflow to solve any integration problems
Original Source
Well explained will it supports all languages?

How to Build a 3D Product Model Within Just 5 Minutes

Displaying products with 3D models is something too great to ignore for an e-commerce app. Using those fancy gadgets, such an app can leave users with the first impression upon products in a fresh way!
The 3D model plays an important role in boosting user conversion. It allows users to carefully view a product from every angle, before they make a purchase. Together with the AR technology, which gives users an insight into how the product will look in reality, the 3D model brings a fresher online shopping experience that can rival offline shopping.
Despite its advantages, the 3D model has yet to be widely adopted. The underlying reason for this is that applying current 3D modeling technology is expensive:
Technical requirements: Learning how to build a 3D model is time-consuming.
Time: It takes at least several hours to build a low polygon model for a simple object, and even longer for a high polygon one.
Spending: The average cost of building a simple model can be more than one hundred dollars, and even higher for building a complex one.
Luckily, 3D object reconstruction, a capability in 3D Modeling Kit newly launched in HMS Core, makes 3D model building straightforward. This capability automatically generates a 3D model with a texture for an object, via images shot from different angles with a common RGB-Cam. It gives an app the ability to build and preview 3D models. For instance, when an e-commerce app has integrated 3D object reconstruction, it can generate and display 3D models of shoes. Users can then freely zoom in and out on the models for a more immersive shopping experience.
Actual Effect​
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
Technical Solutions​
3D object reconstruction is implemented on both the device and cloud. RGB images of an object are collected on the device and then uploaded to the cloud. Key technologies involved in the on-cloud modeling process include object detection and segmentation, feature detection and matching, sparse/dense point cloud computing, and texture reconstruction. Finally, the cloud outputs an OBJ file (a commonly used 3D model file format) of the generated 3D model with 40,000 to 200,000 patches.
Preparations​1. Configuring a Dependency on the 3D Modeling SDK
Open the app-level build.gradle file and add a dependency on the 3D Modeling SDK in the dependencies block.
Code:
// Build a dependency on the 3D Modeling SDK.
implementation 'com.huawei.hms:modeling3d-object-reconstruct:1.0.0.300'
2. Configuring AndroidManifest.xml
Open the AndroidManifest.xml file in the main folder. Add the following information before <application> to apply for the storage read and write permissions and camera permission.
Code:
/<!-- Permission to read data from and write data into storage. -->
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" />
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE" />
<!-- Permission to use the camera. -->
<uses-permission android:name="android.permission.CAMERA" />
Development Procedure​1. Configuring the Storage Permission Application
In the onCreate() method of MainActivity, check whether the storage read and write permissions have been granted; if not, apply for them by using requestPermissions.
Code:
/if (EasyPermissions.hasPermissions(MainActivity.this, PERMISSIONS)) {
Log.i(TAG, "Permissions OK");
} else {
EasyPermissions.requestPermissions(MainActivity.this, "To use this app, you need to enable the permission.",
RC_CAMERA_AND_EXTERNAL_STORAGE, PERMISSIONS);
}
Check the application result. If the permissions are not granted, prompt the user to grant them.
Code:
@Override
public void onPermissionsGranted(int requestCode, @NonNull List<String> perms) {
Log.i(TAG, "permissions = " + perms);
if (requestCode == RC_CAMERA_AND_EXTERNAL_STORAGE && PERMISSIONS.length == perms.size()) {
initView();
initListener();
}
}
@Override
public void onPermissionsDenied(int requestCode, @NonNull List<String> perms) {
if (EasyPermissions.somePermissionPermanentlyDenied(this, perms)) {
new AppSettingsDialog.Builder(this)
.setRequestCode(RC_CAMERA_AND_EXTERNAL_STORAGE)
.setRationale("To use this app, you need to enable the permission.")
.setTitle("Insufficient permissions")
.build()
.show();
}
}
2. Creating a 3D Object Reconstruction Configurator
Code:
/// Set the PICTURE mode.
Modeling3dReconstructSetting setting = new Modeling3dReconstructSetting.Factory()
.setReconstructMode(Modeling3dReconstructConstants.ReconstructMode.PICTURE)
.create();
3. Creating a 3D Object Reconstruction Engine and Initializing the Task
Call getInstance() of Modeling3dReconstructEngine and pass the current context to create an instance of the 3D object reconstruction engine.
Code:
// Create an engine.
modeling3dReconstructEngine = Modeling3dReconstructEngine.getInstance(mContext);
Use the engine to initialize the task.
Code:
// Initialize the 3D object reconstruction task.
modeling3dReconstructInitResult = modeling3dReconstructEngine.initTask(setting);
// Obtain the task ID.
String taskId = modeling3dReconstructInitResult.getTaskId();
4. Creating a Listener Callback to Process the Image Upload Result
Create a listener callback that allows you to configure the operations triggered upon upload success and failure.
Code:
// Create an upload listener callback.
private final Modeling3dReconstructUploadListener uploadListener = new Modeling3dReconstructUploadListener() {
@Override
public void onUploadProgress(String taskId, double progress, Object ext) {
// Upload progress.
}
@Override
public void onResult(String taskId, Modeling3dReconstructUploadResult result, Object ext) {
if (result.isComplete()) {
isUpload = true;
ScanActivity.this.runOnUiThread(new Runnable() {
@Override
public void run() {
progressCustomDialog.dismiss();
Toast.makeText(ScanActivity.this, getString(R.string.upload_text_success), Toast.LENGTH_SHORT).show();
}
});
TaskInfoAppDbUtils.updateTaskIdAndStatusByPath(new Constants(ScanActivity.this).getCaptureImageFile() + manager.getSurfaceViewCallback().getCreateTime(), taskId, 1);
}
}
@Override
public void onError(String taskId, int errorCode, String message) {
isUpload = false;
runOnUiThread(new Runnable() {
@Override
public void run() {
progressCustomDialog.dismiss();
Toast.makeText(ScanActivity.this, "Upload failed." + message, Toast.LENGTH_SHORT).show();
LogUtil.e("taskid" + taskId + "errorCode: " + errorCode + " errorMessage: " + message);
}
});
}
};
5. Passing the Upload Listener Callback to the Engine to Upload Images
Pass the upload listener callback to the engine. Call uploadFile(),
pass the task ID obtained in step 3 and the path of the images to be uploaded. Then, upload the images to the cloud server.
Code:
// Pass the listener callback to the engine.
modeling3dReconstructEngine.setReconstructUploadListener(uploadListener);
// Start uploading.
modeling3dReconstructEngine.uploadFile(taskId, filePath);
6. Querying the Task Status
Call getInstance of Modeling3dReconstructTaskUtils to create a task processing instance. Pass the current context.
Code:
// Create a task processing instance.
modeling3dReconstructTaskUtils = Modeling3dReconstructTaskUtils.getInstance(Modeling3dDemo.getApp());
Call queryTask of the task processing instance to query the status of the 3D object reconstruction task.
Code:
// Query the task status, which can be: 0 (images to be uploaded); 1: (image upload completed); 2: (model being generated); 3( model generation completed); 4: (model generation failed).
Modeling3dReconstructQueryResult queryResult = modeling3dReconstructTaskUtils.queryTask(task.getTaskId());
7. Creating a Listener Callback to Process the Model File Download Result
Create a listener callback that allows you to configure the operations triggered upon download success and failure.
Code:
// Create a download listener callback.
private Modeling3dReconstructDownloadListener modeling3dReconstructDownloadListener = new Modeling3dReconstructDownloadListener() {
@Override
public void onDownloadProgress(String taskId, double progress, Object ext) {
((Activity) mContext).runOnUiThread(new Runnable() {
@Override
public void run() {
dialog.show();
}
});
}
@Override
public void onResult(String taskId, Modeling3dReconstructDownloadResult result, Object ext) {
((Activity) mContext).runOnUiThread(new Runnable() {
@Override
public void run() {
Toast.makeText(getContext(), "Download complete", Toast.LENGTH_SHORT).show();
TaskInfoAppDbUtils.updateDownloadByTaskId(taskId, 1);
dialog.dismiss();
}
});
}
@Override
public void onError(String taskId, int errorCode, String message) {
LogUtil.e(taskId + " <---> " + errorCode + message);
((Activity) mContext).runOnUiThread(new Runnable() {
@Override
public void run() {
Toast.makeText(getContext(), "Download failed." + message, Toast.LENGTH_SHORT).show();
dialog.dismiss();
}
});
}
};
8. Passing the Download Listener Callback to the Engine to Download the File of the Generated Model
Pass the download listener callback to the engine. Call downloadModel, pass the task ID obtained in step 3 and the path for saving the model file to download it.
Code:
/ Pass the download listener callback to the engine.
modeling3dReconstructEngine.setReconstructDownloadListener(modeling3dReconstructDownloadListener);
// Download the model file.
modeling3dReconstructEngine.downloadModel(appDb.getTaskId(), appDb.getFileSavePath());
More Information​
The object should have rich texture, be medium-sized, and a rigid body. The object should not be reflective, transparent, or semi-transparent. The object types include goods (like plush toys, bags, and shoes), furniture (like sofas), and cultural relics (such as bronzes, stone artifacts, and wooden artifacts).
The object dimension should be within the range from 15 x 15 x 15 cm to 150 x 150 x 150 cm. (A larger dimension requires a longer time for modeling.)
3D object reconstruction does not support modeling for the human body and face.
Ensure the following requirements are met during image collection: Put a single object on a stable plane in pure color. The environment shall not be dark or dazzling. Keep all images in focus, free from blur caused by motion or shaking. Ensure images are taken from various angles including the bottom, flat, and top (it is advised that you upload more than 50 images for an object). Move the camera as slowly as possible. Do not change the angle during shooting. Lastly, ensure the object-to-image ratio is as big as possible, and all parts of the object are present.
These are all about the sample code of 3D object reconstruction. Try to integrate it into your app and build your own 3D models!
References​For more details, you can go to:
3D Modeling Kit official website
3D Moedling Kit Development Documentation page, to find the documents you need
Reddit to join our developer discussion
GitHub to download 3D Modeling Kit sample codes
Stack Overflow to solve any integration problems

3D Product Model: See How to Create One in 5 Minutes

Quick question: How do 3D models help e-commerce apps?
The most obvious answer is that it makes the shopping experience more immersive, and there are a whole host of other benefits they bring.
To begin with, a 3D model is a more impressive way of showcasing a product to potential customers. One way it does this is by displaying richer details (allowing potential customers to rotate the product and view it from every angle), to help customers make more informed purchasing decisions. Not only that, customers can virtually try-on 3D products, to recreate the experience of shopping in a physical store. In short, all these factors contribute to boosting user conversion.
As great as it is, the 3D model has not been widely adopted among those who want it. A major reason is that the cost of building a 3D model with existing advanced 3D modeling technology is very high, due to:
Technical requirements: Building a 3D model requires someone with expertise, which can take time to master.
Time: It takes at least several hours to build a low-polygon model for a simple object, not to mention a high-polygon one.
Spending: The average cost of building just a simple model can reach hundreds of dollars.
Fortunately for us, the 3D object reconstruction capability found in HMS Core 3D Modeling Kit makes 3D model creation easy-peasy. This capability automatically generates a texturized 3D model for an object, via images shot from multiple angles with a standard RGB camera on a phone. And what's more, the generated model can be previewed. Let's check out a shoe model created using the 3D object reconstruction capability.
Shoe Model Images​
{
"lightbox_close": "Close",
"lightbox_next": "Next",
"lightbox_previous": "Previous",
"lightbox_error": "The requested content cannot be loaded. Please try again later.",
"lightbox_start_slideshow": "Start slideshow",
"lightbox_stop_slideshow": "Stop slideshow",
"lightbox_full_screen": "Full screen",
"lightbox_thumbnails": "Thumbnails",
"lightbox_download": "Download",
"lightbox_share": "Share",
"lightbox_zoom": "Zoom",
"lightbox_new_window": "New window",
"lightbox_toggle_sidebar": "Toggle sidebar"
}
Technical Solutions​
3D object reconstruction requires both the device and cloud. Images of an object are captured on a device, covering multiple angles of the object. And then the images are uploaded to the cloud for model creation. The on-cloud modeling process and key technologies include object detection and segmentation, feature detection and matching, sparse/dense point cloud computing, and texture reconstruction. Once the model is created, the cloud outputs an OBJ file (a commonly used 3D model file format) of the generated 3D model with 40,000 to 200,000 patches.
Now the boring part is out of the way. Let's move on to the exciting part: how to integrate the 3D object reconstruction capability.
Integrating the 3D Object Reconstruction Capability​Preparations​1. Configure the build dependency for the 3D Modeling SDK.​Add the build dependency for the 3D Modeling SDK in the dependencies block in the app-level build.gradle file.
Code:
// Build dependency for the 3D Modeling SDK.
implementation 'com.huawei.hms:modeling3d-object-reconstruct:1.0.0.300'
2. Configure AndroidManifest.xml.​Open the AndroidManifest.xml file in the main folder. Add the following information before <application> to apply for the storage read and write permissions and camera permission as needed:
Code:
<!-- Write into and read from external storage. -->
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" />
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE" />
<!-- Use the camera. -->
<uses-permission android:name="android.permission.CAMERA" />
Function Development​1. Configure the storage permission application.​In the onCreate() method of MainActivity, check whether the storage read and write permissions have been granted; if not, apply for them by using requestPermissions.
Code:
if (EasyPermissions.hasPermissions(MainActivity.this, PERMISSIONS)) {
Log.i(TAG, "Permissions OK");
} else {
EasyPermissions.requestPermissions(MainActivity.this, "To use this app, you need to enable the permission.",
RC_CAMERA_AND_EXTERNAL_STORAGE, PERMISSIONS);
}
Check the application result. If the permissions are granted, initialize the UI; if the permissions are not granted, prompt the user to grant them.
Code:
@Override
public void onPermissionsGranted(int requestCode, @NonNull List<String> perms) {
Log.i(TAG, "permissions = " + perms);
if (requestCode == RC_CAMERA_AND_EXTERNAL_STORAGE && PERMISSIONS.length == perms.size()) {
initView();
initListener();
}
}
@Override
public void onPermissionsDenied(int requestCode, @NonNull List<String> perms) {
if (EasyPermissions.somePermissionPermanentlyDenied(this, perms)) {
new AppSettingsDialog.Builder(this)
.setRequestCode(RC_CAMERA_AND_EXTERNAL_STORAGE)
.setRationale("To use this app, you need to enable the permission.")
.setTitle("Insufficient permissions")
.build()
.show();
}
}
2. Create a 3D object reconstruction configurator.​
Code:
// PICTURE mode.
Modeling3dReconstructSetting setting = new Modeling3dReconstructSetting.Factory()
.setReconstructMode(Modeling3dReconstructConstants.ReconstructMode.PICTURE)
.create();
3. Create a 3D object reconstruction engine and initialize the task.​Call getInstance() of Modeling3dReconstructEngine and pass the current context to create an instance of the 3D object reconstruction engine.
Code:
// Initialize the engine.
modeling3dReconstructEngine = Modeling3dReconstructEngine.getInstance(mContext);
Use the engine to initialize the task.
Code:
// Create a 3D object reconstruction task.
modeling3dReconstructInitResult = modeling3dReconstructEngine.initTask(setting);
// Obtain the task ID.
String taskId = modeling3dReconstructInitResult.getTaskId();
4. Create a listener callback to process the image upload result.​Create a listener callback in which you can configure the operations triggered upon upload success and failure.
Code:
// Create a listener callback for the image upload task.
private final Modeling3dReconstructUploadListener uploadListener = new Modeling3dReconstructUploadListener() {
@Override
public void onUploadProgress(String taskId, double progress, Object ext) {
// Upload progress
}
@Override
public void onResult(String taskId, Modeling3dReconstructUploadResult result, Object ext) {
if (result.isComplete()) {
isUpload = true;
ScanActivity.this.runOnUiThread(new Runnable() {
@Override
public void run() {
progressCustomDialog.dismiss();
Toast.makeText(ScanActivity.this, getString(R.string.upload_text_success), Toast.LENGTH_SHORT).show();
}
});
TaskInfoAppDbUtils.updateTaskIdAndStatusByPath(new Constants(ScanActivity.this).getCaptureImageFile() + manager.getSurfaceViewCallback().getCreateTime(), taskId, 1);
}
}
@Override
public void onError(String taskId, int errorCode, String message) {
isUpload = false;
runOnUiThread(new Runnable() {
@Override
public void run() {
progressCustomDialog.dismiss();
Toast.makeText(ScanActivity.this, "Upload failed." + message, Toast.LENGTH_SHORT).show();
LogUtil.e("taskid" + taskId + "errorCode: " + errorCode + " errorMessage: " + message);
}
});
}
};
5. Set the image upload listener for the 3D object reconstruction engine and upload the captured images.​Pass the upload callback to the engine. Call uploadFile(), pass the task ID obtained in step 3 and the path of the images to be uploaded, and upload the images to the cloud server.
Code:
// Set the upload listener.
modeling3dReconstructEngine.setReconstructUploadListener(uploadListener);
// Upload captured images.
modeling3dReconstructEngine.uploadFile(taskId, filePath);
6. Query the task status.​Call getInstance of Modeling3dReconstructTaskUtils to create a task processing instance. Pass the current context.
Code:
// Initialize the task processing class.
modeling3dReconstructTaskUtils = Modeling3dReconstructTaskUtils.getInstance(Modeling3dDemo.getApp());
Call queryTask to query the status of the 3D object reconstruction task.
Code:
// Query the reconstruction task execution result. The options are as follows: 0: To be uploaded; 1: Generating; 3: Completed; 4: Failed.
Modeling3dReconstructQueryResult queryResult = modeling3dReconstructTaskUtils.queryTask(task.getTaskId());
7. Create a listener callback to process the model file download result.​Create a listener callback in which you can configure the operations triggered upon download success and failure.
Code:
// Create a download callback listener
private Modeling3dReconstructDownloadListener modeling3dReconstructDownloadListener = new Modeling3dReconstructDownloadListener() {
@Override
public void onDownloadProgress(String taskId, double progress, Object ext) {
((Activity) mContext).runOnUiThread(new Runnable() {
@Override
public void run() {
dialog.show();
}
});
}
@Override
public void onResult(String taskId, Modeling3dReconstructDownloadResult result, Object ext) {
((Activity) mContext).runOnUiThread(new Runnable() {
@Override
public void run() {
Toast.makeText(getContext(), "Download complete", Toast.LENGTH_SHORT).show();
TaskInfoAppDbUtils.updateDownloadByTaskId(taskId, 1);
dialog.dismiss();
}
});
}
@Override
public void onError(String taskId, int errorCode, String message) {
LogUtil.e(taskId + " <---> " + errorCode + message);
((Activity) mContext).runOnUiThread(new Runnable() {
@Override
public void run() {
Toast.makeText(getContext(), "Download failed." + message, Toast.LENGTH_SHORT).show();
dialog.dismiss();
}
});
}
};
8. Pass the download listener callback to the engine to download the generated model file.​Pass the download listener callback to the engine. Call downloadModel. Pass the task ID obtained in step 3 and the path for saving the model file to download it.
Code:
// Set the listener for the model file download task.
modeling3dReconstructEngine.setReconstructDownloadListener(modeling3dReconstructDownloadListener);
// Download the model file.
modeling3dReconstructEngine.downloadModel(appDb.getTaskId(), appDb.getFileSavePath());
Notes​1. To deliver an ideal modeling result, 3D object reconstruction has some requirements on the object to be modeled. For example, the object should have rich textures and a fixed shape. The object is expected to be non-reflective and medium-sized. Transparency or semi-transparency is not recommended. An object that meets these requirements may fall into one of the following types: goods (including plush toys, bags, and shoes), furniture (like sofas), and cultural relics (like bronzes, stone artifacts, and wooden artifacts).
2. The object dimensions should be within the range of 15 x 15 x 15 cm to 150 x 150 x 150 cm. (Larger dimensions require a longer modeling time.)
3. Modeling for the human body or face is not yet supported by the capability.
4. Suggestions for image capture: Put a single object on a stable plane in pure color. The environment should be well lit and plain. Keep all images in focus, free from blur caused by motion or shaking, and take pictures of the object from various angles including the bottom, face, and top. Uploading more than 50 images for an object is recommended. Move the camera as slowly as possible, and do not suddenly alter the angle when taking pictures. The object-to-image ratio should be as big as possible, and not a part of the object is missing.
With all these in mind, as well as the development procedure of the capability, now we are ready to create a 3D model like the shoe model above. Looking forward to seeing your own models created using this capability in the comments section below.
Reference​
Home page of 3D Modeling Kit
Service introduction to 3D Modeling Kit
Detailed information about 3D object reconstruction

Categories

Resources