Implementing Real-Time Transcription in an Easy Way
The real-time onscreen subtitle is a must-have function in an ordinary video app. However, developing such a function can prove costly for small- and medium-sized developers. And even when implemented, speech recognition is often prone to inaccuracy. Fortunately, there’s a better way — HUAWEI ML Kit, which is remarkably easy to integrate, and makes real-time transcription an absolute breeze!
Introduction to ML Kit
ML Kit allows your app to leverage Huawei’s longstanding machine learning prowess to apply cutting-edge artificial intelligence (AI) across a wide range of contexts. With Huawei’s expertise built in, ML Kit is able to provide a broad array of easy-to-use machine learning capabilities, which serve as the building blocks for tomorrow’s cutting-edge AI apps. ML Kit capabilities include those related to:
- Text (including text recognition, document recognition, and ID card recognition)
- Language/Voice (such as real-time/on-device translation, automatic speech recognition, and real-time transcription)
- Image (such as image classification, object detection and tracking, and landmark recognition)
- Face/Body (such as face detection, skeleton detection, liveness detection, and face verification)
- Natural language processing (text embedding)
- Custom model (including the on-device inference framework and model development tool)
Real-time transcription is required to implement the function mentioned above. Let’s take a look at how this works in practice:
Now let’s move on to how to integrate this service.
Integrating Real-Time Transcription
- Registering as a Huawei developer on HUAWEI Developers
- Creating an app
We’ve provided some screenshots for your reference:
3.Enabling ML Kit
4.Integrating the HMS Core SDK
Add the AppGallery Connect configuration file by completing the steps below:
Download and copy the agconnect-service.json file to the app directory of your Android Studio project.
Call setApiKey during app initialization.
To learn more, go to Adding the AppGallery Connect Configuration File.
Add build dependencies.
Import the real-time transcription SDK.
Add the AppGallery Connect plugin configuration.
Method 1: Add the following information under the declaration in the file header:
apply plugin: 'com.huawei.agconnect'
Method 2: Add the plugin configuration in the plugins block.
Please refer to Integrating the Real-Time Transcription SDK to learn more.
Setting the cloud authentication information
When using on-cloud services of ML Kit, you can set the API key or access token (recommended) in either of the following ways:
You can use the following API to initialize the access token when the app is started. The access token does not need to be set again once initialized.
MLApplication.getInstance().setAccessToken(“your access token”);
You can use the following API to initialize the API key when the app is started. The API key does not need to be set again once initialized.
For details, see Notes on Using Cloud Authentication Information.
Create and configure a speech recognizer.
Create a speech recognition result listener callback.
The recognition result can be obtained from the listener callbacks, including onRecognizingResults. Design the UI content according to the obtained results. For example, display the text transcribed from the input speech.
Bind the speech recognizer.
Call startRecognizing to start speech recognition.
Release resources after recognition is complete.
(Optional) Obtain the list of supported languages.
We’ve finished integration here, so let’s test it out on a simple screen.
Tap START RECORDING. The text recognized from the input speech will display in the lower portion of the screen.
We’ve now built a simple audio transcription function.
Eager to build a fancier UI, with stunning animations, and other effects? By all means, take your shot!
To learn more, please visit:
>> Reddit to join developer discussions
>> Stack Overflow to solve integration problems
Follow our official account for the latest HMS Core-related news and updates.