Machine Learning use case overview
In this tutorial we will explain and show how to apply Machine Learning techniques, such as Speech-to-Text and Natural Language processing, to glean actionable information from videos or speech records.
Moreover, we will illustrate the workflow automation using AWS services.
AWS services involved
As a scenario, we suppose we need to write reports by converting audio records to text (Speech to Text) and then apply Natural Language Processing (Sentiment Analysis, automatic detect language ….).
In order to achieve this, we deploy the following AWS services:
- AWS Lambda: AWS Lambda allows to run code without provisioning or managing servers. With Lambda, we can run code for virtually any type of application or backend service – all with zero administration.
- AWS Step Functions: AWS Step Functions allows to coordinate multiple AWS services into serverless workflows so we can build and update apps quickly.
- AWS Transcribe: uses a deep learning process called automatic speech recognition (ASR) to convert speech to text quickly and accurately. Amazon Transcribe makes it easy for developers to add speech-to-text capability to their applications.
- AWS Comprehend: is a natural language processing (NLP) service that uses machine learning to find insights and relationships in text.
Scenario high level technical description
In this use case, AWS Step functions services have provided the right support to automate the different steps of the workflow. Indeed, once a video/record has been stored in S3, the AWS Step Function will be triggered to automate and coordinate the whole workflow.
First, an AWS Lambda function is started to perform Speech-to-Text on the video/record (by calling AWS Transcribe). Once the transcription job has finished, Natural Language Processing (NLP) is applied on the generated transcript to detect sentiment (using AWS Comprehend).
Feedback on set-up and deployment
By using the AWS ML and serverless compute services, the deployment has been really easy and straightforward. We just have to focus on the application side as the operations and underlying infrastructure are fully managed by the related services: zero-administration! The resources are consumed “on demand” when it is needed, which is a great asset in terms of cost optimisation.
Moreover, we have used AWS Step Functions service to orchestrate, track and log the state of each step in the workflow.
As mentioned above, we have used ML services such as AWS Transcribe, and AWS Comprehend.
In some specific fields, the results of ML services may not be as accurate as you expect. For instance, you can construct a custom vocabulary to enhance the ASR accuracy with Amazon Transcribe.
Low effort and cost effective deployment with AWS
In normal cases, developing machine learning models is quite time consuming. It is worthy to note that the design and the implementation of this use case took only 2 days !
The combination of the different AWS services has greatly helped us to optimize the deployment process in terms of time and workload by using prebuilt AWS machine learning algorithms, and compute resources by using AWS serverless services.