Audio recognition using Tensorflow Lite in Flutter applications

Carolina Albuquerque
7 min readFeb 22, 2021

Mobile applications development has been enhanced by Machine Learning (ML) and Artificial Intelligence (AI). The integration of ML models in applications in order to classify or predict events allows the creation of applications able to understand and recognise user’s behaviour and make their experience more intelligent and interesting. An easement to understand example is the next words suggested to user based on the previous content typed while texting messages. So, this article will analyse the integration of a ML model into a mobile flutter application to audio recognition.

Creating a classification model

First of all, it is necessary to have a trained classification model. For instance, and if you are a beginner with ML concepts, the Google Teachable Machine is a fast and easy way to create your model. This framework provides the creation of three project types: Image, Audio and Pose. In this example it will be used a Google Teachable Machine (GTM) model and, as the main goal is the integration of an audio classification model, the Audio Project must be chosen.

Google Teachable Machine framework

The first step to create the model is to arrange data into different classes. It is possible to record samples at the moment or upload files with them. Each class has a minimum required number of audio samples. Background Noise is a default class that must contain samples with background noise. In this example, there were created Bell, Whistle and Xylophone classes with audio samples of these sounds.

Creation of a classification model using Google Teachable Machine
Exporting trained model as Tensorflow Lite model

After the data are separated by classes, the model has to be trained and exported to the Tensorflow Lite format. The downloaded model contains two files: labels.txt (specifies the classes’ labels) and soundclassifier.tflite (model). These files will be added to the mobile application so that with the tensorflow's help could read them and load the model.

Note: If you are comfortable with ML concepts, I recommended you to develop your own model, and train it because using a GTM model into your application can improve its size. Tensorflow documentation has a guide that helps to create models.

Creating a flutter application

A simple application was created containing Text Widgets and a MaterialButton that will be used to start recording audio.

I left here the main.dart base code:

import 'package:flutter/material.dart';
import 'package:tflite_audio/tflite_audio.dart';

void main() {
runApp(MyApp());
}

class MyApp extends StatelessWidget {
// This widget is the root of your application.
@override
Widget build(BuildContext context) {
return MaterialApp(
title: 'Flutter Demo',
theme: ThemeData.dark(),
home: MyHomePage(),
);
}
}

class MyHomePage extends StatefulWidget {
@override
_MyHomePageState createState() => _MyHomePageState();
}

class _MyHomePageState extends State<MyHomePage> {
String _sound = "Press the button to start";
bool _recording = false;

@override
Widget build(BuildContext context) {
return Scaffold(
body: Container(
decoration: BoxDecoration(
image: DecorationImage(
image: AssetImage("assets/background.jpg"),
fit: BoxFit.cover,
),
),
child: Center(
child: Column(
mainAxisAlignment: MainAxisAlignment.spaceEvenly,
children: <Widget>[
Container(
padding: EdgeInsets.all(20),
child: Text(
"What's this sound?",
style: TextStyle(
color: Colors.white,
fontSize: 60,
fontWeight: FontWeight.w200,
),
),
),
MaterialButton(
onPressed: (){},
color: _recording ? Colors.grey : Colors.pink,
textColor: Colors.white,
child: Icon(Icons.mic, size: 60),
shape: CircleBorder(),
padding: EdgeInsets.all(25),
),
Text(
'$_sound',
style: Theme.of(context).textTheme.headline5,
),
],
),
),
),
);
}
}

Integrating the model into application

To integrate tensorflow models in mobile applications, the Tensorflow Lite framework helps to run these models on mobile. For accessing Tensorflow Lite, it is necessary a flutter plugin for this purpose: tflite. However, tflite plugin does not analyse audio yet. Therefore, will be used a similar and recent plugin for audio processing: tflite_audio. Besides this plugin supports GTM models and models with decoded wave inputs, it still has some constraints described in documentation namely only works when it runs on physical mobile devices. In Conclusions, this subject will be discussed in more detail.

First, add the following dependency on your pubspec.yaml file:

dependencies:
tflite_audio: ^0.1.5+3

To install the package run:

flutter pub get

Create a new /assets directory, add the soundclassifier.tflite and labels.txt files downloaded from Google Teachable Machine framework and change the configurations of pubspec.yaml to include /assets folder in your project.

flutter:
# To add assets to your application, add an assets section:
assets:
- assets/labels.txt
- assets/soundclassifier.tflite

After that, it is necessary to configure Android and iOS environment and the use of a GTM model requires the selection of tensorflow operators.

Android configurations

  1. Add the following permissions to AndroidManifest.xml:
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" />

2. Add aaptOptions to app/build.gradle file:

android {
(...)
aaptOptions {
noCompress 'tflite'
}
lintOptions {
disable 'InvalidPackage'
}
(...)
}

3. Enable select-ops in app/build.gradle file:

dependencies {
implementation "org.jetbrains.kotlin:kotlin-stdlib-jdk7:$kotlin_version"
implementation "org.tensorflow:tensorflow-lite-select-tf-ops:+"
}

4. Change the minSdkVersion to at least 21 in app/build.gradle file.

iOS configurations

Add the permissions bellow toios/Runner/Info.plist :

<key>NSMicrophoneUsageDescription</key>
<string>Record audio for playback</string>

It is necessary change the target deployment to at least 12.0. For instance, open the iOS/Runner.xcworkspace on XCode, select Runner and change iOS deployment target in Info tab:

Changing iOS deployment target to 12.0

Now, you need to specify the iOS target deployment in ios/Runner/Podfile

platform :ios, '12.0'

As the model is a GTM model, add pod 'TensorFlowLiteSelectTfOps' line to target in ios/Runner/Podfile

target 'Runner' do
use_frameworks!
use_modular_headers!
pod 'TensorFlowLiteSelectTfOps'
flutter_install_all_ios_pods File.dirname(File.realpath(__FILE__))
end

After that, with the project opened on XCode, select Targets > Runner > Build Settings > All and add the following line to Linking > Other Linker Flag:

-force_load $(SRCROOT)/Pods/TensorFlowLiteSelectTfOps/Frameworks/TensorFlowLiteSelectTfOps.framework/TensorFlowLiteSelectTfOps
Adding the force load command

Finally, into /ios directory run on terminal the following commands:

$ flutter pub get$ pod install$ flutter clean

Make sure that you have CocoaPods updated. During pod install it is possible a warning is launched.

[!] CocoaPods did not set the base configuration of your project because your project already has 
a custom config set. In order for CocoaPods integration to work at all, please either set the base
configurations of the target `Runner` to `Target Support Files/Pods-Runner/Pods-Runner.profile.xcconfig`
or include the `Target Support Files/Pods-Runner/Pods-Runner.profile.xcconfig` in your build configuration
(`Flutter/Release.xcconfig`).

If you have this warning after pod install command, add the following line to your ios/Flutter/Release.xcconfig file and run again pod install:

#include “Pods/Target Support Files/Pods-Runner/Pods-Runner.profile.xcconfig”

Load and use the model

On your initState() method, you must load the model:

TfliteAudio.loadModel(
model: 'assets/soundclassifier.tflite',
label: 'assets/labels.txt',
numThreads: 1,
isAsset: true);

Now, I suggest the implementation of a new method to record audio and analyse it. The startAudioRecognition() method of tflite_audio package has the following parameters that must be passed through arguments:

  • numOfInferences — specify the number of inference will be repeated
  • inputType — specify the type of input audio
  • sampleRate — number of samples per second
  • recordingLength — determines the size of tensor input
  • bufferSize — amount of time to process any incoming audio signal

As the classification model is a GTM model, you must specify the inputType of audio recognition as ‘rawAudio’. If you are using your own model (a decodedwav model) instead of GTM model, the inputType must be ‘decodedWav’. For sampleRate, the recommended values on package documentation are 16000, 22050 or 44100, and it will be used the higher one to improve the accuracy. The recordingLength value must be equal to tensor input and bufferSize should be half the value of recordingLenght.

void _recorder() {
String recognition = "";
if (!_recording) {
setState(() => _recording = true);
result = TfliteAudio.startAudioRecognition(
numOfInferences: 1,
inputType: 'rawAudio',
sampleRate: 44100,
recordingLength: 44032,
bufferSize: 22016,
);
result.listen((event) {
recognition = event["recognitionResult"];
}).onDone(() {
setState(() {
_recording = false;
_sound = recognition.split(" ")[1];
});
});
}
}

Also, result stream variable must be declared:

class _MyHomePageState extends State<MyHomePage> {
String _sound = "Press the button to start";
bool _recording = false;
Stream<Map<dynamic, dynamic>> result;

If you want to stop audio recognition while executing, you can use this method:

TfliteAudio.stopAudioRecognition(); 

Finally, the application is ready to be used! 😄 Note that the application only will run in Android and iOS physical devices as mentioned before. If you have some issues during deployment process on iOS devices, I recommend this documentation. Also, note that when you run for the first time the application on iOS device, it will crash because you have to trust on it (details explained here).

Application preview both in iOS (left) and Android (right) devices

Code

If you want to view the entire code: https://github.com/cmalbuquerque/audio_recognition_app

Please check my Github profile! 😉

Conclusions

There are some constraints related to Tensorflow Lite:

  • Tensorflow no longer support emulators with x86_64 architecture due to CocoaPods issue.
  • Tensorflow Lite plugin (tflite) does not support audio recognition yet

The first known constraint can be surpassed using physical devices to test and debug applications using this plugin or plugins dependent on this. To overcome the second barrier, tflite_audio is a good alternative plugin to load raw audio and decoded wav audio models. However, there are some aspects related to the use of Google Teachable Machine models that are important to retain. Also, this plugin depends on Tensorflow which means applications will crash when using virtual devices (both in Android and iOS).

In this scenario, Tensorflow is used offline since the model and labels associated are allocated into application and this way improves the APK size. Beyond this, Tensorflow can be used online instead of saving the model inside application. In addition, as mentioned above, using Google Teachable Machine models will increase the size of applications. To avoid this, the creation of your own models is recommended.

--

--