Continuous Speech to Text API Implementation on an Android App

Continuous Speech to Text API Implementation on an Android App

On my last semester in University I made an Android App as my final project/thesis meant to help children learning to speak practice their speaking skills. The app calls Google Speech to Text API as the main functionality. Even though I’m quite familiar with Java language, Android app development was completely new to me. That was one of the times I’m really grateful for Google and Stack Overflow. I found many examples of Speech to Text implementation but I needed it to be continuous and automatically resets after the words requested by the app was said. So for those who are going for similar Speech to Text functionality, here’s how I implements it :

1.      Create your Speech Recognizer

private void resetSpeechRecognizer() {
    if(speechRecognizer != null)
        speechRecognizer.destroy();
    speechRecognizer = SpeechRecognizer.createSpeechRecognizer(this);
    if(SpeechRecognizer.isRecognitionAvailable(this))
        speechRecognizer.setRecognitionListener(this);
    else
        finish();

}

As it states above, the programs first will check if there is any speechRecognizer already made. If there is, it will be destroyed and will create a new one. This piece of code also checks if speech recognition is available on the current device.

2.      Set your Speech Recognizer Intent

private void setRecognizerIntent() {
    String language = "in-ID";
    recognizerIntent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
    recognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, "in");
    recognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
            RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
    recognizerIntent.putExtra(RecognizerIntent.EXTRA_MAX_RESULTS, language);
}

Our speech recognizer needs an intent. This piece of code is the settings I used on my program. The “in-ID” and “in” will set the speech recognizer to recognize speech in Bahasa Indonesia.

3.      Make sure you have permission to record

int permissionCheck = ContextCompat.checkSelfPermission(getApplicationContext(),
        Manifest.permission.RECORD_AUDIO);
if (permissionCheck != PackageManager.PERMISSION_GRANTED) {
    ActivityCompat.requestPermissions(this,
            new String[]{Manifest.permission.RECORD_AUDIO},
            PERMISSIONS_REQUEST_RECORD_AUDIO);
    return;
} else {
    speechRecognizer.startListening(recognizerIntent);
}

 

public void onRequestPermissionsResult (int kodeRequest,

                                        @NonNull String[] permissions,

                                        @NonNull int[] grantResults) {

    super.onRequestPermissionsResult(kodeRequest, permissions, grantResults);



    if (kodeRequest == PERMISSIONS_REQUEST_RECORD_AUDIO) {

        if (grantResults.length > 0 && grantResults[0] == PackageManager.PERMISSION_GRANTED) {

            this.recreate();

        } else {

            Toast.makeText(PermainanActivity.this,

                    "Izin merekam suara diperlukan untuk bermain.",Toast.LENGTH_SHORT).show();

            finish();

        }

    }

}

With the codes above, the program will ask user permission to record audio from their device if it wasn’t already granted. If it was, the speech recognizer will start listening immediately.

Also don’t forget to let your AndroidManifest know what kind of permissions you are using.

<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.RECORD_AUDIO" />

4.      Match the speech to text result with requested words

public void onResults(Bundle results) {
    ArrayList<String> ucap = results.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
    String hasil = "";
    Boolean benar = false;

    for(String result : ucap){
        hasil += result + "\n";
        if (hasil.toLowerCase().contains(stringKata.toLowerCase())){
            benar = true;
        }
    }

    if (benar == true){
   imgKata.setImageDrawable(getResources().getDrawable(R.drawable.cekimg));
        nilai++;
        txtSkorG.setText(String.valueOf(nilai));
        resetSpeechRecognizer();
        speechRecognizer.startListening(recognizerIntent);
        setCountDownTimer(500);
    }else {
        resetSpeechRecognizer();
        speechRecognizer.startListening(recognizerIntent);
    }
}

The speech recognizer result will be stored in an ArrayList of String. The for loop function will checks every string in the ArrayList. Any matching string will set the determining Boolean value to true. Both possible Boolean values will reset the speech recognizer and let it start listening again. This piece of code is the one making speech recognition to be continuous.


You can find my simple program in Play Store :

https://play.google.com/store/apps/details?id=com.speech.peech8

 

If you’d like to reach out for any questions or inputs, feel free to email me :

vivianivory22@gmail.com

To view or add a comment, sign in

Others also viewed

Explore content categories