Day 3: Capturing and Analyzing Voice Input #VoiceInput #ReactNativeVoice

Voice input forms the core of any voice-controlled app. On Day 3, we’ll learn how to capture voice input, process it into text, and analyze the content to trigger actions or extract insights.

Table of Contents

1. Overview of Voice Input Processing

The process of handling voice input involves:

Capturing Audio: Use the microphone to collect spoken input.
Speech-to-Text Conversion: Convert audio into text using libraries or APIs.
Analyzing Text: Extract keywords, intents, or actionable commands.

2. Capturing Voice Input

Using React Native Voice

React Native Voice simplifies capturing spoken input and converting it to text.

Example:

import React, { useState } from 'react';
import { Button, View, Text, StyleSheet, Platform, PermissionsAndroid } from 'react-native';
import Voice from '@react-native-voice/voice';

export default function App() {
    const [recognizedText, setRecognizedText] = useState("");
    const [isListening, setIsListening] = useState(false);

    const startListening = async () => {
        if (Platform.OS === 'android') {
            const granted = await PermissionsAndroid.request(
                PermissionsAndroid.PERMISSIONS.RECORD_AUDIO,
                {
                    title: "Microphone Permission",
                    message: "This app requires access to your microphone for voice recognition.",
                }
            );
            if (granted !== PermissionsAndroid.RESULTS.GRANTED) {
                return;
            }
        }

        try {
            setIsListening(true);
            Voice.start("en-US");
        } catch (error) {
            console.error("Error starting voice recognition:", error);
        }
    };

    const stopListening = () => {
        setIsListening(false);
        Voice.stop();
    };

    Voice.onSpeechResults = (event) => {
        setRecognizedText(event.value[0]);
        setIsListening(false);
    };

    return (
        <View style={styles.container}>
            <Text style={styles.instructions}>
                {isListening ? "Listening..." : "Press the button and start speaking"}
            </Text>
            <Button
                title={isListening ? "Stop Listening" : "Start Listening"}
                onPress={isListening ? stopListening : startListening}
            />
            <Text style={styles.resultText}>Recognized Text: {recognizedText}</Text>
        </View>
    );
}

const styles = StyleSheet.create({
    container: { flex: 1, justifyContent: "center", alignItems: "center" },
    instructions: { fontSize: 18, marginBottom: 20 },
    resultText: { fontSize: 20, marginTop: 20 },
});

3. Analyzing the Captured Text

Step 1: Extract Keywords

Use basic string methods to detect keywords:

const analyzeInput = (text) => {
    if (text.includes("weather")) {
        console.log("Fetching weather...");
    } else if (text.includes("news")) {
        console.log("Fetching news...");
    } else {
        console.log("Command not recognized.");
    }
};

Invoke analyzeInput after capturing the text:

Voice.onSpeechResults = (event) => {
    const text = event.value[0];
    setRecognizedText(text);
    analyzeInput(text);
};

Step 2: Advanced Text Analysis with APIs

Use natural language processing (NLP) APIs for more advanced analysis:

Google Cloud Natural Language API: Analyze sentiment, entities, and syntax.
AWS Comprehend: Extract entities and determine intent.

Example: Sending Text to Google NLP API

const analyzeWithAPI = async (text) => {
    const apiKey = "YOUR_GOOGLE_API_KEY";
    const url = `https://language.googleapis.com/v1/documents:analyzeEntities?key=${apiKey}`;
    const body = {
        document: { type: "PLAIN_TEXT", content: text },
        encodingType: "UTF8",
    };

    const response = await fetch(url, {
        method: "POST",
        headers: { "Content-Type": "application/json" },
        body: JSON.stringify(body),
    });

    const result = await response.json();
    console.log("Entity Analysis:", result.entities);
};

4. Error Handling in Voice Recognition

Step 1: Handle Recognition Failures

Add an event listener for recognition errors:

Voice.onSpeechError = (event) => {
    console.error("Speech recognition error:", event.error.message);
};

Step 2: Implement Timeouts

Stop listening if no input is detected within a set timeframe:

setTimeout(() => {
    if (isListening) stopListening();
}, 10000); // 10 seconds timeout

5. Testing the Implementation

Step 1: Start the Development Server

Run your app:

expo start

Step 2: Test Voice Commands

Speak a command like “Show me the weather.”
Observe the recognized text and any triggered actions.

6. Key Concepts Covered

Capturing voice input using React Native Voice.
Analyzing text for keywords and intents.
Using APIs for advanced text analysis.
Handling errors and implementing timeouts.

Next Steps

On Day 4, we’ll focus on handling voice commands to perform app actions, making the app fully interactive.

References and Links:

SEO Keywords: React Native voice input, analyzing speech in apps, natural language processing APIs, capturing voice commands, React Native Voice library tutorial.

Post Views: 229