Day 3: Extracting Facial Landmarks for Avatar Mapping #FacialLandmarks #AIAvatars

On Day 3, we’ll extract key facial landmarks (eyes, nose, mouth, jaw) and prepare them for mapping onto a 3D avatar. This step is crucial for ensuring that the avatar mimics real-time facial movements.


1. Understanding Facial Landmarks

Facial landmarks are specific points detected on a person’s face. MediaPipe Face Mesh detects 468 key points, but for avatars, we primarily focus on:
Eyes (blinking, looking directions)
Eyebrows (raising, lowering)
Mouth (opening, smiling, speaking)
Jaw & Head Movements (tilting, nodding)

Each landmark is represented as an (x, y, z) coordinate.


2. Extracting Facial Landmarks Using MediaPipe Face Mesh

Step 1: Modify CameraScreen.js to Capture Face Mesh Data

import React, { useState, useEffect } from 'react';
import { View, StyleSheet, Text } from 'react-native';
import { Camera } from 'expo-camera';
import * as tf from '@tensorflow/tfjs';
import * as faceLandmarksDetection from '@tensorflow-models/face-landmarks-detection';

export default function CameraScreen() {
    const [hasPermission, setHasPermission] = useState(null);
    const [model, setModel] = useState(null);
    const [landmarks, setLandmarks] = useState([]);

    useEffect(() => {
        (async () => {
            const { status } = await Camera.requestPermissionsAsync();
            setHasPermission(status === 'granted');
        })();
    }, []);

    useEffect(() => {
        const loadModel = async () => {
            await tf.ready();
            const loadedModel = await faceLandmarksDetection.load(
                faceLandmarksDetection.SupportedPackages.mediapipeFacemesh
            );
            setModel(loadedModel);
        };

        loadModel();
    }, []);

    const detectFace = async (image) => {
        if (!model) return;

        const predictions = await model.estimateFaces({
            input: image,
            returnTensors: false,
            flipHorizontal: false,
        });

        if (predictions.length > 0) {
            console.log('Landmarks:', predictions[0].scaledMesh);
            setLandmarks(predictions[0].scaledMesh);
        }
    };

    if (hasPermission === null) {
        return <View />;
    }
    if (hasPermission === false) {
        return <Text>No access to camera</Text>;
    }

    return (
        <View style={styles.container}>
            <Camera style={styles.camera} type={Camera.Constants.Type.front} />
            <Text style={styles.text}>Face Detected: {landmarks.length > 0 ? 'Yes' : 'No'}</Text>
        </View>
    );
}

const styles = StyleSheet.create({
    container: { flex: 1, justifyContent: 'center', alignItems: 'center' },
    camera: { flex: 1 },
    text: { position: 'absolute', top: 50, fontSize: 18, fontWeight: 'bold' },
});

3. Extracting Key Facial Landmarks for Avatar Mapping

MediaPipe Face Mesh provides 468 landmarks, but we only need key points:

See also  Day 10: Deploying the App and Tracking Analytics with Firebase
FeatureLandmark Index
Left Eye33
Right Eye263
Nose Tip1
Mouth Left Corner61
Mouth Right Corner291
Jaw Bottom199

Step 1: Extract Only Key Points

Modify detectFace to extract only essential points:

const getKeyLandmarks = (landmarks) => {
    return {
        leftEye: landmarks[33],  
        rightEye: landmarks[263],  
        nose: landmarks[1],  
        mouthLeft: landmarks[61],  
        mouthRight: landmarks[291],  
        jawBottom: landmarks[199],  
    };
};

const detectFace = async (image) => {
    if (!model) return;
    const predictions = await model.estimateFaces({ input: image });

    if (predictions.length > 0) {
        const keyLandmarks = getKeyLandmarks(predictions[0].scaledMesh);
        console.log('Key Landmarks:', keyLandmarks);
        setLandmarks(keyLandmarks);
    }
};

4. Mapping Face Landmarks to Avatar Movement

Step 1: Normalize the Landmark Positions

Since different faces have different sizes, normalize the coordinates:

const normalizeLandmarks = (landmarks, width, height) => {
    return {
        leftEye: [landmarks.leftEye[0] / width, landmarks.leftEye[1] / height],
        rightEye: [landmarks.rightEye[0] / width, landmarks.rightEye[1] / height],
        nose: [landmarks.nose[0] / width, landmarks.nose[1] / height],
        mouthLeft: [landmarks.mouthLeft[0] / width, landmarks.mouthLeft[1] / height],
        mouthRight: [landmarks.mouthRight[0] / width, landmarks.mouthRight[1] / height],
        jawBottom: [landmarks.jawBottom[0] / width, landmarks.jawBottom[1] / height],
    };
};

Step 2: Convert Landmarks to Avatar-Friendly Data

We’ll map facial movements to avatar expressions:

  • Eye Blinking → Detect distance between upper and lower eyelid.
  • Mouth Movement → Check distance between mouth corners.
  • Head Tilting → Compare left and right eye height.
const detectExpressions = (landmarks) => {
    const eyeDistance = Math.abs(landmarks.leftEye[1] - landmarks.rightEye[1]);
    const mouthWidth = Math.abs(landmarks.mouthLeft[0] - landmarks.mouthRight[0]);

    return {
        isBlinking: eyeDistance < 0.02, 
        isSmiling: mouthWidth > 0.1,
    };
};

5. Testing Facial Landmark Extraction

Step 1: Run the App

expo start

Step 2: Verify the Landmark Detection

  • Open the console to see extracted landmarks.
  • Check if eye blinking and mouth movement are detected.

6. Optimizing Performance

  • Reduce FPS to 30 FPS for better performance:
if (frameCount % 2 === 0) detectFace(frame);
  • Use GPU Acceleration for faster computations:
tf.setBackend('webgl');

7. Key Concepts Covered

✅ Extracted facial landmarks for avatar tracking.
✅ Normalized data for consistent avatar movements.
✅ Detected expressions like blinking & smiling.

See also  Day 5: Implementing Real-Time Avatar Animations #AIAvatars #LiveAnimation

8. Next Steps: Mapping Facial Features to a 3D Avatar

Tomorrow, we’ll: 🔹 Render 3D avatars using Three.js or Babylon.js.
🔹 Sync real-time face tracking with avatar movement.


9. References & Learning Resources


10. SEO Keywords:

Real-time AI avatar, face tracking with MediaPipe, facial landmark detection React Native, building an AI VTuber app, mapping facial expressions to avatars.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.