The Inference Engine: Real-time Detection

This is where the trained AI model is put to work. The “Inference Engine” is the bridge between the static .pt model file and the live webcam feed.

🔄 The Detection Lifecycle

The system runs in a continuous loop. Each iteration must be fast enough to maintain a smooth frame rate (FPS) so that no drowsiness events are missed.

Sequence Diagram: Frame-by-Frame Processing

sequenceDiagram
    participant Cam as Webcam
    participant AI as YOLOv11 Model
    participant Engine as Detection Engine
    participant History as Temporal Window (Deque)
    participant Alert as Alert System

    Cam->>Engine: Capture Raw Frame
    Engine->>AI: Pass Frame for Inference
    AI-->>Engine: Prediction (Open/Closed + Confidence)
    Engine->>Engine: Apply Confidence Cutoff
    Engine->>History: Push State (True/False)
    History->>History: Pop Oldest Frame (Maintain Size 15)
    Engine->>History: Calculate Closed Ratio
    Engine->>History: Check Consecutive Streak
    alt Drowsy State Detected
        Engine->>Alert: Trigger Alert Sequence
    else Awake State
        Engine->>Engine: Update UI Status (Green)
    end

🛠️ Technical Implementation

The DrowsinessDetector Class

The system is encapsulated in a class to maintain the “state” of the user over time.

1. Temporal Analysis via deque

We use a collections.deque to store the history of eye states. A deque (double-ended queue) is used because it has a maxlen property. When the 16th frame arrives, the 1st frame is automatically deleted. This creates a Sliding Window of the last seconds of video (assuming 30 FPS).

2. The Decision Matrix

The engine doesn’t just check if eyes are closed; it evaluates the evidence using two different mathematical paths:

Path A: The Ratio Check (The “Drowsiness” Path) If , the user is considered drowsy. This catches “slow-motion” fatigue where the user’s eyes are flickering.

Path B: The Streak Check (The “Nodding Off” Path) The system scans the end of the deque for a continuous sequence of True (closed) values. If , an immediate alert is triggered. This catches the moment a user’s head drops and eyes stay shut.

🖥️ User Interface (UI) Overlay

The system uses cv2.putText to give the user real-time feedback:

  • Green Text: “EYES OPEN - ALERT” System is healthy and user is awake.
  • Orange Text: “EYES CLOSING…” Eyes are closed, but thresholds haven’t been hit yet.
  • Red Text: “⚠️ DROWSINESS DETECTED!” Alert triggered.

Last Updated: 2026-05-03