EyeNav

Logo

Real-time eye tracking and voice-controlled web browsing with automated test script generation

View on GitHub

Team Members

Table of Contents

Purpose

EyeNav is a modular web interaction framework. It fuses real-time eye-tracking (with Tobii-Pro SDK) and on-device natural-language processing (using Vosk) within a Chrome extension and Python backend to deliver:

By orchestrating gaze-driven pointer control, voice-command parsing, and concurrent logging threads, EyeNav enables both interactive accessibility and behavior-driven development in web environments.


Video


Screenshots

EyeNav in action in english. A screenshot of a web browser with description of voice commands to the right. EyeNav in action in spanish. A screenshot of a web browser with description of voice commands to the right.


Hardware and Software Requirements

Tobii Pro Nano Eyetracker


Summary

EyeNav implements the following core features:


Installation

  1. Clone the Repository

    git clone https://github.com/TheSoftwareDesignLab/EyeNav.git
    cd EyeNav
    
  2. Backend Setup

    Frtom the backend/ folder

    python3 -m venv venv
    source venv/bin/activate
    pip install -r requirements.txt
    
  3. Chrome Extension

    • Open chrome://extensions/ in Chrome (v114+)
    • Enable Developer mode
    • Click Load unpacked and select extension/

Usage

  1. Start Backend

    python backend/main.py
    
  2. Load Web Page & Extension

    • Navigate to any web page
    • Click the EyeNav extension icon to open the side panel
  3. Initiate Session

    • Click Start in the side panel
    • Experiment with gaze, voice commands, or both
  4. Generate Tests

    • Interactions are logged automatically
    • Generated Gherkin scripts appear in tests/ directory
    • Replay with Kraken

Configuration

Voice Model & Language

Logging & Selectors


Architecture

Components

Components Diagram

Context

Context Diagram


Use Cases

  1. Accessible Browsing

    Hands-free navigation for users with disabilities.

  2. Automated Testing (A-TDD)

    Generate and replay acceptance tests for regression.

  3. Accessibility Evaluation

    Collect interaction data for consultants and researchers.

  4. Intelligent Agents

    [TBD] Enable bots to navigate and test web UIs via gaze & speech.