Skip to content

A functional, real-time voice-activated Smart AI Assistant that uses Automatic Speech Recognition (ASR) to process voice commands and Text-to-Speech (TTS) to provide spoken feedback. This project showcases the practical application of voice recognition technology in an interactive desktop system.

Notifications You must be signed in to change notification settings

Jeevan-u/speech-recognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Smart AI Assistant (Speech Recognition)

A functional, real-time voice-activated Smart AI Assistant that uses Automatic Speech Recognition (ASR) to process voice commands and Text-to-Speech (TTS) to provide spoken feedback. This project showcases the practical application of voice recognition technology in an interactive desktop system.


📖 Table of Contents


💡 About the Project

This project implements a Smart AI Assistant that operates in real-time, built using Python[cite: 46, 51]. [cite_start]It integrates the speech_recognition library for capturing and transcribing user audio and pyttsx3 for synthesized voice output[cite: 47]. [cite_start]The successful completion of this project was a partial fulfillment of the requirements for the degree of Bachelor of Technology at Rai Technology University[cite: 31]. ]The assistant was developed by the team of Jeevan KL, Chandan H, Towhid Aalam, and Dhamodhara for their 3rd Semester B.Tech CSE AIML program during the 2025-2026 session[cite: 4, 5, 6, 7, 8, 9, 10, 30].

✨ Key Features

  • Voice-Controlled Execution: The assistant processes voice commands to execute various actions.
  • ASR and TTS Implementation: It uses the speech_recognition library with Google's API to convert speech to text (ASR) and pyttsx3 to convert text responses to natural-sounding speech (TTS).
  • Diverse Task Handling: The system can fetch the time and date, search the web, get summaries from Wikipedia, and play media on YouTube.
  • Real-time Status Feedback: A minimal tkinter GUI displays the assistant's current state, such as Listening or Processing, to the user].
  • Robust Multi-threading: A multi-threaded architecture is used to ensure the GUI remains responsive while the system is waiting for voice input or processing a command

⚙️ How It Works (Methodology)

The system utilizes a sequential pipelined approach involving four main phases to execute a command[cite: 69]:

  1. Audio Acquisition: The sr.Microphone() object captures live audio, and listener.adjust_for_ambient_noise() pre-processes the input to filter background noise
  2. Speech Processing (ASR): The recorded audio is sent to the Google Speech Recognition API via listener.recognize_google(voice) to convert the speech into a text string
  3. Command Logic: The recognized text is analyzed using Python's if...elif... conditional statements to identify keywords (e.g., "play," "time," "wikipedia") and execute the corresponding task using external libraries like pywhatkit or wikipedia.
  4. Text-to-Speech (TTS) Output: The talk(text) function uses the pyttsx3 engine to synthesize the response text and speak it back to the user.

💻 System Design

The assistant is implemented with a multi-threaded design to maintain responsiveness.

GUI Thread (Main): Handles the tkinter window (root.mainloop()) and updates the status label (e.g., "Listening..."). Assistant Thread (Daemon): A separate thread runs the core run_assistant() function, which contains the blocking calls for microphone input (take_command()) and speech synthesis (talk()). This separation prevents the user interface from freezing.

🚀 Getting Started

Prerequisites

  • Python 3.x
  • The following required Python libraries:

You can install the required libraries using pip:

About

A functional, real-time voice-activated Smart AI Assistant that uses Automatic Speech Recognition (ASR) to process voice commands and Text-to-Speech (TTS) to provide spoken feedback. This project showcases the practical application of voice recognition technology in an interactive desktop system.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages