Smart AI Assistant (Speech Recognition)

A functional, real-time voice-activated Smart AI Assistant that uses Automatic Speech Recognition (ASR) to process voice commands and Text-to-Speech (TTS) to provide spoken feedback. This project showcases the practical application of voice recognition technology in an interactive desktop system.

💡 About the Project

This project implements a Smart AI Assistant that operates in real-time, built using Python[cite: 46, 51]. [cite_start]It integrates the speech_recognition library for capturing and transcribing user audio and pyttsx3 for synthesized voice output[cite: 47]. [cite_start]The successful completion of this project was a partial fulfillment of the requirements for the degree of Bachelor of Technology at Rai Technology University[cite: 31]. ]The assistant was developed by the team of Jeevan KL, Chandan H, Towhid Aalam, and Dhamodhara for their 3rd Semester B.Tech CSE AIML program during the 2025-2026 session[cite: 4, 5, 6, 7, 8, 9, 10, 30].

✨ Key Features

Voice-Controlled Execution: The assistant processes voice commands to execute various actions.
ASR and TTS Implementation: It uses the speech_recognition library with Google's API to convert speech to text (ASR) and pyttsx3 to convert text responses to natural-sounding speech (TTS).
Diverse Task Handling: The system can fetch the time and date, search the web, get summaries from Wikipedia, and play media on YouTube.
Real-time Status Feedback: A minimal tkinter GUI displays the assistant's current state, such as Listening or Processing, to the user].
Robust Multi-threading: A multi-threaded architecture is used to ensure the GUI remains responsive while the system is waiting for voice input or processing a command

⚙️ How It Works (Methodology)

The system utilizes a sequential pipelined approach involving four main phases to execute a command[cite: 69]:

Audio Acquisition: The sr.Microphone() object captures live audio, and listener.adjust_for_ambient_noise() pre-processes the input to filter background noise
Speech Processing (ASR): The recorded audio is sent to the Google Speech Recognition API via listener.recognize_google(voice) to convert the speech into a text string
Command Logic: The recognized text is analyzed using Python's if...elif... conditional statements to identify keywords (e.g., "play," "time," "wikipedia") and execute the corresponding task using external libraries like pywhatkit or wikipedia.
Text-to-Speech (TTS) Output: The talk(text) function uses the pyttsx3 engine to synthesize the response text and speak it back to the user.

💻 System Design

The assistant is implemented with a multi-threaded design to maintain responsiveness.

GUI Thread (Main): Handles the tkinter window (root.mainloop()) and updates the status label (e.g., "Listening..."). Assistant Thread (Daemon): A separate thread runs the core run_assistant() function, which contains the blocking calls for microphone input (take_command()) and speech synthesis (talk()). This separation prevents the user interface from freezing.

🚀 Getting Started

Prerequisites

Python 3.x
The following required Python libraries:

You can install the required libraries using pip:

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
speech recognition report.docx		speech recognition report.docx
speech_recognition.py		speech_recognition.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Smart AI Assistant (Speech Recognition)

📖 Table of Contents

💡 About the Project

✨ Key Features

⚙️ How It Works (Methodology)

💻 System Design

🚀 Getting Started

Prerequisites

About

Uh oh!

Releases

Packages

Languages

Jeevan-u/speech-recognition

Folders and files

Latest commit

History

Repository files navigation

Smart AI Assistant (Speech Recognition)

📖 Table of Contents

💡 About the Project

✨ Key Features

⚙️ How It Works (Methodology)

💻 System Design

🚀 Getting Started

Prerequisites

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages