1. Introduction
As you can see from the title, this is a demo project that shows a very basic voice assistant script that can answer your questions in the terminal based on Google search results.
You can find the full code in the GitHub repository: dimitryzub/serpapi-demo-projects/speech-recognition/cli-based/
Subsequent blog posts will cover:
- Web-based using Flask, some HTML, CSS and Javascript solution.
- An Android and Windows based solution using Flutter and Dart.
2. What we will build in this blog post
2.1 Environment preparation
First, let’s make sure we are in a different environment and have the libraries required for our project installed correctly. The most difficult (possibly) is to install .pyaudio. Please refer to the following to overcome this difficulty:
[Solution] Fix PyAudio pip installation error on win 32/64-bit operating system
2.2 Virtual environment and library installation
Before we start installing the library, we need to create and activate a new environment for this project:
# if you're on Linux based systems $ python -m venv env & amp; & amp; source env/bin/activate $ (env) <path> # if you're on Windows and using Bash terminal $ python -m venv env & amp; & amp; source env/Scripts/activate $ (env) <path> # if you're on Windows and using CMD python -m venv env & & .\env\Scripts\activate $ (env) <path>
Explanation python -m venv env
tells Python to run module( -m
) venv
and create a folder named env
. & amp; & amp;
represents “and”. source
will activate your environment and you will only be able to install libraries in that environment.
Now install all required libraries:
pip install rich pyttsx3 SpeechRecognition google-search-results
Now to pyaudio
. Keep in mind that pyaudio
may cause errors when installed. You may need to conduct additional research.
If you are using Linux, we need to install some development dependencies to use pyaudio
:
$ sudo apt-get install -y libasound-dev portaudio19-dev $ pip install pyaudio
If you’re using Windows, it’s even simpler (tested with CMD and Git Bash):
pip install pyaudio
3. Complete code
import os import speech_recognition import pyttsx3 from serpapi import GoogleSearch from rich.console import Console from dotenv import load_dotenv load_dotenv('.env') console = Console() def main(): console.rule('[bold yellow]SerpApi Voice Assistant Demo Project') recognizer = speech_recognition.Recognizer() while True: with console.status(status='Listening you...', spinner='point') as progress_bar: try: with speech_recognition.Microphone() as mic: recognizer.adjust_for_ambient_noise(mic, duration=0.1) audio = recognizer.listen(mic) text = recognizer.recognize_google(audio_data=audio).lower() console.print(f'[bold]Recognized text[/bold]: {text}') progress_bar.update(status='Looking for answers...', spinner='line') params = { 'api_key': os.getenv('API_KEY'), 'device': 'desktop', 'engine': 'google', 'q': text, 'google_domain': 'google.com', 'gl': 'us', 'hl': 'en' } search = GoogleSearch(params) results = search.get_dict() try: if 'answer_box' in results: try: primary_answer = results['answer_box']['answer'] except: primary_answer = results['answer_box']['result'] console.print(f'[bold]The answer is[/bold]: {primary_answer}') elif 'knowledge_graph' in results: secondary_answer = results['knowledge_graph']['description'] console.print(f'[bold]The answer is[/bold]: {secondary_answer}') else: tertiary_answer = results['answer_box']['list'] console.print(f'[bold]The answer is[/bold]: {tertiary_answer}') progress_bar.stop() # if answered is success -> stop progress bar. user_promnt_to_contiune_if_answer_is_success = input('Would you like to search for something again? (y/n) ') if user_promnt_to_continiune_if_answer_is_success == 'y': recognizer = speech_recognition.Recognizer() continue # run speech recognition again until `user_promt` == 'n' else: console.rule('[bold yellow]Thank you for cheking SerpApi Voice Assistant Demo Project') break exceptKeyError: progress_bar.stop() error_user_promt = input("Sorry, didn't found the answer. Would you like to rephrase it? (y/n) ") if error_user_promt == 'y': recognizer = speech_recognition.Recognizer() continue # run speech recognition again until `user_promt` == 'n' else: console.rule('[bold yellow]Thank you for cheking SerpApi Voice Assistant Demo Project') break except speech_recognition.UnknownValueError: progress_bar.stop() user_promt_to_continue = input('Sorry, not quite understood you. Could say it again? (y/n) ') if user_promt_to_continue == 'y': recognizer = speech_recognition.Recognizer() continue # run speech recognition again until `user_promt` == 'n' else: progress_bar.stop() console.rule('[bold yellow]Thank you for cheking SerpApi Voice Assistant Demo Project') break if __name__ == '__main__': main()
4. Code Description
Import library:
import os import speech_recognition import pyttsx3 from serpapi import GoogleSearch from rich.console import Console from dotenv import load_dotenv
rich
is used in the terminal A Python library for beautiful formatting.pyttsx3
Python’s text-to-speech converter works offline.SpeechRecognition
Python library for converting speech to text.google-search-results
A Python API wrapper for SerpApi that can parse data from 15 Data from the above search engines.os
Read the secret environment variable. In this case, it’s the SerpApi API key.dotenv
Load environment variables (SerpApi API key) from file.env
. The.env
file can be renamed to any file: (.napoleon
.
dot) represents the environment variable file.
Define rich
Console()
. It will be used to beautify the terminal output (animations, etc.):
console = Console()
Define all functions that occur in main
:
def main(): console.rule('[bold yellow]SerpApi Voice Assistant Demo Project') recognizer = speech_recognition.Recognizer()
At the beginning of the function, we define speech_recognition.Recognizer()
and console.rule
to create the following output:
────────────────────────────────── SerpApi Voice Assistant Demo Project ───── ───────────────────────────────
The next step is to create a while loop that will constantly listen to microphone input to recognize speech:
while True: with console.status(status='Listening you...', spinner='point') as progress_bar: try: with speech_recognition.Microphone() as mic: recognizer.adjust_for_ambient_noise(mic, duration=0.1) audio = recognizer.listen(mic) text = recognizer.recognize_google(audio_data=audio).lower() console.print(f'[bold]Recognized text[/bold]: {text}')
console.status
–rich
progress bar, for decorative purposes only.speech_recognition.Microphone()
Start picking up input from the microphone.recognizer.adjust_for_ambient_noise
is designed to calibrate energy thresholds based on ambient energy levels.recognizer.listen
Listen to the actual user text.recognizer.recognize_google
Uses the Google Speech Recongition API to perform speech recognition.lower()
is to lower the recognized text.console.print
Allows the use of text modification statementsrich
print
, such as adding bold, italic, etc.
spinner='point'
will produce the following output (use python -m rich.spinner
to see the list of spinners
):
After that, we need to initialize the SerpApi search parameters for searching:
progress_bar.update(status='Looking for answers...', spinner='line') params = { 'api_key': os.getenv('API_KEY'), # serpapi api key 'device': 'desktop', # device used for 'engine': 'google', # serpapi parsing engine: https://serpapi.com/status 'q': text, # search query 'google_domain': 'google.com', # google domain: https://serpapi.com/google-domains 'gl': 'us', # country of the search: https://serpapi.com/google-countries 'hl': 'en' # language of the search: https://serpapi.com/google-languages # other parameters such as locations: https://serpapi.com/locations-api } search = GoogleSearch(params) # where data extraction happens on the SerpApi backend results = search.get_dict() # JSON -> Python dict
progress_bar.update
will update progress_bar
with the new status
(the text printed in the console), spinner='line\ '
and will produce the following animation:
After that, use SerpApi of Google search engine API to extract data from Google search.
The following portion of the code will do the following:
try: if 'answer_box' in results: try: primary_answer = results['answer_box']['answer'] except: primary_answer = results['answer_box']['result'] console.print(f'[bold]The answer is[/bold]: {primary_answer}') elif 'knowledge_graph' in results: secondary_answer = results['knowledge_graph']['description'] console.print(f'[bold]The answer is[/bold]: {secondary_answer}') else: tertiary_answer = results['answer_box']['list'] console.print(f'[bold]The answer is[/bold]: {tertiary_answer}') progress_bar.stop() # if answered is success -> stop progress bar user_promnt_to_contiune_if_answer_is_success = input('Would you like to search for something again? (y/n) ') if user_promnt_to_continiune_if_answer_is_success == 'y': recognizer = speech_recognition.Recognizer() continue # run speech recognition again until `user_promt` == 'n' else: console.rule('[bold yellow]Thank you for cheking SerpApi Voice Assistant Demo Project') break exceptKeyError: progress_bar.stop() # if didn't found the answer -> stop progress bar error_user_promt = input("Sorry, didn't found the answer. Would you like to rephrase it? (y/n) ") if error_user_promt == 'y': recognizer = speech_recognition.Recognizer() continue # run speech recognition again until `user_promt` == 'n' else: console.rule('[bold yellow]Thank you for cheking SerpApi Voice Assistant Demo Project') break
The final step is to handle errors when the microphone doesn’t pick up sound:
# while True: # with console.status(status='Listening you...', spinner='point') as progress_bar: # try: # speech recognition code #data extraction code except speech_recognition.UnknownValueError: progress_bar.stop() # if didn't heard the speech -> stop progress bar user_promt_to_continue = input('Sorry, not quite understood you. Could say it again? (y/n) ') if user_promt_to_continue == 'y': recognizer = speech_recognition.Recognizer() continue # run speech recognition again until `user_promt` == 'n' else: progress_bar.stop() # if want to quit -> stop progress bar console.rule('[bold yellow]Thank you for cheking SerpApi Voice Assistant Demo Project') break
console.rule()
will provide the following output:
───────────────────── Thank you for cheking SerpApi Voice Assistant Demo Project ────────────────── ─────
Add the if __name__ == '__main__'
idiom to prevent users from accidentally calling some scripts without intention, and call the main
function that will run the entire script:
if __name__ == '__main__': main()
5. Link
rich
pyttsx3
SpeechRecognition
google-search-results
os
dotenv
The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledge. Python entry skill treeHomepageOverview 381,819 people are learning the system