I'm trying to follow this example: https://www.thepythoncode.com/article/using-speech-recognition-to-convert-speech-to-text-python
I try to follow the example with speech recognition with microphone. I made this code below:
class Voice:
def __init__(self) -> None:
import speech_recognition as sr
r = sr.Recognizer()
print("Recognizing...")
with sr.Microphone() as source:
# read the audio data from the default microphone
audio_data = r.record(source, duration=5)
# convert speech to text
text = r.recognize_google(audio_data, language="es-ES")
print(text)
new_voice = Voice()
I just get the right result when I use r.record with a specific duration like 5 sec:
audio_data = r.record(source, duration=5)
But I want something similar as the google, that recognize when the user stop to talk.
I tried this 3 others way, but without return:
audio_data = r.listen(source)
audio_data = r.listen(source, timeout=2)
audio_data = r.record(source)
The terminal doesn't give me any error, it's like it's waiting for me to talk or something.
I fount in the record method documentation:
Records up to
durationseconds of audio fromsource(anAudioSourceinstance) starting atoffset(or at the beginning if not specified) into anAudioDatainstance, which it returns.
If
durationis not specified, then it will record until there is no more audio input.
However i literally mute the mic after speaking and even then the terminal stayed the same.