title | emoji | colorFrom | colorTo | sdk | sdk_version | app_file | pinned | license | short_description |
---|---|---|---|---|---|---|---|---|---|
Gender Detection |
🚀 |
gray |
indigo |
streamlit |
1.42.0 |
app.py |
false |
apache-2.0 |
real time gender detection app` |
This is a real-time gender detection app that uses a pre-trained wav2vec2 model to classify the gender of a speaker from their voice. The app is built using Streamlit and leverages the power of transformers from the Hugging Face library.
Watch the demo video below to see the app in action:
- Real-time audio processing
- Gender classification using a pre-trained wav2vec2 model
- Simple and intuitive user interface
- Clone the repository to your local machine.
- Ensure you have Python 3.10 installed on your system.
- Install the required dependencies using
pip install -r requirements.txt
. - Run the app using
streamlit run app.py
. - Click the 'Start' button to begin real-time gender detection from your microphone input.
- streamlit
- numpy
- torch
- transformers
- pyaudio
The app uses the alefiury/wav2vec2-large-xlsr-53-gender-recognition-librispeech
model from Hugging Face for gender recognition.
The app uses the alefiury/wav2vec2-large-xlsr-53-gender-recognition-librispeech
model from Hugging Face for gender recognition. The model is a pre-trained wav2vec2 model for gender recognition on the LibriSpeech dataset. The model is loaded using the AutoFeatureExtractor
and AutoModelForAudioClassification
classes from the Hugging Face library.
The app uses the pyaudio
library to capture the audio from the microphone. The audio is captured using the pyaudio
library and the AudioStream
class. The audio is captured using the AudioStream
class and the start_stream
method. The audio is captured using the AudioStream
class and the stop_stream
method.
The app uses the streamlit
library to create the web application. The app is configured to use a 16kHz sampling rate and mono audio input.
The app is configured to use a 16kHz sampling rate and mono audio input. The audio stream parameters are defined as follows:
- FORMAT: 16-bit resolution
- CHANNELS: 1 (Mono audio)
- RATE: 16000 (16kHz sampling rate)
- CHUNK: 1024 (Number of frames per buffer)
The app uses Python's built-in logging module to log information with a basic configuration that includes timestamps and log levels.
This project is licensed under the Apache-2.0 License.
- Hugging Face for providing the pre-trained wav2vec2 model.
- Streamlit for the easy-to-use web application framework.
- PyAudio for handling audio input.
https://huggingface.co/spaces/prashant-garg/gender-detection
Note: The model is hosted on Huggingface and the app is deployed on Huggingface Spaces. But there is an issue with the audio input. The app is not able to capture the audio from the microphone as huggingface spaces does not have audio drivers enabled.