Multi threading (multiple requests) problem. #338

fadyscube · 2023-05-04T23:48:15Z

fadyscube
May 4, 2023

When I use the Silero vad model on one thread it works perfectly but when I execute it alone, but if I have for example a Django web server with multiple audio requests, it fails, when handling two audio files at the same time,
here is my code for torch import:

import torch

torch.set_num_threads(4)
model, utils = torch.hub.load(repo_or_dir='snakers4/silero-vad',
                            model='silero_vad',
                            force_reload=False,
                            onnx=False)

here is vad() function for returning the start and end of speech in an audio file,

SAMPLING_RATE = 16000


def vad(filename, model, utils):
    org_wav = AudioSegment.from_file(filename)
    new_wav = org_wav.set_frame_rate(SAMPLING_RATE)
    new_wav.export(filename, format='wav')

    (get_speech_timestamps,
    save_audio,
    read_audio,
    VADIterator,
    collect_chunks) = utils

    wav = read_audio(filename, sampling_rate=SAMPLING_RATE)
    # get speech timestamps from full audio file
    print("Error occurred during VAD processing:")
    try:
        speech_timestamps = get_speech_timestamps(wav, model, sampling_rate=SAMPLING_RATE)
    except Exception as e:
        print("Error occurred during VAD processing:", str(e))

    segments = []

    for speech in speech_timestamps:
        segments.append([int((speech['start'] / SAMPLING_RATE) * 1000), int((speech['end'] / SAMPLING_RATE) * 1000)])

    return [segments[0][0], segments[len(segments)-1][1]]

So when there are two requests at the same time I get these errors:

Python(6403,0x700008bec000) malloc: Incorrect checksum for freed object 0x7ffb70d2d8c8: probably modified after being freed.
Corrupt value: 0x8fffffffffffffff
Python(6403,0x700008bec000) malloc: *** set a breakpoint in malloc_error_break to debug

And then when I set threads to 4 I got:

OMP: Warning #191: Forking a process while a parallel region is active is potentially unsafe.

Sorry for my ignorance in this domain, but could you clarify how to handle multiple requests (is it related to torch threading ?) using this model ?

Answered by snakers4

May 5, 2023

Hi,

First of all it is better to use the ONNX model, since it requires less dependencies, onnx-runtime weighs much less, and the onnx model is 2-3 times faster.

Secondly, the model itself is a reference to an underlying lower level object. I am not familiar with how django handles concurrency (is it thread-safe, does it use threads, processes, asyncio, etc). But it looks like that somehow several requests are made at the same time to the same underlying model object, which entails a collision.

There are many ways and instruments in python to do this correctly:

If you process whole files, having a messaging system like celery or rabbit-mq may work well. A separate process (worker, consum…

View full answer

snakers4 · 2023-05-05T06:16:53Z

snakers4
May 5, 2023
Maintainer

Hi,

First of all it is better to use the ONNX model, since it requires less dependencies, onnx-runtime weighs much less, and the onnx model is 2-3 times faster.

Secondly, the model itself is a reference to an underlying lower level object. I am not familiar with how django handles concurrency (is it thread-safe, does it use threads, processes, asyncio, etc). But it looks like that somehow several requests are made at the same time to the same underlying model object, which entails a collision.

There are many ways and instruments in python to do this correctly:

If you process whole files, having a messaging system like celery or rabbit-mq may work well. A separate process (worker, consumer in the messaging system terminology) or several processes, can have its OWN SEPARATE instance of model and process tasks one at a time;
If running VAD inside of the main web-server process is a must, then most likely one of multiple python concurrency APIs should be used. Most likely this will be something like ProcessPoolExecutor or ThreadPoolExecutor, it may depend heavily on the internal design of your framework. The key idea is that the requests / chunks should arrive into the model one at a time. One may consider these concurrency APIs kind of similar to a "worker" consuming tasks.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi threading (multiple requests) problem. #338

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Multi threading (multiple requests) problem. #338

fadyscube May 4, 2023

Replies: 1 comment

snakers4 May 5, 2023 Maintainer

fadyscube
May 4, 2023

snakers4
May 5, 2023
Maintainer