GPU batch decoding + Online request queueing machanism #30

greed2411 · 2021-10-22T03:31:28Z

possibly along with a request queueing mechanism like ServiceStreamer for online

pskrunner14 · 2021-11-17T20:05:05Z

Task 1

Write an interface and implement GPU batch decoding for Kaldi ASR models in the kaldi-serve core C++ library.

The current partial version (gpu-decoder branch) is buggy (stale issue here), which you may use as a starting point or write one from scratch, it's upto you. The main idea here is to be able to pass a custom async callback to the batch decoding pipeline that accepts the final result once the GPU compute task is complete.

Relevant links:

Task 2

Implement an online request queueing mechanism similar to that of ServiceStreamer that utilizes the GPU Batch Decoding interface (Task 1) to reduce latency in the kaldi-serve gRPC server application during higher loads.

greed2411 added the winter-of-code gdsc's woc label Oct 22, 2021

pskrunner14 changed the title ~~GPU online/offline batch decoding~~ GPU batch decoding + Online request queueing machanism Nov 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU batch decoding + Online request queueing machanism #30

GPU batch decoding + Online request queueing machanism #30

greed2411 commented Oct 22, 2021

pskrunner14 commented Nov 17, 2021 •

edited

Loading

GPU batch decoding + Online request queueing machanism #30

GPU batch decoding + Online request queueing machanism #30

Comments

greed2411 commented Oct 22, 2021

pskrunner14 commented Nov 17, 2021 • edited Loading

Task 1

Task 2

pskrunner14 commented Nov 17, 2021 •

edited

Loading