Division by zero error when using generating dataset as part of .\start_finetune.bat #489

raza-qazi · 2025-01-10T21:46:02Z

During the finetuning process in AllTalk's .\start_finetune.bat, the script encounters a ZeroDivisionError shown below:

Steps to Reproduce:

run start_finetune.bat
Upload New Audio Sample
set project name
Whisper Model: large-v3, large-v2 (both failing)
Model Precision: Mixed
Dataset Language: en
Evaluation Data Split: 15
BPE Tokenizer: Enabled
VAD: enabled
Min/Max audio length: default and 4-20 seconds (both failing)
Clicked on Step 1 - Create dataset
Error in Log

1 sample file uploaded:
ffmpeg output:

Input #0, wav, from 'sample1.wav':
  Duration: 00:05:04.25, bitrate: 705 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, mono, s16, 705 kb/s

Text/logs

[FINETUNE] [INFO] Initializing output directory: C:\Users\User\code\alltalk_tts\finetune\newvoice
[FINETUNE] [MODEL] Using device: cuda
[FINETUNE] [GPU] GPU Memory Status:
[FINETUNE] [GPU] Total: 12282.00 MB
[FINETUNE] [GPU] Used:  1282.73 MB
[FINETUNE] [GPU] Free:  10999.27 MB
[FINETUNE] [MODEL] Loading Whisper model: large-v2
[FINETUNE] [MODEL] Using mixed precision
[FINETUNE] [MODEL] Initializing Silero VAD
Using cache found in C:\Users\User/.cache\torch\hub\snakers4_silero-vad_master
[FINETUNE] [GPU] GPU Memory Status:
[FINETUNE] [GPU] Total: 12282.00 MB
[FINETUNE] [GPU] Used:  10870.58 MB
[FINETUNE] [GPU] Free:  1411.42 MB
[FINETUNE] [INFO] Using existing language setting
[FINETUNE] [AUDIO] Found 1 audio files to process
[FINETUNE] [INFO] Processing: sample1
[FINETUNE] [AUDIO] Original audio duration: 304.25s
[FINETUNE] [AUDIO] Processing with VAD
[FINETUNE] [SEG] Merged 0 segments into 22 segments with mid-range preference
[FINETUNE] [SEG] VAD processing: 22 original segments, 22 after merging
[FINETUNE] [SEG] Merged 0 segments into 22 segments with mid-range preference
[FINETUNE] [SEG] Merged 0 short segments
<...>
[FINETUNE] [AUDIO] Audio Processing Statistics:
[FINETUNE] [AUDIO] Total segments: 22
[FINETUNE] [AUDIO] Average duration: 14.42s
[FINETUNE] [AUDIO] Segments under minimum: 0
[FINETUNE] [AUDIO] Segments over maximum: 3
[FINETUNE] [DATA] Processing metadata and handling duplicates
[FINETUNE] [DUP] Found 15 files with multiple transcriptions
[FINETUNE] [DUP] wavs/sample1_00000020.wav: 4 occurrences
<...>
[FINETUNE] [DUP] Re-transcribing duplicate files to get best transcription
[FINETUNE] [DUP] Re-transcribing wavs/sample1_00000020.wav
[FINETUNE] [DUP] Re-transcribing wavs/sample1_00000003.wav
[FINETUNE] [DUP] Re-transcribing wavs/sample1_00000004.wav
[FINETUNE] [DUP] Re-transcribing wavs/sample1_00000014.wav
[FINETUNE] [DUP] Re-transcribing wavs/sample1_00000018.wav
Traceback (most recent call last):
  File "C:\Users\User\code\alltalk_tts\finetune.py", line 3866, in preprocess_dataset
    pd_train_meta, pd_eval_meta, pd_audio_total_size = format_audio_list(
                                                       ^^^^^^^^^^^^^^^^^^
  File "C:\Users\User\code\alltalk_tts\finetune.py", line 1367, in format_audio_list
    best_transcriptions = handle_duplicates(
                          ^^^^^^^^^^^^^^^^^^
  File "C:\Users\User\code\alltalk_tts\finetune.py", line 1920, in handle_duplicates
    "confidence": sum(s.get("confidence", 0) for s in result["segments"])
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ZeroDivisionError: division by zero
Traceback (most recent call last):
  File "C:\Users\User\code\alltalk_tts\alltalk_environment\env\Lib\site-packages\gradio\queueing.py", line 536, in process_events
    response = await route_utils.call_process_api(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\User\code\alltalk_tts\alltalk_environment\env\Lib\site-packages\gradio\route_utils.py", line 322, in call_process_api
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\User\code\alltalk_tts\alltalk_environment\env\Lib\site-packages\gradio\blocks.py", line 1945, in process_api
    data = await self.postprocess_data(block_fn, result["prediction"], state)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\User\code\alltalk_tts\alltalk_environment\env\Lib\site-packages\gradio\blocks.py", line 1717, in postprocess_data
    self.validate_outputs(block_fn, predictions)  # type: ignore
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\User\code\alltalk_tts\alltalk_environment\env\Lib\site-packages\gradio\blocks.py", line 1691, in validate_outputs
    raise ValueError(
ValueError: An event handler (preprocess_dataset) didn't receive enough output values (needed: 6, received: 3).
Wanted outputs:
    [<gradio.components.label.Label object at 0x00000218E74B2950>, <gradio.components.textbox.Textbox object at 0x00000218E7A466D0>, <gradio.components.textbox.Textbox object at 0x00000218E74972D0>, <gradio.components.textbox.Textbox object at 0x00000218E2B38A10>, <gradio.components.textbox.Textbox object at 0x00000218E8CB5690>, <gradio.components.textbox.Textbox object at 0x00000218E8CCA110>]
Received outputs:
    ["The data processing was interrupted due to an error!! Please check the console to verify the full error message! 
 Error summary: Traceback (most recent call last):
  File "C:\Users\User\code\alltalk_tts\finetune.py", line 3866, in preprocess_dataset
    pd_train_meta, pd_eval_meta, pd_audio_total_size = format_audio_list(
                                                       ^^^^^^^^^^^^^^^^^^
  File "C:\Users\User\code\alltalk_tts\finetune.py", line 1367, in format_audio_list
    best_transcriptions = handle_duplicates(
                          ^^^^^^^^^^^^^^^^^^
  File "C:\Users\User\code\alltalk_tts\finetune.py", line 1920, in handle_duplicates
    "confidence": sum(s.get("confidence", 0) for s in result["segments"])
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ZeroDivisionError: division by zero
", "", ""]

Desktop (please complete the following information):
AllTalk was updated: Jan 8th 2025 (fresh setup)
Custom Python environment: no
Text-generation-webUI was updated: Jan 8th 2025 (fresh setup)

The text was updated successfully, but these errors were encountered:

erew123 · 2025-01-19T20:49:47Z

Hi @raza-qazi

Apologies for the late reply. I believe I have fix for this which I will apply at some time, however I am travelling currently for a family emergency. Will get back to this when I can.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Division by zero error when using generating dataset as part of .\start_finetune.bat #489

Division by zero error when using generating dataset as part of .\start_finetune.bat #489

raza-qazi commented Jan 10, 2025 •

edited

Loading

erew123 commented Jan 19, 2025

Division by zero error when using generating dataset as part of .\start_finetune.bat #489

Division by zero error when using generating dataset as part of .\start_finetune.bat #489

Comments

raza-qazi commented Jan 10, 2025 • edited Loading

erew123 commented Jan 19, 2025

raza-qazi commented Jan 10, 2025 •

edited

Loading