My training got killed with no warning message #1702
Unanswered
Envelopepiano
asked this question in
General Q&A
Replies: 2 comments 1 reply
-
I have similar issue: import os
# BaseDatasetConfig: defines name, formatter and path of the dataset.
from TTS.tts.configs.shared_configs import BaseDatasetConfig
output_path = "tts_train_PL"
if not os.path.exists(output_path):
os.makedirs(output_path)
dataset_config = BaseDatasetConfig(
formatter="ljspeech", meta_file_train="metadata.csv", path=os.path.join(output_path, "PL_example/")
)
# GlowTTSConfig: all model related values for training, validating and testing.
from TTS.tts.configs.glow_tts_config import GlowTTSConfig
config = GlowTTSConfig(
batch_size=32,
eval_batch_size=16,
num_loader_workers=4,
num_eval_loader_workers=4,
run_eval=True,
test_delay_epochs=-1,
epochs=100,
text_cleaner="phoneme_cleaners",
use_phonemes=True,
phoneme_language="pl",
phoneme_cache_path=os.path.join(output_path, "phoneme_cache"),
print_step=25,
print_eval=False,
mixed_precision=True,
output_path=output_path,
datasets=[dataset_config],
save_step=1000,
)
from TTS.utils.audio import AudioProcessor
ap = AudioProcessor.init_from_config(config)
# Modify sample rate if for a custom audio dataset:
ap.sample_rate = 16000
from TTS.tts.utils.text.tokenizer import TTSTokenizer
tokenizer, config = TTSTokenizer.init_from_config(config)
from TTS.tts.datasets import load_tts_samples
train_samples, eval_samples = load_tts_samples(
dataset_config,
eval_split=True,
eval_split_max_size=config.eval_split_max_size,
eval_split_size=config.eval_split_size,
)
from TTS.tts.models.glow_tts import GlowTTS
model = GlowTTS(config, ap, tokenizer, speaker_manager=None)
from trainer import Trainer, TrainerArgs
trainer = Trainer(
TrainerArgs(), config, output_path, model=model, train_samples=train_samples, eval_samples=eval_samples
)
trainer.fit()
And ont he second epoch it was killed/
Tried 3 times and the result the same. Any ideas? |
Beta Was this translation helpful? Give feedback.
1 reply
-
I was also facing similar issue and It was due to memory leakage issue. So I increased the RAM of my server to 32 GB and It worked fine. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi everyone I'm a newbie and I'm trying to train a TTS model these days, but I've faced some problems, one of them is that my training keeps getting killed! Can someone help me? Thanks in advance to everyone!
Sorry for the bad english,btw
OS: linux ubuntu 18;
python
My linux top messenge shows that my mem usage is about 60% when my training get killed.
Btw, my friend suggested me to disabling mixed precision, but I dont know how to do that.
Here's my terminal messenge:
a10@a10-ASUSPRO-D640MB-M640MB:~/TTS2$ python TTS/bin/train_tacotron.py --config_path TTS/tts/configs/config.json
2022-06-28 22:59:13.007249: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2022-06-28 22:59:13.007330: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
/home/a10/.local/lib/python3.6/site-packages/numba/errors.py:137: UserWarning: Insufficiently recent colorama version found. Numba requires colorama >= 0.3.9
warnings.warn(msg)
Beta Was this translation helpful? Give feedback.
All reactions