欢迎光临散文网 会员登陆 & 注册

so-VITS-svc在Colab中训练报错

2023-02-18 23:24 作者:神庚Official  | 我要投稿

INFO:48k:{'train': {'log_interval': 200, 'eval_interval': 1000, 'seed': 1234, 'epochs': 10000, 'learning_rate': 0.0001, 'betas': [0.8, 0.99], 'eps': 1e-09, 'batch_size': 12, 'fp16_run': False, 'lr_decay': 0.999875, 'segment_size': 17920, 'init_lr_ratio': 1, 'warmup_epochs': 0, 'c_mel': 45, 'c_kl': 1.0, 'use_sr': True, 'max_speclen': 384, 'port': '8001'}, 'data': {'training_files': 'filelists/train.txt', 'validation_files': 'filelists/val.txt', 'max_wav_value': 32768.0, 'sampling_rate': 48000, 'filter_length': 1280, 'hop_length': 320, 'win_length': 1280, 'n_mel_channels': 80, 'mel_fmin': 0.0, 'mel_fmax': None}, 'model': {'inter_channels': 192, 'hidden_channels': 192, 'filter_channels': 768, 'n_heads': 2, 'n_layers': 6, 'kernel_size': 3, 'p_dropout': 0.1, 'resblock': '1', 'resblock_kernel_sizes': [3, 7, 11], 'resblock_dilation_sizes': [[1, 3, 5], [1, 3, 5], [1, 3, 5]], 'upsample_rates': [10, 8, 2, 2], 'upsample_initial_channel': 512, 'upsample_kernel_sizes': [16, 16, 4, 4], 'n_layers_q': 3, 'use_spectral_norm': False, 'gin_channels': 256, 'ssl_dim': 256, 'n_speakers': 0}, 'spk': {}, 'model_dir': './logs/48k'}

2023-02-18 14:55:41.659435: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA

To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

DEBUG:tensorflow:Falling back to TensorFlow client; we recommended you install the Cloud TPU client directly with pip install cloud-tpu-client.

2023-02-18 14:55:42.619057: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia

2023-02-18 14:55:42.619210: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia

2023-02-18 14:55:42.619232: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

DEBUG:h5py._conv:Creating converter from 7 to 5

DEBUG:h5py._conv:Creating converter from 5 to 7

DEBUG:h5py._conv:Creating converter from 7 to 5

DEBUG:h5py._conv:Creating converter from 5 to 7

DEBUG:root:Initializing MLIR with module: _site_initialize_0

DEBUG:root:Registering dialects from initializer <module 'jaxlib.mlir._mlir_libs._site_initialize_0' from '/usr/local/lib/python3.8/dist-packages/jaxlib/mlir/_mlir_libs/_site_initialize_0.so'>

DEBUG:jax._src.path:etils.epath found. Using etils.epath for file I/O.

INFO:numexpr.utils:NumExpr defaulting to 2 threads.

INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0

INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes.

/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py:554: UserWarning: This DataLoader will create 8 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.

  warnings.warn(_create_warning_msg(

Traceback (most recent call last):

  File "train.py", line 281, in <module>

    main()

  File "train.py", line 48, in main

    mp.spawn(run, nprocs=n_gpus, args=(n_gpus, hps,))

  File "/usr/local/lib/python3.8/dist-packages/torch/multiprocessing/spawn.py", line 240, in spawn

    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')

  File "/usr/local/lib/python3.8/dist-packages/torch/multiprocessing/spawn.py", line 198, in start_processes

    while not context.join():

  File "/usr/local/lib/python3.8/dist-packages/torch/multiprocessing/spawn.py", line 160, in join

    raise ProcessRaisedException(msg, error_index, failed_process.pid)

torch.multiprocessing.spawn.ProcessRaisedException: 


-- Process 0 terminated with the following error:

Traceback (most recent call last):

  File "/usr/local/lib/python3.8/dist-packages/torch/multiprocessing/spawn.py", line 69, in _wrap

    fn(i, *args)

  File "/content/so-vits-svc/train.py", line 88, in run

    net_g = DDP(net_g, device_ids=[rank])  # , find_unused_parameters=True)

  File "/usr/local/lib/python3.8/dist-packages/torch/nn/parallel/distributed.py", line 657, in __init__

    _sync_module_states(

  File "/usr/local/lib/python3.8/dist-packages/torch/distributed/utils.py", line 136, in _sync_module_states

    _sync_params_and_buffers(

  File "/usr/local/lib/python3.8/dist-packages/torch/distributed/utils.py", line 154, in _sync_params_and_buffers

    dist._broadcast_coalesced(

RuntimeError: The size of tensor a (256) must match the size of tensor b (0) at non-singleton dimension 1


so-VITS-svc在Colab中训练报错的评论 (共 条)

分享到微博请遵守国家法律