在跑yolov9时候,断点训练出现“ValueError: loaded state dict contains a parameter group that doesn't match the size of optimizer's ”问题,查找了很多原因,都说是optimizer的SGD优化的问题,所以暂时的解决办法是:

在进行断点训练的时候把重新读入的optimizer注释掉:

在train_dual中找到下面这几行代码:

    # Resume
    best_fitness, start_epoch = 0.0, 0
    if pretrained:
        if resume:
            best_fitness, start_epoch, epochs = smart_resume(ckpt, optimizer, ema, weights, epochs, resume)
        del ckpt, csd

然后进入到smart_resume函数中:

    if ckpt['optimizer'] is not None:
        # optimizer.load_state_dict(ckpt['optimizer'])  # optimizer  
        best_fitness = ckpt['best_fitness']
    if ema and ckpt.get('ema'):
        # ema.ema.load_state_dict(ckpt['ema'].float().state_dict())  # EMA
        ema.updates = ckpt['updates']

将optimizer.load_state_dict(ckpt['optimizer']) 和ema.ema.load_state_dict(ckpt['ema'].float().state_dict())都注释掉。如图所示

成功开始训练:

Logo

一站式 AI 云服务平台

更多推荐