300字范文 > python保存模型与参数_Pytorch - 模型和参数的保存与恢复

python保存模型与参数_Pytorch - 模型和参数的保存与恢复

时间：2023-07-12 19:29:50

模型训练后，需要保存到文件，以供测试和部署；或，继续之前的训练状态.

1. Best Practices

主要有两种模型序列化保存和加载恢复的方法.

1.1 方法 M1 - 推荐

只保存和加载恢复模型参数(model parameters)：import torch

# 保存

torch.save(the_model.state_dict(), PATH)

# 恢复

the_model = TheModelClass(*args, **kwargs)

the_model.load_state_dict(torch.load(PATH))

# 该方法需要自己另导入模型的网络结构信息.

1.2 方法 M2

同时保存模型的参数和网络结构信息：import torch

# 保存

torch.save(the_model, PATH)

# 恢复

the_model = torch.load(PATH)

# 该方法保存的数据绑定着特定的 classes 和所用的确切目录结构. ‘

# 因此，再加载后经过许多重构后，可能会被打乱.

2. Stackoverflow 回答

根据应用场景，选择模型保存和加载恢复方法.

场景 C1 - 模型保存自用于推断

自己保存模型，自己恢复模型，然后，修改模型为 evaluation 模式.

这是因为，默认情况时，网络模型训练时往往有 BatchNorm 和 Dropout 网络层.# 模型保存

torch.save(model.state_dict(), filepath)

# 模型恢复

model.load_state_dict(torch.load(filepath))

model.eval()

场景 C2 - 模型保存用于恢复训练

模型训练时，保持其训练状态. 需要同时保存模型model，优化器状态(optimizer state)，epochs，score 等.# 模型保存

state = {

'epoch': epoch,

'state_dict': model.state_dict(),

'optimizer': optimizer.state_dict(),

...

}

torch.save(state, filepath)

# 加载模型，恢复训练

model.load_state_dict(state['state_dict'])

optimizer.load_state_dict(state['optimizer'])

# 由于是要继续训练，则不需要调用 model.eval().

场景 C3 - 模型保存用于分享他用

TensorFlow 中，可以创建一个 .pb 文件，同时定义了网络结构和模型权重. 这种方式非常便利，尤其在使用 Tensorflow serve.

类似地，Pytorch 中，# 模型保存

torch.save(model, filepath)

# 模型加载

model = torch.load(filepath)

这种方法仍不够稳定，因为 Pytorch 仍在版本更新变化中. 所以不推荐.

3. 实例import torch

state = {

'epoch': epoch,

'state_dict': model.state_dict(),

'optimizer': optimizer.state_dict(),

'best_score': best_score,

...

}

torch.save(state, '/path/to/checkpoint.pth' )

if resume:

if os.path.isfile(resume_file):

print("=> loading checkpoint '{}'".format(resume_file))

checkpoint = torch.load(resume_file)

start_epoch = checkpoint['epoch']

best_score = checkpoint['best_score']

model.load_state_dict(checkpoint['state_dict'])

模型网络层的参数可视化：import torch.nn as nn

from collections import OrderedDict

# 网络结构

model = nn.Sequential(OrderedDict([

('conv1', nn.Conv2d(1,32,5)),

('relu1', nn.ReLU()),

('conv2', nn.Conv2d(32,64,5)),

('relu2', nn.ReLU())

]))

print(model)

# 网络参数可视化

params=model.state_dict()

for k,v in params.items():

print(k) # 网络变量名

print(params['conv1.weight']) # conv1 层权重 weight

print(params['conv1.bias']) # conv1 层偏置 bias

本内容不代表本网观点和政治立场，如有侵犯你的权益请联系我们处理。

网友评论

网友评论仅供其表达个人看法，并不表明网站立场。