您现在的位置是:首页 >技术交流 >深度学习之神经网络框架搭建及模型优化网站首页技术交流
深度学习之神经网络框架搭建及模型优化
                简介深度学习之神经网络框架搭建及模型优化            
            神经网络框架搭建及模型优化
目录
1 数据及配置
1.1 配置
需要安装PyTorch,下载安装torch、torchvision、torchaudio,GPU需下载cuda版本,CPU可直接下载
cuda版本较大,最后通过控制面板pip install +存储地址离线下载,
CPU版本需再下载安装VC_redist.x64.exe,可下载上述三个后运行,通过报错网址直接下载安装
 
1.2 数据
使用的是 torchvision.datasets.MNIST的手写数据,包括特征数据和结果类别
1.3 函数导入
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor
 
1.4 数据函数
train_data = datasets.MNIST(
    root='data',        # 数据集存储的根目录
    train=True,         # 加载训练集
    download=True,      # 如果数据集不存在,自动下载
    transform=ToTensor() # 将图像转换为张量
)
 
- root 指定数据集存储的根目录。如果数据集不存在,会自动下载到这个目录。
 - train 决定加载训练集还是测试集。True 表示加载训练集,False 表示加载测试集。
 - download 如果数据集不在 root 指定的目录中,是否自动下载数据集。True 表示自动下载。
 - transform 对加载的数据进行预处理或转换。通常用于将数据转换为模型所需的格式,如将图像转换为张量。
 
1.5 数据打包
train_dataloader = DataLoader(train_data, batch_size=64)
- train_data, 打包数据
 - batch_size=64,打包个数
 
代码展示:
import torch
print(torch.__version__)
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor
train_data = datasets.MNIST(
    root = 'data',
    train = True,
    download = True,
    transform = ToTensor()
)
test_data = datasets.MNIST(
    root = 'data',
    train = False,
    download = True,
    transform = ToTensor()
)
print(len(train_data))
print(len(test_data))
from matplotlib import pyplot as plt
figure = plt.figure()
for i in range(9):
    img,label = train_data[i+59000]
    figure.add_subplot(3,3,i+1)
    plt.title(label)
    plt.axis('off')
    plt.imshow(img.squeeze(),cmap='gray')
    a = img.squeeze()
plt.show()
train_dataloader = DataLoader(train_data, batch_size=64)
test_dataloader= DataLoader(test_data, batch_size=64)
 
运行结果:
 

调试查看:

 
2 神经网络框架搭建
2.1 框架确认
在搭建神经网络框架前,需先确认建立怎样的框架,目前并没有理论的指导,凭经验建立框架如下:
输入层:输入的图像数据(28*28)个神经元。
 中间层1:全连接层,128个神经元,
 中间层2:全连接层,256个神经元,
 输出层:全连接层,10个神经元,对应10个类别。
 需注意,中间层需使用激励函数激活,对累加数进行非线性的映射,以及forward前向传播过程的函数名不可更改,
2.2 函数搭建
- nn.Flatten() , 将输入展平为一维向量
 - nn.Linear(28*28, 128) ,全连接层,需注意每个连接层的输入输出需前后对应
 - torch.sigmoid(x),对中间层的输出应用Sigmoid激活函数
 
# 定义一个神经网络类,继承自 nn.Module
class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()  # 调用父类 nn.Module 的构造函数
        # 定义网络层
        self.flatten = nn.Flatten()  # 将输入展平为一维向量,适用于将图像数据(如28x28)展平为784维
        self.hidden1 = nn.Linear(28*28, 128)  # 第一个全连接层,输入维度为784(28*28),输出维度为128
        self.hidden2 = nn.Linear(128, 256)    # 第二个全连接层,输入维度为128,输出维度为256
        self.out = nn.Linear(256, 10)         # 输出层,输入维度为256,输出维度为10(对应10个类别)
    # 定义前向传播过程
    def forward(self, x):
        x = self.flatten(x)       # 将输入数据展平
        x = self.hidden1(x)       # 通过第一个全连接层
        x = torch.sigmoid(x)      # 对第一个全连接层的输出应用Sigmoid激活函数
        x = self.hidden2(x)       # 通过第二个全连接层
        x = torch.sigmoid(x)      # 对第二个全连接层的输出应用Sigmoid激活函数
        x = self.out(x)           # 通过输出层
        return x                  # 返回最终的输出
 
2.3 框架上传
- device = ‘cuda’ if torch.cuda.is_available() else ‘mps’ if torch.backends.mps.is_available() else ‘cpu’,确认设备, 检查是否有可用的GPU设备,如果有则使用GPU,否则使用CPU
 - model = NeuralNetwork().to(device),框架上传到GPU/CPU
 
模型输出展示:

3 模型优化
3.1 函数理解
- optimizer = torch.optim.Adam(model.parameters(), lr=0.001),定义优化器: 
  
- Adam()使用Adam优化算法,也可为SGD等优化算法
 - model.parameters()为优化模型的参数,
 - lr为学习率/梯度下降步长为0.001
 
 - loss_fn = nn.CrossEntropyLoss(pre,y),定义损失函数,使用交叉熵损失函数,适用于分类任务 
  
- pre,预测结果
 - y,真实结果
 - loss_fn.item(),当前损失值
 
 - model.train() ,将模型设置为训练模式,模型参数是可变的
 - x, y = x.to(device), y.to(device),将数据移动到指定设备(GPU或CPU)
 - 反向传播:清零梯度,计算梯度,更新模型参数 
  
- optimizer.zero_grad() ,清零梯度缓存
loss.backward(), 计算梯度
optimizer.step() , 更新模型参数 
 - optimizer.zero_grad() ,清零梯度缓存
 - model.eval(),将模型设置为评估模式,模型参数是不可变
 - with torch.no_grad(),禁用梯度计算,在测试过程中不需要计算梯度
 
3.2 训练模型和测试模型代码
optimizer = torch.optim.Adam(model.parameters(),lr=0.001)
loss_fn = nn.CrossEntropyLoss()
def train(dataloader,model,loss_fn,optimizer):
    model.train()
    batch_size_num = 1
    for x,y in dataloader:
        x,y = x.to(device),y.to(device)
        pred = model.forward(x)
        loss = loss_fn(pred,y)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        loss_value = loss.item()
        if batch_size_num %100 ==0:
            print(f'loss: {loss_value:>7f}  [number: {batch_size_num}]')
        batch_size_num +=1
train(train_dataloader,model,loss_fn,optimizer)
def test(dataloader,model,loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    test_loss,correct = 0,0
    with torch.no_grad():
        for x,y in dataloader:
            x,y = x.to(device),y.to(device)
            pred = model.forward(x)
            test_loss += loss_fn(pred,y).item()
            correct +=(pred.argmax(1) == y).type(torch.float).sum().item()
            a = (pred.argmax(1)==y)
            b = (pred.argmax(1)==y).type(torch.float)
    test_loss /=num_batches
    correct /= size
    print(f'test result: 
 Accuracy: {(100*correct)}%, Avg loss:{test_loss}')
 
4 最终代码测试
4.1 SGD优化算法
torch.optim.SGD(model.parameters(),lr=0.01)
代码展示:
import torch
print(torch.__version__)
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor
train_data = datasets.MNIST(
    root = 'data',
    train = True,
    download = True,
    transform = ToTensor()
)
test_data = datasets.MNIST(
    root = 'data',
    train = False,
    download = True,
    transform = ToTensor()
)
print(len(train_data))
print(len(test_data))
from matplotlib import pyplot as plt
figure = plt.figure()
for i in range(9):
    img,label = train_data[i+59000]
    figure.add_subplot(3,3,i+1)
    plt.title(label)
    plt.axis('off')
    plt.imshow(img.squeeze(),cmap='gray')
    a = img.squeeze()
plt.show()
train_dataloader = DataLoader(train_data, batch_size=64)
test_dataloader= DataLoader(test_data, batch_size=64)
device = 'cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu'
print(f'Using {device} device')
class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.hidden1 = nn.Linear(28*28,128)
        self.hidden2 = nn.Linear(128, 256)
        self.out = nn.Linear(256,10)
    def forward(self,x):
        x = self.flatten(x)
        x = self.hidden1(x)
        x = torch.sigmoid(x)
        x = self.hidden2(x)
        x = torch.sigmoid(x)
        x = self.out(x)
        return x
model = NeuralNetwork().to(device)
#
print(model)
optimizer = torch.optim.SGD(model.parameters(),lr=0.01)
loss_fn = nn.CrossEntropyLoss()
def train(dataloader,model,loss_fn,optimizer):
    model.train()
    batch_size_num = 1
    for x,y in dataloader:
        x,y = x.to(device),y.to(device)
        pred = model.forward(x)
        loss = loss_fn(pred,y)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        loss_value = loss.item()
        if batch_size_num %100 ==0:
            print(f'loss: {loss_value:>7f}  [number: {batch_size_num}]')
        batch_size_num +=1
def test(dataloader,model,loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    test_loss,correct = 0,0
    with torch.no_grad():
        for x,y in dataloader:
            x,y = x.to(device),y.to(device)
            pred = model.forward(x)
            test_loss += loss_fn(pred,y).item()
            correct +=(pred.argmax(1) == y).type(torch.float).sum().item()
            a = (pred.argmax(1)==y)
            b = (pred.argmax(1)==y).type(torch.float)
    test_loss /=num_batches
    correct /= size
    print(f'test result: 
 Accuracy: {(100*correct)}%, Avg loss:{test_loss}')
#
train(train_dataloader,model,loss_fn,optimizer)
test(test_dataloader,model,loss_fn)
 
运行结果:
 
4.2 Adam优化算法
自适应算法,torch.optim.Adam(model.parameters(),lr=0.01)
运行结果:
 
4.3 多次迭代
代码展示:
import torch
print(torch.__version__)
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor
train_data = datasets.MNIST(
    root = 'data',
    train = True,
    download = True,
    transform = ToTensor()
)
test_data = datasets.MNIST(
    root = 'data',
    train = False,
    download = True,
    transform = ToTensor()
)
print(len(train_data))
print(len(test_data))
from matplotlib import pyplot as plt
figure = plt.figure()
for i in range(9):
    img,label = train_data[i+59000]
    figure.add_subplot(3,3,i+1)
    plt.title(label)
    plt.axis('off')
    plt.imshow(img.squeeze(),cmap='gray')
    a = img.squeeze()
plt.show()
train_dataloader = DataLoader(train_data, batch_size=64)
test_dataloader= DataLoader(test_data, batch_size=64)
device = 'cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu'
print(f'Using {device} device')
class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.hidden1 = nn.Linear(28*28,128)
        self.hidden2 = nn.Linear(128, 256)
        self.out = nn.Linear(256,10)
    def forward(self,x):
        x = self.flatten(x)
        x = self.hidden1(x)
        x = torch.sigmoid(x)
        x = self.hidden2(x)
        x = torch.sigmoid(x)
        x = self.out(x)
        return x
model = NeuralNetwork().to(device)
#
print(model)
optimizer = torch.optim.Adam(model.parameters(),lr=0.01)
loss_fn = nn.CrossEntropyLoss()
def train(dataloader,model,loss_fn,optimizer):
    model.train()
    batch_size_num = 1
    for x,y in dataloader:
        x,y = x.to(device),y.to(device)
        pred = model.forward(x)
        loss = loss_fn(pred,y)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        loss_value = loss.item()
        if batch_size_num %100 ==0:
            print(f'loss: {loss_value:>7f}  [number: {batch_size_num}]')
        batch_size_num +=1
def test(dataloader,model,loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    test_loss,correct = 0,0
    with torch.no_grad():
        for x,y in dataloader:
            x,y = x.to(device),y.to(device)
            pred = model.forward(x)
            test_loss += loss_fn(pred,y).item()
            correct +=(pred.argmax(1) == y).type(torch.float).sum().item()
            a = (pred.argmax(1)==y)
            b = (pred.argmax(1)==y).type(torch.float)
    test_loss /=num_batches
    correct /= size
    print(f'test result: 
 Accuracy: {(100*correct)}%, Avg loss:{test_loss}')
#
train(train_dataloader,model,loss_fn,optimizer)
test(test_dataloader,model,loss_fn)
#
e = 30
for i in range(e):
    print(f'e: {i+1}
------------------')
    train(train_dataloader, model, loss_fn, optimizer)
print('done')
test(test_dataloader, model, loss_fn)
 
运行结果:
 
风语者!平时喜欢研究各种技术,目前在从事后端开发工作,热爱生活、热爱工作。
        
    
        
    
            




U8W/U8W-Mini使用与常见问题解决
QT多线程的5种用法,通过使用线程解决UI主界面的耗时操作代码,防止界面卡死。...
stm32使用HAL库配置串口中断收发数据(保姆级教程)
分享几个国内免费的ChatGPT镜像网址(亲测有效)
Allegro16.6差分等长设置及走线总结