夜渐深,我还在😘
老地方
睡觉了🙌
文章目录
- 📚 卷积神经网络实战:MNIST手写数字识别
- 🧠 4.1 预备知识
- ⚙️ 4.1.1 `torch.nn.Conv2d()` 三维卷积操作
- 📏 4.1.2 `nn.MaxPool2d()` 池化层的作用
- 📥 4.2 数据输入与处理
- 🗃️ MNIST数据集加载
- 🔍 数据格式验证
- 🚀 4.3 卷积模型构建与训练
- 🧩 4.3.1 网络架构设计
- ⚡ 4.3.2 GPU加速与模型初始化
- 📉 4.3.3 训练与评估函数
- 🔁 4.3.4 模型训练循环
- 🧪 4.4 函数式API
- 🔌 4.4.1导入函数式模块
- ⚡ 4.4.2激活函数应用
- 🧮 4.4.3池化操作实现
📚 卷积神经网络实战:MNIST手写数字识别
🧠 4.1 预备知识
⚙️ 4.1.1 torch.nn.Conv2d()
三维卷积操作
torch.nn.Conv2d()
是PyTorch中实现三维卷积的核心方法,其关键参数包括:
in_channels
:输入通道数(彩色图为3,灰度图为1)out_channels
:输出通道数(卷积核数量)kernel_size
:卷积核尺寸(如3×3)stride
:步长(默认为1)padding
:填充(默认为0)
import torch
from torch import nn# 创建随机输入数据 (batch_size=20, 通道=3, 高=256, 宽=356)
input = torch.randn(20, 3, 256, 256) # 定义卷积层:输入通道3→输出通道16,3×3卷积核,步长1,填充1
conv_layer = nn.Conv2d(3, 16, (3, 3), stride=1, padding=1)# 执行卷积操作
output = conv_layer(input)
output.shape # torch.Size([20, 16, 256, 256])
💡 输出解析:经过卷积后,特征图尺寸保持256×256不变(因padding=1),通道数从3增加到16
📏 4.1.2 nn.MaxPool2d()
池化层的作用
池化层的重要性:
- 🎯 增大感受野:小卷积核视野有限,池化间接扩大覆盖区域
- 🛡️ 降低过拟合:减少参数量,增强模型泛化能力
- ⚡ 加速计算:缩减特征图尺寸,减少后续计算量
核心参数:kernel_size
(池化窗口尺寸)
# 创建随机图像批次 (64张256×256的RGB图像)
img_batch = torch.randn(64, 3, 256, 256)# 2×2最大池化操作
pool_out = torch.max_pool2d(img_batch, kernel_size=(2, 2))
pool_out.shape # torch.Size([64, 3, 128, 128])
💡 输出解析:池化后图像尺寸减半(256→128),通道数不变,实现特征降维
📥 4.2 数据输入与处理
🗃️ MNIST数据集加载
使用PyTorch内置工具加载手写数字数据集:
import torchvision
from torchvision.transforms import ToTensor# 下载并加载训练集/测试集
train_ds = torchvision.datasets.MNIST("data/", train=True, transform=ToTensor(), download=True
)
test_ds = torchvision.datasets.MNIST("data/", train=False, transform=ToTensor(), download=True
)# 创建数据加载器 (batch_size=64)
train_dl = torch.utils.data.DataLoader(train_ds, batch_size=64, shuffle=True)
test_dl = torch.utils.data.DataLoader(test_ds, batch_size=64)
🔍 数据格式验证
imgs, labels = next(iter(train_dl))
print(imgs.shape, labels.shape) # torch.Size([64, 1, 28, 28]) torch.Size([64])
✅ 数据格式:符合卷积网络输入要求(batch_size, 通道, 高, 宽)
🚀 4.3 卷积模型构建与训练
🧩 4.3.1 网络架构设计
LeNet风格CNN模型:
class Model(nn.Module):def __init__(self):super(Model, self).__init__()# 卷积层1:1→6通道,5×5卷积核self.conv1 = nn.Conv2d(1, 6, 5) # 卷积层2:6→16通道,5×5卷积核self.conv2 = nn.Conv2d(6, 16, 5) # 全连接层1:256→256节点self.linear1 = nn.Linear(16*4*4, 256) # 输出层:256→10节点 (10个数字类别)self.linear2 = nn.Linear(256, 10) def forward(self, x):# 卷积→ReLU→池化 (28×28 → 12×12)x = torch.max_pool2d(torch.relu(self.conv1(x)), (2, 2)) # 卷积→ReLU→池化 (12×12 → 4×4)x = torch.max_pool2d(torch.relu(self.conv2(x)), (2, 2)) # 展平特征图x = x.view(-1, 16*4*4) # 全连接层→ReLUx = torch.relu(self.linear1(x)) # 输出层return self.linear2(x)
⚡ 4.3.2 GPU加速与模型初始化
# 自动检测GPU加速
device = "cuda" if torch.cuda.is_available() else "cpu"
model = Model().to(device)
model
Model((conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))(conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))(linear1): Linear(in_features=256, out_features=256, bias=True)(linear2): Linear(in_features=256, out_features=10, bias=True)
)
📉 4.3.3 训练与评估函数
# 训练函数
def train(dataloader, model, loss_fn, optimizer):model.train()total_samples = len(dataloader.dataset)total_batches = len(dataloader)train_loss, correct = 0, 0for X, y in dataloader:X, y = X.to(device), y.to(device)# 前向传播pred = model(X)loss = loss_fn(pred, y)# 反向传播optimizer.zero_grad()loss.backward()optimizer.step()# 统计指标with torch.no_grad():correct += (pred.argmax(1) == y).sum().item()train_loss += loss.item()return train_loss/total_batches, correct/total_samples# 测试函数
def test(dataloader, model):model.eval()total_samples = len(dataloader.dataset)total_batches = len(dataloader)test_loss, correct = 0, 0with torch.no_grad():for X, y in dataloader:X, y = X.to(device), y.to(device)pred = model(X)test_loss += loss_fn(pred, y).item()correct += (pred.argmax(1) == y).sum().item()return test_loss/total_batches, correct/total_samples
🔁 4.3.4 模型训练循环
# 超参数设置
optimizer = torch.optim.Adam(model.parameters(), lr=0.005)
loss_fn = nn.CrossEntropyLoss()
epochs = 20# 训练日志
for epoch in range(epochs):train_loss, train_acc = train(train_dl, model, loss_fn, optimizer)test_loss, test_acc = test(test_dl, model)# 打印训练进度print(f"epoch:{epoch:2d}, train_loss:{train_loss:.5f}, "f"train_acc:{train_acc*100:.1f}%, test_loss:{test_loss:.5f}, "f"test_acc:{test_acc*100:.1f}%")
训练输出:
epoch: 0, train_loss:0.24543, train_acc:92.8%, test_loss:0.07341, test_acc:97.7%
epoch: 1, train_loss:0.06720, train_acc:97.9%, test_loss:0.04788, test_acc:98.4%
...
epoch:19, train_loss:0.00509, train_acc:99.8%, test_loss:0.04585, test_acc:99.2%
Done
🎯 性能总结:模型在20个epoch内达到**99.2%**的测试准确率,显著优于全连接网络
🧪 4.4 函数式API
🔌 4.4.1导入函数式模块
import torch.nn.functional as F
# 行业标准导入方式
⚡ 4.4.2激活函数应用
# 传统方式
output = torch.relu(input)# 函数式API方式
output = F.relu(input)
🧮 4.4.3池化操作实现
# 传统方式
pooled = torch.max_pool2d(input, kernel_size=2)# 函数式API方式
pooled = F.max_pool2d(input, kernel_size=2)