[Torch Uncertainty] Tutorial: Train a Bayesian Neural Network in Three Minutes

`Torch Uncertainty` 공식문서의 `Tutorials`를 번역한 내용입니다.

Train a Bayesian Neural Network in Three Minutes

이 튜토리얼에서는 variational inference Bayesian Neural Network (베이지안 신경망, BNN) LeNet 분류기를 MNIST 데이터셋에서 훈련합니다.

Foreword on Bayesian Neural Networks

베이지안 신경망(BNN)은 예측 불확실성을 가중치의 불확실성으로 추정하는 신경망의 한 유형입니다. 이는 신경망의 가중치를 확률 변수로 간주하고, 그 가중치의 후방 분포(posterior distribution)를 학습함으로써 이루어집니다. 이는 단일 가중치 집합만 학습하는 표준 신경망과 대조적이며, 표준 신경망의 가중치는 디락(Dirac) 분포로 표현될 수 있습니다.

베이지안 신경망에 대한 자세한 정보는 다음 문헌을 참조하세요:

Weight Uncertainty in Neural Networks ICML2015
Hands-on Bayesian Neural Networks - a Tutorial for Deep Learning Users IEEE Computational Intelligence Magazine

Training a Bayesian LeNet using TorchUncertainty models and Lightning

이제 TorchUncertainty에 이미 구현된 모델과 루틴을 기반으로 베이지안 LeNet을 훈련해 보겠습니다.

1. Loading the utilities

BNN을 훈련하기 위해 다음 모듈들을 로드해야 합니다:

Lightning의 Trainer
`torch_uncertainty.models`에 있는 모델: `bayesian_lenet`
`torch_uncertainty.routines`에 있는 분류 훈련 루틴
`torch_uncertainty.losses` 파일에 있는 베이지안 목표 함수: `ELBOLoss`
데이터로더를 처리하는 데이터 모듈: `torch_uncertainty.datamodules`에 있는 `MNISTDataModule`

또한, `torch.optim`을 사용해 옵티마이저를 정의하고, `torch.nn`에서 신경망 유틸리티를 임포트하며, `functools`의 `partial`을 사용해 ELBO 손실의 기본 인수를 수정해야 합니다.

from pathlib import Path

from lightning.pytorch import Trainer
from torch import nn, optim

from torch_uncertainty.datamodules import MNISTDataModule
from torch_uncertainty.losses import ELBOLoss
from torch_uncertainty.models.lenet import bayesian_lenet
from torch_uncertainty.routines import ClassificationRoutine

2. The Optimization Recipe

기본 학습률 0.001로 Adam 옵티마이저를 사용합니다.

def optim_lenet(model: nn.Module):
    optimizer = optim.Adam(
        model.parameters(),
        lr=1e-3,
    )
    return optimizer

3. Creating the necessary variables

다음 단계에서는 Lightning 트레이너, 데이터셋과 로그의 루트 경로를 정의합니다. MNIST 데이터셋, 데이터로더, 변환을 처리하는 데이터 모듈도 생성합니다. 데이터 모듈은 `eval_ood` 파라미터를 `True`로 설정해 OOD 탐지도 처리할 수 있습니다. 마지막으로 `torch_uncertainty.models`에서 제공하는 청사진을 사용해 모델을 생성합니다.

trainer = Trainer(accelerator="cpu", enable_progress_bar=False, max_epochs=1)

# datamodule
root = Path("data")
datamodule = MNISTDataModule(root=root, batch_size=128, eval_ood=False)

# model
model = bayesian_lenet(datamodule.num_channels, datamodule.num_classes)

4. The Loss and the Training Routine

다음으로 훈련 중에 사용할 손실 함수를 정의합니다. 이를 위해 `functools`의 `partial`함수를 사용하여 ELBO 손실의 기본 파라미터를 재정의합니다. 하이퍼파라미터는 blitz 라이브러리에서 제안된 것을 사용합니다. 분류 모델을 훈련하기 때문에, 우도로 `CrossEntropyLoss`를 사용합니다. 그런 다음 `torch_uncertainty.classification`에서 제공하는 분류 훈련 루틴을 사용하여 훈련 루틴을 정의합니다. 모델, ELBO 손실, 옵티마이저를 루틴에 제공합니다.

loss = ELBOLoss(
    model=model,
    inner_loss=nn.CrossEntropyLoss(),
    kl_weight=1 / 50000,
    num_samples=3,
)

routine = ClassificationRoutine(
    model=model,
    num_classes=datamodule.num_classes,
    loss=loss,
    optim_recipe=optim_lenet(model),
    is_ensemble=True
)

5. Gathering Everything and Training the Model

이제 모든 준비가 완료되었으므로, 메인 함수에 모든 것을 모아 Lightning 트레이너를 사용해 모델을 훈련합니다. 이때 루틴은 모델과 훈련/평가 로직을 포함하며, 데이터 모듈도 필요합니다. 데이터셋은 자동으로 `root/data` 폴더에 다운로드되며, 로그는 `root/logs` 폴더에 저장됩니다.

trainer.fit(model=routine, datamodule=datamodule)
trainer.test(model=routine, datamodule=datamodule)

6. Testing the Model

이제 모델이 훈련되었으므로, MNIST에서 테스트해 보겠습니다. 여기서는 로그잇(logits)의 차원을 변환하여 앙상블과 배치에 해당하는 차원을 결정하는 과정을 수행합니다. TorchUncertainty 0.2.0에서는 앙상블 차원이 배치 차원과 병합되어 `(num_estimator x batch, classes)` 순서로 정렬됩니다.

import matplotlib.pyplot as plt
import numpy as np
import torch
import torchvision


def imshow(img):
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg, (1, 2, 0)))
    plt.axis("off")
    plt.tight_layout()
    plt.show()


dataiter = iter(datamodule.val_dataloader())
images, labels = next(dataiter)

# print images
imshow(torchvision.utils.make_grid(images[:4, ...]))
print("Ground truth: ", " ".join(f"{labels[j]}" for j in range(4)))

# Put the model in eval mode to use several samples
model = model.eval()
logits = model(images).reshape(16, 128, 10) # num_estimators, batch_size, num_classes

# We apply the softmax on the classes and average over the estimators
probs = torch.nn.functional.softmax(logits, dim=-1)
avg_probs = probs.mean(dim=0)
var_probs = probs.std(dim=0)

_, predicted = torch.max(avg_probs, 1)

print("Predicted digits: ", " ".join(f"{predicted[j]}" for j in range(4)))
print("Std. dev. of the scores over the posterior samples", " ".join(f"{var_probs[j][predicted[j]]:.3}" for j in range(4)))

Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Ground truth:  7 2 1 0
Predicted digits:  7 2 1 0
Std. dev. of the scores over the posterior samples 0.000171 0.000749 0.000214 0.00577

여기서는 top prediction의 분산을 표시합니다. 이는 비표준적이지만, 앙상블 예측의 다양성을 보여주는 직관적인 방법입니다. 이상적으로는, top prediction 예측이 잘못된 경우 분산이 높아야 합니다.

참조

https://torch-uncertainty.github.io/auto_tutorials/tutorial_bayesian.html#sphx-glr-auto-tutorials-tutorial-bayesian-py

Train a Bayesian Neural Network in Three Minutes — TorchUncertainty 0.2.1.post0 documentation

Train a Bayesian Neural Network in Three Minutes In this tutorial, we will train a variational inference Bayesian Neural Network (BNN) LeNet classifier on the MNIST dataset. Foreword on Bayesian Neural Networks Bayesian Neural Networks (BNNs) are a class o

torch-uncertainty.github.io

LeNet & MNIST: LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE.

Bayesian Neural Networks: Blundell, C., Cornebise, J., Kavukcuoglu, K., & Wierstra, D. (2015). Weight Uncertainty in Neural Networks. ICML 2015.

The Adam optimizer: Kingma, D. P., & Ba, J. (2014). “Adam: A method for stochastic optimization.” ICLR 2015.

The Blitz library (for the hyperparameters).

'통계 & 머신러닝 > 통계적 머신러닝' 카테고리의 다른 글

[논문] Simple and Scalable Predictive Uncertainty Estimation Using Deep Ensembles: Idea (0)	2024.10.30
[Torch Uncertainty] Tutorial: Deep Evidential Classification on a Toy Example (0)	2024.08.23
[Torch Uncertainty] Tutorial: Training a LeNet with Monte Carlo Batch Normalization (0)	2024.08.22
[Torch Uncertainty] Tutorial: Training a LeNet with Monte-Carlo Dropout (0)	2024.08.22
[Torch Uncertainty] Tutorial: Improve Top-label Calibration with Temperature Scaling (0)	2024.08.22

Train a Bayesian Neural Network in Three Minutes

Foreword on Bayesian Neural Networks

Training a Bayesian LeNet using TorchUncertainty models and Lightning

1. Loading the utilities

2. The Optimization Recipe

3. Creating the necessary variables

4. The Loss and the Training Routine

5. Gathering Everything and Training the Model

6. Testing the Model

참조

'통계 & 머신러닝 > 통계적 머신러닝' 카테고리의 다른 글

티스토리툴바