iT邦幫忙

2022 iThome 鐵人賽

DAY 25
0
AI & Data

菜鳥工程師第一個電腦視覺(CV)專案-農作物影像辨識競賽系列 第 25

D25-競賽資料集運算Pretrained_AlexNet_5th

  • 分享至 

  • xImage
  •  

Part0:前言

在連假安排聚會時,就打定主意一定要空下完整一天把進度趕上,把所有的Bug清掉,但環境問題(gpu)和模型設定蠻繁雜,老實說也不太確定一天到底夠不夠,但幸好今天整天的付出是有收穫的,結果不錯哩!


Part1:今日目標

1.訓練中模型的程式碼:選用AlexNet預訓練模型,並用競賽資料進行遷移式學習(Transfer learning)
2.Debug紀錄:Debug_5~Debug7 & Hint_1~Hint_3


Part2:內容

1.訓練中模型的程式碼

(1)整個訓練執行步驟: 使用競賽的所有的訓練資料(train_data),共33類別的農作物!
Step0: Set GPU environment

import torch
print(torch.cuda.is_available())

device = (torch.device('cuda') if torch.cuda.is_available()
         else torch.device('cpu'))
print(f"Training on device {device}.")

Step1:Load the data & Create Dataset

batch_size = 10 # 一個batch有10張圖片
val_size, test_size = 0.1, 0.1  # train:val:test=0.8:0.1:0.1
shuffle_dataset = True
random_seed= 42
transform = transforms.Compose([
    transforms.ToTensor(),  # Debug_5, you can add other transformations in this list
    #transforms.Resize(224),  # but got [3, 298, 224] at entry 0 and [3, 398, 224]
    transforms.Resize(size=(255, 255))  # Debug_6, Hint_2
    #transforms.CenterCrop(255)
])

# Use all training data
dataset = torchvision.datasets.ImageFolder(ALL_data_path, transform=transform, target_transform=None)  # Hint_1

print(dataset.class_to_idx)  # Hint_3

Step2:Split Data & Create Dataloader

# Create data indices for train & validatin spilt
# Set seed and shuffle, then spilt data to train, validation, test
dataset_size = len(dataset)
indices = list(range(dataset_size))
val_spilt = int(np.floor(val_size * dataset_size))
test_spilt = val_spilt + int(np.floor(test_size * dataset_size))

if shuffle_dataset:
    np.random.seed(random_seed)
    np.random.shuffle(indices)
    
train_indices, val_indices, test_indices = indices[test_spilt:], indices[:val_spilt], indices[val_spilt:test_spilt]
# print(train_indices, val_indices, test_indices)

# Creating PT data samplers and loaders:
train_sampler = SubsetRandomSampler(train_indices)
val_sampler = SubsetRandomSampler(val_indices)
test_sampler = SubsetRandomSampler(test_indices)


train_loader = torch.utils.data.DataLoader(dataset, batch_size=batch_size, 
                                           sampler=train_sampler)
val_loader = torch.utils.data.DataLoader(dataset, batch_size=batch_size,
                                                sampler=val_sampler)
test_loader = torch.utils.data.DataLoader(dataset, batch_size=batch_size, 
                                           sampler=test_sampler)

print(len(train_loader), len(val_loader), len(test_loader))  # 各自的batch數量: 7162 896 896
n_total_step = len(train_loader)
print(n_total_step)  # 7162

dataloaders={}
dataloaders["train"] = train_loader
dataloaders["val"] = val_loader

dataset_sizes = {}
dataset_sizes["train"] = len(train_indices)
dataset_sizes["val"] = len(val_indices)
  • 確認目前資料的狀況:
### Check
for index_batch, (images, labels_tensor) in islice(enumerate(train_loader),1,3):
    # print(index_batch, (images, labels_tensor))  # 第index_batch個batch,(images, labels_tensor)代表該個batch裡面所有圖片的像素&對應標籤
    # print(len(images))  # 每個batch有10張圖片,這個images存了某個batch共10張圖片的資訊
    # plt.imshow(images)  # Debug
    # print(images[0].shape[-1])  # 255

    print(images[0].shape, labels_tensor[0])  # 第0張圖片的(色彩通道數Channel, Height, Width)
    plt.imshow(np.transpose(images[0].numpy(), (1, 2, 0)))  # Debug_7, images[0]: 第1張圖片
print("="*10)    

Step3:Use Pytorch pretrained model

def train_model(model, criterion, optimizer, scheduler, num_epochs=25):
    since = time.time()

    best_model_wts = copy.deepcopy(model.state_dict())
    best_acc = 0.0

    for epoch in range(num_epochs):
        print(f'Epoch {epoch}/{num_epochs - 1}')
        print('-' * 10)

        # Each epoch has a training and validation phase
        for phase in ['train', 'val']:
            if phase == 'train':
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            running_loss = 0.0
            running_corrects = 0

            # Iterate over data.
            for inputs, labels in tqdm(dataloaders[phase]):
            #for i, (inputs, labels) in islice(enumerate(dataloaders[phase]),1,3):
                print("=================="+"Train"+"==================")
                #print("train_batch_number: {}".format(i))
                inputs = inputs.to(device)
                labels = labels.to(device)

                # zero the parameter gradients
                optimizer.zero_grad()

                # forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'):
                    outputs = model(inputs)
                    _, preds = torch.max(outputs, 1)
                    loss = criterion(outputs, labels)

                    # backward + optimize only if in training phase
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                # statistics
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)
            if phase == 'train':
                scheduler.step()

            epoch_loss = running_loss / dataset_sizes[phase]
            epoch_acc = running_corrects.double() / dataset_sizes[phase]

            print(f'{phase} Loss: {epoch_loss:.4f} Acc: {epoch_acc:.4f}')

            # deep copy the model
            if phase == 'val' and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())

        print()

    time_elapsed = time.time() - since
    print(f'Training complete in {time_elapsed // 60:.0f}m {time_elapsed % 60:.0f}s')
    print(f'Best val Acc: {best_acc:4f}')

    # load best model weights
    model.load_state_dict(best_model_wts)
    return model

Step3-1.Set Pretrained model & Parameters

model_alexnet = torchvision.models.alexnet(pretrained=True).to(device) # load pretrained Alexnet model (pretrained weights)
print(model_alexnet)

print("="*10)
print(model_alexnet.classifier[6])  # 最後一層分類器

Step3-2.Train

# Transfer Learning: (方法_2)ConvNet as fixed feature extractor
for param in model_alexnet.parameters():
    param.requires_grad = False  # 不自動更新參數:把所有層數都凍結(freeze the weights)
    
# Parameters of newly constructed modules have requires_grad=True by default
num_ftrs = model_alexnet.classifier[6].in_features  # 最後一層輸入器的輸入維度
model_alexnet.classifier[6] = torch.nn.Linear(num_ftrs, 33)  # 33: numbers of output class
# model_alexnet.classifier[6] == model.fc

model_alexnet = model_alexnet.to(device)
criterion = torch.nn.CrossEntropyLoss()
# Only parameters of final layer are being optimized 
optimizer_conv = torch.optim.SGD(model_alexnet.classifier[6].parameters(), lr=0.001, momentum=0.9)  # 只對最後一層參數做更新(指定最後一層:model_conv.fc)
exp_lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer_conv, step_size=7, gamma=0.1)

### Train and evaluate
model_alexnet = train_model(model_alexnet, criterion, optimizer_conv,
                         exp_lr_scheduler, num_epochs=10)

目前進度如下: (設定跑10個Epoch,但1個Epoch就需跑約9小時,2022-10-09 19:15:32開始進行)

Step4.Save models

# Print model's state_dict
print("Model's state_dict:")
for param_tensor in model_alexnet.state_dict():
    print(param_tensor, "\t", model_alexnet.state_dict()[param_tensor].size())

# Print optimizer's state_dict
print("Optimizer's state_dict:")
for var_name in optimizer.state_dict():
    print(var_name, "\t", optimizer.state_dict()[var_name])
    
torch.save(model_alexnet.state_dict(), output_path)
torch.save(model_alexnet, output_path)

model_scripted = torch.jit.script(model_alexnet) # Export to TorchScript
model_scripted.save('model_scripted_alexnet.pt') # Save

2.Debug紀錄: Debug_5~Debug7 & Hint_1~Hint_3

(1)Hint_1:若圖片按類別放在不同資料夾,可直接用torchvision.datasets.ImageFolder 建立dataset

train_data
│     
└───lemon
│   │   00abbbec-6228-4bbd-b777-57287b29a616.jpg
│   │   00cac22f-0304-4f23-bdca-a2c7d0337c65.jpg
│   
└───onion
    │   00b6ec1a-a376-4ffe-abf5-fbf90dab6340.jpg
    │   00b41d15-59ab-406b-a502-335c863593bf.jpg

(2)Debug_5:

  • TypeError: batch must contain tensors, numbers, dicts or lists; found <class ‘PIL.Image.Image’>
  • Reason: The error states that the DataLoader receives a PIL image.
  • Sol: add a transform that creates a tensor from the PIL image by adding transform
transform = transforms.Compose([
# you can add other transformations in this list
transforms.ToTensor()])

(3)Debug_6:

  • RuntimeError: stack expects each tensor to be equal size, but got [3, 2448, 3264] at entry 0 and [3, 1600, 900] at entry 1
  • Sol: transforms.Crop中心裁減圖片
    • (1)transforms.CenterCrop(size): 官方: Crops the given image at the center.
      若size=224,代表從圖片中心裁減圖片成224x224的大小。若裁減的大小比原圖大,那會以值為0的像素填充(黑色)。
    • (2)torchvision.transforms.RandomCrop(size, padding=None, pad_if_needed=False, fill=0, padding_mode='constant'): Crop the given image at a random location.
      從圖片中隨機裁切出大小為size的圖片。
    • (3)torchvision.transforms.RandomResizedCrop(size, scale=(0.08, 1.0), ratio=(0.75, 1.33), interpolation=<InterpolationMode.BILINEAR: 'bilinear'>): Crop a random portion of image and resize it to a given size.
      首先根據 scale 的比例裁剪原圖,然後根據 ratio 的長寬比再裁剪,最後使用插值法把圖片變換為 size 大小。

(4)Hint_2: transforms.Crop vs transforms.Resize

  • torchvision.transforms.Resize(size, interpolation=InterpolationMode.BILINEAR, max_size=None, antialias=None)
  • Ex: transforms.Resize((224, 224))
  • Def: Resize the input image to the given size. 重置影像解析度
  • 注意: PILImage對象size屬性返回的是w, h,而resize的參數順序是h, w。
  • Parameters:
    • size (sequence or int) –Desired output size. If size is a sequence like (h, w), output size will be matched to this. If size is an int, smaller edge of the image will be matched to this number. i.e, if height > width, then image will be rescaled to (size * height / width, size). -> RuntimeError: stack expects each tensor to be equal size, but got [3, 298, 224] at entry 0 and [3, 398, 224] at entry 7
  • Comparison: Crop(直接裁切成指定大小) vs Resize(圖片資訊一樣,只是做等比例縮放)
  • Pytorch官方文件
  • Ref:[PyTorch 學習筆記]2.3二十二種transforms圖片數據預處理方法

(5)Hint_3: When using ImageFolder class , use below to trace back the original label.

  • Docs: ??torchvision.datasets.ImageFolder
  • Attributes:
    classes (list): List of the class names sorted alphabetically.
    class_to_idx (dict): Dict with items (class_name, class_index).
    imgs (list): List of (image path, class_index) tuples

(6)Debug_7: show image with matplotlib

  • TypeError: Invalid shape (10, 3, 255, 255) for image data。其中(10, 3, 255, 255)分別對應(batch, channel, height, width)
  • Reason:
    • dataloader output 4 dimensional tensor - [batch, channel, height, width].
    • Matplotlib and other image processing libraries often requires [height, width, channel].
  • Sol: Using the transpose plt.imshow(np.transpose(images[0].numpy(), (1, 2, 0)))
    • Step1: There will be a lot of images in your images so first you need to pick one (or write a for loop to save all of them). This will be simply images[i], typically I use i=0. (挑選某個batch的第1張圖片,index i 從0起算)。 注意: 每個batch有10張圖片,這個images存了某個batch共10張圖片的資訊
    • Step2: Then, convert a now [channel, height, width] tensor to a [height, width, channel] one. np.transpose(image.numpy(), (1, 2, 0))
  • Ref_Stackoverflow: How do I turn a Pytorch Dataloader into a numpy array to display image data with matplotlib?

Part3:專案進度

開始順利進行模型訓練

Part4:下一步

嘗試不同的預訓練模型&資料前處理方式


參考:

心得小語:
今天花了快3小時處理GPU環境,一直有各種問題,後來索性全部清空重安裝就好了,真是有點謎,其他時間就處理訓練設定上的問題,並搞懂Pytorch一些非常好用的工具! 例如:torchvision.datasets.ImageFolder,可以一鍵建立資料和標籤組合的資料,我之前還傻傻的寫函式(暈倒~~ 收穫滿滿的一天,非常開心!
今日工時50min*7

挑戰讓生命趣味盎然,戰勝挑戰讓生命意義非凡
Challenges are what make life interesting. Overcoming them is what makes life meaningful.
/images/emoticon/emoticon12.gif


上一篇
D24-深度學習於農作物應用_文獻回顧2nd
下一篇
D26-競賽資料集運算Pretrained_AlexNet_6th
系列文
菜鳥工程師第一個電腦視覺(CV)專案-農作物影像辨識競賽32
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言