대학원 시험 공부

데이터분석 프로젝트(파이썬 Covid-19 사진 학습(분류) 1)

code2772 2023. 3. 25. 12:33

728x90

✔ 설치

!pip install torch
!pip install torchvision
!pip install ipywidgets

1. Torch를 기반으로 하며, 자연어 처리와 같은 애플리케이션을 위해 사용된다. GPU사용이 가능하기 때문에 속도가 상당히 빠르다

2. 파이토치가 제공하는 다양한 모델을 가져다 쓰기 위해 사용되는 라이브러리

3. 이미지를 커서를 가지고 이동하면서 이미지를 볼 수 있게하는 라이브러리(슬라이딩 등)

✔ import

import torch
import copy
import os
import cv2
import matplotlib.pyplot as plt
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms, models
from ipywidgets import interact
from torch import nn

1. 이미지 파일 경로 불러오기

def list_image_file(data_dir, sub_dir): # './Covid19-dataset/train/', 'Normal'
    image_format = ['jpeg', 'jpg', 'png']
    image_files = []
    
    image_dir = os.path.join(data_dir, sub_dir) # './Covid19-dataset/train/Normal'
#     print(images_dir)
# 이미지 디렉토리를 만들어서 os.listdir : 경로에 있는 파일을 리스트화하여 file_path에 넣어 줘라
    for file_path in os.listdir(image_dir):
         # 점을 기준으로 (파일이름이랑 확장명 기준으로) [-1] : 끝에거(확장명) 
        # image_format(확장명) -> 이 있다면 
        if file_path.split('.')[-1] in image_format:
             # 파일 이름 경로들을 합처서 넣어라 -> Normal\\001.jpeg
            image_files.append(os.path.join(sub_dir, file_path))
    return image_files

os.listdir : 경로에 있는 파일을 리스트화하여 file_path에 넣어 줘라

data_dir = './Covid19-dataset/train/'
normals_list = list_image_file(data_dir, 'Normal')
covids_list = list_image_file(data_dir, 'Covid') # Covid : 폴더이름
pneumonias_list = list_image_file(data_dir, 'Viral Pneumonia')

print(len(normals_list))
print(len(covids_list))
print(len(pneumonias_list))

결과

70
111
70

각 일반, 코로나, 폐렴의 사진 개수를 확인

2. 이미지 파일을 RGB3차원 배열로 불러오기

def get_RGB_image(data_dir, file_name):
    # 경로화 시키기 path.join() : 스트링 값으로 사용하지 않는 이유는 특정 함수에서 경로로 인식되지 
    # 않기 때문에 join 방식을 합치는것이 이유이다.
    image_file = os.path.join(data_dir, file_name)
     # 읽어들어와 저장
    image = cv2.imread(image_file)
    # 엠디어레이로 저장되어 있기 때문에 그리고 cv2로 읽어오면 bgr로 읽어와서 바꿔주자    
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    return image

# 3가지 사진의 수 가 다르기 때문에 3개의 길이를 구하고 최솟값을 구해 그 최솟값을 기준으로 만든다.
min_num_files = min(len(normals_list), len(covids_list), len(pneumonias_list))
# interact 슬라이딩 만들기 위해 0부터 70 -1 까지 범위를 지정한다
@interact(index=(0, min_num_files-1))
def show_samples(index=0):
      # 현제 데이터 디렉토리에 정상인 사진들을 위 함수에 넣어 알지비로 넣어라
    normal_image = get_RGB_image(data_dir, normals_list[index])
    covid_image = get_RGB_image(data_dir, covids_list[index])
    pneumonia_image = get_RGB_image(data_dir, pneumonias_list[index])
    
    plt.figure(figsize=(12, 8))
    # matplotlib 여러 개의 그래프를 하나의 그림에 나타내도록 
    # 3개의 그래프중 한개를 그리고 첫번째이다.    
    plt.subplot(131)
    plt.title('Normal')
    plt.imshow(normal_image)
    
    plt.subplot(132)
    plt.title('Covid')
    plt.imshow(covid_image)
    
    plt.subplot(133)
    plt.title('Pneumonia')
    plt.imshow(pneumonia_image)
    plt.tight_layout()

3. 학습 데이터셋 클래스 만들기

데이터 셋: 데이터(샘플, 정답)을 저장한 것
데이터 로더: 데이터셋을 접근하기 쉽게 객체(tierable)로 감싼 것

train_data_dir = './Covid19-dataset/train/'
class_list = ['Normal', 'Covid', 'Viral Pneumonia']

# 데이터셋 : 데이터(샘플, 검증)을 저장한 것
# 데이터로더: 데이터셋을 접근하기 쉽도록 객체(iterable)로 감싼 것

class Chest_dataset(Dataset):
    def __init__(self, data_dir, transform=None): #transform=> train, Test인지 구분하기 위해서
        self.data_dir = data_dir
        normals = list_image_file(data_dir, 'Normal') # 데이터 뽑아오기
        covids = list_image_file(data_dir, 'Covid')
        pneumonias = list_image_file(data_dir, 'Viral Pneumonia')
        self.files_path = normals + covids + pneumonias # 리스트를 다 더해라 : 갯수가 전부 합쳐지게 된다.
        self.transform = transform
        
       # __len__ : len() 구현하기 위한   
    def __len__(self):
        return len(self.files_path)
     # __getitem__ : 인덱스의 순서를 찾아주는 것, 인덱싱하게 되면 호출되는 메소드
    def __getitem__(self, index):
        image_file = os.path.join(self.data_dir, self.files_path[index])
        image = cv2.imread(image_file)
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        # os.sep: 디렉토리 분리 문자를 리턴 (/), Normal/01.jpeg
#         print(self.files_path[index].split(os.sep))
#         print(self.files_path[index].split(os.sep)[-2])
        target = class_list.index(self.files_path[index].split(os.sep)[-2])
    #         target = class_list.index(self.files_path[index].split(os.sep)[0])
        if self.transform:
            image = self.transform(image)
            target = torch.Tensor([target]).long()# 숫자로 변환 long형으로
        return {'image':image, 'target':target}

dset = Chest_dataset(train_data_dir)

index = 100
plt.title(class_list[dset[index]['target']])
plt.imshow(dset[index]['image']) # 해당 번호의 이미지

100 번째에 있는 사진

4. 배열에 연산 가능한 텐서로 변환하기

# transforms.ToTenser(): 텐서로 변환하고 , 픽셀값으 ㅣ범위를 0 ~ 1 로 조정
# Resize(): 사이즈를 설정 224* 224
# Normalize() : 정규화, 
# 각 채널의 평균을 0.5 표준편차를 0.5로 정규화 하여 사용
transformer = transforms.Compose([ # 리스트로 등록을 할것이다.
transformer = transforms.Compose([
    transforms.ToTensor(),
    transforms.Resize((224, 224)), # 이미지 사이즈를 정사각형으로 크기를 동일화 시키는것이다.
    transforms.Normalize(mean=[0.5, 0.5, 0.5],# 정규화를 통해 값을 범위안에 들어가게 0과 1사이에 맞게 
                        std=[0.5, 0.5, 0.5]) )# 왜냐하면 서로 값이 다르는 경우가 있기 떄문에 척도를 동일하게 해준다.
    
])# 여러가지 작업을 한꺼번에 하게 하는것을 Compose를 이용

# transforms.ToTenser(): 텐서로 변환하고 , 픽셀값으 ㅣ범위를 0 ~ 1 로 조정
# Resize(): 사이즈를 설정 224* 224
# Normalize() : 정규화,
# 각 채널의 평균을 0.5 표준편차를 0.5로 정규화 하여 사용,
# 왜냐하면 서로 값이 다르는 경우가 있기 떄문에 척도를 동일하게 해준다.

train_dset = Chest_dataset(train_data_dir, transformer)

index = 100
image = train_dset[index]['image']
label = train_dset[index]['target']

print(image.shape, label) print(image.shape, label) # tensor[1] : 코로나
# 결과 : 3채널에 모양(크기)는 224*224, 코로나

결과

torch.Size([3, 224, 224]) tensor([1])

3 채널에 모양은 224 * 224인 코로나

다음장에 계속

'대학원 시험 공부' 카테고리의 다른 글

데이터분석 프로젝트 (파이썬 Covid-19 사진 학습(분류) 3) (0)	2023.04.05
데이터분석 프로젝트 (파이썬 Covid-19 사진 학습(분류) 2) (1)	2023.03.27
파이토치 (0)	2023.02.28
텐서플로우 (1)	2023.02.27
신경망 (0)	2023.02.26

현재글데이터분석 프로젝트(파이썬 Covid-19 사진 학습(분류) 1)

저장소 복습용 블로그

저장소

복습용 블로그

Spring, javascript, 파이썬, 자바, jsp, 자료구조, html, java, 자바스크립트, ChatGPT, 코딩테스트, 프로젝트, 티스토리챌린지, 기본, 알고리즘, 국비지원, CSS, 리눅스, 오블완, 백준,

Today :
Yesterday :

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

저장소

데이터분석 프로젝트(파이썬 Covid-19 사진 학습(분류) 1)

1. 이미지 파일 경로 불러오기

2. 이미지 파일을 RGB3차원 배열로 불러오기

3. 학습 데이터셋 클래스 만들기

4. 배열에 연산 가능한 텐서로 변환하기

'대학원 시험 공부' 카테고리의 다른 글

'대학원 시험 공부'의 다른글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역

2025. 04
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

데이터분석 프로젝트(파이썬 Covid-19 사진 학습(분류) 1)

1. 이미지 파일 경로 불러오기

2. 이미지 파일을 RGB3차원 배열로 불러오기

3. 학습 데이터셋 클래스 만들기

4. 배열에 연산 가능한 텐서로 변환하기

'대학원 시험 공부' 카테고리의 다른 글

'대학원 시험 공부'의 다른글

관련글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역