4 回答

TA貢獻1798條經驗 獲得超3個贊
嘗試此功能來檢查圖像的格式是否正確。
import os
from PIL import Image
folder_path = 'data\img'
extensions = []
for fldr in os.listdir(folder_path):
sub_folder_path = os.path.join(folder_path, fldr)
for filee in os.listdir(sub_folder_path):
file_path = os.path.join(sub_folder_path, filee)
print('** Path: {} **'.format(file_path), end="\r", flush=True)
im = Image.open(file_path)
rgb_im = im.convert('RGB')
if filee.split('.')[1] not in extensions:
extensions.append(filee.split('.')[1])

TA貢獻1946條經驗 獲得超3個贊
我不知道這是否仍然相關,但對于將來遇到同樣問題的人來說:
在這種特定情況下,dog_cat 數據集中有兩個損壞的文件:
貓/666.jpg
狗/11702.jpg
只要刪除它們就可以了。

TA貢獻1801條經驗 獲得超16個贊
我以前遇到過這個問題。因此,我開發了一個 python 腳本來測試訓練和測試目錄中是否存在有效的圖像文件。文件擴展名必須是 jpg、png、bmp 或 gif 之一,因此它首先檢查正確的擴展名。然后它嘗試使用 cv2 讀取圖像。如果未輸入有效圖像,則會創建異常。在每種情況下都會打印出錯誤的文件名。最后,名為 bad_list 的列表包含錯誤文件路徑列表。注意目錄必須名為“test”和“train”
import os
import cv2
bad_list=[]
dir=r'c:\'PetImages'
subdir_list=os.listdir(dir) # create a list of the sub directories in the directory ie train or test
for d in subdir_list: # iterate through the sub directories train and test
dpath=os.path.join (dir, d) # create path to sub directory
if d in ['test', 'train']:
class_list=os.listdir(dpath) # list of classes ie dog or cat
# print (class_list)
for klass in class_list: # iterate through the two classes
class_path=os.path.join(dpath, klass) # path to class directory
#print(class_path)
file_list=os.listdir(class_path) # create list of files in class directory
for f in file_list: # iterate through the files
fpath=os.path.join (class_path,f)
index=f.rfind('.') # find index of period infilename
ext=f[index+1:] # get the files extension
if ext not in ['jpg', 'png', 'bmp', 'gif']:
print(f'file {fpath} has an invalid extension {ext}')
bad_list.append(fpath)
else:
try:
img=cv2.imread(fpath)
size=img.shape
except:
print(f'file {fpath} is not a valid image file ')
bad_list.append(fpath)
print (bad_list)

TA貢獻1848條經驗 獲得超2個贊
我們也可以在每個錯誤實例處刪除,而不是附加損壞的列表......
import os
from PIL import Image
folder_path = r"C:\Users\ImageDatasets"
extensions = []
corupt_img_paths=[]
for fldr in os.listdir(folder_path):
sub_folder_path = os.path.join(folder_path, fldr)
for filee in os.listdir(sub_folder_path):
file_path = os.path.join(sub_folder_path, filee)
print('** Path: {} **'.format(file_path), end="\r", flush=True)
try:
im = Image.open(file_path)
except:
print(file_path)
os.remove(file_path)
continue
else:
rgb_im = im.convert('RGB')
if filee.split('.')[1] not in extensions:
extensions.append(filee.split('.')[1])
添加回答
舉報