首頁猿問從列中的相似值創建嵌套字典并使用值...

從列中的相似值創建嵌套字典并使用值作為字典的鍵包含具有該值的所有行

Python

慕村225694 2022-10-05 18:25:44

所有，我在圖像中有邊界框的數據框，每個框都在單獨的行中。我想要做的是組合特定圖像的所有行。 image xmin ymin xmax ymax label0 bookstore_video0_40.jpg 763 899 806 940 pedestrian3 bookstore_video0_40.jpg 1026 754 1075 797 pedestrian4 bookstore_video0_40.jpg 868 770 927 822 biker5 bookstore_video0_40.jpg 413 1010 433 1040 pedestrian21 bookstore_video0_80.jpg 866 278 917 328 pedestrian22 bookstore_video0_80.jpg 761 825 820 865 biker我在想的可能是把它變成一個單級嵌套字典，注意我對任何解決方案都持開放態度，而且我并不固定在字典中，就像這樣。{'bookstore_video0_40.jpg': {'xmin': 1066, 'ymin': 802, 'xmax': 1093, 'ymax': 829, 'label': 'pedestrian'}但是以圖像名稱為鍵的所有行數據。我的最終目標是將其傳遞給一個函數，該函數將按順序將每一行的數據寫入文件。話雖如此，我對如何將數據分組到塊中迷失了方向。我做了 groupby('image') 但我不知道如何將這些數據變成我想要的東西。有人有想法嗎？我很確定這很容易，我環顧四周，但我看到的大多數回復都是針對更復雜的問題。提前致謝。

查看完整描述

4 回答

躍然一笑

TA貢獻1826條經驗獲得超6個贊

那這個呢？

new_dict = df.set_index('image').stack().groupby('image').apply(list).to_dict()

print(new_dict)

{'bookstore_video0_40.jpg': [763,

899,

806,

940,

'pedestrian',

1026,

754,

1075,

797,

'pedestrian',

868,

770,

927,

822,

'biker',

413,

1010,

433,

1040,

'pedestrian'],

'bookstore_video0_80.jpg': [866,

278,

917,

328,

'pedestrian',

761,

825,

820,

865,

'biker']}

反對回復 2022-10-05

開心每一天1111

TA貢獻1836條經驗獲得超13個贊

這是一個基于您的示例的工作示例，但讀取實際的 XML 文件除外。非常感謝。我懷疑您的回答會很有用，因為這是機器視覺領域的人們在進行諸如切割已經注釋的 4K 圖像之類的事情時會遇到的問題。

import sys

import glob

import numpy as np

import pandas as pd

from lxml import etree

from pathlib import Path, PurePosixPath

from xml.etree import ElementTree as ET

df = pd.DataFrame(dict(

image = '40.jpg 40.jpg 40.jpg 40.jpg 80.jpg 80.jpg'.split(),

xmin = [763, 1026, 868, 413, 866, 761],

ymin = [899, 754, 770, 1010, 278, 825],

xmax = [806, 1075, 927, 433, 917, 820],

ymax = [940, 797, 822, 1040, 328, 865],

label = 'pedestrian pedestrian biker pedestrian pedestrian biker'.split(),

))

for img in df['image'].unique():

img_df = df[df['image']==img].drop(columns = 'image').reset_index()

boxes = range(img_df.shape[0])

print(img, '\n', img_df)

# Ideally your custom voc writer can be inited here

# with something like:

image = img

# v_writer = VocWriter(f'path/{img[:-4]}.xml')

print("New custom VOC Writer instance inited here!")

depth = 3

filepath = PurePosixPath('image')

annotation = ET.Element('annotation')

ET.SubElement(annotation, 'folder').text = str(image)

ET.SubElement(annotation, 'filename').text = str(image)

ET.SubElement(annotation, 'segmented').text = '0'

size = ET.SubElement(annotation, 'size')

ET.SubElement(size, 'width').text = str('0')

ET.SubElement(size, 'height').text = str('0')

ET.SubElement(size, 'depth').text = str('3')

for box in boxes:

xmin = img_df.loc[box,'xmin']

ymin = img_df.loc[box,'ymin']

xmax = img_df.loc[box,'xmax']

ymax = img_df.loc[box,'ymax']

label = img_df.loc[box,'label']

print(xmin, ymin, xmax, ymax)

# Inside of this loop,

# you can add each box to your VocWriter object

# something like:

ob = ET.SubElement(annotation, 'object')

ET.SubElement(ob, 'name').text = str(img_df.loc[box,'label'])

ET.SubElement(ob, 'pose').text = 'Unspecified'

ET.SubElement(ob, 'truncated').text = '0'

ET.SubElement(ob, 'difficult').text = '0'

bbox = ET.SubElement(ob, 'bndbox')

ET.SubElement(bbox, 'xmin').text = str(img_df.loc[box,'xmin'])

ET.SubElement(bbox, 'ymin').text = str(img_df.loc[box,'ymin'])

ET.SubElement(bbox, 'xmax').text = str(img_df.loc[box,'xmax'])

ET.SubElement(bbox, 'ymax').text = str(img_df.loc[box,'ymax'])

# Once you exit that inner loop,

# you can save your data to your .xml file

# with something like:

# v_writer.save(f'path/{img[:-4]}.xml')

print(".xml file saved here!")

fileName = str(img)

tree = ET.ElementTree(annotation)

tree.write("./mergedxml/" + fileName + ".xml", encoding='utf8')

反對回復 2022-10-05

繁花如伊

TA貢獻2012條經驗獲得超12個贊

也許您需要在groupby上使用 dict 和tuple/list：

images_dict = dict(tuple(df.groupby('image')))

反對回復 2022-10-05

侃侃爾雅

TA貢獻1801條經驗獲得超16個贊

我想將此作為評論而不是答案，但鏈接太長：

我寫了一個voc作家。我只需要能夠以這樣的方式傳遞數據，以便我可以遍歷它。我有一個不同的數據集，我在其中做類似的事情，但數據已經是一種易于使用的形式。對于我的項目，我花了很多時間編輯、清理、轉換等數據。對我來說不好玩?? – Robi Sen

你的 voc 作家是如何工作的？它是否類似于我鏈接到的那個（即使用 OPP 并具有用于將 bbox 數據添加到 xml 編寫器實例的類方法，然后是另一種將該實例保存到 xml 文件的方法？）評論寫得不好，這里有一個更好的例子來說明我的意思：

import pandas as pd

df = pd.DataFrame(dict(

image = '40.jpg 40.jpg 40.jpg 40.jpg 80.jpg 80.jpg'.split(),

xmin = [763, 1026, 868, 413, 866, 761],

ymin = [899, 754, 770, 1010, 278, 825],

xmax = [806, 1075, 927, 433, 917, 820],

ymax = [940, 797, 822, 1040, 328, 865],

label = 'pedestrian pedestrian biker pedestrian pedestrian biker'.split(),

))

for img in df['image'].unique():

img_df = df[df['image']==img].drop(columns = 'image').reset_index()

boxes = range(img_df.shape[0])

print(img, '\n', img_df)

# Ideally your custom voc writer can be inited here

# with something like:

# v_writer = VocWriter(f'path/{img[:-4]}.xml')

print('New custom VOC XML Writer instance inited here!')

for box in boxes:

xmin = img_df.loc[box,'xmin']

ymin = img_df.loc[box,'ymin']

xmax = img_df.loc[box,'xmax']

ymax = img_df.loc[box,'ymax']

label = img_df.loc[box,'label']

print(xmin, ymin, xmax, ymax)

# Inside of this loop,

# you can add each box to your VocWriter object

# something like:

# v_writer.addObject(label, xmin, ymin, xmax, ymax)

print('New bbox object added to writer instance here!')

# Once you exit that inner loop,

# you can save your data to your .xml file

# with something like:

# v_writer.save(f'path/{img[:-4]}.xml')

print(f'path/{img[:-4]}.xml file saved here!')

逐步瀏覽python導師中的示例，以更好地了解我的想法

反對回復 2022-10-05

4 回答
0 關注
154 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

從列中的相似值創建嵌套字典并使用值作為字典的鍵包含具有該值的所有行

從列中的相似值創建嵌套字典并使用值作為字典的鍵包含具有該值的所有行

4 回答

添加回答