python 批量修改 labelImg 生成的xml文件的方法
概述
自己在用labelImg打好标签后,想只用其中几类训练,不想训练全部类别,又不想重新打标生成.xml文件,因此想到这个办法:直接在.xml文件中删除原有的不需要的标签类及其属性。
打标时标签名出现了大小写(工程量大时可能会手滑),程序中有改写标签值为小写的过程,因为我做py-faster-rcnn训练时,标签必须全部为小写。
以如下的.xml文件为例,我故意把标签增加了大写
test.jpg C:\Users\yasin\Desktop\test 400 300 3 0
具体实现
假如我们只想保留图片上的people和cat类,其他都删除,代码如下:
fromxml.etree.ElementTreeimportElementTree fromosimportwalk,path defread_xml(in_path): tree=ElementTree() tree.parse(in_path) returntree defwrite_xml(tree,out_path): tree.write(out_path,encoding="utf-8",xml_declaration=True) deffind_nodes(tree,path): returntree.findall(path) defdel_node_by_target_classes(nodelist,target_classes_lower,tree_root): forparent_nodeinnodelist: children=parent_node.getchildren() if(parent_node.tag=="object"andchildren[0].text.lower()notintarget_classes_lower): tree_root.remove(parent_node) elif(parent_node.tag=="object"andchildren[0].text.lower()intarget_classes_lower): children[0].text=children[0].text.lower() defget_fileNames(rootdir): data_path=[] prefixs=[] forroot,dirs,filesinwalk(rootdir,topdown=True): fornameinfiles: pre,ending=path.splitext(name) ifending!=".xml": continue else: data_path.append(path.join(root,name)) prefixs.append(pre) returndata_path,prefixs if__name__=="__main__": #getallthexmlpaths,prefixesifnotusedhere paths_xml,prefixs=get_fileNames("/home/yasin/old_labels/") target_classes=["PEOPLE","CAT"]#targetflagsyouwanttokeep target_classes_lower=[] foriinrange(len(target_classes)): target_classes_lower.append(target_classes[i].lower())#makesureyourtargetislowe-case #print(target_classes_lower) foriinrange(len(paths_xml)): #renameandsavethecorrespondingxml tree=read_xml(paths_xml[i]) #gettreenode tree_root=tree.getroot() #getparentnodes del_parent_nodes=find_nodes(tree,"./") #gettargetclassesanddelete target_del_node=del_node_by_target_classes(del_parent_nodes,target_classes_lower,tree_root) #saveoutputxml,000001.xml write_xml(tree,"/home/yasin/new_labels/{}.xml".format("%06d"%i))
按照上述代码,示例.xml变为如下.xml,可以看出我们删除了除people和cat类的类别(即dog类),并把保留类别的打标改成了小写:
test.jpg C:\Users\yasin\Desktop\test 400 300 3 0
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持毛票票。