我有这个xml输入文件:
<?xml version="1.0"?> <zero> <First> <second> <third-num>1</third-num> <third-def>object001</third-def> <third-len>458</third-len> </second> <second> <third-num>2</third-num> <third-def>object002</third-def> <third-len>426</third-len> </second> <second> <third-num>3</third-num> <third-def>object003</third-def> <third-len>998</third-len> </second> </First> </zero>
我的目标是删除<third-def>没有价值的任何第二层。为此,我编写了以下代码:
<third-def>
try: import xml.etree.cElementTree as ET except ImportError: import xml.etree.ElementTree as ET inputfile='inputfile.xml' tree = ET.parse(inputfile) root = tree.getroot() elem = tree.find('First') for elem2 in tree.iter(tag='second'): if elem2.find('third-def').text == 'object001': pass else: elem.remove(elem2) #elem2.clear()
我的问题是elem.remove(elem2)。它每隔第二级跳过一次。这是此代码的输出:
elem.remove(elem2)
<?xml version="1.0" ?> <zero> <First> <second> <third-num>1</third-num> <third-def>object001</third-def> <third-len>458</third-len> </second> <second> <third-num>3</third-num> <third-def>object003</third-def> <third-len>998</third-len> </second> </First> </zero>
现在,如果我取消注释该elem2.clear()行,则脚本可以完美运行,但是输出效果不佳,因为它保留了所有已删除的 第二级 :
elem2.clear()
<?xml version="1.0" ?> <zero> <First> <second> <third-num>1</third-num> <third-def>object001</third-def> <third-len>458</third-len> </second> <second/> <second/> </First> </zero>
有人知道我的element.remove()陈述为什么错误吗?
element.remove()
您正在遍历活动树:
for elem2 in tree.iter(tag='second'):
然后在迭代时进行更改。该迭代的“计数器”将不被告知更改的一些元素,所以元素0前瞻性和上元件数1移除元素,迭代器然后移动,但什么时候 是 单元号1现在是单元号0。
首先捕获所有元素的列表,然后在其上循环:
for elem2 in tree.findall('.//second'):
.findall() 返回结果列表,该列表在您更改树时不会更新。
.findall()
现在迭代不会跳过最后一个元素:
>>> print ET.tostring(tree) <zero> <First> <second> <third-num>1</third-num> <third-def>object001</third-def> <third-len>458</third-len> </second> </First> </zero>
这种现象不仅限于ElementTree树