内容简介:http://stackoverflow.com/questions/7237466/python-elementtree-support-for-parsing-unknown-xml-entities
我有一套超简单的XML文件来解析…但是他们使用自定义的实体.我不需要将这些映射到字符,但我希望解析并对每个人进行操作.例如:
<Style name="admin-5678"> <Rule> <Filter>[admin_level]='5'</Filter> &maxscale_zoom11; </Rule> </Style>
在 http://effbot.org/elementtree/elementtree-xmlparser.htm 有一个诱人的提示,XMLParser有限的实体支持,但我找不到所提及的方法,一切都给出错误:
#!/usr/bin/python ## ## Where's the entity support as documented at: ## http://effbot.org/elementtree/elementtree-xmlparser.htm ## In Python 2.7.1+ ? ## from pprint import pprint from xml.etree import ElementTree from cStringIO import StringIO parser = ElementTree.ElementTree() #parser.entity["maxscale_zoom11"] = unichr(160) testf = StringIO('<foo>&maxscale_zoom11;</foo>') tree = parser.parse(testf) #tree = parser.parse(testf,"XMLParser") for node in tree.iter('foo'): print node.text
这取决于你如何调整评论:
xml.etree.ElementTree.ParseError: undefined entity: line 1, column 5
要么
AttributeError: 'ElementTree' object has no attribute 'entity'
要么
AttributeError: 'str' object has no attribute 'feed'
对于那些好奇的XML,这是从 OpenStreetMap 的mapnik项目.
我不知道这是否是ElementTree中的一个错误,但是您需要在expat解析器上调用UseForeignDTD(True)来表现其在过去的方式.
这有点恶作剧,但是您可以通过创建您自己的ElementTree.Parser实例来调用该方法,然后将其传递给xml.parsers.expat,然后将其传递给ElementTree.parse():
from xml.etree import ElementTree from cStringIO import StringIO testf = StringIO('<foo>&moo_1;</foo>') parser = ElementTree.XMLParser() parser.parser.UseForeignDTD(True) parser.entity['moo_1'] = 'MOOOOO' etree = ElementTree.ElementTree() tree = etree.parse(testf, parser=parser) for node in tree.iter('foo'): print node.text
输出“MOOOOO”
或使用映射界面:
from xml.etree import ElementTree from cStringIO import StringIO class AllEntities: def __getitem__(self, key): #key is your entity, you can do whatever you want with it here return key testf = StringIO('<foo>&moo_1;</foo>') parser = ElementTree.XMLParser() parser.parser.UseForeignDTD(True) parser.entity = AllEntities() etree = ElementTree.ElementTree() tree = etree.parse(testf, parser=parser) for node in tree.iter('foo'): print node.text
输出“moo_1”
一个更复杂的修复将是子类ElementTree.XMLParser并修复它在那里.
http://stackoverflow.com/questions/7237466/python-elementtree-support-for-parsing-unknown-xml-entities
以上所述就是小编给大家介绍的《Python ElementTree支持解析未知的XML实体?》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!
猜你喜欢:- NLP 教程:词性标注、依存分析和命名实体识别解析与应用
- EF架构~FluentValidation实体检验与实体分离了
- 表单 – 如何使用实体列表(CRUD)从模板中删除实体?
- MyBatis Generator配置文件--指定生成实体类使用实际的表列名作为实体类的属性名
- 命名实体识别技术
- XML实体扩展攻击
本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
Building Social Web Applications
Gavin Bell / O'Reilly Media / 2009-10-1 / USD 34.99
Building a social web application that attracts and retains regular visitors, and gets them to interact, isn't easy to do. This book walks you through the tough questions you'll face if you're to crea......一起来看看 《Building Social Web Applications》 这本书的介绍吧!