利用 xml.dom.minidom 生成 xml,解决属性无序问题和xml声明单独一行

栏目: 编程语言 · XML · 发布时间: 5年前

内容简介:# cat HKEX-EPS_20180830_003249795.xml<?xml version="1.0" encoding="UTF-8"?

1. 问题描述

属性无序问题和 xml 声明不是单独一行

# cat HKEX-EPS_20180830_003249795.xml

> TCML>< News Encoding="UTF-8" Language="en-us" TimeStamp="20180830194015" >246843820180830194015HKEX-EPSAMENDEDen-usCHANGE OF COMPANY NAME,STOCK SHORT NAME AND COMPANY LOGO20180831MAINfalseHKEX-EPS_20180830_003249795_0.PDFAPPLICATION/PDF521386127001979010000185401400? 地科技股份 满地科技股份 MOODY TECH HLDGET Net IIS Category ListET Net Ltd?2018 ET Net Limited. All rights reserved.

达到效果:

cat HKEX-EPS_20180830_003249795.xml

>

TCML>< News TimeStamp="20180830194015" Encoding="UTF-8" Language="en-us" >246843820180830194015HKEX-EPSAMENDEDen-usCHANGE OF COMPANY NAME,STOCK SHORT NAME AND COMPANY LOGO20180831MAINfalseHKEX-EPS_20180830_003249795_0.PDFAPPLICATION/PDF521386127001979010000185401400? 地科技股份 满地科技股份 MOODY TECH HLDGET Net IIS Category ListET Net Ltd?2018 ET Net Limited. All rights reserved.

2 操作步骤

2.1 环境说明

系统自带 python2.6.6  升级为 python2.7.10

如果没有升级 python2.7

>>> import sys

>>> sys.path

路径为 /usr/lib64/python2.6/xml/dom

使用的模块是

import xml.dom.minidom

2.2 换行处理

# cd /usr/local/lib/python2.7/xml/dom/

原始配置

def writexml(self, writer, indent="", addindent="", newl="",

encoding = None):

if encoding is None:

writer.write(''+ newl )

else:

writer.write('%s' % (encoding, newl ))

for node in self.childNodes:

node.writexml(writer, indent, addindent, newl)

修改配置

def writexml(self, writer, indent="", addindent="", newl="",

encoding = None):

if encoding is None:

writer.write(''+ '\n' )

else:

writer.write('%s' % (encoding, '\n' ))

for node in self.childNodes:

node.writexml(writer, indent, addindent, newl)

2.3 属性有序处理

原始配置

def __init__(self, tagName, namespaceURI=EMPTY_NAMESPACE, prefix=None,

localName=None):

self.tagName = self.nodeName = tagName

self.prefix = prefix

self.namespaceURI = namespaceURI

self.childNodes = NodeList()

self._attrs = {}   # attributes are double-indexed:

self._attrsNS = {} #    tagName -> Attribute

#    URI,localName -> Attribute

# in the future: consider lazy generation

# of attribute objects this is too tricky

# for now because of headaches with

# namespaces.

......

def writexml(self, writer, indent="", addindent="", newl=""):

# indent = current indentation

# addindent = indentation to add to higher levels

# newl = newline string

writer.write(indent+"<" + self.tagName)

attrs = self._get_attributes()

a_names = attrs.keys()

a_names.sort()

修改配置:

def __init__(self, tagName, namespaceURI=EMPTY_NAMESPACE, prefix=None,

localName=None):

self.tagName = self.nodeName = tagName

self.prefix = prefix

self.namespaceURI = namespaceURI

self.childNodes = NodeList()

#self._attrs = {}   # attributes are double-indexed:

self._attrs = OrderedDict()   # attributes are double-indexed:

self._attrsNS = {} #    tagName -> Attribute

#    URI,localName -> Attribute

# in the future: consider lazy generation

# of attribute objects this is too tricky

# for now because of headaches with

# namespaces.

......

def writexml(self, writer, indent="", addindent="", newl=""):

# indent = current indentation

# addindent = indentation to add to higher levels

# newl = newline string

writer.write(indent+"<" + self.tagName)

attrs = self._get_attributes()

a_names = attrs.keys()

#a_names.sort()

3. 总结

亲测可用


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

史玉柱自述

史玉柱自述

优米网 / 同心出版社 / 2013-6-1 / 42.00元

史玉柱迄今为止唯一公开著作。 亲口讲述24年创业历程与营销心得。 中国商业思想史里程碑之作! 24年跌宕起伏,功成身退,史玉柱向您娓娓道来,历经时间沉淀的商业智慧和人生感悟。 在书中,史玉柱毫无保留地回顾了创业以来的经历和各阶段的思考。全书没有深奥的理论,铅华洗尽、朴实无华,往往在轻描淡写之间,一语道破营销的本质。 关于产品开发、营销传播、广告投放、团队管理、创业投资......一起来看看 《史玉柱自述》 这本书的介绍吧!

SHA 加密
SHA 加密

SHA 加密工具

Markdown 在线编辑器
Markdown 在线编辑器

Markdown 在线编辑器

HEX CMYK 转换工具
HEX CMYK 转换工具

HEX CMYK 互转工具