利用 xml.dom.minidom 生成 xml,解决属性无序问题和xml声明单独一行

栏目: 编程语言 · XML · 发布时间: 6年前

内容简介:# cat HKEX-EPS_20180830_003249795.xml<?xml version="1.0" encoding="UTF-8"?

1. 问题描述

属性无序问题和 xml 声明不是单独一行

# cat HKEX-EPS_20180830_003249795.xml

> TCML>< News Encoding="UTF-8" Language="en-us" TimeStamp="20180830194015" >246843820180830194015HKEX-EPSAMENDEDen-usCHANGE OF COMPANY NAME,STOCK SHORT NAME AND COMPANY LOGO20180831MAINfalseHKEX-EPS_20180830_003249795_0.PDFAPPLICATION/PDF521386127001979010000185401400? 地科技股份 满地科技股份 MOODY TECH HLDGET Net IIS Category ListET Net Ltd?2018 ET Net Limited. All rights reserved.

达到效果:

cat HKEX-EPS_20180830_003249795.xml

>

TCML>< News TimeStamp="20180830194015" Encoding="UTF-8" Language="en-us" >246843820180830194015HKEX-EPSAMENDEDen-usCHANGE OF COMPANY NAME,STOCK SHORT NAME AND COMPANY LOGO20180831MAINfalseHKEX-EPS_20180830_003249795_0.PDFAPPLICATION/PDF521386127001979010000185401400? 地科技股份 满地科技股份 MOODY TECH HLDGET Net IIS Category ListET Net Ltd?2018 ET Net Limited. All rights reserved.

2 操作步骤

2.1 环境说明

系统自带 python2.6.6  升级为 python2.7.10

如果没有升级 python2.7

>>> import sys

>>> sys.path

路径为 /usr/lib64/python2.6/xml/dom

使用的模块是

import xml.dom.minidom

2.2 换行处理

# cd /usr/local/lib/python2.7/xml/dom/

原始配置

def writexml(self, writer, indent="", addindent="", newl="",

encoding = None):

if encoding is None:

writer.write(''+ newl )

else:

writer.write('%s' % (encoding, newl ))

for node in self.childNodes:

node.writexml(writer, indent, addindent, newl)

修改配置

def writexml(self, writer, indent="", addindent="", newl="",

encoding = None):

if encoding is None:

writer.write(''+ '\n' )

else:

writer.write('%s' % (encoding, '\n' ))

for node in self.childNodes:

node.writexml(writer, indent, addindent, newl)

2.3 属性有序处理

原始配置

def __init__(self, tagName, namespaceURI=EMPTY_NAMESPACE, prefix=None,

localName=None):

self.tagName = self.nodeName = tagName

self.prefix = prefix

self.namespaceURI = namespaceURI

self.childNodes = NodeList()

self._attrs = {}   # attributes are double-indexed:

self._attrsNS = {} #    tagName -> Attribute

#    URI,localName -> Attribute

# in the future: consider lazy generation

# of attribute objects this is too tricky

# for now because of headaches with

# namespaces.

......

def writexml(self, writer, indent="", addindent="", newl=""):

# indent = current indentation

# addindent = indentation to add to higher levels

# newl = newline string

writer.write(indent+"<" + self.tagName)

attrs = self._get_attributes()

a_names = attrs.keys()

a_names.sort()

修改配置:

def __init__(self, tagName, namespaceURI=EMPTY_NAMESPACE, prefix=None,

localName=None):

self.tagName = self.nodeName = tagName

self.prefix = prefix

self.namespaceURI = namespaceURI

self.childNodes = NodeList()

#self._attrs = {}   # attributes are double-indexed:

self._attrs = OrderedDict()   # attributes are double-indexed:

self._attrsNS = {} #    tagName -> Attribute

#    URI,localName -> Attribute

# in the future: consider lazy generation

# of attribute objects this is too tricky

# for now because of headaches with

# namespaces.

......

def writexml(self, writer, indent="", addindent="", newl=""):

# indent = current indentation

# addindent = indentation to add to higher levels

# newl = newline string

writer.write(indent+"<" + self.tagName)

attrs = self._get_attributes()

a_names = attrs.keys()

#a_names.sort()

3. 总结

亲测可用


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

统计思维

统计思维

[美] Allen B. Downey / 金迎 / 人民邮电出版社 / 2015-9 / 49.00元

现实工作中,人们常常需要用数据说话。可是,数据自己不会说话,需要人对它进行分析和挖掘才能找到有价值的信息。概率统计是数据分析的通用语言,是大数据时代预测未来的根基。如果你有编程背景,就能以概率和统计学为工具,将数据转化为有用的信息和知识,让数据说话。本书介绍了如何借助计算而非数学方法,使用Python语言对数据进行统计分析。 通过书中有趣的案例,你可以学到探索性数据分析的整个过程,从数据收集......一起来看看 《统计思维》 这本书的介绍吧!

HTML 压缩/解压工具
HTML 压缩/解压工具

在线压缩/解压 HTML 代码

SHA 加密
SHA 加密

SHA 加密工具

正则表达式在线测试
正则表达式在线测试

正则表达式在线测试