[翻译]  stax - get xml node as string

[CHINESE]  stax - 将xml节点作为字符串


xml looks like so:

xml看起来像这样:

<statements>
   <statement account="123">
      ...stuff...
   </statement>
   <statement account="456">
      ...stuff...
   </statement>
</statements>

I'm using stax to process one "<statement>" at a time and I got that working. I need to get that entire statement node as a string so I can create "123.xml" and "456.xml" or maybe even load it into a database table indexed by account.

我正在使用stax一次处理一个“ ”,然后我就开始工作了。我需要将整个语句节点作为字符串获取,这样我就可以创建“123.xml”和“456.xml”,甚至可以将其加载到由account索引的数据库表中。

using this approach: http://www.devx.com/Java/Article/30298/1954

使用这种方法:http://www.devx.com/Java/Article/30298/1954

I'm looking to do something like this:

我想做这样的事情:

String statementXml = staxXmlReader.getNodeByName("statement");

//load statementXml into database

5 个解决方案

#1


1  

Why not just use xpath for this?

为什么不直接使用xpath呢?

You could have a fairly simple xpath to get all 'statement' nodes.

你可以有一个相当简单的xpath来获取所有'statement'节点。

Like so:

像这样:

//statement

EDIT #1: If possible, take a look at dom4j. You could read the String and get all 'statement' nodes fairly simply.

编辑#1:如果可能的话,看看dom4j。您可以读取字符串并相当简单地获取所有“语句”节点。

EDIT #2: Using dom4j, this is how you would do it: (from their cookbook)

编辑#2:使用dom4j,你就是这样做的:(来自他们的食谱)

String text = "your xml here";
Document document = DocumentHelper.parseText(text);

public void bar(Document document) {
   List list = document.selectNodes( "//statement" );
   // loop through node data
}

#2


6  

I had a similar task and although the original question is older than a year, I couldn't find a satisfying answer. The most interesting answer up to now was Blaise Doughan's answer, but I couldn't get it running on the XML I am expecting (maybe some parameters for the underlying parser could change that?). Here the XML, very simplyfied:

我有一个类似的任务,虽然最初的问题超过一年,但我找不到令人满意的答案。到目前为止最有趣的答案是Blaise Doughan的答案,但是我无法让它在我期望的XML上运行(底层解析器的一些参数可能会改变它吗?)。这里的XML非常简单:

<many-many-tags>
    <description>
        ...
        <p>Lorem ipsum...</p>
        Devils inside...
        ...
    </description>
</many-many-tags>

My solution:

我的解决方案

public static String readElementBody(XMLEventReader eventReader)
    throws XMLStreamException {
    StringWriter buf = new StringWriter(1024);

    int depth = 0;
    while (eventReader.hasNext()) {
        // peek event
        XMLEvent xmlEvent = eventReader.peek();

        if (xmlEvent.isStartElement()) {
            ++depth;
        }
        else if (xmlEvent.isEndElement()) {
            --depth;

            // reached END_ELEMENT tag?
            // break loop, leave event in stream
            if (depth < 0)
                break;
        }

        // consume event
        xmlEvent = eventReader.nextEvent();

        // print out event
        xmlEvent.writeAsEncodedUnicode(buf);
    }

    return buf.getBuffer().toString();
}

Usage example:

用法示例:

XMLEventReader eventReader = ...;
while (eventReader.hasNext()) {
    XMLEvent xmlEvent = eventReader.nextEvent();
    if (xmlEvent.isStartElement()) {
        StartElement elem = xmlEvent.asStartElement();
        String name = elem.getName().getLocalPart();

        if ("DESCRIPTION".equals(name)) {
            String xmlFragment = readElementBody(eventReader);
            // do something with it...
            System.out.println("'" + fragment + "'");
        }
    }
    else if (xmlEvent.isEndElement()) {
        // ...
    }
}

Note that the extracted XML fragment will contain the complete extracted body content, including white space and comments. Filtering those on demand, or making the buffer size parametrizable have been left out for code brevity:

请注意,提取的XML片段将包含完整的提取的正文内容,包括空格和注释。为了简洁起见,省略了按需过滤或缓冲区大小可参数化的问题:

'
    <description>
        ...
        <p>Lorem ipsum...</p>
        Devils inside...
        ...
    </description>
    '

#3


5  

You can use StAX for this. You just need to advance the XMLStreamReader to the start element for statement. Check the account attribute to get the file name. Then use the javax.xml.transform APIs to transform the StAXSource to a StreamResult wrapping a File. This will advance the XMLStreamReader and then just repeat this process.

你可以使用StAX。您只需要将XMLStreamReader推进到start元素for语句。检查帐户属性以获取文件名。然后使用javax.xml.transform API将StAXSource转换为包装文件的StreamResult。这将推进XMLStreamReader,然后重复此过程。

import java.io.File;
import java.io.FileReader;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamReader;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stax.StAXSource;
import javax.xml.transform.stream.StreamResult;

public class Demo {

    public static void main(String[] args) throws Exception  {
        XMLInputFactory xif = XMLInputFactory.newInstance();
        XMLStreamReader xsr = xif.createXMLStreamReader(new FileReader("input.xml"));
        xsr.nextTag(); // Advance to statements element

        while(xsr.nextTag() == XMLStreamConstants.START_ELEMENT) {
            TransformerFactory tf = TransformerFactory.newInstance();
            Transformer t = tf.newTransformer();
            File file = new File("out" + xsr.getAttributeValue(null, "account") + ".xml");
            t.transform(new StAXSource(xsr), new StreamResult(file));
        }
    }

}

#4


2  

Stax is a low-level access API, and it does not have either lookups or methods that access content recursively. But what you actually trying to do? And why are you considering Stax?

Stax是一种低级访问API,它没有查询或递归访问内容的方法。但你真正想做的是什么?你为什么要考虑Stax?

Beyond using a tree model (DOM, XOM, JDOM, Dom4j), which would work well with XPath, best choice when dealing with data is usually data binding library like JAXB. With it you can pass Stax or SAX reader and ask it to bind xml data into Java beans and instead of messing with xml process Java objects. This is often more convenient, and it is usually quite performance. Only trick with larger files is that you do not want to bind the whole thing at once, but rather bind each sub-tree (in your case, one 'statement' at a time). This is easiest done by iterating Stax XmlStreamReader, then using JAXB to bind.

除了使用适用于XPath的树模型(DOM,XOM,JDOM,Dom4j)之外,处理数据时的最佳选择通常是数据绑定库,如JAXB。有了它,您可以传递Stax或SAX读取器并要求它将xml数据绑定到Java bean中,而不是弄乱xml进程Java对象。这通常更方便,而且通常性能相当。只有较大文件的技巧是你不想一次绑定整个事物,而是绑定每个子树(在你的情况下,一次一个'语句')。这是通过迭代Stax XmlStreamReader,然后使用JAXB进行绑定来完成的。

#5


1  

I've been googling and this seems painfully difficult.

我一直在谷歌搜索,这似乎很难。

given my xml I think it might just be simpler to:

鉴于我的xml,我认为它可能更简单:

StringBuilder buffer = new StringBuilder();
for each line in file {
   buffer.append(line)
   if(line.equals(STMT_END_TAG)){
      parse(buffer.toString())
      buffer.delete(0,buffer.length)
   }
 }

 private void parse(String statement){
    //saxParser.parse( new InputSource( new StringReader( xmlText ) );
    // do stuff
    // save string
 }

注意!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系我们删除。



 
© 2014-2018 ITdaan.com 粤ICP备14056181号