Scala -修改xml中的嵌套元素

[英]Scala - modifying nested elements in xml


I'm learning scala, and I'm looking to update a nested node in some xml. I've got something working but i'm wondering if its the most elegant way.

我正在学习scala,并希望在一些xml中更新嵌套节点。我有些东西可以用,但我想这是不是最优雅的方式。

I have some xml:

我有一些xml:

val InputXml : Node =
<root>
    <subnode>
        <version>1</version>
    </subnode>
    <contents>
        <version>1</version>
    </contents>
</root>

And i want to update the version node in subnode, but not the one in contents.

我想更新子节点的版本节点,而不是内容节点。

Here is my function:

这是我的功能:

def updateVersion( node : Node ) : Node = 
 {
   def updateElements( seq : Seq[Node]) : Seq[Node] = 
   {
        var subElements = for( subNode <- seq ) yield
        {
            updateVersion( subNode )
        }   
        subElements
   }

   node match
   {
     case <root>{ ch @ _* }</root> =>
     {
        <root>{ updateElements( ch ) }</root>
     }
     case <subnode>{ ch @ _* }</subnode> =>
     {
         <subnode>{ updateElements( ch ) }</subnode> 
     }
     case <version>{ contents }</version> =>
     {
        <version>2</version>
     }
     case other @ _ => 
     {
         other
     }
   }
 }

Is there a more succint way of writing this function?

是否有更好的方法来写这个函数?

7 个解决方案

#1


11  

I think the original logic is good. This is the same code with (shall I dare to say?) a more Scala-ish flavor:

我认为最初的逻辑是好的。这是同样的代码(我敢说吗?)

def updateVersion( node : Node ) : Node = {
   def updateElements( seq : Seq[Node]) : Seq[Node] = 
     for( subNode <- seq ) yield updateVersion( subNode )  

   node match {
     case <root>{ ch @ _* }</root> => <root>{ updateElements( ch ) }</root>
     case <subnode>{ ch @ _* }</subnode> => <subnode>{ updateElements( ch ) }</subnode>
     case <version>{ contents }</version> => <version>2</version>
     case other @ _ => other
   }
 }

It looks more compact (but is actually the same :) )

它看起来更紧凑(但实际上是一样的:)

  1. I got rid of all the unnecessary brackets
  2. 我去掉了所有不必要的括号
  3. If a bracket is needed, it starts in the same line
  4. 如果需要一个括号,它从同一行开始
  5. updateElements just defines a var and returns it, so I got rid of that and returned the result directly
  6. updateElements只定义一个var并返回它,所以我去掉了它,直接返回结果

if you want, you can get rid of the updateElements too. You want to apply the updateVersion to all the elements of the sequence. That's the map method. With that, you can rewrite the line

如果需要,也可以删除updateElements。您希望将updateVersion应用到序列的所有元素。这是地图的方法。这样,你可以重写这条线

case <subnode>{ ch @ _* }</subnode> => <subnode>{ updateElements( ch ) }</subnode>

with

case <subnode>{ ch @ _* }</subnode> => <subnode>{ ch.map(updateVersion (_)) }</subnode>

As update version takes only 1 parameter I'm 99% sure you can omit it and write:

由于更新版本只包含一个参数,我敢肯定你可以省略它并写:

case <subnode>{ ch @ _* }</subnode> => <subnode>{ ch.map(updateVersion) }</subnode>

And end with:

和结尾:

def updateVersion( node : Node ) : Node = node match {
         case <root>{ ch @ _* }</root> => <root>{ ch.map(updateVersion )}</root>
         case <subnode>{ ch @ _* }</subnode> => <subnode>{ ch.map(updateVersion ) }</subnode>
         case <version>{ contents }</version> => <version>2</version>
         case other @ _ => other
       }

What do you think?

你怎么认为?

#2


54  

All this time, and no one actually gave the most appropriate answer! Now that I have learned of it, though, here's my new take on it:

一直以来,没有人给出最合适的答案!现在我已经知道了,下面是我对它的新看法:

import scala.xml._
import scala.xml.transform._

object t1 extends RewriteRule {
  override def transform(n: Node): Seq[Node] = n match {
    case Elem(prefix, "version", attribs, scope, _*)  =>
      Elem(prefix, "version", attribs, scope, Text("2"))
    case other => other
  }
}

object rt1 extends RuleTransformer(t1)

object t2 extends RewriteRule {
  override def transform(n: Node): Seq[Node] = n match {
    case sn @ Elem(_, "subnode", _, _, _*) => rt1(sn)
    case other => other
  }
}

object rt2 extends RuleTransformer(t2)

rt2(InputXml)

Now, for a few explanations. The class RewriteRule is abstract. It defines two methods, both called transform. One of them takes a single Node, the other a Sequence of Node. It's an abstract class, so we can't instantiate it directly. By adding a definition, in this case override one of the transformmethods, we are creating an anonymous subclass of it. Each RewriteRule needs concern itself with a single task, though it can do many.

现在,我们来解释一下。类RewriteRule是抽象的。它定义了两个方法,都称为transform。其中一个节点是一个节点,另一个节点是一个节点序列。它是一个抽象类,所以我们不能直接实例化它。通过添加一个定义,在本例中覆盖其中一个transformmethod,我们创建了它的一个匿名子类。每个RewriteRule需要关注一个任务,尽管它可以做很多事情。

Next, class RuleTransformer takes as parameters a variable number of RewriteRule. It's transform method takes a Node and return a Sequence of Node, by applying each and every RewriteRule used to instantiate it.

接下来,类RuleTransformer以可变数目的RewriteRule为参数。它的转换方法获取一个节点并返回一个节点序列,方法是应用每个用于实例化的RewriteRule。

Both classes derive from BasicTransformer, which defines a few methods with which one need not concern oneself at a higher level. It's apply method calls transform, though, so both RuleTransformer and RewriteRule can use the syntactic sugar associated with it. In the example, the former does and the later does not.

这两个类都来自BasicTransformer, BasicTransformer定义了一些无需在更高级别上关注的方法。它是应用方法调用转换,因此RuleTransformer和RewriteRule都可以使用与它相关的语法糖。在本例中,前者有,后者没有。

Here we use two levels of RuleTransformer, as the first applies a filter to higher level nodes, and the second apply the change to whatever passes the filter.

这里我们使用两个级别的RuleTransformer,因为第一个将过滤器应用于更高级别的节点,第二个将更改应用于通过过滤器的任何内容。

The extractor Elem is also used, so that there is no need to concern oneself with details such as namespace or whether there are attributes or not. Not that the content of the element version is completely discarded and replaced with 2. It can be matched against too, if needed.

还使用了提取器Elem,这样就不必关心诸如名称空间或是否有属性之类的细节了。并不是说元素版本的内容被完全丢弃并替换为2。如果需要的话,它也可以被匹配。

Note also that the last parameter of the extractor is _*, and not _. That means these elements can have multiple children. If you forget the *, the match may fail. In the example, the match would not fail if there were no whitespaces. Because whitespaces are translated into Text elements, a single whitespace under subnode would case the match to fail.

还要注意提取器的最后一个参数是_*,而不是_。这意味着这些元素可以有多个子元素。如果你忘记了*,比赛可能会失败。在本例中,如果没有空格,匹配不会失败。由于空格被转换为文本元素,子节点下的单个空格将导致匹配失败。

This code is bigger than the other suggestions presented, but it has the advantage of having much less knowledge of the structure of the XML than the others. It changes any element called version that is below -- no matter how many levels -- an element called subnode, no matter namespaces, attributes, etc.

这段代码比提出的其他建议要大,但是它的优点是对XML结构的了解比其他的要少得多。它更改任何称为version的元素(无论有多少层),无论名称空间、属性等等。

Furthermore... well, if you have many transformations to do, recursive pattern matching becomes quickly unyielding. Using RewriteRule and RuleTransformer, you can effectively replace xslt files with Scala code.

此外……如果你有很多转换要做,递归模式匹配就会变得很快不屈服。使用RewriteRule和RuleTransformer,您可以用Scala代码有效地替换xslt文件。

#3


12  

You can use Lift's CSS Selector Transforms and write:

你可以使用Lift的CSS选择器转换和写:

"subnode" #> ("version *" #> 2)

See http://stable.simply.liftweb.net/#sec:CSS-Selector-Transforms

见http://stable.simply.liftweb.net/秒:CSS-Selector-Transforms

#4


5  

I have since learned more and presented what I deem to be a superior solution in another answer. I have also fixed this one, as I noticed I was failing to account for the subnode restriction.

从那以后,我学到了更多,并在另一个答案中提出了我认为是更好的解决方案。我还修复了这个,因为我注意到我没有考虑子节点限制。

Thanks for the question! I just learned some cool stuff when dealing with XML. Here is what you want:

谢谢你的问题!我刚在处理XML时学到了一些很酷的东西。这是你想要的:

def updateVersion(node: Node): Node = {
  def updateNodes(ns: Seq[Node], mayChange: Boolean): Seq[Node] =
    for(subnode <- ns) yield subnode match {
      case <version>{ _ }</version> if mayChange => <version>2</version>
      case Elem(prefix, "subnode", attribs, scope, children @ _*) =>
        Elem(prefix, "subnode", attribs, scope, updateNodes(children, true) : _*)
      case Elem(prefix, label, attribs, scope, children @ _*) =>
        Elem(prefix, label, attribs, scope, updateNodes(children, mayChange) : _*)
      case other => other  // preserve text
    }

  updateNodes(node.theSeq, false)(0)
}

Now, explanation. First and last case statements should be obvious. The last one exists to catch those parts of an XML which are not elements. Or, in other words, text. Note in the first statement, though, the test against the flag to indicate whether version may be changed or not.

现在,解释。第一和最后一种情况的陈述应该是显而易见的。最后一个用于捕获非元素的XML部分。换句话说,就是文本。但是,请注意,在第一个语句中,针对标志的测试表明版本是否可以更改。

The second and third case statements will use a pattern matcher against the object Elem. This will break an element into all its component parts. The last parameter, "children @ _*", will match children to a list of anything. Or, more specifically, a Seq[Node]. Then we reconstruct the element, with the parts we extracted, but pass the Seq[Node] to updateNodes, doing the recursion step. If we are matching against the element subnode, then we change the flag mayChange to true, enabling the change of the version.

第二个和第三个case语句将对对象Elem使用模式匹配器。这将把元素分解成所有的组件。最后一个参数“children @ _*”将把children匹配到任何列表。或者,更具体地说,Seq[Node]。然后,我们用提取的部分重构元素,但将Seq[Node]传递给updateNodes,执行递归步骤。如果我们匹配元素子节点,那么我们将标记更改为true,从而允许更改版本。

In the last line, we use node.theSeq to generate a Seq[Node] from Node, and (0) to get the first element of the Seq[Node] returned as result. Since updateNodes is essentially a map function (for ... yield is translated into map), we know the result will only have one element. We pass a false flag to ensure that no version will be changed unless a subnode element is an ancestor.

在最后一行,我们使用node。theSeq从节点生成Seq[Node],(0)得到Seq[Node]的第一个元素作为结果返回。因为updateNodes本质上是一个map函数(for…)屈服被翻译成地图),我们知道结果将只有一个元素。我们传递一个伪标志,以确保不更改任何版本,除非子节点元素是祖先。

There is a slightly different way of doing it, that's more powerful but a bit more verbose and obscure:

有一种稍微不同的方法,它更强大,但有点冗长和晦涩:

def updateVersion(node: Node): Node = {
  def updateNodes(ns: Seq[Node], mayChange: Boolean): Seq[Node] =
    for(subnode <- ns) yield subnode match {
      case Elem(prefix, "version", attribs, scope, Text(_)) if mayChange => 
        Elem(prefix, "version", attribs, scope, Text("2"))
      case Elem(prefix, "subnode", attribs, scope, children @ _*) =>
        Elem(prefix, "subnode", attribs, scope, updateNodes(children, true) : _*)
      case Elem(prefix, label, attribs, scope, children @ _*) =>
        Elem(prefix, label, attribs, scope, updateNodes(children, mayChange) : _*)
      case other => other  // preserve text
    }

  updateNodes(node.theSeq, false)(0)
}

This version allows you to change any "version" tag, whatever it's prefix, attribs and scope.

这个版本允许您更改任何“版本”标记,无论它的前缀、attribs和范围是什么。

#5


3  

Scales Xml provides tools for "in place" edits. Of course its all immutable but here's the solution in Scales:

scale Xml为“就地”编辑提供了工具。当然它都是不可变的,但这里有一个规模的解决方案:

val subnodes = top(xml).\*("subnode"l).\*("version"l)
val folded = foldPositions( subnodes )( p => 
  Replace( p.tree ~> "2"))

The XPath like syntax is a Scales signature feature, the l after the string specifies it should have no namespace (local name only).

类似于XPath的语法是一个scale signature特性,在字符串指定它应该没有名称空间(仅是本地名称)之后。

foldPositions iterates over the resulting elements and transforms them, joining the results back together.

foldPositions对结果元素进行迭代,并对它们进行转换,使结果返回到一起。

#6


1  

One approach would be lenses (e.g. scalaz's). See http://arosien.github.io/scalaz-base-talk-201208/#slide35 for a very clear presentation.

一种方法是隐形眼镜(例如:scalaz’s)。见http://arosien.github。一个非常清晰的演示文稿。

#7


-2  

I really don't know how this could be done elegantly. FWIW, I would go for a different approach: use a custom model class for the info you're handling, and have conversion to and from Xml for it. You're probably going to find it's a better way to handle the data, and it's even more succint.

我真的不知道怎样才能优雅地做到这一点。FWIW,我将采用另一种方法:使用自定义模型类来处理您所处理的信息,并将其转换为Xml。你可能会发现这是一种更好的数据处理方式,而且它甚至更有吸引力。

However there is a nice way to do it with Xml directly, I'd like to see it.

然而,有一种很好的直接使用Xml的方法,我希望看到它。

智能推荐

注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:http://www.itdaan.com/blog/2009/06/09/54cf2d9b356e94176495f1d771803c83.html



 
© 2014-2019 ITdaan.com 粤ICP备14056181号  

赞助商广告