使用C#进行html到XSLT的转换

[英]html to XSLT conversion using C#


I am trying to change a html page to a xslt page using C#, for example if i have something like

我正在尝试使用C#将html页面更改为xslt页面,例如,如果我有类似的东西

<a href="#compantnameURL#">#companyname#</a>

i have to convert it into

我必须把它转换成

<a href="{test/companynameURL}"><xsl:value-of select="test/companyname" /></a>

I have a xsl file which has all these values. I dont want to replace the values here as they are to be further processed before replacing the original values. The problem i am facing here is i have a trouble identifying(to replace the xml construct) if the value is in the attribute level of the tag or in the value level of the tag.

我有一个xsl文件,其中包含所有这些值。我不想替换这里的值,因为它们在替换原始值之前需要进一步处理。我在这里面临的问题是,如果值在标签的属性级别或标签的值级别中,则无法识别(替换xml构造)。

I am trying to use the regular expressions on it . Can someone help??

我正在尝试使用正则表达式。有人可以帮忙吗?

2 个解决方案

#1


1  

Html Agility Pack is the way to go. Don't forget to add the reference to it. This code illustrates one way of using HTML Agility Pack to create an XSLT which is what I think you want to do.

Html Agility Pack是要走的路。不要忘记添加对它的引用。此代码说明了使用HTML Agility Pack创建XSLT的一种方法,这是我认为您想要做的。

    HtmlDocument doc = new HtmlDocument();
    doc.LoadHtml(@"<html>" + 
        "<a href='#compantnameURL1#'>#companyname1#</a>" +
        "<a href='#compantnameURL2#'>#companyname2#</a>" +
        "</html>");

    XmlWriterSettings settings = new XmlWriterSettings();
    settings.Indent = true;
    settings.IndentChars = ("    ");
    settings.Encoding = Encoding.UTF8;

    using (XmlWriter writer = XmlWriter.Create(Console.Out, settings))
    {                                
        writer.WriteStartDocument();
        writer.WriteStartElement("xsl", "stylesheet", "http://www.w3.org/1999/XSL/Transform");
        writer.WriteStartElement("template", "http://www.w3.org/1999/XSL/Transform");
        writer.WriteAttributeString("match", "/");
        writer.WriteElementString("apply-templates", "http://www.w3.org/1999/XSL/Transform", "");
        writer.WriteEndElement();
        writer.WriteStartElement("template", "http://www.w3.org/1999/XSL/Transform");
        writer.WriteAttributeString("match", "test/");
        foreach (HtmlNode link in doc.DocumentNode.SelectNodes("//a"))
        {
            HtmlAttribute att = link.Attributes["href"];
            writer.WriteStartElement("a");
                writer.WriteStartElement("attribute", "http://www.w3.org/1999/XSL/Transform");
                    writer.WriteStartElement("value-of", "http://www.w3.org/1999/XSL/Transform");
                        writer.WriteAttributeString("select", att.Value);
                    writer.WriteEndElement();
                writer.WriteEndElement();
                writer.WriteStartElement("value-of", "http://www.w3.org/1999/XSL/Transform");
                    writer.WriteAttributeString("select", link.InnerText);
                writer.WriteEndElement();
            writer.WriteEndElement();
        }
        writer.WriteEndElement();
        writer.WriteEndDocument();

    }

#2


0  

I'm not aware of a component that will get you all to XSLT, but the HTML Agility Pack is wonderful for any sort of HTML manipulation. The parser will provide a complete object tree with attributes, tags, styles, etc clearly defined, and it's easily queryable with XSLT.

我不知道一个可以让你全部使用XSLT的组件,但HTML Agility Pack非常适合任何类型的HTML操作。解析器将提供一个完整的对象树,其中包含明确定义的属性,标签,样式等,并且可以使用XSLT轻松查询。

Also, for a good discussion of parsing HTML with regex, see the first answer on this post.

另外,有关使用正则表达式解析HTML的详细讨论,请参阅此帖子的第一个答案。


注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:http://www.itdaan.com/blog/2012/01/06/3a684ff454a859d4e239f06656996dec.html



 
© 2014-2019 ITdaan.com 粤ICP备14056181号