如何選擇具有值未知的給定屬性的標記?

[英]How do I select a tag with a given attribute whose value is unknown?


I'm guessing this is a trivial question for someone with a bit of experience with Nokogiri, but I haven't been able to find an answer in the documentation or tutorials I've found online.

我猜這對於對Nokogiri有一點經驗的人來說是一個微不足道的問題,但是我在網上找到的文檔或教程中找不到答案。

I have a Nokogiri document like this:

我有一個像這樣的Nokogiri文件:

page = Nokogiri::HTML(open("http://www.example.com"))

And the page contains the following tag:

該頁面包含以下標記:

<a title="could be anything" href="http://www.example.com/foo"></a>

How do I get the value of href if the value of title is unknown?

如果title的值未知,我如何獲得href的值?

3 个解决方案

#1


2  

If you want the value of the href attribute for a elements having a title attribute you can use Nokogiri's xpath as follows:

如果你想要具有title屬性的元素的href屬性的值,你可以使用Nokogiri的xpath,如下所示:

require 'nokogiri'

doc = Nokogiri::HTML(File.open('sample.html'))

a_with_title = doc.xpath('//a[@title]').map { |e| puts e['href'] }

If you want to select from an URL online you can use

如果您想從在線URL中選擇,您可以使用

require 'nokogiri'
require 'open-uri'

doc = Nokogiri::HTML(open('http://stackoverflow.com/'))

a_with_title = doc.xpath('//a[@title]').map { |e| puts e['href'] }

#2


1  

I finally figured it out. I believe, the following will work to select the href from the first link element with a title attribute: page.css('a[title]')[0]['href'].

我終於弄明白了。我相信,以下將使用title屬性從第一個link元素中選擇href:page.css('a [title]')[0] ['href']。

I had thought page.css('a[title]') was selecting the value of the title attribute, but in fact it selects the entire element. You can then reference this element to get values from it.

我原以為page.css('a [title]')選擇了title屬性的值,但實際上它選擇了整個元素。然后,您可以引用此元素以從中獲取值。

#3


0  

require 'nokogiri'


doc = Nokogiri::HTML::DocumentFragment.parse <<-SCRIPT
<a title="xx" href="http://www.example1.com/foo1"></a>
<a title="aa" href="http://www.example2.com/foo2"></a>
<a id=5 href="http://www.foo.com/foo3"></a>
<a title="zz" href="http://www.example3.com/foo4"></a>
<a id=5 href="http://www.test.com/foo5"></a>
 SCRIPT

p doc.search("a").map { |nd|  nd['href'] if nd.key?('title')}.compact

#=> ["http://www.example1.com/foo1", "http://www.example2.com/foo2", "http://www.example3.com/foo4"]

注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:https://www.itdaan.com/blog/2013/04/18/724e996b49e30f73d91466c8fe60e09b.html



 
粤ICP备14056181号  © 2014-2020 ITdaan.com