I'm guessing this is a trivial question for someone with a bit of experience with Nokogiri, but I haven't been able to find an answer in the documentation or tutorials I've found online.
我猜這對於對Nokogiri有一點經驗的人來說是一個微不足道的問題,但是我在網上找到的文檔或教程中找不到答案。
I have a Nokogiri document like this:
我有一個像這樣的Nokogiri文件:
page = Nokogiri::HTML(open("http://www.example.com"))
And the page contains the following tag:
該頁面包含以下標記:
<a title="could be anything" href="http://www.example.com/foo"></a>
How do I get the value of href
if the value of title
is unknown?
如果title的值未知,我如何獲得href的值?
2
If you want the value of the href
attribute for a
elements having a title
attribute you can use Nokogiri's xpath
as follows:
如果你想要具有title屬性的元素的href屬性的值,你可以使用Nokogiri的xpath,如下所示:
require 'nokogiri'
doc = Nokogiri::HTML(File.open('sample.html'))
a_with_title = doc.xpath('//a[@title]').map { |e| puts e['href'] }
If you want to select from an URL online you can use
如果您想從在線URL中選擇,您可以使用
require 'nokogiri'
require 'open-uri'
doc = Nokogiri::HTML(open('http://stackoverflow.com/'))
a_with_title = doc.xpath('//a[@title]').map { |e| puts e['href'] }
1
I finally figured it out. I believe, the following will work to select the href
from the first link element with a title attribute: page.css('a[title]')[0]['href']
.
我終於弄明白了。我相信,以下將使用title屬性從第一個link元素中選擇href:page.css('a [title]')[0] ['href']。
I had thought page.css('a[title]')
was selecting the value of the title
attribute, but in fact it selects the entire element. You can then reference this element to get values from it.
我原以為page.css('a [title]')選擇了title屬性的值,但實際上它選擇了整個元素。然后,您可以引用此元素以從中獲取值。
0
require 'nokogiri'
doc = Nokogiri::HTML::DocumentFragment.parse <<-SCRIPT
<a title="xx" href="http://www.example1.com/foo1"></a>
<a title="aa" href="http://www.example2.com/foo2"></a>
<a id=5 href="http://www.foo.com/foo3"></a>
<a title="zz" href="http://www.example3.com/foo4"></a>
<a id=5 href="http://www.test.com/foo5"></a>
SCRIPT
p doc.search("a").map { |nd| nd['href'] if nd.key?('title')}.compact
#=> ["http://www.example1.com/foo1", "http://www.example2.com/foo2", "http://www.example3.com/foo4"]
本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:https://www.itdaan.com/blog/2013/04/18/724e996b49e30f73d91466c8fe60e09b.html。