从regexp中的多个匹配中捕获URL ID

[英]Catch URL IDs from multiple matches in regexp


I'm writing a simple URL parser. With a regexp like following

我正在写一个简单的URL解析器。正如下面的正则表达式

preg_match_all('/^test\/(\w+)\/?$/', $url, $matches);

I can catch all URL like

我可以抓住所有的网址

test/5

and browsing $matches array I can get the ID, which is 5. That's fine.

并且浏览$ matches数组我可以得到ID,即5。那没关系。

With a regexp like following

正如下面的正则表达式

preg_match_all('/^test\/((\w+)\/?)+\/(\w+)\/?$/', $url, $matches);

I can catch all URL like

我可以抓住所有的网址

test/1/5
test/1/2/5
test/1/2/3/5

... and so on. The problem is that browsing $matches array I can't catch all the matched IDs of the variable-length part (which is ((\w+)\/?)+). I mean I don't catch 1,2,3 but 3,3,3. I get the last ID repeated N-times.

... 等等。问题是浏览$ matches数组我无法捕获可变长度部分的所有匹配ID(即((\ w +)\ /?)+)。我的意思是我没有抓到1,2,3但是3,3,3。我得到最后一次ID重复N次。

What am I missing?

我错过了什么?

1 个解决方案

#1


0  

I would do this job in two steps.

我会分两步完成这项工作。

First, you can check the URL format:

首先,您可以检查URL格式:

^test(?:\/\d+)+$

See the demo

看演示

Then, if the test succeeds, you can extract the IDs with this regex:

然后,如果测试成功,您可以使用此正则表达式提取ID:

(?:\G|^test)\/\K\d+

The output array will only contain the IDs.
See the demo

输出数组仅包含ID。看演示

Explanation

  • (?:\G|^test) matches the end position of the previous match or test at the beginning of the string
  • (?:\ G | ^ test)匹配字符串开头的上一个匹配或测试的结束位置

  • \/ matches a /
  • \ /匹配/

  • \K resets the starting point of the current match, excluding here the / from result
  • \ K重置当前匹配的起点,此处除了结果

  • \d+ matches 1 or more digits
  • \ d +匹配1位或更多位数


注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:http://www.itdaan.com/blog/2017/02/17/e74c42e66cdeeaa6bd85cf5cd1f233ce.html



 
© 2014-2019 ITdaan.com 粤ICP备14056181号  

赞助商广告