In .NET, regex is not organizing captures as I would expect. (I won't call this a bug, because obviously someone intended it. However, it's not how I'd expect it to work nor do I find it helpful.)
在。net中,regex並沒有像我預期的那樣組織捕獲。(我不會稱它為bug,因為顯然是有人想要它。)然而,這並不是我所期望的效果,我也不覺得它有用。
This regex is for recipe ingredients (simplified for sake of example):
這個regex是用於配方成分(為了示例而簡化):
(?<measurement> # begin group
\s* # optional beginning space or group separator
(
(?<integer>\d+)| # integer
(
(?<numtor>\d+) # numerator
/
(?<dentor>[1-9]\d*) # denominator. 0 not allowed
)
)
\s(?<unit>[a-zA-Z]+)
)+ # end group. can have multiple
My string: 3 tbsp 1/2 tsp
我的弦:3湯匙半茶匙
Resulting groups and captures:
導致的團體和截圖:
[measurement][0]=3 tbsp
[measurement][1]= 1/2 tsp
[integer][0]=3
[numtor][0]=1
[dentor][0]=2
[unit][0]=tbsp
[unit][1]=tsp[測量][0]=3 tbsp[測量][1]= 1/2 tsp[整數][0]=3 [numtor][0]=1[0]=2[單元][0]=tbsp[單元][1]=tsp。
Notice how even though 1/2 tsp
is in the 2nd Capture, it's parts are in [0]
since these spots were previously unused.
注意,即使在第2個捕獲中有1/2 tsp,它的部分仍然在[0]中,因為這些點以前沒有使用過。
Is there any way to get all of the parts to have predictable useful indexes without having to re-run each group through the regex again?
有什么方法可以讓所有的部分都具有可預測的有用索引,而不必重新運行每個組,通過regex嗎?
1
Is there any way to get all of the parts to have predictable useful indexes without having to re-run each group through the regex again?
有什么方法可以讓所有的部分都具有可預測的有用索引,而不必重新運行每個組,通過regex嗎?
Not with Captures. And if you're going to perform multiple matches anyway, I suggest you remove the +
and match each component of the measurement separately, like so:
而不是捕獲。如果你要執行多個匹配,我建議你移除+並分別匹配測量的每個分量,比如:
string s = @"3 tbsp 1/2 tsp";
Regex r = new Regex(@"\G\s* # anchor to end of previous match
(?<measurement> # begin group
(
(?<integer>\d+) # integer
|
(
(?<numtor>\d+) # numerator
/
(?<dentor>[1-9]\d*) # denominator. 0 not allowed
)
)
\s+(?<unit>[a-zA-Z]+)
) # end group.
", RegexOptions.IgnorePatternWhitespace | RegexOptions.ExplicitCapture);
foreach (Match m in r.Matches(s))
{
for (int i = 1; i < m.Groups.Count; i++)
{
Group g = m.Groups[i];
if (g.Success)
{
Console.WriteLine("[{0}] = {1}", r.GroupNameFromNumber(i), g.Value);
}
}
Console.WriteLine("");
}
output:
輸出:
[measurement] = 3 tbsp
[integer] = 3
[unit] = tbsp
[measurement] = 1/2 tsp
[numtor] = 1
[dentor] = 2
[unit] = tsp
The \G
at the beginning ensures that matches occur only at the point where the previous match ended (or at the beginning of the input if this is the first match attempt). You can also save the match-end position between calls, then use the two-argument Matches
method to resume parsing at that same point (as if that were really the beginning of the input).
開始時的\G確保匹配只發生在前一個匹配結束的地方(如果這是第一次匹配嘗試,則在輸入的開始)。您還可以保存調用之間的配對結束位置,然后使用雙參數匹配方法在同一點上恢復解析(就好像這是輸入的開始一樣)。
1
Seems like you probably need to loop through the input, matching one measurement at a time. Then you would have predictable access to the parts of that measurement, during the loop iteration for that measurement.
似乎您可能需要對輸入進行循環,每次匹配一個度量值。然后,在該測量的循環迭代過程中,您可以對該度量的部分進行可預測的訪問。
-1
Having a look at this....here's a couple of suggestions that might help improve the regexp
在看看這個....下面是一些可能有助於改進regexp的建議
(?<measurement> # begin group
\s* # optional beginning space or group separator
(
(?<integer>\d+)\.?| # integer
(
(?<numtor>\d+) # numerator
/
(?<dentor>[1-9]\d*) # denominator. 0 not allowed
)
)
\s(?<unit>[a-zA-Z]+)
)+ # end group. can have multiple
(?<integer>\d+)
I would try \s?
instead of \.
to capture the whitespace as that is escaping the full-stop and would be expecting a full-stop to appear somewhere..\/
本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:https://www.itdaan.com/blog/2010/08/20/725de883366a6dd2e13e7986fd980db4.html。