在一個句子中多次匹配一個特定的模式

[英]Regex to match a specific pattern multiple times within a sentence


I have the following problem with a latex textfile that consist of multiple sentences, e.g.

我有以下問題,一個乳膠文本文件,由多個句子組成,例如:

Aaa \cref{fig:1}. Bbb \cref{fig:2} bbb \cref{fig:3}. Ccc \cref{fig:4}. Ddd \cref{fig:5} ddd \cref{fig:6} ddd \cref{fig:7}.

What I need to find out is how to isolate the \cref{fig:xxx} parts in each sentence. The problem is that the regex should only account for sentences in which \cref{fig:xxx} occurs more than one times (>1).

我需要知道的是如何在每個句子中分離\cref{fig:xxx}部分。問題是regex應該只對發生了不止一次(>1)的\cref{fig:xxx}的句子進行解釋。

A good result would be if the regex could return fig:2 and fig:3 from sentence bbb, as well as fig:5, fig:6, and fig:7 from sentence ddd.

如果regex能從bbb句返回fig . 2和fig . 3,以及從ddd句返回fig . 5、fig . 6和fig . 7,則可以得到一個好的結果。

I have to use regular expressions for the search in Textmate (texteditor).

我必須在Textmate (texteditor)中使用正則表達式進行搜索。

2 个解决方案

#1


1  

In addition to my comment, you could come up with a recursive approach. However, looking at the documentation, recursion seems not to be supported in TextMate. In this case, you could easily repeat the pattern one more time (fulfilling your requirement of sentences with more than one occurence):

除了我的評論之外,您還可以提出遞歸方法。但是,查看文檔,在TextMate中似乎不支持遞歸。在這種情況下,你可以很容易地重復這個句型一次(用多個出現的句子來滿足你的要求):

(?:\\cref\{(fig:\d+)\})(?:[^.]+?(?:\\cref\{(fig:\d+)\}))+

Broken down, this looks for \\cref{} and captures the inner fig:+ digit, then looks for a character that is not a dot ([^.]) and repeats the first subpattern. As already mentionned in the comments, you will likely need to play around with the sentence conditions (e.g. what is considered as a sentence - this is the [^.] part). See a demo of the approach on regex101.com.

分解,這看起來\ \ cref { }和捕捉內部圖:+數字,然后查找一個字符不是一個點([^])和重復第一子模式。正如已經提到在評論中,你可能會需要把玩這句條件(例如被認為是一個句子——這是什么(^。部分)。參見regex101.com上的方法演示。

#2


1  

what you need is a positive lookahead statement. eg:

你需要的是一個積極的前瞻聲明。例如:

\S*(?=\s*\\cref{)

note! I'm not sure how to enter escapes and/or symbols in your text program so just to be clear by double "\" I mean the \ char and \s is space char, \S anti space. to return also the fig, you will need to introduce different groups. this guide might help you: http://www.rexegg.com/regex-lookarounds.html#compound

注意!我不知道如何在你的文本程序中輸入轉義和/或符號,所以我用雙"\"來說明,我的意思是char和\s是空間char, \s反空間。要返回fig,您將需要引入不同的組。本指南可能對您有所幫助:http://www.rexegg.com/regex lookarounds.html#化合物


注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:https://www.itdaan.com/blog/2016/02/11/7200a0aca4fc490cbbb25fc1c9b7461b.html



 
粤ICP备14056181号  © 2014-2020 ITdaan.com