如何删除重复的连续字符并使用正则表达式保留第一个字符? [重复]

[英]How can I remove the duplicated consecutive chars and reserve the first one using regex? [duplicate]


This question already has an answer here:

这个问题在这里已有答案:

I found a code snippet of removing duplicated consecutive characters and reserving the first character in Python by regex from web like this:

我找到了一个代码片段,用于删除重复的连续字符,并通过网络中的regex保留Python中的第一个字符,如下所示:

import re
re.sub(r'(?s)(.)(?=.*\1)','','aabbcc')  #'abc'

But there is a defect that if the string is 'aabbccaabb' it will ignore the first 'aa', 'bb' and turn out 'cab'.

但是有一个缺点是,如果字符串是'aabbccaabb',它将忽略第一个'aa','bb'并转出'cab'。

re.sub(r'(?s)(.)(?=.*\1)','','aabbccaabb')  #'cab'

Is there a way to solve it by regex?

有没有办法通过正则表达式来解决它?

2 个解决方案

#1


2  

Just remove the .* in the positive look ahead.

只需在正向前方中删除。*即可。

import re

print re.sub(r'(?s)(.)(?=\1)','','aabbcc')
print re.sub(r'(?s)(.)(?=\1)','','aabbccaabb')

Output:

abc
abcab

#2


4  

Without regex, check if previous character is the same as current, using a list comprehension with a condition and join the results:

如果没有正则表达式,请检查前一个字符是否与当前字符相同,使用带有条件的列表推导并加入结果:

s='aabbccaabb'
print("".join([c for i,c in enumerate(s) if i==0 or s[i-1]!=c]))

注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:https://www.itdaan.com/blog/2017/01/14/381f6257af56bb9288b76adb68b95f43.html



 
  © 2014-2022 ITdaan.com 联系我们: