Lookbehind in date regex (python)
54 观看
2回复
319 作者的声誉
I am having some trouble figuring out the look-behind in Python. More specifically I have this piece of text which has dates in (mm/dd/yyyy)
(mm-dd-yyyy)
formats and just the years in (yyyy)
formats :
Jan-01-2001
Jan 01 2001
2003 2007
The year was 2009 when x decided to work for Google
What is the best way of matching to just extract the lines which have the yyyy
. I should be able to extract 2003
, 2007
and 2009
but not any other dates like the Jan-01-2001
and Jan 01 2001
. I tried the lookbehind operator and the best I could come with was ((?<!(-| ))\d{4})
. But this selects only 2003
and not 2007
and 2009
. I also tried using groups to define a date pattern and use them in conjunction with lookbehind, but that did not work. What would be the right and efficient way of doing this in regular expressions (Python)
回应 2
1像
14596 作者的声誉
Brief
This only works with the sample strings you've presented (and where the year is not preceded by 2 digits followed by a space or hyphen). Assuming that all dates use 2 digit numbers to define a day of the month, this will work for you (since lookbehinds in python (and the majority of regex engines) cannot be quantified).
Code
\b(?<!\b\d{2}[ -])\d{4}\b
Results
Input
Jan-01-2001
Jan 01 2001
2003 2007
The year was 2009 when x decided to work for Google
Output
2003
2007
2009
Explanation
\b
Assert position as a word boundary(?<!\b\d{2}[ -])
Negative lookbehind ensuring what precedes doesn't match the following\b
Assert position as a word boundary\d{2}
Match exactly 2 digits[ -]
Match either a spaceor hyphen
-
character
\d{4}
Match exactly 4 digits\b
Assert position as a word boundary
0像
624 作者的声誉
I hope this may help you:
import re
string = """Jan-01-2001
Jan 01 2001
2003 2007
The year was 2009 when x decided to work for Google"""
for year in string.split('\n'):
search_date = re.search(r'^(?!\w{3}(?:\s+|-)\d{2}(?:\s+|-)\d{4}).+',year)
if search_date:
print(re.findall(r'\d{4}',search_date.group()))
作者: Pradam
发布者: 2017 年 12 月 27 日
来自类别的问题 :
- python 如何使用Python的itertools.groupby()?
- python Python:我在运行什么操作系统?
- python 如何使用Python创建可直接执行的跨平台GUI应用程序?
- python Python声音(“钟声”)
- python 使用Python创建加密的ZIP文件
- python 构建一个基本的Python迭代器
- regex Learning Regular Expressions
- regex 我的正则表达式匹配太多。我如何使其停止?
- regex 如何在保留原始字符串的同时对字符串执行Perl替换?
- regex 用Java替换正则表达式匹配的第n个实例
- regex 如何用链接替换普通URL?
- regex 带有标志的Python re.sub不会替换所有出现的内容
- date 如何在JavaScript中获得两个日期之间的差异?
- date DateTime.Now与DateTime.UtcNow
- date 获取Java上个月的上周五
- date 为什么Javascript getYear()返回108?
- date 如何仅从SQL Server DateTime数据类型返回日期
- date 如何在PHP中将日期转换为时间戳?