我正在尝试找到一个基于南亚编号系统的正则表达式,以逗号分隔大量数字。
一些例子:
1,000,000
10,00,000
1,000,000,000
100,00,00,000
逗号模式每7位重复一次。例如, 1,00,00,000,00,00,000。
1,00,00,000,00,00,000
从Friedl撰写的Mastering Regular Expressions一书中,我有以下阿拉伯数字系统的正则表达式:
r'(?<=\d)(?=(\d{3})+(?!\d))'
对于印度编号系统,我想出了以下表达式,但不适用于超过8位数字的数字:
r'(?<=\d)(?=(((\d{2}){0,2}\d{3})(?=\b)))'
使用上述模式,我得到100000000,00,00,000。
100000000,00,00,000
我正在使用Pythonre模块(re.sub())。有任何想法吗?
re
re.sub()
尝试这个:
(?<=\d)(?=(\d{2}){0,2}\d{3}(\d{7})*(?!\d))
例如:
>>> import re >>> inp = ["1" + "0"*i for i in range(20)] >>> [re.sub(r"(?<=\d)(?=(\d{2}){0,2}\d{3}(\d{7})*(?!\d))", ",", i) for i in inp] ['1', '10', '100', '1,000', '10,000', '1,00,000', '10,00,000', '1,00,00,000', '10,00,00,000', '100,00,00,000', '1,000,00,00,000', '10,000,00,00,000', '1,00,000,00,00,000', '10,00,000,00,00,000', '1,00,00,000,00,00,000', '10,00,00,000,00,00,000', '100,00,00,000,00,00,000', '1,000,00,00,000,00,00,000', '10,000,00,00,000,00,00,000', '1,00,000,00,00,000,00,00,000']
作为评论正则表达式:
result = re.sub( r"""(?x) # Enable verbose mode (comments) (?<=\d) # Assert that we're not at the start of the number. (?= # Assert that it's possible to match: (\d{2}){0,2} # 0, 2 or 4 digits, \d{3} # followed by 3 digits, (\d{7})* # followed by 0, 7, 14, 21 ... digits, (?!\d) # and no more digits after that. ) # End of lookahead assertion.""", ",", subject)