小编典典

替换字符串中多个字符的最佳方法？

all

我需要替换一些字符如下：&➔ \&，#➔ \#，…

我编码如下，但我想应该有一些更好的方法。有什么提示吗？

strs = strs.replace('&', '\&')
strs = strs.replace('#', '\#')

阅读 83

2022-04-15

共1个答案

小编典典

替换两个字符

我对当前答案中的所有方法以及一个额外的方法进行了计时。

使用输入字符串abc&def#ghi并替换 & -> \& 和 # ->
\#，最快的方法是将替换链接在一起，如下所示：text.replace('&', '\&').replace('#', '\#').

每个功能的计时：

a) 1000000 次循环，3 次中的最佳：每个循环 1.47 微秒
b) 1000000 次循环，3 次中的最佳：每个循环 1.51 微秒
c) 100000 次循环，3 次中的最佳：每个循环 12.3 微秒
d) 100000 次循环，3 次取胜：每个循环 12 次
e) 100000 次循环，3 次取胜：每个循环 3.27 微秒
f) 1000000 次循环，3 次中的最佳：每循环 0.817 微秒
g) 100000 次循环，3 次中的最佳：每个循环 3.64 微秒
h) 1000000 次循环，3 次中的最佳：每个循环 0.927 微秒
i) 1000000 次循环，3 次中的最佳：每个循环 0.814 微秒

以下是功能：

def a(text):
    chars = "&#"
    for c in chars:
        text = text.replace(c, "\\" + c)


def b(text):
    for ch in ['&','#']:
        if ch in text:
            text = text.replace(ch,"\\"+ch)


import re
def c(text):
    rx = re.compile('([&#])')
    text = rx.sub(r'\\\1', text)


RX = re.compile('([&#])')
def d(text):
    text = RX.sub(r'\\\1', text)


def mk_esc(esc_chars):
    return lambda s: ''.join(['\\' + c if c in esc_chars else c for c in s])
esc = mk_esc('&#')
def e(text):
    esc(text)


def f(text):
    text = text.replace('&', '\&').replace('#', '\#')


def g(text):
    replacements = {"&": "\&", "#": "\#"}
    text = "".join([replacements.get(c, c) for c in text])


def h(text):
    text = text.replace('&', r'\&')
    text = text.replace('#', r'\#')


def i(text):
    text = text.replace('&', r'\&').replace('#', r'\#')

定时是这样的：

python -mtimeit -s"import time_functions" "time_functions.a('abc&def#ghi')"
python -mtimeit -s"import time_functions" "time_functions.b('abc&def#ghi')"
python -mtimeit -s"import time_functions" "time_functions.c('abc&def#ghi')"
python -mtimeit -s"import time_functions" "time_functions.d('abc&def#ghi')"
python -mtimeit -s"import time_functions" "time_functions.e('abc&def#ghi')"
python -mtimeit -s"import time_functions" "time_functions.f('abc&def#ghi')"
python -mtimeit -s"import time_functions" "time_functions.g('abc&def#ghi')"
python -mtimeit -s"import time_functions" "time_functions.h('abc&def#ghi')"
python -mtimeit -s"import time_functions" "time_functions.i('abc&def#ghi')"

替换 17 个字符

这里有类似的代码来做同样的事情，但有更多的字符要转义 (\`*_{}>#+-.!$)：

def a(text):
    chars = "\\`*_{}[]()>#+-.!$"
    for c in chars:
        text = text.replace(c, "\\" + c)


def b(text):
    for ch in ['\\','`','*','_','{','}','[',']','(',')','>','#','+','-','.','!','$','\'']:
        if ch in text:
            text = text.replace(ch,"\\"+ch)


import re
def c(text):
    rx = re.compile('([&#])')
    text = rx.sub(r'\\\1', text)


RX = re.compile('([\\`*_{}[]()>#+-.!$])')
def d(text):
    text = RX.sub(r'\\\1', text)


def mk_esc(esc_chars):
    return lambda s: ''.join(['\\' + c if c in esc_chars else c for c in s])
esc = mk_esc('\\`*_{}[]()>#+-.!$')
def e(text):
    esc(text)


def f(text):
    text = text.replace('\\', '\\\\').replace('`', '\`').replace('*', '\*').replace('_', '\_').replace('{', '\{').replace('}', '\}').replace('[', '\[').replace(']', '\]').replace('(', '\(').replace(')', '\)').replace('>', '\>').replace('#', '\#').replace('+', '\+').replace('-', '\-').replace('.', '\.').replace('!', '\!').replace('$', '\$')


def g(text):
    replacements = {
        "\\": "\\\\",
        "`": "\`",
        "*": "\*",
        "_": "\_",
        "{": "\{",
        "}": "\}",
        "[": "\[",
        "]": "\]",
        "(": "\(",
        ")": "\)",
        ">": "\>",
        "#": "\#",
        "+": "\+",
        "-": "\-",
        ".": "\.",
        "!": "\!",
        "$": "\$",
    }
    text = "".join([replacements.get(c, c) for c in text])


def h(text):
    text = text.replace('\\', r'\\')
    text = text.replace('`', r'\`')
    text = text.replace('*', r'\*')
    text = text.replace('_', r'\_')
    text = text.replace('{', r'\{')
    text = text.replace('}', r'\}')
    text = text.replace('[', r'\[')
    text = text.replace(']', r'\]')
    text = text.replace('(', r'\(')
    text = text.replace(')', r'\)')
    text = text.replace('>', r'\>')
    text = text.replace('#', r'\#')
    text = text.replace('+', r'\+')
    text = text.replace('-', r'\-')
    text = text.replace('.', r'\.')
    text = text.replace('!', r'\!')
    text = text.replace('$', r'\$')


def i(text):
    text = text.replace('\\', r'\\').replace('`', r'\`').replace('*', r'\*').replace('_', r'\_').replace('{', r'\{').replace('}', r'\}').replace('[', r'\[').replace(']', r'\]').replace('(', r'\(').replace(')', r'\)').replace('>', r'\>').replace('#', r'\#').replace('+', r'\+').replace('-', r'\-').replace('.', r'\.').replace('!', r'\!').replace('$', r'\$')

这是相同输入字符串的结果abc&def#ghi：

a) 100000 次循环，3 次中的最佳：每个循环 6.72 微秒
b) 100000 次循环，3 次中的最佳：每个循环 2.64 微秒
c) 100000 次循环，3 次中的最佳：每个循环 11.9 微秒
d) 100000 次循环，3 次中的最佳：每个循环 4.92 微秒
e) 100000 次循环，3 次取胜：每个循环 2.96 微秒
f) 100000 次循环，3 次中的最佳：每个循环 4.29 微秒
g) 100000 次循环，3 次中的最佳：每个循环 4.68 微秒
h) 100000 次循环，3 次中的最佳：每个循环 4.73 微秒
i) 100000 次循环，3 次中的最佳：每个循环 4.24 微秒

并使用更长的输入字符串 ( ## *Something* and [another] thing in a longer sentence with {more} things to replace$)：

a) 100000 次循环，3 次中的最佳：每个循环 7.59 微秒
b) 100000 次循环，3 次中的最佳：每个循环 6.54 微秒
c) 100000 次循环，3 次中的最佳：每个循环 16.9 微秒
d) 100000 次循环，3 次中的最佳：每循环 7.29 微秒
e) 100000 次循环，3 次中的最佳：每个循环 12.2 微秒
f) 100000 次循环，3 次中的最佳：每个循环 5.38 微秒
g) 10000 次循环，3 次中的最佳：每循环 21.7 微秒
h) 100000 次循环，3 次中的最佳：每个循环 5.7 微秒
i) 100000 次循环，3 次中的最佳：每个循环 5.13 微秒

添加几个变体：

def ab(text):
    for ch in ['\\','`','*','_','{','}','[',']','(',')','>','#','+','-','.','!','$','\'']:
        text = text.replace(ch,"\\"+ch)


def ba(text):
    chars = "\\`*_{}[]()>#+-.!$"
    for c in chars:
        if c in text:
            text = text.replace(c, "\\" + c)

使用较短的输入：

ab) 100000 次循环，3 次中的最佳：每个循环 7.05 微秒
ba) 100000 次循环，3 次中的最佳：每个循环 2.4 微秒

使用更长的输入：

ab) 100000 次循环，3 次中最好的：每个循环 7.71 渭s
ba) 100000 次循环，最好的 3 次：每个循环 6.08 渭s

所以我将使用ba可读性和速度。

附录

ab由评论中的黑客提示，和之间的一个区别ba是if c in text:检查。让我们针对另外两个变体对它们进行测试：

def ab_with_check(text):
    for ch in ['\\','`','*','_','{','}','[',']','(',')','>','#','+','-','.','!','$','\'']:
        if ch in text:
            text = text.replace(ch,"\\"+ch)

def ba_without_check(text):
    chars = "\\`*_{}[]()>#+-.!$"
    for c in chars:
        text = text.replace(c, "\\" + c)

Python 2.7.14 和 3.6.3 上的每个循环的渭s 时间，并且在与早期设置不同的机器上，因此无法直接比较。

2022-04-15