shlex模块为基于Uninx shell语法的语言提供了一个简单的lexer(也就是tokenizer)
举例说明:
有一个文本文件quotes.txt
This string has embedded "double quotes" and 'single quotes' in it, and even "a 'nested example'".
python 代码
test.py
#!/usr/bin/env python import shlex import sys if len(sys.argv) != 2: print 'please input' sys.exit(1) filename = sys.argv[1] body = file(filename,'rt').read() print 'ORIGINAL:',repr(body) print print 'TOKENS:' lexer = shlex.shlex(body) for token in lexer: print repr(token)
执行命令:
./test.py quotes.txt
ORIGINAL: 'This string has embedded "double quotes" and \'single quotes\' in it,\nand even "a \'nested example\'".\n' TOKENS: 'This' 'string' 'has' 'embedded' '"double quotes"' 'and' "'single quotes'" 'in' 'it' ',' 'and' 'even' '"a \'nested example\'"' '.'
可以看出shlex非常智能强大,比正则表达式方便多了。
原文链接:https://www.cnblogs.com/rayong/p/7541799.html