我目前通过一堆不好记的AWK、sed、Bash和一小部分Perl来做我的文本文件操作。
我已经看到提到的一些地方,python很适合做这种事情。我如何使用Python来代替shell脚本,AWK, sed和朋友?
我目前通过一堆不好记的AWK、sed、Bash和一小部分Perl来做我的文本文件操作。
我已经看到提到的一些地方,python很适合做这种事情。我如何使用Python来代替shell脚本,AWK, sed和朋友?
当前回答
In the beginning there was sh, sed, and awk (and find, and grep, and...). It was good. But awk can be an odd little beast and hard to remember if you don't use it often. Then the great camel created Perl. Perl was a system administrator's dream. It was like shell scripting on steroids. Text processing, including regular expressions were just part of the language. Then it got ugly... People tried to make big applications with Perl. Now, don't get me wrong, Perl can be an application, but it can (can!) look like a mess if you're not really careful. Then there is all this flat data business. It's enough to drive a programmer nuts.
进入Python、Ruby等。这些都是非常好的通用语言。它们支持文本处理,并且做得很好(尽管可能与语言的基本核心没有那么紧密地交织在一起)。但它们也可以很好地扩展,并且在一天结束时仍然有漂亮的代码。他们还发展了相当庞大的社区,有大量的图书馆,几乎可以提供任何东西。
Now, much of the negativeness towards Perl is a matter of opinion, and certainly some people can write very clean Perl, but with this many people complaining about it being too easy to create obfuscated code, you know some grain of truth is there. The question really becomes then, are you ever going to use this language for more than simple bash script replacements. If not, learn some more Perl.. it is absolutely fantastic for that. If, on the other hand, you want a language that will grow with you as you want to do more, may I suggest Python or Ruby.
不管怎样,祝你好运!
其他回答
我已经构建了半长的shell脚本(300-500行)和Python代码,它们具有类似的功能。当执行许多外部命令时,我发现shell更容易使用。当有大量文本操作时,Perl也是一个很好的选择。
我建议你看一本很棒的在线书籍Dive Into Python。这就是我最初学习这门语言的方式。
除了教你语言的基本结构和大量有用的数据结构之外,它还有一个关于文件处理的很好的章节,以及关于正则表达式等的后续章节。
在ShellPy库中可以使用python而不是bash。
下面是一个从Github下载Python用户头像的例子:
import json
import os
import tempfile
# get the api answer with curl
answer = `curl https://api.github.com/users/python
# syntactic sugar for checking returncode of executed process for zero
if answer:
answer_json = json.loads(answer.stdout)
avatar_url = answer_json['avatar_url']
destination = os.path.join(tempfile.gettempdir(), 'python.png')
# execute curl once again, this time to get the image
result = `curl {avatar_url} > {destination}
if result:
# if there were no problems show the file
p`ls -l {destination}
else:
print('Failed to download avatar')
print('Avatar downloaded')
else:
print('Failed to access github api')
如您所见,所有在grave重音(')符号内的表达式都在shell中执行。在Python代码中,您可以捕获此执行的结果并对其执行操作。例如:
log = `git log --pretty=oneline --grep='Create'
这一行首先在shell中执行git log——pretty=oneline——grep='Create',然后将结果赋值给log变量。结果具有以下属性:
从已执行进程的Stdout中Stdout整个文本
Stderr从已执行进程的Stderr得到的整个文本
Returncode执行的返回码
这是该库的总体概述,更详细的描述和示例可以在这里找到。
Pythonpy是一个工具,可以方便地访问awk和sed的许多特性,但使用python语法:
$ echo me2 | py -x 're.sub("me", "you", x)'
you2
在研究这个主题时,我发现了这个概念验证代码(通过http://jlebar.com/2010/2/1/Replacing_Bash.html上的评论),它让你“使用简洁的语法在Python中编写类似shell的管道,并在有意义的地方利用现有的系统工具”:
for line in sh("cat /tmp/junk2") | cut(d=',',f=1) | 'sort' | uniq:
sys.stdout.write(line)