如何在Python中连接文本文件?

我有一个20个文件名的列表，比如['file1.txt'， 'file2.txt'，…]。我想写一个Python脚本将这些文件连接到一个新文件中。我可以通过f = open(…)打开每个文件，通过调用f.r edline()逐行读取，并将每行写入新文件。这对我来说似乎不是很“优雅”，尤其是我必须一行一行地读/写的部分。

在Python中是否有更“优雅”的方式来做到这一点?

当前回答

如果文件不是很大:

with open('newfile.txt','wb') as newf:
    for filename in list_of_files:
        with open(filename,'rb') as hf:
            newf.write(hf.read())
            # newf.write('\n\n\n')   if you want to introduce
            # some blank lines between the contents of the copied files

如果文件太大，不能完全读取并保存在RAM中，则算法必须稍微不同，以固定长度的块读取循环中复制的每个文件，例如使用read(10000)。

2012-11-28 20:04:38

其他回答

查看File对象的.read()方法:

http://docs.python.org/2/tutorial/inputoutput.html#methods-of-file-objects

你可以这样做:

concat = ""
for file in files:
    concat += open(file).read()

或者更“优雅”的python方式:

concat = ''.join([open(f).read() for f in files])

根据这篇文章，http://www.skymind.com/~ocrow/python_string/也将是最快的。

2012-11-28 20:04:20

如果目录中有很多文件，那么glob2可能是生成文件名列表的更好选择，而不是手工编写它们。

import glob2

filenames = glob2.glob('*.txt')  # list of all .txt files in the directory

with open('outfile.txt', 'w') as f:
    for file in filenames:
        with open(file) as infile:
            f.write(infile.read()+'\n')

2017-05-06 09:45:00

这正是fileinput的作用:

import fileinput
with open(outfilename, 'w') as fout, fileinput.input(filenames) as fin:
    for line in fin:
        fout.write(line)

对于这个用例，它实际上并不比手动遍历文件简单多少，但在其他情况下，使用单个迭代器遍历所有文件，就像它们是单个文件一样非常方便。(此外，fileinput在完成后立即关闭每个文件的事实意味着不需要使用或关闭每个文件，但这只是节省一行，不是什么大不了的事情。)

在fileinput中还有一些其他漂亮的特性，比如通过过滤每一行来对文件进行就地修改的能力。

正如评论中所指出的，并在另一篇文章中讨论，Python 2.7的文件输入将无法正常工作。这里略有修改，使代码符合Python 2.7

with open('outfilename', 'w') as fout:
    fin = fileinput.input(filenames)
    for line in fin:
        fout.write(line)
    fin.close()

2012-11-28 20:07:45

如果文件不是很大:

with open('newfile.txt','wb') as newf:
    for filename in list_of_files:
        with open(filename,'rb') as hf:
            newf.write(hf.read())
            # newf.write('\n\n\n')   if you want to introduce
            # some blank lines between the contents of the copied files

如果文件太大，不能完全读取并保存在RAM中，则算法必须稍微不同，以固定长度的块读取循环中复制的每个文件，例如使用read(10000)。

2012-11-28 20:04:38

使用shutil.copyfileobj。

它会自动读取输入文件的块为您，这是更有效的读取输入文件，即使一些输入文件太大，无法装入内存也能工作:

import shutil

with open('output_file.txt','wb') as wfd:
    for f in ['seg1.txt','seg2.txt','seg3.txt']:
        with open(f,'rb') as fd:
            shutil.copyfileobj(fd, wfd)

2014-11-22 12:35:15

如何在Python中连接文本文件?

推荐文章

最新文章

标签