我如何跳过标题行并开始从第2行读取文件?


当前回答

f = open(fname).readlines()
firstLine = f.pop(0) #removes the first line
for line in f:
    ...

其他回答

f = open(fname).readlines()
firstLine = f.pop(0) #removes the first line
for line in f:
    ...

如果您想从第2行开始读取多个CSV文件,这就像一个魅力

for files in csv_file_list:
        with open(files, 'r') as r: 
            next(r)                  #skip headers             
            rr = csv.reader(r)
            for row in rr:
                #do something

(这是Parfait对另一个问题的部分回答)

为了概括读取多个标题行的任务并提高可读性,我将使用方法提取。假设您想要标记coordinate .txt的前三行作为标题信息。

例子

coordinates.txt
---------------
Name,Longitude,Latitude,Elevation, Comments
String, Decimal Deg., Decimal Deg., Meters, String
Euler's Town,7.58857,47.559537,0, "Blah"
Faneuil Hall,-71.054773,42.360217,0
Yellowstone National Park,-110.588455,44.427963,0

然后,方法提取允许您指定想要对标题信息做什么(在本例中,我们只是基于逗号对标题行进行标记,并将其作为列表返回,但还有空间可以做更多的事情)。

def __readheader(filehandle, numberheaderlines=1):
    """Reads the specified number of lines and returns the comma-delimited 
    strings on each line as a list"""
    for _ in range(numberheaderlines):
        yield map(str.strip, filehandle.readline().strip().split(','))

with open('coordinates.txt', 'r') as rh:
    # Single header line
    #print next(__readheader(rh))

    # Multiple header lines
    for headerline in __readheader(rh, numberheaderlines=2):
        print headerline  # Or do other stuff with headerline tokens

输出

['Name', 'Longitude', 'Latitude', 'Elevation', 'Comments']
['String', 'Decimal Deg.', 'Decimal Deg.', 'Meters', 'String']

如果coordinates.txt包含另一个标题行,只需更改numberheaderlines。最重要的是,__readheader(rh, numberheaderlines=2)所做的事情很清楚,我们避免了必须弄清楚或评论为什么接受答案的作者在他的代码中使用next()的歧义。

如果切片可以在迭代器上工作……

from itertools import islice
with open(fname) as f:
    for line in islice(f, 1, None):
        pass
with open(fname) as f:
    next(f)
    for line in f:
        #do something