通过每次追加一行来创建Pandas数据框架

我如何创建一个空DataFrame，然后添加行，一个接一个?

我创建了一个空DataFrame:

df = pd.DataFrame(columns=('lib', 'qty1', 'qty2'))

然后我可以在最后添加一个新行，并填充一个字段:

df = df._set_value(index=len(df), col='qty1', value=10.0)

它一次只适用于一个领域。向df中添加新行有什么更好的方法?

当前回答

您可以为此连接两个数据框架。我基本上遇到了这个问题，用字符索引(不是数字)向现有的DataFrame添加新行。

因此，我在一个管道()中输入新行数据，并在一个列表中索引。

new_dict = {put input for new row here}
new_list = [put your index here]

new_df = pd.DataFrame(data=new_dict, index=new_list)

df = pd.concat([existing_df, new_df])

2020-04-30 14:07:19

其他回答

我想出了一个简单而美好的方法:

>>> df
     A  B  C
one  1  2  3
>>> df.loc["two"] = [4,5,6]
>>> df
     A  B  C
one  1  2  3
two  4  5  6

请注意评论中提到的性能警告。

2018-08-30 03:19:43

如果你事先知道条目的数量，你应该通过提供索引来预分配空间(从不同的答案中获得数据示例):

import pandas as pd
import numpy as np
# we know we're gonna have 5 rows of data
numberOfRows = 5
# create dataframe
df = pd.DataFrame(index=np.arange(0, numberOfRows), columns=('lib', 'qty1', 'qty2') )

# now fill it up row by row
for x in np.arange(0, numberOfRows):
    #loc or iloc both work here since the index is natural numbers
    df.loc[x] = [np.random.randint(-1,1) for n in range(3)]
In[23]: df
Out[23]: 
   lib  qty1  qty2
0   -1    -1    -1
1    0     0     0
2   -1     0    -1
3    0    -1     0
4   -1     0     0

速度比较

In[30]: %timeit tryThis() # function wrapper for this answer
In[31]: %timeit tryOther() # function wrapper without index (see, for example, @fred)
1000 loops, best of 3: 1.23 ms per loop
100 loops, best of 3: 2.31 ms per loop

而且，从评论中可以看出，如果尺寸为6000，速度差异会变得更大:

增加数组的大小(12)和行数(500)使速度上的差异更加显著:313毫秒vs 2.29秒

2014-07-23 14:21:45

与ShikharDua的答案(基于行)中的字典列表不同，我们也可以将我们的表表示为一个列表字典(基于列)，其中每个列表按行顺序存储一列，前提是我们事先知道我们的列。最后，我们构造一次DataFrame。

在这两种情况下，字典键始终是列名。行顺序隐式存储为列表中的order。对于c列和n行，它使用一个c个字典列表，而不是一个n个字典列表。字典列表方法让每个字典冗余地存储所有键，并且需要为每一行创建一个新字典。这里我们只追加到列表中，这总体上是相同的时间复杂度(向列表和字典中添加条目都是平摊常数时间)，但由于操作简单，开销可能更小。

# Current data
data = {"Animal":["cow", "horse"], "Color":["blue", "red"]}

# Adding a new row (be careful to ensure every column gets another value)
data["Animal"].append("mouse")
data["Color"].append("black")

# At the end, construct our DataFrame
df = pd.DataFrame(data)
#   Animal  Color
# 0    cow   blue
# 1  horse    red
# 2  mouse  black

2019-12-30 01:35:57

有关有效附加，请参见如何向pandas数据框架添加额外行和使用放大设置。

通过loc/ix在不存在的键索引数据上添加行。例如:

In [1]: se = pd.Series([1,2,3])

In [2]: se
Out[2]:
0    1
1    2
2    3
dtype: int64

In [3]: se[5] = 5.

In [4]: se
Out[4]:
0    1.0
1    2.0
2    3.0
5    5.0
dtype: float64

Or:

In [1]: dfi = pd.DataFrame(np.arange(6).reshape(3,2),
   .....:                 columns=['A','B'])
   .....:

In [2]: dfi
Out[2]:
   A  B
0  0  1
1  2  3
2  4  5

In [3]: dfi.loc[:,'C'] = dfi.loc[:,'A']

In [4]: dfi
Out[4]:
   A  B  C
0  0  1  0
1  2  3  2
2  4  5  4
In [5]: dfi.loc[3] = 5

In [6]: dfi
Out[6]:
   A  B  C
0  0  1  0
1  2  3  2
2  4  5  4
3  5  5  5

2014-04-30 17:31:04

另一种方法(可能不是很有效):

# add a row
def add_row(df, row):
    colnames = list(df.columns)
    ncol = len(colnames)
    assert ncol == len(row), "Length of row must be the same as width of DataFrame: %s" % row
    return df.append(pd.DataFrame([row], columns=colnames))

你也可以像这样增强DataFrame类:

import pandas as pd
def add_row(self, row):
    self.loc[len(self.index)] = row
pd.DataFrame.add_row = add_row

2016-11-11 18:18:09

通过每次追加一行来创建Pandas数据框架

推荐文章

最新文章

标签