如何添加一个新的列到现有的数据帧?

我有以下索引DataFrame命名列和行不连续的数字:

          a         b         c         d
2  0.671399  0.101208 -0.181532  0.241273
3  0.446172 -0.243316  0.051767  1.577318
5  0.614758  0.075793 -0.451460 -0.012493

我想添加一个新列，'e'，到现有的数据帧，并不想改变数据帧中的任何东西(即，新列始终具有与DataFrame相同的长度)。

0   -0.335485
1   -1.166658
2   -0.385571
dtype: float64

如何将列e添加到上面的例子中?

当前回答

要在数据帧的给定位置(0 <= loc <=列的数量)插入一个新列，只需使用datafframe .insert:

DataFrame.insert(loc, column, value)

因此，如果你想在一个名为df的数据帧的末尾添加列e，你可以使用:

e = [-0.335485, -1.166658, -0.385571]    
DataFrame.insert(loc=len(df.columns), column='e', value=e)

value可以是一个Series，一个整数(在这种情况下，所有单元格都被这个值填充)，或者一个类似数组的结构

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.insert.html

2019-04-07 15:12:59

其他回答

让我补充一下，就像hum3一样，.loc没有解决SettingWithCopyWarning，我不得不求助于df.insert()。在我的例子中，假阳性是由“假”链索引dict['a']['e']生成的，其中'e'是新列，dict['a']是来自字典的数据框架。

还请注意，如果您知道自己在做什么，您可以使用切换警告 pd.options.mode。chained_assignment =无然后用这里给出的另一个解。

2015-10-22 14:21:45

为了完整起见-使用DataFrame.eval()方法的另一个解决方案:

数据:

In [44]: e
Out[44]:
0    1.225506
1   -1.033944
2   -0.498953
3   -0.373332
4    0.615030
5   -0.622436
dtype: float64

In [45]: df1
Out[45]:
          a         b         c         d
0 -0.634222 -0.103264  0.745069  0.801288
4  0.782387 -0.090279  0.757662 -0.602408
5 -0.117456  2.124496  1.057301  0.765466
7  0.767532  0.104304 -0.586850  1.051297
8 -0.103272  0.958334  1.163092  1.182315
9 -0.616254  0.296678 -0.112027  0.679112

解决方案:

In [46]: df1.eval("e = @e.values", inplace=True)

In [47]: df1
Out[47]:
          a         b         c         d         e
0 -0.634222 -0.103264  0.745069  0.801288  1.225506
4  0.782387 -0.090279  0.757662 -0.602408 -1.033944
5 -0.117456  2.124496  1.057301  0.765466 -0.498953
7  0.767532  0.104304 -0.586850  1.051297 -0.373332
8 -0.103272  0.958334  1.163092  1.182315  0.615030
9 -0.616254  0.296678 -0.112027  0.679112 -0.622436

2017-03-14 21:49:44

但有一点需要注意，如果你这样做了

df1['e'] = Series(np.random.randn(sLength), index=df1.index)

这实际上是df1.index上的左连接。因此，如果您希望具有外部连接效果，我的解决方案可能并不完美，即创建一个包含所有数据的索引值的数据框架，然后使用上面的代码。例如,

data = pd.DataFrame(index=all_possible_values)
df1['e'] = Series(np.random.randn(sLength), index=df1.index)

2015-02-20 17:32:19

向现有数据框架添加新列的简单方法是:

new_cols = ['a' , 'b' , 'c' , 'd']

for col in new_cols:
    df[f'{col}'] = 0 #assiging 0 for the placeholder

print(df.columns)

2021-09-15 07:54:01

这是向pandas数据框架添加新列的特殊情况。在这里，我基于数据框架的现有列数据添加了一个新特性/列。

因此，让我们的dataFrame有列'feature_1'， 'feature_2'， 'probability_score'，我们必须根据'probability_score'列中的数据添加一个new_column 'predicted_class'。

我将使用来自python的map()函数，并定义一个我自己的函数，该函数将实现如何给dataFrame中的每一行一个特定的class_label的逻辑。

data = pd.read_csv('data.csv')

def myFunction(x):
   //implement your logic here

   if so and so:
        return a
   return b

variable_1 = data['probability_score']
predicted_class = variable_1.map(myFunction)

data['predicted_class'] = predicted_class

// check dataFrame, new column is included based on an existing column data for each row
data.head()

2020-06-19 12:24:35

如何添加一个新的列到现有的数据帧?

推荐文章

最新文章

标签