轴在熊猫中是什么意思?

下面是我生成一个数据框架的代码:

import pandas as pd
import numpy as np

dff = pd.DataFrame(np.random.randn(1,2),columns=list('AB'))

然后我得到了数据框架:

+------------+---------+--------+
|            |  A      |  B     |
+------------+---------+---------
|      0     | 0.626386| 1.52325|
+------------+---------+--------+

当我输入命令时:

dff.mean(axis=1)

我得到:

0    1.074821
dtype: float64

根据pandas的参考，axis=1代表列，我希望命令的结果是

A    0.626386
B    1.523255
dtype: float64

我的问题是:轴在熊猫中是什么意思?

当前回答

我是这样理解的:

比如说，如果你的操作需要在数据框架中从左到右/从右到左，你显然是在合并列。你在不同的列上操作。这是轴=1

例子

df = pd.DataFrame(np.arange(12).reshape(3,4),columns=['A', 'B', 'C', 'D'])
print(df)
   A  B   C   D
0  0  1   2   3
1  4  5   6   7
2  8  9  10  11 

df.mean(axis=1)

0    1.5
1    5.5
2    9.5
dtype: float64

df.drop(['A','B'],axis=1,inplace=True)

    C   D
0   2   3
1   6   7
2  10  11

这里需要注意的是，我们是在列上操作

类似地，如果您的操作需要在数据帧中从上到下/从下到上遍历，那么您正在合并行。轴为0。

2018-12-28 04:06:23

其他回答

在Pandas上有两种最常见的axis用法:

用作索引，如df。iloc [0, 1] 用作函数内的参数，如df.mean(axis=1)

当使用作为索引时，我们可以解释为axis=0代表行，axis=1代表列，即df。iloc(行、列)。所以,df。Iloc[0,1]表示从第0行和第1列中选择数据，在本例中，它返回1.52325。

当使用作为参数时，axis=0表示垂直跨行选择对象，而axis=1表示水平跨列选择对象。

因此，df.mean(axis=1)表示水平计算跨列的平均值，它返回:

0    1.074821
dtype: float64

轴的一般用途是用于选择要操作的特定数据。而理解轴的关键，是把“选择”和“操作”的过程分开。

我们用一种额外的情况来解释:df。下降(A轴= 1)

该操作是df.drop()，它需要目标对象的名称列，在这里是A。它和df。mean()不一样对数据内容进行操作。选择的是列的名称，而不是列的数据内容。由于所有列名都是水平排列在列之间的，所以我们使用axis=1来选择name对象。

总之，我们最好把“选择”和“操作”分开，对以下问题有一个清晰的认识:

选择什么对象是怎么安排的

2021-10-04 05:56:34

我的想法是:Axis = n，其中n = 0,1等意味着矩阵沿该轴折叠(折叠)。所以在一个二维矩阵中，当你沿着0(行)折叠时，你实际上是一次对一列进行操作。对于高阶矩阵也是如此。

这与对矩阵中维数的正常引用不同，其中0 ->行和1 ->列。对于N维数组中的其他维度也是如此。

2018-04-25 09:40:07

轴在编程中是形状元组中的位置。这里有一个例子:

import numpy as np

a=np.arange(120).reshape(2,3,4,5)

a.shape
Out[3]: (2, 3, 4, 5)

np.sum(a,axis=0).shape
Out[4]: (3, 4, 5)

np.sum(a,axis=1).shape
Out[5]: (2, 4, 5)

np.sum(a,axis=2).shape
Out[6]: (2, 3, 5)

np.sum(a,axis=3).shape
Out[7]: (2, 3, 4)

轴上的均值将导致该维度被移除。

参考原题，dff形状为(1,2)。使用axis=1将形状更改为(1，)。

2017-07-14 10:38:21

这里的许多答案对我帮助很大!

如果你对Python中的axis和R中的MARGIN的不同行为感到困惑(比如在apply函数中)，你可以找到我写的一篇感兴趣的博客文章:https://accio.github.io/programming/2020/05/19/numpy-pandas-axis.html。

从本质上讲:

Their behaviours are, intriguingly, easier to understand with three-dimensional array than with two-dimensional arrays. In Python packages numpy and pandas, the axis parameter in sum actually specifies numpy to calculate the mean of all values that can be fetched in the form of array[0, 0, ..., i, ..., 0] where i iterates through all possible values. The process is repeated with the position of i fixed and the indices of other dimensions vary one after the other (from the most far-right element). The result is a n-1-dimensional array. In R, the MARGINS parameter let the apply function calculate the mean of all values that can be fetched in the form of array[, ... , i, ... ,] where i iterates through all possible values. The process is not repeated when all i values have been iterated. Therefore, the result is a simple vector.

2020-05-19 14:27:06

我认为还有另一种理解方式。

对于np。数组，如果我们想要消除列，我们使用axis = 1;如果我们想消除行，我们使用axis = 0。

np.mean(np.array(np.ones(shape=(3,5,10))),axis = 0).shape # (5,10)
np.mean(np.array(np.ones(shape=(3,5,10))),axis = 1).shape # (3,10)
np.mean(np.array(np.ones(shape=(3,5,10))),axis = (0,1)).shape # (10,)

对于pandas对象，axis = 0表示按行操作，axis = 1表示按列操作。这与numpy的定义不同，我们可以检查numpy.doc和pandas.doc的定义

2019-04-30 12:02:56

轴在熊猫中是什么意思?

推荐文章

最新文章

标签