在Pandas中重命名列名

我想从

['$a', '$b', '$c', '$d', '$e']

['a', 'b', 'c', 'd', 'e']

当前回答

让我们通过一个小例子来理解重命名。。。

使用映射重命名列：df=pd.DataFrame（｛“A”：[1，2，3]，“B”：[4，5，6]｝）#创建列名为A和B的dfdf.reame（｛“A”：“new_A”，“B”：“new_B”｝，axis='columns'，inplace=True）#用'new_A'重命名列A，用'new_B'重命名列B输出：新a新b0 1 41 2 52 3 6使用映射重命名索引/Row_Name：df.reame（｛0:“x”，1:“y”，2:“z”｝，axis='index'，inplace=True）#行名称被'x'、'y'和'z'替换。输出：新a新bx 142015年z 3 6

2020-03-08 05:35:42

其他回答

只需将其分配给.columns属性：

>>> df = pd.DataFrame({'$a':[1,2], '$b': [10,20]})
>>> df
   $a  $b
0   1  10
1   2  20

>>> df.columns = ['a', 'b']
>>> df
   a   b
0  1  10
1  2  20

2012-07-05 14:23:27

列名与系列名称

我想解释一下幕后发生的事情。

数据帧是一组系列。

序列又是numpy.array的扩展。

numpy.arrays具有属性.name。

这是系列的名称。熊猫很少尊重这个属性，但它会在某些地方停留，可以用来攻击熊猫的一些行为。

命名列列表

这里有很多答案谈到df.columns属性是一个列表，而实际上它是一个系列。这意味着它具有.name属性。

如果您决定填写列的名称“系列：

df.columns = ['column_one', 'column_two']
df.columns.names = ['name of the list of columns']
df.index.names = ['name of the index']

name of the list of columns     column_one  column_two
name of the index
0                                    4           1
1                                    5           2
2                                    6           3

请注意，索引的名称总是低一列。

挥之不去的艺术事实

.name属性有时会持续存在。如果将df.columns设置为['one'，'two']，则df.one.name将为'one'。

如果您将df.one.name设置为'three'，则df.columns仍然会给您['one'，'two']，df.one.name会给您'three]。

BUT

pd.DataFrame（df.one）将返回

因为Pandas重用已经定义的Series的.name。

多级列名

Pandas有多种方法来实现多层列名。这里面没有太多魔法，但我想在我的回答中也包括这一点，因为我没有看到任何人在这里学习这一点。

    |one            |
    |one      |two  |
0   |  4      |  1  |
1   |  5      |  2  |
2   |  6      |  3  |

通过将列设置为列表，这很容易实现，如下所示：

df.columns = [['one', 'one'], ['one', 'two']]

2016-09-29 12:30:40

这里有一个我喜欢用来减少打字的漂亮小函数：

def rename(data, oldnames, newname):
    if type(oldnames) == str: # Input can be a string or list of strings
        oldnames = [oldnames] # When renaming multiple columns
        newname = [newname] # Make sure you pass the corresponding list of new names
    i = 0
    for name in oldnames:
        oldvar = [c for c in data.columns if name in c]
        if len(oldvar) == 0:
            raise ValueError("Sorry, couldn't find that column in the dataset")
        if len(oldvar) > 1: # Doesn't have to be an exact match
            print("Found multiple columns that matched " + str(name) + ": ")
            for c in oldvar:
                print(str(oldvar.index(c)) + ": " + str(c))
            ind = input('Please enter the index of the column you would like to rename: ')
            oldvar = oldvar[int(ind)]
        if len(oldvar) == 1:
            oldvar = oldvar[0]
        data = data.rename(columns = {oldvar : newname[i]})
        i += 1
    return data

下面是一个如何工作的示例：

In [2]: df = pd.DataFrame(np.random.randint(0, 10, size=(10, 4)), columns = ['col1', 'col2', 'omg', 'idk'])
# First list = existing variables
# Second list = new names for those variables
In [3]: df = rename(df, ['col', 'omg'],['first', 'ohmy'])
Found multiple columns that matched col:
0: col1
1: col2

Please enter the index of the column you would like to rename: 0

In [4]: df.columns
Out[5]: Index(['first', 'col2', 'ohmy', 'idk'], dtype='object')

2018-04-19 07:48:53

如果已经有新列名的列表，可以尝试以下操作：

new_cols = ['a', 'b', 'c', 'd', 'e']
new_names_map = {df.columns[i]:new_cols[i] for i in range(len(new_cols))}

df.rename(new_names_map, axis=1, inplace=True)

2021-06-10 03:46:32

假设您可以使用正则表达式，则此解决方案无需使用正则表达式进行手动编码：

import pandas as pd
import re

srch = re.compile(r"\w+")

data = pd.read_csv("CSV_FILE.csv")
cols = data.columns
new_cols = list(map(lambda v:v.group(), (list(map(srch.search, cols)))))
data.columns = new_cols

2019-04-11 15:08:57

在Pandas中重命名列名

推荐文章

最新文章

标签