Pandas 如何删除series中具有重复索引的行

通过在pandas series构造函数中使用duplicated()方法，我们可以轻松地识别series对象索引中的重复值。duplicated()方法用于识别series对象中的重复值。

duplicated()方法将返回一个具有布尔值的series。布尔值False表示单一出现的值，即唯一值。重复值用布尔值True表示。

示例1

在这个示例中，我们将看到如何删除具有重复索引的series对象的行。

# importing pandas package
import pandas as pd

#create series
series = pd.Series(["a", "b", "c", "d", "e"],index=[1, 2, 1, 3, 2])

print(series)

# getting the index data
index = series.index

# removing duplicate indices separately
result = series[~index.duplicated(keep="first")]

print(result)

说明

首先，我们使用pandas.Series()函数创建了一个带有索引标签[1, 2, 1, 3, 2]的pandas系列对象。然后，我们对索引数据应用了duplicated()方法来识别重复的标签。

然后我们应用了“~”运算符来反转结果布尔值，并将此数据作为子集发送到原始系列中，以得到一个没有重复索引的新系列对象。

输出

输出如下所示 −

1    a
2    b
1    c
3    d
2    e
dtype: object

1    a
2    b
3    d
dtype: object

在上面的输出块中，我们可以看到原始的Series对象以及没有重复标签的结果Series对象。

示例2

让我们来看另一个示例，删除具有重复索引的Series对象的行。

# importing package
import pandas as pd
import numpy as np

# creating pandas series
series = pd.Series(np.random.randint(1,100,10),
   index=["a", "b", "a", "d", "c", "e", "f", "c", "d", "e"])

print(series)

# getting the index data
index = series.index

# removing duplicate indices separately
result = series[~index.duplicated(keep="first")]

print(result)

解释

首先，我们使用带标签的索引数据创建了系列对象，然后应用重复方法来识别重复的标签。

输出

输出如下 –

a    66
b    73
a    83
d    63
c    23
e    56
f    55
c    22
d    26
e    20
dtype: int32

a    66
b    73
d    63
c    23
e    56
f    55
dtype: int32

标签a，d，c，e在初始序列对象中出现多次，并且这些行在结果序列对象中被删除。