Python 在列表中删除重复的字典
Python是一种非常广泛应用于Web开发、数据科学、机器学习以及自动化的平台。我们可以在Python中使用不同的数据类型(如列表、字典、数据集)来存储数据和信息。Python字典中的数据和信息可以根据我们的选择进行编辑和更改。
以下文章将介绍不同的方法来删除列表中的重复字典。由于不能直接选择重复的字典,我们将不得不使用Python的不同方法和特性来删除这些字典。
删除重复字典的各种方法
列表推导
由于不能直接比较列表中的不同字典,我们将不得不将它们转换为其他形式,以便我们可以比较不同的字典。通过以下示例,我们可以更好地理解:
示例
def all_duplicate(whole_dict):
same = set() #We check all the dictionaries with the help of same set created
return [dict(tuple(sorted(dupl.items()))) for dupl in whole_dict if tuple(sorted(dupl.items())) not in same and not same.add(tuple(sorted(dupl.items())))] #We will convert each dictionary into tuple so that the dictionary having the same value will be removed and the duplicate dictionary can be found easily, if the tuple has a different value then the dictionary will be kept.
# Example
Whole_Dictionary = [
{"Place": "Haldwani", "State": 'Uttrakhand'},
{"Place": "Hisar", "State": 'Haryana'},
{"Place": "Shillong", "State": 'Meghalaya'},
{"Place": "Kochi", "State": 'Kerala'},
{"Place": "Bhopal", "State": 'Madhya Pradesh'},
{"Place": "Kochi", "State": 'Kerala'}, #This Dictionary is repeating which is to be removed
{"Place": "Haridwar", "State": 'Uttarakhand'}
]
Final_Dict = all_duplicate(Whole_Dictionary)
print(Final_Dict) #The output after removing the duplicate dictionary will be shown
输出
上述示例的输出如下:
[{'Place': 'Haldwani', 'State': 'Uttrakhand'}, {'Place': 'Hisar', 'State': 'Haryana'}, {'Place': 'Shillong', 'State': 'Meghalaya'}, {'Place': 'Kochi', 'State': 'Kerala'}, {'Place': 'Bhopal', 'State': 'Madhya Pradesh'}, {'Place': 'Haridwar', 'State': 'Uttarakhand'}]
Pandas库
该方法仅在数据集非常庞大且包含许多不同元素的情况下使用,即仅用于具有复杂数据的字典。我们可以通过以下示例来理解pandas库的用法:
示例
import pandas as ps #Do not forget to import pandas or error might occur
#Convert the dictionaries into panda frame
def all_duplicate(data):
dd = ps.DataFrame(data)
dd.drop_duplicates(inplace=True) #Drop_duplicates() method will remove all the duplicate dictionaries
return dd.to_dict(orient='records') #Converting dictionaries back into list of dictionaries from panda frame
# Example
Whole_Dictionary = [
{"Place": "Haldwani", "State": 'Uttrakhand'},
{"Place": "Hisar", "State": 'Haryana'},
{"Place": "Shillong", "State": 'Meghalaya'},
{"Place": "Kochi", "State": 'Kerala'},
{"Place": "Bhopal", "State": 'Madhya Pradesh'},
{"Place": "Kochi", "State": 'Kerala'}, #This Dictionary is repeating which is to be removed
{"Place": "Haridwar", "State": 'Uttarakhand'}
]
Final_Dict = all_duplicate(Whole_Dictionary)
print(Final_Dict) #The output after removing the duplicate dictionary will be shown
输出
[{'Place': 'Haldwani', 'State': 'Uttrakhand'}, {'Place': 'Hisar', 'State': 'Haryana'}, {'Place': 'Shillong', 'State': 'Meghalaya'}, {'Place': 'Kochi', 'State': 'Kerala'}, {'Place': 'Bhopal', 'State': 'Madhya Pradesh'}, {'Place': 'Haridwar', 'State': 'Uttarakhand'}]
冻结字典
使用冻结字典的概念是解决字典不可哈希的一种技术。冻结字典可以作为另一个字典的键或者集合中的元素,因为它本质上是字典的不可变形式。frozendict库提供了方便的冻结字典实现。我们可以通过以下示例更好地理解:
示例
def make_hashable(d):
return hash(frozenset(d.items())) # We will convert the dictionary key values into frozen set and then pass it to hash function
def all_duplicate(dicts):
seen = set() #It will check for similarities in the list
return [d for d in dicts if not (make_hashable(d) in seen or seen.add(make_hashable(d)))] #If similarity will be found it will be removed and if not then the data will be kept
# Example
Whole_Dictionary = [
{"Place": "Haldwani", "State": 'Uttrakhand'},
{"Place": "Hisar", "State": 'Haryana'},
{"Place": "Shillong", "State": 'Meghalaya'},
{"Place": "Kochi", "State": 'Kerala'},
{"Place": "Bhopal", "State": 'Madhya Pradesh'},
{"Place": "Kochi", "State": 'Kerala'}, #This Dictionary is repeating which is to be removed
{"Place": "Haridwar", "State": 'Uttarakhand'}
]
Final_Dict = all_duplicate(Whole_Dictionary)
print(Final_Dict) #The output after removing the duplicate dictionary will be shown
输出
[{'Place': 'Haldwani', 'State': 'Uttrakhand'}, {'Place': 'Hisar', 'State': 'Haryana'}, {'Place': 'Shillong', 'State': 'Meghalaya'}, {'Place': 'Kochi', 'State': 'Kerala'}, {'Place': 'Bhopal', 'State': 'Madhya Pradesh'}, {'Place': 'Haridwar', 'State': 'Uttarakhand'}
辅助函数
这是一种从字典列表中删除重复字典的复杂方法。通过使用辅助函数,在此过程中,每个字典都被转换成其内容的排序元组。然后使用这个辅助函数找到并从字典列表中删除重复的元组。我们可以通过以下示例更好地理解:
示例
def sorted_dict_to_tuple(d): # sorted_dicts_to_tuple takes the dictionary as input and sorts it into tuple
return tuple(sorted(d.items()))
def all_duplicates(dicts): # The all_duplicates function will check all the elements in the dictionary and keep track of any repeating element
seen = set()
return [d for d in dicts if not (sorted_dict_to_tuple(d) in seen or seen.add(sorted_dict_to_tuple(d)))]
# Example
Whole_Dictionary = [
{"Place": "Haldwani", "State": 'Uttrakhand'},
{"Place": "Hisar", "State": 'Haryana'},
{"Place": "Shillong", "State": 'Meghalaya'},
{"Place": "Kochi", "State": 'Kerala'},
{"Place": "Bhopal", "State": 'Madhya Pradesh'},
{"Place": "Kochi", "State": 'Kerala'}, #This Dictionary is repeating which is to be removed
{"Place": "Haridwar", "State": 'Uttarakhand'}
]
Final_Dict = all_duplicates(Whole_Dictionary)
print(Final_Dict) #The output after removing the duplicate dictionary will be shown
输出
[{'Place': 'Haldwani', 'State': 'Uttrakhand'}, {'Place': 'Hisar', 'State': 'Haryana'}, {'Place': 'Shillong', 'State': 'Meghalaya'}, {'Place': 'Kochi', 'State': 'Kerala'}, {'Place': 'Bhopal', 'State': 'Madhya Pradesh'}, {'Place': 'Haridwar', 'State': 'Uttarakhand'}]
结论
遵循正确的程序是必要的,因为从列表中删除重复的字典是一项耗时且困难的任务。本文列出了可以用来从列表中消除重复字典的每种方法。根据个人方便和应用领域,可以使用任何方法。