如何基于条件在Pandas DataFrame中选择行
在本教程中,我们将学习如何使用Python基于条件选择Pandas DataFrame中的行。
用户可以使用’ >’, ‘=’, ‘<=’, ‘>=’, ‘!=’运算符基于特定列的值选择行。
条件
我们将讨论可以应用于Pandas DataFrame的不同条件。
条件1
使用 基本 方法选择DataFrame中“百分比”大于70的所有行。
代码:
# First, import pandas
import pandas as pnd
record_1 = {
'Name_1': ['Anuj', 'Ashu', 'Yashi', 'Mark', 'Joshua', 'John', 'Ray', 'Lilly', 'Rose', 'Rachel' ],
'Age_1': [23, 24, 21, 19, 21, 24, 25, 22, 23, 22],
'Subjects_1': ['DBMS', 'ADS', 'ASPM', 'BCM', 'MFCS', 'ADS', 'ASPM', 'TOC', 'Data Mining', 'OOPS'],
'Percentage_1': [88, 62, 85, 71, 55, 78, 70, 66, 71, 89] }
# Now, we are creating a dataframe
Data_Frame = pnd.DataFrame(record_1, columns = ['Name_1', 'Age_1', 'Subject_1', 'Percentage_1'])
print("Given DataFrame: \n", Data_Frame)
# Then we will select rows based on condition
result_DataFrame = Data_Frame[Data_Frame['Percentage_1'] > 70]
print('\nFollowing is the Result DataFrame: \n', result_DataFrame)
输出:
Given DataFrame:
Name_1 Age_1 Subjects_1 Percentage_1
0 Anuj 23 DBMS 88
1 Ashu 24 ADS 62
2 Yashi 21 ASPM 85
3 Mark 19 BCM 71
4 Joshua 21 MFCS 55
5 John 24 ADS 78
6 Ray 25 ASPM 70
7 Lilly 22 TOC 66
8 Rose 23 Data Mining 71
9 Rachel 22 OOPS 89
Following is the Result DataFrame:
Name_1 Age_1 Subjects_1 Percentage_1
0 Anuj 23 DBMS 88
2 Yashi 21 ASPM 85
3 Mark 19 BCM 71
5 John 24 ADS 78
8 Rose 23 Data Mining 71
9 Rachel 22 OOPS 89
条件2
使用“ loc[] ”方法,从DataFrame中选择所有’Percentage’大于70的行。
代码:
# First, import pandas
import pandas as pnd
record_1 = {
'Name_1': ['Anuj', 'Ashu', 'Yashi', 'Mark', 'Joshua', 'John', 'Ray', 'Lilly', 'Rose', 'Rachel' ],
'Age_1': [23, 24, 21, 19, 21, 24, 25, 22, 23, 22],
'Subjects_1': ['DBMS', 'ADS', 'ASPM', 'BCM', 'MFCS', 'ADS', 'ASPM', 'TOC', 'Data Mining', 'OOPS'],
'Percentage_1': [88, 62, 85, 71, 55, 78, 70, 66, 71, 89] }
# Now, we are creating a dataframe
Data_Frame = pnd.DataFrame(record_1, columns = ['Name_1', 'Age_1', 'Subjects_1', 'Percentage_1'])
print("Given DataFrame: \n", Data_Frame)
# Then we will select rows based on condition, That is, Using loc[] method
result_DataFrame = Data_Frame.loc[Data_Frame['Percentage_1'] > 70]
print('\nFollowing is the Result DataFrame: \n', result_DataFrame)
输出:
Given DataFrame:
Name_1 Age_1 Subjects_1 Percentage_1
0 Anuj 23 DBMS 88
1 Ashu 24 ADS 62
2 Yashi 21 ASPM 85
3 Mark 19 BCM 71
4 Joshua 21 MFCS 55
5 John 24 ADS 78
6 Ray 25 ASPM 70
7 Lilly 22 TOC 66
8 Rose 23 Data Mining 71
9 Rachel 22 OOPS 89
Following is the Result DataFrame:
Name_1 Age_1 Subjects_1 Percentage_1
0 Anuj 23 DBMS 88
2 Yashi 21 ASPM 85
3 Mark 19 BCM 71
5 John 24 ADS 78
8 Rose 23 Data Mining 71
9 Rachel 22 OOPS 89
条件3
使用“ loc[] ”方法,选择DataFrame中“Percentage”不等于71的所有行。
代码:
# First, import pandas
import pandas as pnd
record_1 = {
'Name_1': ['Anuj', 'Ashu', 'Yashi', 'Mark', 'Joshua', 'John', 'Ray', 'Lilly', 'Rose', 'Rachel' ],
'Age_1': [23, 24, 21, 19, 21, 24, 25, 22, 23, 22],
'Subjects_1': ['DBMS', 'ADS', 'ASPM', 'BCM', 'MFCS', 'ADS', 'ASPM', 'TOC', 'Data Mining', 'OOPS'],
'Percentage_1': [88, 62, 85, 71, 55, 78, 70, 66, 71, 89] }
# Now, we are creating a dataframe
Data_Frame = pnd.DataFrame(record_1, columns = ['Name_1', 'Age_1', 'Subjects_1', 'Percentage_1'])
print("Given DataFrame: \n", Data_Frame)
# Then we will select rows based on condition, That is, Using loc[] method
result_DataFrame = Data_Frame.loc[Data_Frame['Percentage_1'] != 71]
print('\nFollowing is the Result DataFrame: \n', result_DataFrame)
输出:
Given DataFrame:
Name_1 Age_1 Subjects_1 Percentage_1
0 Anuj 23 DBMS 88
1 Ashu 24 ADS 62
2 Yashi 21 ASPM 85
3 Mark 19 BCM 71
4 Joshua 21 MFCS 55
5 John 24 ADS 78
6 Ray 25 ASPM 70
7 Lilly 22 TOC 66
8 Rose 23 Data Mining 71
9 Rachel 22 OOPS 89
Following is the Result DataFrame:
Name_1 Age_1 Subjects_1 Percentage_1
0 Anuj 23 DBMS 88
1 Ashu 24 ADS 62
2 Yashi 21 ASPM 85
4 Joshua 21 MFCS 55
5 John 24 ADS 78
6 Ray 25 ASPM 70
7 Lilly 22 TOC 66
9 Rachel 22 OOPS 89
现在,我们将学习如何使用DataFrame的”isin()”函数来选择那些列值在列表中存在的行。
条件4
使用基本方法,从给定的DataFrame中选择所有列值为” Subjects_1 “的行,这些行在” Subjects_2 “列表中存在。
代码:
# First, import pandas
import pandas as pnd
record_1 = {
'Name_1': ['Anuj', 'Ashu', 'Yashi', 'Mark', 'Joshua', 'John', 'Ray', 'Lilly', 'Rose', 'Rachel' ],
'Age_1': [23, 24, 21, 19, 21, 24, 25, 22, 23, 22],
'Subjects_1': ['DBMS', 'ADS', 'ASPM', 'BCM', 'MFCS', 'ADS', 'ASPM', 'TOC', 'Data Mining', 'OOPS'],
'Percentage_1': [88, 62, 85, 71, 55, 78, 70, 66, 71, 89] }
# Now, we are creating a dataframe
Data_Frame = pnd.DataFrame(record_1, columns = ['Name_1', 'Age_1', 'Subjects_1', 'Percentage_1'])
print("Given DataFrame: \n", Data_Frame)
Subjects_2 = ['ASPM', 'ADS', 'TOC']
# Then we will select rows based on condition, That is, Using isin[] method
result_DataFrame = Data_Frame[Data_Frame['Subjects_1'].isin(Subjects_2)]
print('\nFollowing is the Result DataFrame: \n', result_DataFrame)
输出:
Given DataFrame:
Name_1 Age_1 Subjects_1 Percentage_1
0 Anuj 23 DBMS 88
1 Ashu 24 ADS 62
2 Yashi 21 ASPM 85
3 Mark 19 BCM 71
4 Joshua 21 MFCS 55
5 John 24 ADS 78
6 Ray 25 ASPM 70
7 Lilly 22 TOC 66
8 Rose 23 Data Mining 71
9 Rachel 22 OOPS 89
Following is the Result DataFrame:
Name_1 Age_1 Subjects_1 Percentage_1
1 Ashu 24 ADS 62
2 Yashi 21 ASPM 85
5 John 24 ADS 78
6 Ray 25 ASPM 70
7 Lilly 22 TOC 66
条件5
选择给定数据框中所有行,其中“ Subjects_1 ”列的值在“ Subjects_2 ”列表中出现,并使用“ loc[] ”方法。
代码:
# First, import pandas
import pandas as pnd
record_1 = {
'Name_1': ['Anuj', 'Ashu', 'Yashi', 'Mark', 'Joshua', 'John', 'Ray', 'Lilly', 'Rose', 'Rachel' ],
'Age_1': [23, 24, 21, 19, 21, 24, 25, 22, 23, 22],
'Subjects_1': ['DBMS', 'ADS', 'ASPM', 'BCM', 'MFCS', 'ADS', 'ASPM', 'TOC', 'Data Mining', 'OOPS'],
'Percentage_1': [88, 62, 85, 71, 55, 78, 70, 66, 71, 89] }
# Now, we are creating a dataframe
Data_Frame = pnd.DataFrame(record_1, columns = ['Name_1', 'Age_1', 'Subjects_1', 'Percentage_1'])
print("Given DataFrame: \n", Data_Frame)
Subjects_2 = ['ASPM', 'ADS', 'TOC']
# Then we will select rows based on condition, That is, Using isin[] method
result_DataFrame = Data_Frame.loc[Data_Frame['Subjects_1'].isin(Subjects_2)]
print('\nFollowing is the Result DataFrame: \n', result_DataFrame)
输出:
Given DataFrame:
Name_1 Age_1 Subjects_1 Percentage_1
0 Anuj 23 DBMS 88
1 Ashu 24 ADS 62
2 Yashi 21 ASPM 85
3 Mark 19 BCM 71
4 Joshua 21 MFCS 55
5 John 24 ADS 78
6 Ray 25 ASPM 70
7 Lilly 22 TOC 66
8 Rose 23 Data Mining 71
9 Rachel 22 OOPS 89
Following is the Result DataFrame:
Name_1 Age_1 Subjects_1 Percentage_1
1 Ashu 24 ADS 62
2 Yashi 21 ASPM 85
5 John 24 ADS 78
6 Ray 25 ASPM 70
7 Lilly 22 TOC 66
条件6
使用 loc[] 方法,从给定的DataFrame中选择所有行,其中列值“ Subjects_1 ”不在“ Subjects_2 ”列表中。
代码:
# First, import pandas
import pandas as pnd
record_1 = {
'Name_1': ['Anuj', 'Ashu', 'Yashi', 'Mark', 'Joshua', 'John', 'Ray', 'Lilly', 'Rose', 'Rachel' ],
'Age_1': [23, 24, 21, 19, 21, 24, 25, 22, 23, 22],
'Subjects_1': ['DBMS', 'ADS', 'ASPM', 'BCM', 'MFCS', 'ADS', 'ASPM', 'TOC', 'Data Mining', 'OOPS'],
'Percentage_1': [88, 62, 85, 71, 55, 78, 70, 66, 71, 89] }
# Now, we are creating a dataframe
Data_Frame = pnd.DataFrame(record_1, columns = ['Name_1', 'Age_1', 'Subjects_1', 'Percentage_1'])
print("Given DataFrame: \n", Data_Frame)
Subjects_2 = ['ASPM', 'ADS', 'TOC']
# Then we will select rows based on condition, That is, Using isin[] method
result_DataFrame = Data_Frame.loc[~Data_Frame['Subjects_1'].isin(Subjects_2)]
print('\nFollowing is the Result DataFrame: \n', result_DataFrame)
输出:
Given DataFrame:
Name_1 Age_1 Subjects_1 Percentage_1
0 Anuj 23 DBMS 88
1 Ashu 24 ADS 62
2 Yashi 21 ASPM 85
3 Mark 19 BCM 71
4 Joshua 21 MFCS 55
5 John 24 ADS 78
6 Ray 25 ASPM 70
7 Lilly 22 TOC 66
8 Rose 23 Data Mining 71
9 Rachel 22 OOPS 89
Following is the Result DataFrame:
Name_1 Age_1 Subjects_1 Percentage_1
0 Anuj 23 DBMS 88
3 Mark 19 BCM 71
4 Joshua 21 MFCS 55
8 Rose 23 Data Mining 71
9 Rachel 22 OOPS 89
现在,我们将学习如何使用“&”运算符根据多列条件选择行。
条件7
从给定的DataFrame中选择所有行,其中“ Percentage_1 ”等于“ 71 ”并且“ Subject_1 ”存在于“ Subject_2 ”列表中,使用 基本 方法。
代码:
# First, import pandas
import pandas as pnd
record_1 = {
'Name_1': ['Anuj', 'Ashu', 'Yashi', 'Mark', 'Joshua', 'John', 'Ray', 'Lilly', 'Rose', 'Rachel' ],
'Age_1': [23, 21, 21, 19, 21, 24, 25, 22, 23, 22],
'Subjects_1': ['DBMS', 'ADS', 'ASPM', 'BCM', 'MFCS', 'ADS', 'ASPM', 'TOC', 'Data Mining', 'OOPS'],
'Percentage_1': [88, 71, 71, 82, 55, 78, 70, 66, 71, 89] }
# Now, we are creating a dataframe
Data_Frame = pnd.DataFrame(record_1, columns = ['Name_1', 'Age_1', 'Subjects_1', 'Percentage_1'])
print("Given DataFrame: \n", Data_Frame)
Subjects_2 = ['ASPM', 'ADS', 'TOC']
# Then we will select rows based on condition, That is, Using isin[] method
result_DataFrame = Data_Frame[(Data_Frame['Percentage_1'] == 71) &
Data_Frame['Subjects_1'].isin(Subjects_2)]
print('\nFollowing is the Result DataFrame: \n', result_DataFrame)
输出:
Given DataFrame:
Name_1 Age_1 Subjects_1 Percentage_1
0 Anuj 23 DBMS 88
1 Ashu 21 ADS 71
2 Yashi 21 ASPM 71
3 Mark 19 BCM 82
4 Joshua 21 MFCS 55
5 John 24 ADS 78
6 Ray 25 ASPM 70
7 Lilly 22 TOC 66
8 Rose 23 Data Mining 71
9 Rachel 22 OOPS 89
Following is the Result DataFrame:
Name_1 Age_1 Subjects_1 Percentage_1
1 Ashu 21 ADS 71
2 Yashi 21 ASPM 71
条件8
从给定的DataFrame中选择所有“Percentage_1”等于“ 71 ”且“ Subject_1 ”出现在“ Subject_2 ”列表中的行,使用“ loc[] ”方法。
代码:
# First, import pandas
import pandas as pnd
record_1 = {
'Name_1': ['Anuj', 'Ashu', 'Yashi', 'Mark', 'Joshua', 'John', 'Ray', 'Lilly', 'Rose', 'Rachel' ],
'Age_1': [23, 21, 21, 19, 21, 24, 25, 22, 23, 22],
'Subjects_1': ['DBMS', 'ADS', 'ASPM', 'BCM', 'MFCS', 'ADS', 'ASPM', 'TOC', 'Data Mining', 'OOPS'],
'Percentage_1': [88, 71, 71, 82, 55, 78, 70, 66, 71, 89] }
# Now, we are creating a dataframe
Data_Frame = pnd.DataFrame(record_1, columns = ['Name_1', 'Age_1', 'Subjects_1', 'Percentage_1'])
print("Given DataFrame: \n", Data_Frame)
Subjects_2 = ['ASPM', 'ADS', 'TOC']
# Then we will select rows based on condition, That is, Using isin[] method
result_DataFrame = Data_Frame.loc[(Data_Frame['Percentage_1'] == 71) &
Data_Frame['Subjects_1'].isin(Subjects_2)]
print('\nFollowing is the Result DataFrame: \n', result_DataFrame)
输出:
Given DataFrame:
Name_1 Age_1 Subjects_1 Percentage_1
0 Anuj 23 DBMS 88
1 Ashu 21 ADS 71
2 Yashi 21 ASPM 71
3 Mark 19 BCM 82
4 Joshua 21 MFCS 55
5 John 24 ADS 78
6 Ray 25 ASPM 70
7 Lilly 22 TOC 66
8 Rose 23 Data Mining 71
9 Rachel 22 OOPS 89
Following is the Result DataFrame:
Name_1 Age_1 Subjects_1 Percentage_1
1 Ashu 21 ADS 71
2 Yashi 21 ASPM 71
结论
在本教程中,我们讨论了如何根据不同条件选择Pandas DataFrame的不同行。