Python for Data Science - Removing duplicates

2021-06-11 13:05

阅读:375

标签:--   ble   segment   ted   style   ati   cal   ext   head   

Chapter 2 - Data Preparation Basics

Segment 3 - Removing duplicates

import numpy as np
import pandas as pd

from pandas import Series, DataFrame

Removing duplicates

DF_obj = DataFrame({‘column 1‘:[1,1,2,2,3,3,3],
                    ‘column 2‘:[‘a‘,‘a‘,‘b‘,‘b‘,‘c‘,‘c‘,‘c‘],
                    ‘column 3‘:[‘A‘,‘A‘,‘B‘,‘B‘,‘C‘,‘C‘,‘C‘]})
DF_obj
column 1 column 2 column 3
0 1 a A
1 1 a A
2 2 b B
3 2 b B
4 3 c C
5 3 c C
6 3 c C
DF_obj.duplicated()
0    False
1     True
2    False
3     True
4    False
5     True
6     True
dtype: bool
DF_obj.drop_duplicates()
column 1 column 2 column 3
0 1 a A
2 2 b B
4 3 c C
DF_obj = DataFrame({‘column 1‘:[1,1,2,2,3,3,3],
                    ‘column 2‘:[‘a‘,‘a‘,‘b‘,‘b‘,‘c‘,‘c‘,‘c‘],
                    ‘column 3‘:[‘A‘,‘A‘,‘B‘,‘B‘,‘C‘,‘D‘,‘C‘]})
DF_obj
column 1 column 2 column 3
0 1 a A
1 1 a A
2 2 b B
3 2 b B
4 3 c C
5 3 c D
6 3 c C
DF_obj.drop_duplicates([‘column 3‘])
column 1 column 2 column 3
0 1 a A
2 2 b B
4 3 c C
5 3 c D

Python for Data Science - Removing duplicates

标签:--   ble   segment   ted   style   ati   cal   ext   head   

原文地址:https://www.cnblogs.com/keepmoving1113/p/14222849.html


评论


亲,登录后才可以留言!