Pandas 데이터분석 기초 실습 -5
2021. 9. 22. 17:18ㆍ빅데이터 스터디
행,열(로우, 칼럼) 생성 및 수정하기¶
In [3]:
import pandas as pd
In [4]:
friend_dict_list = [
{'name':'Jone','age':15,'job':'student'},
{'name':'Jenny','age':30,'job':'developer'},
{'name':'Nate','age':30,'job':'teacher'}
]
column = [
'name','age','job'
]
df = pd.DataFrame(friend_dict_list,columns = column)
df
Out[4]:
name | age | job | |
---|---|---|---|
0 | Jone | 15 | student |
1 | Jenny | 30 | developer |
2 | Nate | 30 | teacher |
In [5]:
df['salary'] = 0 #새로운 열 추가, 기본값= 0
df
Out[5]:
name | age | job | salary | |
---|---|---|---|---|
0 | Jone | 15 | student | 0 |
1 | Jenny | 30 | developer | 0 |
2 | Nate | 30 | teacher | 0 |
In [6]:
import numpy as np
In [7]:
df['salary'] = np.where(df['job'] != 'student','yes','no')
In [8]:
df.head()
Out[8]:
name | age | job | salary | |
---|---|---|---|---|
0 | Jone | 15 | student | no |
1 | Jenny | 30 | developer | yes |
2 | Nate | 30 | teacher | yes |
In [9]:
friend_dict_list = [
{'name':'Jone','midterm':95,'final':85},
{'name':'Jenny','midterm':85,'final':80},
{'name':'Nate','midterm':30,'final':10}
]
column = ['name','midterm','final']
df = pd.DataFrame(friend_dict_list,columns =column)
df.head()
Out[9]:
name | midterm | final | |
---|---|---|---|
0 | Jone | 95 | 85 |
1 | Jenny | 85 | 80 |
2 | Nate | 30 | 10 |
In [10]:
df['total'] = df['midterm']+df['final']
df.head()
Out[10]:
name | midterm | final | total | |
---|---|---|---|---|
0 | Jone | 95 | 85 | 180 |
1 | Jenny | 85 | 80 | 165 |
2 | Nate | 30 | 10 | 40 |
In [11]:
df['average'] = df['total'] / 2
df.head()
Out[11]:
name | midterm | final | total | average | |
---|---|---|---|---|---|
0 | Jone | 95 | 85 | 180 | 90.0 |
1 | Jenny | 85 | 80 | 165 | 82.5 |
2 | Nate | 30 | 10 | 40 | 20.0 |
In [12]:
grades = []
for row in df['average']:
if row >= 90:
grades.append('A')
elif row >= 80:
grades.append('B')
else:
grades.append('F')
df['grade'] = grades #리스트를 바로 추가
df.head()
Out[12]:
name | midterm | final | total | average | grade | |
---|---|---|---|---|---|---|
0 | Jone | 95 | 85 | 180 | 90.0 | A |
1 | Jenny | 85 | 80 | 165 | 82.5 | B |
2 | Nate | 30 | 10 | 40 | 20.0 | F |
In [13]:
#apply 함수 사용
def pass_or_fail(row):
if row != 'F':
return 'Pass'
else:
return 'Fail'
In [14]:
df.grade = df.grade.apply(pass_or_fail) #return value를 df의 grade에 넣음
df.head()
Out[14]:
name | midterm | final | total | average | grade | |
---|---|---|---|---|---|---|
0 | Jone | 95 | 85 | 180 | 90.0 | Pass |
1 | Jenny | 85 | 80 | 165 | 82.5 | Pass |
2 | Nate | 30 | 10 | 40 | 20.0 | Fail |
In [18]:
date_list = [
{
'yyyy-mm-dd':'2000-06-27'
},
{
'yyyy-mm-dd':'2007-10-27'
}
]
df = pd.DataFrame(date_list,columns= ['yyyy-mm-dd'])
df.head()
Out[18]:
yyyy-mm-dd | |
---|---|
0 | 2000-06-27 |
1 | 2007-10-27 |
In [19]:
def extract_year(row):
return row.split('-')[0]
In [21]:
df['year'] = df['yyyy-mm-dd'].apply(extract_year)
df.head()
Out[21]:
yyyy-mm-dd | year | |
---|---|---|
0 | 2000-06-27 | 2000 |
1 | 2007-10-27 | 2007 |
행 수정¶
In [22]:
friend_dict_list = [
{'name':'Jone','midterm':95,'final':85},
{'name':'Jenny','midterm':85,'final':80},
{'name':'Nate','midterm':30,'final':10}
]
column = ['name','midterm','final']
df = pd.DataFrame(friend_dict_list,columns =column)
df.head()
Out[22]:
name | midterm | final | |
---|---|---|---|
0 | Jone | 95 | 85 |
1 | Jenny | 85 | 80 |
2 | Nate | 30 | 10 |
In [24]:
df2 = pd.DataFrame([['Ben',50,50]],
columns=['name','midterm','final']
)
df2.head()
Out[24]:
name | midterm | final | |
---|---|---|---|
0 | Ben | 50 | 50 |
In [25]:
df.append(df2,ignore_index= True) #index무시하고 넣음
Out[25]:
name | midterm | final | |
---|---|---|---|
0 | Jone | 95 | 85 |
1 | Jenny | 85 | 80 |
2 | Nate | 30 | 10 |
3 | Ben | 50 | 50 |
In [ ]:
'빅데이터 스터디' 카테고리의 다른 글
Pandas 데이터분석 기초 실습-7 (0) | 2021.09.22 |
---|---|
Pandas 데이터분석 기초 실습-6 (0) | 2021.09.22 |
Pandas 데이터분석 기초 실습 -4 (0) | 2021.09.22 |
Pandas 데이터분석 기초실습 -3 (0) | 2021.09.22 |
Pandas 데이터분석 기초 실습 -2 (0) | 2021.09.22 |