Pandas 데이터 분석 기초 실습-1
2021. 9. 22. 17:13ㆍ빅데이터 스터디
In [2]:
import pandas as pd
In [12]:
data_frame = pd.read_csv('data/friend_list.csv')
data_frame
Out[12]:
name | age | job | |
---|---|---|---|
0 | John | 20 | student |
1 | Jenny | 30 | developer |
2 | Nate | 30 | teacher |
3 | Julia | 40 | dentist |
4 | Brian | 45 | manager |
5 | Chris | 25 | intern |
6 | BoBo | 6 | Dog |
7 | Sol | 1 | Dog |
In [15]:
data_frame.head(2) #앞에 두개
Out[15]:
name | age | job | |
---|---|---|---|
0 | John | 20 | student |
1 | Jenny | 30 | developer |
In [17]:
data_frame.tail(2) #뒤에 두개 ,column = Series
Out[17]:
name | age | job | |
---|---|---|---|
6 | BoBo | 6 | Dog |
7 | Sol | 1 | Dog |
In [19]:
type(data_frame.name) #type은 series
#Dataframe은 series의 결합체
Out[19]:
pandas.core.series.Series
In [26]:
#series 는 리스트로 만든다
list_tmp = [1,2,3]
s1 = pd.core.series.Series(list_tmp) #첫번째 column
s2 = pd.core.series.Series(['one','two','three']) #두번째 column
In [29]:
pd.DataFrame(data = dict(num = s1,word = s2)) #dataframe은 dict로
Out[29]:
num | word | |
---|---|---|
0 | 1 | one |
1 | 2 | two |
2 | 3 | three |
In [31]:
df = pd.read_csv('data/friend_list.csv')
In [32]:
df
Out[32]:
name | age | job | |
---|---|---|---|
0 | John | 20 | student |
1 | Jenny | 30 | developer |
2 | Nate | 30 | teacher |
3 | Julia | 40 | dentist |
4 | Brian | 45 | manager |
5 | Chris | 25 | intern |
6 | BoBo | 6 | Dog |
7 | Sol | 1 | Dog |
In [33]:
df.head() #처음부터 5개까지(기본)
Out[33]:
name | age | job | |
---|---|---|---|
0 | John | 20 | student |
1 | Jenny | 30 | developer |
2 | Nate | 30 | teacher |
3 | Julia | 40 | dentist |
4 | Brian | 45 | manager |
In [35]:
df.tail() #뒤에서 5개까지(기본)
Out[35]:
name | age | job | |
---|---|---|---|
3 | Julia | 40 | dentist |
4 | Brian | 45 | manager |
5 | Chris | 25 | intern |
6 | BoBo | 6 | Dog |
7 | Sol | 1 | Dog |
In [37]:
df = pd.read_csv('data/friend_list.txt') #콤마로 구분된 txt파일도 사용가능
df
Out[37]:
name | age | job | |
---|---|---|---|
0 | John | 20 | student |
1 | Jenny | 30 | developer |
2 | Nate | 30 | teacher |
3 | Julia | 40 | dentist |
4 | Brian | 45 | manager |
5 | Chris | 25 | intern |
6 | BoBo | 6 | Dog |
7 | Sol | 1 | Dog |
In [41]:
df = pd.read_csv('data/friend_list_tab.txt') #다른 기준으로 분류 된 경우 데이터 프레임이 이상하게 나온다
df
Out[41]:
name\tage\tjob | |
---|---|
0 | Jenny\t30\tdeveloper |
1 | Nate\t30\tteacher |
2 | Julia\t40\tdentist |
3 | Brian\t45\tmanager |
4 | Chris\t25\tintern |
5 | BoBo\t6\tDog |
6 | Sol\t1\tDog |
In [42]:
df = pd.read_csv('data/friend_list_tab.txt',delimiter='\t')
#구분 기준이 다를 경우 delimeter를 줄것
df
Out[42]:
name | age | job | |
---|---|---|---|
0 | Jenny | 30 | developer |
1 | Nate | 30 | teacher |
2 | Julia | 40 | dentist |
3 | Brian | 45 | manager |
4 | Chris | 25 | intern |
5 | BoBo | 6 | Dog |
6 | Sol | 1 | Dog |
In [43]:
df = pd.read_csv('data/friend_list_no_head.csv',header = None)
#head 가 없는경우 header가 없음을 표시
df
Out[43]:
0 | 1 | 2 | |
---|---|---|---|
0 | John | 20 | student |
1 | Jenny | 30 | developer |
2 | Nate | 30 | teacher |
3 | Julia | 40 | dentist |
4 | Brian | 45 | manager |
5 | Chris | 25 | intern |
6 | BoBo | 6 | Dog |
7 | Sol | 1 | Dog |
In [45]:
df.columns = ['name','age','job'] #column의 header를 넣기
df
Out[45]:
name | age | job | |
---|---|---|---|
0 | John | 20 | student |
1 | Jenny | 30 | developer |
2 | Nate | 30 | teacher |
3 | Julia | 40 | dentist |
4 | Brian | 45 | manager |
5 | Chris | 25 | intern |
6 | BoBo | 6 | Dog |
7 | Sol | 1 | Dog |
In [47]:
df = pd.read_csv('data/friend_list_no_head.csv',header=None,names = ['name','age','job'])
#1단계로 끝내기
df
Out[47]:
name | age | job | |
---|---|---|---|
0 | John | 20 | student |
1 | Jenny | 30 | developer |
2 | Nate | 30 | teacher |
3 | Julia | 40 | dentist |
4 | Brian | 45 | manager |
5 | Chris | 25 | intern |
6 | BoBo | 6 | Dog |
7 | Sol | 1 | Dog |
'빅데이터 스터디' 카테고리의 다른 글
Pandas 데이터분석 기초실습 -3 (0) | 2021.09.22 |
---|---|
Pandas 데이터분석 기초 실습 -2 (0) | 2021.09.22 |
30분 요약 강좌 시즌2 : Python 활용편 섹션6-폴리움 (0) | 2021.09.17 |
30분 요약 강좌 시즌2 : Python 활용편-섹션4-웹크롤링 연습문제2 (0) | 2021.09.17 |
30분 요약 강좌 시즌2 : Python 활용편 섹션4-웹크롤링 연습문제-1 (0) | 2021.09.17 |