Pandas 데이터 분석 기초 실습-1
2021. 9. 22. 17:13ㆍ빅데이터 스터디
In [2]:
import pandas as pd
In [12]:
data_frame = pd.read_csv('data/friend_list.csv')
data_frame
Out[12]:
| name | age | job | |
|---|---|---|---|
| 0 | John | 20 | student |
| 1 | Jenny | 30 | developer |
| 2 | Nate | 30 | teacher |
| 3 | Julia | 40 | dentist |
| 4 | Brian | 45 | manager |
| 5 | Chris | 25 | intern |
| 6 | BoBo | 6 | Dog |
| 7 | Sol | 1 | Dog |
In [15]:
data_frame.head(2) #앞에 두개
Out[15]:
| name | age | job | |
|---|---|---|---|
| 0 | John | 20 | student |
| 1 | Jenny | 30 | developer |
In [17]:
data_frame.tail(2) #뒤에 두개 ,column = Series
Out[17]:
| name | age | job | |
|---|---|---|---|
| 6 | BoBo | 6 | Dog |
| 7 | Sol | 1 | Dog |
In [19]:
type(data_frame.name) #type은 series
#Dataframe은 series의 결합체
Out[19]:
pandas.core.series.Series
In [26]:
#series 는 리스트로 만든다
list_tmp = [1,2,3]
s1 = pd.core.series.Series(list_tmp) #첫번째 column
s2 = pd.core.series.Series(['one','two','three']) #두번째 column
In [29]:
pd.DataFrame(data = dict(num = s1,word = s2)) #dataframe은 dict로
Out[29]:
| num | word | |
|---|---|---|
| 0 | 1 | one |
| 1 | 2 | two |
| 2 | 3 | three |
In [31]:
df = pd.read_csv('data/friend_list.csv')
In [32]:
df
Out[32]:
| name | age | job | |
|---|---|---|---|
| 0 | John | 20 | student |
| 1 | Jenny | 30 | developer |
| 2 | Nate | 30 | teacher |
| 3 | Julia | 40 | dentist |
| 4 | Brian | 45 | manager |
| 5 | Chris | 25 | intern |
| 6 | BoBo | 6 | Dog |
| 7 | Sol | 1 | Dog |
In [33]:
df.head() #처음부터 5개까지(기본)
Out[33]:
| name | age | job | |
|---|---|---|---|
| 0 | John | 20 | student |
| 1 | Jenny | 30 | developer |
| 2 | Nate | 30 | teacher |
| 3 | Julia | 40 | dentist |
| 4 | Brian | 45 | manager |
In [35]:
df.tail() #뒤에서 5개까지(기본)
Out[35]:
| name | age | job | |
|---|---|---|---|
| 3 | Julia | 40 | dentist |
| 4 | Brian | 45 | manager |
| 5 | Chris | 25 | intern |
| 6 | BoBo | 6 | Dog |
| 7 | Sol | 1 | Dog |
In [37]:
df = pd.read_csv('data/friend_list.txt') #콤마로 구분된 txt파일도 사용가능
df
Out[37]:
| name | age | job | |
|---|---|---|---|
| 0 | John | 20 | student |
| 1 | Jenny | 30 | developer |
| 2 | Nate | 30 | teacher |
| 3 | Julia | 40 | dentist |
| 4 | Brian | 45 | manager |
| 5 | Chris | 25 | intern |
| 6 | BoBo | 6 | Dog |
| 7 | Sol | 1 | Dog |
In [41]:
df = pd.read_csv('data/friend_list_tab.txt') #다른 기준으로 분류 된 경우 데이터 프레임이 이상하게 나온다
df
Out[41]:
| name\tage\tjob | |
|---|---|
| 0 | Jenny\t30\tdeveloper |
| 1 | Nate\t30\tteacher |
| 2 | Julia\t40\tdentist |
| 3 | Brian\t45\tmanager |
| 4 | Chris\t25\tintern |
| 5 | BoBo\t6\tDog |
| 6 | Sol\t1\tDog |
In [42]:
df = pd.read_csv('data/friend_list_tab.txt',delimiter='\t')
#구분 기준이 다를 경우 delimeter를 줄것
df
Out[42]:
| name | age | job | |
|---|---|---|---|
| 0 | Jenny | 30 | developer |
| 1 | Nate | 30 | teacher |
| 2 | Julia | 40 | dentist |
| 3 | Brian | 45 | manager |
| 4 | Chris | 25 | intern |
| 5 | BoBo | 6 | Dog |
| 6 | Sol | 1 | Dog |
In [43]:
df = pd.read_csv('data/friend_list_no_head.csv',header = None)
#head 가 없는경우 header가 없음을 표시
df
Out[43]:
| 0 | 1 | 2 | |
|---|---|---|---|
| 0 | John | 20 | student |
| 1 | Jenny | 30 | developer |
| 2 | Nate | 30 | teacher |
| 3 | Julia | 40 | dentist |
| 4 | Brian | 45 | manager |
| 5 | Chris | 25 | intern |
| 6 | BoBo | 6 | Dog |
| 7 | Sol | 1 | Dog |
In [45]:
df.columns = ['name','age','job'] #column의 header를 넣기
df
Out[45]:
| name | age | job | |
|---|---|---|---|
| 0 | John | 20 | student |
| 1 | Jenny | 30 | developer |
| 2 | Nate | 30 | teacher |
| 3 | Julia | 40 | dentist |
| 4 | Brian | 45 | manager |
| 5 | Chris | 25 | intern |
| 6 | BoBo | 6 | Dog |
| 7 | Sol | 1 | Dog |
In [47]:
df = pd.read_csv('data/friend_list_no_head.csv',header=None,names = ['name','age','job'])
#1단계로 끝내기
df
Out[47]:
| name | age | job | |
|---|---|---|---|
| 0 | John | 20 | student |
| 1 | Jenny | 30 | developer |
| 2 | Nate | 30 | teacher |
| 3 | Julia | 40 | dentist |
| 4 | Brian | 45 | manager |
| 5 | Chris | 25 | intern |
| 6 | BoBo | 6 | Dog |
| 7 | Sol | 1 | Dog |
'빅데이터 스터디' 카테고리의 다른 글
| Pandas 데이터분석 기초실습 -3 (0) | 2021.09.22 |
|---|---|
| Pandas 데이터분석 기초 실습 -2 (0) | 2021.09.22 |
| 30분 요약 강좌 시즌2 : Python 활용편 섹션6-폴리움 (0) | 2021.09.17 |
| 30분 요약 강좌 시즌2 : Python 활용편-섹션4-웹크롤링 연습문제2 (0) | 2021.09.17 |
| 30분 요약 강좌 시즌2 : Python 활용편 섹션4-웹크롤링 연습문제-1 (0) | 2021.09.17 |