30분 요약 강좌 시즌2 : Python 활용편 섹션4-웹크롤링 연습문제-1
2021. 9. 17. 18:27ㆍ빅데이터 스터디
In [12]:
import requests
from bs4 import BeautifulSoup
response = requests.get('http://www.paullab.co.kr/stock.html')
response.encoding = 'utf-8'
html = response.text
soup = BeautifulSoup(html,'html.parser')
In [19]:
soup.select('.main')[2]
soup.select('.main')[3]
soup.select('.main')[4]
soup.select('.main')[5]
Out[19]:
<div class="main"> <h2 id="제주코딩베이스캠프학원">제주코딩베이스캠프 학원</h2> <h3><span style="color: salmon">일별</span> 시세</h3> <table class="table table-hover"> <tbody> <tr> <th scope="col">날짜</th> <th scope="col">종가</th> <th scope="col">전일비</th> <th scope="col">시가</th> <th scope="col">고가</th> <th scope="col">저가</th> <th scope="col">거래량</th> </tr> <tr> <td align="center "><span class="date">2019.10.23</span></td> <td class="num"><span>2,600</span></td> <td class="num"> <img height="6 " src="ico_up.gif " style="margin-right:4px; " width="7 "/><span> 600 </span> </td> <td class="num"><span>2,055</span></td> <td class="num"><span>2,600</span></td> <td class="num"><span>2,020</span></td> <td class="num"><span>2,203,110</span></td> </tr> <tr> <td align="center"><span class="date">2019.10.22</span></td> <td class="num"><span>2,000</span></td> <td class="num"> <img alt="상승" height="6" src="ico_up.gif" style="margin-right:4px;" width="7"/><span> 5 </span> </td> <td class="num"><span>1,985</span></td> <td class="num"><span>2,005</span></td> <td class="num"><span>1,980</span></td> <td class="num"><span>32,212</span></td> </tr> <tr> <td align="center"><span class="date">2019.10.21</span></td> <td class="num"><span>1,995</span></td> <td class="num"> <img alt="하락" height="6" src="ico_down.gif" style="margin-right:4px;" width="7"/><span class="tah p11 nv01"> 30 </span> </td> <td class="num"><span>2,025</span></td> <td class="num"><span>2,035</span></td> <td class="num"><span>1,975</span></td> <td class="num"><span>35,186</span></td> </tr> <tr> <td align="center"><span class="date">2019.10.18</span></td> <td class="num"><span>2,025</span></td> <td class="num"> <img alt="상승" height="6" src="ico_up.gif" style="margin-right:4px;" width="7"/><span> 40 </span> </td> <td class="num"><span>1,985</span></td> <td class="num"><span>2,050</span></td> <td class="num"><span>1,980</span></td> <td class="num"><span>108,481</span></td> </tr> <tr> <td align="center"><span class="date">2019.10.17</span></td> <td class="num"><span>1,985</span></td> <td class="num"> <img alt="상승" height="6" src="ico_up.gif" style="margin-right:4px;" width="7"/><span> 10 </span> </td> <td class="num"><span>1,980</span></td> <td class="num"><span>1,990</span></td> <td class="num"><span>1,955</span></td> <td class="num"><span>20,766</span></td> </tr> <tr> <td align="center"><span class="date">2019.10.16</span></td> <td class="num"><span>1,975</span></td> <td class="num"> <img alt="하락" height="6" src="ico_down.gif" style="margin-right:4px;" width="7"/><span class="tah p11 nv01"> 5 </span> </td> <td class="num"><span>1,985</span></td> <td class="num"><span>1,995</span></td> <td class="num"><span>1,970</span></td> <td class="num"><span>19,243</span></td> </tr> <tr> <td align="center"><span class="date">2019.10.15</span></td> <td class="num"><span>1,980</span></td> <td class="num"> <img alt="상승" height="6" src="ico_up.gif" style="margin-right:4px;" width="7"/><span> 20 </span> </td> <td class="num"><span>1,970</span></td> <td class="num"><span>1,980</span></td> <td class="num"><span>1,960</span></td> <td class="num"><span>35,658</span></td> </tr> <tr> <td align="center"><span class="date">2019.10.14</span></td> <td class="num"><span>1,960</span></td> <td class="num"> <span>0</span> </td> <td class="num"><span>1,955</span></td> <td class="num"><span>1,970</span></td> <td class="num"><span>1,935</span></td> <td class="num"><span>26,698</span></td> </tr> <tr> <td align="center"><span class="date">2019.10.11</span></td> <td class="num"><span>1,960</span></td> <td class="num"> <img alt="상승" height="6" src="ico_up.gif" style="margin-right:4px;" width="7"/><span> 45 </span> </td> <td class="num"><span>1,925</span></td> <td class="num"><span>1,965</span></td> <td class="num"><span>1,910</span></td> <td class="num"><span>45,469</span></td> </tr> <tr> <td align="center"><span class="date">2019.10.10</span></td> <td class="num"><span>1,915</span></td> <td class="num"> <img alt="상승" height="6" src="ico_up.gif" style="margin-right:4px;" width="7"/><span> 15 </span> </td> <td class="num"><span>1,885</span></td> <td class="num"><span>1,915</span></td> <td class="num"><span>1,885</span></td> <td class="num"><span>32,773</span></td> </tr> <tr> <td align="center"><span class="date">2019.10.08</span></td> <td class="num"><span>1,900</span></td> <td class="num"> <img alt="하락" height="6" src="ico_down.gif" style="margin-right:4px;" width="7"/><span class="tah p11 nv01"> 20 </span> </td> <td class="num"><span>1,915</span></td> <td class="num"><span>1,935</span></td> <td class="num"><span>1,885</span></td> <td class="num"><span>62,433</span></td> </tr> <tr> <td align="center"><span class="date">2019.10.07</span></td> <td class="num"><span>1,920</span></td> <td class="num"> <img alt="하락" height="6" src="ico_down.gif" style="margin-right:4px;" width="7"/><span class="tah p11 nv01"> 50 </span> </td> <td class="num"><span>1,970</span></td> <td class="num"><span>1,980</span></td> <td class="num"><span>1,895</span></td> <td class="num"><span>89,504</span></td> </tr> <tr> <td align="center"><span class="date">2019.10.04</span></td> <td class="num"><span>1,970</span></td> <td class="num"> <img alt="하락" height="6" src="ico_down.gif" style="margin-right:4px;" width="7"/><span class="tah p11 nv01"> 20 </span> </td> <td class="num"><span>1,980</span></td> <td class="num"><span>2,005</span></td> <td class="num"><span>1,970</span></td> <td class="num"><span>47,894</span></td> </tr> <tr> <td align="center"><span class="date">2019.10.02</span></td> <td class="num"><span>1,990</span></td> <td class="num"> <span>0</span> </td> <td class="num"><span>1,975</span></td> <td class="num"><span>2,030</span></td> <td class="num"><span>1,965</span></td> <td class="num"><span>74,176</span></td> </tr> <tr> <td align="center"><span class="date">2019.10.01</span></td> <td class="num"><span>1,990</span></td> <td class="num"> <img alt="상승" height="6" src="ico_up.gif" style="margin-right:4px;" width="7"/><span> 20 </span> </td> <td class="num"><span>1,975</span></td> <td class="num"><span>2,005</span></td> <td class="num"><span>1,965</span></td> <td class="num"><span>44,690</span></td> </tr> <tr> <td align="center"><span class="date">2019.09.30</span></td> <td class="num"><span>1,970</span></td> <td class="num"> <img alt="하락" height="6" src="ico_down.gif" style="margin-right:4px;" width="7"/><span class="tah p11 nv01"> 5 </span> </td> <td class="num"><span>1,980</span></td> <td class="num"><span>2,000</span></td> <td class="num"><span>1,970</span></td> <td class="num"><span>34,087</span></td> </tr> <tr> <td align="center"><span class="date">2019.09.27</span></td> <td class="num"><span>1,975</span></td> <td class="num"> <img alt="상승" height="6" src="ico_up.gif" style="margin-right:4px;" width="7"/><span> 5 </span> </td> <td class="num"><span>1,975</span></td> <td class="num"><span>2,060</span></td> <td class="num"><span>1,965</span></td> <td class="num"><span>109,372</span></td> </tr> <tr> <td align="center"><span class="date">2019.09.26</span></td> <td class="num"><span>1,970</span></td> <td class="num"> <img alt="하락" height="6" src="ico_down.gif" style="margin-right:4px;" width="7"/><span class="tah p11 nv01"> 30 </span> </td> <td class="num"><span>2,000</span></td> <td class="num"><span>2,035</span></td> <td class="num"><span>1,950</span></td> <td class="num"><span>83,120</span></td> </tr> <tr> <td align="center"><span class="date">2019.09.25</span></td> <td class="num"><span>2,000</span></td> <td class="num"> <img alt="하락" height="6" src="ico_down.gif" style="margin-right:4px;" width="7"/><span class="tah p11 nv01"> 65 </span> </td> <td class="num"><span>2,065</span></td> <td class="num"><span>2,065</span></td> <td class="num"><span>1,985</span></td> <td class="num"><span>78,144</span></td> </tr> <tr> <td align="center"><span class="date">2019.09.24</span></td> <td class="num"><span>2,065</span></td> <td class="num"> <img alt="상승" height="6" src="ico_up.gif" style="margin-right:4px;" width="7"/><span> 30 </span> </td> <td class="num"><span>2,020</span></td> <td class="num"><span>2,090</span></td> <td class="num"><span>2,020</span></td> <td class="num"><span>139,085</span></td> </tr> </tbody> </table> </div>
In [29]:
그룹사별일일시가 = soup.select('.main')[2:]
오늘종가 = []
오늘시가총액 =[]
for i in 그룹사별일일시가:
오늘종가.append(int(i.select('.table > tbody > tr')[1].select('td')[1].select('td > span')[0].text.replace(',','')))
In [35]:
오늘시가총액 = [i*10000 for i in 오늘종가] #10000주씩있으므로
전그룹사시가총액 = format(sum(오늘시가총액),',')
전그룹사시가총액
Out[35]:
'538,000,000'
In [58]:
그룹사별일일시가 = soup.select('.main')[2:]
오늘종가 = []
오늘시가총액 =[]
for j in range(1, len(soup.select('.main')[2:][0].select('.table > tbody > tr'))): #전체 row의 개수
오늘종가 = []
for i in 그룹사별일일시가:
오늘종가.append(int(i.select('.table > tbody > tr')[j].select('td')[1].select('td > span')[0].text.replace(',','')))
오늘시가총액.append(sum(오늘종가))
오늘시가총액 =[i*10000 for i in 오늘시가총액]
오늘시가총액
Out[58]:
[538000000, 531800000, 536150000, 523050000, 490350000, 487550000, 469700000, 461400000, 459000000, 457650000, 440000000, 432100000, 438300000, 443100000, 448500000, 443700000, 439350000, 441800000, 444100000, 462450000]
In [59]:
날짜 = soup.select('.main')[2:][0].select('.table > tbody > tr > td > .date')
date =[]
for i in 날짜:
date.append(i.text)
date
Out[59]:
['2019.10.23', '2019.10.22', '2019.10.21', '2019.10.18', '2019.10.17', '2019.10.16', '2019.10.15', '2019.10.14', '2019.10.11', '2019.10.10', '2019.10.08', '2019.10.07', '2019.10.04', '2019.10.02', '2019.10.01', '2019.09.30', '2019.09.27', '2019.09.26', '2019.09.25', '2019.09.24']
In [60]:
import matplotlib.pyplot as plt
plt.plot(date[::-1], 오늘시가총액[::-1])
plt.xticks(rotation = -45)
plt.show()
'빅데이터 스터디' 카테고리의 다른 글
30분 요약 강좌 시즌2 : Python 활용편 섹션6-폴리움 (0) | 2021.09.17 |
---|---|
30분 요약 강좌 시즌2 : Python 활용편-섹션4-웹크롤링 연습문제2 (0) | 2021.09.17 |
30분 요약 강좌 시즌2 : Python 활용편 섹션4-웹크롤링 (0) | 2021.09.17 |
30분 요약 강좌 시즌2 : Python 활용편-섹션3. Visualization (0) | 2021.09.05 |
30분 요약 강좌 시즌2 : Python 활용편 - 섹션3. Numpy와 Pandas (0) | 2021.09.05 |