크롤링 (2)-환경설정 : pip 라이브러리, requests, BeautifulSoup

2021. 5. 10. 10:25

크롤링 환경 만들기

1. Python 공식 사이트(https://www.python.org/downloads/)에서 Python 설치하기

2. PIP 인스톨용 스크립트를 저장하기 (https://bootstrap.pypa.io/get-pip.py) ‘다른 이름으로 링크 저장’

cmd > 링크 저장한 경로에서 PIP 설치하기

python get-pip.py

3. requests, BeautifulSoup 라이브러리 설치하기

pip install beautifulsoup4 requests

requests : html 데이터 요청
BeautifulSoup : html 파싱

Requests 테스트

내 티스토리 주소를 requests 해봤다

import requests
from bs4 import BeautifulSoup
def crawling():
    url='https://rokroks.tistory.com/'
    response = requests.get(url)
    print(response)
    print(response.status_code) #응답코드, 200이면 성공
    print(response.text)
crawling()

▲실행결과

잘 불러오는 것을 볼 수 있다. 다음엔 원하는 내용을 파싱해보자.

Reference

Windows용 PIP 설치하기(How to Install PIP For Python on Windows) - 아크몬드넷

PIP란? pip는 파이썬으로 작성된 패키지 소프트웨어를 설치 · 관리하는 패키지 관리 시스템이다. Python Package Index (PyPI)에서 많은 파이썬 패키지를 볼

archmond.net

파이썬 크롤링 시작하기 - html 구조와 간단한 크롤링

crawling html 간단한 구조 tag head title body p a href img h1, h2, h3, h4 input button css란? tag별 스타일링 id, class 크롤링 requests beautiful soup 네이버 블로그 크롤링 해보기 크롤..

software-creator.tistory.com

'Web > 실습' 카테고리의 다른 글

웹 서비스 다뤄보기 (2) - JavaScript와 DOM (2)	2021.05.12
웹 서비스 다뤄보기 (1) - HTML, CSS (0)	2021.05.12
크롤링 (1)-개념 및 방식 (0)	2021.05.03

코딩하는 eroke

크롤링 (2)-환경설정 : pip 라이브러리, requests, BeautifulSoup

'Web > 실습' 카테고리의 다른 글

+ Recent posts

티스토리툴바