## 유튜브에서 동영상 정보 크롤링 (crawling) 
## 유튜브에서 동영상 정보 크롤링 (crawling) 
import requests
from bs4 import BeautifulSoup 

def get_html(url):
    ## 브라우저 호환을 위해서 설정
    headers = {
        'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 '+ \
        '(KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36'
                } 
    ##해당 url에 htlm(정보) 요청에사용 / url은 사용자가 원하는 url
    r = requests.get(url, headers=headers) 
    ## 해당 url의 html을 사용자가 활용하기 쉽게 변환
    html = BeautifulSoup(r.content, 'html.parser')
    ## 결과값 전달
    return html

def get_youtube_video_crawling():
    url = 'https://www.youtube.com/c/BLACKPINKOFFICIAL/videos'
    html = get_html(url)
    print(str(html)[:1000])

get_youtube_video_crawling()

<!DOCTYPE html>
<html lang="ko-KR" style="font-size: 10px;font-family: Roboto, Arial, sans-serif;" system-icons="" typography="" typography-spacing=""><head><script nonce="DmhUFCyjNnSpj/mUk8OFdw">var ytcfg={d:function(){return window.yt&&yt.config_||ytcfg.data_||(ytcfg.data_={})},get:function(k,o){return k in ytcfg.d()?ytcfg.d()[k]:o},set:function(){var a=arguments;if(a.length>1)ytcfg.d()[a[0]]=a[1];else for(var k in a[0])ytcfg.d()[k]=a[0][k]}};
window.ytcfg.set('EMERGENCY_BASE_URL', '\/error_204?t\x3djserror\x26level\x3dERROR\x26client.name\x3d1\x26client.version\x3d2.20220425.01.00');</script><script nonce="DmhUFCyjNnSpj/mUk8OFdw">(function(){window.yterr=window.yterr||true;window.unhandledErrorMessages={};window.unhandledErrorCount=0;
window.onerror=function(msg,url,line,columnNumber,error){var err;if(error)err=error;else{err=new Error;err.stack="";err.message=msg;err.fileName=url;err.lineNumber=line;if(!isNaN(columnNumber))err["columnNumber"]=columnNumber}var message=String(err.mes

## 유튜브에서 동영상 정보 크롤링 (crawling) 
## 유튜브에서 동영상 정보 크롤링 (crawling) 
import requests
from bs4 import BeautifulSoup 

def get_html(url):
    ## 브라우저 호환을 위해서 설정
    headers = {
        'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 '+ \
        '(KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36'
                } 
    ##해당 url에 htlm(정보) 요청에사용 / url은 사용자가 원하는 url
    r = requests.get(url, headers=headers) 
    ## 해당 url의 html을 사용자가 활용하기 쉽게 변환
    html = BeautifulSoup(r.content, 'html.parser')
    ## 결과값 전달
    return html

def get_youtube_video_crawling():
    url = 'https://www.youtube.com/c/BLACKPINKOFFICIAL/videos'
    html = get_html(url)
    
    tmp_point = 0
    ## html에서 제목위치 얻기
    title_param1 = '"title":{"runs":[{"text":"'
    title_param2 = '"}],"accessibility":{"accessibilityData":'
    
    ## title_param1의 html위치 구하기 (첫번째 동영상)
    point1 = str(html).find(title_param1, tmp_point)
    ## title_param2의 html위치 구하기 (첫번째 동영상)
    point2 = str(html).find(title_param2, tmp_point)
    ## 텍스트 위치정보입력
    print(str(html)[point1+len(title_param1):point2])
    
    ## html에서 url위치 얻기
    url_param1 = '"commandMetadata":{"webCommandMetadata":{"url":"'
    url_param2 = '","webPageType":"WEB_PAGE_TYPE_WATCH"'
    ## 제목위치 바탕으로 인접 url위치 추출
    ## html에서 제목위치를 기준으로 하여 url_param1/2를 추출하여 추출예외케이스 방지
    upoint1 = str(html)[point1:].find(url_param1, tmp_point)
    upoint2 = str(html)[point1:].find(url_param2, tmp_point)
    ## 텍스트 위치정보입력
    print('https://www.youtube.com'+str(html)[point1+upoint1+len(url_param1):point1+upoint2])

get_youtube_video_crawling()

BLACKPINK 2022 WELCOMING COLLECTION PREVIEW
https://www.youtube.com/watch?v=OZdK1czhuv8

[python-파이썬] 9 네이버에서 블로그 정보 크롤링 (crawling) - 2탄 (0)	2022.04.23
[python-파이썬] 14 네이버에서 지역별 인구정보 크롤링 (crawling) - 1탄 (0)	2022.04.22
[python-파이썬] 8 네이버에서 블로그 정보 크롤링 (crawling) - 1탄 (0)	2022.04.21
[python-파이썬] 7 네이버에서 종목뉴스 크롤링 (crawling) - 2탄 (1)	2022.04.21
[python-파이썬] 6 네이버에서 종목뉴스 크롤링 (crawling) - 1탄 (1)	2022.04.21

일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

모두의 실험실

[python-파이썬] 10 유튜브에서 동영상 정보 크롤링 (crawling) - 1탄

'파이썬 > 크롤링' 카테고리의 다른 글

'파이썬/크롤링'의 다른글

티스토리툴바

[python-파이썬] 10 유튜브에서 동영상 정보 크롤링 (crawling) - 1탄

'파이썬 > 크롤링' 카테고리의 다른 글

'파이썬/크롤링'의 다른글

관련글

티스토리툴바