一步一步构建一个爬虫实例,抓取糗事百科的段子
先不用beautifulsoup包来进行解析
第一步,访问网址并抓取源码
# -*- coding: utf-8 -*-# @Author: HaonanWu# @Date: 2016-12-22 16:16:08# @Last Modified by: HaonanWu# @Last Modified time: 2016-12-22 20:17:13import urllibimport urllib2import reimport osif __name__ == '__main__': # 访问网址并抓取源码 url = 'http://pile(u"\s+\$\s\d+\.\d+") for book_title in all_book_title: try: print "Book's name is " + book_title.string.strip() except AttributeError as e: print e exit() book_price = book_title.find_next(text=price_regexp) try: print "Book's price is "+ book_price.strip() except AttributeError as e: print e exit() print ""以上全部为本篇文章的全部内容,希望对大家的学习有所帮助,也希望大家多多支持。