Python Web Scraping Projects

Python Web Scraping Projects: Implement modern Python libraries to build end-to-end automated web scrapers and crawlers

Paperback Published on: 12/12/2020
Price: £29.99
Free UK delivery on orders over £25
Not available
This product is currently unavailable
Make and edit your lists in your account
No stock available in any shop.
Not available
This product is currently unavailable
No stock available in any shop.

Synopsis

Use powerful Python libraries and natural language parsing techniques to scrape 10 different websites efficiently

Key Features

Implement 10 interesting web scraping projects using modern Python libraries such as extruct, NLTK, spaCy, and requests
Perform advanced scraping operations using real-world examples and NLP techniques
Learn how to reverse engineer the websites you want and reproduce their results in Python

Book DescriptionWeb scrapers are programmed to navigate through multiple web pages to extract data as per your needs. This book will cover core web scraping ideas in Python with the help of 10 interesting projects, which utilize real-world examples and varied datasets.

The book starts with an introduction to web scraping, along with guiding you through creating a basic submission scraper. Each chapter will address one end-to-end project to scrape and crawl a unique set of data. With every new project, you'll develop your skills in using web scraping at work or in projects. You'll also learn about synchronous and asynchronous HTTP scraping, HTML parsing and web crawler modeling and scaling. Moving ahead, you'll cover other web-scraping-related mediums such as reverse engineering websites and JavaScript behavior that you can use in web scraping. Later, you'll get to grips with advanced projects related to domains such as employment, sports, and eCommerce. To build on your skills, the book assists you in handling difficult AJAX requests, and scraping JavaScript-heavy pages, along with guiding you through automated web browser scraping. Finally, you'll learn to work on unstructured data by creating powerful scrapers and crawlers.

By the end of this book, you'll have learned how to build automated web scrapers to perform a wide range of complex tasks.

What you will learn

Model your web scraper for distributed scraping
Automate your web browser with the help of Selenium, Puppeteer, and Splash Lua scripting
Scrape the content behind authentication gates and combine values from multiple web pages
Discover industry trends and edge cases such as AI usage in crawler detection and crawler scaling
Replicate the behavior of your web apps in Python
Understand the processes of dealing with proxies, handling errors and bad responses, and managing big data storage
Integrate browser automation with a Python web scraper

Who This Book Is ForThis book is for Python programmers, data analysts, web scraping professionals or anyone who wants to build efficient and powerful web scraping projects. Working knowledge of the Python programming language and web scraping fundamentals is mandatory.

Publisher information

  • Publisher: Packt Publishing Limited
  • ISBN: 9781838648671
  • Number of pages: 103
  • Dimensions: 235 x 191 mm

Customer Reviews