iT邦幫忙

0

How to Scrape Walmart Products for Free?

3.8 Billion people around the world are using eCommerce applications, and it is estimated to drive a mammoth US$2,723,991m in 2021 for retail enterprises. This insight is a step-by-step guide on scraping Walmart products for free. But why should you scrape Walmart? Your business can leverage scraping eCommerce websites like Walmart for different use-cases:

  • Building new products around these platforms to serve market requirements.
  • Product intelligence platform
  • Price comparison website
  • Research & Analysis

For example, Amazon drop shippers or FBA partners need data around ASINs which are unique identification numbers for products listed on Amazon[dot]com. This data helps them shape their Amazon strategy. Amazon-ASIN is one such solution. You may build a similar tool for any eCommerce website, including Walmart to source business-critical insights about products

Price intelligence & monitoring competitors
It’s publicly known that Walmart scrapes price data from Amazon for intelligently pricing its own products. Are you thinking, why is it important? 87% of Americans say the price is very influential in determining where they shop! Walmart scrapes Amazon or any retail enterprise scrapes competitor’s eCommerce websites to monitor & strategically price their own products and research new product developments.

Shopper sentiment analysis by scraping product reviews & ratings
Many of the 3.8 Billion people who shop online, also leave their reviews on the products they shop for. Scraping reviews and ratings data can be beneficial for identifying best-selling products, customer preferences, product features, drawbacks, and strategically preparing location-based personalized catalogs.

Let’s get started.

How To Scrape Walmart For Free?

https://miro.medium.com/max/1400/0*MO_1XLl1T7EEKz_9

For this tutorial, we shall be using a free “click and scrape” tool i.e., Octoparse for scraping Walmart products. No, this tool doesn’t require any coding knowledge. Anyone who knows to browse the web can use this for scraping websites. Octoparse provides pre-built templates for scraping Walmart, Amazon, Etsy, Target, Rakuten, Tokopedia, eBay, BestBuyProduct, Flipkart, and 110+ other popular websites around the globe, across different industry verticals. Using these pre-built templates, you can start scraping websites within a minute.

Method 1 to scrape Walmart products:

Use pre-built Octoparse templates:

Step 1: Choose Walmart product template

For Walmart, we have several scraping templates to scrape

  • Walmart product data,
  • Customer reviews,
  • Q&A,
  • Walmart product URLs
    Based on your scraping requirements, choose one.
    For this tutorial, we shall choose the template “Product Data”.
  1. Download Octoparse Tool
  2. Register/Login
  3. Select “Task” under “Task templates” on the home screen.

https://miro.medium.com/max/1178/0*qc2KnMjdnNkLh-5a

  1. Search Walmart and click on the result.

https://miro.medium.com/max/1260/0*dbAO9ZMOmv9HGdb1

  1. Select the “Product data” template.

https://miro.medium.com/max/1352/0*ZMYDLTy8-EWbViZ6

Here are some of the data points you can scrape using the Walmart product data template:

  • Walmart Number i.e, a unique ID for Walmart products,
  • Star ratings,
  • Customer reviews,
  • Product title,
  • Product URL,
  • Product price,
  • Shipping details,
  • Brand details, etc.,

Step 2: Use the template to scrape Walmart product data.

  1. Click on “use template”.
    https://miro.medium.com/max/1400/0*pwMNbpDuN0CLkFjY

  2. Enter a keyword for which you want to scrape Walmart products. For this tutorial, we chose “fashion”. And clicked on “save and run”.
    https://miro.medium.com/max/1400/0*8RVzBtEs2AOEUHZy

  3. Choose your extraction mode. You can select to extract locally or in the cloud. You may even schedule your Walmart scraping.
    https://miro.medium.com/max/1174/0*4_9ado_MvNrKdX00

Here’s how the scraped data looks:
https://miro.medium.com/max/1400/0*xHKnTXItM54-nOUl

Method 2 to scrape Walmart products:

Scrape using Custom Task Under Advanced Mode.
Let’s scrape the women’s fashion section on the Walmart:
We observe two things here:

  1. The products are listed in card style.
    https://miro.medium.com/max/1400/0*udKPndiP3lnKoic6

  2. There is pagination at the bottom of the screen.
    https://miro.medium.com/max/1400/0*2mDRYwD_H-SCfQfc

Basically, we’ll create a custom Walmart template with the following workflow:

  1. Visit the starting URL of our target niche on Walmart i.e, “fashion” for this tutorial.
  2. Visit individual landing pages for each Walmart product by loop clicking on each of the listed items on the page. Scrape the product’s data points.
  3. Click on the next pagination link.
  4. Repeat steps 2,3 until all Walmart product search results are scraped.

Step 1: Building the custom Walmart product data scraper:

  1. Download Octoparse Tool

  2. Register/Login

  3. Select “Task” under “Advanced Mode” on the home screen.
    https://miro.medium.com/max/1334/0*wBsIgllB2yh7BtrE

  4. Enter the target Walmart URL & then click on the Save Url button at the bottom: https://www.walmart.com/browse/clothing/women/5438_133162?page=1&povid=FashionTopNav_Women_Clothing
    https://miro.medium.com/max/1400/0*SnPPEfITgRvl6IQb

  5. If you see the following screen, you’re good to start building a custom Walmart scraping template.
    https://miro.medium.com/max/1400/0*ugLk3hDz8e0Gz7k2

  6. Create pagination for Walmart product search pages
    6.1 Click on the next page icon and select “Loop click next page” on the “Action Tips” panel.
    https://miro.medium.com/max/1400/0*_A_WPyv4BvAiFHIl

6.2 Clink on the pagination box and update the Xpath on the right half:
From : //BUTTON[@class=’elc-icon paginator-hairline-btn paginator-btn paginator-btn-next’]
https://miro.medium.com/max/1400/0*Kg_F0L4VirXLlOZm

To: //button[contains(string(), “Next Page”)]
https://miro.medium.com/max/1400/0*y2YZnBi98N0z-FWf

  1. Extract product data by loop clicking each of the products listed
    7.1 Click on the “Go To Web Page”. This would bring us back to the starting page. While creating pagination, we were taken to the next page i.e., page 2.
    https://miro.medium.com/max/472/0*kYEEsOBWa71Lza3P

7.2 Click on the pagination box.
7.3 Select the title of the first product. Octoparse browser auto-selects all the product titles. Choose “Select all” on the “action tips” panel.
https://miro.medium.com/max/1400/0*tVEShlECSnRBUksd

7.4 Now, select “Loop click each element” from the “Action Tips” panel.
https://miro.medium.com/max/690/0*LJ0GCdwuvoZEjook

7.5 Click on the “Loop Element” box and update its “Loop mode” to “Variable list”. And, add the Xpath :
//div[contains(@id, “mainSearchContent”)]//span[contains(text(),”Product Title”)]/following-sibling::a[contains(@class,”product-title-link”)]
https://miro.medium.com/max/1400/0*JqGfmTTenRrZjFj5

This was important because, sometimes in the cards list, you can even find some ads. Also, the structure of the website may change with the presence of unexpected elements. So, using hard-coded Xpaths is not recommended. I’ve found a bookmark worthy resource on Xpath.

7.6 Now observe, when we created a loop click item, it automatically took us to the product page of the first listed product. Here, we can select the data points of our interest. And then click on “Extract data”. We shall scrape the following Walmart product data-points:

  1. Brand: This is located right above the “Product Title”.
  2. Title: Product title
  3. Star: This is the average rating a product has garnered from different users.
  4. Ratings: This is the total count of the ratings.
  5. Comments: Total reviews from the shoppers.
  6. Price: Price of the product.
  7. Shipping Details: The number of days it takes to deliver the product.
  8. Features: Product details & descriptions
    https://miro.medium.com/max/1400/0*3FQaMfZJsUYVMqwk

7.7 You can edit the names of the data fields by clicking on them. When you scrape, some of the data points are often missing on the website. You can choose to keep those cells empty, or completely wipe off that column. We chose to keep it blank. Our custom workflow now looks like this:
https://miro.medium.com/max/1400/0*aHqdJ4zwehFPtazv

  1. For adding product page URL, click on “add predefined field” -> “Add current page information” -> “Web page URL”.
    https://miro.medium.com/max/1376/0*8m5EHo_xw5gvVlv-

  2. Again, for data consistency, we shall add Xpaths to each of these Data fields. Select the target field as in the above pic and then click on the “edit” option i.e., the one with the pencil icon.
    https://miro.medium.com/max/1390/0*Qcy2Xgs8b9CVixNd

  3. Click on the Xpath option and enter the following Xpath for respective fields. We have demonstrated using the Brand Xpath in the image below:
    https://miro.medium.com/max/1382/0*JRglA0vOYxr0TQxH

Brand: //a[contains(@class, “prod-brandName”)]/span
Title: //h1[contains(@class, “prod-ProductTitle”)]
Rating: //div[contains(@class, “productsecondaryinformation”)]//div[contains(@itemprop, “aggregateRating”)]/span[contains(@class, “stars-reviews”)]/span[contains(@class, “stars-reviews-count-node”)]
Price: //span[@id=”price”]//div[contains(@class, “prod-PriceHero”)]//span[contains(text(),”$”)]
Delivery: //section[contains(@class, “prod-PriceSection”)]//div[contains(@class, “ShippingMessage-container”)]
Features: //div[contains(text(), “Features”)]/following-sibling::div
Walmart_id: //div[contains(@class, “productsecondaryinformation”)]/div[contains(text(), “Walmart”)]
Comments: //div[contains(@class, “productsecondaryinformation”)]/button/span[contains(text(), “comments”)]
Stars: //span[contains(@itemprop, “ratingValue”)]

Step 2: Extracting Walmart Product Data

In step 1, we’ve prepared a custom Walmart scraping template. Just save & run it by clicking on “Save” and then on “Start Extraction”.

You have three options to extract Walmart product data:

  • Local Extraction i.e, use your local computing and networking resources
  • Cloud Extraction i.e, uses cloud computing and networking resources
  • Creating API to enable data access via APIs

For this tutorial, we chose “Local extraction”.
https://miro.medium.com/max/1134/0*feIrP18noLEJWJQF

The extraction will automatically start as soon as you chose “Local extraction”. Here’s how it looks.
https://miro.medium.com/max/1400/0*oO3qxf5mdTOhsUc8

You can export the data as:

  • JSON
  • Excel sheets
  • CSV
  • HTML, or
  • Directly to a database.
    https://miro.medium.com/max/938/0*FwAS6OzROSWpLWW3

Conclusion:

Yayyy!! That was a breeze. We learned about

  • the use-cases of scraping Walmart product data, and
  • 2 easy methodologies to scrape Walmart products

How long did it take to follow the tutorial? Hardly 5–10 minutes? Now, as you’re acquainted with Octoparse and how it works, scraping data using Octoparse will be a cakewalk for you. Besides, here are some other reasons you should consider Octoparse:
I. Ridiculously Cheap
II. Easy To Use
III. Highly Robust Scraping Tool
IV. Great Support
V. Flexible Customizable
VI. Facilitates Cloud Extraction
VII. 110+ Pre-built Templates
VIII. Well Documented
IX. Blazing Fast
X. Facilitates Scheduling
Contact us for any assistance with scraping Walmart.

Happy Scraping!

Scrape walmart product info


尚未有邦友留言

立即登入留言