logo
logo
Sign in

How to Scrape eBay Product Data with Python?

avatar
Retailgators
How to Scrape eBay Product Data with Python?

Although Amazon is the e-commerce market leader, eBay is having its considerable share in this online industry. Brands that sell online should monitor pricing on eBay and get competitive benefits.

To extract products data on eBay at a big scale frequently is a challenging job for data scientists. Let’s take an example of extracting eBay data with Python to recognize the mobile phone prices.

In this blog, we will scrape eBay data for collecting the phone prices and discover the difference between offerings on an eBay website.

Scrape EBay Products Data In A Step-By-Step Procedure

Here, we will go through a step-by-step procedure of scraping eBay data for products as well as their prices.

1. Select the Necessary Information

Select-the-Necessary-Information

The initial task in data scraping is to recognize the targeted web page. This is a web page where you require to scrape all the necessary information.

We would be extracting eBay data for product listings therefore, we could just open an eBay website as well as type product in the search bar as well as press enter. When a page gets loaded with all product listings of the product, you require to do it is pulling the URLs out from a browser. The URL would be the targeted URL. Here, the URL would be,

https://www.ebay.com/sch/i.html_from=R40&_nkw=galaxy+note+8&_sacat=0&_pgn=1

Notice a couple of parameters in the URL i.e. “nkw” (means new keyword) as well as “pgn” (means page number) parameters. Those parameters in a URL defines a search query. In case, we change the “pgn” parameters to 2, then this would open a second page of product listing for the Galaxy Note 8 Phone as well as if we need to change the “nkw” to iPhone X, then eBay would search for the iPhone X as well as would show you corresponding results.

2. Finalize Tags for Scraping

Finalize-Tags-for-Scraping

When we have finalized a targeted web page, we should understand the HTML layout for scraping the results available. It is the most vital and important part of data scraping and fundamental HTML knowledge is required for the step.

When on a targeted web page, perform “inspect element” as well as open a developer tools window or press CTRL+SHIFT+I. Within a new window, just get the source codes of the targeted web page. Here, all these products are given as list elements therefore, we need to grasp all the lists.

For grabbing the HTML elements, we require to get an identifier related with that. It could be the id of an element or class name or other HTML attributes of a particular element. We have used a class name as an identifier. All these lists have a similar class name i.e. s-item.

For more inspection, we have class names for a product name as well as product prices that are “s-item__title” as well as “s-item__price” correspondingly. With this data, we have effectively completed the step 2!

3. Put the Extracted Data in a Well-Structured Format

Put-the-Extracted-Data-in-a-Well-Structured-Format

After getting our identifiers or extractors, we need to scrape particular sections out from HTML content. When it’s done, we require to organize data in a suitable well-structured format. We would create a table in which we will need all product names with one column as well as their prices.

4. Visualize the Results (Elective)

Visualize-the-Results-(Elective)

As we compare the pricing offerings on different mobile phones, we would visualize different results also. It is not a compulsory step for data scraping however, is more of the procedure to turn collected data with actionable insights. We would be scheming boxplots to know distribution of price offerings for both iPhone 8 and Galaxy Note 8 mobile phones.

Necessary Libraries & Installation
Necessary-Libraries-Installation

For implementing data scraping for the use case, you would require Python, pip (package installers for Python), as well as BeautifulSoup library within python for data scraping. You would also require pandas as well as numpy libraries to organize collected data in the well-structured formats.

Install PIP and Python

Install-PIP-and-Python

As per your OS, you could use this blog link for setting python as well as Pip in the system.

Install BeautifulSoup Library

Install-BeautifulSoup-Library
 apt-get install python-bs4 pip install beautifulsoup4  

Install Numpy and Pandas

Install-Numpy-and-Pandas
 pip install pandas pip install numpy 

We do set up of our environment as well as now could start with scraping implementation with python. The implementation includes different steps discussed in earlier sections.

Implementing Python To Scrape EBay Data
Implementing-Python-to-Scrape-eBay-Data

Here, we will do two web scraping operations i.e. one for iPhone 8 as well as another for Galaxy Note 8. Implementation gets repeated for these two mobile phones to do easy comprehension. A well-optimized version could have two different scraping activities joined one that is not needed right now.

Scrape EBay To Buy Galaxy Note 8
item_name = []
prices = []

for i in range(1,10):

    ebayUrl = "https://www.ebay.com/sch/i.html?_from=R40&_nkw=note+8&_sacat=0&_pgn="+str(i)
    r= requests.get(ebayUrl)
    data=r.text
    soup=BeautifulSoup(data)

    listings = soup.find_all('li', attrs={'class': 's-item'})

    for listing in listings:
        prod_name=" "
        prod_price = " "
        for name in listing.find_all('h3', attrs={'class':"s-item__title"}):
            if(str(name.find(text=True, recursive=False))!="None"):
                prod_name=str(name.find(text=True, recursive=False))
                item_name.append(prod_name)

        if(prod_name!=" "):
            price = listing.find('span', attrs={'class':"s-item__price"})
            prod_price = str(price.find(text=True, recursive=False))
            prod_price = int(sub(",","",prod_price.split("INR")[1].split(".")[0]))
            prices.append(prod_price)

from scipy import stats
import numpy as np

data_note_8 = pd.DataFrame({"Name":item_name, "Prices": prices})
data_note_8 = data_note_8.iloc[np.abs(stats.zscore(data_note_8["Prices"]))< 3,]
Gather Data For Galaxy Note 8
Gather-Data-for-Galaxy-Note-8
Scrape EBay For IPhone 8
item_name = []
prices = []

for i in range(1,10):

    ebayUrl = "https://www.ebay.com/sch/i.html?_from=R40&_nkw=iphone+8_sacat=0_pgn="+str(i)
    r= requests.get(ebayUrl)
    data=r.text
    soup=BeautifulSoup(data)

    listings = soup.find_all('li', attrs={'class': 's-item'})

    for listing in listings:
        prod_name=" "
        prod_price = " "
        for name in listing.find_all('h3', attrs={'class':"s-item__title"}):
            if(str(name.find(text=True, recursive=False))!="None"):
                prod_name=str(name.find(text=True, recursive=False))
                item_name.append(prod_name)

        if(prod_name!=" "):
            price = listing.find('span', attrs={'class':"s-item__price"})
            prod_price = str(price.find(text=True, recursive=False))
            prod_price = int(sub(",","",prod_price.split("INR")[1].split(".")[0]))
            prices.append(prod_price)

from scipy import stats
import numpy as np

data_note_8 = pd.DataFrame({"Name":item_name, "Prices": prices})
data_note_8 = data_note_8.iloc[np.abs(stats.zscore(data_note_8["Prices"])) < 3,]
Collect Data For IPhone 8
Collect-Data-for-iPhone-8
Visualize The Product Prices
Visualize-the-Product-Prices

Now, the time has come to visualize extracted results. We would use the boxplots for visualizing the price distribution of mobile phones.

Box plots help us visualize trends in the numerical values. A green line is a median of collected pricing data. The box spreads from Q1-Q3 quartile data values with the line at median (Q2). Different whiskers spread from edges of a box to display the data range.

For iPhone 8, the majority of prices come between INR 25k-35k while most Galaxy Note 8 phones are accessible within pricing range of 25k-30k.

Although, price in iPhone 8 is more than Galaxy Note 8. The iPhone 8 is accessible for a minimum price of INR 15k on eBay while the minimum price of Galaxy Note 8 on eBay is about INR 22-23K!

Retailgators As A Dependable Scraping Partner
Retailgators-as-a-Dependable-Scraping-Partner

There are many tools available, which can assist you in scraping data. Although, if you want professional help with nominal technical know-how, then Retailgators can assist you. We have a transparent and well-structured procedure for scraping data in real-time as well as offer in the required format. We have assisted businesses across different industrial verticals. From support to the hiring industry python or retail solutions, Retailgators has specially designed refined solutions for the majority of use cases.

Conclusion

Here, we have successfully utilized Python to scrape eBay data for different products as well as their prices. We have also compared accessible prices for iPhone 8 and Galaxy Note 8 for making better purchase decisions. Web scraping attached with data science could get leveraged for making smart decisions in your daily life.

Want to avail eBay data scraping services for all your data requirements? Contact Retailgators or ask for a free quote!

source code: https://www.retailgators.com/how-to-scrape-ebay-product-data-with-python.php

collect
0
avatar
Retailgators
guide
Zupyak is the world’s largest content marketing community, with over 400 000 members and 3 million articles. Explore and get your content discovered.
Read more