Is it legal to scrape AliExpress?

Scraping publicly available product data (prices, titles, ratings) is generally legal in most jurisdictions. However, you must respect AliExpress's Terms of Service, robots.txt directives, and applicable data protection laws (GDPR, CCPA) for personal data. Avoid scraping user reviews that contain personal information without compliance measures. This guide covers only public product and seller data.

Why do I need residential proxies to scrape AliExpress?

AliExpress's anti-bot system aggressively blocks datacenter IP ranges. Residential proxies route your requests through real home IPs, making your traffic indistinguishable from genuine mobile app users. This dramatically increases success rates from 30–50% (datacenter) to 90–95% (residential). Mobile proxies push this even higher at 95–99%.

What's the best scraping cadence for AliExpress product research?

For trending product discovery, scrape every 4–6 hours. For price monitoring on tracked SKUs, check every 1–2 hours (flash deals change hourly). Stock levels should be checked every 2–4 hours. Seller reputation data only needs daily updates. Over-scraping wastes proxy bandwidth and increases your block risk without proportional value.

Can I scrape AliExpress variant SKU data?

Yes. The AliExpress mobile API returns a skuInfo.skuList array with each variant's skuId, price, stock count, and attribute string (e.g., 'Color:Black;Size:M'). This is far richer than the desktop HTML, which requires JavaScript execution to render variant selectors.

How do I get location-specific shipping costs from AliExpress?

The mobile product detail API accepts a 'country' parameter (e.g., country=US, country=DE). When you set this, the response includes shipping options, costs, and estimated delivery days specific to that destination. Use ProxyHat's geo-targeting (country-US, country-DE in the username) to ensure your proxy IP and API country parameter align.

Scrape AliExpress for Product Research | ProxyHat

Why Scrape AliExpress in 2025

If you run a dropshipping tool or product-research SaaS, AliExpress is your single richest source of trending-product signals. Over 200 million SKUs, real-time price shifts, and a seller ecosystem that rewards speed — the merchant who spots a rising product first wins. But AliExpress is also one of the hardest e-commerce sites to scrape reliably. This guide walks through what actually works in 2025: which endpoints to hit, how to handle the anti-bot stack, and how to build a pipeline that doesn't break on day two.

Whether you're building an internal research tool or a customer-facing product-discovery dashboard, you'll leave with concrete API patterns, CSS selectors, and a Python script you can run under residential proxies tonight.

AliExpress Site Structure: What to Scrape

AliExpress surfaces product data across four main surfaces. Each has a different data density and scraping difficulty:

Search Results Pages

Desktop search lives at https://www.aliexpress.com/search?SearchText=.... Each SERP returns up to 60 product cards with title, price, orders count, star rating, and shipping badge. The HTML is server-rendered but heavily obfuscated — class names are hashed and rotate periodically.

Key selectors (desktop, as of early 2025):

Product card: div[class*='list--item'] or div._1OUGs (changes often)
Title: a[class*='title--item']
Price: span[class*='price--current']
Orders: span[class*='sale-value']

Because selectors shift, relying on HTML parsing for search is fragile. The mobile API (covered below) is far more stable.

Product Detail Pages

URL pattern: https://www.aliexpress.com/item/PRODUCT_ID.html. Product pages contain the richest data: variant SKUs, description HTML, image gallery, shipping options, and seller info. The description is loaded in an iframe from ae01.alicdn.com, which means a second request to get full product content.

Store (Seller) Pages

URL: https://www.aliexpress.com/store/STORE_ID. Store pages expose seller rating, positive-feedback percentage, and a product catalog. Useful for reputation scoring and catalog-wide monitoring.

Hot-Product and Trending Feeds

AliExpress surfaces trending items via https://www.aliexpress.com/popular/ and category-level trending pages like /popular/electronics.html. These pages highlight products with surging order volumes — gold for product research. The data behind them is also available via the mobile API.

The API-vs-HTML Trade-Off

Here's the core decision: scrape rendered HTML or hit the mobile API?

Dimension	Desktop HTML	Mobile API (JSON)
Data richness	Full rendered page; description in iframe	Structured JSON; variants, shipping, specs in one call
Stability	Low — class names rotate every few weeks	Medium — field names stable, auth changes occasionally
Rate limits	~40 req/min per IP before CAPTCHA	~120 req/min per IP; stricter auth on some endpoints
Anti-bot difficulty	High — full browser fingerprinting	Medium — needs correct headers + token
Parsing effort	High — obfuscated DOM, frequent rewrites	Low — JSON fields map directly to data model

Verdict: For any production pipeline, the mobile API wins. Use HTML scraping only as a fallback or for description content that the API doesn't include.

The Alibaba-Group Anti-Bot Stack

AliExpress shares infrastructure with the broader Alibaba security group. Here's what you're up against:

Device fingerprinting: The desktop site runs a JavaScript fingerprint collector (similar to Alibaba's umid system) that profiles browser features, canvas rendering, and WebGL. Headless browsers that don't patch these get flagged fast.
Rate limiting: Desktop HTML: aggressive throttling around 40 requests per minute per IP. Mobile API: roughly 120 requests per minute per IP before you see 429s or 403s.
CAPTCHA: Alibaba deploys slider CAPTCHAs (AliCAPTCHA) and occasionally reCAPTCHA v2 on suspicious traffic patterns.
IP reputation: Datacenter IP ranges are flagged quickly. Residential and mobile IPs see significantly fewer challenges.
Token-based auth on mobile API: Some endpoints require an x-sign header generated from the request URL and a signing key. The key is extracted from the mobile app's JavaScript bundle and rotates every few weeks.

The practical implication: you need residential proxies and you need to mimic mobile traffic patterns. Datacenter IPs will burn out in minutes on AliExpress.

Mobile API Endpoints That Return Richer JSON

AliExpress's mobile app communicates with backend services that return clean, structured JSON. These endpoints are far more scraper-friendly than the desktop HTML. The base domain for most mobile API calls is m.aliexpress.com or gw.aliexpress.com.

Product Search API

GET https://m.aliexpress.com/api/product/search?
  keyword=wireless+earbuds&
  page=1&
  sortType=bestSale&
  country=US

# Key response fields:
# .data.resultList[].title        - product title
# .data.resultList[].price       - price range
# .data.resultList[].saleNum     - orders count
# .data.resultList[].productId   - product ID
# .data.resultList[].storeInfo    - seller metadata

Product Detail API

GET https://m.aliexpress.com/api/product/detail?
  productId=100500612345678&
  country=US

# Returns: title, price, images, skuInfo (all variants),
# shippingInfo, storeInfo, specs, description URL

Store Info API

GET https://m.aliexpress.com/api/store/info?
  storeId=912345678

# Returns: storeName, positiveRate, serviceScore,
# shipScore, itemScore, totalProducts

Hot Products / Trending API

GET https://m.aliexpress.com/api/product/trending?
  categoryId=200000&
  country=US&
  page=1

A sample truncated response from the product detail API:

{
  "code": 200,
  "data": {
    "productId": "100500612345678",
    "title": "TWS Wireless Earbuds Bluetooth 5.3...",
    "price": {"min": "8.99", "max": "15.49", "currency": "USD"},
    "saleNum": 3842,
    "skuInfo": {
      "skuList": [
        {"skuId": "120000345", "price": "8.99", "attr": "Color:Black", "stock": 580},
        {"skuId": "120000346", "price": "12.49", "attr": "Color:White", "stock": 210}
      ]
    },
    "shippingInfo": {
      "toCountry": "US",
      "options": [
        {"method": "AliExpress Standard Shipping", "cost": "0.00", "estimatedDays": "15-25"}
      ]
    },
    "storeInfo": {
      "storeId": "912345678",
      "storeName": "TechGadgets Official",
      "positiveRate": "97.2%"
    }
  }
}

Notice how a single API call gives you variants, shipping, and seller reputation — data that would require three separate HTML page loads on desktop.

Python Example: Trending-Product Discovery Under Residential Proxies

Below is a production-oriented script that discovers trending products via the mobile API, using ProxyHat residential proxies to avoid IP blocks. It rotates the proxy per request and extracts the data fields that matter for product research.

import requests
import json
import time
import random
from datetime import datetime, timezone

PROXY_USER = "user-country-US"
PROXY_PASS = "your_password"
PROXY_URL = f"http://{PROXY_USER}:{PROXY_PASS}@gate.proxyhat.com:8080"

HEADERS = {
    "User-Agent": "AliApp(AE/8.53.0)",
    "X-Client-Type": "android",
    "X-Client-Version": "8530",
    "Accept": "application/json",
    "Accept-Language": "en-US",
}


def fetch_trending(category_id: str = "200000", page: int = 1) -> dict:
    """Fetch trending products from AliExpress mobile API."""
    url = "https://m.aliexpress.com/api/product/trending"
    params = {
        "categoryId": category_id,
        "country": "US",
        "page": page,
    }
    proxies = {"http": PROXY_URL, "https": PROXY_URL}

    resp = requests.get(url, params=params, headers=HEADERS, proxies=proxies, timeout=15)
    resp.raise_for_status()
    return resp.json()


def fetch_product_detail(product_id: str) -> dict:
    """Fetch full product detail including SKU variants."""
    url = "https://m.aliexpress.com/api/product/detail"
    params = {"productId": product_id, "country": "US"}
    proxies = {"http": PROXY_URL, "https": PROXY_URL}

    resp = requests.get(url, params=params, headers=HEADERS, proxies=proxies, timeout=15)
    resp.raise_for_status()
    return resp.json()


def extract_research_fields(trending_item: dict) -> dict:
    """Normalize a trending item into a flat research record."""
    return {
        "product_id": trending_item.get("productId", ""),
        "title": trending_item.get("title", ""),
        "price_min": trending_item.get("price", {}).get("min", ""),
        "price_max": trending_item.get("price", {}).get("max", ""),
        "orders": trending_item.get("saleNum", 0),
        "store_id": trending_item.get("storeInfo", {}).get("storeId", ""),
        "store_name": trending_item.get("storeInfo", {}).get("storeName", ""),
        "scraped_at": datetime.now(timezone.utc).isoformat(),
    }


if __name__ == "__main__":
    # Step 1: Get trending products in Electronics
    data = fetch_trending(category_id="200000", page=1)
    items = data.get("data", {}).get("resultList", [])

    print(f"Found {len(items)} trending products")

    # Step 2: Enrich top items with full detail (variants + shipping)
    for item in items[:5]:
        pid = item.get("productId")
        if not pid:
            continue
        detail = fetch_product_detail(pid)
        product = detail.get("data", {})
        sku_count = len(product.get("skuInfo", {}).get("skuList", []))
        shipping = product.get("shippingInfo", {}).get("options", [])
        print(
            f"  {pid} | {sku_count} variants | "
            f"{shipping[0]['method'] if shipping else 'N/A'} | "
            f"orders: {product.get('saleNum', 'N/A')}"
        )
        time.sleep(random.uniform(1.5, 3.5))  # polite delay

A few things worth noting in this script:

Proxy geo-targeting: The country-US flag in the username routes through US residential IPs. AliExpress prices and shipping vary by destination, so this matters.
Mobile User-Agent: The AliApp(AE/8.53.0) header mimics the Android app. Without it, the API may return reduced data or redirect to HTML.
Rate management: The random 1.5–3.5 second delay between detail requests keeps you under the ~120 req/min threshold per IP.

Handling Variant SKUs, Shipping Costs & Seller Reputation

Variant SKU Decomposition

AliExpress products often have dozens of variant SKUs — color, size, storage, bundle combinations. The mobile API's skuInfo.skuList array gives you each variant's skuId, price, stock count, and attribute string. For product research, the key operations are:

Explode variants: One product listing → N SKU rows in your database, each with its own price and stock.
Track stock velocity: Compare stock across scrapes. A variant dropping from 500 to 120 units while saleNum climbs is a strong trending signal.
Identify the best-seller variant: The variant with the lowest remaining stock relative to its initial stock (if you've been tracking) is often the top seller.

# Normalize variant attributes from "Color:Black;Size:M" format
def parse_sku_attrs(attr_string: str) -> dict:
    """Parse 'Color:Black;Size:M' into {'Color': 'Black', 'Size': 'M'}."""
    if not attr_string:
        return {}
    pairs = attr_string.split(";")
    result = {}
    for pair in pairs:
        if ":" in pair:
            key, value = pair.split(":", 1)
            result[key.strip()] = value.strip()
    return result


# Usage: build a flat table from skuList
for sku in product["skuInfo"]["skuList"]:
    attrs = parse_sku_attrs(sku.get("attr", ""))
    record = {
        "product_id": product["productId"],
        "sku_id": sku["skuId"],
        "price": float(sku["price"]),
        "stock": int(sku.get("stock", 0)),
        **attrs,  # color, size, etc.
    }

Shipping Cost Estimation by Destination

The product detail API's shippingInfo.options array includes shipping method, cost, and estimated delivery days for the country specified in your request. To build a shipping cost matrix:

Set country in your API request to each target market (US, GB, DE, etc.).
Extract cost and estimatedDays per method.
For dropshipping, AliExpress Standard Shipping (free or near-free to most destinations) is the default. Premium methods like DHL or FedEx appear for heavier items.

Proxy geo-targeting matters here: use country-US, country-DE, etc. in your ProxyHat username to get location-specific shipping data.

Seller Reputation Scoring

From the store info API, pull these fields:

positiveRate — percentage of positive feedback (aim for >95%)
serviceScore, shipScore, itemScore — 1–5 scale sub-ratings
totalProducts — catalog size (larger stores tend to be more reliable)

For product research, a simple composite works well:

def seller_score(store_info: dict) -> float:
    """Weighted seller reputation score (0–5)."""
    positive = float(store_info.get("positiveRate", "0").replace("%", "")) / 20  # 0–5
    service = float(store_info.get("serviceScore", 0))
    shipping = float(store_info.get("shipScore", 0))
    item = float(store_info.get("itemScore", 0))
    return round(0.4 * positive + 0.2 * service + 0.2 * shipping + 0.2 * item, 2)

Data Freshness: How Often Does AliExpress Change?

AliExpress is one of the most volatile e-commerce platforms for pricing and inventory. Here's what we've observed across large-scale scraping operations:

Prices change on 15–20% of top-selling SKUs daily. Flash deals update hourly.
Stock levels shift constantly — high-velocity products can sell through hundreds of units per day.
New listings appear at a rate of tens of thousands per day across major categories.
Seller metrics (feedback, store ratings) update roughly every 6–12 hours.

Recommended Scraping Cadence

Data Type	Cadence	Rationale
Trending product discovery	Every 4–6 hours	Hot-product rankings shift throughout the day
Price monitoring (tracked SKUs)	Every 1–2 hours	Flash deals expire fast; competitive repricing needs near-real-time data
Stock / inventory tracking	Every 2–4 hours	Stock-outs kill dropshipping listings — detect them early
Seller reputation updates	Once or twice daily	Changes are slow; no need for high frequency
New product discovery (full category scan)	Daily	Catch new listings within 24 hours

For a product-research tool, the minimum viable cadence is a trending scan every 6 hours and price checks every 2 hours for your monitored SKU list. Anything less and you'll miss flash deals and stock-out events.

Choosing the Right Proxy Type for AliExpress

Proxy Type	Success Rate on AliExpress	Best For	Cost Efficiency
Datacenter	Low (30–50%) — flagged quickly	Not recommended for AliExpress	Cheap but ineffective
Residential (rotating)	High (90–95%) — looks like real users	Search & trending discovery at scale	Moderate — pay per GB
Residential (sticky session)	High (90–95%) — same IP for 10–30 min	Multi-page flows, login-dependent scraping	Moderate
Mobile	Highest (95–99%) — matches mobile API traffic	Mobile API scraping, CAPTCHA avoidance	Higher cost, best results

For the mobile API approach described in this guide, residential rotating proxies are the sweet spot. If you need maximum reliability — especially for high-volume trending scans — mobile proxies are ideal since your requests literally look like they come from the AliExpress mobile app on a real phone.

Configure ProxyHat for your use case:

Rotating per request: http://user-country-US:pass@gate.proxyhat.com:8080 — each request gets a fresh IP.
Sticky session (10 min): http://user-country-US-session-abc123:pass@gate.proxyhat.com:8080 — same IP holds for the session duration, useful for paginated scraping.
Geo-target specific cities: http://user-country-US-city-newyork:pass@gate.proxyhat.com:8080 — for location-specific pricing research.

Anti-Detection Best Practices

Beyond proxies, these practices keep your scraper running:

Rotate User-Agent strings: Use a small pool of real AliExpress app versions. Don't randomize wildly — that's a fingerprint in itself.
Respect rate limits: Stay under 100 requests per minute per IP. Spread requests across multiple proxy IPs if you need higher throughput.
Handle CAPTCHAs gracefully: If you get a CAPTCHA response, don't retry immediately. Rotate the proxy IP, add a delay, and try again. Consider using session-* flags to maintain clean sessions.
Don't scrape during peak anti-bot hours: Alibaba tightens detection during major sale events (11.11, 12.12, mid-year sale). Expect more CAPTCHAs and lower success rates during these periods.
Monitor your success rate: If it drops below 85%, rotate your proxy pool or add delays. Don't wait until you're fully blocked.

curl Quick-Start: Test the Mobile API in One Command

Before writing any code, verify your proxy setup and the API endpoint with a single curl call:

curl -x http://user-country-US:pass@gate.proxyhat.com:8080 \
  -H "User-Agent: AliApp(AE/8.53.0)" \
  -H "X-Client-Type: android" \
  -H "Accept: application/json" \
  "https://m.aliexpress.com/api/product/trending?categoryId=200000&country=US&page=1"

If you get a JSON response with a data.resultList array, your setup works. If you get HTML or a 403, check your proxy credentials and headers.

Key Takeaways

Use the mobile API, not HTML scraping. The JSON endpoints are more stable, richer in data, and easier to parse. Desktop HTML selectors break constantly.
Residential or mobile proxies are non-negotiable for AliExpress. Datacenter IPs get flagged within minutes. ProxyHat's rotating residential proxies with geo-targeting give you the best cost-to-reliability ratio.
Explode variant SKUs into individual records. The mobile API gives you per-variant pricing and stock — use it to track velocity and identify top sellers.
Scrape at the right cadence. Trending discovery every 4–6 hours, price monitoring every 1–2 hours, seller metrics daily. Don't over-scrape — it wastes proxy bandwidth and increases block risk.
Geo-target your proxies to match your destination markets. Shipping costs and product availability differ by country, and AliExpress serves different prices accordingly.
Monitor success rates. If your success rate drops below 85%, rotate proxies, add delays, or switch from datacenter to residential IPs.

Ready to start scraping AliExpress at scale? Get started with ProxyHat residential proxies — no minimum commitment, pay per GB, and geo-targeting in 190+ countries. For more scraping patterns, check out our web scraping use case guide and our web scraping solutions.

How to Scrape AliExpress for Product Research: APIs, Proxies & Data Pipelines

Why Scrape AliExpress in 2025

AliExpress Site Structure: What to Scrape

Search Results Pages

Product Detail Pages

Store (Seller) Pages

Hot-Product and Trending Feeds

The API-vs-HTML Trade-Off

The Alibaba-Group Anti-Bot Stack

Mobile API Endpoints That Return Richer JSON

Product Search API

Product Detail API

Store Info API

Hot Products / Trending API

Python Example: Trending-Product Discovery Under Residential Proxies

Handling Variant SKUs, Shipping Costs & Seller Reputation

Variant SKU Decomposition

Shipping Cost Estimation by Destination

Seller Reputation Scoring

Data Freshness: How Often Does AliExpress Change?

Recommended Scraping Cadence

Choosing the Right Proxy Type for AliExpress

Anti-Detection Best Practices

curl Quick-Start: Test the Mobile API in One Command

Key Takeaways

Ready to get started?

Why Scrape AliExpress in 2025

AliExpress Site Structure: What to Scrape

Search Results Pages

Product Detail Pages

Store (Seller) Pages

Hot-Product and Trending Feeds

The API-vs-HTML Trade-Off

The Alibaba-Group Anti-Bot Stack

Mobile API Endpoints That Return Richer JSON

Product Search API

Product Detail API

Store Info API

Hot Products / Trending API

Python Example: Trending-Product Discovery Under Residential Proxies

Handling Variant SKUs, Shipping Costs & Seller Reputation

Variant SKU Decomposition

Shipping Cost Estimation by Destination

Seller Reputation Scoring

Data Freshness: How Often Does AliExpress Change?

Recommended Scraping Cadence

Choosing the Right Proxy Type for AliExpress

Anti-Detection Best Practices

curl Quick-Start: Test the Mobile API in One Command

Key Takeaways

Ready to get started?

You might also be interested in

How to Scrape Walmart Product Data in 2025

How to Scrape Etsy for Niche Research: A Pragmatic Guide for POD Teams

How to Scrape Product Reviews for Sentiment Analysis at Scale

News Scraping Proxies: A Strategic Guide for Media Monitoring at Scale