Why Scrape AliExpress in 2025
If you run a dropshipping tool or product-research SaaS, AliExpress is your single richest source of trending-product signals. Over 200 million SKUs, real-time price shifts, and a seller ecosystem that rewards speed — the merchant who spots a rising product first wins. But AliExpress is also one of the hardest e-commerce sites to scrape reliably. This guide walks through what actually works in 2025: which endpoints to hit, how to handle the anti-bot stack, and how to build a pipeline that doesn't break on day two.
Whether you're building an internal research tool or a customer-facing product-discovery dashboard, you'll leave with concrete API patterns, CSS selectors, and a Python script you can run under residential proxies tonight.
AliExpress Site Structure: What to Scrape
AliExpress surfaces product data across four main surfaces. Each has a different data density and scraping difficulty:
Search Results Pages
Desktop search lives at https://www.aliexpress.com/search?SearchText=.... Each SERP returns up to 60 product cards with title, price, orders count, star rating, and shipping badge. The HTML is server-rendered but heavily obfuscated — class names are hashed and rotate periodically.
Key selectors (desktop, as of early 2025):
- Product card:
div[class*='list--item']ordiv._1OUGs(changes often) - Title:
a[class*='title--item'] - Price:
span[class*='price--current'] - Orders:
span[class*='sale-value']
Because selectors shift, relying on HTML parsing for search is fragile. The mobile API (covered below) is far more stable.
Product Detail Pages
URL pattern: https://www.aliexpress.com/item/PRODUCT_ID.html. Product pages contain the richest data: variant SKUs, description HTML, image gallery, shipping options, and seller info. The description is loaded in an iframe from ae01.alicdn.com, which means a second request to get full product content.
Store (Seller) Pages
URL: https://www.aliexpress.com/store/STORE_ID. Store pages expose seller rating, positive-feedback percentage, and a product catalog. Useful for reputation scoring and catalog-wide monitoring.
Hot-Product and Trending Feeds
AliExpress surfaces trending items via https://www.aliexpress.com/popular/ and category-level trending pages like /popular/electronics.html. These pages highlight products with surging order volumes — gold for product research. The data behind them is also available via the mobile API.
The API-vs-HTML Trade-Off
Here's the core decision: scrape rendered HTML or hit the mobile API?
| Dimension | Desktop HTML | Mobile API (JSON) |
|---|---|---|
| Data richness | Full rendered page; description in iframe | Structured JSON; variants, shipping, specs in one call |
| Stability | Low — class names rotate every few weeks | Medium — field names stable, auth changes occasionally |
| Rate limits | ~40 req/min per IP before CAPTCHA | ~120 req/min per IP; stricter auth on some endpoints |
| Anti-bot difficulty | High — full browser fingerprinting | Medium — needs correct headers + token |
| Parsing effort | High — obfuscated DOM, frequent rewrites | Low — JSON fields map directly to data model |
Verdict: For any production pipeline, the mobile API wins. Use HTML scraping only as a fallback or for description content that the API doesn't include.
The Alibaba-Group Anti-Bot Stack
AliExpress shares infrastructure with the broader Alibaba security group. Here's what you're up against:
- Device fingerprinting: The desktop site runs a JavaScript fingerprint collector (similar to Alibaba's umid system) that profiles browser features, canvas rendering, and WebGL. Headless browsers that don't patch these get flagged fast.
- Rate limiting: Desktop HTML: aggressive throttling around 40 requests per minute per IP. Mobile API: roughly 120 requests per minute per IP before you see 429s or 403s.
- CAPTCHA: Alibaba deploys slider CAPTCHAs (AliCAPTCHA) and occasionally reCAPTCHA v2 on suspicious traffic patterns.
- IP reputation: Datacenter IP ranges are flagged quickly. Residential and mobile IPs see significantly fewer challenges.
- Token-based auth on mobile API: Some endpoints require an
x-signheader generated from the request URL and a signing key. The key is extracted from the mobile app's JavaScript bundle and rotates every few weeks.
The practical implication: you need residential proxies and you need to mimic mobile traffic patterns. Datacenter IPs will burn out in minutes on AliExpress.
Mobile API Endpoints That Return Richer JSON
AliExpress's mobile app communicates with backend services that return clean, structured JSON. These endpoints are far more scraper-friendly than the desktop HTML. The base domain for most mobile API calls is m.aliexpress.com or gw.aliexpress.com.
Product Search API
GET https://m.aliexpress.com/api/product/search?
keyword=wireless+earbuds&
page=1&
sortType=bestSale&
country=US
# Key response fields:
# .data.resultList[].title - product title
# .data.resultList[].price - price range
# .data.resultList[].saleNum - orders count
# .data.resultList[].productId - product ID
# .data.resultList[].storeInfo - seller metadata
Product Detail API
GET https://m.aliexpress.com/api/product/detail?
productId=100500612345678&
country=US
# Returns: title, price, images, skuInfo (all variants),
# shippingInfo, storeInfo, specs, description URL
Store Info API
GET https://m.aliexpress.com/api/store/info?
storeId=912345678
# Returns: storeName, positiveRate, serviceScore,
# shipScore, itemScore, totalProducts
Hot Products / Trending API
GET https://m.aliexpress.com/api/product/trending?
categoryId=200000&
country=US&
page=1
A sample truncated response from the product detail API:
{
"code": 200,
"data": {
"productId": "100500612345678",
"title": "TWS Wireless Earbuds Bluetooth 5.3...",
"price": {"min": "8.99", "max": "15.49", "currency": "USD"},
"saleNum": 3842,
"skuInfo": {
"skuList": [
{"skuId": "120000345", "price": "8.99", "attr": "Color:Black", "stock": 580},
{"skuId": "120000346", "price": "12.49", "attr": "Color:White", "stock": 210}
]
},
"shippingInfo": {
"toCountry": "US",
"options": [
{"method": "AliExpress Standard Shipping", "cost": "0.00", "estimatedDays": "15-25"}
]
},
"storeInfo": {
"storeId": "912345678",
"storeName": "TechGadgets Official",
"positiveRate": "97.2%"
}
}
}
Notice how a single API call gives you variants, shipping, and seller reputation — data that would require three separate HTML page loads on desktop.
Python Example: Trending-Product Discovery Under Residential Proxies
Below is a production-oriented script that discovers trending products via the mobile API, using ProxyHat residential proxies to avoid IP blocks. It rotates the proxy per request and extracts the data fields that matter for product research.
import requests
import json
import time
import random
from datetime import datetime, timezone
PROXY_USER = "user-country-US"
PROXY_PASS = "your_password"
PROXY_URL = f"http://{PROXY_USER}:{PROXY_PASS}@gate.proxyhat.com:8080"
HEADERS = {
"User-Agent": "AliApp(AE/8.53.0)",
"X-Client-Type": "android",
"X-Client-Version": "8530",
"Accept": "application/json",
"Accept-Language": "en-US",
}
def fetch_trending(category_id: str = "200000", page: int = 1) -> dict:
"""Fetch trending products from AliExpress mobile API."""
url = "https://m.aliexpress.com/api/product/trending"
params = {
"categoryId": category_id,
"country": "US",
"page": page,
}
proxies = {"http": PROXY_URL, "https": PROXY_URL}
resp = requests.get(url, params=params, headers=HEADERS, proxies=proxies, timeout=15)
resp.raise_for_status()
return resp.json()
def fetch_product_detail(product_id: str) -> dict:
"""Fetch full product detail including SKU variants."""
url = "https://m.aliexpress.com/api/product/detail"
params = {"productId": product_id, "country": "US"}
proxies = {"http": PROXY_URL, "https": PROXY_URL}
resp = requests.get(url, params=params, headers=HEADERS, proxies=proxies, timeout=15)
resp.raise_for_status()
return resp.json()
def extract_research_fields(trending_item: dict) -> dict:
"""Normalize a trending item into a flat research record."""
return {
"product_id": trending_item.get("productId", ""),
"title": trending_item.get("title", ""),
"price_min": trending_item.get("price", {}).get("min", ""),
"price_max": trending_item.get("price", {}).get("max", ""),
"orders": trending_item.get("saleNum", 0),
"store_id": trending_item.get("storeInfo", {}).get("storeId", ""),
"store_name": trending_item.get("storeInfo", {}).get("storeName", ""),
"scraped_at": datetime.now(timezone.utc).isoformat(),
}
if __name__ == "__main__":
# Step 1: Get trending products in Electronics
data = fetch_trending(category_id="200000", page=1)
items = data.get("data", {}).get("resultList", [])
print(f"Found {len(items)} trending products")
# Step 2: Enrich top items with full detail (variants + shipping)
for item in items[:5]:
pid = item.get("productId")
if not pid:
continue
detail = fetch_product_detail(pid)
product = detail.get("data", {})
sku_count = len(product.get("skuInfo", {}).get("skuList", []))
shipping = product.get("shippingInfo", {}).get("options", [])
print(
f" {pid} | {sku_count} variants | "
f"{shipping[0]['method'] if shipping else 'N/A'} | "
f"orders: {product.get('saleNum', 'N/A')}"
)
time.sleep(random.uniform(1.5, 3.5)) # polite delay
A few things worth noting in this script:
- Proxy geo-targeting: The
country-USflag in the username routes through US residential IPs. AliExpress prices and shipping vary by destination, so this matters. - Mobile User-Agent: The
AliApp(AE/8.53.0)header mimics the Android app. Without it, the API may return reduced data or redirect to HTML. - Rate management: The random 1.5–3.5 second delay between detail requests keeps you under the ~120 req/min threshold per IP.
Handling Variant SKUs, Shipping Costs & Seller Reputation
Variant SKU Decomposition
AliExpress products often have dozens of variant SKUs — color, size, storage, bundle combinations. The mobile API's skuInfo.skuList array gives you each variant's skuId, price, stock count, and attribute string. For product research, the key operations are:
- Explode variants: One product listing → N SKU rows in your database, each with its own price and stock.
- Track stock velocity: Compare
stockacross scrapes. A variant dropping from 500 to 120 units whilesaleNumclimbs is a strong trending signal. - Identify the best-seller variant: The variant with the lowest remaining stock relative to its initial stock (if you've been tracking) is often the top seller.
# Normalize variant attributes from "Color:Black;Size:M" format
def parse_sku_attrs(attr_string: str) -> dict:
"""Parse 'Color:Black;Size:M' into {'Color': 'Black', 'Size': 'M'}."""
if not attr_string:
return {}
pairs = attr_string.split(";")
result = {}
for pair in pairs:
if ":" in pair:
key, value = pair.split(":", 1)
result[key.strip()] = value.strip()
return result
# Usage: build a flat table from skuList
for sku in product["skuInfo"]["skuList"]:
attrs = parse_sku_attrs(sku.get("attr", ""))
record = {
"product_id": product["productId"],
"sku_id": sku["skuId"],
"price": float(sku["price"]),
"stock": int(sku.get("stock", 0)),
**attrs, # color, size, etc.
}
Shipping Cost Estimation by Destination
The product detail API's shippingInfo.options array includes shipping method, cost, and estimated delivery days for the country specified in your request. To build a shipping cost matrix:
- Set
countryin your API request to each target market (US, GB, DE, etc.). - Extract
costandestimatedDaysper method. - For dropshipping, AliExpress Standard Shipping (free or near-free to most destinations) is the default. Premium methods like DHL or FedEx appear for heavier items.
Proxy geo-targeting matters here: use country-US, country-DE, etc. in your ProxyHat username to get location-specific shipping data.
Seller Reputation Scoring
From the store info API, pull these fields:
positiveRate— percentage of positive feedback (aim for >95%)serviceScore,shipScore,itemScore— 1–5 scale sub-ratingstotalProducts— catalog size (larger stores tend to be more reliable)
For product research, a simple composite works well:
def seller_score(store_info: dict) -> float:
"""Weighted seller reputation score (0–5)."""
positive = float(store_info.get("positiveRate", "0").replace("%", "")) / 20 # 0–5
service = float(store_info.get("serviceScore", 0))
shipping = float(store_info.get("shipScore", 0))
item = float(store_info.get("itemScore", 0))
return round(0.4 * positive + 0.2 * service + 0.2 * shipping + 0.2 * item, 2)
Data Freshness: How Often Does AliExpress Change?
AliExpress is one of the most volatile e-commerce platforms for pricing and inventory. Here's what we've observed across large-scale scraping operations:
- Prices change on 15–20% of top-selling SKUs daily. Flash deals update hourly.
- Stock levels shift constantly — high-velocity products can sell through hundreds of units per day.
- New listings appear at a rate of tens of thousands per day across major categories.
- Seller metrics (feedback, store ratings) update roughly every 6–12 hours.
Recommended Scraping Cadence
| Data Type | Cadence | Rationale |
|---|---|---|
| Trending product discovery | Every 4–6 hours | Hot-product rankings shift throughout the day |
| Price monitoring (tracked SKUs) | Every 1–2 hours | Flash deals expire fast; competitive repricing needs near-real-time data |
| Stock / inventory tracking | Every 2–4 hours | Stock-outs kill dropshipping listings — detect them early |
| Seller reputation updates | Once or twice daily | Changes are slow; no need for high frequency |
| New product discovery (full category scan) | Daily | Catch new listings within 24 hours |
For a product-research tool, the minimum viable cadence is a trending scan every 6 hours and price checks every 2 hours for your monitored SKU list. Anything less and you'll miss flash deals and stock-out events.
Choosing the Right Proxy Type for AliExpress
| Proxy Type | Success Rate on AliExpress | Best For | Cost Efficiency |
|---|---|---|---|
| Datacenter | Low (30–50%) — flagged quickly | Not recommended for AliExpress | Cheap but ineffective |
| Residential (rotating) | High (90–95%) — looks like real users | Search & trending discovery at scale | Moderate — pay per GB |
| Residential (sticky session) | High (90–95%) — same IP for 10–30 min | Multi-page flows, login-dependent scraping | Moderate |
| Mobile | Highest (95–99%) — matches mobile API traffic | Mobile API scraping, CAPTCHA avoidance | Higher cost, best results |
For the mobile API approach described in this guide, residential rotating proxies are the sweet spot. If you need maximum reliability — especially for high-volume trending scans — mobile proxies are ideal since your requests literally look like they come from the AliExpress mobile app on a real phone.
Configure ProxyHat for your use case:
- Rotating per request:
http://user-country-US:pass@gate.proxyhat.com:8080— each request gets a fresh IP. - Sticky session (10 min):
http://user-country-US-session-abc123:pass@gate.proxyhat.com:8080— same IP holds for the session duration, useful for paginated scraping. - Geo-target specific cities:
http://user-country-US-city-newyork:pass@gate.proxyhat.com:8080— for location-specific pricing research.
Anti-Detection Best Practices
Beyond proxies, these practices keep your scraper running:
- Rotate User-Agent strings: Use a small pool of real AliExpress app versions. Don't randomize wildly — that's a fingerprint in itself.
- Respect rate limits: Stay under 100 requests per minute per IP. Spread requests across multiple proxy IPs if you need higher throughput.
- Handle CAPTCHAs gracefully: If you get a CAPTCHA response, don't retry immediately. Rotate the proxy IP, add a delay, and try again. Consider using
session-*flags to maintain clean sessions. - Don't scrape during peak anti-bot hours: Alibaba tightens detection during major sale events (11.11, 12.12, mid-year sale). Expect more CAPTCHAs and lower success rates during these periods.
- Monitor your success rate: If it drops below 85%, rotate your proxy pool or add delays. Don't wait until you're fully blocked.
curl Quick-Start: Test the Mobile API in One Command
Before writing any code, verify your proxy setup and the API endpoint with a single curl call:
curl -x http://user-country-US:pass@gate.proxyhat.com:8080 \
-H "User-Agent: AliApp(AE/8.53.0)" \
-H "X-Client-Type: android" \
-H "Accept: application/json" \
"https://m.aliexpress.com/api/product/trending?categoryId=200000&country=US&page=1"
If you get a JSON response with a data.resultList array, your setup works. If you get HTML or a 403, check your proxy credentials and headers.
Key Takeaways
- Use the mobile API, not HTML scraping. The JSON endpoints are more stable, richer in data, and easier to parse. Desktop HTML selectors break constantly.
- Residential or mobile proxies are non-negotiable for AliExpress. Datacenter IPs get flagged within minutes. ProxyHat's rotating residential proxies with geo-targeting give you the best cost-to-reliability ratio.
- Explode variant SKUs into individual records. The mobile API gives you per-variant pricing and stock — use it to track velocity and identify top sellers.
- Scrape at the right cadence. Trending discovery every 4–6 hours, price monitoring every 1–2 hours, seller metrics daily. Don't over-scrape — it wastes proxy bandwidth and increases block risk.
- Geo-target your proxies to match your destination markets. Shipping costs and product availability differ by country, and AliExpress serves different prices accordingly.
- Monitor success rates. If your success rate drops below 85%, rotate proxies, add delays, or switch from datacenter to residential IPs.
Ready to start scraping AliExpress at scale? Get started with ProxyHat residential proxies — no minimum commitment, pay per GB, and geo-targeting in 190+ countries. For more scraping patterns, check out our web scraping use case guide and our web scraping solutions.






