Methodology

Cadence is matched to each source's freshness. X runs every 5-15 minutes; Google trending-now every ~10-30 minutes and comparative/related daily; TikTok Creative Center daily; YouTube every 1-6 hours; ALL retail and review ranking snapshots run once daily (rankings change at most hourly, so daily is sufficient and lowest-risk); editorial/awards refresh daily or per awards release. Every chart and card shows both 'last updated' and a 'source freshness' label of live / hourly / daily / analyst-curated — we never use a single ambiguous 'real-time' label.

Source coverage

SourceDomainScopeCadenceNotes
Google TrendsCountry VerifiedcountryUSComparative/related daily; trending-now ~10-30 minCountry-scoped keyword interest and related terms (US, CA). Also joinable via BigQuery public datasets.
XCountry VerifiedcountryUSConversation layer every 5-15 minCountry-aware keyword/brand mention counts plus recent-post evidence.
RedditCountry VerifiedcountryUSEvery ~15 minCommunity signal over a K-beauty subreddit watchlist via the official Reddit Data API (registered OAuth app, rate-limited) — never scraped. Country attribution only from a subreddit's geography (r/SkincareAddictionCanada -> CA; general subs stay global).
InstagramGlobal SocialsocialGlobalDailyWatchlist hashtag/account evidence only. Global directional — never presented as US- or CA-specific.
TikTokGlobal SocialsocialGlobalDailyCreative Center keyword/hashtag/top-product directional signals (ads-based). Global directional — never presented as country-specific.
PinterestGlobal SocialsocialGlobalDailyGlobal keyword/visual trend watchlist. Global directional — never presented as country-specific.
YouTubeGlobal SocialsocialGlobalEvery 1-6 hShorts-adjacent discovery and review/tutorial proof.
Amazon USRetail ConfirmedretailUSOnce dailyBest-Sellers + category nodes. Permitted/identified user-agent (ClaudeBot/GPTBot/CCBot are name-blocked). Commerce-intent, not direct sales truth.
Amazon.caRetail ConfirmedretailCAOnce dailyBestsellers + Movers & Shakers + New Releases (beauty). Permitted user-agent; throttled against anti-bot 403/503.
SephoraRetail ConfirmedretailUSPer editorial/awards releaseAccess-blocked at the access layer (Akamai 403) regardless of robots.txt. Referenced as an editorial/analyst source and via official awards content — NOT auto-ingested.
UltaRetail ConfirmedretailUSPer editorial/awards releasePrestige/mass growth surface treated as an editorial reference.
Walmart.caRetail ConfirmedretailCAPer editorial releaseReturns anti-bot 418 at the access layer — editorial reference only, not auto-ingested.
YesStyleRetail ConfirmedretailGlobalOnce daily/en/k-beauty + annual/mid-year Beauty Awards. robots allows base paths (avoid ?q* and /product-grid/).
JolseRetail ConfirmedretailGlobalOnce dailyBEST / NEW / TIME DEAL listings.
StyleKoreanRetail ConfirmedretailGlobalOnce dailyCurated best-selling picks (low weight; curated, not true rank). Now auto-applies CA customs.
StylevanaRetail ConfirmedretailGlobalOnce dailyRegion-segmented best sellers.
WishtrendRetail ConfirmedretailKROnce dailyKorea-direct Shopify store popular with NA buyers. /collections/ best-sellers; *sort_by*/filter params disallowed.
TesterKoreaRetail ConfirmedretailKROnce dailyKorea-direct e-tailer. Bestseller/category listings; only /Member/, /Payment/, /Search/ disallowed.
RoseRoseShopRetail ConfirmedretailKROnce dailyKorea-direct Shopify store. /collections/ best-sellers; *sort_by*/filter params disallowed.
Soko GlamRetail ConfirmedretailGlobalOnce dailyUS-based curated K-beauty (Shopify). /collections/bestsellers; avoid ?sort_by/filter.
Sukoshi MartRetail ConfirmedretailCAOnce dailyCanada-domestic specialist (Shopify). /collections/best-sellers curated; *sort_by* disallowed.
LakinzaRetail ConfirmedretailCAOnce dailyMontreal-area Canada-domestic specialist (Shopify), ships domestic with no customs. Curated /collections/; filter params disallowed.
TheKShopRetail ConfirmedretailCAOnce dailyMontreal Canada-domestic specialist (Shopify). Curated /collections/; filter params disallowed.
Mikaela BeautyRetail ConfirmedretailCAOnce dailyEdmonton AB Canada-domestic specialist (Shopify). Curated /collections/; filter params disallowed.
ShopDamaRetail ConfirmedretailCAOnce dailyToronto Canada-domestic specialist. Exposes an official /api/ucp/mcp agent endpoint — the preferred ingestion path over scraping.
Kbeauty CanadaRetail ConfirmedretailCAOnce dailyCanada-domestic specialist (kbeauty.ca). Curated /collections/; filter params disallowed.
Olive Young GlobalRetail ConfirmedretailKROnce dailyHighest-value demand anchor. /display/page/best-seller (Top Orders / Top in Korea) + Olive Young Awards (~180M purchase records). JS-rendered; robots allows /display with Crawl-delay 5. Ships to US and CA.
HwahaeRetail ConfirmedretailKRPer awards release / dailyReview-driven rankings + biannual awards, retailer-independent. Used as authenticity/recommendation corroboration; awards pages are the preferred low-risk target.
GlowpickRetail ConfirmedretailKRPer awards release / dailyReview-driven rankings + biannual awards, retailer-independent. Used as authenticity/recommendation corroboration; awards-first.

Scope, domains, and the retail-demand authenticity model

  • Country Intelligence (Google Trends, X, Reddit) is geo-safe: country is attributed only where the source genuinely carries it (e.g. a country-scoped Trends query or a Canada-specific subreddit).
  • Global Social Signal (Instagram, TikTok, Pinterest, YouTube) is directional and global. These signals are NEVER presented as US- or Canada-specific unless a payload explicitly contains geography.
  • Retail Demand sources carry a real storefront geography, so US / CA / KR / Global segmentation here is legitimate. Mass-market retailer position is treated as commerce-intent, not direct sales truth.
  • Retail authenticity model: authentic demand (Korean-domestic purchase data like Olive Young, then mass-market) and review-driven recommendation (Hwahae/Glowpick, retailer-independent) rank ABOVE ad/merchandising-driven Amazon position alone; specialist export retail is corroboration. Disagreement between social hype and retail demand is shown, not hidden.
  • Access-blocked channels (Sephora US/CA, Walmart.ca) are editorial/analyst references and official awards content only — they are not auto-ingested.

Data source access & compliance policy

We read public pages responsibly. Compliance is a first-class, enforced constraint — not a footnote.

robots.txt gate
robots.txt is the gate. Before any scheduled fetch we fetch and cache the target site's robots.txt and only ingest paths Allowed for our user-agent. Disallowed paths (e.g. Shopify *sort_by* URLs, Amazon's name-blocked AI-bot rules, Gmarket's full block) are never fetched. We honor Crawl-delay (e.g. Olive Young Global: 5s) and throttle aggressively.
Once-daily cadence
All retail and review ranking snapshots run once per day. Rankings change at most hourly, so a single daily snapshot is sufficient, minimizes load on the source, and lowers risk.
Derived-data-only
Derived-data-only. We store derived rankings, positions, product identity, price, and review COUNTS — never bulk copies of copyrighted review text or full page captures.
Access-blocked sites are editorial
Access-blocked sites are editorial, not ingested. Sephora (US/CA) and Walmart.ca return anti-bot 403/418 at the access layer regardless of robots.txt; these are referenced as analyst/editorial sources and via official awards content, not automated snapshots.
Identified user-agent
Honest, identified user-agent with contact info, restricted to an explicit allowlist of permitted UAs. Where a site name-blocks AI bots (Amazon blocks ClaudeBot/GPTBot/CCBot while allowing * on bestseller paths) we use a permitted, identified user-agent.
  • robots.txt Allowed/Disallowed verdict is recorded on every ingestion run.
  • Crawl-delay is honored per source.
  • Once-daily cadence for all retail/review ranking snapshots.
  • Derived data only — never bulk review text or full page captures.
  • Identified, allowlisted user-agent with contact.
  • Access-blocked sites are editorial references, not automated sources.
  • Official awards pages (Olive Young Awards, YesStyle Beauty Awards, Glowpick/Hwahae awards) are preferred as low-risk, high-signal anchors.