Issue #30: Lebanese retailer audit + scraper roadmap¶
- Issue: #30
- Started: 2026-04-28
- Completed: in progress
Goal¶
Ship a single reference doc — docs/reference/retailers.md — that profiles 6-8 Lebanese tech retailers (the 3 currently scraped + 3-5 prioritized-next) on category coverage, pricing model, SKU scale, scraper feasibility (page structure, anti-bot, pagination), and affiliate program. Closes with a scraper roadmap that makes #20 (add 3-5 more retailers) trivial to plan.
Out of scope¶
- No scraper code. This is a docs-only ticket. Touching
src/scrapers/is forbidden (issue scope). - No new ADRs/RFCs. Decisions are recommendations per retailer; the actual "we will add X" is finalized in #20.
- No re-research of global competitors.
docs/reference/competitive-landscape.md(#35) is the source for global context. This doc is Lebanon-only. - No persona work.
docs/reference/personas.md(#36) already covers buyer behavior. - No category-scope decisions. That's #32. We describe what categories each retailer carries, we don't prescribe what 961tech indexes.
Approach¶
Two-phase research + one-pass compile:
- Currently-scraped (3 retailers) — info already lives in
src/scrapers/sites/*.ts(page structure, sold-out signals) and in competitive-landscape §4.4 (UX + density teardowns). One WebFetch per retailer for current category-page structure and rough SKU count; cross-check against the scraper code. - Prioritized-next (4-5 candidates) — picked from competitive-landscape §4.4 candidate list (CompuOne, Mojitech, Ayoub Computers, Multitech, PCBuildingLeb, with PcMacLB / Gamma / Microcity as fallbacks). Dispatched as parallel subagents per
superpowers:dispatching-parallel-agents— each subagent visits one retailer's homepage + at least one category page, returns a structured report against the scope checklist in the ticket. - Compile retailers.md with a fixed per-retailer template (matches the ticket's "Per retailer, include" list), order: currently-scraped first, then prioritized-next. Add the Scraper roadmap section (top 3 to add, rationale).
- Verify with
mkdocs build --strict.
Why this approach: parallel subagents give 4-5 retailer reads in roughly the wall-clock time of one. The competitive-landscape doc already did the cross-cutting Lebanese analysis — we don't repeat that work, we link to it.
Per-retailer template (locked before research)¶
Every retailer entry uses exactly these fields, matching the ticket's scope list:
### N. <Retailer Name>
| | |
|---|---|
| **URL** | <url> |
| **Languages** | <EN/FR/AR coverage> |
| **Categories** | <CPU/GPU/MB/RAM/storage/PSU/case/cooler/peripherals/laptops/prebuilt — what's carried> |
| **Pricing model** | <USD-only / LBP-only / dual / USD with cash-rate calc / call-for-price> |
| **SKU scale** | <small <100 / medium 100-500 / large 500+> + how counted |
| **Page structure** | <server-rendered HTML / SPA / API-backed> + platform if known (Shopify/Woo/custom) |
| **Pagination** | <numbered pages / infinite scroll / load-more / no pagination> |
| **Anti-bot signals** | <none observed / UA gating / Cloudflare / CAPTCHA> |
| **Affiliate program** | <yes/no/unknown> + link if known |
| **Notes** | <contact / business model quirks> |
| **Recommendation** | **H / M / L / Skip** + one-sentence rationale |
Steps¶
- Step 1: Stub the doc with frontmatter + section skeleton
- Create
docs/reference/retailers.mdwith frontmatter (title,description,status: active,tags: [reference, scraping, foundation]), section headers (Scope & method, Currently scraped, Prioritized next, Scraper roadmap, Open questions, See also), and the per-retailer template embedded as a comment for consistency. - Add a row in
docs/reference/index.mdlinking to it (mirror the pattern ofcompetitive-landscape.mdandpersonas.mdrows). -
Verification:
mkdocs build --strictfrom a clean state succeeds (no broken refs introduced). -
Step 2: Currently-scraped retailers — fill 3 entries
- PCAndParts — read
src/scrapers/sites/pcandparts.tsfor category URLs + WooCommerce/Flatsome confirmation; WebFetchhttps://pcandparts.com/product-category/cpu/to get rough listing count + pagination model + check for anti-bot; cross-reference competitive-landscape §4.4 for UX context (5/10 era 2015-2018 Woo/Flatsome). Fill the template. - 961Souq — read
src/scrapers/sites/souq961.tsfor selectors + Call-For-Price handling; WebFetchhttps://961souq.com/collections/cpus; cross-reference §4.4 (5/10 Shopify, "Call For Price" without inquiry mechanism). Fill the template. - Macrotronics — read
src/scrapers/sites/macrotronics.tsfor selectors; WebFetchhttps://www.macrotronics.net/collections/processors-cpu; cross-reference §4.4 (6.5/10 Shopify, includes 10% VAT in displayed prices). Fill the template. - Recommendation field for these three is already-on (continue) — keep maintained.
-
Verification: 3 entries filled, every template field has a value or an explicit "Unknown — flagged" note. No TODO/TBD strings.
-
Step 3: Prioritized-next — dispatch parallel subagents for 5 candidates
- Use
superpowers:dispatching-parallel-agentsto launch 5Explore-type agents in a single message. Per agent: fetch the homepage + the most relevant category page (CPU or "components") + look for anti-bot, pagination, "call for price", language switcher, affiliate program disclosure. Each agent returns a structured report keyed to the per-retailer template. - Targets, in priority order:
- CompuOne —
https://compuonelb.com(most cited in spec §6.2 Wave-1; PC parts focus) - Mojitech —
https://mojitech.net(distributor + retail; signals catalog scale) - Ayoub Computers —
https://ayoubcomputers.com(components + retail) - PCBuildingLeb —
https://pcbuildingleb.com(custom builds + accessories — interesting niche; tests whether they expose a parts catalog or only services) - Multitech —
https://multitech-lb.com(Apple + PC, retail/wholesale; scope-fit risk to flag)
- CompuOne —
- Fallbacks if any of the above are unreachable: PcMacLB (
pcmaclb.com), Gamma Computers (gammalb.com), Microcity (gomicrocity.com). -
Verification: 5 subagent reports received; each report covers all template fields or explicitly says "Unable to verify because
". -
Step 4: Compile prioritized-next entries (4-5 total)
- Transcribe each subagent report into the doc, in priority order. Maintain the same template. Where an agent flagged "Unable to verify", state that explicitly in the doc — do not invent data (per ticket constraint).
- For each, set Recommendation = H / M / L / Skip with rationale tied to: catalog scale × scraper feasibility × strategic fit (does it bring a category 961tech is weak in, e.g. peripherals/prebuilt?).
- If total entries land at 7 (3 scraped + 4 next) or 8 (3 + 5), both are within the 6-8 ticket target; prefer 8 unless one candidate is clearly Skip.
-
Verification: every prioritized-next entry has a recommendation; no Recommendation field reads "TBD".
-
Step 5: Scraper roadmap section
- Append a "Scraper roadmap" section: prioritized list of next 3-5 retailers to add (subset of prioritized-next entries with H or M recommendations), with one-paragraph rationale per pick, and one combined "what makes #20 trivial to plan now" close.
- Call out any retailer flagged strategically-important-but-infeasible (heavy SPA, anti-bot, login-walled) at the top of this section, separate from the recommended-add list. Do not bury it.
-
Verification: at least 3 retailers listed; all picks are present in the audit table above; infeasibility callouts (if any) are separate from the recommended list.
-
Step 6: Open questions + cross-links
- Add an "Open questions" subsection capturing anything that needs reviewer input or future research (e.g. Lebanese IG-only retailers per competitive-landscape §5.4 #4; brand-overlap with
961gamers.com). - Cross-link from competitive-landscape.md §5.4 #4 (was an open question deferred to #30) only as a
See alsoline in retailers.md — do not edit competitive-landscape.md (it isstatus: activeand out-of-scope here). -
Verification: see-also block lists competitive-landscape, personas, writing-a-scraper guide.
-
Step 7: Gate — mkdocs build --strict
- Run
mkdocs build --strictfrom project root. - Expected: builds without errors. Any broken cross-link fails the build (the
--strictflag). -
If failure: fix the offending link/anchor and re-run. Do not commit until clean.
-
Step 8: Atomic commit
git add docs/reference/retailers.md docs/reference/index.md docs/plans/2026-04-28-issue-30-retailer-audit.md- Commit message:
docs(reference): add Lebanese retailer audit and scraper roadmap(matches ticket spec exactly, no--no-verify, no push, no PR). - Verification:
git statusshows clean tree post-commit;git log -1 --statshows only the three docs files touched.
Risks¶
| Risk | Likelihood | Mitigation |
|---|---|---|
| WebFetch returns 403 / anti-bot for one of the candidate retailers | Medium | Subagent flags "anti-bot signals: 403 on direct fetch" — that is a finding (informs feasibility recommendation). Don't substitute for the data; state the gap. |
| Subagent invents data when a field isn't visible (e.g. SKU count, affiliate program) | Medium | Per-retailer template requires explicit "Unknown — flagged" for unverifiable fields. Reviewer sees what was guessed vs. observed. |
| Candidate retailer turns out to be heavy SPA (no useful HTML to scrape) | Low-medium | Ticket "Stop and ask if" clause #1 — pause and surface to MASTER for the strategic call (do we accept a heavier scraper). |
| One of the 5 candidates is dead/redirected | Low | Fallback list (PcMacLB, Gamma, Microcity) is pre-staged in Step 3. |
| Doc grows beyond 6-8 retailers and dilutes the recommendation signal | Low | Template enforces the Recommendation field; cap at 8. If a 9th is interesting, add it to "Open questions" instead of the table. |
Tests¶
No code changes → no test additions. The doc passes if mkdocs build --strict succeeds (Step 7 gate).
Doc updates¶
- Reference: new file
docs/reference/retailers.md - Reference index: row added in
docs/reference/index.mdlinking to retailers.md - Architecture: not changed (ingest-pipeline.md already covers the per-retailer scraper pattern)
- ADR: none — no new design decision; recommendations are advisory pending #20
- Glossary: only if a new term is coined; not anticipated
- Issue body: closing comment on #30 with commit SHA + summary
Rollback¶
git revert <sha> removes the doc. No code/state to undo.