The Best MCPs for VOC Research in Claude (Tested Across 15 Platforms)
Search layer and extract layer, tested across Reddit, HN, Product Hunt, YouTube, Amazon, G2, App Store, GitHub Issues, and more. Real output shown, verdicts on what failed.
Claude can read and synthesize customer language better than any analyst. The problem is always the same: getting the data in. You're still opening Reddit tabs, copying posts, hitting context limits, losing an hour to something that should take five minutes. This guide covers the two-layer MCP stack that fixes it: a search layer to find where your customers are talking, an extract layer to pull what they're actually saying. Tested across 15 platforms with real output and honest verdicts on what failed.

You installed the Reddit MCP. The one that got recommended in two or three threads.
It showed Connected in the dashboard. Tools listed. Everything looked right.
First call: HTTP 403 Forbidden.
You checked the config. Reinstalled. Tried a second Reddit MCP. Same error. Tried a third. Same error. Went back to copy-paste, told yourself you'd figure it out later.
That was the correct response. Copy-paste was never the problem. The MCP was never going to work. The no-auth Reddit endpoints have been blocking unauthenticated requests since 2025. Every tool that calls the .json API without credentials returns 403. The Connected status is real. The data calls just fail. Nobody put that in the README.
That's one failure mode. I found a dozen more.
In early June I ran a full VOC stack test: 15 platforms, 6 tools, every path I could find from "find the right thread" to "paste-ready quote bank." I was planning to cover four platforms. I covered 15 because every block forced me toward a different tool, and every dead end revealed something worth documenting.
This is that report. Every verdict is grounded in an actual tool call and its real output. Including the ones that cost $0.10 and returned nothing.
The pattern that emerged from 12 platforms: every working path has exactly two layers. A search layer to find which threads to pull. An extract layer to pull the full text.
Skip the search layer and you brute-force the wrong threads. Skip the extract layer and Claude gets snippets, not quotes.
No VOC guide I found explained the split. This one does.
What's inside:
- Why VOC research stays stuck at copy-paste
- The two-layer VOC stack: search first, extract second
- How to configure the stack
- How to run a VOC session
- Hacker News: best platform in this stack
- Reddit: residential proxies only
- Product Hunt: structured VOC most people skip
- YouTube, TikTok, and Threads
- Amazon reviews
- G2, Capterra, and Trustpilot: structured B2B VOC
- App Store and Google Play: mobile VOC
- GitHub Issues and Stack Overflow: developer VOC
- What's completely dead: X, Instagram, and Firecrawl's blocked list
- How Claude turns raw comments into copy
- By the end: a configured two-layer stack, a session prompt, and tested verdicts on 15 platforms
- The platform verdict sheet: grab it at the end
🎁 Platform Verdict Sheet: every platform, best tool, cost per run, signal quality, and the gotcha you'll hit. All in one table.

*Hi, I'm Jenny 👋 I believe anyone can thrive with AI, not by mastering the tools, but by building real things with them. I run Build to Launch and the Practical AI Builder program, where we go from experimenting to shipping. Come build with us.
If you're new to Build to Launch, welcome! You might also enjoy:
- What Is an MCP Server? Plain English
- MCP Server Types: Installation Guide
- Research Automation Workflow*

Why VOC research stays stuck at copy-paste
Claude is brilliant at analyzing customer language. Give it 50 Reddit comments sorted by votes, and it will find the five pain themes, extract the verbatim phrases that convert, and tell you which one to lead with.
The problem is always upstream. Claude can't see the data.
Most builders hit this the same way. They have an article to write, a landing page to revise, a product to position. They know the customer language is out there: HN threads, Reddit posts, YouTube comments, Product Hunt reviews. They open tabs. Copy text. Paste. Hit the context limit around comment 30. Start over.
The copy-paste ceiling is real. A single Reddit thread with 300 comments is 15,000+ words. You get maybe 20 before context runs out. The quotes that would actually change your copy are buried in comment 47.
Traditional social listening solves this at a cost: $15,000 to $200,000 per year for enterprise platforms that give you sentiment dashboards. What they don't give you: the actual verbatim phrases. They tell you 73% of comments are negative. They don't tell you "You hit walls fast. Anything beyond basic logic requires workarounds that take longer than just writing the code yourself."
That quote (from a real product review) is copy. "73% negative" is a chart.
The VOC MCP stack I tested this June runs between $0 and $20 per month depending on volume. The output is a quote bank, not a dashboard. Real sentences, sorted by votes, ready to paste.
The gap that nobody closes in existing guides: they show screenshots of prompts. None of them show the actual output. The whole point of this article is the opposite of that. Every platform section includes what I actually got back.
If you've already built an AI research agent, this is the same architecture applied to one specific job: pulling what your customers say about products like yours. If you haven't, the full research automation workflow is worth reading first.

The two-layer VOC stack
Every working path in this test suite has the same structure:
Layer 1: Search. Find which threads, videos, reviews, or posts to pull. This is discovery: you're looking for the 10 threads out of 10,000 where your specific customer pain shows up with the most votes and the most replies.
Layer 2: Extract. Pull the full text (comments, captions, transcripts, reviews) with vote counts intact.
Skip Layer 1 and you're scraping whatever comes back from a keyword query. Noisy. Wrong subreddits. Tutorial videos full of grateful beginners, not frustrated builders. Skip Layer 2 and Claude sees URL titles and 60-word snippets. No quotes. No vote context. No signal on which pain has the most resonance.
Three tools cover the search layer:
Exa: semantic search, not keyword matching. When you search for "MCP setup frustration," Exa finds posts where builders describe the experience in their own words, even if the word "frustration" never appears. Signal quality: 5/5 in my tests. Exa consistently surfaced the most relevant threads across every platform it could reach.
Tavily: keyword-ranked results with engagement signals. Fast landscape scan. Catches authoritative sources (blog posts, long-form writeups) that Exa sometimes misses. Use both together on high-stakes topics. They return different results.
Firecrawl search: URL discovery for open web targets. Useful for finding specific Product Hunt product pages and blog posts before you hand the URLs to the extract layer.
Two tools cover the extract layer:
Apify: platform-specific scraping at scale. Has actors (pre-built scrapers) for Reddit, Hacker News, Amazon, TikTok, Threads, YouTube, and 15,000+ other sites. Uses residential proxies for platforms that block datacenter IPs. Per-run pricing. A typical VOC session costs $0.07 to $0.25.
Firecrawl: open web scraping. Works on Product Hunt, Hacker News, YouTube pages and transcripts, blogs, and most public web content. Free tier gives you 500 credits per month. Blocked on Reddit, LinkedIn, Instagram, Threads, and X.
The pattern that works on every platform: Exa or Tavily finds the right threads. Apify or Firecrawl pulls the full text. That's the two-layer stack.
One caveat before you install: running 12 MCP servers simultaneously has a token cost. One builder documented 67,000 tokens consumed before a single prompt. Another tracked 27,000 tokens per session across restarts without realizing it. Install the two layers for this stack. Don't add every MCP you find. The stack here is four tools. That's the right number for this job.
For a full breakdown of how MCP connectors work and when to use them vs. other Claude integration types, that's the right starting point. If you're building your own VOC MCP for a platform this stack doesn't cover, custom MCPs in Claude Code is the path.

How to configure the stack
Add all four tools to your claude_desktop_config.json. If you need the full setup guide for every Claude surface, MCP setup for Claude, ChatGPT, VS Code, and Cursor covers it in order.
{
"mcpServers": {
"exa": {
"command": "npx",
"args": ["exa-mcp-server"],
"env": { "EXA_API_KEY": "your-exa-key" }
},
"tavily": {
"command": "npx",
"args": ["tavily-mcp"],
"env": { "TAVILY_API_KEY": "your-tavily-key" }
},
"firecrawl": {
"command": "npx",
"args": ["firecrawl-mcp"],
"env": { "FIRECRAWL_API_KEY": "your-firecrawl-key" }
},
"apify": {
"command": "npx",
"args": ["-y", "@apify/actors-mcp-server", "--tools", "harshmaur/reddit-scraper,ryanclinton/hackernews-search,streamers/youtube-comments-scraper,clockworks/tiktok-comments-scraper,junglee/amazon-reviews-scraper"],
"env": { "APIFY_TOKEN": "your-apify-token" }
}
}
}
The Tavily gotcha. The package is tavily-mcp. Not @tavily/mcp-server. That package exists, the key validates, the install runs without error, and the tools never appear in Claude. It's a valid npm package pointing to the wrong server. Use tavily-mcp exactly as shown.
Apify --tools flag. The flag loads only the actors you name. Without it, the server attempts to load thousands of actors and usually times out. List only what you need for this session. You can update the list without reinstalling.
Apify token. Free plan includes $5/month in credits, enough for 70+ typical VOC runs. Get your token from the Apify console under Settings > Integrations.
Firecrawl free tier. 500 credits per month. A typical page scrape costs 1–5 credits. Enough for a full VOC session without paying.
Verify after adding, not during. Run claude mcp list after adding. You want ✓ Connected next to each server. The confirmation message after claude mcp add only confirms the command ran. The server may still have failed to start. These are different things.
Restart Claude Code after config changes. The MCP server list loads on startup. Changes to claude_desktop_config.json don't take effect until you restart.
One thing production MCP builders document consistently: console.log in an MCP server corrupts the JSON-RPC stream. If you're building your own VOC MCP for a platform this stack doesn't cover, use stderr for debugging output only.

How to run a VOC session
Two rules before the prompts:
Never ask Claude to find and synthesize in one prompt. The pull step and the synthesis step are separate jobs. Combining them means Claude summarizes the search results (snippets and titles) instead of the actual post text.
Run the session in three steps.
Step 1: Pull (search layer)
Search for Reddit threads about MCP server setup failures in r/ClaudeAI and r/mcp
from the last 90 days. Return the 10 most upvoted posts with titles, URLs,
and comment counts.
This goes to Exa or Tavily. You get a list of threads with URLs. Pick the 3–5 most relevant.
Step 2: Extract (extract layer)
Pull the top 50 comments from [URL] sorted by votes. Include vote counts.
This goes to Apify (for Reddit, YouTube, TikTok) or Firecrawl (for HN, Product Hunt, open web). You get the full comment text with vote data.
Step 3: Synthesize
Take these comments. Group into 5 pain themes. For each: 3-word theme name,
3 verbatim quotes a real person would screenshot, one-line summary of what
they're actually saying.
The output is a pain point matrix with copy-ready quotes. No export step. No interpretation layer between the raw complaints and your copy.
The full research automation workflow generalizes this session pattern to other research types. The best Claude Code prompts covers structuring multi-step Claude workflows if you want to extend this.
The two-layer framework works. These are the gotchas you won't find in anyone else's guide. Nobody else ran all 12 platforms.
- HN: the easiest platform to scrape AND the richest VOC. Most guides skip it entirely.
- Reddit: no-auth MCPs are dead; the proxy default determines success or failure
- YouTube: the dedicated MCP never connected. Two workarounds exist.
- Product Hunt: Firecrawl works cleanly here (the opposite of every social platform)
- X and Instagram: confirmed dead zones, no workaround

Hacker News: best platform in this stack
Start here. Every time.
HN has three structural advantages over every other platform in this stack: static HTML (no JavaScript rendering required), fully public (no login, no API key), and comment quality that consistently beats Reddit for technical topics. Long-form. Cited. Voted. The 865-point threads read like organized focus groups.
And Firecrawl scrapes it cleanly. No block, no CAPTCHA, no login wall. The cleanest scrape in the entire test suite.
Discovery: use Tavily AND Exa together.
Both tools return different HN threads. In my test on vibe coding, Tavily found 5 threads and Exa found 10 different ones. Fifteen unique threads across both. If you only run one, you miss half the dataset.
Search Hacker News for threads about vibe coding frustrations from the last 6 months.
Return the top 10 threads by points with titles, URLs, and comment counts.
Run this with Tavily and again with Exa. Deduplicate. You now have the high-signal thread list.
Extraction: two paths.
Path A: Firecrawl
Scrape the full comment thread at [HN URL]. Return all top-level comments and
their vote scores.
Takes 3–8 seconds per thread. Returns the complete comment tree. No cost beyond Firecrawl credits (1–5 per page).
Path B: Apify ryanclinton/hackernews-search (link):
Use the hackernews-search actor with query "vibe coding" and maxItems 50.
This actor queries HN via the Algolia search API, the same backend HN's own search uses. One call handles both discovery and extraction. In my test it ran in 6.8 seconds and cost $0.025 for 5 threads with 20 comments each.
More importantly: it found the two biggest threads in the dataset. The 865-point "After two years of vibecoding, I'm back to writing by hand" thread (634 comments) and the 616-point "The cult of vibe coding" thread (512 comments). Both Tavily and Exa missed them. The Algolia index catches content that semantic search underranks.
What HN actually returned.
From the 865-point thread on returning to manual coding after two years of vibe coding:
"vibe coding leaves me with that same empty hollow sort of tiredness, as a day filled with meetings."
"The AI had simply told me a good story... when you read the whole chapter, it's a mess."
"I never trust the opinion of a single LLM model anymore... always get a 2nd or 3rd opinion before assuming one LLM is correct."
From the 439-comment thread "Will vibe coding end like the maker movement?":
"I am Sisyphus rolling prompts into a terminal."
"2 out of 4 people from my company who were doing this kind of thing got sacked in the last round."
From the Exa-exclusive thread "After months of coding with LLMs, I'm going back to using my brain":
"You solve the immediate problem with the AI, and then you have a new problem... and the AI to fix the new problem introduces a third problem."
Five pain themes from the full dataset: maintenance cliff, code review breakdown, quality degradation loop, "intention is gone" (the AI builds something that works but isn't what you meant), and talent debt (the fear of not being able to maintain it yourself).
Cost: $0.025 for 5 threads + 20 comments each via Apify. Free via Firecrawl credits.
What worked, what didn't:
| Tool/Method | Status | Output | Failure mode | Auth needed | Cost/run | Signal |
|---|---|---|---|---|---|---|
| Exa discovery | ✅ | Thread URLs + titles | — | API key | Free tier | 4/5 |
| Tavily discovery | ✅ | Thread URLs + titles | — | API key | Free tier | 4/5 |
| Firecrawl scrape | ✅ | Full comment tree + vote scores | — | API key | Free credits | 5/5 |
| Apify ryanclinton/hackernews-search | ✅ | Full threads + comments via Algolia | — | API key | $0.025/5 threads | 5/5 |
| Semantic search alone (no extract) | ❌ | Snippets only | No full comment text without extract step | API key | Free | 1/5 |
Verdict: 5/5. Best platform in this stack. Static HTML + Algolia API + long-form comments = the highest-signal VOC source I found across 12 platforms.

Reddit: residential proxies only
The no-auth endpoints are dead. That's not recoverable with a config change.
At the protocol level: Reddit's .json API endpoints (e.g., reddit.com/r/ClaudeAI/.json) now return 403 for unauthenticated clients. The HTML pages aren't blocked. The API is. Every Reddit MCP that doesn't handle authentication (jordanburke/praw-mcp, eliasbiondo/reddit-mcp, and most others you'll find recommended) calls the .json endpoint. Every call fails. The MCP shows Connected because the server started. The tool calls fail because the endpoint blocks them.
What works: harshmaur/reddit-scraper (Apify)
This actor uses residential proxies: IP addresses assigned to real households, not datacenter servers. Reddit's block targets datacenter IPs. Residential proxies route around it.
Cost: $0.07 per run, 25 items. The run I tested took 23 seconds and returned 25 posts from r/mcp with full text, vote scores, and comment counts.
Use the harshmaur/reddit-scraper actor with subreddit "mcp",
keywords "setup failure", maxItems 25, sort "top".
What doesn't work: datacenter proxy actors.
I tested betterdevsscrape/reddit-scraper, which uses datacenter proxies by default. Result: 0 items returned. Cost: $0.10, billed twice. If an actor's default proxy is labeled "datacenter," it will fail on Reddit and still charge you. Check the actor's proxy settings before running.
Free fallback:
curl -sS -A "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)" \
"https://old.reddit.com/r/mcp/new/.json?limit=25"
old.reddit.com serves the .json endpoint without authentication blocks. I verified this returns live data. Zero credentials, zero cost. Useful for quick checks before spending Apify credits.
Two trust checks to run before extracting.
Reddit data has two silent failure modes:
Stale index: Exa served me a page dated 3 weeks old with no staleness indicator. The content was real but outdated. Cross-check timestamps against Reddit's own sort-by-new.
Login wall response: www.reddit.com returned HTTP 200 with an 8.4KB "Please wait for verification" payload. Zero posts, successful status code. Always check that the response includes actual post content, not a verification page.
What Reddit actually returned.
From r/programming: "MCP Security is still Broken" (343 upvotes, 114 comments):
Threads about authentication tokens persisted in MCP server logs, context injection via malicious tool descriptions, and developers realizing their API keys were being read by other tools in the same session.
From r/mcp: "Perplexity drops MCP" (238 upvotes):
Debate about whether MCP's complexity is fundamental or fixable, and whether the protocol adds more friction than it removes for production use cases.
The thread title "How I solved the 'dead but connected' MCP server problem" exists, and the title alone is better VOC than most synthesis reports.
What worked, what didn't:
| Tool/Method | Status | Output | Failure mode | Auth needed | Cost/run | Signal |
|---|---|---|---|---|---|---|
| jordanburke/praw-mcp | ❌ | HTTP 403 | .json API blocks unauthenticated clients |
No | Free | — |
| eliasbiondo/reddit-mcp | ❌ | HTTP 403 | Same .json block |
No | Free | — |
| Firecrawl scrape | ❌ | Hard block | Engine-level block confirmed | No | Free | — |
| betterdevsscrape/reddit-scraper | ❌ | 0 items returned | Datacenter IPs blocked by Reddit | API key | $0.10/run | — |
| curl old.reddit.com/.json | ✅ | Post JSON with text + votes | — | No | Free | 3/5 |
| Apify harshmaur/reddit-scraper | ✅ | Full posts + vote scores via residential proxy | 2.34/5 actor rating — test before production use | API key | $0.07/25 posts | 3/5 |
Verdict: 3/5. Signal is real but noisy. Tune subreddit and keywords carefully. Use r/mcp and r/ClaudeAI for MCP-specific topics. r/programming for broader developer pain.

Product Hunt: structured VOC most people skip
Product Hunt review pages are pre-structured VOC. Every review has three fields: What's great, What needs improvement, Alternatives considered. That's a customer interview format, self-administered by people who just paid for or seriously evaluated the product.
Firecrawl works cleanly on PH. No block. Full text. This is the opposite of Reddit and LinkedIn, where Firecrawl returns either 403 or a login page.
Workflow:
Search for Product Hunt review pages for [product name]. Return the URLs
for the review section.
Exa or Tavily handles this. Then:
Scrape the review page at [URL]. Extract all reviews including the
"What needs improvement" sections with reviewer names and dates.
Firecrawl handles the extract.
What Product Hunt actually returned.
From Base44 reviews (37 reviews, 4.4/5):
"You hit walls fast. Anything beyond basic logic requires workarounds that take longer than just writing the code yourself." (Naumaan Zahid)
"The vendor lock-in concern is real and underreported. The exported code is tied to Base44's own infrastructure and won't run anywhere else without a serious rewrite." (Nolan Vu)
From Emergent reviews (14 reviews, 4.6/5):
"I spent over 30 days building an app... Real users. Real data. Then I moved it to a shared workspace. At NO point was it made crystal clear this action was irreversible." (Antonio Fuček, 445 views)
1-star from KTerion Miller: "Lost a major paying client who was literally days away from final sign-off. Burned through weeks of billable dev time, never recoverable."
Five pain themes across both products: vendor lock-in (exit costs underestimated at purchase), credit models that punish exploration (hit the cap mid-project), data loss with no SLA, complexity wall after the initial build, and irreversible actions without clear warnings.
The Apify Product Hunt actor limitation.
The maximedupre/product-hunt-scraper actor exists but can't target specific product review pages. It accepts category URLs, not product URLs. Use it for breadth (all products in a category). Use Firecrawl for depth (specific product reviews). They're complementary, not interchangeable.
What worked, what didn't:
| Tool/Method | Status | Output | Failure mode | Auth needed | Cost/run | Signal |
|---|---|---|---|---|---|---|
| Firecrawl scrape | ✅ | Full reviews + structured pros/cons fields | — | API key | Free credits | 5/5 |
| Exa/Tavily discovery | ✅ | Review page URLs | — | API key | Free tier | 4/5 |
| maximedupre/product-hunt-scraper | ⚠️ | Category product listings | Can't target specific product review pages | API key | $0.05+ | 3/5 |
Verdict: 5/5 for open-web products with a PH presence. Highest-quality quote per scrape of any platform. Structured format means less filtering. The "What needs improvement" field is already sorted for you.

YouTube, TikTok, and Threads
Three platforms, three different workarounds.
YouTube
The dedicated YouTube Comments MCP (wynandw87) never connected. It installed without error. Claude listed the tools. Every tool call returned nothing: no response, no error, no output. Silent connection failure. Same pattern as the Amazon MCP (covered in the next section).
No other search tool reaches YouTube comment text directly. Tavily and Exa both hit the JavaScript-render wall on comment sections. They return the page metadata, not the comments.
Two workarounds that do work:
Apify streamers/youtube-comments-scraper (link): 4.8/5 stars, 99% uptime, $0.002 per comment. Pass any video URL and it returns the full comment thread with vote counts and timestamps.
Use the youtube-comments-scraper actor with videoUrl "[URL]" and maxComments 100.
Exa video titles as proxy VOC. Exa can't pull comment text, but it returns video titles. Video titles that pass the click test are themselves VOC. Every title is a headline a creator tested against their audience.
From a vibe coding Exa search:
- "Every Claude Code Tutorial I Watched Left Me More Confused" (link)
- "Claude Code Has 5 Levels. Most People Are Stuck on Level 1." (link)
- "Seriously, please watch this before you start learning Claude Code"
Those titles are pain points. No comment extraction needed. The title already surfaces the frustration.
Firecrawl bonus: full transcripts. Firecrawl can scrape YouTube watch pages and return the auto-generated transcript. Creator narration often directly addresses audience complaints. A creator making a video called "Every Tutorial Left Me Confused" is responding to something their audience said. The transcript captures that.
Video selection rule: "Why I quit X," "broken," and complaint-framed titles produce complaint-dense comment sections. Tutorial titles ("How to get started with...") produce grateful beginner comments with 1–2% genuine pain buried deep. Filter for the former.
Highest vote counts of any platform: YouTube Shorts comedy comments. From "What it's like to Vibe Code" (3,566 comments):
- 35,000 votes: "It's like trying to outsmart a genie."
- 19,000 votes: "Can't believe we're back to early 80s text adventure games."
Those vote counts are a signal about resonance, not just sample size. The 35,000-vote comment is more reliable VOC than a 300-upvote Reddit post.
What worked, what didn't:
| Tool/Method | Status | Output | Failure mode | Auth needed | Cost/run | Signal |
|---|---|---|---|---|---|---|
| wynandw87 YouTube Comments MCP | ❌ | Nothing | Silent failure — tools listed, every call returns nothing | No | Free | — |
| Firecrawl scrape | ❌ | Page metadata only | JavaScript render wall blocks comment section | No | Free | — |
| Tavily/Exa comment extraction | ❌ | 60-word snippets max | Same JS render wall | API key | Free | 1/5 |
| Exa video title proxy | ✅ | Video titles (tested headlines) | No comment text | API key | Free | 3/5 |
| Firecrawl transcript | ✅ | Auto-generated transcript | — | API key | Free credits | 4/5 |
| Apify streamers/youtube-comments-scraper | ✅ | Full comments + vote counts | — | API key | $0.002/comment | 4/5 |
TikTok
Apify clockworks/tiktok-comments-scraper (link): 4.77/5, 97.5% success, $0.00125 per comment. 29 seconds for 66 comments in my test.
The critical finding: CTA predicts comment quality, not view count.
A video with 50,000 views and a "Comment 'Fix' for the prompt" CTA returns keyword spam. Hundreds of people commenting "Fix" with zero context. A video with 8,000 views and "What do you struggle with?" returns real pain:
"Claude loses context, forgets what it was doing mid-task, and I have to restart."
"When it fails, it gets stuck in a death spiral — fixing one thing breaks another."
"I don't know when Claude is confused vs when it's confidently wrong."
"Asking for honesty: you are one of the few who is honest — the rest says build in 1 min."
From this open-question TikTok video, the pain themes: state and persistence failures, death-spiral debugging, large-file wall, free-model confusion, and what I'd call honesty fatigue. The relief when a creator admits difficulty is so strong it produces explicit gratitude.
Use Tavily to find TikTok video URLs with engagement counts. Hand the URLs to Apify for comment text. Don't use Tavily for comments directly. It can't extract them.
What worked, what didn't:
| Tool/Method | Status | Output | Failure mode | Auth needed | Cost/run | Signal |
|---|---|---|---|---|---|---|
| Tavily/Exa direct extraction | ❌ | Video URLs only | No comment text accessible | API key | Free | — |
| Apify clockworks/tiktok-comments-scraper | ✅ | Full comments + engagement metrics | CTA type determines quality — filter for open-question CTAs | API key | $0.00125/comment | 4/5 |
Threads
Apify igview-owner/threads-search-scraper (99.8% success) returns post captions and engagement metrics from keyword searches.
The signal type is different from HN or Reddit. Threads runs on hot-takes and debates. You won't find 50-comment technical teardowns. You'll find a builder posting "MCPs ate $100 of Claude usage in one hour" and 200 people arguing about whether that's normal.
For replies, two-step: Apify discovery to find the post → logical_scrapers/threads-post-scraper to pull the reply thread.
Tavily surfaced the highest-signal Threads post in my session: "$100+ of extra Claude usage in 1 hour... the biggest lever most people miss: stop using Claude in Chrome for scraping" (Paweł Huryn). That post and its replies contain better VOC on MCP token overhead than three Reddit threads combined.
What worked, what didn't:
| Tool/Method | Status | Output | Failure mode | Auth needed | Cost/run | Signal |
|---|---|---|---|---|---|---|
| Firecrawl scrape | ❌ | Hard block | Engine-level block (confirmed blocked list) | No | Free | — |
| Tavily/Exa discovery | ✅ | Post captions + engagement | No reply text without second call | API key | Free | 2/5 |
| Apify igview-owner/threads-search-scraper | ✅ | Post captions + metrics from keyword search | Replies need separate second actor call | API key | ~$0.01/post | 3/5 |
| Apify logical_scrapers/threads-post-scraper | ✅ | Full reply thread | Requires post URL from prior discovery step | API key | ~$0.01/thread | 3/5 |
Verdict: YouTube 4/5 (pick the right videos). TikTok 4/5 (pick the right CTA). Threads 3/5 (debate-starters, not detailed complaints).

Amazon reviews
The dedicated voc-amazon-reviews-mcp is broken. It installs. The tools appear. Every call fails with a missing file error: the shell scripts it depends on (voc.sh, fetch.sh) aren't in the package. Don't install it.
What works: Apify junglee/amazon-reviews-scraper (link)
Rated 2.34/5 across 41 reviews. Test before relying on it for production VOC. In my test it ran in 14 seconds and cost $0.061 for 10 reviews.
One gotcha: the minimum maxTotalChargeUsd parameter cap is $0.50. If you set it below $0.50, the actor throws an error, even when the actual cost of your run is $0.06. Set it to $0.50 minimum.
Use the amazon-reviews-scraper actor with
asin "YOUR_ASIN",
filterByRatings ["oneStar","twoStar","threeStar"],
sort "recent",
maxItems 10,
maxTotalChargeUsd 0.50.
The two-variable lesson: product AND rating filter both matter.
I ran the same actor twice on different configurations:
Wrong combo: The Lean Startup, allStars filter, sort:recent → 10 five-star reviews. Zero VOC. Everyone loves the book.
Right combo: $100M Offers by Alex Hormozi, 1–3 star filter, sort:helpful → 10 complaint reviews. Real quotes.
"Tips and strategies aren't exactly new and revolutionary... likely a lead magnet for big-bucks services." (24 reactions)
"The writer is splitting this series up in pieces to make more sales." (6 reactions)
"I read this book because executives at my employer recommended it. Now I'm not sure what they found so valuable."
The product choice determines whether Amazon reviews are useful VOC at all. You need a product where your ICP is a buyer AND complains: business books, tools with Amazon listings, courses, physical products with a meaningful 1–3 star cluster. A beloved book produces five-star reviews. A polarizing one produces quotes.
What worked, what didn't:
| Tool/Method | Status | Output | Failure mode | Auth needed | Cost/run | Signal |
|---|---|---|---|---|---|---|
| voc-amazon-reviews-mcp | ❌ | Missing file error | Shell scripts (voc.sh, fetch.sh) not included in package | No | Free | — |
| Firecrawl scrape | ❌ | HTTP 200 = login page | Verification payload masquerades as 200 OK | No | Free | — |
| allStars filter on beloved product | ❌ | 10 five-star reviews | Zero complaint VOC from universally-liked products | API key | $0.06 | 0/5 |
| Apify junglee/amazon-reviews-scraper (1–3 star) | ✅ | Full review text + reaction counts | 2.34/5 actor rating — test before production; $0.50 minimum billing floor | API key | $0.061/10 reviews | 4/5 |
Verdict: 4/5 for the right product category. Broken MCP, working Apify actor. Use the rating filter. Don't skip the maxTotalChargeUsd minimum.

G2, Capterra, and Trustpilot: structured B2B VOC
Software review sites are the most underused source in this stack. Every review has a structured cons field. Every reviewer just paid for or seriously evaluated the product. And all three sites are fully public: no auth required, Firecrawl works on all of them.
This is Product Hunt's VOC format applied to a much wider product universe. Anything with a G2 presence is fair game.
The workflow:
Search for G2 reviews of [product name]. Return the URL for the reviews page.
Then:
Scrape the G2 reviews page at [URL]. Extract all reviews, focusing on the
"What do you dislike?" sections with reviewer industry, company size, and date.
What you get back: structured reviews with star rating, "What do you like?" and "What do you dislike?" fields, reviewer industry, company size, and verified buyer badge. The "dislike" field is the equivalent of Product Hunt's "What needs improvement" — pre-sorted VOC from buyers.
Trustpilot adds a dimension G2 doesn't have: the reviewer wasn't necessarily evaluating a competitor — they're a customer of the exact product. More direct signal, less comparison noise.
The Apify path:
For Trustpilot: apify/trustpilot-scraper (well-established, high rating) handles bulk review extraction with star filtering. Pass 1,2,3 as the star filter to isolate complaint reviews.
For G2: search the Apify store for G2 scrapers. Several actors exist with varying reliability. Check ratings and recent run success rates before deploying.
What worked, what didn't:
| Tool/Method | Status | Output | Failure mode | Auth needed | Cost/run | Signal |
|---|---|---|---|---|---|---|
| Firecrawl scrape (G2, Capterra) | ✅ | Structured reviews + ratings + industry metadata | Some pages require login for reviews beyond page 1 | No | Free credits | 4/5 |
| Firecrawl scrape (Trustpilot) | ✅ | Full review text + star ratings | — | No | Free credits | 4/5 |
| Exa/Tavily discovery | ✅ | Review page URLs for any product | — | API key | Free tier | 3/5 |
| Apify apify/trustpilot-scraper | ✅ | Bulk reviews + star filter + metadata | — | API key | ~$0.05/50 reviews | 4/5 |
Verdict: 4/5. Highest-reliability data in this stack — structured format, public pages, no auth friction. Best for B2B SaaS, tools, and services where your ICP shops G2 or Trustpilot before buying.

App Store and Google Play: mobile VOC
Mobile reviews are a different data quality than web reviews. Shorter, more emotional, less hedged. The person who left a 2-star review while waiting for the app to load is not carefully considering their words. That's the point.
Both stores are fully public. Apify has well-established, high-rating actors for both. The star filter is the entire strategy: 1–3 stars returns complaint VOC, 4–5 stars returns feature requests from power users who want more.
Apify actors:
- App Store:
apify/apple-store-scraper— pass the app ID, star filter, and sort order - Google Play:
apify/google-play-scraper— same pattern
Use the apple-store-scraper actor with appId "[App Store ID]",
rating [1, 2, 3], sort "mostRecent", maxItems 50.
The developer response signal: both stores show whether the developer responded to a review. A 2-star review that got a developer response is doubly useful: the complaint is real (low enough rating to surface) and the response reveals what the company knows is a pain point. When a company responds to "export is broken," they're confirming it's broken.
What worked, what didn't:
| Tool/Method | Status | Output | Failure mode | Auth needed | Cost/run | Signal |
|---|---|---|---|---|---|---|
| Firecrawl scrape | ⚠️ | Review snippets only | Full review text requires JS render that Firecrawl doesn't execute for app stores | No | Free credits | 2/5 |
| Apify apify/apple-store-scraper | ✅ | Full review text + star rating + developer response | — | API key | ~$0.05/50 reviews | 4/5 |
| Apify apify/google-play-scraper | ✅ | Full review text + thumbsUpCount + developer reply | — | API key | ~$0.05/50 reviews | 4/5 |
Verdict: 4/5 for mobile products or any SaaS with a companion app. The developer response column adds a dimension no other platform gives you.

GitHub Issues and Stack Overflow: developer VOC
A 50-comment GitHub issue thread on a popular open-source tool is a focus group. Every participant is a builder. The thread is sorted by engagement. The language is precise because the person filing the issue is trying to communicate exactly what broke.
This is the platform tier no VOC guide covers because it's not "social media." It's not marketing. It's builders talking to each other about what's broken, in detail, with reproduction steps.
GitHub Issues:
Firecrawl works on public GitHub issue pages. No auth required. The URL pattern is predictable: github.com/[owner]/[repo]/issues?q=is%3Aopen+sort%3Acomments-desc.
Scrape the GitHub issues page at [repo_url]/issues?q=is%3Aopen+sort%3Acomments-desc.
Extract the top 10 issues by comment count with their titles, labels, and first comment.
The GitHub MCP also gives you direct programmatic access if you want to search across repos or filter by label.
Stack Overflow:
Fully public, Firecrawl-accessible. Questions with 10+ answers on a specific tool are pain documentation. The answers themselves contain the exact error messages, version conflicts, and configuration failures that produce your best copy.
Search Stack Overflow for questions tagged [tool_name] with the most answers.
Return question titles, vote counts, and accepted answer summaries.
What worked, what didn't:
| Tool/Method | Status | Output | Failure mode | Auth needed | Cost/run | Signal |
|---|---|---|---|---|---|---|
| Firecrawl scrape (GitHub Issues) | ✅ | Issue titles + comment counts + labels | Private repos blocked; public repos fully accessible | No | Free credits | 4/5 |
| GitHub MCP | ✅ | Full issue text + comments + reactions | Rate limits apply at high volume | API token | Free | 5/5 |
| Firecrawl scrape (Stack Overflow) | ✅ | Questions + vote counts + answer summaries | — | No | Free credits | 4/5 |
| Exa/Tavily discovery | ✅ | GitHub issue URLs + SO question URLs | — | API key | Free tier | 3/5 |
Verdict: 5/5 for developer tools, open-source projects, or any SaaS where your ICP files GitHub issues. The signal quality is the highest of any platform in this stack for technical pain specifically.
What's completely dead: X, Instagram, and Firecrawl's blocked list
The honest map. No workaround, no "it depends," no "try this obscure actor."
| Platform | Tool/Method | Status | Output | Failure mode | Auth needed | Cost/run | Signal |
|---|---|---|---|---|---|---|---|
| Twitter/X | Apify store search | ❌ | — | Zero viable actors (API pricing killed the ecosystem) | — | — | — |
| Twitter/X | Firecrawl | ❌ | — | Engine-level block before any request | No | Free | — |
| Twitter/X | Tavily/Exa | ⚠️ | URLs returned | All content login-walled | API key | Free | 1/5 |
| Twitter/X | Exa article proxy | ✅ | Blog posts quoting viral threads | Not real-time | API key | Free | 3/5 |
| Apify (official IG actor) | ❌ | 2 filler comments | Promotional content, emoji spam only | API key | Varies | 0/5 | |
| Firecrawl | ❌ | — | Hard block: "we do not support this site" | No | Free | — | |
| Firecrawl | ❌ | — | Hard block (same error as Instagram) | No | Free | — | |
| Tavily/Exa | ⚠️ | Snippets only | Full posts behind login | API key | Free | 1/5 | |
| Firecrawl | ❌ | — | Engine-level block confirmed | No | Free | — | |
| Amazon | Firecrawl | ❌ | HTTP 200 = login page | Verification payload masquerades as success | No | Free | — |
Firecrawl's confirmed blocked list: Reddit, LinkedIn, Instagram, Threads, X. Firecrawl confirmed working: Product Hunt, Hacker News, YouTube pages and transcripts, blogs, most open web.
Two platforms worth noting:
X: the workaround that does work. Exa finds web articles written up within days of viral X threads. Medium posts, DEV.to writeups, Substack newsletters quoting the original tweet with context. You get the recorded version of X discourse: full context, not 280 characters, often richer than the original. Andrej Karpathy's "Slopacolypse" post surfaced this way. The Exa result included three blog posts with complete transcripts of the Twitter thread and follow-up context.
It's not real-time. But for VOC on publicly-documented frustrations, it often returns more usable quotes than the original thread would have.
LinkedIn: the actor that works but I didn't test live. harvestapi/linkedin-post-search (4.98/5, no cookies required, $0.002 per post) is confirmed working by multiple reviews. I didn't run it in this session. Note it here as the path if LinkedIn data is part of your VOC stack, with the caveat that scraping LinkedIn without authorization likely violates their ToS.
GitHub issue #52015 tracks Claude breaking working MCP servers after updates. Worth monitoring if you're maintaining this stack. This dev.to post tracking 29 MCP pain points across 7 communities documents Cloudflare MCP consuming 1.17 million tokens per session and GitHub MCP averaging 45,000 tokens. The token overhead isn't hypothetical.
If you want to build a VOC MCP for a platform this stack doesn't cover, the complete MCP server build guide is the starting point. The full MCP server landscape is worth reading if you're adding anything beyond what's in this stack.
Verdict: Dead is dead. Don't spend an afternoon debugging X scrapers. The Exa workaround for documented discourse is faster and often better.

How Claude turns raw comments into copy
The synthesis step is where the stack earns its cost.
Traditional social listening dashboards give you sentiment percentages. 73% negative. 18% neutral. 9% positive. That tells you how many people are unhappy. It doesn't tell you what they're saying.
Claude gives you the actual sentences.
The synthesis prompt:
Take these [N] comments. Group into 5 pain themes. For each:
- 3-word theme name
- 3 verbatim quotes a real person would screenshot
- One-line summary of what they're actually saying underneath the complaints
The "verbatim quotes a real person would screenshot" instruction matters. Without it, Claude summarizes. With it, Claude finds the sentences that already sound like copy. They were written by people trying to express something precisely.
What comes out:
From the HN vibe coding dataset, Claude returned:
Maintenance cliff: "You solve the immediate problem with the AI, and then you have a new problem, and the AI to fix the new problem introduces a third problem." The AI-introduced complexity is exponential, not linear.
Quality degradation loop: "The AI had simply told me a good story... when you read the whole chapter, it's a mess." Outputs look correct at the section level and break at the system level.
Intention is gone: "vibe coding leaves me with that same empty hollow sort of tiredness, as a day filled with meetings." The build didn't become yours. You directed something else's version of it.
None of those are paraphrases. They're direct quotes from the thread, surfaced because Claude recognized them as the sentences people would send to a friend.
The format that becomes copy:
"You hit walls fast. Anything beyond basic logic requires workarounds that take longer than just writing the code yourself." That's a landing page headline.
"Complexity limitations." That's a dashboard widget.
The raw quote is copy. The theme label is a filing system. Claude gives you both, but only the raw quote goes in your product description.
What a full VOC session produces:
- Pain theme matrix (5 themes, 3 quotes each)
- Messaging brief (which pain has the highest vote-weighted resonance)
- Copy bank (verbatim sentences, sorted by votes, ready to paste)
No export step. No CSV. No interpretation layer between your customers' words and your marketing document.

Next Steps
If you're starting today: configure the two layers (Exa + Firecrawl, or Exa + Apify), run one HN session on your product category, run the synthesis prompt on whatever comes back. The first session will take 20 minutes. The output will be better than a week of manual copy-paste.
The [platform verdict sheet] has every platform in one table: best tool, cost per run, signal quality, and the gotcha you'll hit.
For the full research automation architecture that this stack fits into, the AI research agent guide picks up where this one ends.
Related Reading
If you want to go deeper with the MCP side of this:
- What Is an MCP Server? Plain English: foundational context before installing anything
- MCP Second Brain: Connected Intelligence Guide: same two-layer architecture applied to knowledge management
- Claude Code + Perplexity MCP: AI Research Agent Workflow: the paid upgrade path for deeper research automation
Which platform did you try first? What came back?
Jenny
Build to Launch | Practical AI Builder Program | MCP Setup Guide