After 14 years building and managing B2B sales teams at companies like Aiven, Ververica, and Sharper Shape, I kept watching the same scene play out: a talented rep, 15 browser tabs open, trying to piece together enough context to write a halfway decent email before a call. The average SDR spends between 30 and 60 minutes researching each prospect manually, which across a team adds up to 60% of selling time spent not selling. I wanted to see how much of that could be automated with LLMs, web scraping, and a retrieval layer. RevHunt was the result.
Enter a company name or domain and RevHunt produces a complete prospect brief in 60 seconds by pulling from 50+ data sources in parallel: website content, news, job postings, funding announcements, tech stack signals, SEC filings, and CRM history. It saves roughly 29 minutes per prospect versus manual research, which across a sales team compounds to 2,522 hours annually, around $90,938 in recovered selling time, at an average 1,819% ROI. The brief runs in a Streamlit UI so it is usable without any technical setup.
RevHunt was also my first attempt at building a complete application end to end with AI assistance from the start: not a script, not a prototype, but an actual tool with a UI, a backend, an API layer, a data pipeline, and a prompt engineering layer sitting on top of all of it. I came into this from a marketing background, and by the time RevHunt was working, I had personally wrestled with every decision that usually gets distributed across a product team.
The scraping problem
Web scraping sounds simple, but most company websites have some combination of JavaScript rendering, bot detection, rate limiting, and dynamically loaded content that makes naive HTTP requests useless. RevHunt uses Browserless, a headless Chrome-as-a-service API, to render pages properly before extracting content, which solves the rendering problem, though the signal-to-noise problem is harder.
Raw scraped content is mostly noise: navigation bars, cookie banners, boilerplate legal text, footer links. The extraction layer strips all of that and pulls structured signals: product descriptions, customer names, technology mentions, hiring patterns, recent press. The quality of the final brief depends almost entirely on the quality of this extraction step.
What RAG adds
The RAG layer allows follow-up questions grounded in the scraped content. After the initial brief is generated, you can query the document store directly: 'What compliance certifications do they mention?', 'Who are their named customers?', 'What tech stack do their job postings suggest?'. Answers come from the actual scraped content, not from model training data, which matters a lot for companies the model has never seen.
FAISS handles the vector store. It is more than needed for the document volumes involved, but the latency is good and it requires no external service. For a tool primarily about speed, getting from zero to a usable prospect brief in under two minutes, latency at every step matters.
One feature that required more work than expected was familiarity scoring. RevHunt scores your relationship depth with each prospect as Cold, Warm, or Hot by analyzing CRM activity and interaction history, then adjusts the tone, talking points, and outreach strategy of the generated brief accordingly. A cold outreach to a company you have never touched reads very differently from a reactivation of a deal that stalled six months ago. The LangChain orchestration layer and HubSpot API integration handle this; Bright Data provides the web intelligence layer for sources that require authenticated or JavaScript-rendered access.
What I learned about AI-assisted sales tools
The main thing I learned is that the value is not in the brief itself. It is in the decision about what to include in the brief. A model left to its own devices produces a thorough summary. What a salesperson actually needs is a prioritized summary: the three things most likely to make this conversation worth having. Building that prioritization into the prompt took more iteration than everything else combined.
Why every product person should build something
Building RevHunt end to end meant facing every decision personally: what data to collect and what to discard, where the UX breaks under real usage, what the performance constraints actually mean for feature choices, and which technical trade-offs have real costs versus which are theoretical. You cannot get this from a spec, a user story, or a product review, and there is genuinely no substitute for getting it from building.
My recommendation to anyone who leads a product, owns a user experience, or works in product management: build something end to end, not a tutorial project but something real enough that you have to make hard decisions about what to cut, what to defer, and what to ship in a state you are not fully satisfied with. The problems you hit, the judgment calls nobody warned you about, the gap between what you planned and what you could actually build in a reasonable time frame: all of it is education that nothing else provides. Owning a product, even a small one, teaches product management in a way that no course, book, or job title ever will.
RevHunt also set the scene for something that mattered more than the tool itself. The experience of working through every layer of a product from data pipeline to UI to user experience directly influenced my move into product leadership. Going from running marketing to leading a product organization was a significant transition, and having built enough things myself to understand what engineering and product teams actually face made it possible. That path has been one of the most satisfying professional decisions I have made. The full project page has a complete breakdown of the features, the ROI numbers, and the architecture. RevHunt is live at revhunt.ai.