LobbyMatch Case Study

01 — The Problem

Picking a lobbying firm is guesswork for outsiders.

Directories like Leadership Directories list firms, but they don't help with fit. Small businesses, nonprofits, and individuals face a steep learning curve when seeking representation, while large corporations have in-house expertise and established relationships. Matchmaking relies on social or political connections, not merit.

Information asymmetry favors those already in the system. Decades of public Lobbying Disclosure Act filings sit in government databases, but nobody has turned that data into a tool that evaluates which firm is right for a specific client's needs.

02 — Key Decisions

What I prioritized, what I cut, and why

Building on public disclosure data required careful scoping to ship a working prototype:

Kept

client_self_select classification

A single field in LD-1 filings reliably distinguishes lobbying firms from self-filing corporations. Prevented weeks of false starts with keyword-based approaches.

Cut

Activity description parsing

Extracting freeform lobbying descriptions and matching bill numbers to user inputs. Parked because the structure and use case weren't clear yet.

Kept

Percentile-based relative scoring

Early versions used absolute thresholds where top firms all scored identically. Relative ranking exposes why one firm wins on Experience despite fewer filings.

Cut

Covered position text extraction

The count exists but not the actual descriptions (e.g., "Former Chief of Staff, Senate Finance"). Identified as a data gap, deferred rather than blocking launch.

Kept

Two-phase architecture

Server-side pre-computation of match scores paired with AI-generated narratives. This decision drove the 10x performance improvement.

Cut

Full committee validation

Current logic infers relationships from issue codes rather than parsing contribution recipients. Proper implementation would cross-reference Congress member databases.

03 — How It Works

Enter your issue area, get ranked firm recommendations with AI-generated rationale.

Users describe their organization and lobbying needs. The system matches against an enriched dataset of lobbying firms derived from public LDA filings, scoring each firm on experience, committee relationships, and issue relevance.

Results display with component scores that reveal tradeoffs: one firm might win on Experience (95) because they have 51 former officials with strong committee relationships for the user's specific issue, while another has higher volume but less targeted relationships.

04 — What I Learned

Technical and product insights

1

Architecture decisions matter more than optimization

Initial response times were 45-76 seconds. The breakthrough came from rethinking the processing model (deterministic analytics + AI narrative), not from tweaking existing code. Result: 5-10 second responses.

2

Early exit logic transforms data pipelines

Classification script went from 24+ hours to 2-3 hours by adding "stop counting after 10 clients" and a 2025-only activity cutoff. Large firms with 1,000+ clients no longer require paginating through all filings.

3

Domain knowledge compounds

Understanding that client_self_select reliably classifies firms came from knowing how LD-1 forms work. This prevented weeks of false starts with keyword-based or name-matching approaches.

4

Honest scoring beats cosmetic scoring

Early versions used absolute thresholds where top firms all maxed out identically. Percentile ranking exposed underlying tradeoffs and made the recommendations genuinely useful.

05 — Future

What's next

If I continued development, these are the natural extensions:

Covered position text enrichment: Populate actual position descriptions ("Former Senate Finance Committee staffer on team") rather than just counts. High value for both matching and pitch generation.

Committee relationship validation: Move from issue-code inference to actual contribution parsing. Cross-reference against Congress member databases and look up actual committee assignments.

PitchSource integration: Already built. Flipped the model to help lobbying firms generate pitches to potential clients. Same enriched dataset, different interface. Validates B2B potential.