🗺️ Internal Reference — Data Flow & Ownership

How a brand becomes an outreach lead.

Every step the data passes through — where it comes from, how it's transformed, who built it, who owns it after handoff, and how to troubleshoot it. Click any step to expand.

Checking pipeline mode…
Status Live In progress Building Planned
Owner on handoff
📖 Technical terms guide Expand if any term is unfamiliar
🛠 Tools & Platforms used in this pipeline
SmartScout
An Amazon market-intelligence platform. It indexes brands selling on Amazon and exposes revenue, growth, and seller data. NAVIRA uses it as the source for every brand we evaluate.
Apollo.io
A B2B contact database. After a brand is scored, we use Apollo to find the right person to contact — their name, job title, and verified email address.
Smartlead
An email outreach platform. It manages warmed-up sending mailboxes and runs the multi-step cold-email sequences we send to scored brands.
HubSpot
NAVIRA's CRM. It receives a new contact record when a brand replies positively to outreach, and is where the team works the relationship. (The existing-client dedup list lives in D1, not the CRM. GHL was retired — redundant to HubSpot.)
trigger.dev
A background-job platform. Each stage of the pipeline (ingest, score, enrich, send) runs as a durable task here — it handles retries and scheduling automatically.
Cloudflare D1
A serverless SQLite database hosted by Cloudflare. It's the pipeline's central state store — every brand record, scoring config, and outreach event is saved here.
Cloudflare Pages
The platform hosting the NAVIRA web tools (this page, the scoring configurator, the sequence editor). It also runs small server-side functions that query D1 securely.
Cloudflare Zero Trust
An access-control layer in front of the internal site. It requires a verified login before any page loads — so the tools and data aren't publicly accessible.
Snowflake
A cloud data warehouse planned for Q3 2026. It will replace D1 as the permanent, scalable data store. The pipeline is already built so the swap is a single configuration change.
💡 Technical concepts & terminology
API
Application Programming Interface. A structured channel that lets two software systems exchange data. Every external tool (SmartScout, Apollo, Smartlead, HubSpot) is accessed through its API.
API key
A secret credential, like a password, that proves the pipeline is authorised to call an external API. Each integration (SmartScout, Apollo, Smartlead) has its own key stored as a secret.
REST API
The most common web API style. Uses standard HTTP requests — GET to read data, POST to write it — and returns responses in JSON format.
Webhook
An automatic HTTP callback. When an event happens in an external tool (e.g., a reply in Smartlead), it instantly calls a URL we provide so the pipeline can react in real time.
D1 binding
A Cloudflare setting that gives a Pages Function a named reference to the D1 database (e.g., env.navira_pipeline). The credentials never leave the server.
Brand ID
A stable, hash-based identifier we assign each brand on first ingest. Ensures the same brand is never processed twice, even if its name appears with minor variations.
Deduplication (dedup)
Removing records that already exist. The ingestion engine checks the D1 suppression list (existing clients + lost/fired, matched by name and Amazon store slug) and previously-contacted brand IDs before writing, so they're automatically skipped.
Upsert
A database operation that inserts a new row if it doesn't exist, or updates it if it does. Used throughout so brand records always reflect the latest data without creating duplicates.
Scoring config
The business-editable document (saved via the scoring UI) that defines how brands are evaluated — which signals matter, how much each one weighs, and where the segment thresholds sit.
Segment (A–E)
The tier a brand is placed into after scoring. A is the highest-opportunity; E is the lowest. Each segment maps to a distinct outreach sequence with different tone and messaging.
Enrichment
The step that turns a brand name into a real contact. Apollo is queried to find a decision-maker's name, title, and verified email for each brand that passes the scoring threshold.
Backfill
Running a processing step retroactively on records that already exist. For example, after a new scoring signal is added, a backfill updates all existing brands to include that signal.
PIPELINE_MODE
A safety gate. Campaign tasks refuse to send any emails unless this environment variable is explicitly set to production — going live is always a deliberate choice, never a default.
Cursor-based paging
A way to step through large API result sets. Each response includes a cursor token pointing to the next batch. More reliable than page numbers for large datasets that change between requests.
Python task
A Python script run as a managed background job inside trigger.dev. Each pipeline stage (ingest, segment, enrich, campaign) is one Python task with automatic retries and logging.