Steve Hill 🌰 Portfolio
🌰 predictive_scoring · logistic_regression

Website Squirrel — Predictive CAC & Tiered Retention Engine

Dual logistic regression models (conversion likelihood + churn risk) feeding an A–D customer tier score. Python + AWS pipelines write predictions back to HubSpot so campaigns target high-value prospects and reactivation triggers fire before churn.

-60%
CAC reduction
+20%
Retention lift
15%
YoY profit contribution
48+
Campaigns scored

1. End-to-end data flow

Customer behavioral signals and CRM activity flow into an AWS pipeline, get scored by two logistic regression models, and write back to HubSpot as A–D tier tags — so every marketing decision downstream is made against a predicted profitability outcome, not a vanity metric.

SOURCES INGEST MODEL SURFACE DECIDE HubSpot CRM Transactional DB Behavioral Tracking Ad Platforms Support & NPS (churn signal) Stripe / Billing Email / SMS logs AWS PIPELINE s3.put() lambda.run() glue.etl() feature_eng() train_test_split() hubspot.write() LOGIT_CONVERSION P(purchase | traits) LOGIT_CHURN P(churn | 90d) CUSTOMER_TIER_SCORE A · B · C · D bands LTV_SIMULATOR Monte Carlo · 10k runs CAMPAIGN_ROI Spend ÷ Tier-A yield Tiered CAC dash Retention curves Campaign ROI grid Churn early warn HubSpot tag sync Scale Tier-A spend Nurture Tier-B Trigger win-back Kill Tier-D ads Expand verticals ↺ outcomes feed back into retraining
01 · SOURCES

Behavior + CRM + money

HubSpot CRM, Stripe billing, on-site behavioral tracking, paid ad platforms, support tickets, and email/SMS engagement logs.

02 · INGEST

AWS-native pipeline

Python workers (pandas, scikit-learn) orchestrated by AWS Lambda. S3 for raw and feature stores, Glue jobs for ETL. Daily batch + hourly refresh for high-intent events.

03 · MODEL

Dual logistic regression

Conversion model predicts purchase likelihood; churn model predicts 90-day attrition. Outputs feed a composite tier score (A–D) and an LTV Monte Carlo simulator.

04 · SURFACE

Tiered dashboards + sync

Power BI dashboards for CAC by tier, retention curves, campaign ROI grid, and churn early warning. Tier tags written back to HubSpot for segmentation.

05 · DECIDE

Scale, nurture, kill, win-back

Marketing acts on tiers: scale Tier-A spend, nurture Tier-B, trigger Tier-C win-back sequences, kill Tier-D acquisition. Outcomes feed back into model retraining.

2. The model, plain-English

Two logistic regression models run in parallel. One predicts the probability someone becomes a customer. The other predicts the probability an existing customer churns. Combined, they drive the tier score that every downstream decision rides on.

# Conversion model
P(convert) = σ(β₀ + β₁·pageviews + β₂·session_dur + β₃·email_opens + β₄·vertical + β₅·ad_touches)

# Churn model
P(churn_90d) = σ(γ₀ + γ₁·days_since_last + γ₂·support_tickets + γ₃·nps_drop + γ₄·payment_fail)

# Composite tier score
tier = f( P(convert), 1 - P(churn), LTV_sim ) → { A, B, C, D }

Trained on 18 months of customer outcomes, retrained weekly. AUC 0.83 (conversion) / 0.79 (churn) on held-out data.

3. Tier definitions

Each customer and lead gets one of four tier tags, written back to HubSpot as a property so marketing ops can build segments, triggers, and exclusions off a single predicted field.

A
P(convert) > 0.7 · LTV top quartile

High-value champions

Scale ad spend, priority sales routing, white-glove onboarding, reference program.

B
0.4 ≤ P(convert) ≤ 0.7

Nurture candidates

Email sequences, retargeting, webinars. Move to Tier-A with an additional activation event.

C
P(churn_90d) > 0.5 · existing customer

At-risk — win-back

Trigger reactivation offers, CSM check-in, discount workflow. Silence general campaigns.

D
P(convert) < 0.15

Suppress

Remove from paid acquisition, exclude from high-cost channels. Re-score quarterly.

4. Decision examples

The models are only valuable if they change marketing behavior. These are the literal calls the scoring engine let the business make — each backed by a number, not a hunch.

Scale Tier-A · +3x ad spend

Identified vertical/channel combos where Tier-A concentration was 4x average — redirected budget and cut blended CAC 60%.

Trigger win-back sequences

Automated HubSpot workflow fires when a customer flips to Tier-C. Lifted 90-day retention ~20% on reached cohort.

Kill Tier-D acquisition

Suppressed 7 ad campaigns spending against Tier-D lookalikes. Reallocated spend delivered 15% YoY profit contribution.

5. Operational cadence

Cadence Job Consumer Owner
Hourly Behavioral + CRM event capture → S3 raw zone Feature store Data eng
Daily 03:00 Feature build + tier rescore (all contacts) HubSpot write-back Data eng
Weekly Mon Model retrain + AUC report + drift check Marketing Ops review Analytics
Monthly Campaign ROI roll-up + channel reallocation Exec readout Marketing lead
Quarterly Vertical expansion analysis + Tier-D resurrection review Growth strategy Marketing lead

What this unlocked

Predictive scoring moved CAC, retention, and profit contribution together — by making every marketing decision downstream of a model, not a meeting.

-60%
CAC
+20%
Retention
15%
YoY profit lift