Production data science
Predictive modeling, client scoring, churn analysis, A/B testing frameworks. Where LLMs are overkill — classical ML still earns more.
Not everything needs to be an LLM
The GenAI hype masks an unsexy fact: classical ML still wins in 80% of business use cases. Client scoring, churn prediction, demand forecasting, A/B testing — these are tools that make money predictably and cheaply. An LLM here is over-engineering at 100x the price.
At IG Group I spent 6 years building exactly these systems. Client scoring models saved ~$250,000 annually through better retention resource allocation. Trading opportunity models delivered ~$40,000 in additional monthly revenue.
What I deliver
- Client scoring & lifetime value modeling — models that score customer value across lifecycle stages, ready to plug into CRM and campaigns.
- Churn analysis — identification of high-churn-risk customers with concrete action recommendations.
- Demand & revenue forecasting — Darts, Prophet, classical ARIMA where it makes sense. With honest backtesting.
- A/B testing infrastructure — experiment design, sample size calculators, sequential testing, multiple comparison correction. Done right, not “this variant looks better.”
- Data architecture — S3 + dbt + Redshift / GCP BigQuery / data lakehouse — for companies that outgrew a single Postgres. (At inFakt I designed a modern data architecture with 40% reduction in data processing costs.)
Common pitfalls
- Train/test leakage — classic, model looks 95%, in production drops to 60%.
- No model monitoring after deployment — models degrade through data drift and concept drift. Without monitoring, after 6 months you have a random number generator.
- Optimizing on a proxy metric — model maximizes CTR, business loses conversions.
- A/B test without sample size calc — conclusions after 3 days when you needed 3 weeks.
Who this fits
- E-commerce, fintech, SaaS with ≥10k customers, where scoring/segmentation has real revenue impact.
- Companies escaping the “let’s use ChatGPT for everything” trap — when classical ML is enough and 50x cheaper.
- Teams needing MLOps setup (CI/CD for models, monitoring, retraining).
Stack
Python · scikit-learn · Pandas · NumPy · pySpark · Darts · XGBoost · LightGBM · SQL · BigQuery · Redshift · dbt · Vertex AI · SageMaker · MLflow · Streamlit · Tableau