About Me

I’m a data scientist interested in causal inference and bayesian methods. I mainly use this blog to practice what I learn, but hopefully others find this helpful as well!

For a work sample, please refer to this post

Double ML in Numpyro using scope

causal inference

numpyro

This is a more of a tutorial for using numpyro’s scope handler. It’s fairly straightforward and allows one to use a composable model framework in numpyro - ie calling…

An easy way to choose evaluation metrics

metrics

model evaluation

time series

I’m not going to dive into forecasting evaluation here. I’m going to highlight a simple technique to consider when you’re struggling with metric choice.

Common Misconceptions with Multicollinearity

multicollinearity

causal inference

regression

I recently saw a viral linkedin post discussing how multicollinearity will ruin your regression estimates. The solution? Simply throw out variables with high Variance…

Automatic Dim Labelling with Numpyro?

numpyro

tensors

ArviZ

The goal of this post is to figure out how to use numpyro internals to auto-label variable dimensions for ArviZ. PyMC is heavily integrated with ArviZ and their dimension…

Pandera and Object Oriented Data Validation

object oriented programming

data processing

Pandera schema’s are a useful tool to make sure input data is as expected. If you’ve ever used dbt before, theyre just like schema tests or great-expectations.

Modeling Anything With First Principles: Demand under extreme stockouts

Time Series

Demand Modeling

Causal Inference

Supply Chain

Discrete Choice

Survival Analysis

When trying to decide how much inventory to buy we care more about Demand, not observed sales (or rentals in this example). Demand and sales are not the same thing.

Introduction to Surrogate Indexes

experimentation

Causal Inference

How should you design your experiments if the metric you want to change might take months to observe?

Desiging an Experimentation Strategy

experimentation

Experiments have alot more use cases than many give them credit for. At their simplest, they’re a tool for mitigating risk when making product decisions. But at their best…

Useful Tools for Weibull Survival Analysis

survival analysis

The Weibull distirbution is an excellent choice for many survival analysis problems - it has an interpretable parameterization that is highly flexible to a large number of…

Why do we need A/B tests? The Potential Outcomes Model

experimentation

This blog post introduces the Potential Outcomes Model and introduces why experiments are often necessary to measure what we want. This topic is already covered extensively…

Making out of sample predictions with PyMC

A cool thing about hierarchical models is that its easy to predict out of sample - i.e. if you want to make a prediction on a new zipcode, just sample from the state’s…

How long should you run an A/B test for?

For some people in industry new to A/B testing, they might wonder “Why cant we just run an A/B test for 2 days and be done with it?”. Even those familiar with it might…

Uncertainty Intervals or p-values?

Uncertainty Intervals are better than p-values. Sure, its better to use both, but p-values are just a point estimate and they bring no concept of uncertainty in our estimate…

Explainable AI is not Causal Inference

Explainable AI is all the rage these days. Black box ML models come along with some fun tools such as LIME, SHAP, or Partial Depence Plots that try to give visibility into…

Blog LinkedIn Github