Per-user revenue data is generated below and analysed in both views. Combinations are tested directly; factor effects are estimated from a factorial regression so each level is judged holding other factors constant.
Each variant is compared directly against the control. P(beat) is the bootstrap probability the variant’s true mean exceeds control’s; significance uses Welch’s t-test with Benjamini–Hochberg FDR correction at q = 0.05 (95% confidence).
Effects are estimated from a factorial decomposition over cell means. EMMs are the level’s mean holding all other factors at their average; uplift is vs the reference level; P(beat) is from the effect’s sampling distribution. Significance at 95% confidence.