Bayesian Econometrics for Empirical IO -- A journey¶

Episode 3: A Berry (1994)-style Discrete Choice model¶

So far, introducing a Bayesian method sounds rather cosmetic. Instead of assuming a fixed parameter and recovering an estimate and its distribution, we assume a random parameter and recover its posterior distribution. These approaches are philosophically different, but they do not seem to bring much more than a robustness test so far. Before moving on to more advanced models in which the Bayesian approach may yield gains, let's write a slightly more advanced discrete choice models with prices, competition and endogeneity.

There are still (J) inside goods, sold by (J) one-product sellers, in each of (M) markets. Each buyer has unit demand and buys at most one good. Prices are endogenous: we simulate equilibrium prices and show how unaddressed endogeneity may lead to bias.

Someone unfamiliar with the IO literature may wonder why we introduce several markets. The reason is that observing how firms compete across different markets gives rise to layers of variation that help identification. Feel free to modify this notebook and try it yourself: a simulation with a single market is usually too fragile to cleanly resolve the endogeneity problem that we discuss below.

The following specification is adopted from Berry (1994). Note that there is still no modelling of random coefficients: \begin{align*} u_{ijm} &= x_{jm}'\theta - \alpha p_{jm} + \xi_{jm} + \varepsilon_{ijm}, \qquad \varepsilon_{ijm} \sim_{\text{iid}} \text{Gumbel}. \end{align*}

Here $\xi_{jm}$ is the unobserved systematic quality of alternative $j$ in market $m$. The econometrician observes $x_{jm}$, $p_{jm}$, and market shares, but not $\xi_{jm}$. Firms, however, observe $\xi_{jm}$ when setting prices.

The implied market shares are: \begin{align*} s_{jm} &= \frac{\exp(x_{jm}'\theta - \alpha p_{jm} + \xi_{jm})}{1+\sum_{k \in \mathcal{J}_m} \exp(x_{km}'\theta - \alpha p_{km} + \xi_{km})},\\ s_{0m} &= \frac{1}{1+\sum_{k \in \mathcal{J}_m} \exp(x_{km}'\theta - \alpha p_{km} + \xi_{km})}. \end{align*}

Therefore, taking the log ratio of product (j)'s share to the outside share gives the Berry inversion: \begin{align*} \log\left(\frac{s_{jm}}{s_{0m}}\right) &= x_{jm}'\theta - \alpha p_{jm} + \xi_{jm}. \end{align*}

This equation looks like a linear regression. The problem is that $p_{jm}$ is endogenous. Products with high unobserved quality $\xi_{jm}$ have higher demand, and firms respond by charging higher prices. Therefore: \begin{align*} \operatorname{Cov}(p_{jm},\xi_{jm}) \neq 0. \end{align*}

As a result, OLS gives a biased estimate of the price coefficient.

We assume Bertrand competition. Each firm fixes a price optimally given the best response of all other firms. Firm (j)'s profit in market (m) is: \begin{align*} \pi_{jm}(p_{jm};p_{-jm}) &= (p_{jm} - mc_{jm}) \cdot s_{jm}(p_m). \end{align*}

The first-order condition is: \begin{align*} \frac{\partial}{\partial p_{jm}}\pi_{jm}(p_{jm};p_{-jm}) &= s_{jm}(p_m) + (p_{jm} - mc_{jm})\frac{\partial}{\partial p_{jm}}s_{jm}(p_m) = 0. \end{align*}

Under our logit assumption: \begin{align*} \frac{\partial s_{jm}}{\partial p_{jm}} &= -\alpha s_{jm}(1-s_{jm}). \end{align*}

So the Bertrand first-order condition can be written as: \begin{align*} p_{jm} &= mc_{jm} + \frac{1}{\alpha(1-s_{jm}(p_{jm}))}. \end{align*}

Note the fixed-point nature of the equation above: the price of product (j) depends on its market share, but its market share depends on the full vector of prices. We solve this fixed point numerically in each market.

We model marginal cost linearly: \begin{align*} mc_{jm} &= \gamma_0 + \gamma_w w_{jm} + x_{jm}'\gamma + \omega_{jm}. \end{align*}

Here $w_{jm}$ is an observed cost shifter. It affects marginal cost and therefore equilibrium prices, but it does not enter consumer utility directly. The term $\omega_{jm}$ is an unobserved cost shock.

This structure gives us the IV logic. A valid instrument must shift equilibrium prices, but must be excluded from demand: \begin{align*} \operatorname{Cov}(Z_{jm},p_{jm}) &\neq 0,\\ \operatorname{Cov}(Z_{jm},\xi_{jm}) &= 0. \end{align*}

In the simulation, true marginal cost (mc_{jm}) is available to us because we generated the data. We first use it as an oracle instrument to illustrate the IV logic as clearly as possible. In real data, marginal costs are usually not observed. The realistic analogue is to use observed cost shifters such as (w_{jm}), or BLP-style instruments based on rival product characteristics.

InĀ [1]:
import jax
import jax.numpy as jnp
import blackjax #Bayesian samplers

import math
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

import statsmodels.formula.api as smf
from linearmodels.iv import IV2SLS

import pyblp
pyblp.options.verbose = False
pyblp.options.digits = 4

Data Simulation¶

InĀ [2]:
key = jax.random.key(123)

M = 50          # number of markets
J = 15           # products per market
K = 2            # product characteristics

alpha_true = 0.7
theta_true = jnp.array([0.8, 0.6])

# Demand shock strength
xi_scale = 1.5

# Marginal cost parameters
gamma_0 = 0.5
gamma_x = jnp.array([0.05, 0.05])
gamma_w = 0.35
omega_scale = 0.08

# Price iteration
num_price_iter = 500
price_damping = 0.2
InĀ [3]:
key_X, key_xi, key_w, key_omega, subkey = jax.random.split(key, 5)

# Observed product characteristics
# Shape: M x J x K
X = jax.random.uniform(key_X, shape=(M, J, K))

# Demand shock: observed by firms, unobserved by econometrician
# Shape: M x J
xi = xi_scale * jax.random.normal(key_xi, shape=(M, J))

# Observed cost shifter: excluded from demand
# Shape: M x J
w = jax.random.uniform(key_w, shape=(M, J))

# Unobserved cost shock
omega = omega_scale * jax.random.normal(key_omega, shape=(M, J))

# Mean utility excluding price and xi
# v_jm = theta_1 x1_jm + theta_2 x2_jm
v = X@theta_true

# Hidden marginal cost
mc = gamma_0 + X@gamma_x + gamma_w * w + omega
InĀ [4]:
@jax.jit
def market_shares_one_market(v_m, p_m, xi_m, alpha):
    """
    Inside-good shares in one market.
    
    v_m:  shape (J,)
    p_m:  shape (J,)
    xi_m: shape (J,)
    """
    delta = v_m - alpha * p_m + xi_m
    full_delta = jnp.concatenate([jnp.array([0.0]), delta])
    full_shares = jax.nn.softmax(full_delta)
    return full_shares[1:]
InĀ [5]:
def solve_prices_one_market(
    v_m,
    xi_m,
    mc_m,
    alpha,
    damping=0.7,
    num_iter=500,
):
    """
    Fixed-point algorithm, that solves single-product Bertrand equilibrium prices in one market.
    """
    p0 = mc_m + 1.0 / alpha
    @jax.jit
    def body_fun(_, p):
        s = market_shares_one_market(v_m, p, xi_m, alpha)
        p_target = mc_m + 1.0 / (alpha * (1.0 - s))
        return p + damping * (p_target - p)

    p_eq = jax.lax.fori_loop(0, num_iter, body_fun, p0)
    s_eq = market_shares_one_market(v_m, p_eq, xi_m, alpha)

    return p_eq, s_eq


solve_prices_many_markets = jax.jit(
    jax.vmap(
        solve_prices_one_market,
        in_axes=(0, 0, 0, None, None, None),
    ),
    static_argnames=("num_iter",),
)
InĀ [6]:
p_eq, s_eq = solve_prices_many_markets(
    v,
    xi,
    mc,
    alpha_true,
    price_damping,
    num_price_iter,
)
InĀ [7]:
#checking that all FOC's are satisfied:
print(jnp.max(jnp.abs(p_eq - (mc + 1/(alpha_true*(1-s_eq))))))
9.536743e-07
InĀ [8]:
#Now, we build a dataset we may see in the real world

market_id = np.repeat(np.arange(M), J)
product_id = np.tile(np.arange(J), M)

agg_data = pd.DataFrame({
    "market_ids": market_id,
    "product_ids": product_id,

    "shares": np.asarray(s_eq).reshape(-1),
    "prices": np.asarray(p_eq).reshape(-1),

    "x1": np.asarray(X[:, :, 0]).reshape(-1),
    "x2": np.asarray(X[:, :, 1]).reshape(-1),
    "demand_instruments0": np.asarray(w).reshape(-1), #this is the cost shifter w

    # Hidden DGP objects, for diagnostics and oracle-IV only
    "xi_true": np.asarray(xi).reshape(-1),
    "mc_true": np.asarray(mc).reshape(-1),
})

# Outside share by market
agg_data["outside share"] = (
    1.0
    - agg_data.groupby("market_ids")["shares"].transform("sum")
)

# Berry inversion
agg_data["logratio"] = np.log(
    agg_data["shares"] / agg_data["outside share"]
)

agg_data.head()
Out[8]:
market_ids product_ids shares prices x1 x2 demand_instruments0 xi_true mc_true outside share logratio
0 0 0 0.004487 2.216928 0.311737 0.171598 0.183158 -1.174208 0.781918 0.048176 -2.373708
1 0 1 0.001273 2.267322 0.037933 0.381706 0.862042 -2.306006 0.836930 0.048176 -3.633761
2 0 2 0.009043 2.063396 0.519684 0.304478 0.115692 -0.826956 0.621789 0.048176 -1.672897
3 0 3 0.017060 2.099433 0.254136 0.439862 0.323540 -0.035779 0.646068 0.048176 -1.038155
4 0 4 0.176472 2.349051 0.753439 0.989069 0.270088 1.746440 0.614354 0.048176 1.298298
InĀ [9]:
sns.set_theme(style="whitegrid", font_scale=1.1)
plt.figure(figsize=(8, 6))
ax = sns.scatterplot(data=agg_data, 
                x='shares', 
                y='prices',
                hue='market_ids',
                alpha=0.5,   
                size=1,
                palette='mako',
                edgecolor='black',  
                color='#2077b4',
                legend=None)
plt.title('Price vs. Market Share', fontsize=16, fontweight='bold', pad=15)
plt.ylabel('Price', fontsize=12, fontweight='bold')
plt.xlabel('Market Share', fontsize=12, fontweight='bold')

sns.despine(left=True, bottom=True)

plt.tight_layout()
plt.show()
No description has been provided for this image

Warm-up: Frequentist approach¶

The model above is specified so that it can be estimated with linear models. $v_j$ is an alternative fixed effect, and we treat $\log\left(\frac{s_j}{s_0}\right)$ and prices as data. As is now usual, we do a frequentist warmup, before introducing Bayesian methods. First, we show that OLS yields biased estimates.

InĀ [10]:
ols_model = smf.ols(data=agg_data, formula='logratio ~ 0 + prices + x1 + x2').fit(cov_type='HC1')
print(ols_model.summary()) #see that the coefficients are biased, which should not be a surprise given the price endogeneity
                                 OLS Regression Results                                
=======================================================================================
Dep. Variable:               logratio   R-squared (uncentered):                   0.257
Model:                            OLS   Adj. R-squared (uncentered):              0.254
Method:                 Least Squares   F-statistic:                              78.35
Date:                Mon, 18 May 2026   Prob (F-statistic):                    4.51e-44
Time:                        11:29:53   Log-Likelihood:                         -1352.1
No. Observations:                 750   AIC:                                      2710.
Df Residuals:                     747   BIC:                                      2724.
Df Model:                           3                                                  
Covariance Type:                  HC1                                                  
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
prices        -0.4120      0.067     -6.113      0.000      -0.544      -0.280
x1             0.1998      0.184      1.084      0.278      -0.162       0.561
x2            -0.0469      0.189     -0.249      0.804      -0.417       0.323
==============================================================================
Omnibus:                        1.894   Durbin-Watson:                   2.008
Prob(Omnibus):                  0.388   Jarque-Bera (JB):                1.959
Skew:                          -0.118   Prob(JB):                        0.375
Kurtosis:                       2.916   Cond. No.                         8.74
==============================================================================

Notes:
[1] R² is computed without centering (uncentered) since the model does not contain a constant.
[2] Standard Errors are heteroscedasticity robust (HC1)

Two main types of instruments we can use to solve this problem are cost-shifters (here we modeled one as $w$), and BLP instruments, which consist in averaging over rival products' characteristics in each market.

InĀ [11]:
def add_blp_instruments(df, chars=("x1", "x2"), market_col="market_ids"):
    df = df.copy()

    for index, x in enumerate(chars):
        market_total = df.groupby(market_col)[x].transform("sum")
        df[f"demand_instruments{index+1}"] = market_total - df[x]

    return df


agg_data = add_blp_instruments(
    agg_data,
    chars=("x1", "x2"),
    market_col="market_ids",
)

agg_data.head()
Out[11]:
market_ids product_ids shares prices x1 x2 demand_instruments0 xi_true mc_true outside share logratio demand_instruments1 demand_instruments2
0 0 0 0.004487 2.216928 0.311737 0.171598 0.183158 -1.174208 0.781918 0.048176 -2.373708 7.915670 7.657978
1 0 1 0.001273 2.267322 0.037933 0.381706 0.862042 -2.306006 0.836930 0.048176 -3.633761 8.189474 7.447869
2 0 2 0.009043 2.063396 0.519684 0.304478 0.115692 -0.826956 0.621789 0.048176 -1.672897 7.707722 7.525097
3 0 3 0.017060 2.099433 0.254136 0.439862 0.323540 -0.035779 0.646068 0.048176 -1.038155 7.973271 7.389713
4 0 4 0.176472 2.349051 0.753439 0.989069 0.270088 1.746440 0.614354 0.048176 1.298298 7.473968 6.840506

First: see a first-best IV. If we could observe the true marginal cost, it would be an ideal instrument, theoretically grounded in the FOC of the firm's profit maximization problem.

InĀ [12]:
iv_mc_model = IV2SLS.from_formula(
    "logratio ~ 0 + x1 + x2 + [prices ~ mc_true]",
    data=agg_data,
).fit(cov_type="robust")

print(iv_mc_model.summary)
                          IV-2SLS Estimation Summary                          
==============================================================================
Dep. Variable:               logratio   R-squared:                      0.2376
Estimator:                    IV-2SLS   Adj. R-squared:                 0.2345
No. Observations:                 750   F-statistic:                    301.89
Date:                Mon, May 18 2026   P-value (F-stat)                0.0000
Time:                        11:29:53   Distribution:                  chi2(3)
Cov. Estimator:                robust                                         
                                                                              
                             Parameter Estimates                              
==============================================================================
            Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
------------------------------------------------------------------------------
x1             0.7354     0.1941     3.7893     0.0002      0.3550      1.1158
x2             0.5025     0.1941     2.5886     0.0096      0.1220      0.8829
prices        -0.6888     0.0692    -9.9522     0.0000     -0.8244     -0.5531
==============================================================================

Endogenous: prices
Instruments: mc_true
Robust Covariance (Heteroskedastic)
Debiased: False

And indeed, the estimates are very close to the true value. But using the marginal cost as an instrument is not realistic: we do not observe it ! At best, we may have access to a cost-shifter like $w$.

InĀ [13]:
iv_w_model = IV2SLS.from_formula(
    "logratio ~ 0 + x1 + x2 + [prices ~ demand_instruments0]",
    data=agg_data,
).fit(cov_type="robust")

print(iv_w_model.summary)
                          IV-2SLS Estimation Summary                          
==============================================================================
Dep. Variable:               logratio   R-squared:                      0.2423
Estimator:                    IV-2SLS   Adj. R-squared:                 0.2393
No. Observations:                 750   F-statistic:                    254.15
Date:                Mon, May 18 2026   P-value (F-stat)                0.0000
Time:                        11:29:53   Distribution:                  chi2(3)
Cov. Estimator:                robust                                         
                                                                              
                             Parameter Estimates                              
==============================================================================
            Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
------------------------------------------------------------------------------
x1             0.6647     0.2440     2.7241     0.0064      0.1865      1.1430
x2             0.4299     0.2352     1.8278     0.0676     -0.0311      0.8910
prices        -0.6522     0.0995    -6.5554     0.0000     -0.8472     -0.4572
==============================================================================

Endogenous: prices
Instruments: demand_instruments0
Robust Covariance (Heteroskedastic)
Debiased: False

We get something very close. If we try our BLP instruments:

InĀ [14]:
iv_blp_model = IV2SLS.from_formula(
    "logratio ~ 0 + x1 + x2 + [prices ~ demand_instruments1 + demand_instruments2]",
    data=agg_data,
).fit(cov_type="robust")

print(iv_blp_model.summary)
                          IV-2SLS Estimation Summary                          
==============================================================================
Dep. Variable:               logratio   R-squared:                      0.2394
Estimator:                    IV-2SLS   Adj. R-squared:                 0.2363
No. Observations:                 750   F-statistic:                    309.42
Date:                Mon, May 18 2026   P-value (F-stat)                0.0000
Time:                        11:29:53   Distribution:                  chi2(3)
Cov. Estimator:                robust                                         
                                                                              
                             Parameter Estimates                              
==============================================================================
            Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
------------------------------------------------------------------------------
x1             0.7101     0.1938     3.6639     0.0002      0.3303      1.0900
x2             0.4765     0.1983     2.4026     0.0163      0.0878      0.8652
prices        -0.6757     0.0697    -9.6967     0.0000     -0.8123     -0.5391
==============================================================================

Endogenous: prices
Instruments: demand_instruments1, demand_instruments2
Robust Covariance (Heteroskedastic)
Debiased: False

Also very close. Finally, with all instruments:

InĀ [15]:
iv_all_model = IV2SLS.from_formula(
    """
    logratio ~ 0 + x1 + x2
    + [prices ~ demand_instruments0 + demand_instruments1 + demand_instruments2]
    """,
    data=agg_data,
).fit(cov_type="robust")

print(iv_all_model.summary)
                          IV-2SLS Estimation Summary                          
==============================================================================
Dep. Variable:               logratio   R-squared:                      0.2398
Estimator:                    IV-2SLS   Adj. R-squared:                 0.2367
No. Observations:                 750   F-statistic:                    310.86
Date:                Mon, May 18 2026   P-value (F-stat)                0.0000
Time:                        11:29:53   Distribution:                  chi2(3)
Cov. Estimator:                robust                                         
                                                                              
                             Parameter Estimates                              
==============================================================================
            Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
------------------------------------------------------------------------------
x1             0.7039     0.1932     3.6433     0.0003      0.3252      1.0826
x2             0.4701     0.1953     2.4070     0.0161      0.0873      0.8529
prices        -0.6725     0.0684    -9.8260     0.0000     -0.8066     -0.5383
==============================================================================

Endogenous: prices
Instruments: demand_instruments0, demand_instruments1, demand_instruments2
Robust Covariance (Heteroskedastic)
Debiased: False

Note that we can also use a dedicated package, pyBLP, for this. https://jeffgortmaker.com/files/Best_Practices_for_Differentiated_Products_Demand_Estimation_with_PyBLP.pdf

InĀ [16]:
formulation = pyblp.Formulation("0 + prices + x1 + x2")

problem = pyblp.Problem(
    product_formulations=formulation,
    product_data=agg_data,
)

results = problem.solve()
def extract_beta(results):
    return pd.Series(
        results.beta.flatten(),
        index=["prices", "x1", "x2"],
    )
InĀ [17]:
## Final comparison of the frequentist estimates:

truth = pd.Series({
    "prices": -float(alpha_true),
    "x1": float(theta_true[0]),
    "x2": float(theta_true[1]),
})

comparison = pd.DataFrame({
    "True": truth,
    "OLS": ols_model.params[["prices", "x1", "x2"]],
    "IV: oracle mc": iv_mc_model.params[["prices", "x1", "x2"]],
    "IV: w": iv_w_model.params[["prices", "x1", "x2"]],
    "IV: BLP": iv_blp_model.params[["prices", "x1", "x2"]],
    "IV: all": iv_all_model.params[["prices", "x1", "x2"]],
    "PyBLP": extract_beta(results)
})

display(comparison)
True OLS IV: oracle mc IV: w IV: BLP IV: all PyBLP
prices -0.7 -0.411961 -0.688755 -0.652227 -0.675682 -0.672459 -0.672286
x1 0.8 0.199829 0.735427 0.664745 0.710131 0.703894 0.707738
x2 0.6 -0.046943 0.502450 0.429948 0.476503 0.470106 0.464750

A Quasi-Bayesian Approach¶

In Episode 2, we used the Bayesian approach as a substitute for maximum likelihood estimation. In this episode, our parameters $\beta=(\theta_1, \theta_2, \alpha)$ are identified by the moment condition $\text{E}[Z_{jm}\xi_{jm}(\beta)]=0$. One way to introduce a Bayesian approach here is to do Bayesian GMM posterior.

The IV moment condition is: \begin{align*} \mathbb{E}\left[Z_{jm}\xi_{jm}(\beta)\right] &= 0, \end{align*} where \begin{align*} \xi_{jm}(\beta) &= \log\left(\frac{s_{jm}}{s_{0m}}\right) +\alpha p_{jm} - \theta_1x_{1jm} - \theta_2x_{2jm}. \end{align*}

The sample moment is: \begin{align*} g_N(\beta) &= \frac{1}{N}\sum_{m=1}^M\sum_{j=1}^J Z_{jm}\xi_{jm}(\beta). \end{align*}

Classical GMM chooses $\beta$ to minimize: \begin{align*} Q_N(\beta) &= g_N(\beta)'Wg_N(\beta). \end{align*}

The Bayesian GMM approach instead builds a quasi-posterior: \begin{align*} \pi(\beta\mid data) &\propto \exp\left(-\frac{N}{2}Q_N(\beta)\right)\pi(\beta). \end{align*}

This is not a likelihood in the usual parametric sense. It is a likelihood-like object built from the moment restrictions. The posterior concentrates around parameter values that make the IV moments close to zero. To see this, note that, when $Q_N$ is small, $\exp(-N Q_N)$ is large, and the reverse holds as well. This means that the log-posterior "puts" more density on values of the parameters such that the GMM objective is small.

We sample from the posterior distribution above using Bayesian samplers. This procedure is close to the one described in Chernozhukov and Hong (2003) and Yin (2009). Note that this is not, strictly speaking, a pure Bayesian approach.

InĀ [18]:
# Outcome
y = jnp.array(agg_data["logratio"].to_numpy())

# Regressors: price, x1, x2
X_gmm = jnp.array(agg_data[["prices", "x1", "x2"]].to_numpy())

# Instruments: exogenous regressors + excluded instruments
Z_gmm = jnp.array(agg_data[["x1", "x2", "demand_instruments0", "demand_instruments1", "demand_instruments2"]].to_numpy())

N = y.shape[0]
K_beta = X_gmm.shape[1]
K_z = Z_gmm.shape[1]
InĀ [19]:
ZZ = (Z_gmm.T @ Z_gmm) / N
W = jnp.linalg.inv(ZZ + 1e-8 * jnp.eye(K_z)) #GMM weight matrix
InĀ [20]:
@jax.jit
def gmm_moments(beta):
    """
    beta = (beta_price, beta_x1, beta_x2)
    """
    xi_beta = y - X_gmm @ beta
    moments = (Z_gmm.T @ xi_beta) / N
    return moments

@jax.jit
def gmm_criterion(beta):
    g = gmm_moments(beta)
    return g @ W @ g

@jax.jit
def log_prior_beta(beta):
    # Weak normal prior
    return -0.5 * jnp.sum((beta / 10.0) ** 2)

@jax.jit
def log_quasi_posterior(beta):
    return -0.5 * N * gmm_criterion(beta) + log_prior_beta(beta)
InĀ [21]:
XZ = X_gmm.T @ Z_gmm
ZX = Z_gmm.T @ X_gmm
Zy = Z_gmm.T @ y

beta_gmm_init = jnp.linalg.solve(
    XZ @ W @ ZX,
    XZ @ W @ Zy,
)

print("GMM initialization:", beta_gmm_init)
print("initial quasi posterior:", log_quasi_posterior(beta_gmm_init)) 
#we initialize the bayesian sampling at the educated guess of the frequentist estimate
GMM initialization: [-0.6724593   0.7038948   0.47010505]
initial quasi posterior: -0.0965331
InĀ [22]:
warmup_key, sample_key, subkey = jax.random.split(subkey, 3)

warmup = blackjax.window_adaptation(
    blackjax.nuts,
    log_quasi_posterior,
    target_acceptance_rate=0.8,
    is_mass_matrix_diagonal=False,
)

adaptation_results, warmup_info = warmup.run(
    warmup_key,
    beta_gmm_init,
    num_steps=1_500,
)

nuts = blackjax.nuts(
    log_quasi_posterior,
    **adaptation_results.parameters,
)
InĀ [23]:
def inference_loop(rng_key, kernel, initial_state, num_samples):
    @jax.jit
    def one_step(state, rng_key):
        state, info = kernel(rng_key, state)
        return state, (state.position, info)

    keys = jax.random.split(rng_key, num_samples)
    _, (samples, info) = jax.lax.scan(one_step, initial_state, keys)

    return samples, info


beta_samples, info = inference_loop(
    sample_key,
    nuts.step,
    adaptation_results.state,
    num_samples=25_000,
)

beta_samples.block_until_ready()

print("divergences:", info.is_divergent.sum())
print("mean acceptance rate:", info.acceptance_rate.mean())
divergences: 0
mean acceptance rate: 0.9076951
InĀ [24]:
fig, ax = plt.subplots(figsize=(13, 4))
samples_df = pd.DataFrame(beta_samples)
samples_df.columns = ['price', 'x1', 'x2']

sns.histplot(
    data=samples_df,
    bins=120,
    kde=True,
    stat="density",
    common_norm=False,
    fill=True,
    ax=ax,
)

ax.set_title("Quasi-Posterior distributions from BlackJax NUTS")
ax.set_xlabel(r"Utility value relative to $v_0=0$")
ax.set_ylabel("Posterior density")

sns.move_legend(
    ax,
    "center left",
    bbox_to_anchor=(1.01, 0.5),
    title="Parameter",
    frameon=False,
)

fig.tight_layout()
plt.show()
No description has been provided for this image
InĀ [25]:
param_names = ["prices", "x1", "x2"]

truth = pd.Series({
    "prices": -float(alpha_true),
    "x1": float(theta_true[0]),
    "x2": float(theta_true[1]),
})

# Bayesian GMM posterior summaries
bayes_mean = pd.Series(np.asarray(beta_samples.mean(axis=0)), index=param_names)

comparison = pd.DataFrame({
    "True": truth,
    "OLS": ols_model.params[param_names],
    "IV: MC shifter": iv_w_model.params[param_names],
    "IV: BLP": iv_blp_model.params[param_names],
    "IV: all": iv_all_model.params[param_names],
    "PyBLP": extract_beta(results)[param_names],
    "Quasi-Bayes GMM": bayes_mean,
})

comparison = comparison.round(4)

display(comparison)
True OLS IV: MC shifter IV: BLP IV: all PyBLP Quasi-Bayes GMM
prices -0.7 -0.4120 -0.6522 -0.6757 -0.6725 -0.6723 -0.6726
x1 0.8 0.1998 0.6647 0.7101 0.7039 0.7077 0.7039
x2 0.6 -0.0469 0.4299 0.4765 0.4701 0.4648 0.4709