So far, introducing a Bayesian method sounds rather cosmetic. Instead of assuming a fixed parameter and recovering an estimate and its distribution, we assume a random parameter and recover its posterior distribution. These approaches are philosophically different, but they do not seem to bring much more than a robustness test so far. Before moving on to more advanced models in which the Bayesian approach may yield gains, let's write a slightly more advanced discrete choice models with prices, competition and endogeneity.
There are still (J) inside goods, sold by (J) one-product sellers, in each of (M) markets. Each buyer has unit demand and buys at most one good. Prices are endogenous: we simulate equilibrium prices and show how unaddressed endogeneity may lead to bias.
Someone unfamiliar with the IO literature may wonder why we introduce several markets. The reason is that observing how firms compete across different markets gives rise to layers of variation that help identification. Feel free to modify this notebook and try it yourself: a simulation with a single market is usually too fragile to cleanly resolve the endogeneity problem that we discuss below.
The following specification is adopted from Berry (1994). Note that there is still no modelling of random coefficients: \begin{align*} u_{ijm} &= x_{jm}'\theta - \alpha p_{jm} + \xi_{jm} + \varepsilon_{ijm}, \qquad \varepsilon_{ijm} \sim_{\text{iid}} \text{Gumbel}. \end{align*}
Here $\xi_{jm}$ is the unobserved systematic quality of alternative $j$ in market $m$. The econometrician observes $x_{jm}$, $p_{jm}$, and market shares, but not $\xi_{jm}$. Firms, however, observe $\xi_{jm}$ when setting prices.
The implied market shares are: \begin{align*} s_{jm} &= \frac{\exp(x_{jm}'\theta - \alpha p_{jm} + \xi_{jm})}{1+\sum_{k \in \mathcal{J}_m} \exp(x_{km}'\theta - \alpha p_{km} + \xi_{km})},\\ s_{0m} &= \frac{1}{1+\sum_{k \in \mathcal{J}_m} \exp(x_{km}'\theta - \alpha p_{km} + \xi_{km})}. \end{align*}
Therefore, taking the log ratio of product (j)'s share to the outside share gives the Berry inversion: \begin{align*} \log\left(\frac{s_{jm}}{s_{0m}}\right) &= x_{jm}'\theta - \alpha p_{jm} + \xi_{jm}. \end{align*}
This equation looks like a linear regression. The problem is that $p_{jm}$ is endogenous. Products with high unobserved quality $\xi_{jm}$ have higher demand, and firms respond by charging higher prices. Therefore: \begin{align*} \operatorname{Cov}(p_{jm},\xi_{jm}) \neq 0. \end{align*}
As a result, OLS gives a biased estimate of the price coefficient.
We assume Bertrand competition. Each firm fixes a price optimally given the best response of all other firms. Firm (j)'s profit in market (m) is: \begin{align*} \pi_{jm}(p_{jm};p_{-jm}) &= (p_{jm} - mc_{jm}) \cdot s_{jm}(p_m). \end{align*}
The first-order condition is: \begin{align*} \frac{\partial}{\partial p_{jm}}\pi_{jm}(p_{jm};p_{-jm}) &= s_{jm}(p_m) + (p_{jm} - mc_{jm})\frac{\partial}{\partial p_{jm}}s_{jm}(p_m) = 0. \end{align*}
Under our logit assumption: \begin{align*} \frac{\partial s_{jm}}{\partial p_{jm}} &= -\alpha s_{jm}(1-s_{jm}). \end{align*}
So the Bertrand first-order condition can be written as: \begin{align*} p_{jm} &= mc_{jm} + \frac{1}{\alpha(1-s_{jm}(p_{jm}))}. \end{align*}
Note the fixed-point nature of the equation above: the price of product (j) depends on its market share, but its market share depends on the full vector of prices. We solve this fixed point numerically in each market.
We model marginal cost linearly: \begin{align*} mc_{jm} &= \gamma_0 + \gamma_w w_{jm} + x_{jm}'\gamma + \omega_{jm}. \end{align*}
Here $w_{jm}$ is an observed cost shifter. It affects marginal cost and therefore equilibrium prices, but it does not enter consumer utility directly. The term $\omega_{jm}$ is an unobserved cost shock.
This structure gives us the IV logic. A valid instrument must shift equilibrium prices, but must be excluded from demand: \begin{align*} \operatorname{Cov}(Z_{jm},p_{jm}) &\neq 0,\\ \operatorname{Cov}(Z_{jm},\xi_{jm}) &= 0. \end{align*}
In the simulation, true marginal cost (mc_{jm}) is available to us because we generated the data. We first use it as an oracle instrument to illustrate the IV logic as clearly as possible. In real data, marginal costs are usually not observed. The realistic analogue is to use observed cost shifters such as (w_{jm}), or BLP-style instruments based on rival product characteristics.
import jax
import jax.numpy as jnp
import blackjax #Bayesian samplers
import math
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import statsmodels.formula.api as smf
from linearmodels.iv import IV2SLS
import pyblp
pyblp.options.verbose = False
pyblp.options.digits = 4
Data Simulation¶
key = jax.random.key(123)
M = 50 # number of markets
J = 15 # products per market
K = 2 # product characteristics
alpha_true = 0.7
theta_true = jnp.array([0.8, 0.6])
# Demand shock strength
xi_scale = 1.5
# Marginal cost parameters
gamma_0 = 0.5
gamma_x = jnp.array([0.05, 0.05])
gamma_w = 0.35
omega_scale = 0.08
# Price iteration
num_price_iter = 500
price_damping = 0.2
key_X, key_xi, key_w, key_omega, subkey = jax.random.split(key, 5)
# Observed product characteristics
# Shape: M x J x K
X = jax.random.uniform(key_X, shape=(M, J, K))
# Demand shock: observed by firms, unobserved by econometrician
# Shape: M x J
xi = xi_scale * jax.random.normal(key_xi, shape=(M, J))
# Observed cost shifter: excluded from demand
# Shape: M x J
w = jax.random.uniform(key_w, shape=(M, J))
# Unobserved cost shock
omega = omega_scale * jax.random.normal(key_omega, shape=(M, J))
# Mean utility excluding price and xi
# v_jm = theta_1 x1_jm + theta_2 x2_jm
v = X@theta_true
# Hidden marginal cost
mc = gamma_0 + X@gamma_x + gamma_w * w + omega
@jax.jit
def market_shares_one_market(v_m, p_m, xi_m, alpha):
"""
Inside-good shares in one market.
v_m: shape (J,)
p_m: shape (J,)
xi_m: shape (J,)
"""
delta = v_m - alpha * p_m + xi_m
full_delta = jnp.concatenate([jnp.array([0.0]), delta])
full_shares = jax.nn.softmax(full_delta)
return full_shares[1:]
def solve_prices_one_market(
v_m,
xi_m,
mc_m,
alpha,
damping=0.7,
num_iter=500,
):
"""
Fixed-point algorithm, that solves single-product Bertrand equilibrium prices in one market.
"""
p0 = mc_m + 1.0 / alpha
@jax.jit
def body_fun(_, p):
s = market_shares_one_market(v_m, p, xi_m, alpha)
p_target = mc_m + 1.0 / (alpha * (1.0 - s))
return p + damping * (p_target - p)
p_eq = jax.lax.fori_loop(0, num_iter, body_fun, p0)
s_eq = market_shares_one_market(v_m, p_eq, xi_m, alpha)
return p_eq, s_eq
solve_prices_many_markets = jax.jit(
jax.vmap(
solve_prices_one_market,
in_axes=(0, 0, 0, None, None, None),
),
static_argnames=("num_iter",),
)
p_eq, s_eq = solve_prices_many_markets(
v,
xi,
mc,
alpha_true,
price_damping,
num_price_iter,
)
#checking that all FOC's are satisfied:
print(jnp.max(jnp.abs(p_eq - (mc + 1/(alpha_true*(1-s_eq))))))
9.536743e-07
#Now, we build a dataset we may see in the real world
market_id = np.repeat(np.arange(M), J)
product_id = np.tile(np.arange(J), M)
agg_data = pd.DataFrame({
"market_ids": market_id,
"product_ids": product_id,
"shares": np.asarray(s_eq).reshape(-1),
"prices": np.asarray(p_eq).reshape(-1),
"x1": np.asarray(X[:, :, 0]).reshape(-1),
"x2": np.asarray(X[:, :, 1]).reshape(-1),
"demand_instruments0": np.asarray(w).reshape(-1), #this is the cost shifter w
# Hidden DGP objects, for diagnostics and oracle-IV only
"xi_true": np.asarray(xi).reshape(-1),
"mc_true": np.asarray(mc).reshape(-1),
})
# Outside share by market
agg_data["outside share"] = (
1.0
- agg_data.groupby("market_ids")["shares"].transform("sum")
)
# Berry inversion
agg_data["logratio"] = np.log(
agg_data["shares"] / agg_data["outside share"]
)
agg_data.head()
| market_ids | product_ids | shares | prices | x1 | x2 | demand_instruments0 | xi_true | mc_true | outside share | logratio | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 0.004487 | 2.216928 | 0.311737 | 0.171598 | 0.183158 | -1.174208 | 0.781918 | 0.048176 | -2.373708 |
| 1 | 0 | 1 | 0.001273 | 2.267322 | 0.037933 | 0.381706 | 0.862042 | -2.306006 | 0.836930 | 0.048176 | -3.633761 |
| 2 | 0 | 2 | 0.009043 | 2.063396 | 0.519684 | 0.304478 | 0.115692 | -0.826956 | 0.621789 | 0.048176 | -1.672897 |
| 3 | 0 | 3 | 0.017060 | 2.099433 | 0.254136 | 0.439862 | 0.323540 | -0.035779 | 0.646068 | 0.048176 | -1.038155 |
| 4 | 0 | 4 | 0.176472 | 2.349051 | 0.753439 | 0.989069 | 0.270088 | 1.746440 | 0.614354 | 0.048176 | 1.298298 |
sns.set_theme(style="whitegrid", font_scale=1.1)
plt.figure(figsize=(8, 6))
ax = sns.scatterplot(data=agg_data,
x='shares',
y='prices',
hue='market_ids',
alpha=0.5,
size=1,
palette='mako',
edgecolor='black',
color='#2077b4',
legend=None)
plt.title('Price vs. Market Share', fontsize=16, fontweight='bold', pad=15)
plt.ylabel('Price', fontsize=12, fontweight='bold')
plt.xlabel('Market Share', fontsize=12, fontweight='bold')
sns.despine(left=True, bottom=True)
plt.tight_layout()
plt.show()
Warm-up: Frequentist approach¶
The model above is specified so that it can be estimated with linear models. $v_j$ is an alternative fixed effect, and we treat $\log\left(\frac{s_j}{s_0}\right)$ and prices as data. As is now usual, we do a frequentist warmup, before introducing Bayesian methods. First, we show that OLS yields biased estimates.
ols_model = smf.ols(data=agg_data, formula='logratio ~ 0 + prices + x1 + x2').fit(cov_type='HC1')
print(ols_model.summary()) #see that the coefficients are biased, which should not be a surprise given the price endogeneity
OLS Regression Results
=======================================================================================
Dep. Variable: logratio R-squared (uncentered): 0.257
Model: OLS Adj. R-squared (uncentered): 0.254
Method: Least Squares F-statistic: 78.35
Date: Mon, 18 May 2026 Prob (F-statistic): 4.51e-44
Time: 11:29:53 Log-Likelihood: -1352.1
No. Observations: 750 AIC: 2710.
Df Residuals: 747 BIC: 2724.
Df Model: 3
Covariance Type: HC1
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
prices -0.4120 0.067 -6.113 0.000 -0.544 -0.280
x1 0.1998 0.184 1.084 0.278 -0.162 0.561
x2 -0.0469 0.189 -0.249 0.804 -0.417 0.323
==============================================================================
Omnibus: 1.894 Durbin-Watson: 2.008
Prob(Omnibus): 0.388 Jarque-Bera (JB): 1.959
Skew: -0.118 Prob(JB): 0.375
Kurtosis: 2.916 Cond. No. 8.74
==============================================================================
Notes:
[1] R² is computed without centering (uncentered) since the model does not contain a constant.
[2] Standard Errors are heteroscedasticity robust (HC1)
Two main types of instruments we can use to solve this problem are cost-shifters (here we modeled one as $w$), and BLP instruments, which consist in averaging over rival products' characteristics in each market.
def add_blp_instruments(df, chars=("x1", "x2"), market_col="market_ids"):
df = df.copy()
for index, x in enumerate(chars):
market_total = df.groupby(market_col)[x].transform("sum")
df[f"demand_instruments{index+1}"] = market_total - df[x]
return df
agg_data = add_blp_instruments(
agg_data,
chars=("x1", "x2"),
market_col="market_ids",
)
agg_data.head()
| market_ids | product_ids | shares | prices | x1 | x2 | demand_instruments0 | xi_true | mc_true | outside share | logratio | demand_instruments1 | demand_instruments2 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 0.004487 | 2.216928 | 0.311737 | 0.171598 | 0.183158 | -1.174208 | 0.781918 | 0.048176 | -2.373708 | 7.915670 | 7.657978 |
| 1 | 0 | 1 | 0.001273 | 2.267322 | 0.037933 | 0.381706 | 0.862042 | -2.306006 | 0.836930 | 0.048176 | -3.633761 | 8.189474 | 7.447869 |
| 2 | 0 | 2 | 0.009043 | 2.063396 | 0.519684 | 0.304478 | 0.115692 | -0.826956 | 0.621789 | 0.048176 | -1.672897 | 7.707722 | 7.525097 |
| 3 | 0 | 3 | 0.017060 | 2.099433 | 0.254136 | 0.439862 | 0.323540 | -0.035779 | 0.646068 | 0.048176 | -1.038155 | 7.973271 | 7.389713 |
| 4 | 0 | 4 | 0.176472 | 2.349051 | 0.753439 | 0.989069 | 0.270088 | 1.746440 | 0.614354 | 0.048176 | 1.298298 | 7.473968 | 6.840506 |
First: see a first-best IV. If we could observe the true marginal cost, it would be an ideal instrument, theoretically grounded in the FOC of the firm's profit maximization problem.
iv_mc_model = IV2SLS.from_formula(
"logratio ~ 0 + x1 + x2 + [prices ~ mc_true]",
data=agg_data,
).fit(cov_type="robust")
print(iv_mc_model.summary)
IV-2SLS Estimation Summary
==============================================================================
Dep. Variable: logratio R-squared: 0.2376
Estimator: IV-2SLS Adj. R-squared: 0.2345
No. Observations: 750 F-statistic: 301.89
Date: Mon, May 18 2026 P-value (F-stat) 0.0000
Time: 11:29:53 Distribution: chi2(3)
Cov. Estimator: robust
Parameter Estimates
==============================================================================
Parameter Std. Err. T-stat P-value Lower CI Upper CI
------------------------------------------------------------------------------
x1 0.7354 0.1941 3.7893 0.0002 0.3550 1.1158
x2 0.5025 0.1941 2.5886 0.0096 0.1220 0.8829
prices -0.6888 0.0692 -9.9522 0.0000 -0.8244 -0.5531
==============================================================================
Endogenous: prices
Instruments: mc_true
Robust Covariance (Heteroskedastic)
Debiased: False
And indeed, the estimates are very close to the true value. But using the marginal cost as an instrument is not realistic: we do not observe it ! At best, we may have access to a cost-shifter like $w$.
iv_w_model = IV2SLS.from_formula(
"logratio ~ 0 + x1 + x2 + [prices ~ demand_instruments0]",
data=agg_data,
).fit(cov_type="robust")
print(iv_w_model.summary)
IV-2SLS Estimation Summary
==============================================================================
Dep. Variable: logratio R-squared: 0.2423
Estimator: IV-2SLS Adj. R-squared: 0.2393
No. Observations: 750 F-statistic: 254.15
Date: Mon, May 18 2026 P-value (F-stat) 0.0000
Time: 11:29:53 Distribution: chi2(3)
Cov. Estimator: robust
Parameter Estimates
==============================================================================
Parameter Std. Err. T-stat P-value Lower CI Upper CI
------------------------------------------------------------------------------
x1 0.6647 0.2440 2.7241 0.0064 0.1865 1.1430
x2 0.4299 0.2352 1.8278 0.0676 -0.0311 0.8910
prices -0.6522 0.0995 -6.5554 0.0000 -0.8472 -0.4572
==============================================================================
Endogenous: prices
Instruments: demand_instruments0
Robust Covariance (Heteroskedastic)
Debiased: False
We get something very close. If we try our BLP instruments:
iv_blp_model = IV2SLS.from_formula(
"logratio ~ 0 + x1 + x2 + [prices ~ demand_instruments1 + demand_instruments2]",
data=agg_data,
).fit(cov_type="robust")
print(iv_blp_model.summary)
IV-2SLS Estimation Summary
==============================================================================
Dep. Variable: logratio R-squared: 0.2394
Estimator: IV-2SLS Adj. R-squared: 0.2363
No. Observations: 750 F-statistic: 309.42
Date: Mon, May 18 2026 P-value (F-stat) 0.0000
Time: 11:29:53 Distribution: chi2(3)
Cov. Estimator: robust
Parameter Estimates
==============================================================================
Parameter Std. Err. T-stat P-value Lower CI Upper CI
------------------------------------------------------------------------------
x1 0.7101 0.1938 3.6639 0.0002 0.3303 1.0900
x2 0.4765 0.1983 2.4026 0.0163 0.0878 0.8652
prices -0.6757 0.0697 -9.6967 0.0000 -0.8123 -0.5391
==============================================================================
Endogenous: prices
Instruments: demand_instruments1, demand_instruments2
Robust Covariance (Heteroskedastic)
Debiased: False
Also very close. Finally, with all instruments:
iv_all_model = IV2SLS.from_formula(
"""
logratio ~ 0 + x1 + x2
+ [prices ~ demand_instruments0 + demand_instruments1 + demand_instruments2]
""",
data=agg_data,
).fit(cov_type="robust")
print(iv_all_model.summary)
IV-2SLS Estimation Summary
==============================================================================
Dep. Variable: logratio R-squared: 0.2398
Estimator: IV-2SLS Adj. R-squared: 0.2367
No. Observations: 750 F-statistic: 310.86
Date: Mon, May 18 2026 P-value (F-stat) 0.0000
Time: 11:29:53 Distribution: chi2(3)
Cov. Estimator: robust
Parameter Estimates
==============================================================================
Parameter Std. Err. T-stat P-value Lower CI Upper CI
------------------------------------------------------------------------------
x1 0.7039 0.1932 3.6433 0.0003 0.3252 1.0826
x2 0.4701 0.1953 2.4070 0.0161 0.0873 0.8529
prices -0.6725 0.0684 -9.8260 0.0000 -0.8066 -0.5383
==============================================================================
Endogenous: prices
Instruments: demand_instruments0, demand_instruments1, demand_instruments2
Robust Covariance (Heteroskedastic)
Debiased: False
Note that we can also use a dedicated package, pyBLP, for this. https://jeffgortmaker.com/files/Best_Practices_for_Differentiated_Products_Demand_Estimation_with_PyBLP.pdf
formulation = pyblp.Formulation("0 + prices + x1 + x2")
problem = pyblp.Problem(
product_formulations=formulation,
product_data=agg_data,
)
results = problem.solve()
def extract_beta(results):
return pd.Series(
results.beta.flatten(),
index=["prices", "x1", "x2"],
)
## Final comparison of the frequentist estimates:
truth = pd.Series({
"prices": -float(alpha_true),
"x1": float(theta_true[0]),
"x2": float(theta_true[1]),
})
comparison = pd.DataFrame({
"True": truth,
"OLS": ols_model.params[["prices", "x1", "x2"]],
"IV: oracle mc": iv_mc_model.params[["prices", "x1", "x2"]],
"IV: w": iv_w_model.params[["prices", "x1", "x2"]],
"IV: BLP": iv_blp_model.params[["prices", "x1", "x2"]],
"IV: all": iv_all_model.params[["prices", "x1", "x2"]],
"PyBLP": extract_beta(results)
})
display(comparison)
| True | OLS | IV: oracle mc | IV: w | IV: BLP | IV: all | PyBLP | |
|---|---|---|---|---|---|---|---|
| prices | -0.7 | -0.411961 | -0.688755 | -0.652227 | -0.675682 | -0.672459 | -0.672286 |
| x1 | 0.8 | 0.199829 | 0.735427 | 0.664745 | 0.710131 | 0.703894 | 0.707738 |
| x2 | 0.6 | -0.046943 | 0.502450 | 0.429948 | 0.476503 | 0.470106 | 0.464750 |
A Quasi-Bayesian Approach¶
In Episode 2, we used the Bayesian approach as a substitute for maximum likelihood estimation. In this episode, our parameters $\beta=(\theta_1, \theta_2, \alpha)$ are identified by the moment condition $\text{E}[Z_{jm}\xi_{jm}(\beta)]=0$. One way to introduce a Bayesian approach here is to do Bayesian GMM posterior.
The IV moment condition is: \begin{align*} \mathbb{E}\left[Z_{jm}\xi_{jm}(\beta)\right] &= 0, \end{align*} where \begin{align*} \xi_{jm}(\beta) &= \log\left(\frac{s_{jm}}{s_{0m}}\right) +\alpha p_{jm} - \theta_1x_{1jm} - \theta_2x_{2jm}. \end{align*}
The sample moment is: \begin{align*} g_N(\beta) &= \frac{1}{N}\sum_{m=1}^M\sum_{j=1}^J Z_{jm}\xi_{jm}(\beta). \end{align*}
Classical GMM chooses $\beta$ to minimize: \begin{align*} Q_N(\beta) &= g_N(\beta)'Wg_N(\beta). \end{align*}
The Bayesian GMM approach instead builds a quasi-posterior: \begin{align*} \pi(\beta\mid data) &\propto \exp\left(-\frac{N}{2}Q_N(\beta)\right)\pi(\beta). \end{align*}
This is not a likelihood in the usual parametric sense. It is a likelihood-like object built from the moment restrictions. The posterior concentrates around parameter values that make the IV moments close to zero. To see this, note that, when $Q_N$ is small, $\exp(-N Q_N)$ is large, and the reverse holds as well. This means that the log-posterior "puts" more density on values of the parameters such that the GMM objective is small.
We sample from the posterior distribution above using Bayesian samplers. This procedure is close to the one described in Chernozhukov and Hong (2003) and Yin (2009). Note that this is not, strictly speaking, a pure Bayesian approach.
# Outcome
y = jnp.array(agg_data["logratio"].to_numpy())
# Regressors: price, x1, x2
X_gmm = jnp.array(agg_data[["prices", "x1", "x2"]].to_numpy())
# Instruments: exogenous regressors + excluded instruments
Z_gmm = jnp.array(agg_data[["x1", "x2", "demand_instruments0", "demand_instruments1", "demand_instruments2"]].to_numpy())
N = y.shape[0]
K_beta = X_gmm.shape[1]
K_z = Z_gmm.shape[1]
ZZ = (Z_gmm.T @ Z_gmm) / N
W = jnp.linalg.inv(ZZ + 1e-8 * jnp.eye(K_z)) #GMM weight matrix
@jax.jit
def gmm_moments(beta):
"""
beta = (beta_price, beta_x1, beta_x2)
"""
xi_beta = y - X_gmm @ beta
moments = (Z_gmm.T @ xi_beta) / N
return moments
@jax.jit
def gmm_criterion(beta):
g = gmm_moments(beta)
return g @ W @ g
@jax.jit
def log_prior_beta(beta):
# Weak normal prior
return -0.5 * jnp.sum((beta / 10.0) ** 2)
@jax.jit
def log_quasi_posterior(beta):
return -0.5 * N * gmm_criterion(beta) + log_prior_beta(beta)
XZ = X_gmm.T @ Z_gmm
ZX = Z_gmm.T @ X_gmm
Zy = Z_gmm.T @ y
beta_gmm_init = jnp.linalg.solve(
XZ @ W @ ZX,
XZ @ W @ Zy,
)
print("GMM initialization:", beta_gmm_init)
print("initial quasi posterior:", log_quasi_posterior(beta_gmm_init))
#we initialize the bayesian sampling at the educated guess of the frequentist estimate
GMM initialization: [-0.6724593 0.7038948 0.47010505] initial quasi posterior: -0.0965331
warmup_key, sample_key, subkey = jax.random.split(subkey, 3)
warmup = blackjax.window_adaptation(
blackjax.nuts,
log_quasi_posterior,
target_acceptance_rate=0.8,
is_mass_matrix_diagonal=False,
)
adaptation_results, warmup_info = warmup.run(
warmup_key,
beta_gmm_init,
num_steps=1_500,
)
nuts = blackjax.nuts(
log_quasi_posterior,
**adaptation_results.parameters,
)
def inference_loop(rng_key, kernel, initial_state, num_samples):
@jax.jit
def one_step(state, rng_key):
state, info = kernel(rng_key, state)
return state, (state.position, info)
keys = jax.random.split(rng_key, num_samples)
_, (samples, info) = jax.lax.scan(one_step, initial_state, keys)
return samples, info
beta_samples, info = inference_loop(
sample_key,
nuts.step,
adaptation_results.state,
num_samples=25_000,
)
beta_samples.block_until_ready()
print("divergences:", info.is_divergent.sum())
print("mean acceptance rate:", info.acceptance_rate.mean())
divergences: 0 mean acceptance rate: 0.9076951
fig, ax = plt.subplots(figsize=(13, 4))
samples_df = pd.DataFrame(beta_samples)
samples_df.columns = ['price', 'x1', 'x2']
sns.histplot(
data=samples_df,
bins=120,
kde=True,
stat="density",
common_norm=False,
fill=True,
ax=ax,
)
ax.set_title("Quasi-Posterior distributions from BlackJax NUTS")
ax.set_xlabel(r"Utility value relative to $v_0=0$")
ax.set_ylabel("Posterior density")
sns.move_legend(
ax,
"center left",
bbox_to_anchor=(1.01, 0.5),
title="Parameter",
frameon=False,
)
fig.tight_layout()
plt.show()
param_names = ["prices", "x1", "x2"]
truth = pd.Series({
"prices": -float(alpha_true),
"x1": float(theta_true[0]),
"x2": float(theta_true[1]),
})
# Bayesian GMM posterior summaries
bayes_mean = pd.Series(np.asarray(beta_samples.mean(axis=0)), index=param_names)
comparison = pd.DataFrame({
"True": truth,
"OLS": ols_model.params[param_names],
"IV: MC shifter": iv_w_model.params[param_names],
"IV: BLP": iv_blp_model.params[param_names],
"IV: all": iv_all_model.params[param_names],
"PyBLP": extract_beta(results)[param_names],
"Quasi-Bayes GMM": bayes_mean,
})
comparison = comparison.round(4)
display(comparison)
| True | OLS | IV: MC shifter | IV: BLP | IV: all | PyBLP | Quasi-Bayes GMM | |
|---|---|---|---|---|---|---|---|
| prices | -0.7 | -0.4120 | -0.6522 | -0.6757 | -0.6725 | -0.6723 | -0.6726 |
| x1 | 0.8 | 0.1998 | 0.6647 | 0.7101 | 0.7039 | 0.7077 | 0.7039 |
| x2 | 0.6 | -0.0469 | 0.4299 | 0.4765 | 0.4701 | 0.4648 | 0.4709 |