Thanks to Chengkun Li (best co-lead!) Aki Vehtari (@avehtari) Luigi Acerbi (@AcerbiLuigi) Paul Bürkner (@paul_buerkner) and Stefan Radev.
That project is a product of my research visit to Aalto University in Finland 🇫🇮
🔗 Link: https://arxiv.org/abs/2409.04332
🔎 What shall we add in the full version?
=> More informations about this toot | View the thread
To sum up, our adaptive workflow efficiently uses resources:
1️⃣Use amortized inference when accurate
2️⃣Refinewith PSIS when possible
3️⃣Use ChEES-HMC w/ amortized initializations when needed
Re-using draws in future steps creates synergies.
Stay tuned for the full version! /9
=> More informations about this toot | View the thread
Our workflow builds on synergies 🧬
Using amortized draws to initialize ChEES-HMC chains significantly reduces warmup time compared to the default random initialization ♨️
...even though the amortized draws were rejected based on our step 1 diagnostics! /8
=> More informations about this toot | View the thread
Proof-of-concept: Generalized Extreme Value distribution with 1000 observed data sets -> 2M posterior draws!
1️⃣ Amortized: 678/1000 accept (120s NN training, 10s inference)
2️⃣ PSIS: 228/322 accept (124s)
3️⃣ ChEES-HMC: 66/94 accept (398s)
⌛️Total time: 11min Ours vs 16h NUTS. /7
=> More informations about this toot | View the thread
We initialize ChEES-HMC with amortized draws. That means the chains start closer to the target distribution than random inits.
Result: Shorter warmup
We even see this when the amortized draws haven’t passed our diagnostics in step 1!
That’s not recycling, it's upcycling♻️/6
=> More informations about this toot | View the thread
ChEES-HMC is a massively parallelizable MCMC sampler by Hoffman et al (2021). It is less control flow heavy than NUTS, which means that 1000s of chains can be run on a GPU.
Our workflow offers initializations for all these chains.
🔗More info here: https://proceedings.mlr.press/v130/hoffman21a.html
/5
=> More informations about this toot | View the thread
Our workflow gradually escalates each data set along three steps:
1️⃣ Amortized inference: Near-instant posterior draws
2️⃣ Pareto-smoothed importance sampling (PSIS): Refinement of draws
3️⃣ ChEES-HMC: Massively parallel MCMC with amortized initializations from steps 1 or 2
/4
=> More informations about this toot | View the thread
To run a Bayesian analysis, we have to choose an algorithm.
Our workflow moves along the Pareto front by using amortized inference when accurate, and slow MCMC when necessary. /3
=> More informations about this toot | View the thread
Before we start, a quick primer.
Amortized inference has two stages:
/2
=> More informations about this toot | View the thread
Our new short paper on Amortized Bayesian Workflow is out!✨
We developed an adaptive workflow that combines the speed of amortized inference with the reliability of MCMC on thousands of datasets.
🔗Link: https://arxiv.org/abs/2409.04332
The whole is more than the sum of its parts 🧵👇
=> More informations about this toot | View the thread
=> This profile with reblog | Go to MarvinSchmitt@mastodon.online account This content has been proxied by September (3851b).Proxy Information
text/gemini