📢 PSA: Tired of #betareg in :rstats: complaining when you have 0 and/or 1 in your response variable?
🎉 Good news: betareg can now capture this at the expense of a single extra parameter! #rstats
📄 New arXiv working paper with @ikosmidis
https://www.zeileis.org/news/xbx/
=> More informations about this toot | More toots from zeileis@fosstodon.org
💡 Idea for extended-support beta mixture (XBX) distribution:
=> More informations about this toot | More toots from zeileis@fosstodon.org
🧭 This was a long journey of almost a decade (interrupted by a couple of kids and a pandemic).
2015: First idea and basic theory.
2017: First implementation on R-Forge, work out details.
2024: Full implementation, application, paper. 🥳
=> More informations about this toot | More toots from zeileis@fosstodon.org
@zeileis Thanks for pointing this out!! It's comforting to see that these things take a lot of time, even to very clever people.
I get frustrated too often because I feel I should be capable of "solving" a problem or implementing a methodological idea in a few days of work.
=> More informations about this toot | More toots from famuvie@oc.todon.fr
@famuvie Most of the cleverness was done by @ikosmidis almost 10 years ago. 😇
After that it was more tedious work, implementation, embellishing the application case study, and weaving everything into a coherent story for the paper. And these things just need time for thought and trying out different ideas.
=> More informations about this toot | More toots from zeileis@fosstodon.org
@zeileis I didn't quite get the advantage of using the XBX continuous mixture model over the simpler XB truncated model, as both require estimating an extra parameter (with different roles).
=> More informations about this toot | More toots from famuvie@oc.todon.fr
@famuvie We briefly discuss this in the paper. With XB you get all sorts of identifiability problems which are mitigated by shrinking it towards the beta distribution.
But if you want to play around with this, you can via betareg(..., dist = "xbeta"). But you might often encounter problems with starting values, convergence, etc.
=> More informations about this toot | More toots from zeileis@fosstodon.org
@zeileis @ikosmidis
Genius!! very elegant and clever solution. 👏👏
Would the same idea be applicable to Poisson or negative-binomial #regression with non-negative outcomes?
=> More informations about this toot | More toots from famuvie@oc.todon.fr
@famuvie @ikosmidis I think it's the other way around: For XBX we also took inspiration from zero-inflated and hurdle count data models.
But for count data the situation is a bit different because you always have a point mass at zero. So it's just the question how to inflate (or modify) that.
In contrast the beta distribution has no probability weight on the boundaries. Hence the extended-support approach.
=> More informations about this toot | More toots from zeileis@fosstodon.org
@zeileis @ikosmidis Right, I meant regression with continuous likelihood like Gamma.
=> More informations about this toot | More toots from famuvie@oc.todon.fr
@famuvie @ikosmidis Yes, there is some work in that direction. For example, Baran & Nemoda proposed to use a censored shifted gamma distribution for modeling precipitation (non-negative with point mass at zero).
But I haven't tried how reliably their exceedence parameter can be estimated in practice (without shrinkage towards zero).
https://doi.org/10.1002/env.2391
=> More informations about this toot | More toots from zeileis@fosstodon.org
@zeileis @ikosmidis Thanks! This shrinkage idea is interesting, but defies my intuition. How in the world is a hyperparameter more easily identifiable than the parameter itself? 🤯
=> More informations about this toot | More toots from famuvie@oc.todon.fr
@famuvie @ikosmidis The exceedence parameters in the XB and the XBX distribution play very similar roles. I wouldn't think of it as a hyperparameter, really. Both parameters drive how much probability mass is in the tails of the latent distribution, outside of [0, 1], which then becomes the point masses at 0 and 1.
However, in XBX the support of the latent distribution does not depend on ν which makes the marginal likelihood and its derivates much more well-behaved.
=> More informations about this toot | More toots from zeileis@fosstodon.org This content has been proxied by September (3851b).Proxy Information
text/gemini