Variational inference #27

ngoodman · 2015-01-08T16:14:03Z

Basic black box variational inference (see http://arxiv.org/pdf/1301.1299v1.pdf and http://arxiv.org/pdf/1401.0118v1.pdf) is in the codebase.

It needs to be tested and benchmarked.

There are several major performance improvements described in the papers that need to be implemented. Most important rao-blackwellization of the gradient estimates. This may require a flow analysis to determine the markov blanket of random choices.

Once everything is working, try variationally-guided PF: in the particle filter, sample new choices from the variational distribution, instead of the prior. Or possibly mix / interpolate prior and variational distribution. The idea is that variational gets you an importance sampler closer to the posterior modes, while PF helps capture the joint structure ignored by variational.

null-a · 2015-04-20T16:58:42Z

What are your thoughts on what should be returned by variational inference? I imagine we want to return the actual variational program (rather than summarise it with samples for example) but it's not obvious to me how to do that. It seems like we'd need to reach into the thunk we're passed and set its ERP parameters to those found during inference. This doesn't seem straight forward.

In Stochy, I return a function which runs the original program with a special co-routine which switches in the variational parameters at run-time. This is pretty ugly, I've not convinced myself it's a fully general, and it doesn't support the ERP interface. Do you have any better ideas?

null-a · 2015-04-20T17:10:11Z

I've started to clean-up/test what we have so far. See my variational branch.

Here's a simple test case I'm working with. We already get close to the optimal parameters as found by the hand-derived variational inference algorithm.

ngoodman · 2015-04-20T17:31:41Z

Awesome!

The question of what kind of ERP representation to return is a good and tricky one. I think as a simple first-pass, returning an empirical distribution built from samples is ok. The ERP object should have extra fields for best variational parameters and the corresponding (estimated) variational lower-bound on marginal likelihood.

As you say, a better representation of the sample() method for this ERP would be the variational program with it's params fixed. The coroutine trick is a clever and not terrible way to do this (only drawback i see is speed and slight kludginess). The other ways that i can think of all involve reflecting into the source code of the thunk and building a new (minimal) sampler program.

But even with the variational sampler, there's still the question of how to implement the score() method.... This could be tricky if there is a lot of deterministic computation between the random choices and the return value of the marginalized thunk.

Anyhow, it requires more thought! But a lot of the things we'd want to do with variational can already be done with the simple solutions.

null-a · 2015-04-23T18:49:24Z

Great, thanks!

I'm now returning the estimated lower-bound and an ERP built from samples. See the updated test case.

I've also implemented the control variate idea from "Black Box Variational Inference". I think it's working but I need to test it more thoroughly.

null-a · 2015-05-12T19:11:17Z

The returned ERP now has a variationalParams field which is an object mapping addresses to vectors of ERP parameters. Is that what you had in mind?

Also, is this to do note still relevant? (The code looks ok to me.)

I've also added inference tests and made a few other tweaks. This is all in the variational branch.

ngoodman · 2016-05-07T17:16:46Z

We've made a lot of progress in the daipp branch. I think the basic variational infrastructure (ability to specify variational guide distributions, optimization of ELBO via PW+LR estimators, etc) will be ready to merge into dev soon. I am moving this to milestone 0.8 so that we'll have time to test and document before the summer school....

ngoodman · 2016-06-07T16:01:41Z

btw It would be great if around when this makes it to dev, there is a default for when no guide is given at a sample within Optimize. E.g. do mean-field by taking the guide at each sample to be the same Dist with the dist params upgraded to guide params. (I guess it would be good to print a "warning: defaulting to mean-field" to the console in this case.)

(Maybe this was already part of the plan...)

null-a · 2016-06-08T08:54:11Z

do mean-field by taking the guide at each sample to be the same Dist with the dist params upgraded to guide params

Yes, I was planning to do this. My intention is to factor the information about parameters and their constraints out of daipp, so that mean-field can also do appropriate parameter squishing.

null-a · 2016-06-08T09:07:54Z

Here are the remaining changes I intend to make before opening a PR for this:

~~add daipp license (mit?)~~ now Add license webppl-daipp#1
doc stubs
mean-field when no guide is given in the program
add more inference tests for vi
add basic (sanity check) tests for new distributions
move daipp to a package
check guides/params are ignored by coroutines that don't use them

Ideally, for simplicity, I'd like to just merge the daipp branch once this is done. The only reason we might prefer not to do this is that it will include a few un-finished and un-tested bits. These are:

EUBO gradient estimator
The ability to get traces out of SMC
The mapData construct
evaluateGuide (computes ESS for a guide)

I am moving this to milestone 0.8 so that we'll have time to test and document before the summer school....

I take this to mean we'll add docs for this later.

ngoodman · 2016-06-08T14:37:40Z

sounds good! i think merging daipp into dev is ok -- the extra bits will just remain undocumented until they are done and tested. (perhaps put notes to this effect at the top of the relevant source files...)

yes, we can document later (though if you have time to add a stubb to the inference section of docs, that'll get us started).

ngoodman added enhancement research labels Jan 8, 2015

null-a self-assigned this Apr 17, 2015

null-a removed their assignment Jul 1, 2015

null-a modified the milestone: v0.9 Feb 12, 2016

ngoodman mentioned this issue May 7, 2016

Merge variational inference #259

Closed

ngoodman modified the milestones: v0.8, v0.9 May 7, 2016

null-a self-assigned this May 28, 2016

longouyang mentioned this issue Jun 11, 2016

Allow continuous distributions to be serialized #488

Open

null-a mentioned this issue Jun 13, 2016

Variational inference #494

Merged

stuhlmueller closed this as completed in #494 Jun 17, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Variational inference #27

Variational inference #27

ngoodman commented Jan 8, 2015

null-a commented Apr 20, 2015

null-a commented Apr 20, 2015

ngoodman commented Apr 20, 2015

null-a commented Apr 23, 2015

null-a commented May 12, 2015

ngoodman commented May 7, 2016

ngoodman commented Jun 7, 2016

null-a commented Jun 8, 2016

null-a commented Jun 8, 2016 •

edited

Loading

ngoodman commented Jun 8, 2016

Variational inference #27

Variational inference #27

Comments

ngoodman commented Jan 8, 2015

null-a commented Apr 20, 2015

null-a commented Apr 20, 2015

ngoodman commented Apr 20, 2015

null-a commented Apr 23, 2015

null-a commented May 12, 2015

ngoodman commented May 7, 2016

ngoodman commented Jun 7, 2016

null-a commented Jun 8, 2016

null-a commented Jun 8, 2016 • edited Loading

ngoodman commented Jun 8, 2016

null-a commented Jun 8, 2016 •

edited

Loading