-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Variational inference #494
Conversation
* Adds wppl method getRelativeAddress. * Rejuvenation (MH only) can now be used when generating initial traces for tutorial training.
This code supported pretty printing optimization method instances.
The core primitive for registering parameters is now `registerParams`. This is more flexible than its predecessor in that it tracks arrays of parameters per name/address, and can optionally call a hook once parameters have been lifted.
this is exciting! are there any breaking changes in this PR, or is it just additions? (seems like maybe signatures of some distributions have changed in a way that will break existing code?) |
a suggestion: can we change the name of the naive VI with mean field optimize then sample call to |
The only breaking changes (I hope) are to those distributions that now take a tensor rather than an array or nested array. i.e. Attempting to create either of those using arrays throws a useful error message.
As implemented, Also, just to be clear, this only does mean-field when no guide is specified. We can of course add the thing you suggested, but we'll have to do some extra work to massage the parameters into the format expected by This massaging makes the interfaces a little less consistent (which then makes it harder to use, more fiddly to document) which is another reason I went the way I did: Infer({method: 'optimize', samples: 1000, estimator: {ELBO: {samples: 10}}}, model)
// If you want to start doing something fancy, you might start by
// writing the above as something like:
Optimize(model, {estimator: {ELBO: {samples: 10}}})
SampleGuide(model, {samples: 1000})
// (The fact that Optimize and Infer both take a method parameter
// spoils this a little. That could easily be fixed.) versus: Infer({method: 'variational', samples: 1000, elboSamples: 10}, model)
// this is a little trickier =>
Optimize(model, {estimator: {ELBO: {samples: 10}}})
SampleGuide(model, {samples: 1000}) So that's what I had in mind, what do you think? |
that's reasonable... maybe make an additional method |
I'm totally on board with the idea of having a slightly more convenient interface for this, I'm just not at all sure what it should be. I like the default params idea but I wonder whether having the option to vary the number of samples per step might turn out to be something we often need, especially while we don't have baselines.
Given my uncertainty I'd be happy to see how things go, things are likely to be clearer once people start using it. If someone wants to make a call, that's also fine with me. Either way, I don't think it needs to hold this PR up. |
Why make this distinction? (As a user, I'd assume that either both |
function eqDim0(v, w) { | ||
// Useful for checking two vectors have the same length, or that the | ||
// dimension of a vector and matrix match. | ||
return v.dims[0] === w.dims[0]; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move the functions above somewhere else (to tensor.js
, or util.js
)?
First, I should say that it's not obvious to me what the optimal thing to do here is, and there may well be other good or better choices, but here's what I was thinking: Having the Something similar doesn't seem to hold for the
Re-reading this I've noticed that it glosses over a potentially more important issue: the values sampled from |
That's a significant backwards-incompatibility—many models that use It's unfortunate that this change makes code a bit less readable/intuitive, going from
to
but I don't see a good way around this. (We could macro-rewrite all list indexing to a call like |
var mapDataIndices = {}; | ||
|
||
// This is been developed as part of daipp. It's probably still | ||
// buggy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to include mapData
as part of a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to go either way on this, so I've removed it.
Indeed. What's more I suspect that the effects of introducing tensors that are not as deeply embedded in the language as arrays will be felt for a while. Our current situation is perhaps similar to the early days of python+numpy, so I expect it will take some effort to get to something as nice as their current setup. Having multidimensional arrays everywhere (like e.g. Julia) is an appealing long-term solution, but whether that's achievable...
Do you think there'll be much call for indexing tensors drawn from Dirichlets? Are the things you have in mind expressible using tensor operations,
You'll have thought of this I'm sure, but for the record other options include:
I don't much like either of these though. If you ever want to use a gradient based inference algorithm you'll probably be best writing the model in terms of tensors, so I suspect the system should encourage that.
Once we have ES2015 support everywhere we could implement this using a |
I worry that people won't know about the distinction between vectors (tensors) and arrays, and that they will be surprised when they write functions that operate on arrays and that then break when applied to a value sampled from We could either attempt to make the interface similar enough that most array functions can polymorphically apply to tensors as well, or we could try to make it very obvious which things are arrays and which are tensors, and give helpful error messages when users attempt to use one in place of the other. The current situation where integer property lookups on tensors result in
Good idea! Overall, I'm done with a shallow review of this PR. If we have a plan for the array/tensor issue above, I'm happy to move on and merge. (Given the amount of code that has been changed/added, a proper review would take at least another week or so, with some back-and-forth.) |
it's true that this introduces some potential confusion and slightly uglier code, but i think we need proper tensors for a lot of algorithm development. in general, anything that was an array of numbers should probably now be a tensor, hence all the distribution changes. where it is feasible to add helpful error messages, we should; some places it probably won't be feasible (eg
i think we should go ahead and merge, so that we can get feedback sooner than later. let's warn the "public" and open an issue to think about more uniform array / tensor interfaces and proxies and such later on. @mhtess you should check that these changes don't destroy the models for your class (some updates may be needed) and let us know if additional error checks or helper functions seem warranted.... |
I think we're good to merge the daipp branch. (Closes #27.) Here's a summary of what's included in this PR:
Optimize
coroutine for optimizing guide params using either elbo or eubo.sample(targetDist, {guide: guideDist})
.scalarParam
,tensorParam
. Also still haveparam
which dispatches to the scalar or tensor version based on its first arg.withImportanceDist
has been removed. (Partially addressing Unify treatment of importance distributions and custom MH proposers #119.)ForwardSample
coroutine to sample from guide or target program. (Closes Have an "inference" algorithm that just returns forward samples #368.)Infer({method: 'optimize'}, model)
as a convenient way of doing the common thing of doing some optimization and then sampling to make a marginal.saveTraces
option onSMC
as a way to generate traces for eubo estimator.TensorGaussian
,DiagCovGaussian
,LogisticNormal
,MultivariateBernoulli
.Dirichlet
,MultivariateGaussian
,Discrete
. (Discrete
can still also take arrays, the others take only tensors.)Work-in-progressmapData
construct.Delta
. The only place whereDelta
is used (that I know of) is agent models, and I don't think this change will have any impact there.