Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add
CategoricalMADE
#1269base: main
Are you sure you want to change the base?
Add
CategoricalMADE
#1269Changes from all commits
71b6eac
abac114
f2db6eb
73e85b5
8bbb764
913776e
c9e2b88
db03a5b
b3cefa9
ba5ae95
ee7a34e
1d23ae5
70b287f
a0f6f44
2e5898b
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better keep this general: "input variable", because it could be
x
for MNLE, ortheta
when doing mixed NPE (see Daniel's recent PR).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this numerically stable? Better use
logsumexp
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these shapes correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very naive question here: the outputs are coming from the MADE, i.e., the conditional dependencies are already taken care of internally right?
I am just wondering because for the 1-D case, we used the network-predicted
ps
to construct aCategorical
distribution and then evaluated theinputs
under that distribution. This is not needed here because the underlyingMADE
takes both theinputs
and thecontext
and outputs unnormalized conditional probabilities already?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be initialized with uniform
torch.rand
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debugging leftover?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same question as above: these samples are internally autoregressive, right? So each discrete variable is sampled given the upstream discrete variables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am just confused because I would have expected that we for each iteration we need pass the so far sampled discrete samples as
context
; but this seems to be happening implicitly in the MADE?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I see it: in line 148 you are updating
samples
with the new samples of the currenti
. It probably boils down to the same thing, but you could also update all sofar sampledsamples
, i.e.,?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the custom init? is it a abstract method from
nflows
MADE
?if so, we should probably raise "not implemented" here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is part of nflows
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
however, cannot raise not implemented, since this is run on init of MADE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I am missing something the
_initialize()
is needed only inMixtureOfGaussiansMADE(MADE):
, not inMADE
, so it's not needed here?