You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When to resample in SMC? Currently particles are aligned by number-of-tokens, so when we resample, all particles have the same number of tokens (unless some have already hit EOS). But this isn't really fair. For example:
When intersecting "My favorite physicist is" and "My favorite writer is", we end up comparing particles that say, e.g., " Richard Feynman. He was" and " Neil deGrasse Tyson" -- when we really want to compare " Richard Feynman" to " Neil deGrasse Tyson".
When intersecting "A great personal finance tip is" and "A great tip for healthy living is", we end up comparing particles that say, e.g., " to avoid eating out" and " to make sure you're". The former loses out, intuitively because its weight already factors in the semantic constraints whereas they largely 'withhold judgment' on the vaguer latter particle.
It would be great to find a clear theoretical framework for thinking about these intermediate distributions, and other heuristics (or principled strategies) for alignment.
One heuristic worth trying might be to resample at syntax-directed points -- at the end of each sentence, clause, or some other grammatical element.
The text was updated successfully, but these errors were encountered:
When to resample in SMC? Currently particles are aligned by number-of-tokens, so when we resample, all particles have the same number of tokens (unless some have already hit EOS). But this isn't really fair. For example:
It would be great to find a clear theoretical framework for thinking about these intermediate distributions, and other heuristics (or principled strategies) for alignment.
One heuristic worth trying might be to resample at syntax-directed points -- at the end of each sentence, clause, or some other grammatical element.
The text was updated successfully, but these errors were encountered: