-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loading model fails in tutorial #97
Comments
Possibly the TransformerLens version you're using is different from the one that was used to save the hypothesis, so the hook names are different. What's the list of edges from |
Thanks for your lighting fast response! Before running the `.step()` function blockict_keys([('blocks.1.hook_resid_post', [:], 'blocks.1.attn.hook_result', [:, :, 7]), ('blocks.1.hook_resid_post', [:], After the step function After running the `.step()` function blockdict_keys([('blocks.1.hook_resid_post', [:], 'blocks.1.attn.hook_result', [:, :, 6]), ('blocks.1.hook_resid_post', [:], 'blocks.0.hook_resid_pre', [:]), ('blocks.1.attn.hook_result', [:, :, 6], 'blocks.1.attn.hook_q', [:, :, 6]), ('blocks.1.attn.hook_result', [:, :, 6], 'blocks.1.attn.hook_k', [:, :, 6]), ('blocks.1.attn.hook_result', [:, :, 6], 'blocks.1.attn.hook_v', [:, :, 6]), ('blocks.1.attn.hook_q', [:, :, 6], 'blocks.1.hook_q_input', [:, :, 6]), ('blocks.1.attn.hook_k', [:, :, 6], 'blocks.1.hook_k_input', [:, :, 6]), ('blocks.1.attn.hook_v', [:, :, 6], 'blocks.1.hook_v_input', [:, :, 6]), ('blocks.1.hook_q_input', [:, :, 6], 'blocks.0.hook_resid_pre', [:]), ('blocks.1.hook_k_input', [:, :, 6], 'blocks.0.attn.hook_result', [:, :, 0]), ('blocks.1.hook_v_input', [:, :, 6], 'blocks.0.hook_resid_pre', [:]), ('blocks.0.attn.hook_result', [:, :, 0], 'blocks.0.attn.hook_q', [:, :, 0]), ('blocks.0.attn.hook_result', [:, :, 0], 'blocks.0.attn.hook_k', [:, :, 0]), ('blocks.0.attn.hook_result', [:, :, 0], 'blocks.0.attn.hook_v', [:, :, 0]), ('blocks.0.attn.hook_v', [:, :, 0], 'blocks.0.hook_v_input', [:, :, 0]), ('blocks.0.hook_v_input', [:, :, 0], 'blocks.0.hook_resid_pre', [:])]) I don't think the issue is that the TransformerLens versions are different because I can reproduce this all from the same notebook in colab. Thank you |
Turns out the explanation is: the ACDC algorithm literally removes edges (i.e. removes them from the correspondence dictionaries), as opposed to saying The loading code should be changed to fix this. |
@velezbeltran I'm curious if you would be so kind to share the working code for loading the subgraph weights |
Hello!
I have been working on the
ACDC_Main_Demo.ipynb
repo and I am currently facing an issue where if I attempt to load the model from a subgraph I get an error. In particular I attempt the following steps.If I do this I get the following assertion error:
What could be causing this? Am I doing something wrong? Alternatively, what is the standard way of loading in circuits?
Also, if I do run the cell that contains the
.step()
method I don't have this issue.Thank you!
Nicolas
The text was updated successfully, but these errors were encountered: