-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flexible neuron, synapse data layout #195
Comments
|
more context: GPU has to access global arrays directly, which are allocated for each kernel, so need global functions that are defined differently on CPU vs. GPU:
Have a separate Can just switch out NeuronVarIdx functions to see impact of different layouts. |
The non-float32 vars (flags, indexes) in Neuron would be stored separately, so you don't have to complicate anything about type conversions. Also, SynCa would be separated from Synapses memory, per #168 |
Context requires data parallel state for all the PVLV, NeuroMod stuff. |
Plan: add GlobalVars enum in globals.go, store memory in Network, with all the NeuroMod and PVLV state. Can parameterize with NDrives with computed offsets etc for flexible storage. maybe have an offset index lookup table for each val (above a given enum value). Data inner-loop indexing for each var. GPU just exposes the Globals directly as usual. |
This worked as expected and massively improves GPU performance, and even CPU performance is significantly improved in NData > 1 cases. |
This seems very straightforward, now that Neurons and Syns are all just giant arrays of structs on the Network:
[]float32
arrays (likewise on GPU).Context
as first arg (nearly ubiquitous), which has a Network index in CPU mode, that allows access to a global list of networks, which then allows access to these arrays. On GPU, the accessor methods just access the global arrays defined in the kernel.The text was updated successfully, but these errors were encountered: