Profile CPU and memory usage of model eval & fit #18

jacobpennington · 2022-09-14T18:34:06Z

The current Layer.evaluate implementations were mostly copied from nems0, with refactoring mainly focused on readability. Model.evaluate, .fit and related components were developed with a similar focus. Now they should be revised with an eye for efficiency, at least for cases where improvements are not overly complicated - on some level, TF or other backends will always be faster so the revised scipy implementations should still be easily interpretable since that's its main advantage.

See nems.preprocessing.normalization.minmax for an example of how to reduce memory spiking when array shapes don't change (not pushed as of posting this issue but will be soon).

The text was updated successfully, but these errors were encountered:

jacobpennington · 2022-09-16T18:51:58Z

Changing nonlinearity implementations to use numexpr got rid of some unnecessary copies (and sped up the computation), other Layers still TODO since they're not as straightforward.

Profiling results on Model.evaluate are showing very little overhead outside of the actual Layer.evaluate methods (which is what we want):

(abbreviated output)

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   >> in Model.evaluate <<
   336         1      21479.6  21479.6     99.7              data = self.get_layer_data(data_generator, n, n)[-1]['data']
  
   >> in Model.get_layer_data <<
   610         1          2.8      2.8      0.0          subset = itertools.islice(data_generator, first_index, last_index)
   611         1      21958.7  21958.7    100.0          return [d for d in subset]

   >> in Model.generate_layer_data <<
   524         3      21199.9   7066.6     99.2              a, k, o = self._evaluate_layer(layer, data, inplace_ok=inplace_ok)

   >> in Layer._evaluate <<
   380         3      20992.1   6997.4     99.4          output = self.evaluate(*args, **kwargs)

(memory results were also good - no memory blowups outside Layer.evaluate)

Similar story for .fit so far:

   >> in Model.fit <<
   786         4   31310635.9 7827659.0     99.9          fit_results = backend_obj._fit(
   787         3          2.5      0.8      0.0              data, eval_kwargs=eval_kwargs, **fitter_options
   788                                                       )

   >> in SciypBackend._fit <<
    80         5  111461117.8 22292223.6    100.0                  fit_result = wrapper.get_fit_result(_data, **fitter_options)

   >> in _FitWrapper.__call__ <<
          > forgot to save the output, but ~99% spent on computing the cost function (expected) <

   >> in _FitWrapper.compute_cost <<
          > forgot to save the output, but ~90% spent on cost function itself.
              mostly expected, some room for minor improvements <

jacobpennington · 2022-09-17T03:33:19Z

In summary: still not sure why the current scipy backend is fitting around half as fast as NEMS0.

jacobpennington added the enhancement New feature or request label Sep 14, 2022

jacobpennington self-assigned this Sep 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Profile CPU and memory usage of model eval & fit #18

Profile CPU and memory usage of model eval & fit #18

jacobpennington commented Sep 14, 2022

jacobpennington commented Sep 16, 2022 •

edited

Loading

jacobpennington commented Sep 17, 2022

Profile CPU and memory usage of model eval & fit #18

Profile CPU and memory usage of model eval & fit #18

Comments

jacobpennington commented Sep 14, 2022

jacobpennington commented Sep 16, 2022 • edited Loading

jacobpennington commented Sep 17, 2022

jacobpennington commented Sep 16, 2022 •

edited

Loading