Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple gradient descent optimiser #1104

Merged
merged 12 commits into from
Apr 17, 2020
Merged

Simple gradient descent optimiser #1104

merged 12 commits into from
Apr 17, 2020

Conversation

MichaelClerx
Copy link
Member

Closes #930

  • Updates the optimiser interface to allow sensitivities
  • Updates the optimiser controller to work with sensitivities
  • Adds a simple fixed learning rate gradient descent method

@MichaelClerx MichaelClerx self-assigned this Apr 7, 2020
@codecov
Copy link

codecov bot commented Apr 7, 2020

Codecov Report

Merging #1104 into master will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##            master     #1104   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           69        70    +1     
  Lines         7343      7413   +70     
=========================================
+ Hits          7343      7413   +70     
Impacted Files Coverage Δ
pints/_optimisers/__init__.py 100.00% <100.00%> (ø)
pints/_optimisers/_gradient_descent.py 100.00% <100.00%> (ø)
pints/toy/_parabola.py 100.00% <100.00%> (ø)
pints/plot.py 100.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9c68cb2...29d8afb. Read the comment docs.

@MichaelClerx
Copy link
Member Author

@ben18785 & @alisterde could you have a quick look at this PR for me?
It adds in a first gradient based optimiser, which I think would pave the way for your BFGS optimiser.

Copy link
Member

@alisterde alisterde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @MichaelClerx it looks good to me and the structure will be really useful for the BFGS method.

@MichaelClerx
Copy link
Member Author

Thanks @alisterde !

Copy link
Collaborator

@ben18785 ben18785 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @MichaelClerx -- a really nice and helpful addition.

I have made some changes to the example notebook: the reason the logistic problem wasn't converging was that the parameter scales are so different for this case. I rescaled them and now gradient descent runs fine.

Otherwise, have a look at my line comment about the name and see what you think.

Finally, shouldn't we have some tests for the parabolic error sensitivities?

@MichaelClerx
Copy link
Member Author

Thanks @ben18785 ! I quite like how the notebook suggests improvements to this method :D

@MichaelClerx MichaelClerx requested a review from ben18785 April 17, 2020 12:08
@MichaelClerx
Copy link
Member Author

Have added some tests and changed the name now @ben18785 !

Copy link
Member

@fcooper8472 fcooper8472 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good to me code-wise.

Is this the canonical "gradient descent" optimiser or, for instance, are there examples with non-constant learning rate? I.e. is the name going to need to become more specific at some point in the future when someone implements a similar algorithm?

(Maybe for @ben18785)

@MichaelClerx
Copy link
Member Author

This all looks good to me code-wise.

Is this the canonical "gradient descent" optimiser or, for instance, are there examples with non-constant learning rate? I.e. is the name going to need to become more specific at some point in the future when someone implements a similar algorithm?

It's a good question, we'll probably want to name it something more specific in the future. (Gradient descent is one of those things that became a class of methods, I suppose)

But to be honest I'm not super worried about that, because this won't really be used in practice

@MichaelClerx
Copy link
Member Author

Then again, @fcooper8472 , the more specific ones do seem to have specific names, so could be fine to leave this as is even when we add more?

See #1105

Someone will probably do these for fun at some point :D

@fcooper8472
Copy link
Member

Then again, @fcooper8472 , the more specific ones do seem to have specific names, so could be fine to leave this as is even when we add more?

Sounds good to me.

Copy link
Collaborator

@ben18785 ben18785 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good -- thanks. I'll go ahead and merge

@ben18785 ben18785 merged commit 5791246 into master Apr 17, 2020
@ben18785 ben18785 deleted the 930-gradient-descent branch April 17, 2020 12:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add simple gradient descent optimiser
4 participants