diff --git a/README.md b/README.md index c4b88ed..1ad50d6 100644 --- a/README.md +++ b/README.md @@ -15,20 +15,11 @@ TemporalGPs.jl is registered, so simply type the following at the REPL: ``` While you can install TemporalGPs without AbstractGPs and KernelFunctions, in practice the latter are needed for all common tasks in TemporalGPs. -## Note !!! - -This package is currently not guaranteed to work with all current versions of dependencies. If something is not working on the current release of TemporalGPs, -please try out v0.6.7, which pins some dependencies in order to circumvent some of the problems. You can do so by typing instead: -```julia -] add AbstractGPs KernelFunctions TemporalGPs@0.6.7 -``` -Please report an issue if this work-around fails. - # Example Usage Most examples can be found in the [examples](https://github.com/JuliaGaussianProcesses/TemporalGPs.jl/tree/master/examples) directory. In particular see the associated [README](https://github.com/JuliaGaussianProcesses/TemporalGPs.jl/tree/master/examples/README.md). -This is a small problem by TemporalGPs' standard. See timing results below for expected performance on larger problems. +The following is a small problem by TemporalGPs' standard. See timing results below for expected performance on larger problems. ```julia using AbstractGPs, KernelFunctions, TemporalGPs @@ -66,72 +57,11 @@ logpdf(f_post(x), y) ## Learning kernel parameters with [Optim.jl](https://github.com/JuliaNLSolvers/Optim.jl), [ParameterHandling.jl](https://github.com/invenia/ParameterHandling.jl), and [Mooncake.jl](https://github.com/compintell/Mooncake.jl/) TemporalGPs.jl doesn't provide scikit-learn-like functionality to train your model (find good kernel parameter settings). -Instead, we offer the functionality needed to easily implement your own training functionality using standard tools from the Julia ecosystem, as shown below. -```julia -# Load our GP-related packages. -using AbstractGPs -using KernelFunctions -using TemporalGPs - -# Load standard packages from the Julia ecosystem -using Optim # Standard optimisation algorithms. -using ParameterHandling # Helper functionality for dealing with model parameters. -using Mooncake # Algorithmic Differentiation - -using ParameterHandling: flatten - -# Declare model parameters using `ParameterHandling.jl` types. -flat_initial_params, unflatten = flatten(( - var_kernel = positive(0.6), - λ = positive(2.5), - var_noise = positive(0.1), -)) - -# Construct a function to unpack flattened parameters and pull out the raw values. -unpack = ParameterHandling.value ∘ unflatten -params = unpack(flat_initial_params) - -function build_gp(params) - f_naive = GP(params.var_kernel * Matern52Kernel() ∘ ScaleTransform(params.λ)) - return to_sde(f_naive, SArrayStorage(Float64)) -end - -# Generate some synthetic data from the prior. -const x = RegularSpacing(0.0, 0.1, 10_000) -const y = rand(build_gp(params)(x, params.var_noise)) - -# Specify an objective function for Optim to minimise in terms of x and y. -# We choose the usual negative log marginal likelihood (NLML). -function objective(params) - f = build_gp(params) - return -logpdf(f(x, params.var_noise), y) -end - -# Check that the objective function works: -objective(params) - -# Optimise using Optim. This optimiser often works fairly well in practice, -# but it's not going to be the best choice in all situations. Consult -# Optim.jl for more info on available optimisers and their properties. -training_results = Optim.optimize( - objective ∘ unpack, - θ -> only(Mooncake.gradient(objective ∘ unpack, θ)), - flat_initial_params + randn(3), # Add some noise to make learning non-trivial - BFGS( - alphaguess = Optim.LineSearches.InitialStatic(scaled=true), - linesearch = Optim.LineSearches.BackTracking(), - ), - Optim.Options(show_trace = true); - inplace=false, -) - -# Extracting the final values of the parameters. -# Should be close to truth. -final_params = unpack(training_results.minimizer) -``` -Once you've learned the parameters, you can use `posterior`, `marginals`, and `rand` to make posterior-predictions with the optimal parameters. +Instead, we offer the functionality needed to easily implement your own training functionality using standard tools from the Julia ecosystem. +See [exact_time_learning.jl](https://github.com/JuliaGaussianProcesses/TemporalGPs.jl/blob/master/examples/exact_time_learning.jl). -In the above example we optimised the parameters, but we could just as easily have utilised e.g. [AdvancedHMC.jl](https://github.com/TuringLang/AdvancedHMC.jl) in conjunction with a prior over the parameters to perform approximate Bayesian inference in them -- indeed, [this is often a very good idea](http://proceedings.mlr.press/v118/lalchand20a/lalchand20a.pdf). We leave this as an exercise for the interested user (see e.g. the examples in [Stheno.jl](https://github.com/willtebbutt/Stheno.jl/) for inspiration). +In this example we optimised the parameters, but we could just as easily have utilised e.g. [AdvancedHMC.jl](https://github.com/TuringLang/AdvancedHMC.jl) in conjunction with a prior over the parameters to perform approximate Bayesian inference in them -- indeed, [this is often a very good idea](http://proceedings.mlr.press/v118/lalchand20a/lalchand20a.pdf). +We leave this as an exercise for the interested user (see e.g. the examples in [Stheno.jl](https://github.com/willtebbutt/Stheno.jl/) for inspiration). Moreover, it should be possible to plug this into probabilistic programming framework such as `Turing` and `Soss` with minimal effort, since `f(x, params.var_noise)` is a plain old `Distributions.MultivariateDistribution`. @@ -155,20 +85,6 @@ This tells TemporalGPs that you want all parameters of `f` and anything derived Gradient computations use Mooncake. Custom adjoints have been implemented to achieve this level of performance. - -# On-going Work - -- Optimisation - + in-place implementation with `ArrayStorage` to reduce allocations - + input data types for posterior inference - the `RegularSpacing` type is great for expressing that the inputs are regularly spaced. A carefully constructed data type to let the user build regularly-spaced data when working with posteriors would also be very beneficial. -- Interfacing with other packages - + When [Stheno.jl](https://github.com/willtebbutt/Stheno.jl/) moves over to the AbstractGPs interface, it should be possible to get some interesting process decomposition functionality in this package. -- Approximate inference under non-Gaussian observation models - -If you're interested in helping out with this stuff, please get in touch by opening an issue, commenting on an open one, or messaging me on the Julia Slack. - - - # Relevant literature See chapter 12 of [1] for the basics. diff --git a/examples/README.md b/examples/README.md index 381566f..2a84a38 100644 --- a/examples/README.md +++ b/examples/README.md @@ -2,7 +2,7 @@ Ideally, you would have worked through a few examples involving AbstractGPs.jl, as the code in TemporalGPs.jl implements the (primary) interface specified there. -Equally, these examples stand alone, so if you're not familiar with AbstractGPS.jl, don't +Equally, these examples stand alone, so if you're not familiar with AbstractGPs.jl, don't worry too much. The examples in this directory are best worked through in the following order: diff --git a/examples/exact_time_learning.jl b/examples/exact_time_learning.jl index d880815..267d694 100644 --- a/examples/exact_time_learning.jl +++ b/examples/exact_time_learning.jl @@ -6,9 +6,6 @@ using AbstractGPs using TemporalGPs -# Load up the separable kernel from TemporalGPs. -using TemporalGPs: RegularSpacing - # Load standard packages from the Julia ecosystem using Optim # Standard optimisation algorithms. using ParameterHandling # Helper functionality for dealing with model parameters. @@ -27,6 +24,9 @@ flat_initial_params, unpack = ParameterHandling.value_flatten(( # Pull out the raw values. params = unpack(flat_initial_params); +# Functionality to load build a TemporalGPs.jl GP given the model parameters. +# Specifying SArrayStorage ensures that StaticArrays.jl is used to represent model +# parameters under the hood, which enables very strong peformance. function build_gp(params) k = params.var_kernel * Matern52Kernel() ∘ ScaleTransform(params.λ) return to_sde(GP(params.mean, k), SArrayStorage(Float64)) @@ -34,7 +34,7 @@ end # Specify a collection of inputs. Must be increasing. T = 1_000_000; -x = RegularSpacing(0.0, 1e-4, T); +x = TemporalGPs.RegularSpacing(0.0, 1e-4, T); # Generate some noisy synthetic data from the GP. f = build_gp(params) @@ -48,6 +48,7 @@ function objective(flat_params) return -logpdf(f(x, params.var_noise), y) end +# A helper function to get the gradient. function objective_grad(rule, flat_params) return Mooncake.value_and_gradient!!(rule, objective, flat_params)[2][2] end @@ -74,7 +75,7 @@ f_post = posterior(f_final(x, final_params.var_noise), y); # Specify some locations at which to make predictions. T_pr = 1_200_000; -x_pr = RegularSpacing(0.0, 1e-4, T_pr); +x_pr = TemporalGPs.RegularSpacing(0.0, 1e-4, T_pr); # Compute the exact posterior marginals at `x_pr`. f_post_marginals = marginals(f_post(x_pr));