Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bottom up iterator #107

Draft
wants to merge 60 commits into
base: master
Choose a base branch
from
Draft

Bottom up iterator #107

wants to merge 60 commits into from

Conversation

ReubenJ
Copy link
Member

@ReubenJ ReubenJ commented May 21, 2024

Just to trigger CI for now...

Copy link

codecov bot commented May 21, 2024

Codecov Report

Attention: Patch coverage is 93.22034% with 16 lines in your changes missing coverage. Please review.

Project coverage is 32.59%. Comparing base (1876d08) to head (5ed48ef).
Report is 25 commits behind head on master.

Files with missing lines Patch % Lines
src/stochastic_iterator.jl 0.00% 12 Missing ⚠️
src/top_down_iterator.jl 0.00% 3 Missing ⚠️
src/bottom_up_iterators/bottom_up_iterator.jl 97.36% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master     #107       +/-   ##
===========================================
- Coverage   67.35%   32.59%   -34.76%     
===========================================
  Files          21       24        +3     
  Lines         729      908      +179     
===========================================
- Hits          491      296      -195     
- Misses        238      612      +374     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

current_programs::Queue{RuleNode}
mutable struct BottomUpState
bank::Any
data::Any
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-urgent thing to think about: should we leave bank and data as Any, or should we introduce some subtyping? I assume the bank will always be either a Dict so that programs are ordered over their type, or some kind of priority as in Brute. If we figure out more precise type, the code might be faster.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have decided to introduce two new abstract types: BUDepthBank and BUDepthData that the concrete iterators will need to extend. Please let me know what you think about this.

iter::DepthIterator
)::Dict{Symbol, Vector{RuleNode}}
grammar::ContextSensitiveGrammar = get_grammar(iter.solver)
bank::Dict{Symbol, Vector{RuleNode}} = Dict{Symbol, Vector{RuleNode}}()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to make the bank type Dict{Symbol, Dict{Int,Vector{RuleNode}}}? so that you can also index it over depth for easier fetching?

Julia 1.12 will clean up when implicit world age increments happen.
In particular, there will no longer be implicit increments within
the same statement at toplevel. However, we are likely retaining
implicit increments before each statement within @testset. This
PR makes a small rearrangements to be compatible with this change.
See JuliaLang/julia#56509.
@THinnerichs
Copy link
Member

THinnerichs commented Nov 20, 2024

@TudorMagirescu:
Could you merge the uniform iterator branch into this one and check what changes we should keep? Then I can give a full review of both including some hints for the design patterns we discussed during the Hackathon.

ReubenJ and others added 20 commits November 27, 2024 17:53
Add abstract layer to iterator hierarchy
Small tweak for Julia 1.12 world age change
Add `Aqua.jl` to test for this in the future as well as other Aqua
tests. Ignoring the type piracies on `RuleNode`s and
`AbstractGrammar`. Ideally, this functionality would be moved to
`HerbCore`, but that can wait.
…-16-01-46-08-093-02170095097

CompatHelper: bump compat for HerbSpecification to 0.2, (keep existing compat)
Simplify compat for `Random`
The caret is the default SemVer behavior (https://pkgdocs.julialang.org/v1/compatibility/\#Caret-specifiers)
and it seems other projects leave it off.
…-13-01-52-34-511-01072321669

CompatHelper: bump compat for HerbGrammar to 0.5
@TudorMagirescu TudorMagirescu self-assigned this Feb 13, 2025
@sebdumancic
Copy link
Member

sebdumancic commented Feb 14, 2025

I have been going over our struggles to define a clean interface for the bottom-up search. The situation with two iterators is quite clunky, and avoiding the inner iterator overloads the memory. We have to make it easy for people to implement various versions rather quickly.
Here is my proposal that (hopefully?) avoids our previous struggles.

Here is a proposal for a generic bottom-up iterator (the capitalized functions are the interface):

bank <- INITIALISE_BANK(iter)
POPULATE!(iter, bank, grammar)
to_combine, cont <- COMBINE(iter, bank, grammar, nothing)

while cont != nothing
     for prog in construct(to_combine)
          # prog is a uniform tree
          return prog
          address <- NEW_ADDRESS([parent addresses])
          ADD!(iter, prog, bank, address)
          to_combine, cont <- COMBINE(iter, bank, grammar, cont)
     end
end

The interface functions do the following:

  • INITIALISE_BANK: create the empty data structure for the bank
  • POPULATE!: populates the bank with the initial programs
  • COMBINE: returns the addresses (in the bank) of the programs to combine next and an arbitrary state to keep track of combinations so far. Returning the addresses instead of programs allows flexibility and customizability without overwhelming the memory. If the state is nothing, then the iteration stops. The generic combine function takes the addresses and constructs a program combining the programs indicated by those addresses (this is not a functionality that a user needs to implement).
  • NEW_ADDRESS: computes the address (in the bank) of a newly constructed program, from the parents' addresses
  • ADD!: adds a program into the bank at the given address

This interface would allow us to implement many versions of bottom-up search easily:

  • standard depth-indexed bottom-up:
    • the bank is a dict where the depth is the key and values are the programs with that depth
    • the COMBINE function combines the biggest entry in the bank with all the others
  • size-indexed bottom-up:
    • the bank is a dict with keys being the size of programs and the values are programs of that size
    • the COMBINE function combines the programs of the biggest size with all the others
  • Brute (execution-based distance)
    • the bank is a priority queue with the priority function being the distance to the desired output
    • the COMBINE function takes the best program from the queue and combines it with single-step refinements
  • cost-based search:
    • the bank is either a priority queue ordered by cost or a dict where the costs are the keys
    • the COMBINEfunction
  • Eco/Heap/Bee search
    • the bank is a nested dictionary with cost tuples
    • the COMBINEfunctions gradually increases the cost
  • beam search
    • the bank is a priority queue (of fixed size) ordered by how good programs are (for whatever metric) and a collection of simple programs
    • the COMBINE function extends each program in a queue with simple extensions

Optimisations are also easy to do:

  • observational equivalence: do it within the ADD! functions and store the previous executions in the bank
  • bank cleanup: also do it in the ADD! function?
  • want to store some additional information: extend the bank to contain that information, and use it in theCOMBINE! function, maintain the information within the ADD! function

Issue: how to handle returning individual programs when manipulating uniform trees within the iterator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants