The package kpsearch
contains methods to search for portfolios of algorithms (e.g., SAT solvers).
A portfolio is a set of algorithms,
where the portfolio's runtime on a problem instance equals the runtime of the fastest algorithm from the portfolio.
This VBS (virtual best solver) approach corresponds to running all the portfolio's algorithms in parallel
and only recording the runtime of the fastest algorithm per problem instance.
(In reality, one could use a prediction model to choose a (hopefully fast) algorithm for each problem instance.)
Given the runtimes of multiple algorithms on multiple problem instances,
one wants to find the overall optimal (fastest) portfolio of size k
, i.e., containing k
algorithms.
This document provides:
- Steps for setting up the package.
- A short overview of the (portfolio-search) functionality.
- A demo of the functionality.
- Guidelines for developers who want to modify or extend the code base.
If you use this package for a scientific publication, please cite our paper
@inproceedings{bach2022comprehensive,
author={Bach, Jakob and Iser, Markus and B\"{o}hm, Klemens},
title={A Comprehensive Study of k-Portfolios of Recent {SAT} Solvers},
booktitle={25th International Conference on Theory and Applications of Satisfiability Testing (SAT 2022)},
location={Haifa, Israel},
year={2022},
doi={10.4230/LIPIcs.SAT.2022.2}
}
You can install our package from PyPI:
python -m pip install kpsearch
Alternatively, you can install the package from GitHub:
python -m pip install git+https://github.com/Jakob-Bach/Small-Portfolios.git#subdirectory=code/kpsearch_package
If you already have the source code for the package (i.e., the directory in which this README
resides)
as a local directory on your computer (e.g., after cloning the project), you can also perform a local install:
python -m pip install kpsearch_package/
kpsearch.py
provides several functions for searching k-portfolios:
- exact:
exhaustive_search()
,mip_search()
,smt_search()
,smt_search_nof()
- heuristic:
beam_search()
,kbest_search()
,random_search()
mip_search()
is a novel contribution of our paper, using a MIP solver to find optimal k-portfolios.
All other search methods are straightforward and/or adaptations from related work.
mip_search()
is usually the fastest exact search method except for very small or large k
(where the plain exhaustive_search()
may be faster).
All functions share two parameters:
- A
pandas.DataFrame
, where the rows represent problem (e.g., SAT) instances, the columns represent solvers (column names are solver names), and the cells represent runtimes. - The portfolio size
k
, i.e., the maximum number of solvers in the portfolio (some search methods may return smaller portfolios if adding more solvers is not beneficial).
The existence of further parameters depends on the search method; see the individual functions' documentation for details.
All search methods return a list, where each entry is a portfolio described by
- the solver names (column names from the runtime data) and
- the objective value (average runtime of the VBS, i.e., virtual best solver, which assumes the lowest runtime of the portfolio members for each instance).
Let's create a small demo dataset with runtimes of three solvers on four problem instances:
import pandas as pd
import kpsearch
runtimes = pd.DataFrame({'Solver1': [1, 2, 3, 4],
'Solver2': [2, 2, 5, 1],
'Solver3': [5, 3, 2, 1]})
I.e., the data looks as follows:
Solver1 Solver2 Solver3
0 1 2 5
1 2 2 3
2 3 5 2
3 4 1 1
Let's try exhaustive search:
print(kpsearch.exhaustive_search(runtimes=runtimes, k=2))
This search procedure returns all portfolios of the desired size k
(so if you only wanted the optimal portfolio, you would need to search in these results accordingly):
[(['Solver1', 'Solver2'], 1.75), (['Solver1', 'Solver3'], 1.5), (['Solver2', 'Solver3'], 1.75)]
Let's try greedy search, i.e., a beam search with a beam width of one:
print(kpsearch.beam_search(runtimes=runtimes, k=3, w=1))
This search procedure does not only yield a k
-portfolio, but also all intermediate results,
i.e., smaller portfolio sizes, since the procedure iteratively adds solvers to the portfolio:
[(['Solver1'], 2.5), (['Solver1', 'Solver3'], 1.5), (['Solver1', 'Solver2', 'Solver3'], 1.5)]
We can see that the third iteration does not improve the portfolio's VBS, i.e., Solver 2 cannot solve any instance faster than both other solvers.
Though there is no formal superclass or interface, all existing search methods follow specific conventions, which makes them compatible with each other. Thus, if you want to add another portfolio-search method, it may be beneficial to follow these conventions as well.
All search methods share two parameters:
The solver runtimes
as DataFrame
and the number of solvers k
as int
.
You can add further method-specific parameters as you like.
For example, beam search has the beam width w
as another parameter.
The result of each search method is a list of tuples of
- solver names (list of strings, corresponding to column names in
runtimes
) and - portfolio performance (float) in terms of VBS score.
The list may have a length of one in case the search only returns one portfolio.