Skip to content
forked from ShuhuaGao/geppy

A framework for gene expression programming (an evolutionary algorithm) in Python

License

Notifications You must be signed in to change notification settings

ByteSumoLtd/geppy

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

geppy: a gene expression programming framework in Python

drawing

geppy is a computational framework dedicated to Gene Expression Programming (GEP), which is proposed by C. Ferreira in 2001 [1]. geppy is developed in Python 3.

What is GEP?

Gene Expression Programming (GEP) is a popular and established evolutionary algorithm for automatic generation of computer programs and mathematical models. It has found wide applications in symbolic regression, classification, automatic model design, combinatorial optimization and real parameter optimization problems [2].

GEP can be seen as a variant of the traditional genetic programming (GP) and it uses simple linear chromosomes of fixed lengths to encode the genetic information. Though the chromosome (genes) is of fixed length, it can produce expression trees of various sizes thanks to its genotype-phenotype expressio system. Many experiments show that GEP is more efficient than GP, and the trees evolved by GEP tend to have a smaller size than the ones of GP.

geppy and DEAP

geppy is built on top of the excellent evolutionary computation framework DEAP for rapid prototyping and testing of ideas with GEP. DEAP provides fundamental support for GP, while lacking support for GEP. geppy tries the best to follow the style of DEAP and attempts to maintain compatibility with the major infrastructure of DEAP. In other words, to some degree geppy may be considered as a plugin of DEAP to specially support GEP. If you are familiar with DEAP, then it is easy to grasp geppy. Besides, comprehensive documentation is also available.

Features

  • Compatibility with the DEAP infrastructure and easy accessibility to DEAP's functionality including:
    • Multi-objective optimisation
    • Straightforward parallelization of fitness evaluations for speedup
    • Hall of Fame of the best individuals that lived in the population
    • Checkpoints that take snapshots of a system regularly
    • Statistics and logging
  • Core data structures in GEP, including the gene, chromosome, expression tree, and K-expression.
  • Implementation of common mutation, transposition, inversion and crossover operators in GEP.
  • Boilerplate algorithms, including the standard GEP algorithm and advanced algorithms integrating a local optimizer for numerical constant optimization.
  • Support numerical constants inference with a third Dc domain in genes: the GEP-RNC algorithm.
  • Flexible built-in algorithm interface, which can support an arbitrary number of custom mutation and crossover-like operators.
  • Visualization of the expression tree.
  • Symbolic simplification of a gene, a chromosome, or a K-expression in postprocessing.
  • Examples of different applications using GEP with detailed comments in Jupyter notebook.

Installation

Currently, geppy is still in its alpha phase. If you want to try it, you can install it from sources.

  1. First download or clone this repository
git clone https://github.com/ShuhuaGao/geppy
  1. Change into the root directory, i.e., the one containing the setup.py file and install geppy using pip
cd geppy
pip install .

(TODO) Later, I will publish geppy to pip.

Documentation

Check geppy documentation for GEP theory and tutorials as well as a comprehensive introduction of geppy's API and typical usages with comprehensive tutorials and examples.

Examples

A getting started example is presented in the Jupyter notebook Boolean model identification, which infers a Boolean function from given input-output data with GEP. More examples are listed in the following.

Simple symbolic regression

  1. Boolean model identification (Getting started with no constants involved)
  2. Simple mathematical expression inference (Constants finding with ephemeral random constants (ERC))
  3. Simple mathematical expression inference with the GEP-RNC algorithm (Demonstrating the GEP-RNC algorithm for numerical constant evolution)

Advanced symbolic regression

  1. Improving symbolic regression with linear scaling (Use the linear scaling technique to evolve models with continuous real constants more efficiently)

Requirements

  • Python 3.5 and afterwards
  • DEAP, which should be installed automatically if you haven't got it when installing geppy.
  • [optional] To visualize the expression tree using the geppy.export_expression_tree method, you need the graphviz module.
  • [optional] Since GEP/GP doesn't simplify the expressions during evolution, its final result may contain many redundancies, and the tree can be very large, like x + 5 * (2 * x - x - x) - 1, which is simply x - 1. You may like to simplify the final model evolved by GEP with symbolic computation to get better understanding of this model. The corresponding geppy.simplify method depends on the sympy package.

Reference

The bible of GEP is definitely Ferreira, C.'s monograph: Ferreira, C. (2006). Gene expression programming: mathematical modeling by an artificial intelligence (Vol. 21). Springer.

You can also get a lot of papers/documents by Googling 'gene expression programming'.

[1] Ferreira, C. (2001). Gene Expression Programming: a New Adaptive Algorithm for Solving Problems. Complex Systems, 13. [2] Zhong, J., Feng, L., & Ong, Y. S. (2017). Gene expression programming: a survey. IEEE Computational Intelligence Magazine, 12(3), 54-72.

About

A framework for gene expression programming (an evolutionary algorithm) in Python

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%