Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework the optimizing model pages #235

Open
3 of 8 tasks
lesquoyb opened this issue Sep 13, 2024 · 5 comments
Open
3 of 8 tasks

Rework the optimizing model pages #235

lesquoyb opened this issue Sep 13, 2024 · 5 comments

Comments

@lesquoyb
Copy link
Collaborator

lesquoyb commented Sep 13, 2024

Is your request related to a problem? Please describe.

I think the pages about optimizing models are a bit incomplete and now that we have better tools to assess the efficiency of a model we could add more information. The things I thought about so far:

  • check that everything said on the pages is still working, obviously
  • Add a section about the benchmark statement to explain to people how they can benchmark their models themselves
  • talk about the few optimization parameters of gama
  • new concatenate function for optimizing building big strings
  • difference in performance between having a reflex in the agent or to iterate over them and call an action ?
  • difference in performance of the multiple ways to interact with a container for basic scenarios
  • talk about how to set the ram usage in the ini and/or in the headless command, also the heap workaround (see this issue)
  • mention that headless mode is way more efficient than gui and general knowledge about what impacts execution time (losing focus on the window/doing something else in parallel, can't think of other things rn but there certainly are more)

Additional context
If you think about anything else related to performances (options/operators/tricks that may have an impact, interesting scenarios to compare etc.) please tell me about it and I'll try to add it too

@lesquoyb
Copy link
Collaborator Author

I'm going to post the experiments I do as well as the results here so you can all see it and comment if I forgot something obvious.
So the first one is Actions vs reflexes, I compared having an ask loop that calls actions into an agent vs reflexes into the agent calling the same action vs reflexes having the same code as the action:

model actionsvsreflexes

global {
		int size <- 100;
}

species a {
	
	action do_random_things {
		let m <- matrix_with({size,size}, rnd(0,255));
		m <- shuffle(m);
		loop i from:0 to:size-1 {
			loop j from:0 to:size-1 {
				m[i,j] <- rnd(255);
			}
		}
	}
}

species b parent:a {
	reflex call_do_random_things{
		do do_random_things;
	}
}

species c {
	reflex do_random_things_rewritten_in_a_reflex{
		let m <- matrix_with({size,size}, rnd(0,255));
		m <- shuffle(m);
		loop i from:0 to:size-1 {
			loop j from:0 to:size-1 {
				m[i,j] <- rnd(255);
			}
		}
	}
}

experiment e {
	
	int nb_agents <- 1000;
	
	init {
		ask simulation{
			create a number:myself.nb_agents;			
		}
		create simulation{
			create b number:myself.nb_agents;
		}
		
		create simulation{
			create c number:myself.nb_agents;
		}
	}	
	
	action _step_ {
		ask simulations{
			if length(a) >0 {
				benchmark "Manually call actions" repeat:10 {
					ask a {
						do do_random_things;
					}
				}				
			}
			else if length(b) >  0 {
				benchmark "Reflexes call actions" repeat:10 {
						do _step_;	
				}		
			}
			else {
				benchmark "Reflexes do the action" repeat:10 {
						do _step_;	
				}				
			}
		}
	}
}

here are the results:

Manually call actions (over 10 iteration(s)): min = 4089.0007 ms (iteration #0) | max = 4134.1196 ms (iteration #9) | average = 4108.250980000001ms
Reflexes call actions (over 10 iteration(s)): min = 4096.4389 ms (iteration #7) | max = 4125.0502 ms (iteration #1) | average = 4106.92661ms
Reflexes do the action (over 10 iteration(s)): min = 4138.6298 ms (iteration #9) | max = 4182.6288 ms (iteration #6) | average = 4151.716270000001ms

Not much differences between the three methods. I already did other experiments and noticed that the operators on list could be more efficient than basic loops or ask, so that would be interesting to test a "foreach" kind of operator to see if it is also the case here

@lesquoyb
Copy link
Collaborator Author

lesquoyb commented Sep 13, 2024

Here is a model to test the best way to build a list out of properties of agents:
I tested loop over vs loop from/to vs loop times vs ask vs collect
With this model:


model accessinglistitems

global{
	int nb_agents <- 1000000;	
}

species b{
	
	int v;
	
	init {
		v <- rnd(0, 10);
	}
}

experiment e {


	reflex fill_list_from_agents {
		
		ask b{
			do die;
		}
		create b number:nb_agents;
		
		list<int> l1 <- [];
		benchmark "fill list loop over" repeat:10{
			loop obj over:b{
				l1 <+ obj.v;
			}
		}
		
		list<int> l2 <- [];
		benchmark "fill list loop from to" repeat:10{
			int to <- length(b)-1;
			loop i from:0 to:to{
				l2 <+ b[i].v;
			}
		}
		
		list<int> l3 <- [];
		benchmark "fill list loop times" repeat:10{
			int to <- length(b)-1;
			int i <- 0;
			loop times:to{
				l3 <+ b[i].v;
				i <- i + 1;
			}
		}
		
		list<int> l4 <- [];
		benchmark "fill list ask"  repeat:10{
			ask b{
				l4 <+ v;
			}
		}
		
		list<int> l5 <- [];
		benchmark "fill list collect" repeat:10{
			l5 <- b collect (each.v);
		}
	}
}

Which gave me those results

fill list loop over (over 10 iteration(s)): min = 246.7017 ms (iteration #4) | max = 524.0814 ms (iteration #0) | average = 288.12743ms
fill list loop from to (over 10 iteration(s)): min = 469.3644 ms (iteration #9) | max = 555.4155 ms (iteration #2) | average = 486.78100000000006ms
fill list loop times (over 10 iteration(s)): min = 634.147 ms (iteration #1) | max = 658.3801 ms (iteration #0) | average = 647.32994ms
fill list ask (over 10 iteration(s)): min = 227.0711 ms (iteration #4) | max = 297.658 ms (iteration #2) | average = 246.10171ms
fill list collect (over 10 iteration(s)): min = 97.022 ms (iteration #8) | max = 102.0031 ms (iteration #0) | average = 98.54660000000003ms

We can see that ask and loop over have basically the same execution time with a slight advantage for ask, but weirdly enough loop from to is far behind, it was expected that it performed worse but I wouldn't think it was that much. The loop times is even worse which once again was expected but not at that level either.
Finally the best way was also the expected one: the operator collect as we are spared the list concatenation in gaml.

What this teaches us is that there may be hidden (undocumented) behavioural differences between the different loops and/or discrepancies in the implementation of the different syntax that may bring performance gains if homogenized.

@lesquoyb
Copy link
Collaborator Author

lesquoyb commented Sep 13, 2024

The same experiment but with a simple list of int instead of a list of agents raises similar results:

model accessinglistitems

global{
	int nb_agents <- 1000000;	
}

species b{
	
	int v;
	
	init {
		v <- rnd(0, 10);
	}
}

experiment e {

	reflex fill_list_from_other_list{
		
		list<int> base <- [];
		loop times:nb_agents{
			base <+ rnd(0,10);
		}
		list<int> l1 <- [];
		benchmark "fill list loop over" repeat:10{
			loop obj over:base{
				l1 <+ obj;
			}
		}
		
		list<int> l2 <- [];
		benchmark "fill list loop from to" repeat:10{
			int to <- length(base)-1;
			loop i from:0 to:to{
				l2 <+ base[i];
			}
		}
		
		list<int> l3 <- [];
		benchmark "fill list loop times" repeat:10{
			int to <- length(base)-1;
			int i <- 0;
			loop times:to{
				l3 <+ base[i];
				i <- i + 1;
			}
		}
		
		list<int> l5 <- [];
		benchmark "fill list collect" repeat:10{
			l5 <- base collect (each);
		}
		
	}
	
}
fill list loop over (over 10 iteration(s)): min = 137.9432 ms (iteration #8) | max = 230.8834 ms (iteration #1) | average = 159.16049000000004ms
fill list loop from to (over 10 iteration(s)): min = 356.9075 ms (iteration #3) | max = 631.9033 ms (iteration #5) | average = 417.74827000000005ms
fill list loop times (over 10 iteration(s)): min = 498.8026 ms (iteration #9) | max = 649.7151 ms (iteration #0) | average = 535.07749ms
fill list collect (over 10 iteration(s)): min = 17.6604 ms (iteration #7) | max = 20.3564 ms (iteration #4) | average = 18.880190000000002ms

Once again collect is impressively faster than the rest, then comes loop over, then way behind loop from to and even slower loop times.

As this test has been done with the same parameters and on the same computer as the previous one we can also conclude that gathering data is way faster once it's stored inside a list than extracting it from the list of agents (collect here being more than 5 times faster to compute than on the list of agents)

@lesquoyb
Copy link
Collaborator Author

Additionally I tested the two examples of list optimization that were already provided in Optimizing models.
Here is the model to optimize one_of:

/**
* Name: oneofvsshufflefirst
* Based on the internal empty template. 
* Author: baptiste
* Tags: 
*/


model oneofvsshufflefirst




species s {
	
	int property;
	
	init {
		property <- rnd(0,10);
	}
	
}
experiment a {
	
	
	reflex test {
		int list_size <- 100000;
		int nb_repeat <- 1000;
		
		list l <- list_with(list_size, rnd(0,10));
		create s number:list_size;

		benchmark "list with one_of" repeat:nb_repeat{
			int n <- one_of(l where (each > 4));
		}
		benchmark "list with shuffle" repeat:nb_repeat{
			int n <- shuffle(l) first_with(each > 4);
		}
		
		benchmark "species with one_of" repeat:nb_repeat{
			s n <- one_of(s where (each.property > 4));
		}
		benchmark "species with shuffle" repeat:nb_repeat{
			s n <- shuffle(s) first_with(each.property > 4);
		}
		
	}
}

On my computer it yields those results:

list with one_of (over 1000 iteration(s)): min = 2.2603 ms (iteration #725) | max = 22.3685 ms (iteration #266) | average = 2.3916148000000015ms
list with shuffle (over 1000 iteration(s)): min = 1.2011 ms (iteration #543) | max = 8.9432 ms (iteration #0) | average = 1.260572500000002ms
species with one_of (over 1000 iteration(s)): min = 18.7069 ms (iteration #557) | max = 45.8503 ms (iteration #25) | average = 26.117754300000037ms
species with shuffle (over 1000 iteration(s)): min = 1.5321 ms (iteration #881) | max = 15.5877 ms (iteration #934) | average = 2.048304ms

And here is the model to optimize lenght on a filtered list:

/**
* Name: lists
* Based on the internal empty template. 
* Author: baptiste
* Tags: 
*/


model lengthvscount



species s {
	
	int property;
	
	init {
		property <- rnd(0,10);
	}
	
}
experiment a {
	
	
	reflex test {
		int list_size <- 100000;
		int nb_repeat <- 1000;
		
		list l <- list_with(list_size, rnd(0,10));
		create s number:list_size;

		benchmark "list with length" repeat:nb_repeat{
			int n <- length(l where (each > 4));
		}
		benchmark "list with count" repeat:nb_repeat{
			int n <- count(l, (each > 4));
		}
		
		benchmark "species with length" repeat:nb_repeat{
			int n <- length(s where (each.property > 4));
		}
		benchmark "species with count" repeat:nb_repeat{
			int n <- count(s, (each.property > 4));
		}
		
	}
}

Yielding those results:

list with length (over 1000 iteration(s)): min = 2.3611 ms (iteration #142) | max = 24.9621 ms (iteration #957) | average = 2.7138886999999974ms
list with count (over 1000 iteration(s)): min = 2.5347 ms (iteration #633) | max = 22.0283 ms (iteration #781) | average = 2.8719582999999997ms
species with length (over 1000 iteration(s)): min = 21.631 ms (iteration #511) | max = 116.4879 ms (iteration #855) | average = 27.276894799999972ms
species with count (over 1000 iteration(s)): min = 17.6016 ms (iteration #444) | max = 25.8563 ms (iteration #493) | average = 18.519500100000002ms

Once again we can observe that in general it is way more efficient to use lists of simple types (in this case int) instead of selecting the properties inside of agents.
Another noticeable thing that results from this is that optimizations that work on list of agents can not work on lists of simple items; as it's the case in the second example where the "optimized" version is on par with the simple version, and actually on average a little bit slower

@tnguyenh
Copy link

This is very slow and should not be:

/**
* Name: GrayScott
*
* This model simulates patterns emergence with Gray-Scott model. 
*
* Author: Tri Nguyen-Huu
* Tags: 
*/


model GrayScott

global torus: false{
	
	float ru <- 0.5;
	float rv <- 0.25;
	float f <- 0.04;
	float k <- 0.06;
	string color_palette_name <- "RdBu";
	float date_debut ;
	
	int neighborhood_type <- 8;
	
	float step <-1.0;
	int env_size <- 50;
	int grid_size <- 100;
	int nb_seeds <- 12;

	list<string>  palettes_list <- brewer_palettes(2);
	list<rgb> palette <- brewer_colors (color_palette_name);
	int nb_colors <- length(palette);
	geometry shape <- square(env_size) ;
	
	init{
		ask nb_seeds among env{
			self.v <- 1.0;
			ask self.neighbors {
				self.v <- 1.0;
			}
		}
	}
	
	reflex update {
		ask env{
			old_u <- u;
			old_v <- v;
		}
	}
	
	reflex debut when: cycle = 0{
		date_debut <- gama.machine_time;
	}
	
	
	reflex fin when: cycle > 20{
		write (gama.machine_time - date_debut)/1000;
		do die;
	}
}


grid env width: grid_size height: grid_size use_regular_agents: false neighbors: neighborhood_type parallel:true{	
	float u <- 1.0;
	float v <- 0.0;
	rgb color <- pal(u);	
	float old_u;
	float old_v;
	float laplacian_u;
	float laplacian_v;
	int nb_neighbors <- length(self.neighbors);
	
	reflex update{
		if neighborhood_type=4{
			laplacian_u <- sum(self.neighbors collect (each.old_u-self.old_u));
			laplacian_v <- sum(self.neighbors collect (each.old_v-self.old_v));
		}else{
			laplacian_u <- sum(self.neighbors collect (self.w(each)*(each.old_u-self.old_u)));
			laplacian_v <- sum(self.neighbors collect (self.w(each)*(each.old_v-self.old_v)));
		}
		
		u <- old_u + (ru*laplacian_u - old_u*old_v^2 + f*(1-old_u))*step;
		v <- old_v + (rv*laplacian_v + old_u*old_v^2 -(k+f)*old_v)*step;
		color <- pal(u);
	}
	
	rgb pal(float x){
		return palette[max(0,min(nb_colors-1,int(floor(x*nb_colors))))];
	}
	
	float w(env e){
		if even(e.grid_x-self.grid_x+e.grid_y-self.grid_y){
			return 0.05;
		}else{
			return 0.2;
		}
	}
	
}

experiment Gray_Scott type: gui autorun: true {
	text "Change here the reaction and diffusion parameters. u should diffuse faster than v."  category: "Reaction-Diffusion";
	parameter 'Diffusion u' var: ru  min: 0.01 max: 1.0 step: 0.01 category: "Reaction-Diffusion";
	parameter 'Diffusion v' var: rv  min: 0.01 max: 1.0 step: 0.01 category: "Reaction-Diffusion";
	parameter 'Feed rate' var: f  min: 0.0001 max: 0.1 step: 0.0001 category: "Reaction-Diffusion";
	parameter 'Kill rate' var: k  min: 0.01 max: 0.08 step: 0.0001 category: "Reaction-Diffusion";
	parameter 'Neighbors number' var: neighborhood_type <- 8 among:[4,8] category: "Topology";
	parameter 'Grid size' var: grid_size <- 100 min:10 max: 200 category: "Initialization";
	parameter 'Seeds number' var: nb_seeds min: 1 max: 50 category: "Initialization";
	parameter 'Color palette' var: color_palette_name among: brewer_palettes(2) category: "Color" 
				on_change: {palette <- brewer_colors (color_palette_name);nb_colors <- length(palette);};
	
	output {
		display Patterns type: 2d axes:false refresh: every(1#cycle) autosave: every(10#cycle) 
		background: #black toolbar: false fullscreen: false{ 
			grid env border: #black;
		}
	}

}

I wrote the same on Matlab (using GPU optimization) and it runs 1000x faster with higher resolution (1000x1000 instead of 100x100)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Development

No branches or pull requests

2 participants