Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Re)consider more condensed link events? #1228

Open
kainagel opened this issue Nov 11, 2020 · 4 comments
Open

(Re)consider more condensed link events? #1228

kainagel opened this issue Nov 11, 2020 · 4 comments
Labels
enhancement performance performance-related issues

Comments

@kainagel
Copy link
Member

As we know, processing link enter and link leave events uses up a lot of hardware resources. These events are used in the following places:

  • visualization
  • link travel time computation
  • emissions
    For Vis and possibly for emissions, in many situations it would be sufficient to output the corresponding events only at the end, or every 100 iterations. So the following concentrates on link travel time computation.

Concerning link ttime computation, one could think about the following accelerations:

  1. only report link leave events, which contain the link enter information implicitly
  2. aggregate link travel times already at the link level and send them only at the end of the time bin

The path to this would be, I think, a switch to an events format that has something like cnt="1" for the current processing, and something like "tTimesSum" or "tTimeMean" and some second moment, e.t. "tTimeVariance".

@mrieser mrieser added the performance performance-related issues label Nov 11, 2020
@mrieser
Copy link
Contributor

mrieser commented Nov 11, 2020

I like the thinking on how we could optimize the amount of data MATSim generates and processes. Just some thinking from my side:

  1. in regular traffic, having only linkEnter or linkLeave would be enough, at least as long as we don't have any special intersection logic and also uses up travel time.
  2. Having only linkLeave event seems natural, as it can report on what has happened and how long it took
  3. Having no linkEnter event makes analysis of congested scenarios more difficult, as you cannot observe from the events, on which links vehicles get stuck. You only get the information, which link they last successfully passed. In some special cases, this might be a minor hindrance. (hint: The next version of Via will include a "stuck agent analysis" to help debug scenarios where a high number of agents do not reach the end of their plan. Knowing on what link an agent "disappears" is a useful information in this analysis).
  4. Maybe replace linkEnter and linkLeave with a new CrossedIntersection event that contains fromLink and toLink? Then, we would not lose any information. But it would require major changes, as linkLeave events are broadly used in analysis code.
  5. Outputting certain events only every Nth iteration sounds nice from a performance point, but basically renders them unusable. Essentially, an analysis cannot rely on these events being around, and so the events will become less and less important. So why include them at all?
  6. "aggregating link travel times at the link level" essentially means, it becomes a job of QSim. And JDEQSim. And HERMES. And would need to be supported by QLinks, and QLanes, and probably other implementations. Basically, we would no longer think of "travel times" as something observed from events, but something provided from the mobsim implementation. I don't say this is bad, but it sounds like quite a different paradigm than what we did the last 20 years.
  7. I can image that processing lots of LinkEnterEvents that are mostly ignored in the events handling add significant overhead (memory allocation, distribution to the different threads handling the events). But how much overhead is the actual calculation of the link travel times? With the help of the indexed Ids this should now be fairly efficient (or if it is not yet, it should be possible to implement it fairly efficient) – assuming that some kind of event (either only LinkLeaveEvents or some CrossedIntersectionEvent is still generated and processed, as other event handlers require this information as well.

@kainagel
Copy link
Member Author

@mrieser thanks. I took the liberty to number your items. Just some clarification for some of them:

Re 3.: The stuck event could report the link enter time. Also the vehicleLeavesTraffic event.

Re 5.: We already have the option to write events to file only every Nth iteration. I was essentially thinking of not generating link enter/leave events in iterations where we do not write them to file anyways.

Re. 6.: Yes, I completely agree.

And, I think that Marcel took this in the right stride: This is just some thoughts I got during a seminar talk; nothing to immediately act upon. We should probably close the ticket eventually and then just keep it for reference.

@mrieser
Copy link
Contributor

mrieser commented Nov 11, 2020

Re 5.: Not generating these events when the events are not written to files means that any analysis running in parallel to MATSim cannot use these events (e.g. the traveltime-calculator). Only post-processing analysis code would be able to use those events. Do we want such a divide between events, or analysis-code respectively?

@neuma
Copy link
Contributor

neuma commented Dec 17, 2020

Personally, I consider the events file to be a dump of all the events created at some point during an iteration. It should be the most complete view on what happened. So any (also intermediate) dump should contain all events thrown at some point.

I like the reasoning of 1-4. Squashing those two event types can make a difference - especially with the more and more detailed networks in use. So we would have
a) a new event with fromLink, toLink, timeLeftFromLink, and timeEnteredToLink that replaces LinkEnter and LinkLeave
b) an events reader that translates back into two separate leave/enter events for backwards compatibility?

Regarding 5: If we allow to make throwing an event optional... Guess, we would require some registry were each consumer of an event type, e.g. some on-the-fly analysis, is forced to explicitly name the required event types. The producers could then check the registry and throw only the required event types. However, I don't see how this would work with more generic event types like the LinkEnter/Leave. An analysis interested in e.g. only pt vehicles would still register for all LinkEnter events including those related to cars. We may gain some performance by dropping more special event types that link to a contrib. But then, not using the contrib would probably also not produce the special event types in the first place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement performance performance-related issues
Projects
None yet
Development

No branches or pull requests

3 participants