Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pathfinding bug (binary domain / calc_longest_path) #12

Open
smearle opened this issue Mar 29, 2022 · 3 comments
Open

Pathfinding bug (binary domain / calc_longest_path) #12

smearle opened this issue Mar 29, 2022 · 3 comments

Comments

@smearle
Copy link
Collaborator

smearle commented Mar 29, 2022

The following code returns a sub-optimal path:

import gym
import numpy as np

import gym_pcgrl
from gym_pcgrl.envs.helper import get_string_map


maze = np.array([[0, 0],
                 [0, 0],
                 [0, 1],])

env_name = "binary-narrow-v0"
env = gym.make(env_name)
stats = env.unwrapped._prob.get_stats(
    get_string_map(maze, env.unwrapped._prob.get_tile_types()),
)
print(stats)

This prints {'regions': 1, 'path-length': 2}, but the maximum path length is 3 (from bottom left to top right).

In calc_longest_path, the empty tile (0, 0) is chosen as the source of a call to run_dijkstra. This returns the following dijkstra map:

0 1
1 2
2 -1

. The extremities relative to this tile are tiles (1, 1) and (2, 0), which are both 2 away from (0, 0). But (1, 1) is chosen as the source of the next call to dijkstra, which results in:

2 1
1 0
2 -1

, giving a path-length of 2.

If instead (2, 0) was chosen as the source of the second call to dijkstra, we would have:

2 3
1 2
0 -1

, resulting in a path-length of 3.

One obvious fix would be to call dijkstra from each of the extremities that results from the first search, then record the maximum path-length over these. Is there any faster way?

@smearle smearle changed the title Pathfinding bug Pathfinding bug (binary domain / calc_longest_path) Mar 29, 2022
@amidos2006
Copy link
Owner

amidos2006 commented Mar 29, 2022

Yup, that issue was intentional since the beginning of the PCGRL and it is well known. Getting the longest shortest path is an np problem and to solve it accurately you need to do dikjstra n^2 times which is extremely slow leading to O(n^4).

The way we calculate is by running dijkstra twice leading to a good approximation with O(n^2). The approximation is consistent and returns good relative values which were what is important for the framework not to get accurate results but to get consistent results that are relatively good (when you compare two maps, I can approximately tell you which one is better).

A higher accuracy solution is what you have suggested. The second run of dijkstra is to be repeated k times where k maximum will be probably n making the algorithm O(n^3) in the worst case. The k times equal to all the cells with the largest value after running the first dijkstra.

In the end, the function was not designed to return the perfect result but to return fast results for RL so O(n^2) is better than O(n^3) in the case of RL and the difference won't be as noticeable from RL perspective to learn because it usually differs in a small amount like the 3 and 2 example that you have shown.

@smearle
Copy link
Collaborator Author

smearle commented Mar 31, 2022

Thanks so much for the detailed explanation! I missed the memo, clearly, but nonetheless an interesting problem to think about. The current approximation sounds like a reasonable tradeoff.

Perhaps we should say "Calculate the approximate longest path on the map" for the sake of future naive coders like myself 🥲 .

@amidos2006
Copy link
Owner

Sorry about all of that, It is my fault not to comment that :) Let's add that comment :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants