The goal of this project is to advance research in maritime search + and rescue missions using Reinforcement Learning techniques. The + software provides researchers with two distinct environments: one + simulates shipwrecked people drifting with maritime currents, creating + a stochastic setting for training and evaluating autonomous agents; + the other features a realistic particle simulation for mapping and + optimizing search area coverage by autonomous agents.
+Both environments adhere to open-source standards and offer + extensive customization options, allowing users to tailor them to + specific research needs. These tools enable Reinforcement Learning + agents to learn efficient policies for locating shipwrecked + individuals or maximizing search area coverage, thereby enhancing the + effectiveness of maritime rescue operations.
+Maritime navigation plays a crucial role across various domains,
+ including leisure activities and commercial fishing. However, maritime
+ transportation is particularly significant as it accounts for 80% to
+ 90% of global trade
+ (
To achieve this objective, traditional SAR operations have utilized
+ path planning algorithms such as parallel sweep, expanding square, and
+ sector searches
+ (
The two primary metrics for evaluating an efficient search are + coverage rate and time to detection. Coverage rate is the proportion + of the search area covered by the search units over a specific period. + Higher coverage rates typically indicate more effective search + strategies. Time to detection is the time taken from the start of the + search operation to the successful detection of the target. Minimizing + this time is often a critical objective in SAR missions.
+Expanding on the state-of-the-art research presented by Ai et al.
+ (
In order to contribute to research on the effectiveness of
+ integrating RL techniques into SAR path planning, the Drone Swarm
+ Search Environment (
Simulation environment showcasing the algorithm’s
+ execution.
The environment depicted in
+
The package also includes a second environment option. Similar to
+ the first, this alternative setup is designed for training agents, but
+ with key differences in its objectives and mechanics. Unlike the first
+ environment, which rewards agents for speed in their searches, this
+ second option rewards agents that cover the largest area without
+ repetition. It incorporates a trade-off by using a stationary
+ probability matrix, but enhances the simulation with a more advanced
+ Lagrangian particle model
+ (
Using this environment, any researcher or practitioner can write + code and execute an agent’s training, such as the source code + presented below.
+from DSSE import DroneSwarmSearch
+
+env = DroneSwarmSearch()
+
+observations, info = env.reset()
+
+rewards = 0
+done = False
+while not done:
+ actions = policy(observations, env.get_agents())
+ observations, rewards, terminations, truncations, infos = env.step(actions)
+ done = any(terminations.values()) or any(truncations.values())
+ The grid is divided into square cells, each representing a quadrant + with sides measuring 130 meters in the real world. This correlation + with real-world dimensions is crucial for developing agents capable of + learning from realistic motion patterns. The drones, which are + controlled by RL algorithms, serve as these agents. During the + environment’s instantiation, users define the drones’ nominal speeds. + These drones can move both orthogonally and diagonally across the + grid, and they are equipped to search each cell for the presence of + the PIW.
+Several works have been developed over the past few years to define
+ better algorithms for the search and rescue of shipwrecks
+ (
This new library makes it possible to implement and evaluate new RL
+ algorithms, such as Deep Q-Networks (DQN)
+ (