From 8655430cf00f59ed07dd490be5463d8ac55ecad0 Mon Sep 17 00:00:00 2001
From: Jeremiah <4462211+jeremiahpslewis@users.noreply.github.com>
Date: Mon, 23 Oct 2023 10:53:00 -0500
Subject: [PATCH] Apply suggestions from code review

---
 .../src/algorithms/offline_rl/CQL_SAC.jl                      | 4 ++--
 .../src/algorithms/offline_rl/offline_rl.jl                   | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/ReinforcementLearningZoo/src/algorithms/offline_rl/CQL_SAC.jl b/src/ReinforcementLearningZoo/src/algorithms/offline_rl/CQL_SAC.jl
index 72d284ca2..a77e42fb7 100644
--- a/src/ReinforcementLearningZoo/src/algorithms/offline_rl/CQL_SAC.jl
+++ b/src/ReinforcementLearningZoo/src/algorithms/offline_rl/CQL_SAC.jl
@@ -14,11 +14,11 @@ export CQLSACPolicy
     )
 
     Implements the Conservative Q-Learning algorithm [1] in its continuous variant on top of the SAC algorithm [2]. `CQLSACPolicy` wraps a classic `SACPolicy` whose networks will be trained normally, except for the additional conservative loss.
-    CQLSACPolicy contains the additional hyperparameters that are specific to this method. α_cql is the lagrange penalty for the conservative_loss, it will be automatically tuned if ` α_cql_autotune = true`. `cons_weight` is a scaling parameter 
+    `CQLSACPolicy` contains the additional hyperparameters that are specific to this method. α_cql is the lagrange penalty for the conservative_loss, it will be automatically tuned if ` α_cql_autotune = true`. `cons_weight` is a scaling parameter 
     which may be necessary to decrease if the scale of the Q-values is large. `τ_cql` is the threshold of the lagrange conservative penalty.
     See SACPolicy for all the other hyperparameters related to SAC.
 
-    If desired, you can provide an `Experiment(agent, env, stop_condition, hook)` to finetune_experiment to finish the training with a finetuning run. `agent` should be a normal `Agent` with policy being `sac`, an environment to finetune on. 
+    If desired, you can provide an `Experiment(agent, env, stop_condition, hook)` to `finetune_experiment` to finish the training with a finetuning run. `agent` should be a normal `Agent` with policy being `sac`, an environment to finetune on. 
     See the example in ReinforcementLearningExperiments.jl for an example on the Pendulum task.
     
     As this is an offline algorithm, it must be wrapped in an `OfflineAgent` which will not update the trajectory as the training progresses. However, it _will_ interact with the supplied environment, which may be useful to record the progress.
diff --git a/src/ReinforcementLearningZoo/src/algorithms/offline_rl/offline_rl.jl b/src/ReinforcementLearningZoo/src/algorithms/offline_rl/offline_rl.jl
index 7ccc7d823..d3ba98a59 100644
--- a/src/ReinforcementLearningZoo/src/algorithms/offline_rl/offline_rl.jl
+++ b/src/ReinforcementLearningZoo/src/algorithms/offline_rl/offline_rl.jl
@@ -8,4 +8,4 @@ include("PLAS.jl")
 include("ope/ope.jl")
 include("common.jl")
 =#
-include("CQL_SAC.jl")
\ No newline at end of file
+include("CQL_SAC.jl")