Discussion of methods alternative to CNNs (e.g. RNNs or transformers) #5

modenesi · 2025-01-28T18:03:49Z

Hi all,

I'm not used to using issues on github, but I love the idea and I want to be using it more. Feel free to give me feedback on it anytime.

I just want to start an issue to discuss whether or not we want to implement models alternative to CNNs in order to make our ML model able to take inputs of any size. It might be that the current joint CNN solution that you both have is enough to tackle the issue. I'll try to learn a bit more about it.

ALTERNATIVE SOLUTIONs:
It might also be interesting to try alternative models, especially if the cost of running them is low? I wonder if we can handle our current code to chatGPT and ask it to write a 2nd model directly suited for time series. It might even capture temporal aspects better than CNNs. Some options:

RNNs, Recurrent Neural Nets, not be confused with Recursive Neural Net (e.g. LSTM or GRU)
Transformers

I need to check, but I think we want to train these models w/ very long vectors, but with padding and masking. We can talk more about it during our meeting today.

I read that RNNs or transformers are great to model temporal dependencies, while CNNs with global pooling is more efficient to capture broad patters in the time sequence.

modenesi · 2025-01-30T00:07:12Z

RNNs are like Neural Nets, but with a "hidden state" that keeps track of the dependency between observations over time. I like this simple explanation of it:

modenesi · 2025-01-30T00:58:20Z

So we would be able to incorporate time-invariant features (such as population size, etc) to the RNN, which is great news!

Roughly, it would update the hidden state as:
$h_t = f ( W_h \cdot h_{t-1} + W_x \cdot x_{t} + W_h \cdot h_{t-1} + W_c \cdot C + b)$
where $C$ accounts for the time-invariant variables.

It is simple to implement it:

TIME SERIES VECTOR: get the time series data, pad it to e.g. 200 days, and mask it
REPEAT TIME-INVARIANT: create 200 rows of the repeated time-invariant features (no masking or padding here)
FINAL INPUT: concatenate the previous two datasets, it will be the input for the model

@gvegayon: What would be the time-invariant variables we'd like to include?

modenesi · 2025-01-30T01:05:17Z

Also, after reading about options and thinking about our problem. I think that LSTM (which is a fancy RNN) might be an overkill, given that our time series isn't that long and it is univariate. In fact, we might even have problems with overfitting the data.

My suggestion:

Try a simple RNN first, adding the time-invariant variables as described above
Also try a Gated Recurrent Unit (GRU), which is an RNN with a bit more structure for time dependencies, but not as complex as a LSTM, in order to compare it to the simple RNN. If there is significant accuracy gains compared to the simple RNN, then consider training a LSTM. If accuracy is similar to the simple RNN, stick to the simple RNN.

Happy to talk about it more next time we meet.

sima-njf · 2025-01-30T17:53:49Z

Thank you so much, Bernardo!
that is an excellent explanation of LSTM. I am now working on splitting the data so I am in the first steps. when I do that part in a way that it runs faster I will implement your ideas on it.

gvegayon · 2025-01-30T18:53:04Z

What would be the time-invariant variables we'd like to include?

That's a great question, @modenesi. For the moment, the only one I can think of is population size. We could add other things, but generally that would make the model less usable. For instance, we could add Rt and generation interval estimates, but that information is not always available.

modenesi · 2025-01-31T17:48:28Z

From chatGPT, I asked for a specific architecture for a simple RNN in R, with 1 time invariant and 1 time variant variable:

library(keras)
library(tensorflow)

# Define Temporal Input (Variable-Length Time-Series Data)
temporal_input <- layer_input(shape = c(90, 1), name = "temporal_input")  # Max 90 days

# Apply Masking to Ignore Padded Timesteps
masked_temporal_input <- temporal_input %>%
  layer_masking(mask_value = 0.0)  # Ignore 0 values (padding)

# Define Time-Invariant Feature Input
time_invariant_input <- layer_input(shape = c(1), name = "time_invariant_input")

# Repeat Time-Invariant Feature Across Timesteps
time_invariant_repeated <- time_invariant_input %>%
  layer_repeat_vector(90)  # Repeat for max length = 90 days

# Concatenate Temporal & Time-Invariant Features
combined_input <- layer_concatenate(list(masked_temporal_input, time_invariant_repeated))

# RNN Layer (Handles Variable-Length Inputs)
rnn_output <- combined_input %>%
  layer_simple_rnn(units = 32, activation = "tanh",
                   kernel_regularizer = regularizer_l2(0.001)) %>%
  layer_dropout(rate = 0.2)  # Dropout for Regularization

# Final Output Layer
final_output <- rnn_output %>%
  layer_dense(units = 1, activation = "linear", name = "output")  # Regression output

# Define Model
model <- keras_model(inputs = list(temporal_input, time_invariant_input),
                     outputs = final_output)

# Compile Model
model %>% compile(
  optimizer = optimizer_adam(),
  loss = "mse"
)

# Print Model Summary
summary(model)

gvegayon assigned modenesi and sima-njf Jan 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion of methods alternative to CNNs (e.g. RNNs or transformers) #5

Discussion of methods alternative to CNNs (e.g. RNNs or transformers) #5

modenesi commented Jan 28, 2025

modenesi commented Jan 30, 2025

modenesi commented Jan 30, 2025 •

edited

Loading

modenesi commented Jan 30, 2025

sima-njf commented Jan 30, 2025

gvegayon commented Jan 30, 2025

modenesi commented Jan 31, 2025

Discussion of methods alternative to CNNs (e.g. RNNs or transformers) #5

Discussion of methods alternative to CNNs (e.g. RNNs or transformers) #5

Comments

modenesi commented Jan 28, 2025

modenesi commented Jan 30, 2025

modenesi commented Jan 30, 2025 • edited Loading

modenesi commented Jan 30, 2025

sima-njf commented Jan 30, 2025

gvegayon commented Jan 30, 2025

modenesi commented Jan 31, 2025

modenesi commented Jan 30, 2025 •

edited

Loading