Skip to content

madhu-aithal/nnRNN_release

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics

nnRNN - NeurIPS 2019

expRNN code taken from here

EURNN tests based on code taken from here

Summary of Current Results

Copytask

Copytask, T=200, with same number of hidden units

Copytask, T=200, with same number of hidden units

Permuted Sequential MNIST

Permuted Sequential MNIST, with same number of hidden units

Permuted Sequential MNIST, with same number of hidden units

Hyperparameters for reported results

Copytask
Model Hidden Size Optimizer LR Orth. LR δ T decay Recurrent init
RNN 128 RMSprop α=0.9 0.001 Glorot Normal
RNN-orth 128 RMSprop α=0.99 0.0002 Random Orth
EURNN 128 RMSprop α=0.5 0.001
EURNN 256 RMSprop α=0.5 0.001
expRNN 128 RMSprop α=0.99 0.001 0.0001 Henaff
expRNN 176 RMSprop α=0.99 0.001 0.0001 Henaff
nnRNN 128 RMSprop α = 0.99 0.0005 10-6 0.0001 10-6 Cayley
sMNIST
Model Hidden Size Optimizer LR Orth. LR δ T decay Recurrent init
RNN 512 RMSprop α=0.9 0.0001 Glorot Normal
RNN-orth 512 RMSprop α=0.99 5*10-5 Random orth
EURNN 512 RMSprop α=0.9 0.0001
EURNN 1024 RMSprop α=0.9 0.0001
expRNN 512 RMSprop α=0.99 0.0005 5*10-5 Cayley
expRNN 722 RMSprop α=0.99 5*10-5 Cayley
nnRNN 512 RMSprop α=0.99 0.0002 2*10-5 0.1 0.0001 Cayley
LSTM 512 RMSprop α=0.99 0.0005 Glorot Normal
LSTM 257 RMSprop α=0.9 0.0005 Glorot Normal

Usage

Copytask

python copytask.py [args]

Options:

  • net-type : type of RNN to use in test
  • nhid : number if hidden units
  • cuda : use CUDA
  • T : delay between sequence lengths
  • labels : number of labels in output and input, maximum 8
  • c-length : sequence length
  • onehot : onehot labels and inputs
  • vari : variable length
  • random-seed : random seed for experiment
  • batch : batch size
  • lr : learning rate for optimizer
  • lr_orth : learning rate for orthogonal optimizer
  • alpha : alpha value for optimizer (always RMSprop)
  • rinit : recurrent weight matrix initialization options: [xavier, henaff, cayley, random orth.]
  • iinit : input weight matrix initialization, options: [xavier, kaiming]
  • nonlin : non linearity type, options: [None, tanh, relu, modrelu]
  • alam : strength of penalty on (δ in the paper)
  • Tdecay : weight decay on upper triangular matrix values

permuted sequtential MNIST

python sMNIST.py [args]

Options:

  • net-type : type of RNN to use in test
  • nhid : number if hidden units
  • epochs : number of epochs
  • cuda : use CUDA
  • permute : permute the order of the input
  • random-seed : random seed for experiment (excluding permute order which has independent seed)
  • batch : batch size
  • lr : learning rate for optimizer
  • lr_orth : learning rate for orthogonal optimizer
  • alpha : alpha value for optimizer (always RMSprop)
  • rinit : recurrent weight matrix initialization options: [xavier, henaff, cayley, random orth.]
  • iinit : input weight matrix initialization, options: [xavier, kaiming]
  • nonlin : non linearity type, options: [None, tanh, relu, modrelu]
  • alam : strength of penalty on (δ in the paper)
  • Tdecay : weight decay on upper triangular matrix values
  • save_freq : frequency in epochs to save data and network

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 90.3%
  • Python 9.5%
  • Shell 0.2%