-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathnotes.txt
94 lines (69 loc) · 3.8 KB
/
notes.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
https://keras.io/
CNN: https://adeshpande3.github.io/A-Beginner's-Guide-To-Understanding-Convolutional-Neural-Networks/
Dataset: https://www.kaggle.com/smeschke/four-shapes
prevent local maxima: https://github.com/keras-team/keras/issues/1006
LeNet: https://www.pyimagesearch.com/2017/12/11/image-classification-with-keras-and-deep-learning/
RNN: https://github.com/sagar448/Keras-Recurrent-Neural-Network-Python
theano - installed for user, not in pip env
tensorflow, cntk - pip env
DeepDSL
DSL comparisons
install driver: https://askubuntu.com/a/851147 follow steps and choose the driver that is latest and proprietary
CUDA and cuDNN: https://gist.github.com/zhanwenchen/e520767a409325d9961072f666815bb8
tensorflow: https://www.tensorflow.org/install/install_linux install tensorflow==1.8.0 using pip in virtual env
20 epochs:
channels last:
cntk - 61
tensorflow - 42
theano - 100
mxnet - 42
channels first:
cntk - 41 (599 on cpu for 10 epochs)
tensorflow - 41 (489 on cpu for 10 epochs)
theano - 40
mxnet - 42
(x, y, 3) (3, x, y)
deepdsl 200 epochs - 3 sec - accuracy 40%
1. keras < high level API as DSL
2. deepdsl as backend for keras (after validating)
c++, java with GPU or directly for cuda
1. next frame prediction
2. matrix convergence
why deepdsl accuracy 40%, is it actually doing 200 epochs
python API that compiles into CUDA code:
1. understand flow of deepdsl
2. see how to reuse keras' code
#######################################################################################################################################################
DeepDSL
1. Lot of optimizations, better than mainstream libraries/frameworks
DSL
2. Notations (should be easier to use for non programmers)
-> New notations -> should have residual connection support, inceptions (to test state of the art dl models)
-> matlab, keras, and other high level apis
3. optimizations
-> use optimizations of DeepDSL
-> if new notations, then either by compiling to deepdsl or implementing optimizations
-> if new optimization, try future works that are shown in conclusion (RNN, genetic adversary network, reinforcement learning networks)
To Do
1. See how all other mainstream libraries implement deep learning, in order to understand what deepdsl does differently
2. see examples of other libraries/frameworks for notations
*. try different examples, try changing the java code, printing out the target and predictions, different models
*. check why convergence is slow
3. understand the optimizations
*. change code to represent model as list
*. try creating jar and execute without maven
Ideally, we need a complete DSL without external dependancies for optimization
-> opens opportunity for us and community to add new optimizations, features
-> as it is now, the deepdsl code can only be used for research purposes, for new papers that add new optimizations to the code. the optimizations cannot be added to the code due to license
For 2nd review,
-> show notations alone with VGG, overfeat, alexnet, lenet on MNIST
-> justification for notations in python
* easy to use.
* why is deepdsl not easy to use? few extra steps(solver, param, cudacompile), naming layers, lot of different imports
* can add support for image processing
* deepdsl does not have imaeg processing support. image processing(imagenet, lmdb, mnist) is done in java side. difficult for user
* deepdsl was not designed with user convenience in mind. it was designed to showcase the optimizations and efficiency
-> quantification of efficiency
* cannot be said at this stage
* lines of code maybe an indicator, but if notation changes it cannot be used
* ease of usage needs to be measured