-
Hey @rwightman ! I saw that Also, regarding the implementation of def drop_path(x, drop_prob: float = 0., training: bool = False):
if drop_prob == 0. or not training:
return x
keep_prob = 1 - drop_prob
shape = (x.shape[0],) + (1,) * (x.ndim - 1) # work with diff dim tensors, not just 2D ConvNets
random_tensor = keep_prob + torch.rand(shape, dtype=x.dtype, device=x.device)
random_tensor.floor_() # binarize
output = x.div(keep_prob) * random_tensor
return output
class DropPath(nn.Module):
def __init__(self, drop_prob=None):
super(DropPath, self).__init__()
self.drop_prob = drop_prob
def forward(self, x):
return drop_path(x, self.drop_prob, self.training) How does this relate to the paper please? In the paper, the Residual Block is either active or inactive which is not the case for the implementation above. It updates the tensor and doesn't quite return either zeros or the tensor itself? As an example,
My tensor just got updated to 1.333 which is a version of Stochastic Depth that I do not understand. Is this expected? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 8 replies
-
@amaarora they are the same (as used here), drop_connect was the name used for stochastic depth in the original TF efficientnet code, and I adopted that name, then realized it conflicts with another, different use of 'drop connect'. Since 'stochastic depth' is a mouth/keyboard full and more of a concept name than a good layer name, I used drop_path as the layer to implement stochastic depth (by dropping paths in the residual). The arg for drop_connect -> drop_path |
Beta Was this translation helpful? Give feedback.
@amaarora they are the same (as used here), drop_connect was the name used for stochastic depth in the original TF efficientnet code, and I adopted that name, then realized it conflicts with another, different use of 'drop connect'. Since 'stochastic depth' is a mouth/keyboard full and more of a concept name than a good layer name, I used drop_path as the layer to implement stochastic depth (by dropping paths in the residual). The arg for drop_connect -> drop_path