alexnet-head

All memes

An architectural pattern where the final layers of a convnet are dense, large layers (often taking up a bulk of the model's parameter count), finally fed into a softmax to create a multinomial distribution on the resulting labels conditioned on the input. Here is an example:

	nn.Flatten(),
	nn.Linear(final_out_dim, dense_mid_dim),
	nn.ReLU(),
	nn.Linear,(dense_mid_dim, dense_mid_dim),
	nn.ReLU(),
	nn.Linear(dense_mid_dim, num_labels),
	nn.ReLU(),
	nn.Softmax()