What is: Mixture of Softmaxes?
| Source | Breaking the Softmax Bottleneck: A High-Rank RNN Language Model |
| Year | 2000 |
| Data Source | CC BY-SA - https://paperswithcode.com |
Mixture of Softmaxes performs different softmaxes and mixes them. The motivation is that the traditional softmax suffers from a softmax bottleneck, i.e. the expressiveness of the conditional probability we can model is constrained by the combination of a dot product and the softmax. By using a mixture of softmaxes, we can model the conditional probability more expressively.
