Abstract: Traditionally, the success of the Transformer has been attributed to its token mixer, particularly the self-attention mechanism. However, recent studies suggest that replacing such attention ...