Why Scaling Mamba Beyond Small Models Could Lead to New Challenges
:::info
Authors:
(1) Albert Gu, Machine Learning Department, Carnegie Mellon University and with equal contribution;
(2) Tri Dao, Department of Computer Science, Princeton University and with equal co...