A new technical paper titled “LongSight: Compute-Enabled Memory to Accelerate Large-Context LLMs via Sparse Attention” was published by researchers at Cornell University. “Large input context windows ...