long-context-attention
PublicUSP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
Creat:2024-03-27T20:33:12
Update:2025-03-25T16:36:51
537
Stars
1
Stars Increase
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference