HomeAI Tutorial

frad

Public

A from-scratch PyTorch LLM implementing Sparse Mixture-of-Experts (MoE) with Top-2 gating. Integrates modern Llama-3 components (RMSNorm, SwiGLU, RoPE, GQA) and a custom-coded Byte-Level BPE tokenizer. Pre-trained on a curated corpus of existential & dark philosophical literature.

Hora de creación2025-12-01T22:30:51
Hora de actualización2025-12-01T22:45:07
0
Stars
0
Stars Increase

Proyectos relacionados