Initial commit: SheepOp LLM - Transformer-based language model implementation
- Complete transformer implementation from scratch - Training pipeline with gradient accumulation and mixed precision - Optimized inference with KV caching - Multi-format data processing (PDFs, images, code, text) - Comprehensive documentation - Apache 2.0 license - Example training plots included in docs/images/
This commit is contained in:
1087
docs/MATHEMATICS.md
Normal file
1087
docs/MATHEMATICS.md
Normal file
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user