Initial commit: SheepOp LLM - Transformer-based language model implementation
- Complete transformer implementation from scratch - Training pipeline with gradient accumulation and mixed precision - Optimized inference with KV caching - Multi-format data processing (PDFs, images, code, text) - Comprehensive documentation - Apache 2.0 license - Example training plots included in docs/images/
This commit is contained in:
BIN
docs/images/loss_by_epoch.png
Normal file
BIN
docs/images/loss_by_epoch.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 26 KiB |
BIN
docs/images/training_curve.png
Normal file
BIN
docs/images/training_curve.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 73 KiB |
Reference in New Issue
Block a user