Commit Graph

14 Commits

Author SHA1 Message Date
Carlos Gutierrez
8fc6aa5a1e fixing memory 2025-11-16 16:55:58 -05:00
Carlos Gutierrez
fb1ca67be9 fixing memory 2025-11-16 16:52:26 -05:00
Carlos Gutierrez
9f17e1db24 Fix optimized attention mask handling for training
- Fix mask format conversion (float to boolean) for scaled_dot_product_attention
- Fix mask dimensions for proper broadcasting [batch, 1, seq_len, seq_len]
- Resolve conflict between is_causal and custom mask parameters
- Enable training with optimized attention and KV caching
2025-11-16 16:44:55 -05:00
Carlos Gutierrez
3fef3e2689 fixing memory 2025-11-16 16:39:11 -05:00
Carlos Gutierrez
49f9e700b4 Merge branch 'master' of github.com:CarGDev/sheepOp 2025-11-16 21:34:13 +00:00
Carlos Gutierrez
153343dca4 fixing memory 2025-11-16 16:34:08 -05:00
Carlos Gutierrez
19e9ae7fbe Merge branch 'master' of github.com:CarGDev/sheepOp 2025-11-16 21:29:20 +00:00
Carlos Gutierrez
b3b955442a fixing memory 2025-11-16 16:29:09 -05:00
Carlos Gutierrez
28bc0b4c27 adding the needed data 2025-11-16 21:28:55 +00:00
Carlos Gutierrez
a1e703423c fixing memory 2025-11-16 16:22:12 -05:00
Carlos Gutierrez
5fe3dc0753 adding the needed data 2025-11-16 21:03:15 +00:00
Carlos Gutierrez
87db20cc7b Fixing README 2025-11-16 15:54:13 -05:00
Carlos Gutierrez
82b1759c5a Update to dual license: Apache 2.0 (Research) + Commercial License
- Changed from Apache 2.0 only to dual license model
- Apache 2.0 for research, education, and non-commercial use
- Commercial license required for profit-making use
- Added citation requirement as condition of use for academic purposes
- Created CITATION.cff file for automatic citation suggestions
- Updated LICENSE, LICENSE.txt, README.md, and documentation
- Citation formats provided (BibTeX and text)
- Contact information for commercial licensing inquiries
2025-11-06 22:15:15 -05:00
Carlos Gutierrez
3d2da94ce2 Initial commit: SheepOp LLM - Transformer-based language model implementation
- Complete transformer implementation from scratch
- Training pipeline with gradient accumulation and mixed precision
- Optimized inference with KV caching
- Multi-format data processing (PDFs, images, code, text)
- Comprehensive documentation
- Apache 2.0 license
- Example training plots included in docs/images/
2025-11-06 22:07:41 -05:00