Initial commit: SheepOp LLM - Transformer-based language model implementation

- Complete transformer implementation from scratch
- Training pipeline with gradient accumulation and mixed precision
- Optimized inference with KV caching
- Multi-format data processing (PDFs, images, code, text)
- Comprehensive documentation
- Apache 2.0 license
- Example training plots included in docs/images/
This commit is contained in:
Carlos Gutierrez
2025-11-06 22:07:41 -05:00
commit 3d2da94ce2
60 changed files with 25153 additions and 0 deletions

369
extensions.txt Normal file
View File

@@ -0,0 +1,369 @@
_10083
_10083-h
1996
1997
1998
1999
2000
2001
2002
2003
2004
20041201
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2020-06-27
2020-12-19
2020-12-28
2021
2021-01-26
2021-01-27
2022
2023
2024
2025
aaaa
aaab
aaac
aaad
aaae
aaaf
aaag
aaah
aaai
aaaj
aaak
aaal
aaam
aaan
aaao
aaap
aaaq
aaar
aaas
aaat
aaau
aaav
aaaw
aaax
aaay
aaaz
aaba
aabb
aabc
aabd
aabe
aabf
aabg
aabh
aabi
aabj
aabk
aabl
aabm
aabn
aabo
aabp
aabq
aabr
aabs
aabt
aabu
aabv
aabw
aabx
aaby
aabz
aaca
aacb
aacc
aacd
aace
aacf
aacg
aach
aaci
aacj
aack
aacl
aacm
aacn
aaco
aacp
aacq
aacr
aacs
aact
aacu
aacv
aacw
aacx
aacy
aacz
aada
aadb
aadc
aadd
aade
aadf
aadg
aadh
aadi
aadj
aadk
aadl
aadm
aadn
aado
aadp
aadq
aadr
aads
aadt
aadu
aadv
aadw
aadx
aady
aadz
aaea
aaeb
aaec
aaed
aaee
aaef
aaeg
aaeh
aaei
aaej
aaek
aael
aaem
aaen
aaeo
aaep
aaeq
aaer
aaes
aaet
aaeu
aaev
aaew
aaex
aaey
aaez
aafa
aafb
aafc
aafd
aafe
aaff
aafg
aafh
aafi
aafj
aafk
aafl
aafm
aafn
aafo
aafp
aafq
aafr
aafs
aaft
aafu
aafv
aafw
aafx
aafy
aafz
aaga
aagb
aagc
aagd
aage
aagf
aagg
aagh
aagi
aagj
aagk
aagl
aagm
aagn
aago
aagp
aagq
aagr
aags
aagt
aagu
aagv
aagw
aagx
aagy
aagz
aaha
aahb
aahc
aahd
aahe
aahf
aahg
aahh
aahi
aahj
aahk
aahl
aahm
aahn
aaho
aahp
aahq
aahr
aahs
aaht
aahu
aahv
aahw
aahx
aahy
aahz
aaia
aaib
aaic
aaid
aaie
aaif
aaig
aaih
aaii
aaij
aaik
aail
aaim
aain
aaio
aaip
aaiq
aair
aais
aait
aaiu
aaiv
aaiw
aaix
aaiy
aaiz
aaja
aajb
aajc
aajd
aaje
aajf
aajg
aajh
aaji
aajj
aajk
aajl
aajm
aajn
aajo
aajp
aajq
aajr
aajs
aajt
aaju
aajv
aajw
aajx
aajy
aajz
aaka
aakb
aakc
aakd
aake
aakf
aakg
aakh
aaki
aakj
aakk
aakl
aakm
aakn
aako
aakp
ALL
AUS
brl
bz2
cache
css
db
dcs
doc
DS_Store
eps
gif
gz
htm
ico
_images
iso
jigdo
jpg
JPG
jsonl
lit
ly
md5
message
mid
mp3
_old
PAR
PAR2
pdf
png
prc
py
pyc
rar
rtf
selected_editor
sfv
sh
sib
static
svg
tar
template
tex
txt
txt~
TXT
xml
zcompdump
zip
zip~
zshrc
zst