This commit is contained in:
Carlos Gutierrez
2025-10-05 17:19:12 -04:00
parent bd03215133
commit 0fb21fd408
2 changed files with 120 additions and 43 deletions

View File

@@ -182,18 +182,23 @@ SmartEdgeAI/
### Sample Output (iot_llm_sim)
```
simSeconds 3.875651 # Simulation time
simInsts 2665005563 # Instructions executed
simOps 5787853650 # Operations (including micro-ops)
hostInstRate 474335 # Instructions per second
simSeconds 3.875651 # Simulation time (3.88 seconds)
simInsts 2665005563 # Instructions executed (2.67 billion)
simOps 5787853650 # Operations (5.79 billion including micro-ops)
hostInstRate 476936 # Instructions per second (477K inst/s)
hostOpRate 1035809 # Operations per second (1.04M op/s)
hostMemory 11323568 # Host memory usage (11.3 MB)
hostSeconds 5587.76 # Real time elapsed (93 minutes)
```
### Performance Metrics
- **Simulation Speed**: ~474K instructions/second
- **Memory Usage**: Successfully processes 24k tokens (24MB allocation)
- **CPU Utilization**: O3CPU with realistic pipeline behavior
- **Cache Performance**: Detailed L1/L2 hit/miss statistics
- **Simulation Speed**: 477K instructions/second
- **Total Instructions**: 2.67 billion for 24k token processing
- **Cache Performance**: 98.75% hit rate, 1.25% miss rate
- **Memory Efficiency**: 57.4M cache misses out of 4.58B total accesses
- **Energy Consumption**: 568.4 mJ total (212.8 pJ per instruction)
- **Power Consumption**: 146.5 mW average
## 🛠️ Usage Guide
@@ -292,20 +297,26 @@ sh scripts/check_gem5.sh
### Key Metrics
- **simSeconds**: Total simulation time
- **simInsts**: Instructions executed
- **simOps**: Operations (including micro-ops)
- **hostInstRate**: Simulation speed
- **Cache Miss Rates**: L1/L2 performance
- **Memory Bandwidth**: DRAM utilization
- **simSeconds**: Total simulation time (3.88s for IoT LLM)
- **simInsts**: Instructions executed (2.67B for 24k tokens)
- **simOps**: Operations (5.79B including micro-ops)
- **hostInstRate**: Simulation speed (477K inst/s)
- **Cache Miss Rates**: 1.25% miss rate, 98.75% hit rate
- **Memory Bandwidth**: 4.58B cache transactions processed
### Energy Analysis
The project includes energy post-processing scripts that calculate:
- **Energy per Instruction (EPI)**
- **Power consumption**
- **Energy-Delay Product (EDP)**
- **Drowsy vs Non-drowsy comparisons**
**Actual IoT LLM Results**:
- **Energy per Instruction (EPI)**: 212.8 pJ
- **Total Energy**: 568.4 mJ for 24k token processing
- **Power Consumption**: 146.5 mW average
- **Memory Energy**: 34.4 mJ (6% of total energy)
- **Energy-Delay Product (EDP)**: 2.204 J·s
**Optimization Potential**:
- **Drowsy Cache**: 15% energy reduction (483 mJ)
- **Little Core**: 55% energy reduction (254 mJ)
- **Hybrid+Drowsy**: 47% energy reduction (302 mJ)
## 🎯 Future Enhancements