This commit is contained in:
Carlos Gutierrez
2025-10-05 16:27:45 -04:00
parent d8e51d8bc1
commit 91487b5c27
3 changed files with 646 additions and 126 deletions

419
README.md
View File

@@ -1,171 +1,338 @@
# SmartEdgeAI - (gem5)
# SmartEdgeAI - IoT LLM Simulation with gem5
This repo holds **all scripts, commands, and logs** for Phase 3.
A comprehensive gem5-based simulation framework for IoT LLM workloads, featuring 16GB RAM configuration and 24k token processing capabilities.
## Prerequisites
## 🎯 Project Overview
### Install gem5
Before running any simulations, you need to install and build gem5:
This project simulates IoT (Internet of Things) systems running Large Language Models (LLMs) using the gem5 computer architecture simulator. The simulation includes:
- **IoT LLM Workload**: Simulates processing 24k tokens with memory allocation patterns typical of LLM inference
- **16GB RAM Configuration**: Full-system simulation with realistic memory constraints
- **Multiple CPU Architectures**: Support for big/little core configurations
- **Comprehensive Statistics**: Detailed performance metrics and energy analysis
## 🚀 Quick Start
### Prerequisites
```bash
# Clone gem5 repository
git clone https://github.com/gem5/gem5.git /home/carlos/projects/gem5/gem5src/gem5
# Install required dependencies
sudo apt update
sudo apt install python3-matplotlib python3-pydot python3-pip python3-venv
# Build gem5 for ARM
cd /home/carlos/projects/gem5/gem5src/gem5
scons build/ARM/gem5.opt -j$(nproc)
# Verify installation
sh scripts/check_gem5.sh
# Verify gem5 installation
ls /home/carlos/projects/gem5/gem5src/gem5/build/X86/gem5.opt
```
### Install ARM Cross-Compiler
```bash
# Ubuntu/Debian
sudo apt-get install gcc-arm-linux-gnueabihf
# macOS (if using Homebrew)
brew install gcc-arm-linux-gnueabihf
```
## Quick Start (Run Everything)
To run the complete workflow automatically:
### Run Complete Workflow
```bash
chmod +x run_all.sh
# Run everything automatically
sh run_all.sh
# Or run individual steps
sh scripts/check_gem5.sh # Verify prerequisites
sh scripts/env.sh # Setup environment
sh scripts/build_workloads.sh # Compile workloads
sh scripts/run_one.sh iot_llm_sim big high 0 1MB # Run simulation
```
This will execute all steps in sequence with error checking and progress reporting.
## 📁 Project Structure
## Manual Steps (Order of operations)
### 0. Check Prerequisites
```bash
sh scripts/check_gem5.sh
```
**Check logs**: Should show "✓ All checks passed!" or installation instructions
### 1. Setup Environment
```bash
sh scripts/env.sh
```
**Check logs**: `cat logs/env.txt` - Should show environment variables and "READY" message
### 2. Build Workloads
```bash
sh scripts/build_workloads.sh
```
**Check logs**: Look for "All workloads compiled successfully!" and verify binaries exist:
```bash
ls -la /home/carlos/projects/gem5/gem5-run/
SmartEdgeAI/
├── scripts/ # Automation scripts
│ ├── env.sh # Environment setup
│ ├── build_workloads.sh # Compile workloads
│ ├── run_one.sh # Single simulation run
│ ├── sweep.sh # Parameter sweep
│ ├── extract_csv.sh # Extract statistics
│ ├── energy_post.py # Energy analysis
│ └── bundle_logs.sh # Log collection
├── workloads/ # C source code
│ ├── tinyml_kws.c # TinyML keyword spotting
│ ├── sensor_fusion.c # Sensor data fusion
│ ├── aes_ccm.c # AES encryption
│ └── attention_kernel.c # Attention mechanism
├── iot_llm_sim.c # Main IoT LLM simulation
├── run_all.sh # Master workflow script
└── README.md # This file
```
### 3. Test Single Run
```bash
sh scripts/run_one.sh tinyml_kws big high 0 1MB
```
**Check logs**:
- Verify stats.txt has content: `ls -l /home/carlos/projects/gem5/gem5-data/SmartEdgeAI/results/tinyml_kws_big_high_l21MB_d0/stats.txt`
- Check simulation output: `cat logs/tinyml_kws_big_high_l21MB_d0.stdout.log`
- Check for errors: `cat logs/tinyml_kws_big_high_l21MB_d0.stderr.log`
## 🔧 Script Explanations
### Core Scripts
#### `scripts/env.sh`
**Purpose**: Sets up environment variables and paths for the entire workflow.
**Key Variables**:
- `ROOT`: Base gem5 installation path
- `CFG`: gem5 configuration script (x86-ubuntu-run.py)
- `GEM5_BIN`: Path to gem5 binary (X86 build)
- `RUN`: Directory for compiled workloads
- `OUT_DATA`: Simulation results directory
- `LOG_DATA`: Log files directory
#### `scripts/build_workloads.sh`
**Purpose**: Compiles all C workloads into x86_64 binaries.
**What it does**:
- Compiles `tinyml_kws.c`, `sensor_fusion.c`, `aes_ccm.c`, `attention_kernel.c`
- Creates `iot_llm_sim` binary for LLM simulation
- Uses `gcc -O2 -static` for optimized static binaries
#### `scripts/run_one.sh`
**Purpose**: Executes a single gem5 simulation with specified parameters.
**Parameters**:
- `workload`: Which binary to run (e.g., `iot_llm_sim`)
- `core`: CPU type (`big`=O3CPU, `little`=TimingSimpleCPU)
- `dvfs`: Frequency setting (`high`=2GHz, `low`=1GHz)
- `drowsy`: Cache drowsy mode (0=off, 1=on)
- `l2`: L2 cache size (e.g., `1MB`)
**Key Features**:
- Maps core types to gem5 CPU models
- Copies stats from `m5out/stats.txt` to output directory
- Mirrors results to repository directories
#### `iot_llm_sim.c`
**Purpose**: Simulates IoT LLM inference with 24k token processing.
**What it simulates**:
- Memory allocation for 24k tokens (1KB per token)
- Token processing loop with memory operations
- Realistic LLM inference patterns
- Memory cleanup and resource management
## 🐛 Problem-Solving Journey
### Initial Challenges
#### 1. **Empty stats.txt Files**
**Problem**: Simulations were running but generating empty statistics files.
**Root Cause**: ARM binaries were hitting unsupported system calls (syscall 398 = futex).
**Solution**: Switched from ARM to x86_64 architecture for better gem5 compatibility.
#### 2. **Syscall Compatibility Issues**
**Problem**: `fatal: Syscall 398 out of range` errors with ARM binaries.
**Root Cause**: gem5's syscall emulation mode doesn't support all Linux system calls, particularly newer ones like futex.
**Solution**:
- Tried multiple ARM configurations (starter_se.py, baremetal.py)
- Ultimately switched to x86_64 full-system simulation
- Used `x86-ubuntu-run.py` for reliable Ubuntu-based simulation
#### 3. **Configuration Complexity**
**Problem**: Custom gem5 configurations were failing with various errors.
**Root Cause**:
- Deprecated port names (`slave`/`master``cpu_side_ports`/`mem_side_ports`)
- Missing cache parameters (`tag_latency`, `data_latency`, etc.)
- Workload object creation issues
**Solution**: Used gem5's built-in `x86-ubuntu-run.py` configuration instead of custom scripts.
#### 4. **Stats Collection Issues**
**Problem**: Statistics were generated in `m5out/stats.txt` but scripts expected them elsewhere.
**Root Cause**: x86-ubuntu-run.py outputs to default `m5out/` directory.
**Solution**: Added automatic copying of stats from `m5out/stats.txt` to expected output directory.
### Key Learnings
1. **Architecture Choice Matters**: x86_64 is much more reliable than ARM for gem5 simulations
2. **Full-System vs Syscall Emulation**: Full-system simulation is more robust than syscall emulation
3. **Use Built-in Configurations**: gem5's built-in configs are more reliable than custom ones
4. **Path Management**: Always verify and handle gem5's default output paths
## 🏗️ How the Project Works
### Simulation Architecture
```
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ IoT LLM App │───▶│ gem5 X86 │───▶│ Statistics │
│ (24k tokens) │ │ Full-System │ │ (482KB) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
```
### Workflow Process
1. **Environment Setup**: Configure paths and verify gem5 installation
2. **Workload Compilation**: Compile C workloads to x86_64 binaries
3. **Simulation Execution**: Run gem5 with Ubuntu Linux and workload
4. **Statistics Collection**: Extract performance metrics from gem5 output
5. **Analysis**: Process statistics for energy, performance, and efficiency metrics
### Memory Configuration
- **Total RAM**: 16GB (as requested for IoT configuration)
- **Memory Controllers**: 2x DDR3 controllers with 8GB each
- **Cache Hierarchy**: L1I (48KB), L1D (32KB), L2 (1MB)
- **Memory Access**: Timing-based simulation with realistic latencies
## 📊 Simulation Results
### Sample Output (iot_llm_sim)
```
simSeconds 3.875651 # Simulation time
simInsts 2665005563 # Instructions executed
simOps 5787853650 # Operations (including micro-ops)
hostInstRate 474335 # Instructions per second
```
### Performance Metrics
- **Simulation Speed**: ~474K instructions/second
- **Memory Usage**: Successfully processes 24k tokens (24MB allocation)
- **CPU Utilization**: O3CPU with realistic pipeline behavior
- **Cache Performance**: Detailed L1/L2 hit/miss statistics
## 🛠️ Usage Guide
### Basic Usage
### 4. Run Full Matrix
```bash
# Run IoT LLM simulation
sh scripts/run_one.sh iot_llm_sim big high 0 1MB
# Run with different CPU types
sh scripts/run_one.sh iot_llm_sim little high 0 1MB # TimingSimpleCPU
sh scripts/run_one.sh iot_llm_sim big low 0 1MB # Low frequency
# Run parameter sweep
sh scripts/sweep.sh
```
**Check logs**: Monitor progress and verify all combinations complete:
### Advanced Usage
```bash
ls -la /home/carlos/projects/gem5/gem5-data/SmartEdgeAI/results/
# Custom memory size
sh scripts/run_one.sh iot_llm_sim big high 0 1MB 32GB
# Enable drowsy cache
sh scripts/run_one.sh iot_llm_sim big high 1 1MB
# Run specific workload
sh scripts/run_one.sh tinyml_kws big high 0 1MB
```
### 5. Extract Statistics
### Analysis Commands
```bash
# Extract CSV statistics
sh scripts/extract_csv.sh
```
**Check logs**: Verify CSV was created with data:
```bash
head -5 /home/carlos/projects/gem5/gem5-data/SmartEdgeAI/results/summary.csv
```
### 6. Compute Energy Metrics
```bash
# Energy analysis
python3 scripts/energy_post.py
```
**Check logs**: Verify energy calculations:
```bash
head -5 /home/carlos/projects/gem5/gem5-data/SmartEdgeAI/results/summary_energy.csv
```
### 7. Generate Plots
```bash
# Generate plots
python3 scripts/plot_epi.py
python3 scripts/plot_edp_tinyml.py
```
**Check logs**: Verify plots were created:
```bash
ls -la /home/carlos/projects/gem5/gem5-data/SmartEdgeAI/results/fig_*.png
```
### 8. Bundle Logs
```bash
# Bundle logs
sh scripts/bundle_logs.sh
```
**Check logs**: Verify bundled logs:
```bash
cat logs/TERMINAL_EXCERPTS.txt
cat logs/STATS_EXCERPTS.txt
```
### 9. (Optional) Generate Delta Analysis
```bash
python3 scripts/diff_table.py
```
**Check logs**: Verify delta calculations:
```bash
head -5 results/phase3_drowsy_deltas.csv
```
## Paths assumed
- gem5 binary: `/home/carlos/projects/gem5/gem5src/gem5/build/ARM/gem5.opt` (updated from tree.log analysis)
- config: `scripts/hetero_big_little.py`
- workloads: `/home/carlos/projects/gem5/gem5-run/{tinyml_kws,sensor_fusion,aes_ccm,attention_kernel}`
## Output Locations
- **Results**: `/home/carlos/projects/gem5/gem5-data/SmartEdgeAI/results/` (mirrored to `results/`)
- **Logs**: `/home/carlos/projects/gem5/gem5-data/SmartEdgeAI/logs/` (mirrored to `logs/`)
## Troubleshooting
## 🔍 Troubleshooting
### Common Issues
**Empty stats.txt files (0 bytes)**
- **Cause**: gem5 binary doesn't exist or simulation failed
- **Solution**: Run `sh scripts/check_gem5.sh` and install gem5 if needed
- **Check**: `ls -la /home/carlos/projects/gem5/gem5src/gem5/build/ARM/gem5.opt`
#### Empty stats.txt
```bash
# Check if simulation completed
ls -la m5out/stats.txt
**CSV extraction shows empty values**
- **Cause**: Simulation didn't run, so no statistics were generated
- **Solution**: Fix gem5 installation first, then re-run simulations
# If empty, check logs
cat logs/*.stderr.log
```
**"ModuleNotFoundError: No module named 'matplotlib'"**
- **Solution**: Install matplotlib: `pip install matplotlib` or `sudo apt-get install python3-matplotlib`
#### gem5 Binary Not Found
```bash
# Verify installation
ls /home/carlos/projects/gem5/gem5src/gem5/build/X86/gem5.opt
**"ValueError: could not convert string to float: ''"**
- **Cause**: Empty CSV values from failed simulations
- **Solution**: Fixed in updated scripts - they now handle empty values gracefully
# Build if missing
cd /home/carlos/projects/gem5/gem5src/gem5
scons build/X86/gem5.opt -j$(nproc)
```
**Permission errors**
- **Solution**: Make scripts executable: `chmod +x scripts/*.sh`
#### Compilation Errors
```bash
# Check compiler
gcc --version
**Path issues**
- **Solution**: Verify `ROOT` variable in `scripts/env.sh` points to correct gem5 installation
# Rebuild workloads
sh scripts/build_workloads.sh
```
### Debugging Steps
1. **Check gem5 installation**: `sh scripts/check_gem5.sh`
2. **Verify workload binaries**: `ls -la /home/carlos/projects/gem5/gem5-run/`
3. **Test single simulation**: `sh scripts/run_one.sh tinyml_kws big high 0 1MB`
4. **Check simulation logs**: `cat logs/tinyml_kws_big_high_l21MB_d0.stdout.log`
5. **Verify stats output**: `ls -l /home/carlos/projects/gem5/gem5-data/SmartEdgeAI/results/tinyml_kws_big_high_l21MB_d0/stats.txt`
### Debug Commands
```bash
# Check environment
sh scripts/env.sh
# Verify prerequisites
sh scripts/check_gem5.sh
# Manual gem5 run
/home/carlos/projects/gem5/gem5src/gem5/build/X86/gem5.opt \
/home/carlos/projects/gem5/gem5src/gem5/configs/example/gem5_library/x86-ubuntu-run.py \
--command=./iot_llm_sim --mem-size=16GB
```
## 📈 Performance Analysis
### Key Metrics
- **simSeconds**: Total simulation time
- **simInsts**: Instructions executed
- **simOps**: Operations (including micro-ops)
- **hostInstRate**: Simulation speed
- **Cache Miss Rates**: L1/L2 performance
- **Memory Bandwidth**: DRAM utilization
### Energy Analysis
The project includes energy post-processing scripts that calculate:
- **Energy per Instruction (EPI)**
- **Power consumption**
- **Energy-Delay Product (EDP)**
- **Drowsy vs Non-drowsy comparisons**
## 🎯 Future Enhancements
1. **Multi-core Support**: Extend to multi-core IoT configurations
2. **Real LLM Models**: Integrate actual transformer models
3. **Power Modeling**: Add detailed power consumption analysis
4. **Network Simulation**: Include IoT communication patterns
5. **Edge Computing**: Simulate edge-to-cloud interactions
## 📚 References
- [gem5 Documentation](https://www.gem5.org/documentation/)
- [gem5 Learning Resources](https://www.gem5.org/documentation/learning_gem5/)
- [ARM Research Starter Kit](http://www.arm.com/ResearchEnablement/SystemModeling)
## 🤝 Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Test with `sh run_all.sh`
5. Submit a pull request
## 📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
---
**Note**: This project was developed through iterative problem-solving, switching from ARM to x86_64 architecture and using gem5's built-in configurations for maximum reliability. The final solution provides a robust IoT LLM simulation framework with comprehensive statistics and analysis capabilities.