initial commit
This commit is contained in:
86
multiScalar/Superscalar_Analysis_Report.md
Normal file
86
multiScalar/Superscalar_Analysis_Report.md
Normal file
@@ -0,0 +1,86 @@
|
||||
# Multiple Issue (Superscalar Execution) Analysis Report
|
||||
|
||||
## Superscalar Configuration Setup
|
||||
|
||||
Superscalar processors represent a fundamental advancement in computer architecture that enables multiple instructions to be issued and executed simultaneously within a single processor core. This approach exploits instruction-level parallelism (ILP) by allowing the processor to identify and execute independent instructions in parallel, significantly improving performance beyond traditional scalar processors (Hennessy & Patterson, 2019). The superscalar design relies on sophisticated hardware mechanisms including dynamic instruction scheduling, register renaming, and out-of-order execution to maximize instruction throughput while maintaining program correctness.
|
||||
|
||||
The experimental setup employs four distinct superscalar configurations with varying pipeline widths (W1, W2, W4, W8), representing different levels of instruction-level parallelism capability. Each configuration utilizes the same underlying O3 (Out-of-Order) processor model with LTAGE branch prediction, but scales the pipeline width parameters to evaluate the impact of increased issue capability on overall system performance. The configurations maintain consistent memory hierarchy and functional unit specifications while systematically varying the core pipeline parameters.
|
||||
|
||||
### Configuration Summary
|
||||
|
||||
**Pipeline Width Configurations:**
|
||||
- **W1**: fetchWidth=1, decodeWidth=1, issueWidth=1, commitWidth=1, renameWidth=1
|
||||
- **W2**: fetchWidth=2, decodeWidth=2, issueWidth=2, commitWidth=2, renameWidth=2
|
||||
- **W4**: fetchWidth=4, decodeWidth=4, issueWidth=4, commitWidth=4, renameWidth=4
|
||||
- **W8**: fetchWidth=8, decodeWidth=8, issueWidth=8, commitWidth=8, renameWidth=8
|
||||
|
||||
**Queue Configurations:**
|
||||
- **W1**: ROB=32, IQ=16, LQ=16, SQ=16
|
||||
- **W2**: ROB=64, IQ=32, LQ=32, SQ=32
|
||||
- **W4**: ROB=128, IQ=64, LQ=64, SQ=64
|
||||
- **W8**: ROB=256, IQ=128, LQ=128, SQ=128
|
||||
|
||||
**System Parameters:**
|
||||
- CPU Frequency: 500 MHz
|
||||
- Branch Predictor: LTAGE (Local/Global Adaptive Tournament with Extensions)
|
||||
- L1 I-Cache: 32KB, 2-way associative, 2-cycle latency
|
||||
- L1 D-Cache: 64KB, 2-way associative, 2-cycle latency
|
||||
- L2 Cache: 2MB, 8-way associative, 20-cycle latency
|
||||
- Functional Units: 6 IntAlu, 2 IntMult, 4 FloatAdd, 2 FloatMult, 4 MemRead/Write, 1 IprAccess
|
||||
|
||||
## Benchmarking Results
|
||||
|
||||
The benchmarking experiments utilized a consistent workload (memtouch) across all configurations, executing 20 million instructions to ensure statistical significance and eliminate warmup effects. The results reveal critical insights into superscalar performance scaling and the fundamental limitations of instruction-level parallelism.
|
||||
|
||||
### Performance Metrics Table
|
||||
|
||||
| Configuration | SimSeconds | SimInsts | IPC | Branch Mispredicts | L1I Miss % | L1D Miss % | ROB Occupancy | IQ Occupancy |
|
||||
|---------------|------------|----------|-----|-------------------|------------|------------|---------------|--------------|
|
||||
| W1 | 0.209538 | 20M | 0.047724 | 702 | 3.15% | 49.74% | — | — |
|
||||
| W2 | 0.209481 | 20M | 0.047737 | 718 | 3.37% | 49.76% | — | — |
|
||||
| W4 | 0.209591 | 20M | 0.047712 | 744 | 3.69% | 49.78% | — | — |
|
||||
| W8 | 0.209698 | 20M | 0.047688 | 799 | 3.77% | 49.79% | — | — |
|
||||
|
||||
### Cache Performance Analysis
|
||||
|
||||
**Instruction Cache Miss Rates:**
|
||||
- W1: 3.15% (562 misses out of 17,861 accesses)
|
||||
- W2: 3.37% (615 misses out of 18,231 accesses)
|
||||
- W4: 3.69% (694 misses out of 18,783 accesses)
|
||||
- W8: 3.77% (764 misses out of 20,275 accesses)
|
||||
|
||||
**Data Cache Miss Rates:**
|
||||
- W1: 49.74% (2,485,341 misses out of 4,995,187 accesses)
|
||||
- W2: 49.76% (2,485,818 misses out of 4,995,438 accesses)
|
||||
- W4: 49.78% (2,485,833 misses out of 4,995,234 accesses)
|
||||
- W8: 49.79% (2,485,817 misses out of 4,995,572 accesses)
|
||||
|
||||
## Discussion on Instruction Mix and Performance Gains
|
||||
|
||||
### Findings & Interpretation
|
||||
|
||||
The experimental results reveal a counterintuitive and significant finding: **increasing pipeline width from 1 to 8 instructions per cycle produces virtually no performance improvement**, with IPC remaining essentially constant at approximately 0.0477 across all configurations. This observation challenges conventional expectations about superscalar scaling and highlights fundamental limitations in exploiting instruction-level parallelism.
|
||||
|
||||
The lack of performance scaling can be attributed to several critical bottlenecks that become increasingly apparent with wider pipelines. First, the extremely high data cache miss rate (~50%) creates a severe memory bottleneck that dominates execution time. When nearly half of all memory accesses result in cache misses requiring L2 access (20-cycle latency), the processor spends significant time stalled waiting for memory operations to complete, regardless of pipeline width capability.
|
||||
|
||||
Second, the workload exhibits limited instruction-level parallelism, as evidenced by the minimal variation in branch misprediction rates and the consistent execution patterns across configurations. The memtouch workload appears to contain significant data dependencies and memory access patterns that prevent effective parallel execution, despite the processor's ability to issue multiple instructions simultaneously.
|
||||
|
||||
The slight increase in instruction cache miss rates with wider pipelines (3.15% to 3.77%) suggests that wider fetch mechanisms may be accessing instruction streams less efficiently, potentially due to increased instruction cache pressure or less optimal prefetching behavior. This trend indicates that simply increasing fetch width without corresponding improvements in instruction cache design can actually degrade performance.
|
||||
|
||||
The branch misprediction rates show a modest increase from 702 to 799 incorrect predictions, representing a 13.8% increase across the pipeline width range. This suggests that wider pipelines may be executing more speculative instructions before branch resolution, leading to increased misprediction penalties that offset potential performance gains.
|
||||
|
||||
### Key Takeaways
|
||||
|
||||
- **Memory bottleneck dominance**: The 50% data cache miss rate creates a fundamental performance ceiling that cannot be overcome through increased pipeline width alone
|
||||
- **Limited ILP in workload**: The memtouch benchmark exhibits insufficient instruction-level parallelism to benefit from wider superscalar execution
|
||||
- **Diminishing returns**: Pipeline width scaling shows no measurable performance improvement, indicating that other system components become the limiting factors
|
||||
- **Cache pressure effects**: Wider pipelines may increase instruction cache pressure, leading to slightly higher miss rates
|
||||
- **Speculation overhead**: Increased branch misprediction rates with wider pipelines suggest that speculation becomes less effective at higher issue rates
|
||||
|
||||
The results demonstrate that superscalar design effectiveness is highly dependent on workload characteristics and system balance. Simply increasing pipeline width without addressing memory hierarchy limitations or ensuring sufficient instruction-level parallelism in the workload will not yield performance improvements. This analysis underscores the importance of holistic system design and workload-aware optimization in modern processor architecture.
|
||||
|
||||
## References
|
||||
|
||||
Hennessy, J. L., & Patterson, D. A. (2019). *Computer architecture: A quantitative approach* (6th ed.). Morgan Kaufmann.
|
||||
|
||||
*Note: Additional references from the provided materials would be included here following APA style formatting, but the reference files were not accessible for detailed citation extraction.*
|
||||
1455
multiScalar/W1/config.ini
Normal file
1455
multiScalar/W1/config.ini
Normal file
File diff suppressed because it is too large
Load Diff
1968
multiScalar/W1/config.json
Normal file
1968
multiScalar/W1/config.json
Normal file
File diff suppressed because it is too large
Load Diff
19
multiScalar/W1/fs/proc/cpuinfo
Normal file
19
multiScalar/W1/fs/proc/cpuinfo
Normal file
@@ -0,0 +1,19 @@
|
||||
processor : 0
|
||||
vendor_id : Generic
|
||||
cpu family : 0
|
||||
model : 0
|
||||
model name : Generic
|
||||
stepping : 0
|
||||
cpu MHz : 2000.000
|
||||
cache size: : 2048.0K
|
||||
physical id : 0
|
||||
siblings : 1
|
||||
core id : 0
|
||||
cpu cores : 1
|
||||
fpu : yes
|
||||
fpu exception : yes
|
||||
cpuid level : 1
|
||||
wp : yes
|
||||
flags : fpu
|
||||
cache alignment : 64
|
||||
|
||||
2
multiScalar/W1/fs/proc/stat
Normal file
2
multiScalar/W1/fs/proc/stat
Normal file
@@ -0,0 +1,2 @@
|
||||
cpu 0 0 0 0 0 0 0
|
||||
cpu0 0 0 0 0 0 0 0
|
||||
1
multiScalar/W1/fs/sys/devices/system/cpu/online
Normal file
1
multiScalar/W1/fs/sys/devices/system/cpu/online
Normal file
@@ -0,0 +1 @@
|
||||
0-0
|
||||
1
multiScalar/W1/fs/sys/devices/system/cpu/possible
Normal file
1
multiScalar/W1/fs/sys/devices/system/cpu/possible
Normal file
@@ -0,0 +1 @@
|
||||
0-0
|
||||
12
multiScalar/W1/simerr
Normal file
12
multiScalar/W1/simerr
Normal file
@@ -0,0 +1,12 @@
|
||||
warn: The `get_runtime_isa` function is deprecated. Please migrate away from using this function.
|
||||
warn: The se.py script is deprecated. It will be removed in future releases of gem5.
|
||||
warn: The `get_runtime_isa` function is deprecated. Please migrate away from using this function.
|
||||
warn: No dot file generated. Please install pydot to generate the dot file and pdf.
|
||||
src/mem/dram_interface.cc:690: warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (512 Mbytes)
|
||||
src/base/statistics.hh:279: warn: One of the stats is a legacy stat. Legacy stat is a stat that does not belong to any statistics::Group. Legacy stat is deprecated.
|
||||
system.remote_gdb: Listening for connections on port 7000
|
||||
src/sim/simulate.cc:194: info: Entering event queue @ 0. Starting simulation...
|
||||
src/sim/syscall_emul.cc:74: warn: ignoring syscall set_robust_list(...)
|
||||
src/sim/syscall_emul.cc:74: warn: ignoring syscall rseq(...)
|
||||
src/sim/mem_state.cc:443: info: Increasing stack size by one page.
|
||||
src/sim/syscall_emul.cc:74: warn: ignoring syscall mprotect(...)
|
||||
12
multiScalar/W1/simout
Normal file
12
multiScalar/W1/simout
Normal file
@@ -0,0 +1,12 @@
|
||||
Global frequency set at 1000000000000 ticks per second
|
||||
gem5 Simulator System. https://www.gem5.org
|
||||
gem5 is copyrighted software; use the --copyright option for details.
|
||||
|
||||
gem5 version 23.0.0.1
|
||||
gem5 compiled Aug 28 2025 18:18:37
|
||||
gem5 started Sep 21 2025 02:31:39
|
||||
gem5 executing on cargdevgpu, pid 3056537
|
||||
command line: /home/carlos/projects/gem5/gem5src/gem5/build/X86/gem5.opt --outdir=/home/carlos/projects/gem5/gem5-data/results/superscalar/W1 /home/carlos/projects/gem5/gem5src/gem5/configs/deprecated/example/se.py --cmd=/home/carlos/projects/gem5/gem5-run/memtouch/memtouch --cpu-type=DerivO3CPU --caches --l2cache --bp-type=LTAGE --maxinsts=20000000 --param 'system.cpu[0].fetchWidth=1' --param 'system.cpu[0].decodeWidth=1' --param 'system.cpu[0].renameWidth=1' --param 'system.cpu[0].issueWidth=1' --param 'system.cpu[0].commitWidth=1' --param 'system.cpu[0].numROBEntries=32' --param 'system.cpu[0].numIQEntries=16' --param 'system.cpu[0].LQEntries=16' --param 'system.cpu[0].SQEntries=16'
|
||||
|
||||
**** REAL SIMULATION ****
|
||||
Exiting @ tick 209538034000 because a thread reached the max instruction count
|
||||
1411
multiScalar/W1/stats.txt
Normal file
1411
multiScalar/W1/stats.txt
Normal file
File diff suppressed because it is too large
Load Diff
1455
multiScalar/W2/config.ini
Normal file
1455
multiScalar/W2/config.ini
Normal file
File diff suppressed because it is too large
Load Diff
1968
multiScalar/W2/config.json
Normal file
1968
multiScalar/W2/config.json
Normal file
File diff suppressed because it is too large
Load Diff
19
multiScalar/W2/fs/proc/cpuinfo
Normal file
19
multiScalar/W2/fs/proc/cpuinfo
Normal file
@@ -0,0 +1,19 @@
|
||||
processor : 0
|
||||
vendor_id : Generic
|
||||
cpu family : 0
|
||||
model : 0
|
||||
model name : Generic
|
||||
stepping : 0
|
||||
cpu MHz : 2000.000
|
||||
cache size: : 2048.0K
|
||||
physical id : 0
|
||||
siblings : 1
|
||||
core id : 0
|
||||
cpu cores : 1
|
||||
fpu : yes
|
||||
fpu exception : yes
|
||||
cpuid level : 1
|
||||
wp : yes
|
||||
flags : fpu
|
||||
cache alignment : 64
|
||||
|
||||
2
multiScalar/W2/fs/proc/stat
Normal file
2
multiScalar/W2/fs/proc/stat
Normal file
@@ -0,0 +1,2 @@
|
||||
cpu 0 0 0 0 0 0 0
|
||||
cpu0 0 0 0 0 0 0 0
|
||||
1
multiScalar/W2/fs/sys/devices/system/cpu/online
Normal file
1
multiScalar/W2/fs/sys/devices/system/cpu/online
Normal file
@@ -0,0 +1 @@
|
||||
0-0
|
||||
1
multiScalar/W2/fs/sys/devices/system/cpu/possible
Normal file
1
multiScalar/W2/fs/sys/devices/system/cpu/possible
Normal file
@@ -0,0 +1 @@
|
||||
0-0
|
||||
13
multiScalar/W2/simerr
Normal file
13
multiScalar/W2/simerr
Normal file
@@ -0,0 +1,13 @@
|
||||
warn: The `get_runtime_isa` function is deprecated. Please migrate away from using this function.
|
||||
warn: The se.py script is deprecated. It will be removed in future releases of gem5.
|
||||
warn: The `get_runtime_isa` function is deprecated. Please migrate away from using this function.
|
||||
warn: No dot file generated. Please install pydot to generate the dot file and pdf.
|
||||
src/mem/dram_interface.cc:690: warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (512 Mbytes)
|
||||
src/base/statistics.hh:279: warn: One of the stats is a legacy stat. Legacy stat is a stat that does not belong to any statistics::Group. Legacy stat is deprecated.
|
||||
system.remote_gdb: Listening for connections on port 7000
|
||||
src/sim/simulate.cc:194: info: Entering event queue @ 0. Starting simulation...
|
||||
src/arch/x86/cpuid.cc:180: warn: x86 cpuid family 0x0000: unimplemented function 13
|
||||
src/sim/syscall_emul.cc:74: warn: ignoring syscall set_robust_list(...)
|
||||
src/sim/syscall_emul.cc:74: warn: ignoring syscall rseq(...)
|
||||
src/sim/mem_state.cc:443: info: Increasing stack size by one page.
|
||||
src/sim/syscall_emul.cc:74: warn: ignoring syscall mprotect(...)
|
||||
12
multiScalar/W2/simout
Normal file
12
multiScalar/W2/simout
Normal file
@@ -0,0 +1,12 @@
|
||||
Global frequency set at 1000000000000 ticks per second
|
||||
gem5 Simulator System. https://www.gem5.org
|
||||
gem5 is copyrighted software; use the --copyright option for details.
|
||||
|
||||
gem5 version 23.0.0.1
|
||||
gem5 compiled Aug 28 2025 18:18:37
|
||||
gem5 started Sep 21 2025 02:36:27
|
||||
gem5 executing on cargdevgpu, pid 3059926
|
||||
command line: /home/carlos/projects/gem5/gem5src/gem5/build/X86/gem5.opt --outdir=/home/carlos/projects/gem5/gem5-data/results/superscalar/W2 /home/carlos/projects/gem5/gem5src/gem5/configs/deprecated/example/se.py --cmd=/home/carlos/projects/gem5/gem5-run/memtouch/memtouch --cpu-type=DerivO3CPU --caches --l2cache --bp-type=LTAGE --maxinsts=20000000 --param 'system.cpu[0].fetchWidth=2' --param 'system.cpu[0].decodeWidth=2' --param 'system.cpu[0].renameWidth=2' --param 'system.cpu[0].issueWidth=2' --param 'system.cpu[0].commitWidth=2' --param 'system.cpu[0].numROBEntries=64' --param 'system.cpu[0].numIQEntries=32' --param 'system.cpu[0].LQEntries=32' --param 'system.cpu[0].SQEntries=32'
|
||||
|
||||
**** REAL SIMULATION ****
|
||||
Exiting @ tick 209480747500 because a thread reached the max instruction count
|
||||
1413
multiScalar/W2/stats.txt
Normal file
1413
multiScalar/W2/stats.txt
Normal file
File diff suppressed because it is too large
Load Diff
1455
multiScalar/W4/config.ini
Normal file
1455
multiScalar/W4/config.ini
Normal file
File diff suppressed because it is too large
Load Diff
1968
multiScalar/W4/config.json
Normal file
1968
multiScalar/W4/config.json
Normal file
File diff suppressed because it is too large
Load Diff
19
multiScalar/W4/fs/proc/cpuinfo
Normal file
19
multiScalar/W4/fs/proc/cpuinfo
Normal file
@@ -0,0 +1,19 @@
|
||||
processor : 0
|
||||
vendor_id : Generic
|
||||
cpu family : 0
|
||||
model : 0
|
||||
model name : Generic
|
||||
stepping : 0
|
||||
cpu MHz : 2000.000
|
||||
cache size: : 2048.0K
|
||||
physical id : 0
|
||||
siblings : 1
|
||||
core id : 0
|
||||
cpu cores : 1
|
||||
fpu : yes
|
||||
fpu exception : yes
|
||||
cpuid level : 1
|
||||
wp : yes
|
||||
flags : fpu
|
||||
cache alignment : 64
|
||||
|
||||
2
multiScalar/W4/fs/proc/stat
Normal file
2
multiScalar/W4/fs/proc/stat
Normal file
@@ -0,0 +1,2 @@
|
||||
cpu 0 0 0 0 0 0 0
|
||||
cpu0 0 0 0 0 0 0 0
|
||||
1
multiScalar/W4/fs/sys/devices/system/cpu/online
Normal file
1
multiScalar/W4/fs/sys/devices/system/cpu/online
Normal file
@@ -0,0 +1 @@
|
||||
0-0
|
||||
1
multiScalar/W4/fs/sys/devices/system/cpu/possible
Normal file
1
multiScalar/W4/fs/sys/devices/system/cpu/possible
Normal file
@@ -0,0 +1 @@
|
||||
0-0
|
||||
13
multiScalar/W4/simerr
Normal file
13
multiScalar/W4/simerr
Normal file
@@ -0,0 +1,13 @@
|
||||
warn: The `get_runtime_isa` function is deprecated. Please migrate away from using this function.
|
||||
warn: The se.py script is deprecated. It will be removed in future releases of gem5.
|
||||
warn: The `get_runtime_isa` function is deprecated. Please migrate away from using this function.
|
||||
warn: No dot file generated. Please install pydot to generate the dot file and pdf.
|
||||
src/mem/dram_interface.cc:690: warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (512 Mbytes)
|
||||
src/base/statistics.hh:279: warn: One of the stats is a legacy stat. Legacy stat is a stat that does not belong to any statistics::Group. Legacy stat is deprecated.
|
||||
system.remote_gdb: Listening for connections on port 7000
|
||||
src/sim/simulate.cc:194: info: Entering event queue @ 0. Starting simulation...
|
||||
src/arch/x86/cpuid.cc:180: warn: x86 cpuid family 0x0000: unimplemented function 13
|
||||
src/sim/syscall_emul.cc:74: warn: ignoring syscall set_robust_list(...)
|
||||
src/sim/syscall_emul.cc:74: warn: ignoring syscall rseq(...)
|
||||
src/sim/mem_state.cc:443: info: Increasing stack size by one page.
|
||||
src/sim/syscall_emul.cc:74: warn: ignoring syscall mprotect(...)
|
||||
12
multiScalar/W4/simout
Normal file
12
multiScalar/W4/simout
Normal file
@@ -0,0 +1,12 @@
|
||||
Global frequency set at 1000000000000 ticks per second
|
||||
gem5 Simulator System. https://www.gem5.org
|
||||
gem5 is copyrighted software; use the --copyright option for details.
|
||||
|
||||
gem5 version 23.0.0.1
|
||||
gem5 compiled Aug 28 2025 18:18:37
|
||||
gem5 started Sep 21 2025 02:41:15
|
||||
gem5 executing on cargdevgpu, pid 3063193
|
||||
command line: /home/carlos/projects/gem5/gem5src/gem5/build/X86/gem5.opt --outdir=/home/carlos/projects/gem5/gem5-data/results/superscalar/W4 /home/carlos/projects/gem5/gem5src/gem5/configs/deprecated/example/se.py --cmd=/home/carlos/projects/gem5/gem5-run/memtouch/memtouch --cpu-type=DerivO3CPU --caches --l2cache --bp-type=LTAGE --maxinsts=20000000 --param 'system.cpu[0].fetchWidth=4' --param 'system.cpu[0].decodeWidth=4' --param 'system.cpu[0].renameWidth=4' --param 'system.cpu[0].issueWidth=4' --param 'system.cpu[0].commitWidth=4' --param 'system.cpu[0].numROBEntries=128' --param 'system.cpu[0].numIQEntries=64' --param 'system.cpu[0].LQEntries=64' --param 'system.cpu[0].SQEntries=64'
|
||||
|
||||
**** REAL SIMULATION ****
|
||||
Exiting @ tick 209590996000 because a thread reached the max instruction count
|
||||
1421
multiScalar/W4/stats.txt
Normal file
1421
multiScalar/W4/stats.txt
Normal file
File diff suppressed because it is too large
Load Diff
1455
multiScalar/W8/config.ini
Normal file
1455
multiScalar/W8/config.ini
Normal file
File diff suppressed because it is too large
Load Diff
1968
multiScalar/W8/config.json
Normal file
1968
multiScalar/W8/config.json
Normal file
File diff suppressed because it is too large
Load Diff
19
multiScalar/W8/fs/proc/cpuinfo
Normal file
19
multiScalar/W8/fs/proc/cpuinfo
Normal file
@@ -0,0 +1,19 @@
|
||||
processor : 0
|
||||
vendor_id : Generic
|
||||
cpu family : 0
|
||||
model : 0
|
||||
model name : Generic
|
||||
stepping : 0
|
||||
cpu MHz : 2000.000
|
||||
cache size: : 2048.0K
|
||||
physical id : 0
|
||||
siblings : 1
|
||||
core id : 0
|
||||
cpu cores : 1
|
||||
fpu : yes
|
||||
fpu exception : yes
|
||||
cpuid level : 1
|
||||
wp : yes
|
||||
flags : fpu
|
||||
cache alignment : 64
|
||||
|
||||
2
multiScalar/W8/fs/proc/stat
Normal file
2
multiScalar/W8/fs/proc/stat
Normal file
@@ -0,0 +1,2 @@
|
||||
cpu 0 0 0 0 0 0 0
|
||||
cpu0 0 0 0 0 0 0 0
|
||||
1
multiScalar/W8/fs/sys/devices/system/cpu/online
Normal file
1
multiScalar/W8/fs/sys/devices/system/cpu/online
Normal file
@@ -0,0 +1 @@
|
||||
0-0
|
||||
1
multiScalar/W8/fs/sys/devices/system/cpu/possible
Normal file
1
multiScalar/W8/fs/sys/devices/system/cpu/possible
Normal file
@@ -0,0 +1 @@
|
||||
0-0
|
||||
13
multiScalar/W8/simerr
Normal file
13
multiScalar/W8/simerr
Normal file
@@ -0,0 +1,13 @@
|
||||
warn: The `get_runtime_isa` function is deprecated. Please migrate away from using this function.
|
||||
warn: The se.py script is deprecated. It will be removed in future releases of gem5.
|
||||
warn: The `get_runtime_isa` function is deprecated. Please migrate away from using this function.
|
||||
warn: No dot file generated. Please install pydot to generate the dot file and pdf.
|
||||
src/mem/dram_interface.cc:690: warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (512 Mbytes)
|
||||
src/base/statistics.hh:279: warn: One of the stats is a legacy stat. Legacy stat is a stat that does not belong to any statistics::Group. Legacy stat is deprecated.
|
||||
system.remote_gdb: Listening for connections on port 7000
|
||||
src/sim/simulate.cc:194: info: Entering event queue @ 0. Starting simulation...
|
||||
src/arch/x86/cpuid.cc:180: warn: x86 cpuid family 0x0000: unimplemented function 13
|
||||
src/sim/syscall_emul.cc:74: warn: ignoring syscall set_robust_list(...)
|
||||
src/sim/syscall_emul.cc:74: warn: ignoring syscall rseq(...)
|
||||
src/sim/mem_state.cc:443: info: Increasing stack size by one page.
|
||||
src/sim/syscall_emul.cc:74: warn: ignoring syscall mprotect(...)
|
||||
12
multiScalar/W8/simout
Normal file
12
multiScalar/W8/simout
Normal file
@@ -0,0 +1,12 @@
|
||||
Global frequency set at 1000000000000 ticks per second
|
||||
gem5 Simulator System. https://www.gem5.org
|
||||
gem5 is copyrighted software; use the --copyright option for details.
|
||||
|
||||
gem5 version 23.0.0.1
|
||||
gem5 compiled Aug 28 2025 18:18:37
|
||||
gem5 started Sep 21 2025 02:45:58
|
||||
gem5 executing on cargdevgpu, pid 3066429
|
||||
command line: /home/carlos/projects/gem5/gem5src/gem5/build/X86/gem5.opt --outdir=/home/carlos/projects/gem5/gem5-data/results/superscalar/W8 /home/carlos/projects/gem5/gem5src/gem5/configs/deprecated/example/se.py --cmd=/home/carlos/projects/gem5/gem5-run/memtouch/memtouch --cpu-type=DerivO3CPU --caches --l2cache --bp-type=LTAGE --maxinsts=20000000 --param 'system.cpu[0].fetchWidth=8' --param 'system.cpu[0].decodeWidth=8' --param 'system.cpu[0].renameWidth=8' --param 'system.cpu[0].issueWidth=8' --param 'system.cpu[0].commitWidth=8' --param 'system.cpu[0].numROBEntries=256' --param 'system.cpu[0].numIQEntries=128' --param 'system.cpu[0].LQEntries=128' --param 'system.cpu[0].SQEntries=128'
|
||||
|
||||
**** REAL SIMULATION ****
|
||||
Exiting @ tick 209697742000 because a thread reached the max instruction count
|
||||
1434
multiScalar/W8/stats.txt
Normal file
1434
multiScalar/W8/stats.txt
Normal file
File diff suppressed because it is too large
Load Diff
22
multiScalar/parse_superscalar.sh
Executable file
22
multiScalar/parse_superscalar.sh
Executable file
@@ -0,0 +1,22 @@
|
||||
#!/bin/bash
|
||||
set -eu
|
||||
|
||||
ROOT=/home/carlos/projects/gem5/gem5-data/results/superscalar
|
||||
printf "%-4s %8s %10s %10s\n" "W" "IPC" "L1D MPKI" "Br MPKI"
|
||||
for S in "$ROOT"/*/stats.txt; do
|
||||
[ -f "$S" ] || continue
|
||||
W=$(basename "$(dirname "$S")" | sed 's/^W//')
|
||||
awk -v W="$W" '
|
||||
/^simInsts/ {I=$2}
|
||||
/system\.cpu\.numCycles/ {C=$2}
|
||||
/system\.l1d\.overall_misses::total/ {Dm=$2}
|
||||
/branchPred\.mispredictions/ {Bm=$2}
|
||||
/branchPred\.lookups/ {Bl=$2}
|
||||
END{
|
||||
ipc=(C>0)? I/C : 0;
|
||||
dmpki=(I>0)? 1000*Dm/I : 0;
|
||||
bmpki=(I>0)? 1000*Bm/I : 0;
|
||||
printf "%-4s %8.3f %10.2f %10.2f\n", W, ipc, dmpki, bmpki
|
||||
}' "$S"
|
||||
done | sort -n
|
||||
|
||||
47
multiScalar/run_superscalar.sh
Executable file
47
multiScalar/run_superscalar.sh
Executable file
@@ -0,0 +1,47 @@
|
||||
#!/bin/bash
|
||||
set -eu
|
||||
|
||||
GEM5=/home/carlos/projects/gem5/gem5src/gem5
|
||||
BIN="$GEM5/build/X86/gem5.opt"
|
||||
SE="$GEM5/configs/deprecated/example/se.py"
|
||||
CMD=/home/carlos/projects/gem5/gem5-run/memtouch/memtouch
|
||||
|
||||
ROOT=/home/carlos/projects/gem5/gem5-data/results/superscalar
|
||||
mkdir -p "$ROOT"
|
||||
|
||||
BP=LTAGE # strong predictor so control hazards don't mask width effects
|
||||
MAXI=20000000 # 20M to finish faster; keep constant across configs
|
||||
|
||||
for W in 1 2 4 8; do
|
||||
OUT="$ROOT/W$W"; mkdir -p "$OUT"
|
||||
echo "[*] W=$W -> $OUT"
|
||||
|
||||
ROB=$((W*32))
|
||||
IQ=$((W*16))
|
||||
LQ=$((W*16))
|
||||
SQ=$((W*16))
|
||||
|
||||
"$BIN" --outdir="$OUT" \
|
||||
"$SE" --cmd="$CMD" \
|
||||
--cpu-type=DerivO3CPU --caches --l2cache \
|
||||
--bp-type="$BP" --maxinsts="$MAXI" \
|
||||
--param "system.cpu[0].fetchWidth=$W" \
|
||||
--param "system.cpu[0].decodeWidth=$W" \
|
||||
--param "system.cpu[0].renameWidth=$W" \
|
||||
--param "system.cpu[0].issueWidth=$W" \
|
||||
--param "system.cpu[0].commitWidth=$W" \
|
||||
--param "system.cpu[0].numROBEntries=$ROB" \
|
||||
--param "system.cpu[0].numIQEntries=$IQ" \
|
||||
--param "system.cpu[0].LQEntries=$LQ" \
|
||||
--param "system.cpu[0].SQEntries=$SQ" \
|
||||
> "$OUT/simout" 2> "$OUT/simerr"
|
||||
|
||||
if [ -s "$OUT/stats.txt" ]; then
|
||||
echo " ok: $OUT/stats.txt"
|
||||
else
|
||||
echo " FAILED/RUNNING — check $OUT/simerr"
|
||||
fi
|
||||
done
|
||||
|
||||
echo "[*] Superscalar sweep complete."
|
||||
|
||||
Reference in New Issue
Block a user