Using SimPoint in Gem5 to Speed up Simulation

Introduction

If you ever used Gem5 simulator, you know how slow it is. It is a common complaint from across the entire community, and people brings solutions to this problem. In this article, I’ll introduce SimPoint, a profiling & sampling method which is integrated into Gem5 to help speed up simulations.

What is SimPoint?

The original project could be found here: https://cseweb.ucsd.edu/~calder/simpoint/. But like most university projects, when the students who developed the tool graduated and/or the funding ended, the project discontinued in 2006… which is not a huge problem as long as it still works.

But anyways, SimPoint samples programs with a various of features like IPC, branch prediction, cache hit/miss rates and obtain a phase analysis of the program. e.g. A simple program may have initialization phase, some computation phases, and a finalization phase. Often times, programs runs in loops, so they would have periodical behaviors. From a simulation’s perspective, we don’t care so much if the program finishes if we can capture the most interesting region of the execution and simulated only that region.

If you want to know more about SimPoint, please check out their paper. Here we only focus on how to use it in Gem5.

Using SimPoint in Gem5

Here we use the latest release of Gem5 (As of Oct, 2017) as an example to show how to use SimPoint.

First of all, the official website: http://gem5.org/Simpoints . Let’s break it down step by step.

Profiling & Generating BBV

Some basic concept here: BBV stands for basic block vector. Basic block is a widely used programming analysis/compiler term. It basically contains the information SimPoint needs to do the analysis.

Different from the official doc, we use the system emulation script se.py as an example.

% build/ARM/gem5.opt <base options> configs/example/se.py --simpoint-profile --simpoint-interval 10000000 --cpu-type=AtomicSimpleCPU --fastmem

One of the most important parameters here is the --simpoint-interval. It is basically the sampling frequency of simpoint in number of instructions. Smaller intervals might be more accurate, but could also result in too many unnecessary simpoints. It looks like people typically use 10M to 1B intervals, depending on how large your program is. Also note that if you simpoint is too small you may not be able to warmup your memory system.

It looks like in se mode you can only profile the program with the AtomicSimpleCPU model and fastmem.

If you specified outdir, you should see a simpoint.bb.gz generated there. Next we’re going to use SimPoint to do the analysis we need using this file (this process is called offline analysis).

Building & Running SimPoint

First obtain the source code of SimPoint, which was last updated in 2006. So don’t expect typing make would work out of the box (if it does, you should probably upgrade you compiler).

But luckily, not a lot of stuff that we need to change, adding

#include <cstdlib>
#include <cstring>
#include <limits.h>
#include <iostream>
#include <fstream>

to Utilities.h and adding

#include <iostream>

to Datapoint.h should make the make file work.

Once you have simpoint in you SimPoint/bin directory, you can do

% simpoint -loadFVFile simpoint.bb.gz -maxK 30 -saveSimpoints <simpoint_file> -saveSimpointWeights <weight_file> -inputVectorsGzipped

to obtain the simpoint file and weight file. The simpint output file looks like:

3 0
123 1
345 2

each line represents a simpoint taken. Within each line, the number on the left means the interval number of this simpoint, the number on the right is the simpoint index. For example, simpoint 0 corresponds to 3rd interval, which is at instruction 3 * 10000000.

The weight output file has a similar format except the interval number changes to a weight number. The weights sum up to 1 and each weight tells you how important this simpoint is.

Take Checkpoints in Gem5

Now that we know that is the most interesting region from SimPoint analysis, we can take checkpoints based on these information using Gem5’s checkpointing facility. An easyway of doing this is as the official doc:

% build/ARM/gem5.opt <base options> configs/example/se.py --take-simpoint-checkpoint=<simpoint file path>,<weight file path>,<interval length>,<warmup length> <rest of se.py options>

The --take-simpoint-checkpoint option here offers an convenient utility to calculate the instruction number to take checkpoint, you can also calculate where to take checkpoints yourself and use the --take-checkpoints option.

Note the warmup length here is quite important, because when you run the detailed simulation you want to have a warmed up cache and CPU states. This warmup length allows you to take the checkpoint at simpoint_inst - warmup_length so that you don’t have to worry about not warming up before you hit the most interesting region. It is strongly recommended to set up the warmup_length here so that you don’t have to hand code it later.

After command Gem5 will generate a directory named cpt.simpont_xx... for each simpoint.

Restore Checkpoints in Gem5

% build/ARM/gem5.opt <base options> configs/example/fs.py --restore-simpoint-checkpoint -r <N> --checkpoint-dir <simpoint checkpoint path> --cpu-type=TimingSimpleCPU --restore-with-cpu=DerivO3CPU <rest of fs.py options>

Note the <N> here is off by 1, and the <simpoint checkpoint path> here is the directory containing those cpt.xxx directories.

The CPU types here is also a bit confusing, maybe in another blog I’ll explain these with the checkpointing system of Gem5 but in short: --cpu-type is the CPU you will run with the warmup period and the --restore-with-cpu is the CPU that runs the actual simulation after warmup. You don’t have to use 2 CPU types as I did here but this combination generally provides a good balance on simulation speed and accuracy.

Because the warmup_length and interval_length mentioned earlier are encoded in the directory name, Gem5 script takes care of them and therefore you don’t have to worry about it, which is neat.

Comments

  1. This comment has been removed by the author.

    ReplyDelete
  2. Sands Casino & Resort | Las Vegas, NV | Official Website
    At Sands Casino & Resort, the ultimate Las Vegas casino getaway, hotel choegocasino and spa vacation package, septcasino with all the amenities you need to 메리트 카지노 고객센터 ensure you have the best

    ReplyDelete
  3. Why the casino has no pay per spin? - Dr.MD
    Slots, poker and casino games pay to keep what they win. 광주광역 출장마사지 · 의왕 출장안마 The 김천 출장안마 main pay-per-play (POGE) 순천 출장샵 game, the Slots, has a minimum bet of $0.20 per spin (USD). 김포 출장샵 · Other games

    ReplyDelete

Post a Comment