gem5 v24.0.0.0
|
The elastic data memory request generator to read protobuf trace containing execution trace annotated with data and ordering dependencies. More...
#include <trace_cpu.hh>
Classes | |
struct | ElasticDataGenStatGroup |
class | GraphNode |
The struct GraphNode stores an instruction in the trace file. More... | |
class | HardwareResource |
The HardwareResource class models structures that hold the in-flight nodes. More... | |
class | InputStream |
The InputStream encapsulates a trace file and the internal buffers and populates GraphNodes based on the input. More... | |
struct | ReadyNode |
Struct to store a ready-to-execute node and its execution tick. More... | |
Public Member Functions | |
ElasticDataGen (TraceCPU &_owner, const std::string &_name, RequestPort &_port, RequestorID requestor_id, const std::string &trace_file, const TraceCPUParams ¶ms) | |
Tick | init () |
Called from TraceCPU init(). | |
void | adjustInitTraceOffset (Tick &offset) |
Adjust traceOffset based on what TraceCPU init() determines on comparing the offsets in the fetch request and elastic traces. | |
const std::string & | name () const |
Returns name of the ElasticDataGen instance. | |
void | exit () |
Exit the ElasticDataGen. | |
bool | readNextWindow () |
Reads a line of the trace file. | |
template<typename T > | |
void | addDepsOnParent (GraphNode *new_node, T &dep_list) |
Iterate over the dependencies of a new node and add the new node to the list of dependents of the parent node. | |
void | execute () |
This is the main execute function which consumes nodes from the sorted readyList. | |
PacketPtr | executeMemReq (GraphNode *node_ptr) |
Creates a new request for a load or store assigning the request parameters. | |
void | addToSortedReadyList (NodeSeqNum seq_num, Tick exec_tick) |
Add a ready node to the readyList. | |
void | printReadyList () |
Print readyList for debugging using debug flag TraceCPUData. | |
void | completeMemAccess (PacketPtr pkt) |
When a load writeback is received, that is when the load completes, release the dependents on it. | |
bool | isExecComplete () const |
Returns the execComplete variable which is set when the last node is executed. | |
bool | checkAndIssue (const GraphNode *node_ptr, bool first=true) |
Attempts to issue a node once the node's source dependencies are complete. | |
uint64_t | getMicroOpCount () const |
Get number of micro-ops modelled in the TraceCPU replay. | |
Protected Attributes | |
gem5::TraceCPU::ElasticDataGen::ElasticDataGenStatGroup | elasticStats |
Private Types | |
typedef uint64_t | NodeSeqNum |
Node sequence number type. | |
typedef uint64_t | NodeRobNum |
Node ROB number type. | |
typedef ProtoMessage::InstDepRecord::RecordType | RecordType |
typedef ProtoMessage::InstDepRecord | Record |
Private Attributes | |
TraceCPU & | owner |
Reference of the TraceCPU. | |
RequestPort & | port |
Reference of the port to be used to issue memory requests. | |
const RequestorID | requestorId |
RequestorID used for the requests being sent. | |
InputStream | trace |
Input stream used for reading the input trace file. | |
std::string | genName |
String to store the name of the FixedRetryGen. | |
PacketPtr | retryPkt |
PacketPtr used to store the packet to retry. | |
bool | traceComplete |
Set to true when end of trace is reached. | |
bool | nextRead |
Set to true when the next window of instructions need to be read. | |
bool | execComplete |
Set true when execution of trace is complete. | |
const uint32_t | windowSize |
Window size within which to check for dependencies. | |
HardwareResource | hwResource |
Hardware resources required to contain in-flight nodes and to throttle issuing of new nodes when resources are not available. | |
std::unordered_map< NodeSeqNum, GraphNode * > | depGraph |
Store the depGraph of GraphNodes. | |
std::queue< const GraphNode * > | depFreeQueue |
Queue of dependency-free nodes that are pending issue because resources are not available. | |
std::list< ReadyNode > | readyList |
List of nodes that are ready to execute. | |
The elastic data memory request generator to read protobuf trace containing execution trace annotated with data and ordering dependencies.
It deduces the time at which to send a load/store request by tracking the dependencies. It attempts to send a memory request for a load/store without performing real execution of micro-ops. If L1 cache port sends packet succesfully, the generator checks which instructions became dependency free as a result of this and schedules an event accordingly. If it fails to send the packet, it waits for a retry from the cache.
Definition at line 524 of file trace_cpu.hh.
|
private |
Node ROB number type.
Definition at line 531 of file trace_cpu.hh.
|
private |
Node sequence number type.
Definition at line 528 of file trace_cpu.hh.
|
private |
Definition at line 534 of file trace_cpu.hh.
|
private |
Definition at line 533 of file trace_cpu.hh.
|
inline |
Definition at line 805 of file trace_cpu.hh.
References DPRINTF, and windowSize.
void gem5::TraceCPU::ElasticDataGen::addDepsOnParent | ( | GraphNode * | new_node, |
T & | dep_list ) |
Iterate over the dependencies of a new node and add the new node to the list of dependents of the parent node.
new_node | new node to add to the graph |
dep_list | the dependency list of type rob or register, that is to be iterated, and may get modified |
Definition at line 344 of file trace_cpu.cc.
References depGraph, elasticStats, gem5::TraceCPU::ElasticDataGen::ElasticDataGenStatGroup::maxDependents, and gem5::statistics::ScalarBase< Derived, Stor >::value().
Referenced by readNextWindow().
void gem5::TraceCPU::ElasticDataGen::addToSortedReadyList | ( | NodeSeqNum | seq_num, |
Tick | exec_tick ) |
Add a ready node to the readyList.
When inserting, ensure the nodes are sorted in ascending order of their execute ticks.
seq_num | seq. num of ready node |
exec_tick | the execute tick of the ready node |
Definition at line 748 of file trace_cpu.cc.
References elasticStats, gem5::TraceCPU::ElasticDataGen::ReadyNode::execTick, gem5::TraceCPU::ElasticDataGen::ElasticDataGenStatGroup::maxReadyListSize, readyList, gem5::Packet::req, retryPkt, gem5::TraceCPU::ElasticDataGen::ReadyNode::seqNum, and gem5::statistics::ScalarBase< Derived, Stor >::value().
Referenced by checkAndIssue().
void gem5::TraceCPU::ElasticDataGen::adjustInitTraceOffset | ( | Tick & | offset | ) |
Adjust traceOffset based on what TraceCPU init() determines on comparing the offsets in the fetch request and elastic traces.
trace_offset | trace offset set by comparing both traces |
Definition at line 277 of file trace_cpu.cc.
References gem5::ArmISA::offset, and readyList.
Referenced by gem5::TraceCPU::init().
bool gem5::TraceCPU::ElasticDataGen::checkAndIssue | ( | const GraphNode * | node_ptr, |
bool | first = true ) |
Attempts to issue a node once the node's source dependencies are complete.
If resources are available then add it to the readyList, otherwise the node is not issued and is stored in depFreeQueue until resources become available.
node_ptr | pointer to node to be issued |
first | true if this is the first attempt to issue this node |
Definition at line 641 of file trace_cpu.cc.
References addToSortedReadyList(), gem5::Clocked::clockEdge(), gem5::TraceCPU::ElasticDataGen::GraphNode::compDelay, depFreeQueue, DPRINTFR, hwResource, gem5::TraceCPU::ElasticDataGen::HardwareResource::isAvailable(), gem5::TraceCPU::ElasticDataGen::HardwareResource::occupy(), owner, gem5::TraceCPU::ElasticDataGen::GraphNode::regDep, gem5::TraceCPU::ElasticDataGen::GraphNode::robDep, gem5::TraceCPU::ElasticDataGen::GraphNode::robNum, gem5::TraceCPU::ElasticDataGen::GraphNode::seqNum, and gem5::TraceCPU::ElasticDataGen::GraphNode::typeToStr().
Referenced by completeMemAccess(), execute(), and readNextWindow().
void gem5::TraceCPU::ElasticDataGen::completeMemAccess | ( | PacketPtr | pkt | ) |
When a load writeback is received, that is when the load completes, release the dependents on it.
This is called from the dcache port recvTimingResp().
Definition at line 681 of file trace_cpu.cc.
References checkAndIssue(), gem5::Clocked::clockEdge(), gem5::TraceCPU::ElasticDataGen::GraphNode::dependents, depGraph, DPRINTF, hwResource, gem5::Packet::isWrite(), nextRead, owner, printReadyList(), readyList, gem5::TraceCPU::ElasticDataGen::HardwareResource::release(), gem5::TraceCPU::ElasticDataGen::HardwareResource::releaseStoreBuffer(), gem5::Packet::req, retryPkt, gem5::TraceCPU::ElasticDataGen::GraphNode::robNum, gem5::TraceCPU::schedDcacheNextEvent(), gem5::TraceCPU::ElasticDataGen::GraphNode::seqNum, traceComplete, gem5::TraceCPU::updateNumOps(), and windowSize.
Referenced by gem5::TraceCPU::dcacheRecvTimingResp().
void gem5::TraceCPU::ElasticDataGen::execute | ( | ) |
This is the main execute function which consumes nodes from the sorted readyList.
First attempt to issue the pending dependency-free nodes held in the depFreeQueue. Insert the ready-to-issue nodes into the readyList. Then iterate through the readyList and when a node has its execute tick equal to curTick(), execute it. If the node is a load or a store call executeMemReq() and if it is neither, simply mark it complete.
Definition at line 369 of file trace_cpu.cc.
References gem5::TraceCPU::ElasticDataGen::HardwareResource::awaitingResponse(), checkAndIssue(), gem5::Clocked::clockEdge(), gem5::curTick(), gem5::TraceCPU::ElasticDataGen::ElasticDataGenStatGroup::dataLastTick, gem5::TraceCPU::ElasticDataGen::GraphNode::dependents, depFreeQueue, depGraph, DPRINTF, DPRINTFR, elasticStats, execComplete, executeMemReq(), hwResource, gem5::TraceCPU::ElasticDataGen::HardwareResource::isAvailable(), gem5::TraceCPU::ElasticDataGen::GraphNode::isLoad(), gem5::TraceCPU::ElasticDataGen::GraphNode::isStore(), gem5::TraceCPU::ElasticDataGen::GraphNode::isStrictlyOrdered(), nextRead, gem5::TraceCPU::ElasticDataGen::ElasticDataGenStatGroup::numRetrySucceeded, owner, panic, port, gem5::TraceCPU::ElasticDataGen::HardwareResource::printOccupancy(), printReadyList(), readNextWindow(), readyList, gem5::TraceCPU::ElasticDataGen::HardwareResource::release(), gem5::Packet::req, retryPkt, gem5::TraceCPU::ElasticDataGen::GraphNode::robNum, gem5::TraceCPU::schedDcacheNextEvent(), gem5::RequestPort::sendTimingReq(), gem5::TraceCPU::ElasticDataGen::GraphNode::seqNum, traceComplete, gem5::TraceCPU::updateNumOps(), and windowSize.
Referenced by gem5::TraceCPU::schedDcacheNext().
Creates a new request for a load or store assigning the request parameters.
Calls the port's sendTimingReq() and returns a packet if the send failed so that it can be saved for a retry.
node_ptr | pointer to the load or store node to be executed |
Definition at line 565 of file trace_cpu.cc.
References gem5::TraceCPU::cacheLineSize, gem5::Packet::createRead(), gem5::Packet::createWrite(), gem5::Packet::dataDynamic(), DPRINTF, elasticStats, gem5::TraceCPU::ElasticDataGen::GraphNode::flags, gem5::TraceCPU::ElasticDataGen::GraphNode::isLoad(), gem5::TraceCPU::ElasticDataGen::GraphNode::isStrictlyOrdered(), gem5::TraceCPU::ElasticDataGen::ElasticDataGenStatGroup::numSendAttempted, gem5::TraceCPU::ElasticDataGen::ElasticDataGenStatGroup::numSendFailed, gem5::TraceCPU::ElasticDataGen::ElasticDataGenStatGroup::numSendSucceeded, gem5::TraceCPU::ElasticDataGen::ElasticDataGenStatGroup::numSOLoads, gem5::TraceCPU::ElasticDataGen::ElasticDataGenStatGroup::numSOStores, gem5::TraceCPU::ElasticDataGen::ElasticDataGenStatGroup::numSplitReqs, owner, gem5::TraceCPU::ElasticDataGen::GraphNode::pc, gem5::TraceCPU::ElasticDataGen::GraphNode::physAddr, port, requestorId, gem5::RequestPort::sendTimingReq(), gem5::TraceCPU::ElasticDataGen::GraphNode::seqNum, gem5::TraceCPU::ElasticDataGen::GraphNode::size, and gem5::TraceCPU::ElasticDataGen::GraphNode::virtAddr.
Referenced by execute().
void gem5::TraceCPU::ElasticDataGen::exit | ( | ) |
Exit the ElasticDataGen.
Definition at line 285 of file trace_cpu.cc.
References gem5::TraceCPU::ElasticDataGen::InputStream::reset(), and trace.
|
inline |
Get number of micro-ops modelled in the TraceCPU replay.
Definition at line 931 of file trace_cpu.hh.
References gem5::TraceCPU::ElasticDataGen::InputStream::getMicroOpCount(), and trace.
Tick gem5::TraceCPU::ElasticDataGen::init | ( | ) |
Reads the first message from the input trace file and returns the send tick.
Definition at line 246 of file trace_cpu.cc.
References depGraph, DPRINTF, panic_if, printReadyList(), readNextWindow(), readyList, and windowSize.
Referenced by gem5::TraceCPU::init().
|
inline |
Returns the execComplete variable which is set when the last node is executed.
Definition at line 916 of file trace_cpu.hh.
References execComplete.
Referenced by gem5::TraceCPU::schedDcacheNext().
|
inline |
Returns name of the ElasticDataGen instance.
Definition at line 843 of file trace_cpu.hh.
References genName.
void gem5::TraceCPU::ElasticDataGen::printReadyList | ( | ) |
Print readyList for debugging using debug flag TraceCPUData.
Definition at line 808 of file trace_cpu.cc.
References depGraph, DPRINTF, DPRINTFR, readyList, and gem5::TraceCPU::ElasticDataGen::GraphNode::typeToStr().
Referenced by completeMemAccess(), execute(), and init().
bool gem5::TraceCPU::ElasticDataGen::readNextWindow | ( | ) |
Reads a line of the trace file.
Returns the tick when the next request should be generated. If the end of the file has been reached, it returns false.
Definition at line 291 of file trace_cpu.cc.
References addDepsOnParent(), checkAndIssue(), depGraph, DPRINTF, gem5::TraceCPU::ElasticDataGen::InputStream::read(), gem5::TraceCPU::ElasticDataGen::GraphNode::regDep, gem5::TraceCPU::ElasticDataGen::GraphNode::robDep, gem5::TraceCPU::ElasticDataGen::GraphNode::seqNum, trace, traceComplete, and windowSize.
|
private |
Queue of dependency-free nodes that are pending issue because resources are not available.
This is chosen to be FIFO so that dependent nodes which become free in program order get pushed into the queue in that order. Thus nodes are more likely to issue in program order.
Definition at line 988 of file trace_cpu.hh.
Referenced by checkAndIssue(), and execute().
|
private |
Store the depGraph of GraphNodes.
Definition at line 979 of file trace_cpu.hh.
Referenced by addDepsOnParent(), completeMemAccess(), execute(), init(), printReadyList(), and readNextWindow().
|
protected |
Referenced by addDepsOnParent(), addToSortedReadyList(), execute(), and executeMemReq().
|
private |
Set true when execution of trace is complete.
Definition at line 959 of file trace_cpu.hh.
Referenced by execute(), and isExecComplete().
|
private |
String to store the name of the FixedRetryGen.
Definition at line 947 of file trace_cpu.hh.
Referenced by name().
|
private |
Hardware resources required to contain in-flight nodes and to throttle issuing of new nodes when resources are not available.
Definition at line 976 of file trace_cpu.hh.
Referenced by checkAndIssue(), completeMemAccess(), and execute().
|
private |
Set to true when the next window of instructions need to be read.
Definition at line 956 of file trace_cpu.hh.
Referenced by completeMemAccess(), and execute().
|
private |
Reference of the TraceCPU.
Definition at line 935 of file trace_cpu.hh.
Referenced by checkAndIssue(), completeMemAccess(), execute(), and executeMemReq().
|
private |
Reference of the port to be used to issue memory requests.
Definition at line 938 of file trace_cpu.hh.
Referenced by execute(), and executeMemReq().
List of nodes that are ready to execute.
Definition at line 991 of file trace_cpu.hh.
Referenced by addToSortedReadyList(), adjustInitTraceOffset(), completeMemAccess(), execute(), init(), and printReadyList().
|
private |
RequestorID used for the requests being sent.
Definition at line 941 of file trace_cpu.hh.
Referenced by executeMemReq().
|
private |
PacketPtr used to store the packet to retry.
Definition at line 950 of file trace_cpu.hh.
Referenced by addToSortedReadyList(), completeMemAccess(), and execute().
|
private |
Input stream used for reading the input trace file.
Definition at line 944 of file trace_cpu.hh.
Referenced by exit(), getMicroOpCount(), gem5::TraceCPU::ElasticDataGen::InputStream::read(), readNextWindow(), and gem5::TraceCPU::ElasticDataGen::InputStream::reset().
|
private |
Set to true when end of trace is reached.
Definition at line 953 of file trace_cpu.hh.
Referenced by completeMemAccess(), execute(), and readNextWindow().
|
private |
Window size within which to check for dependencies.
Its value is made equal to the window size used to generate the trace which is recorded in the trace header. The dependency graph must be populated enough such that when a node completes, its potential child node must be found and the dependency removed before the completed node itself is removed. Thus as soon as the graph shrinks to become smaller than this window, we read in the next window.
Definition at line 970 of file trace_cpu.hh.
Referenced by completeMemAccess(), ElasticDataGen(), execute(), init(), and readNextWindow().