|
gem5 [DEVELOP-FOR-25.1]
|
#include <bac.hh>
Classes | |
| struct | BACStats |
| struct | Stalls |
| Source of possible stalls. More... | |
Public Types | |
| enum | BACStatus { Active , Inactive } |
| Overall decoupled BPU stage status. More... | |
| enum | ThreadStatus { Idle , Running , Squashing , Blocked , FTQFull , FTQLocked , ThreadStatusMax } |
| Individual thread status. More... | |
Public Member Functions | |
| BAC (CPU *_cpu, const BaseO3CPUParams ¶ms) | |
| BAC constructor. | |
| std::string | name () const |
| Returns the name of the stage. | |
| void | regProbePoints () |
| Registers probes and listeners. | |
| void | setTimeBuffer (TimeBuffer< TimeStruct > *tb_ptr) |
| Sets the main backwards communication time buffer pointer. | |
| void | setActiveThreads (std::list< ThreadID > *at_ptr) |
| Sets pointer to list of active threads. | |
| void | setFetchTargetQueue (FTQ *_ptr) |
| Connect the FTQ. | |
| void | startupStage () |
| Initialize stage. | |
| void | clearStates (ThreadID tid) |
| Clear all thread-specific states. | |
| void | drainResume () |
| Resume after a drain. | |
| void | drainSanityCheck () const |
| Perform sanity checks after a drain. | |
| bool | isDrained () const |
| Has the stage drained? | |
| void | drainStall (ThreadID tid) |
| Stall the fetch stage after reaching a safe drain point. | |
| void | takeOverFrom () |
| Takes over from another CPU's thread. | |
| void | deactivateThread (ThreadID tid) |
| void | tick () |
| Process all input signals and create the next fetch target. | |
| bool | updatePC (const DynInstPtr &inst, PCStateBase &fetch_pc, FetchTargetPtr &ft) |
| Calculate the next PC address depending on the instruction type and the branch prediction. | |
Protected Attributes | |
| gem5::o3::BAC::BACStats | stats |
Private Types | |
| typedef branch_prediction::BranchType | BranchType |
Private Member Functions | |
| void | resetStage () |
| Reset this pipeline stage. | |
| void | switchToActive () |
| Changes the status of this stage to active, and indicates this to the CPU. | |
| void | switchToInactive () |
| Changes the status of this stage to inactive, and indicates this to the CPU. | |
| bool | checkStall (ThreadID tid) const |
| Checks if a thread is stalled. | |
| void | updateBACStatus () |
| Updates overall BAC stage status; to be called at the end of each cycle. | |
| bool | checkSignalsAndUpdate (ThreadID tid) |
| Checks all input signals and updates the status as necessary. | |
| bool | checkAndUpdateBPUSignals (ThreadID tid) |
| Check the backward signals that update the BPU. | |
| FetchTargetPtr | newFetchTarget (ThreadID tid, const PCStateBase &start_pc) |
| Create a new fetch target. | |
| bool | predict (ThreadID tid, const StaticInstPtr &inst, const FetchTargetPtr &ft, PCStateBase &pc) |
| The prediction function for the BAC stage. | |
| void | generateFetchTargets (ThreadID tid, bool &status_change) |
| Main function that feeds the FTQ with new fetch targets. | |
| bool | updatePreDecode (ThreadID tid, const InstSeqNum seqNum, const StaticInstPtr &inst, PCStateBase &pc, const FetchTargetPtr &ft) |
| Pre-decode update --------------------------------------— After predecoding instruction in the fetch stage all instructions are known together and a sequence number is assigned to them. | |
| void | squash (const PCStateBase &new_pc, ThreadID tid) |
| Squashes BAC for a specific thread and resets the PC. | |
| void | squashBpuHistories (ThreadID tid) |
| Squashes the BPU histories in the FTQ. | |
| Addr | alignToCacheBlock (Addr addr) |
| Align a address to the start of a cache block. | |
Private Attributes | |
| BACStatus | _status |
| Decode status. | |
| ThreadStatus | bacStatus [MaxThreads] |
| Per-thread status. | |
| CPU * | cpu |
| Pointer to the main CPU. | |
| branch_prediction::BPredUnit * | bpu |
| BPredUnit. | |
| FTQ * | ftq |
| Fetch target Queue. | |
| TimeBuffer< TimeStruct > * | timeBuffer |
| Time buffer interface. | |
| TimeBuffer< TimeStruct >::wire | fromFetch |
| Wire to get fetches's information from backwards time buffer. | |
| TimeBuffer< TimeStruct >::wire | fromDecode |
| Wire to get decode's information from backwards time buffer. | |
| TimeBuffer< TimeStruct >::wire | fromCommit |
| Wire to get commit's information from backwards time buffer. | |
| TimeBuffer< FetchStruct >::wire | toFetch |
| Wire used to write any information heading to fetch. | |
| std::unique_ptr< PCStateBase > | bacPC [MaxThreads] |
| The decoupled PC which runs ahead of fetch. | |
| bool | wroteToTimeBuffer |
| Variable that tracks if BAC has written to the time buffer this cycle. | |
| Stalls | stalls [MaxThreads] |
| Tracks which stages are telling the ftq to stall. | |
| const bool | decoupledFrontEnd |
| Enables the decoupled front-end. | |
| const Cycles | fetchToBacDelay |
| Fetch to BAC delay. | |
| const Cycles | decodeToFetchDelay |
| Decode to fetch delay. | |
| const Cycles | commitToFetchDelay |
| Commit to fetch delay. | |
| const Cycles | bacToFetchDelay |
| BAC to fetch delay. | |
| const unsigned int | cacheBlkSize |
| Cache block size. | |
| const unsigned | fetchTargetWidth |
| The maximum width of a fetch target. | |
| const unsigned | minInstSize |
| The minimum size an instruction can have in the current architecture. | |
| std::list< ThreadID > * | activeThreads |
| List of Active FTQ Threads. | |
| const ThreadID | numThreads |
| Number of threads. | |
| const unsigned | maxFTPerCycle |
| const unsigned | maxTakenPredPerCycle |
|
private |
| gem5::o3::BAC::BAC | ( | CPU * | _cpu, |
| const BaseO3CPUParams & | params ) |
BAC constructor.
Definition at line 77 of file bac.cc.
References bacPC, bacToFetchDelay, bpu, cacheBlkSize, commitToFetchDelay, cpu, decodeToFetchDelay, decoupledFrontEnd, fatal_if, fetchTargetWidth, fetchToBacDelay, ftq, gem5::ArmISA::i, maxFTPerCycle, maxTakenPredPerCycle, gem5::o3::MaxThreads, minInstSize, numThreads, stalls, stats, and wroteToTimeBuffer.
Referenced by gem5::o3::BAC::BACStats::BACStats(), checkAndUpdateBPUSignals(), checkSignalsAndUpdate(), checkStall(), generateFetchTargets(), newFetchTarget(), squash(), squashBpuHistories(), updatePC(), and updatePreDecode().
Align a address to the start of a cache block.
Definition at line 421 of file bac.hh.
References gem5::X86ISA::addr, and cacheBlkSize.
|
private |
Check the backward signals that update the BPU.
Definition at line 290 of file bac.cc.
References BAC(), bacStatus, bpu, DPRINTF, fromCommit, fromDecode, fromFetch, squash(), squashBpuHistories(), Squashing, and stats.
Referenced by checkSignalsAndUpdate(), and tick().
|
private |
Checks all input signals and updates the status as necessary.
Definition at line 375 of file bac.cc.
References BAC(), bacStatus, Blocked, checkAndUpdateBPUSignals(), checkStall(), cpu, DPRINTF, ftq, FTQFull, FTQLocked, Idle, Running, squashBpuHistories(), Squashing, and stalls.
Referenced by tick().
|
private |
| void gem5::o3::BAC::clearStates | ( | ThreadID | tid | ) |
Clear all thread-specific states.
Definition at line 148 of file bac.cc.
References bacPC, bacStatus, cpu, ftq, Running, gem5::ArmISA::set, and stalls.
Referenced by resetStage().
| void gem5::o3::BAC::drainResume | ( | ) |
Resume after a drain.
Definition at line 174 of file bac.cc.
References DPRINTF, gem5::ArmISA::i, numThreads, and stalls.
| void gem5::o3::BAC::drainSanityCheck | ( | ) | const |
Perform sanity checks after a drain.
Definition at line 183 of file bac.cc.
References bacStatus, bpu, ftq, gem5::ArmISA::i, Idle, isDrained(), numThreads, and stalls.
| void gem5::o3::BAC::drainStall | ( | ThreadID | tid | ) |
Stall the fetch stage after reaching a safe drain point.
The CPU uses this method to stop fetching instructions from a thread that has been drained. The drain stall is different from all other stalls in that it is signaled instantly from the commit stage (without the normal communication delay) when it has reached a safe point to drain from.
|
private |
Main function that feeds the FTQ with new fetch targets.
By leveraging the BTB up to N consecutive addresses are searched to detect a branch instruction. For every BTB hit the direction predictor is asked to make a prediction. In every cycle one fetch target is created. A fetch target ends once the first branch instruction is detected or the maximum search bandwidth for a cycle is reached.
This function implements the head of the decoupled frontend. Instead of waiting for the pre-decoding the current instruction, as done in the standared front-end, the BTB is leveraged for finding branches in the instruction stream.
Starting from the current address we search all consecutive addresses if a entry exits in the BTB. As soon as the BTB hits, we know we have reached a branch instruction and make a prediction for the branch. The start and end address of this so called fetch target is stored together with the prediction in the FTQ.
Depending on the prediction of the BPU the branch target or the fallthrough address determine the start address for the next fetch target and search cycle.
For simplicity each fetch target contains at max one branch. However, as a not-taken branch does not require redirecting the fetch unit CPU's may continue fetching past a not taken branch. Therefore, this implementationt may create multiple fetach targets per cycle. A cycle ends when (1) the fetch target size is reached, (2) an upper bound of fetch targets per cycle is reached, or (3) a branch is predicted as taken.
The same mechanism enables us to simulate making multiple taken predictions per cycles as it is the case in very recent commercial CPU's.
Definition at line 585 of file bac.cc.
References BAC(), bacPC, bacStatus, bpu, gem5::PCStateBase::clone(), DPRINTF, fetchTargetWidth, ftq, FTQFull, gem5::PCStateBase::instAddr(), gem5::StaticInst::isLastMicroop(), gem5::StaticInst::isMicroop(), maxFTPerCycle, maxTakenPredPerCycle, minInstSize, newFetchTarget(), predict(), gem5::ArmISA::set, gem5::PCStateBase::set(), gem5::StaticInst::size(), stats, and wroteToTimeBuffer.
Referenced by tick().
| bool gem5::o3::BAC::isDrained | ( | ) | const |
Has the stage drained?
Definition at line 196 of file bac.cc.
References bacStatus, ftq, gem5::ArmISA::i, Idle, and numThreads.
Referenced by drainSanityCheck().
| std::string gem5::o3::BAC::name | ( | ) | const |
|
private |
|
private |
The prediction function for the BAC stage.
In the decoupled scenario the branch history is not added to the BPUs very own predictor history because at the moment a prediction is made the sequence number in not known.
| inst | The branch instruction. |
| ft | The fetch target that is currently processed. |
| PC | The predicted PC is passed back through this parameter. |
Perform the prediction. The prediction history object is pushed onto the fetch target. This allows tracking which object belongs to which branch. It also allows inserting dummy objects for branches that where not detected by the BAC state due to BTB misses. The postFetch() function will move the history from the FT to the main history of the BPU and insert these missing histories.
Definition at line 565 of file bac.cc.
References bpu, DPRINTF, and gem5::MipsISA::pc.
Referenced by generateFetchTargets().
|
inline |
|
private |
Reset this pipeline stage.
Definition at line 162 of file bac.cc.
References _status, clearStates(), Inactive, numThreads, and wroteToTimeBuffer.
Referenced by startupStage(), and takeOverFrom().
Sets pointer to list of active threads.
Definition at line 126 of file bac.cc.
References activeThreads.
| void gem5::o3::BAC::setFetchTargetQueue | ( | FTQ * | _ptr | ) |
| void gem5::o3::BAC::setTimeBuffer | ( | TimeBuffer< TimeStruct > * | tb_ptr | ) |
Sets the main backwards communication time buffer pointer.
Definition at line 115 of file bac.cc.
References commitToFetchDelay, decodeToFetchDelay, fetchToBacDelay, fromCommit, fromDecode, fromFetch, and timeBuffer.
|
private |
Squashes BAC for a specific thread and resets the PC.
Definition at line 484 of file bac.cc.
References BAC(), bacPC, bacStatus, decoupledFrontEnd, DPRINTF, ftq, gem5::ArmISA::set, and Squashing.
Referenced by checkAndUpdateBPUSignals().
|
private |
Squashes the BPU histories in the FTQ.
by iterating from tail to head and reverts the predictions made.
Definition at line 464 of file bac.cc.
References BAC(), bpu, decoupledFrontEnd, DPRINTF, and ftq.
Referenced by checkAndUpdateBPUSignals(), checkSignalsAndUpdate(), and updatePreDecode().
| void gem5::o3::BAC::startupStage | ( | ) |
Initialize stage.
Definition at line 139 of file bac.cc.
References resetStage(), and switchToActive().
|
private |
Changes the status of this stage to active, and indicates this to the CPU.
Definition at line 223 of file bac.cc.
References _status, Active, gem5::o3::CPU::BACIdx, cpu, DPRINTF, and Inactive.
Referenced by startupStage().
|
private |
|
inline |
Takes over from another CPU's thread.
Definition at line 177 of file bac.hh.
References resetStage().
| void gem5::o3::BAC::tick | ( | ) |
Process all input signals and create the next fetch target.
Definition at line 503 of file bac.cc.
References activeThreads, bacStatus, checkAndUpdateBPUSignals(), checkSignalsAndUpdate(), cpu, decoupledFrontEnd, DPRINTF, generateFetchTargets(), Idle, Running, stats, and updateBACStatus().
|
private |
| bool gem5::o3::BAC::updatePC | ( | const DynInstPtr & | inst, |
| PCStateBase & | fetch_pc, | ||
| FetchTargetPtr & | ft ) |
Calculate the next PC address depending on the instruction type and the branch prediction.
| inst | The currently processed dynamic instruction. |
| fetch_pc | The current fetch PC passed in by reference. It will be updated with what the next PC will be. |
| ft | The currently processed fetch target. Can be nullptr for the non-decoupled scenario. |
Definition at line 904 of file bac.cc.
References BAC(), bpu, decoupledFrontEnd, DPRINTF, ftq, stats, and updatePreDecode().
|
private |
Pre-decode update --------------------------------------— After predecoding instruction in the fetch stage all instructions are known together and a sequence number is assigned to them.
Post fetch part ---------------------------------------—.
The fetch stage will call this function for every branch instruction to allow the BAC stage to update the branch predictor history.
There can be the following two cases:
This function performs the following steps:
Together with inserting an instruction into the instruction queue instruction matches the predicted instruction type. If so update the information with the new. In case the types dont match something is wrong and we need to squash. (should not be the case.)
| seq_num | The branches sequence that we want to update. |
| inst | The new pre-decoded branch instruction. |
| tid | The thread id. |
Definition at line 764 of file bac.cc.
References gem5::StaticInst::advancePC(), BAC(), gem5::branch_prediction::BPredUnit::PredictorHistory::bpHistory, bpu, DPRINTF, ftq, gem5::branch_prediction::getBranchType(), gem5::StaticInst::isLastMicroop(), gem5::StaticInst::isMicroop(), gem5::StaticInst::isUncondCtrl(), gem5::branch_prediction::BPredUnit::PredictorHistory::pc, gem5::MipsISA::pc, gem5::branch_prediction::BPredUnit::PredictorHistory::predTaken, gem5::branch_prediction::BPredUnit::PredictorHistory::seqNum, gem5::ArmISA::set, squashBpuHistories(), stats, gem5::branch_prediction::BPredUnit::PredictorHistory::target, gem5::branch_prediction::toString(), and gem5::branch_prediction::BPredUnit::PredictorHistory::type.
Referenced by updatePC().
|
private |
Decode status.
Definition at line 119 of file bac.hh.
Referenced by resetStage(), switchToActive(), switchToInactive(), and updateBACStatus().
List of Active FTQ Threads.
Definition at line 408 of file bac.hh.
Referenced by setActiveThreads(), tick(), and updateBACStatus().
|
private |
The decoupled PC which runs ahead of fetch.
Definition at line 361 of file bac.hh.
Referenced by BAC(), clearStates(), generateFetchTargets(), and squash().
|
private |
Per-thread status.
Definition at line 122 of file bac.hh.
Referenced by checkAndUpdateBPUSignals(), checkSignalsAndUpdate(), clearStates(), drainSanityCheck(), generateFetchTargets(), isDrained(), squash(), tick(), and updateBACStatus().
|
private |
|
private |
BPredUnit.
Definition at line 340 of file bac.hh.
Referenced by BAC(), checkAndUpdateBPUSignals(), checkStall(), drainSanityCheck(), generateFetchTargets(), predict(), squashBpuHistories(), updatePC(), and updatePreDecode().
|
private |
Cache block size.
Definition at line 395 of file bac.hh.
Referenced by alignToCacheBlock(), and BAC().
|
private |
|
private |
Pointer to the main CPU.
Definition at line 337 of file bac.hh.
Referenced by BAC(), gem5::o3::BAC::BACStats::BACStats(), checkSignalsAndUpdate(), clearStates(), drainStall(), name(), newFetchTarget(), switchToActive(), switchToInactive(), tick(), and updateBACStatus().
|
private |
|
private |
Enables the decoupled front-end.
Definition at line 380 of file bac.hh.
Referenced by BAC(), squash(), squashBpuHistories(), tick(), and updatePC().
|
private |
The maximum width of a fetch target.
This also determines the maximum addresses searched in one cycle. (FT width / minInstSize)
Definition at line 399 of file bac.hh.
Referenced by BAC(), gem5::o3::BAC::BACStats::BACStats(), and generateFetchTargets().
|
private |
|
private |
Wire to get commit's information from backwards time buffer.
Definition at line 355 of file bac.hh.
Referenced by checkAndUpdateBPUSignals(), and setTimeBuffer().
|
private |
Wire to get decode's information from backwards time buffer.
Definition at line 352 of file bac.hh.
Referenced by checkAndUpdateBPUSignals(), and setTimeBuffer().
|
private |
Wire to get fetches's information from backwards time buffer.
Definition at line 349 of file bac.hh.
Referenced by checkAndUpdateBPUSignals(), and setTimeBuffer().
|
private |
Definition at line 343 of file bac.hh.
Referenced by BAC(), checkSignalsAndUpdate(), clearStates(), drainSanityCheck(), generateFetchTargets(), isDrained(), setFetchTargetQueue(), squash(), squashBpuHistories(), updatePC(), and updatePreDecode().
|
private |
Definition at line 414 of file bac.hh.
Referenced by BAC(), gem5::o3::BAC::BACStats::BACStats(), and generateFetchTargets().
|
private |
Definition at line 417 of file bac.hh.
Referenced by BAC(), and generateFetchTargets().
|
private |
The minimum size an instruction can have in the current architecture.
It determines the search granularity of the decoupled front-end. I.e. for x86 this must be 1. For fixed size ISA's it should be equal to the instruction size to speedup simulation time.
Definition at line 405 of file bac.hh.
Referenced by BAC(), and generateFetchTargets().
|
private |
Number of threads.
Definition at line 411 of file bac.hh.
Referenced by BAC(), drainResume(), drainSanityCheck(), isDrained(), and resetStage().
|
private |
Tracks which stages are telling the ftq to stall.
Definition at line 377 of file bac.hh.
Referenced by BAC(), checkSignalsAndUpdate(), checkStall(), clearStates(), drainResume(), drainSanityCheck(), and drainStall().
|
protected |
Referenced by BAC(), checkAndUpdateBPUSignals(), generateFetchTargets(), newFetchTarget(), tick(), updatePC(), and updatePreDecode().
|
private |
|
private |
|
private |
Variable that tracks if BAC has written to the time buffer this cycle.
Used to tell CPU if there is activity this cycle.
Definition at line 366 of file bac.hh.
Referenced by BAC(), generateFetchTargets(), and resetStage().