gem5  v20.1.0.0
Classes | Public Types | Public Member Functions | Public Attributes | Protected Attributes | Private Member Functions | Private Attributes | List of all members
ComputeUnit Class Reference

#include <compute_unit.hh>

Inheritance diagram for ComputeUnit:
ClockedObject SimObject Clocked EventManager Serializable Drainable Stats::Group

Classes

class  DataPort
 Data access Port. More...
 
class  DTLBPort
 Data TLB port. More...
 
class  GMTokenPort
 
class  ITLBPort
 
class  LDSPort
 the port intended to communicate between the CU and its LDS More...
 
class  ScalarDataPort
 
class  ScalarDTLBPort
 
class  SQCPort
 

Public Types

typedef ComputeUnitParams Params
 
typedef std::unordered_map< Addr, std::pair< int, int > > pageDataStruct
 
- Public Types inherited from ClockedObject
typedef ClockedObjectParams Params
 Parameters of ClockedObject. More...
 
- Public Types inherited from SimObject
typedef SimObjectParams Params
 

Public Member Functions

int numExeUnits () const
 
int firstMemUnit () const
 
int lastMemUnit () const
 
int mapWaveToScalarAlu (Wavefront *w) const
 
int mapWaveToScalarAluGlobalIdx (Wavefront *w) const
 
int mapWaveToGlobalMem (Wavefront *w) const
 
int mapWaveToLocalMem (Wavefront *w) const
 
int mapWaveToScalarMem (Wavefront *w) const
 
void insertInPipeMap (Wavefront *w)
 
void deleteFromPipeMap (Wavefront *w)
 
 ComputeUnit (const Params *p)
 
 ~ComputeUnit ()
 
int oprNetPipeLength () const
 
int simdUnitWidth () const
 
int spBypassLength () const
 
int dpBypassLength () const
 
int scalarPipeLength () const
 
int storeBusLength () const
 
int loadBusLength () const
 
int wfSize () const
 
void exec ()
 
void initiateFetch (Wavefront *wavefront)
 
void fetch (PacketPtr pkt, Wavefront *wavefront)
 
void fillKernelState (Wavefront *w, HSAQueueEntry *task)
 
void startWavefront (Wavefront *w, int waveId, LdsChunk *ldsChunk, HSAQueueEntry *task, int bar_id, bool fetchContext=false)
 
void doInvalidate (RequestPtr req, int kernId)
 trigger invalidate operation in the cu More...
 
void doFlush (GPUDynInstPtr gpuDynInst)
 trigger flush operation in the cu More...
 
void dispWorkgroup (HSAQueueEntry *task, int num_wfs_in_wg)
 
bool hasDispResources (HSAQueueEntry *task, int &num_wfs_in_wg)
 
int cacheLineSize () const
 
int getCacheLineBits () const
 
int numYetToReachBarrier (int bar_id)
 
bool allAtBarrier (int bar_id)
 
void incNumAtBarrier (int bar_id)
 
int numAtBarrier (int bar_id)
 
int maxBarrierCnt (int bar_id)
 
void resetBarrier (int bar_id)
 
void decMaxBarrierCnt (int bar_id)
 
void releaseBarrier (int bar_id)
 
void releaseWFsFromBarrier (int bar_id)
 
int numBarrierSlots () const
 
template<typename c0 , typename c1 >
void doSmReturn (GPUDynInstPtr gpuDynInst)
 
virtual void init () override
 init() is called after all C++ SimObjects have been created and all ports are connected. More...
 
void sendRequest (GPUDynInstPtr gpuDynInst, PortID index, PacketPtr pkt)
 
void sendScalarRequest (GPUDynInstPtr gpuDynInst, PacketPtr pkt)
 
void injectGlobalMemFence (GPUDynInstPtr gpuDynInst, bool kernelMemSync, RequestPtr req=nullptr)
 
void handleMemPacket (PacketPtr pkt, int memport_index)
 
bool processTimingPacket (PacketPtr pkt)
 
void processFetchReturn (PacketPtr pkt)
 
void updatePageDivergenceDist (Addr addr)
 
RequestorID requestorId ()
 
bool isDone () const
 
bool isVectorAluIdle (uint32_t simdId) const
 
void updateInstStats (GPUDynInstPtr gpuDynInst)
 
void regStats () override
 Callback to set stat parameters. More...
 
LdsStategetLds () const
 
int32_t getRefCounter (const uint32_t dispatchId, const uint32_t wgId) const
 
bool sendToLds (GPUDynInstPtr gpuDynInst) __attribute__((warn_unused_result))
 send a general request to the LDS make sure to look at the return value here as your request might be NACK'd and returning false means that you have to have some backup plan More...
 
void exitCallback ()
 
TokenManagergetTokenManager ()
 
PortgetPort (const std::string &if_name, PortID idx) override
 Get a port with a given name and index. More...
 
InstSeqNum getAndIncSeqNum ()
 
- Public Member Functions inherited from ClockedObject
 ClockedObject (const ClockedObjectParams *p)
 
const Paramsparams () const
 
void serialize (CheckpointOut &cp) const override
 Serialize an object. More...
 
void unserialize (CheckpointIn &cp) override
 Unserialize an object. More...
 
- Public Member Functions inherited from SimObject
const Paramsparams () const
 
 SimObject (const Params *_params)
 
virtual ~SimObject ()
 
virtual const std::string name () const
 
virtual void loadState (CheckpointIn &cp)
 loadState() is called on each SimObject when restoring from a checkpoint. More...
 
virtual void initState ()
 initState() is called on each SimObject when not restoring from a checkpoint. More...
 
virtual void regProbePoints ()
 Register probe points for this object. More...
 
virtual void regProbeListeners ()
 Register probe listeners for this object. More...
 
ProbeManagergetProbeManager ()
 Get the probe manager for this object. More...
 
virtual void startup ()
 startup() is the final initialization call before simulation. More...
 
DrainState drain () override
 Provide a default implementation of the drain interface for objects that don't need draining. More...
 
virtual void memWriteback ()
 Write back dirty buffers to memory using functional writes. More...
 
virtual void memInvalidate ()
 Invalidate the contents of memory buffers. More...
 
void serialize (CheckpointOut &cp) const override
 Serialize an object. More...
 
void unserialize (CheckpointIn &cp) override
 Unserialize an object. More...
 
- Public Member Functions inherited from EventManager
EventQueueeventQueue () const
 
void schedule (Event &event, Tick when)
 
void deschedule (Event &event)
 
void reschedule (Event &event, Tick when, bool always=false)
 
void schedule (Event *event, Tick when)
 
void deschedule (Event *event)
 
void reschedule (Event *event, Tick when, bool always=false)
 
void wakeupEventQueue (Tick when=(Tick) -1)
 This function is not needed by the usual gem5 event loop but may be necessary in derived EventQueues which host gem5 on other schedulers. More...
 
void setCurTick (Tick newVal)
 
 EventManager (EventManager &em)
 Event manger manages events in the event queue. More...
 
 EventManager (EventManager *em)
 
 EventManager (EventQueue *eq)
 
- Public Member Functions inherited from Serializable
 Serializable ()
 
virtual ~Serializable ()
 
void serializeSection (CheckpointOut &cp, const char *name) const
 Serialize an object into a new section. More...
 
void serializeSection (CheckpointOut &cp, const std::string &name) const
 
void unserializeSection (CheckpointIn &cp, const char *name)
 Unserialize an a child object. More...
 
void unserializeSection (CheckpointIn &cp, const std::string &name)
 
- Public Member Functions inherited from Drainable
DrainState drainState () const
 Return the current drain state of an object. More...
 
virtual void notifyFork ()
 Notify a child process of a fork. More...
 
- Public Member Functions inherited from Stats::Group
 Group (Group *parent, const char *name=nullptr)
 Construct a new statistics group. More...
 
virtual ~Group ()
 
virtual void resetStats ()
 Callback to reset stats. More...
 
virtual void preDumpStats ()
 Callback before stats are dumped. More...
 
void addStat (Stats::Info *info)
 Register a stat with this group. More...
 
const std::map< std::string, Group * > & getStatGroups () const
 Get all child groups associated with this object. More...
 
const std::vector< Info * > & getStats () const
 Get all stats associated with this object. More...
 
void addStatGroup (const char *name, Group *block)
 Add a stat block as a child of this block. More...
 
const InforesolveStat (std::string name) const
 Resolve a stat by its name within this group. More...
 
 Group ()=delete
 
 Group (const Group &)=delete
 
Groupoperator= (const Group &)=delete
 
- Public Member Functions inherited from Clocked
void updateClockPeriod ()
 Update the tick to the current tick. More...
 
Tick clockEdge (Cycles cycles=Cycles(0)) const
 Determine the tick when a cycle begins, by default the current one, but the argument also enables the caller to determine a future cycle. More...
 
Cycles curCycle () const
 Determine the current cycle, corresponding to a tick aligned to a clock edge. More...
 
Tick nextCycle () const
 Based on the clock of the object, determine the start tick of the first cycle that is at least one cycle in the future. More...
 
uint64_t frequency () const
 
Tick clockPeriod () const
 
double voltage () const
 
Cycles ticksToCycles (Tick t) const
 
Tick cyclesToTicks (Cycles c) const
 

Public Attributes

int numVectorGlobalMemUnits
 
WaitClass glbMemToVrfBus
 
WaitClass vrfToGlobalMemPipeBus
 
WaitClass vectorGlobalMemUnit
 
int numVectorSharedMemUnits
 
WaitClass locMemToVrfBus
 
WaitClass vrfToLocalMemPipeBus
 
WaitClass vectorSharedMemUnit
 
int numScalarMemUnits
 
WaitClass scalarMemToSrfBus
 
WaitClass srfToScalarMemPipeBus
 
WaitClass scalarMemUnit
 
int numVectorALUs
 
std::vector< WaitClassvectorALUs
 
int numScalarALUs
 
std::vector< WaitClassscalarALUs
 
int vrfToCoalescerBusWidth
 
int coalescerToVrfBusWidth
 
int numCyclesPerStoreTransfer
 
int numCyclesPerLoadTransfer
 
std::unordered_set< uint64_t > pipeMap
 
RegisterManagerregisterManager
 
FetchStage fetchStage
 
ScoreboardCheckStage scoreboardCheckStage
 
ScheduleStage scheduleStage
 
ExecStage execStage
 
GlobalMemPipeline globalMemoryPipe
 
LocalMemPipeline localMemoryPipe
 
ScalarMemPipeline scalarMemoryPipe
 
EventFunctionWrapper tickEvent
 
std::vector< std::vector< Wavefront * > > wfList
 
int cu_id
 
std::vector< VectorRegisterFile * > vrf
 
std::vector< ScalarRegisterFile * > srf
 
int simdWidth
 
int spBypassPipeLength
 
int dpBypassPipeLength
 
int scalarPipeStages
 
int operandNetworkLength
 
Cycles issuePeriod
 
Cycles vrf_gm_bus_latency
 
Cycles srf_scm_bus_latency
 
Cycles vrf_lm_bus_latency
 
std::vector< uint64_t > lastExecCycle
 
Stats::VectorDistribution instInterleave
 
std::vector< uint64_t > instExecPerSimd
 
bool perLaneTLB
 
int prefetchDepth
 
int prefetchStride
 
std::vector< AddrlastVaddrCU
 
std::vector< std::vector< Addr > > lastVaddrSimd
 
std::vector< std::vector< std::vector< Addr > > > lastVaddrWF
 
Enums::PrefetchType prefetchType
 
EXEC_POLICY exec_policy
 
bool debugSegFault
 
Tick idleCUTimeout
 
int idleWfs
 
bool functionalTLB
 
bool localMemBarrier
 
bool countPages
 
Shadershader
 
Tick req_tick_latency
 
Tick resp_tick_latency
 
std::vector< int > numWfsToSched
 Number of WFs to schedule to each SIMD. More...
 
std::vector< int > vectorRegsReserved
 
std::vector< int > scalarRegsReserved
 
int numVecRegsPerSimd
 
int numScalarRegsPerSimd
 
std::map< Addr, int > pagesTouched
 
Stats::Scalar vALUInsts
 
Stats::Formula vALUInstsPerWF
 
Stats::Scalar sALUInsts
 
Stats::Formula sALUInstsPerWF
 
Stats::Scalar instCyclesVALU
 
Stats::Scalar instCyclesSALU
 
Stats::Scalar threadCyclesVALU
 
Stats::Formula vALUUtilization
 
Stats::Scalar ldsNoFlatInsts
 
Stats::Formula ldsNoFlatInstsPerWF
 
Stats::Scalar flatVMemInsts
 
Stats::Formula flatVMemInstsPerWF
 
Stats::Scalar flatLDSInsts
 
Stats::Formula flatLDSInstsPerWF
 
Stats::Scalar vectorMemWrites
 
Stats::Formula vectorMemWritesPerWF
 
Stats::Scalar vectorMemReads
 
Stats::Formula vectorMemReadsPerWF
 
Stats::Scalar scalarMemWrites
 
Stats::Formula scalarMemWritesPerWF
 
Stats::Scalar scalarMemReads
 
Stats::Formula scalarMemReadsPerWF
 
Stats::Formula vectorMemReadsPerKiloInst
 
Stats::Formula vectorMemWritesPerKiloInst
 
Stats::Formula vectorMemInstsPerKiloInst
 
Stats::Formula scalarMemReadsPerKiloInst
 
Stats::Formula scalarMemWritesPerKiloInst
 
Stats::Formula scalarMemInstsPerKiloInst
 
Stats::Vector instCyclesVMemPerSimd
 
Stats::Vector instCyclesScMemPerSimd
 
Stats::Vector instCyclesLdsPerSimd
 
Stats::Scalar globalReads
 
Stats::Scalar globalWrites
 
Stats::Formula globalMemInsts
 
Stats::Scalar argReads
 
Stats::Scalar argWrites
 
Stats::Formula argMemInsts
 
Stats::Scalar spillReads
 
Stats::Scalar spillWrites
 
Stats::Formula spillMemInsts
 
Stats::Scalar groupReads
 
Stats::Scalar groupWrites
 
Stats::Formula groupMemInsts
 
Stats::Scalar privReads
 
Stats::Scalar privWrites
 
Stats::Formula privMemInsts
 
Stats::Scalar readonlyReads
 
Stats::Scalar readonlyWrites
 
Stats::Formula readonlyMemInsts
 
Stats::Scalar kernargReads
 
Stats::Scalar kernargWrites
 
Stats::Formula kernargMemInsts
 
int activeWaves
 
Stats::Distribution waveLevelParallelism
 
Stats::Scalar tlbRequests
 
Stats::Scalar tlbCycles
 
Stats::Formula tlbLatency
 
Stats::Vector hitsPerTLBLevel
 
Stats::Scalar ldsBankAccesses
 
Stats::Distribution ldsBankConflictDist
 
Stats::Distribution pageDivergenceDist
 
Stats::Scalar dynamicGMemInstrCnt
 
Stats::Scalar dynamicFlatMemInstrCnt
 
Stats::Scalar dynamicLMemInstrCnt
 
Stats::Scalar wgBlockedDueBarrierAllocation
 
Stats::Scalar wgBlockedDueLdsAllocation
 
Stats::Scalar numInstrExecuted
 
Stats::Distribution execRateDist
 
Stats::Scalar numVecOpsExecuted
 
Stats::Scalar numVecOpsExecutedF16
 
Stats::Scalar numVecOpsExecutedF32
 
Stats::Scalar numVecOpsExecutedF64
 
Stats::Scalar numVecOpsExecutedFMA16
 
Stats::Scalar numVecOpsExecutedFMA32
 
Stats::Scalar numVecOpsExecutedFMA64
 
Stats::Scalar numVecOpsExecutedMAC16
 
Stats::Scalar numVecOpsExecutedMAC32
 
Stats::Scalar numVecOpsExecutedMAC64
 
Stats::Scalar numVecOpsExecutedMAD16
 
Stats::Scalar numVecOpsExecutedMAD32
 
Stats::Scalar numVecOpsExecutedMAD64
 
Stats::Scalar numVecOpsExecutedTwoOpFP
 
Stats::Scalar totalCycles
 
Stats::Formula vpc
 
Stats::Formula vpc_f16
 
Stats::Formula vpc_f32
 
Stats::Formula vpc_f64
 
Stats::Formula ipc
 
Stats::Distribution controlFlowDivergenceDist
 
Stats::Distribution activeLanesPerGMemInstrDist
 
Stats::Distribution activeLanesPerLMemInstrDist
 
Stats::Formula numALUInstsExecuted
 
Stats::Scalar numTimesWgBlockedDueVgprAlloc
 
Stats::Scalar numTimesWgBlockedDueSgprAlloc
 
Stats::Scalar numCASOps
 
Stats::Scalar numFailedCASOps
 
Stats::Scalar completedWfs
 
Stats::Scalar completedWGs
 
Stats::Distribution headTailLatency
 
pageDataStruct pageAccesses
 
TokenManagermemPortTokens
 
GMTokenPort gmTokenPort
 
LDSPort ldsPort
 The port to access the Local Data Store Can be connected to a LDS object. More...
 
std::vector< DataPortmemPort
 The memory port for SIMD data accesses. More...
 
std::vector< DTLBPorttlbPort
 
ScalarDataPort scalarDataPort
 
ScalarDTLBPort scalarDTLBPort
 
SQCPort sqcPort
 
ITLBPort sqcTLBPort
 
- Public Attributes inherited from ClockedObject
PowerStatepowerState
 

Protected Attributes

RequestorID _requestorId
 
LdsStatelds
 
- Protected Attributes inherited from SimObject
const SimObjectParams * _params
 Cached copy of the object parameters. More...
 
- Protected Attributes inherited from EventManager
EventQueueeventq
 A pointer to this object's event queue. More...
 

Private Member Functions

WFBarrierbarrierSlot (int bar_id)
 
int getFreeBarrierId ()
 

Private Attributes

const int _cacheLineSize
 
const int _numBarrierSlots
 
int cacheLineBits
 
InstSeqNum globalSeqNum
 
int wavefrontSize
 
ScoreboardCheckToSchedule scoreboardCheckToSchedule
 TODO: Update these comments once the pipe stage interface has been fully refactored. More...
 
ScheduleToExecute scheduleToExecute
 
std::vector< WFBarrierwfBarrierSlots
 The barrier slots for this CU. More...
 
std::unordered_set< int > freeBarrierIds
 A set used to easily retrieve a free barrier ID. More...
 
std::unordered_map< GPUDynInstPtr, TickheadTailMap
 

Additional Inherited Members

- Static Public Member Functions inherited from SimObject
static void serializeAll (CheckpointOut &cp)
 Serialize all SimObjects in the system. More...
 
static SimObjectfind (const char *name)
 Find the SimObject with the given name and return a pointer to it. More...
 
- Static Public Member Functions inherited from Serializable
static const std::string & currentSection ()
 Gets the fully-qualified name of the active section. More...
 
static void serializeAll (const std::string &cpt_dir)
 Serializes all the SimObjects. More...
 
static void unserializeGlobals (CheckpointIn &cp)
 
- Protected Member Functions inherited from Drainable
 Drainable ()
 
virtual ~Drainable ()
 
virtual void drainResume ()
 Resume execution after a successful drain. More...
 
void signalDrainDone () const
 Signal that an object is drained. More...
 
- Protected Member Functions inherited from Clocked
 Clocked (ClockDomain &clk_domain)
 Create a clocked object and set the clock domain based on the parameters. More...
 
 Clocked (Clocked &)=delete
 
Clockedoperator= (Clocked &)=delete
 
virtual ~Clocked ()
 Virtual destructor due to inheritance. More...
 
void resetClock () const
 Reset the object's clock using the current global tick value. More...
 
virtual void clockPeriodUpdated ()
 A hook subclasses can implement so they can do any extra work that's needed when the clock rate is changed. More...
 

Detailed Description

Definition at line 198 of file compute_unit.hh.

Member Typedef Documentation

◆ pageDataStruct

typedef std::unordered_map<Addr, std::pair<int, int> > ComputeUnit::pageDataStruct

Definition at line 626 of file compute_unit.hh.

◆ Params

typedef ComputeUnitParams ComputeUnit::Params

Definition at line 287 of file compute_unit.hh.

Constructor & Destructor Documentation

◆ ComputeUnit()

ComputeUnit::ComputeUnit ( const Params p)

This check is necessary because std::bitset only provides conversion to unsigned long or unsigned long long via to_ulong() or to_ullong(). there are a few places in the code where to_ullong() is used, however if wavefrontSize is larger than a value the host can support then bitset will throw a runtime exception. We should remove all use of to_long() or to_ullong() so we can have wavefrontSize greater than 64b, however until that is done this assert is required.

Definition at line 62 of file compute_unit.cc.

References exec().

◆ ~ComputeUnit()

ComputeUnit::~ComputeUnit ( )

Member Function Documentation

◆ allAtBarrier()

bool ComputeUnit::allAtBarrier ( int  bar_id)

Definition at line 638 of file compute_unit.cc.

References barrierSlot().

Referenced by ScoreboardCheckStage::ready().

◆ barrierSlot()

WFBarrier& ComputeUnit::barrierSlot ( int  bar_id)
inlineprivate

◆ cacheLineSize()

int ComputeUnit::cacheLineSize ( ) const
inline

Definition at line 414 of file compute_unit.hh.

References _cacheLineSize.

Referenced by FetchUnit::init(), and FetchUnit::initiateFetch().

◆ decMaxBarrierCnt()

void ComputeUnit::decMaxBarrierCnt ( int  bar_id)

Definition at line 673 of file compute_unit.cc.

References barrierSlot().

Referenced by Gcn3ISA::Inst_SOPP__S_ENDPGM::execute().

◆ deleteFromPipeMap()

void ComputeUnit::deleteFromPipeMap ( Wavefront w)

Definition at line 491 of file compute_unit.cc.

References panic_if, pipeMap, and MipsISA::w.

Referenced by Wavefront::exec().

◆ dispWorkgroup()

void ComputeUnit::dispWorkgroup ( HSAQueueEntry task,
int  num_wfs_in_wg 
)

◆ doFlush()

void ComputeUnit::doFlush ( GPUDynInstPtr  gpuDynInst)

trigger flush operation in the cu

gpuDynInst: inst passed to the request

Definition at line 399 of file compute_unit.cc.

References injectGlobalMemFence().

◆ doInvalidate()

void ComputeUnit::doInvalidate ( RequestPtr  req,
int  kernId 
)

trigger invalidate operation in the cu

req: request initialized in shader, carrying the invlidate flags

Definition at line 380 of file compute_unit.cc.

References getAndIncSeqNum(), and injectGlobalMemFence().

◆ doSmReturn()

template<typename c0 , typename c1 >
void ComputeUnit::doSmReturn ( GPUDynInstPtr  gpuDynInst)

◆ dpBypassLength()

int ComputeUnit::dpBypassLength ( ) const
inline

Definition at line 393 of file compute_unit.hh.

References dpBypassPipeLength.

Referenced by VectorRegisterFile::waveExecuteInst().

◆ exec()

void ComputeUnit::exec ( )

◆ exitCallback()

void ComputeUnit::exitCallback ( )

◆ fetch()

void ComputeUnit::fetch ( PacketPtr  pkt,
Wavefront wavefront 
)

◆ fillKernelState()

void ComputeUnit::fillKernelState ( Wavefront w,
HSAQueueEntry task 
)

◆ firstMemUnit()

int ComputeUnit::firstMemUnit ( ) const

Definition at line 236 of file compute_unit.cc.

References numScalarALUs, and numVectorALUs.

Referenced by ScheduleStage::arbitrateVrfToLdsBus(), and ScheduleStage::exec().

◆ getAndIncSeqNum()

InstSeqNum ComputeUnit::getAndIncSeqNum ( )
inline

Definition at line 1023 of file compute_unit.hh.

References globalSeqNum.

Referenced by doInvalidate().

◆ getCacheLineBits()

int ComputeUnit::getCacheLineBits ( ) const
inline

Definition at line 415 of file compute_unit.hh.

References cacheLineBits.

Referenced by FetchUnit::initiateFetch().

◆ getFreeBarrierId()

int ComputeUnit::getFreeBarrierId ( )
inlineprivate

Definition at line 426 of file compute_unit.hh.

References freeBarrierIds.

Referenced by dispWorkgroup().

◆ getLds()

LdsState& ComputeUnit::getLds ( ) const
inline

Definition at line 615 of file compute_unit.hh.

References lds.

Referenced by Gcn3ISA::Inst_SOPP__S_ENDPGM::execute().

◆ getPort()

Port& ComputeUnit::getPort ( const std::string &  if_name,
PortID  idx 
)
inlineoverridevirtual

Get a port with a given name and index.

This is used at binding time and returns a reference to a protocol-agnostic port.

gem5 has a request and response port interface. All memory objects are connected together via ports. These ports provide a rigid interface between these memory objects. These ports implement three different memory system modes: timing, atomic, and functional. The most important mode is the timing mode and here timing mode is used for conducting cycle-level timing experiments. The other modes are only used in special circumstances and should not be used to conduct cycle-level timing experiments. The other modes are only used in special circumstances. These ports allow SimObjects to communicate with each other.

Parameters
if_namePort name
idxIndex in the case of a VectorPort
Returns
A reference to the given port

Reimplemented from SimObject.

Definition at line 1002 of file compute_unit.hh.

References SimObject::getPort(), ldsPort, memPort, scalarDataPort, scalarDTLBPort, sqcPort, sqcTLBPort, and tlbPort.

◆ getRefCounter()

int32_t ComputeUnit::getRefCounter ( const uint32_t  dispatchId,
const uint32_t  wgId 
) const

Definition at line 2513 of file compute_unit.cc.

References LdsState::getRefCounter(), and lds.

◆ getTokenManager()

TokenManager* ComputeUnit::getTokenManager ( )
inline

Definition at line 981 of file compute_unit.hh.

References memPortTokens.

Referenced by GlobalMemPipeline::exec(), and Wavefront::exec().

◆ handleMemPacket()

void ComputeUnit::handleMemPacket ( PacketPtr  pkt,
int  memport_index 
)

◆ hasDispResources()

bool ComputeUnit::hasDispResources ( HSAQueueEntry task,
int &  num_wfs_in_wg 
)

◆ incNumAtBarrier()

void ComputeUnit::incNumAtBarrier ( int  bar_id)

Definition at line 645 of file compute_unit.cc.

References barrierSlot().

Referenced by Gcn3ISA::Inst_SOPP__S_BARRIER::execute().

◆ init()

void ComputeUnit::init ( )
overridevirtual

◆ initiateFetch()

void ComputeUnit::initiateFetch ( Wavefront wavefront)

◆ injectGlobalMemFence()

void ComputeUnit::injectGlobalMemFence ( GPUDynInstPtr  gpuDynInst,
bool  kernelMemSync,
RequestPtr  req = nullptr 
)

◆ insertInPipeMap()

void ComputeUnit::insertInPipeMap ( Wavefront w)

Definition at line 482 of file compute_unit.cc.

References panic_if, pipeMap, and MipsISA::w.

Referenced by ScheduleStage::addToSchList().

◆ isDone()

bool ComputeUnit::isDone ( ) const

◆ isVectorAluIdle()

bool ComputeUnit::isVectorAluIdle ( uint32_t  simdId) const

Definition at line 2520 of file compute_unit.cc.

References Shader::n_wf, numVectorALUs, Wavefront::S_STOPPED, shader, and wfList.

Referenced by isDone().

◆ lastMemUnit()

int ComputeUnit::lastMemUnit ( ) const

Definition at line 243 of file compute_unit.cc.

References numExeUnits().

Referenced by ScheduleStage::exec().

◆ loadBusLength()

int ComputeUnit::loadBusLength ( ) const
inline

Definition at line 396 of file compute_unit.hh.

References numCyclesPerLoadTransfer.

Referenced by LdsState::processPacket().

◆ mapWaveToGlobalMem()

int ComputeUnit::mapWaveToGlobalMem ( Wavefront w) const

Definition at line 268 of file compute_unit.cc.

References numScalarALUs, and numVectorALUs.

Referenced by Wavefront::init().

◆ mapWaveToLocalMem()

int ComputeUnit::mapWaveToLocalMem ( Wavefront w) const

Definition at line 276 of file compute_unit.cc.

References numScalarALUs, numVectorALUs, and numVectorGlobalMemUnits.

Referenced by Wavefront::init().

◆ mapWaveToScalarAlu()

int ComputeUnit::mapWaveToScalarAlu ( Wavefront w) const

Definition at line 250 of file compute_unit.cc.

References numScalarALUs, and MipsISA::w.

Referenced by Wavefront::init(), and mapWaveToScalarAluGlobalIdx().

◆ mapWaveToScalarAluGlobalIdx()

int ComputeUnit::mapWaveToScalarAluGlobalIdx ( Wavefront w) const

Definition at line 261 of file compute_unit.cc.

References mapWaveToScalarAlu(), numVectorALUs, and MipsISA::w.

Referenced by Wavefront::init().

◆ mapWaveToScalarMem()

int ComputeUnit::mapWaveToScalarMem ( Wavefront w) const

◆ maxBarrierCnt()

int ComputeUnit::maxBarrierCnt ( int  bar_id)

Definition at line 659 of file compute_unit.cc.

References barrierSlot().

Referenced by Gcn3ISA::Inst_SOPP__S_ENDPGM::execute().

◆ numAtBarrier()

int ComputeUnit::numAtBarrier ( int  bar_id)

Definition at line 652 of file compute_unit.cc.

References barrierSlot().

Referenced by Gcn3ISA::Inst_SOPP__S_BARRIER::execute().

◆ numBarrierSlots()

int ComputeUnit::numBarrierSlots ( ) const
inline

Definition at line 445 of file compute_unit.hh.

References _numBarrierSlots.

◆ numExeUnits()

int ComputeUnit::numExeUnits ( ) const

◆ numYetToReachBarrier()

int ComputeUnit::numYetToReachBarrier ( int  bar_id)

Definition at line 631 of file compute_unit.cc.

References barrierSlot().

Referenced by Gcn3ISA::Inst_SOPP__S_BARRIER::execute().

◆ oprNetPipeLength()

int ComputeUnit::oprNetPipeLength ( ) const
inline

Definition at line 390 of file compute_unit.hh.

References operandNetworkLength.

◆ processFetchReturn()

void ComputeUnit::processFetchReturn ( PacketPtr  pkt)

◆ processTimingPacket()

bool ComputeUnit::processTimingPacket ( PacketPtr  pkt)

◆ regStats()

void ComputeUnit::regStats ( )
overridevirtual

Callback to set stat parameters.

This callback is typically used for complex stats (e.g., distributions) that need parameters in addition to a name and a description. Stat names and descriptions should typically be set from the constructor usingo from the constructor using the ADD_STAT macro.

Reimplemented from Stats::Group.

Definition at line 1806 of file compute_unit.cc.

References activeLanesPerGMemInstrDist, activeLanesPerLMemInstrDist, argMemInsts, argReads, argWrites, completedWfs, completedWGs, controlFlowDivergenceDist, csprintf(), Stats::DataWrap< Derived, InfoProxyType >::desc(), dynamicFlatMemInstrCnt, dynamicGMemInstrCnt, dynamicLMemInstrCnt, execRateDist, execStage, fetchStage, Stats::DataWrap< Derived, InfoProxyType >::flags(), flatLDSInsts, flatLDSInstsPerWF, flatVMemInsts, flatVMemInstsPerWF, globalMemInsts, globalMemoryPipe, globalReads, globalWrites, groupMemInsts, groupReads, groupWrites, headTailLatency, hitsPerTLBLevel, ArmISA::i, Stats::VectorBase< Derived, Stor >::init(), Stats::Distribution::init(), Stats::VectorDistribution::init(), instCyclesLdsPerSimd, instCyclesSALU, instCyclesScMemPerSimd, instCyclesVALU, instCyclesVMemPerSimd, instInterleave, ipc, kernargMemInsts, kernargReads, kernargWrites, ldsBankAccesses, ldsBankConflictDist, ldsNoFlatInsts, ldsNoFlatInstsPerWF, localMemoryPipe, Shader::n_wf, SimObject::name(), Stats::DataWrap< Derived, InfoProxyType >::name(), numALUInstsExecuted, numCASOps, numFailedCASOps, numInstrExecuted, numTimesWgBlockedDueSgprAlloc, numTimesWgBlockedDueVgprAlloc, numVecOpsExecuted, numVecOpsExecutedF16, numVecOpsExecutedF32, numVecOpsExecutedF64, numVecOpsExecutedFMA16, numVecOpsExecutedFMA32, numVecOpsExecutedFMA64, numVecOpsExecutedMAC16, numVecOpsExecutedMAC32, numVecOpsExecutedMAC64, numVecOpsExecutedMAD16, numVecOpsExecutedMAD32, numVecOpsExecutedMAD64, numVecOpsExecutedTwoOpFP, numVectorALUs, Stats::oneline, pageDivergenceDist, Stats::pdf, privMemInsts, privReads, privWrites, readonlyMemInsts, readonlyReads, readonlyWrites, registerManager, FetchStage::regStats(), RegisterManager::regStats(), LocalMemPipeline::regStats(), ScoreboardCheckStage::regStats(), ExecStage::regStats(), ScalarMemPipeline::regStats(), GlobalMemPipeline::regStats(), ScheduleStage::regStats(), Stats::Group::regStats(), sALUInsts, sALUInstsPerWF, scalarMemInstsPerKiloInst, scalarMemoryPipe, scalarMemReads, scalarMemReadsPerKiloInst, scalarMemReadsPerWF, scalarMemWrites, scalarMemWritesPerKiloInst, scalarMemWritesPerWF, scheduleStage, scoreboardCheckStage, shader, spillMemInsts, spillReads, spillWrites, Stats::DataWrapVec< Derived, InfoProxyType >::subname(), threadCyclesVALU, tlbCycles, tlbLatency, tlbRequests, totalCycles, vALUInsts, vALUInstsPerWF, vALUUtilization, vectorMemInstsPerKiloInst, vectorMemReads, vectorMemReadsPerKiloInst, vectorMemReadsPerWF, vectorMemWrites, vectorMemWritesPerKiloInst, vectorMemWritesPerWF, vpc, vpc_f16, vpc_f32, vpc_f64, waveLevelParallelism, wfSize(), wgBlockedDueBarrierAllocation, and wgBlockedDueLdsAllocation.

◆ releaseBarrier()

void ComputeUnit::releaseBarrier ( int  bar_id)

Definition at line 680 of file compute_unit.cc.

References barrierSlot(), and freeBarrierIds.

Referenced by Gcn3ISA::Inst_SOPP__S_ENDPGM::execute().

◆ releaseWFsFromBarrier()

void ComputeUnit::releaseWFsFromBarrier ( int  bar_id)

◆ requestorId()

RequestorID ComputeUnit::requestorId ( )
inline

Definition at line 461 of file compute_unit.hh.

References _requestorId.

Referenced by FetchUnit::initiateFetch(), and injectGlobalMemFence().

◆ resetBarrier()

void ComputeUnit::resetBarrier ( int  bar_id)

Definition at line 666 of file compute_unit.cc.

References barrierSlot().

Referenced by ScoreboardCheckStage::ready().

◆ scalarPipeLength()

int ComputeUnit::scalarPipeLength ( ) const
inline

Definition at line 394 of file compute_unit.hh.

References scalarPipeStages.

Referenced by ScalarRegisterFile::waveExecuteInst().

◆ sendRequest()

void ComputeUnit::sendRequest ( GPUDynInstPtr  gpuDynInst,
PortID  index,
PacketPtr  pkt 
)

◆ sendScalarRequest()

void ComputeUnit::sendScalarRequest ( GPUDynInstPtr  gpuDynInst,
PacketPtr  pkt 
)

◆ sendToLds()

bool ComputeUnit::sendToLds ( GPUDynInstPtr  gpuDynInst)

send a general request to the LDS make sure to look at the return value here as your request might be NACK'd and returning false means that you have to have some backup plan

Definition at line 2539 of file compute_unit.cc.

References ldsPort, MemCmd::ReadReq, Packet::senderState, and ComputeUnit::LDSPort::sendTimingReq().

Referenced by LocalMemPipeline::exec().

◆ simdUnitWidth()

int ComputeUnit::simdUnitWidth ( ) const
inline

Definition at line 391 of file compute_unit.hh.

References simdWidth.

◆ spBypassLength()

int ComputeUnit::spBypassLength ( ) const
inline

Definition at line 392 of file compute_unit.hh.

References spBypassPipeLength.

◆ startWavefront()

void ComputeUnit::startWavefront ( Wavefront w,
int  waveId,
LdsChunk ldsChunk,
HSAQueueEntry task,
int  bar_id,
bool  fetchContext = false 
)

◆ storeBusLength()

int ComputeUnit::storeBusLength ( ) const
inline

Definition at line 395 of file compute_unit.hh.

References numCyclesPerStoreTransfer.

Referenced by LdsState::processPacket().

◆ updateInstStats()

void ComputeUnit::updateInstStats ( GPUDynInstPtr  gpuDynInst)

◆ updatePageDivergenceDist()

void ComputeUnit::updatePageDivergenceDist ( Addr  addr)

Definition at line 2455 of file compute_unit.cc.

References addr, ArmISA::PageBytes, pagesTouched, and roundDown().

Referenced by sendRequest().

◆ wfSize()

int ComputeUnit::wfSize ( ) const
inline

Member Data Documentation

◆ _cacheLineSize

const int ComputeUnit::_cacheLineSize
private

Definition at line 1026 of file compute_unit.hh.

Referenced by cacheLineSize().

◆ _numBarrierSlots

const int ComputeUnit::_numBarrierSlots
private

Definition at line 1027 of file compute_unit.hh.

Referenced by numBarrierSlots().

◆ _requestorId

RequestorID ComputeUnit::_requestorId
protected

Definition at line 467 of file compute_unit.hh.

Referenced by requestorId().

◆ activeLanesPerGMemInstrDist

Stats::Distribution ComputeUnit::activeLanesPerGMemInstrDist

Definition at line 594 of file compute_unit.hh.

Referenced by Wavefront::exec(), and regStats().

◆ activeLanesPerLMemInstrDist

Stats::Distribution ComputeUnit::activeLanesPerLMemInstrDist

Definition at line 595 of file compute_unit.hh.

Referenced by Wavefront::exec(), and regStats().

◆ activeWaves

int ComputeUnit::activeWaves

Definition at line 530 of file compute_unit.hh.

Referenced by Gcn3ISA::Inst_SOPP__S_ENDPGM::execute(), and startWavefront().

◆ argMemInsts

Stats::Formula ComputeUnit::argMemInsts

Definition at line 513 of file compute_unit.hh.

Referenced by regStats().

◆ argReads

Stats::Scalar ComputeUnit::argReads

Definition at line 511 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ argWrites

Stats::Scalar ComputeUnit::argWrites

Definition at line 512 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ cacheLineBits

int ComputeUnit::cacheLineBits
private

Definition at line 1028 of file compute_unit.hh.

Referenced by getCacheLineBits().

◆ coalescerToVrfBusWidth

int ComputeUnit::coalescerToVrfBusWidth

Definition at line 266 of file compute_unit.hh.

◆ completedWfs

Stats::Scalar ComputeUnit::completedWfs

Definition at line 604 of file compute_unit.hh.

Referenced by Gcn3ISA::Inst_SOPP__S_ENDPGM::execute(), and regStats().

◆ completedWGs

Stats::Scalar ComputeUnit::completedWGs

Definition at line 605 of file compute_unit.hh.

Referenced by Gcn3ISA::Inst_SOPP__S_ENDPGM::execute(), and regStats().

◆ controlFlowDivergenceDist

Stats::Distribution ComputeUnit::controlFlowDivergenceDist

Definition at line 593 of file compute_unit.hh.

Referenced by Wavefront::exec(), and regStats().

◆ countPages

bool ComputeUnit::countPages

Definition at line 354 of file compute_unit.hh.

Referenced by exitCallback().

◆ cu_id

int ComputeUnit::cu_id

◆ debugSegFault

bool ComputeUnit::debugSegFault

Definition at line 344 of file compute_unit.hh.

Referenced by sendRequest().

◆ dpBypassPipeLength

int ComputeUnit::dpBypassPipeLength

Definition at line 304 of file compute_unit.hh.

Referenced by dpBypassLength().

◆ dynamicFlatMemInstrCnt

Stats::Scalar ComputeUnit::dynamicFlatMemInstrCnt

Definition at line 552 of file compute_unit.hh.

Referenced by regStats(), and GPUDynInst::updateStats().

◆ dynamicGMemInstrCnt

Stats::Scalar ComputeUnit::dynamicGMemInstrCnt

Definition at line 550 of file compute_unit.hh.

Referenced by regStats(), and GPUDynInst::updateStats().

◆ dynamicLMemInstrCnt

Stats::Scalar ComputeUnit::dynamicLMemInstrCnt

Definition at line 553 of file compute_unit.hh.

Referenced by regStats(), and GPUDynInst::updateStats().

◆ exec_policy

EXEC_POLICY ComputeUnit::exec_policy

Definition at line 342 of file compute_unit.hh.

◆ execRateDist

Stats::Distribution ComputeUnit::execRateDist

Definition at line 563 of file compute_unit.hh.

Referenced by Wavefront::exec(), and regStats().

◆ execStage

ExecStage ComputeUnit::execStage

Definition at line 280 of file compute_unit.hh.

Referenced by exec(), init(), and regStats().

◆ fetchStage

FetchStage ComputeUnit::fetchStage

◆ flatLDSInsts

Stats::Scalar ComputeUnit::flatLDSInsts

Definition at line 484 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ flatLDSInstsPerWF

Stats::Formula ComputeUnit::flatLDSInstsPerWF

Definition at line 485 of file compute_unit.hh.

Referenced by regStats().

◆ flatVMemInsts

Stats::Scalar ComputeUnit::flatVMemInsts

Definition at line 482 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ flatVMemInstsPerWF

Stats::Formula ComputeUnit::flatVMemInstsPerWF

Definition at line 483 of file compute_unit.hh.

Referenced by regStats().

◆ freeBarrierIds

std::unordered_set<int> ComputeUnit::freeBarrierIds
private

A set used to easily retrieve a free barrier ID.

Definition at line 1074 of file compute_unit.hh.

Referenced by getFreeBarrierId(), hasDispResources(), and releaseBarrier().

◆ functionalTLB

bool ComputeUnit::functionalTLB

Definition at line 348 of file compute_unit.hh.

Referenced by sendRequest().

◆ glbMemToVrfBus

WaitClass ComputeUnit::glbMemToVrfBus

Definition at line 218 of file compute_unit.hh.

Referenced by GlobalMemPipeline::exec(), init(), and isDone().

◆ globalMemInsts

Stats::Formula ComputeUnit::globalMemInsts

Definition at line 510 of file compute_unit.hh.

Referenced by regStats().

◆ globalMemoryPipe

GlobalMemPipeline ComputeUnit::globalMemoryPipe

◆ globalReads

Stats::Scalar ComputeUnit::globalReads

Definition at line 508 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ globalSeqNum

InstSeqNum ComputeUnit::globalSeqNum
private

Definition at line 1029 of file compute_unit.hh.

Referenced by getAndIncSeqNum().

◆ globalWrites

Stats::Scalar ComputeUnit::globalWrites

Definition at line 509 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ gmTokenPort

GMTokenPort ComputeUnit::gmTokenPort

Definition at line 649 of file compute_unit.hh.

Referenced by init().

◆ groupMemInsts

Stats::Formula ComputeUnit::groupMemInsts

Definition at line 519 of file compute_unit.hh.

Referenced by regStats().

◆ groupReads

Stats::Scalar ComputeUnit::groupReads

Definition at line 517 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ groupWrites

Stats::Scalar ComputeUnit::groupWrites

Definition at line 518 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ headTailLatency

Stats::Distribution ComputeUnit::headTailLatency

Definition at line 609 of file compute_unit.hh.

Referenced by ComputeUnit::DataPort::processMemRespEvent(), and regStats().

◆ headTailMap

std::unordered_map<GPUDynInstPtr, Tick> ComputeUnit::headTailMap
private

Definition at line 1079 of file compute_unit.hh.

Referenced by ComputeUnit::DataPort::processMemRespEvent().

◆ hitsPerTLBLevel

Stats::Vector ComputeUnit::hitsPerTLBLevel

Definition at line 541 of file compute_unit.hh.

Referenced by regStats(), and sendRequest().

◆ idleCUTimeout

Tick ComputeUnit::idleCUTimeout

Definition at line 346 of file compute_unit.hh.

Referenced by Wavefront::setStatus().

◆ idleWfs

int ComputeUnit::idleWfs

Definition at line 347 of file compute_unit.hh.

Referenced by Wavefront::setStatus().

◆ instCyclesLdsPerSimd

Stats::Vector ComputeUnit::instCyclesLdsPerSimd

Definition at line 506 of file compute_unit.hh.

Referenced by Wavefront::exec(), and regStats().

◆ instCyclesSALU

Stats::Scalar ComputeUnit::instCyclesSALU

Definition at line 477 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ instCyclesScMemPerSimd

Stats::Vector ComputeUnit::instCyclesScMemPerSimd

Definition at line 505 of file compute_unit.hh.

Referenced by Wavefront::exec(), and regStats().

◆ instCyclesVALU

Stats::Scalar ComputeUnit::instCyclesVALU

Definition at line 476 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ instCyclesVMemPerSimd

Stats::Vector ComputeUnit::instCyclesVMemPerSimd

Definition at line 504 of file compute_unit.hh.

Referenced by Wavefront::exec(), and regStats().

◆ instExecPerSimd

std::vector<uint64_t> ComputeUnit::instExecPerSimd

Definition at line 329 of file compute_unit.hh.

Referenced by Wavefront::exec().

◆ instInterleave

Stats::VectorDistribution ComputeUnit::instInterleave

Definition at line 326 of file compute_unit.hh.

Referenced by Wavefront::exec(), and regStats().

◆ ipc

Stats::Formula ComputeUnit::ipc

Definition at line 592 of file compute_unit.hh.

Referenced by regStats().

◆ issuePeriod

Cycles ComputeUnit::issuePeriod

Definition at line 310 of file compute_unit.hh.

Referenced by Wavefront::exec().

◆ kernargMemInsts

Stats::Formula ComputeUnit::kernargMemInsts

Definition at line 528 of file compute_unit.hh.

Referenced by regStats().

◆ kernargReads

Stats::Scalar ComputeUnit::kernargReads

Definition at line 526 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ kernargWrites

Stats::Scalar ComputeUnit::kernargWrites

Definition at line 527 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ lastExecCycle

std::vector<uint64_t> ComputeUnit::lastExecCycle

Definition at line 320 of file compute_unit.hh.

Referenced by Wavefront::exec().

◆ lastVaddrCU

std::vector<Addr> ComputeUnit::lastVaddrCU

Definition at line 338 of file compute_unit.hh.

Referenced by ~ComputeUnit().

◆ lastVaddrSimd

std::vector<std::vector<Addr> > ComputeUnit::lastVaddrSimd

Definition at line 339 of file compute_unit.hh.

Referenced by ~ComputeUnit().

◆ lastVaddrWF

std::vector<std::vector<std::vector<Addr> > > ComputeUnit::lastVaddrWF

Definition at line 340 of file compute_unit.hh.

◆ lds

LdsState& ComputeUnit::lds
protected

◆ ldsBankAccesses

Stats::Scalar ComputeUnit::ldsBankAccesses

Definition at line 543 of file compute_unit.hh.

Referenced by LdsState::processPacket(), and regStats().

◆ ldsBankConflictDist

Stats::Distribution ComputeUnit::ldsBankConflictDist

Definition at line 544 of file compute_unit.hh.

Referenced by LdsState::processPacket(), and regStats().

◆ ldsNoFlatInsts

Stats::Scalar ComputeUnit::ldsNoFlatInsts

Definition at line 480 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ ldsNoFlatInstsPerWF

Stats::Formula ComputeUnit::ldsNoFlatInstsPerWF

Definition at line 481 of file compute_unit.hh.

Referenced by regStats().

◆ ldsPort

LDSPort ComputeUnit::ldsPort

The port to access the Local Data Store Can be connected to a LDS object.

Definition at line 978 of file compute_unit.hh.

Referenced by getPort(), and sendToLds().

◆ localMemBarrier

bool ComputeUnit::localMemBarrier

Definition at line 349 of file compute_unit.hh.

◆ localMemoryPipe

LocalMemPipeline ComputeUnit::localMemoryPipe

Definition at line 282 of file compute_unit.hh.

Referenced by ScheduleStage::dispatchReady(), exec(), isDone(), and regStats().

◆ locMemToVrfBus

WaitClass ComputeUnit::locMemToVrfBus

Definition at line 226 of file compute_unit.hh.

Referenced by LocalMemPipeline::exec(), init(), and isDone().

◆ memPort

std::vector<DataPort> ComputeUnit::memPort

The memory port for SIMD data accesses.

Can be connected to PhysMem for Ruby for timing simulations

Definition at line 989 of file compute_unit.hh.

Referenced by getPort(), injectGlobalMemFence(), ComputeUnit::DataPort::recvTimingResp(), and sendRequest().

◆ memPortTokens

TokenManager* ComputeUnit::memPortTokens

Definition at line 648 of file compute_unit.hh.

Referenced by getTokenManager(), and init().

◆ numALUInstsExecuted

Stats::Formula ComputeUnit::numALUInstsExecuted

Definition at line 597 of file compute_unit.hh.

Referenced by regStats().

◆ numCASOps

Stats::Scalar ComputeUnit::numCASOps

Definition at line 602 of file compute_unit.hh.

Referenced by AtomicOpCAS< T >::execute(), and regStats().

◆ numCyclesPerLoadTransfer

int ComputeUnit::numCyclesPerLoadTransfer

Definition at line 268 of file compute_unit.hh.

Referenced by loadBusLength().

◆ numCyclesPerStoreTransfer

int ComputeUnit::numCyclesPerStoreTransfer

Definition at line 267 of file compute_unit.hh.

Referenced by storeBusLength().

◆ numFailedCASOps

Stats::Scalar ComputeUnit::numFailedCASOps

Definition at line 603 of file compute_unit.hh.

Referenced by AtomicOpCAS< T >::execute(), and regStats().

◆ numInstrExecuted

Stats::Scalar ComputeUnit::numInstrExecuted

Definition at line 560 of file compute_unit.hh.

Referenced by Wavefront::exec(), and regStats().

◆ numScalarALUs

int ComputeUnit::numScalarALUs

◆ numScalarMemUnits

int ComputeUnit::numScalarMemUnits

Definition at line 232 of file compute_unit.hh.

Referenced by init(), and numExeUnits().

◆ numScalarRegsPerSimd

int ComputeUnit::numScalarRegsPerSimd

◆ numTimesWgBlockedDueSgprAlloc

Stats::Scalar ComputeUnit::numTimesWgBlockedDueSgprAlloc

Definition at line 601 of file compute_unit.hh.

Referenced by hasDispResources(), and regStats().

◆ numTimesWgBlockedDueVgprAlloc

Stats::Scalar ComputeUnit::numTimesWgBlockedDueVgprAlloc

Definition at line 599 of file compute_unit.hh.

Referenced by hasDispResources(), and regStats().

◆ numVecOpsExecuted

Stats::Scalar ComputeUnit::numVecOpsExecuted

Definition at line 565 of file compute_unit.hh.

Referenced by Wavefront::exec(), and regStats().

◆ numVecOpsExecutedF16

Stats::Scalar ComputeUnit::numVecOpsExecutedF16

Definition at line 567 of file compute_unit.hh.

Referenced by Wavefront::exec(), and regStats().

◆ numVecOpsExecutedF32

Stats::Scalar ComputeUnit::numVecOpsExecutedF32

Definition at line 569 of file compute_unit.hh.

Referenced by Wavefront::exec(), and regStats().

◆ numVecOpsExecutedF64

Stats::Scalar ComputeUnit::numVecOpsExecutedF64

Definition at line 571 of file compute_unit.hh.

Referenced by Wavefront::exec(), and regStats().

◆ numVecOpsExecutedFMA16

Stats::Scalar ComputeUnit::numVecOpsExecutedFMA16

Definition at line 573 of file compute_unit.hh.

Referenced by Wavefront::exec(), and regStats().

◆ numVecOpsExecutedFMA32

Stats::Scalar ComputeUnit::numVecOpsExecutedFMA32

Definition at line 574 of file compute_unit.hh.

Referenced by Wavefront::exec(), and regStats().

◆ numVecOpsExecutedFMA64

Stats::Scalar ComputeUnit::numVecOpsExecutedFMA64

Definition at line 575 of file compute_unit.hh.

Referenced by Wavefront::exec(), and regStats().

◆ numVecOpsExecutedMAC16

Stats::Scalar ComputeUnit::numVecOpsExecutedMAC16

Definition at line 577 of file compute_unit.hh.

Referenced by Wavefront::exec(), and regStats().

◆ numVecOpsExecutedMAC32

Stats::Scalar ComputeUnit::numVecOpsExecutedMAC32

Definition at line 578 of file compute_unit.hh.

Referenced by Wavefront::exec(), and regStats().

◆ numVecOpsExecutedMAC64

Stats::Scalar ComputeUnit::numVecOpsExecutedMAC64

Definition at line 579 of file compute_unit.hh.

Referenced by Wavefront::exec(), and regStats().

◆ numVecOpsExecutedMAD16

Stats::Scalar ComputeUnit::numVecOpsExecutedMAD16

Definition at line 581 of file compute_unit.hh.

Referenced by Wavefront::exec(), and regStats().

◆ numVecOpsExecutedMAD32

Stats::Scalar ComputeUnit::numVecOpsExecutedMAD32

Definition at line 582 of file compute_unit.hh.

Referenced by Wavefront::exec(), and regStats().

◆ numVecOpsExecutedMAD64

Stats::Scalar ComputeUnit::numVecOpsExecutedMAD64

Definition at line 583 of file compute_unit.hh.

Referenced by Wavefront::exec(), and regStats().

◆ numVecOpsExecutedTwoOpFP

Stats::Scalar ComputeUnit::numVecOpsExecutedTwoOpFP

Definition at line 585 of file compute_unit.hh.

Referenced by Wavefront::exec(), and regStats().

◆ numVecRegsPerSimd

int ComputeUnit::numVecRegsPerSimd

◆ numVectorALUs

int ComputeUnit::numVectorALUs

◆ numVectorGlobalMemUnits

int ComputeUnit::numVectorGlobalMemUnits

◆ numVectorSharedMemUnits

int ComputeUnit::numVectorSharedMemUnits

Definition at line 224 of file compute_unit.hh.

Referenced by ScheduleStage::init(), init(), mapWaveToScalarMem(), and numExeUnits().

◆ numWfsToSched

std::vector<int> ComputeUnit::numWfsToSched

Number of WFs to schedule to each SIMD.

This vector is populated by hasDispResources(), and consumed by the subsequent call to dispWorkgroup(), to schedule the specified number of WFs to the SIMD units. Entry I provides the number of WFs to schedule to SIMD I.

Definition at line 367 of file compute_unit.hh.

Referenced by dispWorkgroup(), and hasDispResources().

◆ operandNetworkLength

int ComputeUnit::operandNetworkLength

Definition at line 308 of file compute_unit.hh.

Referenced by oprNetPipeLength().

◆ pageAccesses

pageDataStruct ComputeUnit::pageAccesses

Definition at line 627 of file compute_unit.hh.

Referenced by exitCallback(), and GPUDynInst::updateStats().

◆ pageDivergenceDist

Stats::Distribution ComputeUnit::pageDivergenceDist

Definition at line 548 of file compute_unit.hh.

Referenced by regStats(), and GPUDynInst::updateStats().

◆ pagesTouched

std::map<Addr, int> ComputeUnit::pagesTouched

Definition at line 381 of file compute_unit.hh.

Referenced by updatePageDivergenceDist(), and GPUDynInst::updateStats().

◆ perLaneTLB

bool ComputeUnit::perLaneTLB

Definition at line 332 of file compute_unit.hh.

Referenced by sendRequest().

◆ pipeMap

std::unordered_set<uint64_t> ComputeUnit::pipeMap

Definition at line 273 of file compute_unit.hh.

Referenced by deleteFromPipeMap(), insertInPipeMap(), and Wavefront::nextInstr().

◆ prefetchDepth

int ComputeUnit::prefetchDepth

Definition at line 334 of file compute_unit.hh.

◆ prefetchStride

int ComputeUnit::prefetchStride

Definition at line 336 of file compute_unit.hh.

◆ prefetchType

Enums::PrefetchType ComputeUnit::prefetchType

Definition at line 341 of file compute_unit.hh.

◆ privMemInsts

Stats::Formula ComputeUnit::privMemInsts

Definition at line 522 of file compute_unit.hh.

Referenced by regStats().

◆ privReads

Stats::Scalar ComputeUnit::privReads

Definition at line 520 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ privWrites

Stats::Scalar ComputeUnit::privWrites

Definition at line 521 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ readonlyMemInsts

Stats::Formula ComputeUnit::readonlyMemInsts

Definition at line 525 of file compute_unit.hh.

Referenced by regStats().

◆ readonlyReads

Stats::Scalar ComputeUnit::readonlyReads

Definition at line 523 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ readonlyWrites

Stats::Scalar ComputeUnit::readonlyWrites

Definition at line 524 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ registerManager

RegisterManager* ComputeUnit::registerManager

◆ req_tick_latency

Tick ComputeUnit::req_tick_latency

Definition at line 358 of file compute_unit.hh.

Referenced by injectGlobalMemFence(), and sendRequest().

◆ resp_tick_latency

Tick ComputeUnit::resp_tick_latency

Definition at line 359 of file compute_unit.hh.

Referenced by ComputeUnit::DataPort::recvTimingResp().

◆ sALUInsts

Stats::Scalar ComputeUnit::sALUInsts

Definition at line 474 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ sALUInstsPerWF

Stats::Formula ComputeUnit::sALUInstsPerWF

Definition at line 475 of file compute_unit.hh.

Referenced by regStats().

◆ scalarALUs

std::vector<WaitClass> ComputeUnit::scalarALUs

Definition at line 246 of file compute_unit.hh.

Referenced by ScheduleStage::dispatchReady(), Wavefront::exec(), and init().

◆ scalarDataPort

ScalarDataPort ComputeUnit::scalarDataPort

Definition at line 993 of file compute_unit.hh.

Referenced by getPort(), and ComputeUnit::ScalarDataPort::MemReqEvent::process().

◆ scalarDTLBPort

ScalarDTLBPort ComputeUnit::scalarDTLBPort

Definition at line 995 of file compute_unit.hh.

Referenced by getPort(), and sendScalarRequest().

◆ scalarMemInstsPerKiloInst

Stats::Formula ComputeUnit::scalarMemInstsPerKiloInst

Definition at line 500 of file compute_unit.hh.

Referenced by regStats().

◆ scalarMemoryPipe

ScalarMemPipeline ComputeUnit::scalarMemoryPipe

Definition at line 283 of file compute_unit.hh.

Referenced by ScheduleStage::dispatchReady(), exec(), and regStats().

◆ scalarMemReads

Stats::Scalar ComputeUnit::scalarMemReads

Definition at line 492 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ scalarMemReadsPerKiloInst

Stats::Formula ComputeUnit::scalarMemReadsPerKiloInst

Definition at line 498 of file compute_unit.hh.

Referenced by regStats().

◆ scalarMemReadsPerWF

Stats::Formula ComputeUnit::scalarMemReadsPerWF

Definition at line 493 of file compute_unit.hh.

Referenced by regStats().

◆ scalarMemToSrfBus

WaitClass ComputeUnit::scalarMemToSrfBus

Definition at line 234 of file compute_unit.hh.

Referenced by ScalarMemPipeline::exec(), init(), and isDone().

◆ scalarMemUnit

WaitClass ComputeUnit::scalarMemUnit

◆ scalarMemWrites

Stats::Scalar ComputeUnit::scalarMemWrites

Definition at line 490 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ scalarMemWritesPerKiloInst

Stats::Formula ComputeUnit::scalarMemWritesPerKiloInst

Definition at line 499 of file compute_unit.hh.

Referenced by regStats().

◆ scalarMemWritesPerWF

Stats::Formula ComputeUnit::scalarMemWritesPerWF

Definition at line 491 of file compute_unit.hh.

Referenced by regStats().

◆ scalarPipeStages

int ComputeUnit::scalarPipeStages

Definition at line 306 of file compute_unit.hh.

Referenced by scalarPipeLength().

◆ scalarRegsReserved

std::vector<int> ComputeUnit::scalarRegsReserved

Definition at line 372 of file compute_unit.hh.

Referenced by StaticRegisterManagerPolicy::allocateRegisters(), and init().

◆ scheduleStage

ScheduleStage ComputeUnit::scheduleStage

Definition at line 279 of file compute_unit.hh.

Referenced by ExecStage::exec(), exec(), init(), and regStats().

◆ scheduleToExecute

ScheduleToExecute ComputeUnit::scheduleToExecute
private

Definition at line 1065 of file compute_unit.hh.

◆ scoreboardCheckStage

ScoreboardCheckStage ComputeUnit::scoreboardCheckStage

Definition at line 278 of file compute_unit.hh.

Referenced by exec(), and regStats().

◆ scoreboardCheckToSchedule

ScoreboardCheckToSchedule ComputeUnit::scoreboardCheckToSchedule
private

TODO: Update these comments once the pipe stage interface has been fully refactored.

Pipeline stage interfaces.

Buffers used to communicate between various pipeline stages List of waves which will be dispatched to each execution resource. An EXREADY implies dispatch list is non-empty and execution unit has something to execute this cycle. Currently, the dispatch list of an execution resource can hold only one wave because an execution resource can execute only one wave in a cycle. dispatchList is used to communicate between schedule and exec stage

At a high level, the following intra-/inter-stage communication occurs: SCB to SCH: readyList provides per exec resource list of waves that passed dependency and readiness checks. If selected by scheduler, attempt to add wave to schList conditional on RF support. SCH: schList holds waves that are gathering operands or waiting for execution resource availability. Once ready, waves are placed on the dispatchList as candidates for execution. A wave may spend multiple cycles in SCH stage, on the schList due to RF access conflicts or execution resource contention. SCH to EX: dispatchList holds waves that are ready to be executed. LM/FLAT arbitration may remove an LM wave and place it back on the schList. RF model may also force a wave back to the schList if using the detailed model.

Definition at line 1064 of file compute_unit.hh.

◆ shader

Shader* ComputeUnit::shader

◆ simdWidth

int ComputeUnit::simdWidth

Definition at line 298 of file compute_unit.hh.

Referenced by simdUnitWidth().

◆ spBypassPipeLength

int ComputeUnit::spBypassPipeLength

Definition at line 301 of file compute_unit.hh.

Referenced by spBypassLength().

◆ spillMemInsts

Stats::Formula ComputeUnit::spillMemInsts

Definition at line 516 of file compute_unit.hh.

Referenced by regStats().

◆ spillReads

Stats::Scalar ComputeUnit::spillReads

Definition at line 514 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ spillWrites

Stats::Scalar ComputeUnit::spillWrites

Definition at line 515 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ sqcPort

SQCPort ComputeUnit::sqcPort

Definition at line 997 of file compute_unit.hh.

Referenced by FetchUnit::fetch(), and getPort().

◆ sqcTLBPort

ITLBPort ComputeUnit::sqcTLBPort

Definition at line 999 of file compute_unit.hh.

Referenced by getPort(), and FetchUnit::initiateFetch().

◆ srf

std::vector<ScalarRegisterFile*> ComputeUnit::srf

◆ srf_scm_bus_latency

Cycles ComputeUnit::srf_scm_bus_latency

Definition at line 315 of file compute_unit.hh.

Referenced by Wavefront::exec().

◆ srfToScalarMemPipeBus

WaitClass ComputeUnit::srfToScalarMemPipeBus

Definition at line 236 of file compute_unit.hh.

Referenced by ScheduleStage::checkMemResources(), Wavefront::exec(), init(), and isDone().

◆ threadCyclesVALU

Stats::Scalar ComputeUnit::threadCyclesVALU

Definition at line 478 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ tickEvent

EventFunctionWrapper ComputeUnit::tickEvent

Definition at line 285 of file compute_unit.hh.

Referenced by dispWorkgroup(), and exec().

◆ tlbCycles

Stats::Scalar ComputeUnit::tlbCycles

Definition at line 538 of file compute_unit.hh.

Referenced by regStats(), and sendRequest().

◆ tlbLatency

Stats::Formula ComputeUnit::tlbLatency

Definition at line 539 of file compute_unit.hh.

Referenced by regStats().

◆ tlbPort

std::vector<DTLBPort> ComputeUnit::tlbPort

Definition at line 991 of file compute_unit.hh.

Referenced by getPort(), and sendRequest().

◆ tlbRequests

Stats::Scalar ComputeUnit::tlbRequests

Definition at line 537 of file compute_unit.hh.

Referenced by regStats(), and sendRequest().

◆ totalCycles

Stats::Scalar ComputeUnit::totalCycles

Definition at line 587 of file compute_unit.hh.

Referenced by Wavefront::exec(), exec(), and regStats().

◆ vALUInsts

Stats::Scalar ComputeUnit::vALUInsts

Definition at line 472 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ vALUInstsPerWF

Stats::Formula ComputeUnit::vALUInstsPerWF

Definition at line 473 of file compute_unit.hh.

Referenced by regStats().

◆ vALUUtilization

Stats::Formula ComputeUnit::vALUUtilization

Definition at line 479 of file compute_unit.hh.

Referenced by regStats().

◆ vectorALUs

std::vector<WaitClass> ComputeUnit::vectorALUs

Definition at line 242 of file compute_unit.hh.

Referenced by ScheduleStage::dispatchReady(), Wavefront::exec(), and init().

◆ vectorGlobalMemUnit

WaitClass ComputeUnit::vectorGlobalMemUnit

◆ vectorMemInstsPerKiloInst

Stats::Formula ComputeUnit::vectorMemInstsPerKiloInst

Definition at line 497 of file compute_unit.hh.

Referenced by regStats().

◆ vectorMemReads

Stats::Scalar ComputeUnit::vectorMemReads

Definition at line 488 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ vectorMemReadsPerKiloInst

Stats::Formula ComputeUnit::vectorMemReadsPerKiloInst

Definition at line 495 of file compute_unit.hh.

Referenced by regStats().

◆ vectorMemReadsPerWF

Stats::Formula ComputeUnit::vectorMemReadsPerWF

Definition at line 489 of file compute_unit.hh.

Referenced by regStats().

◆ vectorMemWrites

Stats::Scalar ComputeUnit::vectorMemWrites

Definition at line 486 of file compute_unit.hh.

Referenced by regStats(), and updateInstStats().

◆ vectorMemWritesPerKiloInst

Stats::Formula ComputeUnit::vectorMemWritesPerKiloInst

Definition at line 496 of file compute_unit.hh.

Referenced by regStats().

◆ vectorMemWritesPerWF

Stats::Formula ComputeUnit::vectorMemWritesPerWF

Definition at line 487 of file compute_unit.hh.

Referenced by regStats().

◆ vectorRegsReserved

std::vector<int> ComputeUnit::vectorRegsReserved

Definition at line 370 of file compute_unit.hh.

Referenced by StaticRegisterManagerPolicy::allocateRegisters(), and init().

◆ vectorSharedMemUnit

WaitClass ComputeUnit::vectorSharedMemUnit

◆ vpc

Stats::Formula ComputeUnit::vpc

Definition at line 588 of file compute_unit.hh.

Referenced by regStats().

◆ vpc_f16

Stats::Formula ComputeUnit::vpc_f16

Definition at line 589 of file compute_unit.hh.

Referenced by regStats().

◆ vpc_f32

Stats::Formula ComputeUnit::vpc_f32

Definition at line 590 of file compute_unit.hh.

Referenced by regStats().

◆ vpc_f64

Stats::Formula ComputeUnit::vpc_f64

Definition at line 591 of file compute_unit.hh.

Referenced by regStats().

◆ vrf

std::vector<VectorRegisterFile*> ComputeUnit::vrf

◆ vrf_gm_bus_latency

Cycles ComputeUnit::vrf_gm_bus_latency

Definition at line 313 of file compute_unit.hh.

Referenced by Wavefront::exec().

◆ vrf_lm_bus_latency

Cycles ComputeUnit::vrf_lm_bus_latency

Definition at line 317 of file compute_unit.hh.

Referenced by Wavefront::exec().

◆ vrfToCoalescerBusWidth

int ComputeUnit::vrfToCoalescerBusWidth

Definition at line 265 of file compute_unit.hh.

◆ vrfToGlobalMemPipeBus

WaitClass ComputeUnit::vrfToGlobalMemPipeBus

Definition at line 220 of file compute_unit.hh.

Referenced by ScheduleStage::checkMemResources(), Wavefront::exec(), init(), and isDone().

◆ vrfToLocalMemPipeBus

WaitClass ComputeUnit::vrfToLocalMemPipeBus

Definition at line 228 of file compute_unit.hh.

Referenced by ScheduleStage::checkMemResources(), Wavefront::exec(), init(), and isDone().

◆ wavefrontSize

int ComputeUnit::wavefrontSize
private

Definition at line 1030 of file compute_unit.hh.

Referenced by wfSize().

◆ waveLevelParallelism

Stats::Distribution ComputeUnit::waveLevelParallelism

Definition at line 531 of file compute_unit.hh.

Referenced by regStats(), and startWavefront().

◆ wfBarrierSlots

std::vector<WFBarrier> ComputeUnit::wfBarrierSlots
private

The barrier slots for this CU.

Definition at line 1070 of file compute_unit.hh.

Referenced by barrierSlot().

◆ wfList

std::vector<std::vector<Wavefront*> > ComputeUnit::wfList

◆ wgBlockedDueBarrierAllocation

Stats::Scalar ComputeUnit::wgBlockedDueBarrierAllocation

Definition at line 555 of file compute_unit.hh.

Referenced by hasDispResources(), and regStats().

◆ wgBlockedDueLdsAllocation

Stats::Scalar ComputeUnit::wgBlockedDueLdsAllocation

Definition at line 556 of file compute_unit.hh.

Referenced by hasDispResources(), and regStats().


The documentation for this class was generated from the following files:

Generated on Wed Sep 30 2020 14:02:22 for gem5 by doxygen 1.8.17