3.6.2.2. 资源及其使用的描述
我们已经知道有两个方式可以描述指令的执行。一种是执行步骤,Itinerary,它包括了一系列包含一组InstrStage定义的InstrItinData定义,将InstrItinData与指令定义关联起来的InstrItinClass,以及一个把有关定义组合起来的ProcessorItineraries定义。另一种则是通过描述资源使用情形,它由一系列相互关联的SchedReadWrite派生定义组成。
这背后都是将处理器描述成若干资源,并叙述指令对这些资源的使用情况。现在是时候输出相关的数据结构了。
1250 void SubtargetEmitter::EmitSchedModel (raw_ostream &OS) {
1251 OS << "#ifdef DBGFIELD\n"
1252 << "#error \"<target>GenSubtargetInfo.inc requires a DBGFIELD macro\"\n"
1253 << "#endif\n"
1254 << "#ifndef NDEBUG\n"
1255 << "#define DBGFIELD(x) x,\n"
1256 << "#else\n"
1257 << "#define DBGFIELD(x)\n"
1258 << "#endif\n";
1259
1260 if (SchedModels.hasItineraries()) {
1261 std::vector<std::vector<> > ProcItinLists;
1262 // Emit the stage data
1263 EmitStageAndOperandCycleData (OS, ProcItinLists);
1264 (OS, ProcItinLists);
1265 }
与前面章节看到的一样,这里的SchedModels对象中容器ProcModels保存了同一族的各个处理器的CodeGenProcModel对象。如果处理器中有使用执行步骤来描述的,满足1260行条件,将输出这些处理器的步骤(stage)数据。类似于TD文件里使用的InstrStage定义,LLVM也有一个同名的、作用相类的类型。
59 struct InstrStage {
60 enum ReservationKinds {
61 Required = 0,
62 Reserved = 1
63 };
64
65 unsigned Cycles_; ///< Length of stage in machine cycles
66 unsigned Units_; ///< Choice of functional units
67 int NextCycles_; ///< Number of machine cycles to next stage
68 ReservationKinds Kind_; ///< Kind of the FU reservation
69
70 /// \brief Returns the number of cycles the stage is occupied.
71 unsigned getCycles() const {
72 return Cycles_;
73 }
74
75 /// \brief Returns the choice of FUs.
76 unsigned getUnits() const {
77 return Units_;
78 }
79
80 ReservationKinds getReservationKind() const {
81 return Kind_;
82 }
83
84 /// \brief Returns the number of cycles from the start of this stage to the
85 /// start of the next stage in the itinerary
86 unsigned getNextCycles() const {
87 return (NextCycles_ >= 0) ? (unsigned)NextCycles_ : Cycles_;
88 }
89 };
InstrStage代表指令执行中的一个非流水线化的步骤。Cycles表示完成该步骤所需的周期,Units表示可供选择用于完成该步骤的功能单元。比如IntUnit1,IntUnit2。NextCycles表示从该步骤开始到下一步开始所应该消逝的周期数。值-1表示下一步应该跟在当前步骤后立即开始。比如:
{ 1, x, -1 }:表示该步骤占用FU x一个周期,下一步在该步骤后立即开始。
{ 2, x|y, 1 }:表示该步骤占用FU x或FU y连续的两个周期,下一步应该在该步骤开始一周期后开始。即,这些步骤要求在时间上重叠。
{ 1, x, 0 }:表示该步骤占用FU x一个周期,下一步与该步骤在同一个周期开始。这可用于表示指令同一时间要求多个步骤。
有两种FU保留类型:指令实际要求的FU,指令仅保留的FU。对其他指令的执行,保留单元不可用。不过,多条指令可以多次保留同一个单元。这两种单元保留用于模拟指令字段改变导致的暂停,使用同样资源(比如同一个寄存器)的FU,等等。
97 struct InstrItinerary {
98 int NumMicroOps; ///< # of micro-ops, -1 means it's variable
99 unsigned FirstStage; ///< Index of first stage in itinerary
100 unsigned LastStage; ///< Index of last + 1 stage in itinerary
101 unsigned FirstOperandCycle; ///< Index of first operand rd/wr
102 unsigned LastOperandCycle; ///< Index of last + 1 operand rd/wr
103 };
InstrItinerary代表指令的调度信息。包括该指令所占据的一组步骤及操作数读、写所在的流水线周期。它是InstrItinData定义在LLVM的对等物。更上一级的封装则是InstrItineraryData,它所定义的数据成员及构造函数有下面这些。它为子目标机器提供数据的封装。
109 class InstrItineraryData {
110 public :
111 MCSchedModel SchedModel; ///< Basic machine properties.
112 const InstrStage *Stages; ///< Array of stages selected
113 const unsigned *OperandCycles; ///< Array of operand cycles selected
114 const unsigned *Forwardings; ///< Array of pipeline forwarding pathes
115 const *Itineraries; ///< Array of itineraries selected
116
117 /// Ctors.
118 InstrItineraryData() : SchedModel(MCSchedModel::GetDefaultSchedModel()),
119 Stages(nullptr), OperandCycles(nullptr),
120 Forwardings(nullptr), Itineraries(nullptr) {}
121
122 InstrItineraryData( const MCSchedModel &SM, const InstrStage *S,
123 const unsigned *OS, const unsigned *F)
124 : SchedModel(SM), Stages(S), OperandCycles(OS), Forwardings(F),
125 Itineraries(SchedModel.InstrItineraries) {}
3.6.2.2.1. 功能单元与旁路定义
我们已经知道一个处理器CodeGenProcModel对象的ItinsDef成员是其Processor派生定义里实际使用的ProcessorItineraries定义的Record对象(ProcessoràProcItin或ProcessoràSchedModelà Itineraries)。
359 void SubtargetEmitter::
360 EmitStageAndOperandCycleData (raw_ostream &OS,
361 std::vector<std::vector<InstrItinerary> >
362 &ProcItinLists) {
363
364 // Multiple processor models may share an itinerary record. Emit it once.
365 SmallPtrSet<Record*, 8> ItinsDefSet;
366
367 // Emit functional units for all the itineraries.
368 for (CodeGenSchedModels::ProcIter PI = SchedModels.procModelBegin(),
369 PE = SchedModels.procModelEnd(); PI != PE; ++PI) {
370
371 if (!ItinsDefSet.insert(PI->ItinsDef).second)
372 continue ;
373
374 std::vector<Record*> FUs = PI->ItinsDef->getValueAsListOfDefs("FU");
375 if (FUs.empty())
376 continue ;
377
378 const std::string &Name = PI->ItinsDef->getName();
379 OS << "\n// Functional units for \"" << Name << "\"\n"
380 << "namespace " << Name << "FU {\n";
381
382 for (unsigned j = 0, FUN = FUs.size(); j < FUN; ++j)
383 OS << " const unsigned " << FUs[j]->getName()
384 << " = 1 << " << j << ";\n";
385
386 OS << "}\n";
387
388 std::vector<Record*> BPs = PI->ItinsDef->getValueAsListOfDefs("BP");
389 if (!BPs.empty()) {
390 OS << "\n// Pipeline forwarding pathes for itineraries \"" << Name
391 << "\"\n" << "namespace " << Name << "Bypass {\n";
392
393 OS << " const unsigned NoBypass = 0;\n";
394 for (unsigned j = 0, BPN = BPs.size(); j < BPN; ++j)
395 OS << " const unsigned " << BPs[j]->getName()
396 << " = 1 << " << j << ";\n";
397
398 OS << "}\n";
399 }
400 }
X86家族中只有Atom使用Itinerary机制。Atom的ProcessorItineraries定义没有定义BP(旁路,bypass),而且只定义了两个Port资源,因此我们得到如下的输出:
#ifdef DBGFIELD
#error "<target>GenSubtargetInfo.inc requires a DBGFIELD macro"
#endif
#ifndef NDEBUG
#define DBGFIELD(x) x,
#else
#define DBGFIELD(x)
#endif
// Functional units for "AtomItineraries"
namespace AtomItinerariesFU {
const unsigned Port0 = 1 << 0;
const unsigned Port1 = 1 << 1;
}
接下来要输出三张表。第一个是InstrStage类型描述的Stage数组,第二个是描述操作数周期的字符串数组,第三个是描述旁路的字符串数组。这些数组的第一个项都是预留给NoItineraries定义。
SubtargetEmitter::EmitStageAndOperandCycleData(续)
402 // Begin stages table
403 std::string StageTable = "\nextern const llvm::InstrStage " + Target +
404 "Stages[] = {\n";
405 StageTable += " { 0, 0, 0, llvm::InstrStage::Required }, // No itinerary\n";
406
407 // Begin operand cycle table
408 std::string OperandCycleTable = "extern const unsigned " + Target +
409 "OperandCycles[] = {\n";
410 OperandCycleTable += " 0, // No itinerary\n";
411
412 // Begin pipeline bypass table
413 std::string BypassTable = "extern const unsigned " + Target +
414 "ForwardingPaths[] = {\n";
415 BypassTable += " 0, // No itinerary\n";
416
417 // For each Itinerary across all processors, add a unique entry to the stages,
418 // operand cycles, and pipepine bypess tables. Then add the new Itinerary
419 // object with computed offsets to the ProcItinLists result.
420 unsigned StageCount = 1, OperandCycleCount = 1;
421 std::map<std::string, unsigned> ItinStageMap, ItinOperandMap;
422 for (CodeGenSchedModels::ProcIter PI = SchedModels.procModelBegin(),
423 PE = SchedModels.procModelEnd(); PI != PE; ++PI) {
424 const CodeGenProcModel &ProcModel = *PI;
425
426 // Add process itinerary to the list.
427 ProcItinLists.resize(ProcItinLists.size()+1);
428
429 // If this processor defines no itineraries, then leave the itinerary list
430 // empty.
431 std::vector<InstrItinerary> &ItinList = ProcItinLists.back();
432 if (!ProcModel.hasItineraries())
433 continue ;
434
435 const std::string &Name = ProcModel.ItinsDef->getName();
436
437 ItinList.resize(SchedModels.numInstrSchedClasses());
438 assert (ProcModel.ItinDefList.size() == ItinList.size() && "bad Itins");
439
440 for (unsigned SchedClassIdx = 0, SchedClassEnd = ItinList.size();
441 SchedClassIdx < SchedClassEnd; ++SchedClassIdx) {
442
443 // Next itinerary data
444 Record *ItinData = ProcModel.ItinDefList[SchedClassIdx];
445
446 // Get string and stage count
447 std::string ItinStageString;
448 unsigned NStages = 0;
449 if (ItinData)
450 FormItineraryStageString (Name, ItinData, ItinStageString, NStages);
451
452 // Get string and operand cycle count
453 std::string ItinOperandCycleString;
454 unsigned NOperandCycles = 0;
455 std::string ItinBypassString;
456 if (ItinData) {
457 FormItineraryOperandCycleString (ItinData, ItinOperandCycleString,
458 NOperandCycles);
459
460 FormItineraryBypassString (Name, ItinData, ItinBypassString,
461 NOperandCycles);
462 }
463
464 // Check to see if stage already exists and create if it doesn't
465 unsigned FindStage = 0;
466 if (NStages > 0) {
467 FindStage = ItinStageMap[ItinStageString];
468 if (FindStage == 0) {
469 // Emit as { cycles, u1 | u2 | ... | un, timeinc }, // indices
470 StageTable += ItinStageString + ", // " + itostr(StageCount);
471 if (NStages > 1)
472 StageTable += "-" + itostr(StageCount + NStages - 1);
473 StageTable += "\n";
474 // Record Itin class number.
475 ItinStageMap[ItinStageString] = FindStage = StageCount;
476 StageCount += NStages;
477 }
478 }
479
480 // Check to see if operand cycle already exists and create if it doesn't
481 unsigned FindOperandCycle = 0;
482 if (NOperandCycles > 0) {
483 std::string ItinOperandString = ItinOperandCycleString+ItinBypassString;
484 FindOperandCycle = ItinOperandMap[ItinOperandString];
485 if (FindOperandCycle == 0) {
486 // Emit as cycle, // index
487 OperandCycleTable += ItinOperandCycleString + ", // ";
488 std::string OperandIdxComment = itostr(OperandCycleCount);
489 if (NOperandCycles > 1)
490 OperandIdxComment += "-"
491 + itostr(OperandCycleCount + NOperandCycles - 1);
492 OperandCycleTable += OperandIdxComment + "\n";
493 // Record Itin class number.
494 ItinOperandMap[ItinOperandCycleString] =
495 FindOperandCycle = OperandCycleCount;
496 // Emit as bypass, // index
497 BypassTable += ItinBypassString + ", // " + OperandIdxComment + "\n";
498 OperandCycleCount += NOperandCycles;
499 }
500 }
501
502 // Set up itinerary as location and location + stage count
503 int NumUOps = ItinData ? ItinData->getValueAsInt("NumMicroOps") : 0;
504 InstrItinerary Intinerary = { NumUOps, FindStage, FindStage + NStages,
505 FindOperandCycle,
506 FindOperandCycle + NOperandCycles};
507
508 // Inject - empty slots will be 0, 0
509 ItinList[SchedClassIdx] = Intinerary;
510 }
511 }
512
513 // Closing stage
514 StageTable += " { 0, 0, 0, llvm::InstrStage::Required } // End stages\n";
515 StageTable += "};\n";
516
517 // Closing operand cycles
518 OperandCycleTable += " 0 // End operand cycles\n";
519 OperandCycleTable += "};\n";
520
521 BypassTable += " 0 // End bypass tables\n";
522 BypassTable += "};\n";
523
524 // Emit tables.
525 OS << StageTable;
526 OS << OperandCycleTable;
527 OS << BypassTable;
528 }
3.6.2.2.2. 执行步骤的数据
对使用执行步骤辅助指令调度的每个处理器,其CodeGenProcModel实例的ItinDefList容器保存的是相关ProcessorItineraries定义里的IID列表(类型list<InstrItinData>),这个容器关联了援引相同InstrItinClass定义的调度类型与InstrItinData定义。上面438行断言必须满足,因为在collectProcItins的784行,ProcModel.ItinsDef被调整为NumInstrSchedClasses大小。
对某个处理器CodeGenProcModel对象,440行实质上是遍历所有的非推导的CodeGenSchedClass对象,因此,444行获取的是与指定调度类型匹配的InstrItinData定义的Record对象,并作为450行调用的FormItineraryStageString方法的第二个参数。
274 void SubtargetEmitter::FormItineraryStageString ( const std::string &Name,
275 Record *ItinData,
276 std::string &ItinString,
277 unsigned &NStages) {
278 // Get states list
279 const std::vector<Record*> &StageList =
280 ItinData->getValueAsListOfDefs("Stages");
281
282 // For each stage
283 unsigned N = NStages = StageList.size();
284 for (unsigned i = 0; i < N;) {
285 // Next stage
286 const Record *Stage = StageList[i];
287
288 // Form string as ,{ cycles, u1 | u2 | ... | un, timeinc, kind }
289 int Cycles = Stage->getValueAsInt("Cycles");
290 ItinString += " { " + itostr(Cycles) + ", ";
291
292 // Get unit list
293 const std::vector<Record*> &UnitList = Stage->getValueAsListOfDefs("Units");
294
295 // For each unit
296 for (unsigned j = 0, M = UnitList.size(); j < M;) {
297 // Add name and bitwise or
298 ItinString += Name + "FU::" + UnitList[j]->getName();
299 if (++j < M) ItinString += " | ";
300 }
301
302 int TimeInc = Stage->getValueAsInt("TimeInc");
303 ItinString += ", " + itostr(TimeInc);
304
305 int Kind = Stage->getValueAsInt("Kind");
306 ItinString += ", (llvm::InstrStage::ReservationKinds)" + itostr(Kind);
307
308 // Close off stage
309 ItinString += " }";
310 if (++i < N) ItinString += ", ";
311 }
312 }
所输出的描述字符串可以参考上面对类InstrStage说明的例子。InstrItinData定义里还有一个OperandCycles定义用来描述指令发出后,指定操作数的值读、写完成所需的周期数。
319 void SubtargetEmitter::FormItineraryOperandCycleString (Record *ItinData,
320 std::string &ItinString, unsigned &NOperandCycles) {
321 // Get operand cycle list
322 const std::vector<int64_t> &OperandCycleList =
323 ItinData->getValueAsListOfInts("OperandCycles");
324
325 // For each operand cycle
326 unsigned N = NOperandCycles = OperandCycleList.size();
327 for (unsigned i = 0; i < N;) {
328 // Next operand cycle
329 const int OCycle = OperandCycleList[i];
330
331 ItinString += " " + itostr(OCycle);
332 if (++i < N) ItinString += ", ";
333 }
334 }
最后还要输出一个描述旁路(bypass)的数组。可以发现.td文件里的InstrItinData定义被拆分为这三个数组,这是因为这是描写InstrItinData定义比较独立的3个维度。而且这3个维度本身也可能是存在不少的重复定义,创建这三个数组,并通过数组下标来标定InstrItinData定义会获取更为紧凑的数据结构。
336 void SubtargetEmitter::FormItineraryBypassString ( const std::string &Name,
337 Record *ItinData,
338 std::string &ItinString,
339 unsigned NOperandCycles) {
340 const std::vector<Record*> &BypassList =
341 ItinData->getValueAsListOfDefs("Bypasses");
342 unsigned N = BypassList.size();
343 unsigned i = 0;
344 for (; i < N;) {
345 ItinString += Name + "Bypass::" + BypassList[i]->getName();
346 if (++i < NOperandCycles) ItinString += ", ";
347 }
348 for (; i < NOperandCycles;) {
349 ItinString += " 0";
350 if (++i < NOperandCycles) ItinString += ", ";
351 }
352 }
注意,对方法FormItineraryOperandCycleString,参数NOperandCycles是一个引用,在326行被设置为InstrItinData定义里OperandCycles的大小。它被传给方法FormItineraryBypassString,用以控制旁路数组的大小。
在EmitStageAndOperandCycleData的466行,NStages是由FormItineraryStageString方法设置的InstrItinData定义Stages的对象。容器ItinStageMap(std::map<std::string, unsigned>)用来保证生成InstrStage的唯一性,468~477行确保输出唯一的InstrStage。容器ItinOperandMap也是类似的作用,确保OperandCycle输出的唯一性。
在504行生成了一个InstrItinerary实例,保存到ProcItinLists容器的相应位置。在514行开始输出这三个数组。例如对X86目标机器,这是:
extern const llvm::InstrStage X86Stages[] = {
{ 0, 0, 0, llvm::InstrStage::Required }, // No itinerary
{ 13, AtomItinerariesFU::Port0 | AtomItinerariesFU::Port1, -1, (llvm::InstrStage::ReservationKinds)0 }, // 1
{ 7, AtomItinerariesFU::Port0 | AtomItinerariesFU::Port1, -1, (llvm::InstrStage::ReservationKinds)0 }, // 2
{ 21, AtomItinerariesFU::Port0 | AtomItinerariesFU::Port1, -1, (llvm::InstrStage::ReservationKinds)0 }, // 3
{ 1, AtomItinerariesFU::Port0 | AtomItinerariesFU::Port1, -1, (llvm::InstrStage::ReservationKinds)0 }, // 4
…
{ 202, AtomItinerariesFU::Port0 | AtomItinerariesFU::Port1, -1, (llvm::InstrStage::ReservationKinds)0 }, // 92
{ 0, 0, 0, llvm::InstrStage::Required } // End stages
};
extern const unsigned X86OperandCycles[] = {
0, // No itinerary
0 // End operand cycles
};
extern const unsigned X86ForwardingPaths[] = {
0, // No itinerary
0 // End bypass tables
};
这三者通过下面将要生成的InstrItinerary数组联系起来。方法EmitItineraries的参数ProcItinLists是在前面的方法EmitStageAndOperandCycleData里准备的。注意,在546行对SchedModels容器ProcModels的遍历顺序与EmitStageAndOperandCycleData准备这些InstrItinerary对象数据时遍历ProcModels容器的顺序是一样的,而且ProcItinLists与ProcModels容器的大小总是相等的(EmitStageAndOperandCycleData的427行)。另外在432行看到,对不使用Itinerary的处理器,ProcItinLists的项是空的,而在509行看到,对于使用Itinerary的处理器,不管是否存在内容相同的Intinerary实例,总是为该处理器的ProcItinLists项生成一个新的Intinerary实例。因此,在下面遍历的处理器与ProcItinLists总是一一对应的(562行条件将不使用Itinerary的处理器滤除了)。
536 void SubtargetEmitter::
537 EmitItineraries (raw_ostream &OS,
538 std::vector<std::vector<InstrItinerary> > &ProcItinLists) {
539
540 // Multiple processor models may share an itinerary record. Emit it once.
541 SmallPtrSet<Record*, 8> ItinsDefSet;
542
543 // For each processor's machine model
544 std::vector<std::vector<InstrItinerary> >::iterator
545 ProcItinListsIter = ProcItinLists.begin();
546 for (CodeGenSchedModels::ProcIter PI = SchedModels.procModelBegin(),
547 PE = SchedModels.procModelEnd(); PI != PE; ++PI, ++ProcItinListsIter) {
548
549 Record *ItinsDef = PI->ItinsDef;
550 if (!ItinsDefSet.insert(ItinsDef).second)
551 continue ;
552
553 // Get processor itinerary name
554 const std::string &Name = ItinsDef->getName();
555
556 // Get the itinerary list for the processor.
557 assert (ProcItinListsIter != ProcItinLists.end() && "bad iterator");
558 std::vector<InstrItinerary> &ItinList = *ProcItinListsIter;
559
560 // Empty itineraries aren't referenced anywhere in the tablegen output
561 // so don't emit them.
562 if (ItinList.empty())
563 continue ;
564
565 OS << "\n";
566 OS << "static const llvm::InstrItinerary ";
567
568 // Begin processor itinerary table
569 OS << Name << "[] = {\n";
570
571 // For each itinerary class in CodeGenSchedClass::Index order.
572 for (unsigned j = 0, M = ItinList.size(); j < M; ++j) {
573 InstrItinerary &Intinerary = ItinList[j];
574
575 // Emit Itinerary in the form of
576 // { firstStage, lastStage, firstCycle, lastCycle } // index
577 OS << " { " <<
578 Intinerary.NumMicroOps << ", " <<
579 Intinerary.FirstStage << ", " <<
580 Intinerary.LastStage << ", " <<
581 Intinerary.FirstOperandCycle << ", " <<
582 Intinerary.LastOperandCycle << " }" <<
583 ", // " << j << " " << SchedModels.getSchedClass(j).Name << "\n";
584 }
585 // End processor itinerary table
586 OS << " { 0, ~0U, ~0U, ~0U, ~0U } // end marker\n";
587 OS << "};\n";
588 }
589 }
X86目标机器只有Atom处理器使用了Itinerary,因此它输出这样的数组(有950项):
static const llvm::AtomItineraries[] = {
{ 0, 0, 0, 0, 0 }, // 0 NoInstrModel
{ 1, 1, 2, 0, 0 }, // 1 IIC_AAA_WriteMicrocoded
{ 1, 2, 3, 0, 0 }, // 2 IIC_AAD_WriteMicrocoded
{ 1, 3, 4, 0, 0 }, // 3 IIC_AAM_WriteMicrocoded
{ 1, 1, 2, 0, 0 }, // 4 IIC_AAS_WriteMicrocoded
{ 1, 4, 5, 0, 0 }, // 5 IIC_BIN_CARRY_NONMEM_WriteALU
…
{ 1, 43, 44, 0, 0 }, // 948 LDMXCSR_VLDMXCSR
{ 1, 17, 18, 0, 0 }, // 949 STMXCSR_VSTMXCSR
{ 0, ~0U, ~0U, ~0U, ~0U } // end marker
};
注释里给出的是所谓的调度类型。注意这里输出的顺序与X86GenInstrInfo.inc里Sched名字空间里的表示调度类型的枚举常量的顺序是完全一样。这个一致性使得我们通过这些枚举常量就能得到对应调度类型的具体参数。
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网
猜你喜欢:- 【每日笔记】【Go学习笔记】2019-01-04 Codis笔记
- 【每日笔记】【Go学习笔记】2019-01-02 Codis笔记
- 【每日笔记】【Go学习笔记】2019-01-07 Codis笔记
- Golang学习笔记-调度器学习
- Vue学习笔记(二)------axios学习
- 算法/NLP/深度学习/机器学习面试笔记
本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。