在http://ilinuxkernel.com/?p=1755文章中,介绍了ARMv8/ARM64下读取CPU Cycle Counter/Time Stamp Counter值的方法,但在部分机器上测试,发现读取的值完全异常,不能真正反映CPU Cycle Counter真实值。
[root@localhost TSC]# ./arm_tsc
PMCR_EL0 Register:41013001
CPU Cycle Count:0x392151
CPU Cycle Count:0x44CD30
Register PMOVSCLR_EL0:0x0
Register pmuserenr_el0:0xF
Register PMCNTENSET_EL0:0x80000000
Register PMCCFILTR_EL0:0x0
Register PMCNTENCLR_EL0:0x80000000
Register PMOVSSET_EL0:0x0
根因:BIOS初始化时,将寄存器CPUACTLR_EL1 30bit设置为0,即Force main clock enable active的值没有正确设置。
解决办法:BIOS初始化时,将寄存器CPUACTLR_EL1 30bit设置为1。
CPU Auxiliary Control Register, EL1
- Purpose
- Provides IMPLEMENTATION DEFINED configuration and control options for the processor. There is one 64-bit CPU Auxiliary Control Register for each core in the cluster.
-
CPUACTLR_EL1[31:0] bit assignments
Bits Name Function [31] Snoop-delayed exclusive handling Snoop-delayed exclusive handling. The possible values are:- 0
- Normal exclusive handling behavior. This is the reset value.
- 1
- Modifies exclusive handling behavior by delaying certain snoop requests.
[30] Force main clock enable active Forces main clock enable active. The possible values are:- 0
- Does not prevent the clock generator from stopping the processor clock. This is the reset value.
- 1
- Prevents the clock generator from stopping the processor clock.
If the processor dynamic retention feature is used then this bit must be zero. See Processor dynamic retention.[29] Force Advanced SIMD and floating-point clock enable active Forces Advanced SIMD and Floating-point clock enable active. The possible values are:- 0
- Does not prevent the clock generator from stopping the Advanced SIMD and Floating-point clock. This is the reset value.
- 1
- Prevents the clock generator from stopping the Advanced SIMD and Floating-point clock.
If the processor dynamic retention feature is used then this bit must be zero. See Processor dynamic retention.[28:27] Write streaming no-allocate threshold Write streaming no-allocate threshold. The possible values are:0b00
- 12th consecutive streaming cache line does not allocate in the L1 or L2 cache. This is the reset value.
0b01
- 128th consecutive streaming cache line does not allocate in the L1 or L2 cache.
0b10
- 512th consecutive streaming cache line does not allocate in the L1 or L2 cache.
0b11
- Disables streaming. All Write-Allocate lines allocate in the L1 or L2 cache.
[26:25] Write streaming no-L1-allocate threshold Write streaming no-L1-allocate threshold. The possible values are:0b00
- 4th consecutive streaming cache line does not allocate in the L1 cache. This is the reset value.
0b01
- 64th consecutive streaming cache line does not allocate in the L1 cache.
0b10
- 128th consecutive streaming cache line does not allocate in the L1 cache.
0b11
- Disables streaming. All Write-Allocate lines allocate in the L1 cache.
[24] Non-cacheable streaming enhancement Non-cacheable streaming enhancement. You can set this bit only if your memory system meets the requirement that cache line fill requests from the Cortex-A57 processor are atomic. The possible values are:- 0
- Disables higher performance Non-cacheable load forwarding. This is the reset value.
- 1
- Enables higher performance Non-cacheable load forwarding. See 6.4.4 Non-cacheable streaming enhancement for more information.
[23] Force in-order requests to the same set and way Forces in-order requests to the same set and way. The possible values are:- 0
- Does not force in-order requests to the same set and way. This is the reset value.
- 1
- Forces in-order requests to the same set and way.
[22] Force in-order load issue Forces in-order load issue. The possible values are:- 0
- Does not force in-order load issue. This is the reset value.
- 1
- Forces in-order load issue.
[21] Disable L2 TLB prefetching Disables L2 TLB prefetching. The possible values are:- 0
- Enables L2 TLB prefetching. This is the reset value.
- 1
- Disables L2 TLB prefetching.
[20] Disable L2 translation table walk IPA PA cache Disables L2 translation table walk Immediate Physical Address (IPA) to Physical Address(PA) cache. The possible values are:- 0
- Enables L2 translation table walk IPA to PA cache. This is the reset value.
- 1
- Disables L2 translation table walk IPA to PA cache.
[19] Disable L2 stage 1 translation table walk cache Disables L2 stage 1 translation table walk cache. The possible values are:- 0
- Enables L2 stage 1 translation table walk cache. This is the reset value.
- 1
- Disables L2 stage 1 translation table walk cache.
[18] Disable L2 stage 1 translation table walk L2 PA cache Disables L2 stage 1 translation table walk L2 PA cache. The possible values are:- 0
- Enables L2 stage 1 translation table walk L2 PA cache. This is the reset value.
- 1
- Disables L2 stage 1 translation table walk L2 PA cache.
[17] Disable L2 TLB performance optimization Disables L2 TLB performance optimization. The possible values are- 0
- Enables L2 TLB optimization. This is the reset value.
- 1
- Disables L2 TLB optimization.
[16] Enable full Strongly-ordered and Device load replay Enables full Strongly-ordered or Device load replay. The possible values are:- 0
- Disables full Strongly-ordered or Device load replay. This is the reset value.
- 1
- Enables full Strongly-ordered or Device load replay.
[15]e Force in-order issue in branch execute unit Forces in-order issue in branch execute unit. The possible values are:- 0
- Disables forced in-order issue. This is the reset value.
- 1
- Forces in-order issue.
[14] Force limit of one instruction group commit/de-allocate per cycle Forces limit of one instruction group to commit and de-allocate per cycle. The possible values are:- 0
- Normal commit and de-allocate behavior. This is the reset value.
- 1
- Limits commit and de-allocate to one instruction group per cycle.
[13] Flush after Special Purpose Register (SPR) writes Flushes after certain SPR writes. The possible values are:- 0
- Normal behavior for SPR writes. This is the reset value.
- 1
- Flushes after certain SPR writes.
[12] Force push of SPRs Forces push of certain SPRs from local dispatch copies to shadow copies. The possible values are:- 0
- Normal behavior for SPRs. This is the reset value.
- 1
- Pushes certain SPRs from local dispatch copies to shadow copies.
Note
Setting this bit to 1 forces the processor to behave as if bit[13] is set to 1.
[11] Limit to one instruction per instruction group Limits to one instruction per instruction group. The possible values are:- 0
- Normal instruction grouping. This is the reset value.
- 1
- Limits to one instruction per instruction group.
[10] Force serialization after each instruction group Forces serialization after each instruction group. The possible values are:- 0
- Disables forced serialization after each instruction group. This is the reset value.
- 1
- Forces serialization after each instruction group.
Note
Setting this bit to 1 forces the processor to behave as if bit[11] is set to 1.
[9] Disable flag renaming optimization Disables flag renaming optimization. The possible values are:- 0
- Enables normal flag renaming optimization. This is the reset value.
- 1
- Disables normal flag renaming optimization.
[8] Execute WFI
instruction as aNOP
instructionExecutesWFI
instruction as aNOP
instruction. The possible values are:- 0
- Executes
WFI
instruction as defined in the ARM® Architecture Reference Manual ARMv8. This is the reset value. - 1
- Executes
WFI
instruction as aNOP
instruction, and does not put the processor in WFI low-power state.
[7] Execute WFE
instruction as aNOP
instructionExecutesWFE
instruction as aNOP
instruction. The possible values are:- 0
- Executes
WFE
instruction as defined in the ARM® Architecture Reference Manual ARMv8. This is the reset value. - 1
- Executes
WFE
instruction as aNOP
instruction, and does not put the processor in WFE low-power state.
[6] – Reserved, RES0. [5] Execute PLD
andPLDW
instructions as aNOP
ExecutesPLD
andPLDW
instructions as aNOP
instruction. The possible values are:- 0
- Executes
PLD
andPLDW
instructions as defined in the ARM® Architecture Reference Manual ARMv8. This is the reset value. - 1
- Executes
PLD
andPLDW
instructions as aNOP
instruction.
[4] Disable indirect predictor Disables indirect predictor. The possible values are:- 0
- Enables indirect predictor. This is the reset value.
- 1
- Disables indirect predictor.
[3] Disable micro-BTB Disables micro-Branch Target Buffer (BTB). The possible values are:- 0
- Enables micro-BTB. This is the reset value.
- 1
- Disables micro-BTB.
[2] – Reserved, RES0. [1] Disable Instruction Cache miss streaming Disables Instruction Cache miss streaming. The possible values are:- 0
- Enables Instruction Cache miss streaming. Sequential fetches resulting from Instruction Cache misses wait until individual packets arrive. This is the reset value.
- 1
- Disables Instruction Cache miss streaming. Sequential fetches resulting from Instruction Cache misses internally generate misses for each packet.
[0] Enable invalidates of BTBEnables invalidate of BTB. The possible values are:- 0
- The Invalidate Instruction Cache All and Invalidate Instruction Cache by VA instructions only invalidates the instruction cache array. This is the reset value.
- 1
- The Invalidate Instruction Cache All and Invalidate Instruction Cache by VA instructions invalidates the instruction cache array and branch target buffer.