Allows a program to request a cache block fetch before it is actually needed by the program.
Note: The dcbt instruction is support only in the PowerPC architecture.
The dcbt instruction may improve performance by anticipating a load from the addressed byte. The block containing the byte addressed by the effective address (EA) is fetched into the data cache before the block is needed by the program. The program can later perform loads from the block and may not experience the added delay caused by fetching the block into the cache. Executing the dcbt instruction does not invoke the system error handler.
If general-purpose register (GPR) RA is not 0, the effective address (EA) is the sum of the content of GPR RA and the content of GPR RB. Otherwise, the EA is the content of GPR RB.
Consider the following when using the dcbt instruction:
Note: If a program needs to store to the data cache block, use the dcbtst (Data Cache Block Touch for Store) instruction.
The dcbt instruction has one syntax form and does not affect Condition Register field 0 or the Fixed-Point Exception register.
|RA||Specifies source general-purpose register for EA computation.|
|RB||Specifies source general-purpose register for EA computation.|
The following code sums the content of a one-dimensional vector:
# Assume that GPR 4 contains the address of the first element # of the sum. # Assume 49 elements are to be summed. # Assume the data cache block size is 32 bytes. # Assume the elements are word aligned and the address # are multiples of 4. dcbt 0,4 # Issue hint to fetch first # cache block. addi 5,4,32 # Compute address of second # cache block. addi 8,0,6 # Set outer loop count. addi 7,0,8 # Set inner loop counter. dcbt 0,5 # Issue hint to fetch second # cache block. lwz 3,4,0 # Set sum = element number 1. bigloop: addi 8,8,-1 # Decrement outer loop count # and set CR field 0. mtspr CTR,7 # Set counter (CTR) for # inner loop. addi 5,5,32 # Computer address for next # touch. lttlloop: lwzu 6,4,4 # Fetch element. add 3,3,6 # Add to sum. bc 16,0,lttlloop # Decrement CTR and branch # if result is not equal to 0. dcbt 0,5 # Issue hint to fetch next # cache block. bc 4,3,bigloop # Branch if outer loop CTR is # not equal to 0. end # Summation complete.
The clcs (Cache Line Compute Size) instruction, clf (Cache Line Flush) instruction, cli (Cache Line Invalidate) instruction, dcbf (Data Cache Block Flush) instruction, dcbi (Data Cache Block Invalidate) instruction, dcbst (Data Cache Block Store) instruction, dcbtst (Data Cache Block Touch for Store) instruction, dcbz or dclz (Data Cache Block Set to Zero) instruction, dclst (Data Cache Line Store) instruction, icbi (Instruction Cache Block Invalidate) instruction, sync (Synchronize) or dcs (Data Cache Synchronize) instruction.
Processing and Storage