To bitsliced implement the operations, we gather the bits with index yz, i.e. bits with same row index and
same slice index are gathered together. We call the resulted bit set a lane1, in which all bits will be settled in the
same register in our implementations. And we rearrange the state in a way as depicted in Figure 3a.
We then use the following conventions. Let S denotes the complete state, then S[;y; z] denotes a particular
lane. In implementations for Scenario 1, two lanes of one state S[;y; z] and S[;y+2; z] (y 2 f0;1g , z 2 f0; : : : ;3g)
are stored in one register, in the low half and high half respectively. In Scenario 2, two lanes of two states S[;y; z]
and S0[;y; z] (y 2 f0; : : : ;3g, z 2 f0; : : : ;3g) are stored in one register to process two blocks in parallel.
The rearrangement of the state takes 2 clocks per bit using rotate through carry instructions (ROL and ROR).
Thus, rearranging the input state and back rearranging the output state take 4 clocks per bits.
To bitsliced implement the operations, we gather the bits with index yz, i.e. bits with same row index andsame slice index are gathered together. We call the resulted bit set a lane1, in which all bits will be settled in thesame register in our implementations. And we rearrange the state in a way as depicted in Figure 3a.We then use the following conventions. Let S denotes the complete state, then S[;y; z] denotes a particularlane. In implementations for Scenario 1, two lanes of one state S[;y; z] and S[;y+2; z] (y 2 f0;1g , z 2 f0; : : : ;3g)are stored in one register, in the low half and high half respectively. In Scenario 2, two lanes of two states S[;y; z]and S0[;y; z] (y 2 f0; : : : ;3g, z 2 f0; : : : ;3g) are stored in one register to process two blocks in parallel.The rearrangement of the state takes 2 clocks per bit using rotate through carry instructions (ROL and ROR).Thus, rearranging the input state and back rearranging the output state take 4 clocks per bits.
การแปล กรุณารอสักครู่..