pick pair of bit (0b11) in memory array

Question

my embedded system has 128KB memory array structure for specific purpose

and each 2bit represents 4state( state 0 ,state 1, state 2, state 3)

I'd like to count total state 3 (0b11) in memory array

for example 0xFF001234 = 1111 1111 0000 0000 0001 0010 0011 0100

It counts 5 (0b11)

I searched algorithm but it only counts single bit - https://www.geeksforgeeks.org/count-set-bits-in-an-integer/

I hope to avoid greedy algorithm like compare 0b11 every 2bit

anyone has good idea?

ps : I'm using LEON3 Sparc V8 32bit processor, using C language

What programming language are you using and how do you access the memory? — Socowi, Nov 08 '19 at 08:51
The processor really accesses the memory 4 bits per 4 bits ? Sounds surprising. — Guillaume Petitjean, Nov 08 '19 at 09:16
Writing an effective algorithm for this is all about how big chunks your CPU can munch in a single instruction. It will look wildly different on a 32 bit CPU and a 8 bit CPU. Also, it depends on if you can utilize data cache or not. Attempting to design an effective algorithm for this without a specific target in mind is quackery. So what is your target here? — Lundin, Nov 08 '19 at 09:41

Socowi · Accepted Answer · 2019-11-08T14:18:45.080

2

You have an array uint32_t states[] where each state[i] represents 16 states?

To count the number of 0b11 states in the variable uint32_t s you can use the following approach:

First, pre-process s such that every state 0b11 leads to exactly one 1 bit and all other states lead to 0 bits. Then count the numbers of 1 bits.

Pre-Processing

Split s into the left bits l and right bits r of each state.

s                    AB CD EF GH IJ LM NO PQ RS TU VW XY ZΓ ΔΠ ΦΨ ДЖ
l = s & 0xAAAAAAAA = A0 C0 E0 G0 I0 L0 N0 P0 R0 T0 V0 X0 Z0 Δ0 Φ0 Д0
r = s & 0x55555555 = 0B 0D 0F 0H 0J 0M 0O 0Q 0S 0U 0W 0Y 0Γ 0Π 0Ψ 0Ж

Then align the bits of l and r.

(l >>> 1)          = 0A 0C 0E 0G 0I 0L 0N 0P 0R 0T 0V 0X 0Z 0Δ 0Φ 0Д
r                  = 0B 0D 0F 0H 0J 0M 0O 0Q 0S 0U 0W 0Y 0Γ 0Π 0Ψ 0Ж

Finally, use & to get a 1-bit if and only if the state was 0b11.

(l >>> 1) & r      = 0? 0? 0? 0? 0? 0? 0? 0? 0? 0? 0? 0? 0? 0? 0? 0?

Here ? is 1 if the corresponding state was 0b11 and 0 otherwise.
The same result can be achived by the simplified formula (s >>> 1) & s & 0x55555555.

Bit-Counting

To count the 0b11 states in s we only have to count the 1-bits in (s >>> 1) & s & 0x55555555.

Bit-counting can be done without a loop as explained in the book Hacker's Delight, chapter 5 or in this Stackoverflow answer.

The methods shown here only apply to a single array element. To count the states in your whole array loop over its elements.

Optimization

As pointed out by Lundin in the comments, the operation (s >>> 1) might me expensive if your processors cannot fit uint32_t into its registers. In this case it would be sensible to declare your array states[] not as uint32_t but whatever works best on your processor – the procedure stays the same, you only have to use more or less 555…. If, for some reason you cannot change the type of your array, you can still access it as if it had another type, see how to cast an int array to a byte array in C.

edited Nov 08 '19 at 14:18

answered Nov 08 '19 at 09:29

Socowi

25,550
3
32
54

1

This will be very ineffective on low end CPUs < 32 bits. It isn't possible to suggest to optimal algorithm without a specific target in mind. – Lundin Nov 08 '19 at 09:43
I think it is effective (no matter how fast/slow it is). Regarding efficiency: I guess even on such CPUs this approach would still be faster than a loop, so it cannot be that bad. Anyways, thinking about stuff like that right now seems like premature optimization to me – OP asked for something "*to avoid [...] compare 0b11 every 2bit*" and here it is. – Socowi Nov 08 '19 at 09:47
1

Shifting a 32 bit integer on a 8 bit MCU is some ~100 times slower than iterating over a byte array. It is not premature optimization, it is the opposite: real-world optimization. – Lundin Nov 08 '19 at 10:14
`~100 times slower` – Ok, I didn't expect that as that means the shift `>>> 1` implemented in software (by shifting and combining single bytes) would be a lot faster than the hardware shift. I added some pointers to this answer. Thank you for the comments. – Socowi Nov 08 '19 at 10:28
@Socowi Better start with the naïve method `if ((arr[idx] & 0x03) == 0x03) count++;` (same for 0x0c, 0x30, 0xc0) in a loop. Any decent compiler will see dat `arr[idx]` is used four times, and wil keep it in a register in the loop body. (same for the pointer version). Also, the loop will only need at most three variables, which can all be kept in registers. – joop Nov 08 '19 at 12:03
@joop Feel free to add this as an additional answer in which you explain that OP's requirement "*"to avoid [...] compare 0b11 every 2bit"*" isn't useful. – Socowi Nov 08 '19 at 12:10
Thanks for @Socowi and Ludin and other guys for helping me. also, I am sorry for give you not enough information about I'm using 32bit CPU... If I do onboard test, I'll record the result against the greedy algorithm – I.C.Baek Nov 14 '19 at 02:45

pick pair of bit (0b11) in memory array

1 Answers1

Pre-Processing

Bit-Counting

Optimization