Commit fc5adbeb authored by Wang YanQing's avatar Wang YanQing Committed by Brian Norris

Documentation: mtd: improve nand_ecc.txt for readability and correctness

This patch correct some representation errors, add a little
clarification in some places, and fix indentation problems
for pseudo code.

It also delete one more white space for one place.
Signed-off-by: default avatarWang YanQing <udknight@gmail.com>
[Brian: a few tweaks]
Signed-off-by: default avatarBrian Norris <computersforpeace@gmail.com>
parent 1b15b1f5
...@@ -107,7 +107,7 @@ for (i = 0; i < 256; i++) ...@@ -107,7 +107,7 @@ for (i = 0; i < 256; i++)
if (i & 0x01) if (i & 0x01)
rp1 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp1; rp1 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp1;
else else
rp0 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp1; rp0 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp0;
if (i & 0x02) if (i & 0x02)
rp3 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp3; rp3 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp3;
else else
...@@ -127,7 +127,7 @@ for (i = 0; i < 256; i++) ...@@ -127,7 +127,7 @@ for (i = 0; i < 256; i++)
if (i & 0x20) if (i & 0x20)
rp11 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp11; rp11 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp11;
else else
rp10 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp10; rp10 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp10;
if (i & 0x40) if (i & 0x40)
rp13 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp13; rp13 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp13;
else else
...@@ -158,7 +158,7 @@ the values in any order. So instead of calculating all the bits ...@@ -158,7 +158,7 @@ the values in any order. So instead of calculating all the bits
individually, let us try to rearrange things. individually, let us try to rearrange things.
For the column parity this is easy. We can just xor the bytes and in the For the column parity this is easy. We can just xor the bytes and in the
end filter out the relevant bits. This is pretty nice as it will bring end filter out the relevant bits. This is pretty nice as it will bring
all cp calculation out of the if loop. all cp calculation out of the for loop.
Similarly we can first xor the bytes for the various rows. Similarly we can first xor the bytes for the various rows.
This leads to: This leads to:
...@@ -271,11 +271,11 @@ to write our code in such a way that we process data in 32 bit chunks. ...@@ -271,11 +271,11 @@ to write our code in such a way that we process data in 32 bit chunks.
Of course this means some modification as the row parity is byte by Of course this means some modification as the row parity is byte by
byte. A quick analysis: byte. A quick analysis:
for the column parity we use the par variable. When extending to 32 bits for the column parity we use the par variable. When extending to 32 bits
we can in the end easily calculate p0 and p1 from it. we can in the end easily calculate rp0 and rp1 from it.
(because par now consists of 4 bytes, contributing to rp1, rp0, rp1, rp0 (because par now consists of 4 bytes, contributing to rp1, rp0, rp1, rp0
respectively) respectively, from MSB to LSB)
also rp2 and rp3 can be easily retrieved from par as rp3 covers the also rp2 and rp3 can be easily retrieved from par as rp3 covers the
first two bytes and rp2 the last two bytes. first two MSBs and rp2 covers the last two LSBs.
Note that of course now the loop is executed only 64 times (256/4). Note that of course now the loop is executed only 64 times (256/4).
And note that care must taken wrt byte ordering. The way bytes are And note that care must taken wrt byte ordering. The way bytes are
...@@ -387,11 +387,11 @@ Analysis 2 ...@@ -387,11 +387,11 @@ Analysis 2
The code (of course) works, and hurray: we are a little bit faster than The code (of course) works, and hurray: we are a little bit faster than
the linux driver code (about 15%). But wait, don't cheer too quickly. the linux driver code (about 15%). But wait, don't cheer too quickly.
THere is more to be gained. There is more to be gained.
If we look at e.g. rp14 and rp15 we see that we either xor our data with If we look at e.g. rp14 and rp15 we see that we either xor our data with
rp14 or with rp15. However we also have par which goes over all data. rp14 or with rp15. However we also have par which goes over all data.
This means there is no need to calculate rp14 as it can be calculated from This means there is no need to calculate rp14 as it can be calculated from
rp15 through rp14 = par ^ rp15; rp15 through rp14 = par ^ rp15, because par = rp14 ^ rp15;
(or if desired we can avoid calculating rp15 and calculate it from (or if desired we can avoid calculating rp15 and calculate it from
rp14). That is why some places refer to inverse parity. rp14). That is why some places refer to inverse parity.
Of course the same thing holds for rp4/5, rp6/7, rp8/9, rp10/11 and rp12/13. Of course the same thing holds for rp4/5, rp6/7, rp8/9, rp10/11 and rp12/13.
...@@ -419,12 +419,12 @@ with ...@@ -419,12 +419,12 @@ with
if (i & 0x20) rp15 ^= cur; if (i & 0x20) rp15 ^= cur;
and outside the loop added: and outside the loop added:
rp4 = par ^ rp5; rp4 = par ^ rp5;
rp6 = par ^ rp7; rp6 = par ^ rp7;
rp8 = par ^ rp9; rp8 = par ^ rp9;
rp10 = par ^ rp11; rp10 = par ^ rp11;
rp12 = par ^ rp13; rp12 = par ^ rp13;
rp14 = par ^ rp15; rp14 = par ^ rp15;
And after that the code takes about 30% more time, although the number of And after that the code takes about 30% more time, although the number of
statements is reduced. This is also reflected in the assembly code. statements is reduced. This is also reflected in the assembly code.
...@@ -524,12 +524,12 @@ THe code within the for loop was changed to: ...@@ -524,12 +524,12 @@ THe code within the for loop was changed to:
cur = *bp++; tmppar ^= cur; rp4 ^= cur; rp6 ^= cur; cur = *bp++; tmppar ^= cur; rp4 ^= cur; rp6 ^= cur;
cur = *bp++; tmppar ^= cur; rp6 ^= cur; cur = *bp++; tmppar ^= cur; rp6 ^= cur;
cur = *bp++; tmppar ^= cur; rp4 ^= cur; cur = *bp++; tmppar ^= cur; rp4 ^= cur;
cur = *bp++; tmppar ^= cur; rp10 ^= tmppar; cur = *bp++; tmppar ^= cur; rp10 ^= tmppar;
cur = *bp++; tmppar ^= cur; rp4 ^= cur; rp6 ^= cur; rp8 ^= cur; cur = *bp++; tmppar ^= cur; rp4 ^= cur; rp6 ^= cur; rp8 ^= cur;
cur = *bp++; tmppar ^= cur; rp6 ^= cur; rp8 ^= cur; cur = *bp++; tmppar ^= cur; rp6 ^= cur; rp8 ^= cur;
cur = *bp++; tmppar ^= cur; rp4 ^= cur; rp8 ^= cur; cur = *bp++; tmppar ^= cur; rp4 ^= cur; rp8 ^= cur;
cur = *bp++; tmppar ^= cur; rp8 ^= cur; cur = *bp++; tmppar ^= cur; rp8 ^= cur;
cur = *bp++; tmppar ^= cur; rp4 ^= cur; rp6 ^= cur; cur = *bp++; tmppar ^= cur; rp4 ^= cur; rp6 ^= cur;
...@@ -537,7 +537,7 @@ THe code within the for loop was changed to: ...@@ -537,7 +537,7 @@ THe code within the for loop was changed to:
cur = *bp++; tmppar ^= cur; rp4 ^= cur; cur = *bp++; tmppar ^= cur; rp4 ^= cur;
cur = *bp++; tmppar ^= cur; cur = *bp++; tmppar ^= cur;
par ^= tmppar; par ^= tmppar;
if ((i & 0x1) == 0) rp12 ^= tmppar; if ((i & 0x1) == 0) rp12 ^= tmppar;
if ((i & 0x2) == 0) rp14 ^= tmppar; if ((i & 0x2) == 0) rp14 ^= tmppar;
} }
...@@ -548,8 +548,8 @@ to rp12 and rp14. ...@@ -548,8 +548,8 @@ to rp12 and rp14.
While making the changes I also found that I could exploit that tmppar While making the changes I also found that I could exploit that tmppar
contains the running parity for this iteration. So instead of having: contains the running parity for this iteration. So instead of having:
rp4 ^= cur; rp6 = cur; rp4 ^= cur; rp6 ^= cur;
I removed the rp6 = cur; statement and did rp6 ^= tmppar; on next I removed the rp6 ^= cur; statement and did rp6 ^= tmppar; on next
statement. A similar change was done for rp8 and rp10 statement. A similar change was done for rp8 and rp10
...@@ -593,22 +593,22 @@ The new code now looks like: ...@@ -593,22 +593,22 @@ The new code now looks like:
cur = *bp++; tmppar ^= cur; rp4_6 ^= cur; cur = *bp++; tmppar ^= cur; rp4_6 ^= cur;
cur = *bp++; tmppar ^= cur; rp6 ^= cur; cur = *bp++; tmppar ^= cur; rp6 ^= cur;
cur = *bp++; tmppar ^= cur; rp4 ^= cur; cur = *bp++; tmppar ^= cur; rp4 ^= cur;
cur = *bp++; tmppar ^= cur; rp10 ^= tmppar; cur = *bp++; tmppar ^= cur; rp10 ^= tmppar;
notrp8 = tmppar; notrp8 = tmppar;
cur = *bp++; tmppar ^= cur; rp4_6 ^= cur; cur = *bp++; tmppar ^= cur; rp4_6 ^= cur;
cur = *bp++; tmppar ^= cur; rp6 ^= cur; cur = *bp++; tmppar ^= cur; rp6 ^= cur;
cur = *bp++; tmppar ^= cur; rp4 ^= cur; cur = *bp++; tmppar ^= cur; rp4 ^= cur;
cur = *bp++; tmppar ^= cur; cur = *bp++; tmppar ^= cur;
rp8 = rp8 ^ tmppar ^ notrp8; rp8 = rp8 ^ tmppar ^ notrp8;
cur = *bp++; tmppar ^= cur; rp4_6 ^= cur; cur = *bp++; tmppar ^= cur; rp4_6 ^= cur;
cur = *bp++; tmppar ^= cur; rp6 ^= cur; cur = *bp++; tmppar ^= cur; rp6 ^= cur;
cur = *bp++; tmppar ^= cur; rp4 ^= cur; cur = *bp++; tmppar ^= cur; rp4 ^= cur;
cur = *bp++; tmppar ^= cur; cur = *bp++; tmppar ^= cur;
par ^= tmppar; par ^= tmppar;
if ((i & 0x1) == 0) rp12 ^= tmppar; if ((i & 0x1) == 0) rp12 ^= tmppar;
if ((i & 0x2) == 0) rp14 ^= tmppar; if ((i & 0x2) == 0) rp14 ^= tmppar;
} }
...@@ -700,7 +700,7 @@ Conclusion ...@@ -700,7 +700,7 @@ Conclusion
The gain when calculating the ecc is tremendous. Om my development hardware The gain when calculating the ecc is tremendous. Om my development hardware
a speedup of a factor of 18 for ecc calculation was achieved. On a test on an a speedup of a factor of 18 for ecc calculation was achieved. On a test on an
embedded system with a MIPS core a factor 7 was obtained. embedded system with a MIPS core a factor 7 was obtained.
On a test with a Linksys NSLU2 (ARMv5TE processor) the speedup was a factor On a test with a Linksys NSLU2 (ARMv5TE processor) the speedup was a factor
5 (big endian mode, gcc 4.1.2, -O3) 5 (big endian mode, gcc 4.1.2, -O3)
For correction not much gain could be obtained (as bitflips are rare). Then For correction not much gain could be obtained (as bitflips are rare). Then
again there are also much less cycles spent there. again there are also much less cycles spent there.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment