Thursday, October 25, 2012

Error Correcting Code, Error Detection and Correction, Checksum

Error Detection and Correction (EDC)

Engineering is such a beautiful field, how brilliant minds are able to find awesome solution to the difficult problems.Error detection and correction is again such an example. As you have reached this post, try to think and generate a strategy yourself for EDC. Error detection is really easy and you should be able to find the primitive solution to it. Error correction on the other hand is little tricky. Anyway I will also explain one of simple strategy here.

Definition (WikiPedia)
  • Error detection (say ED) is the detection of errors caused by noise or other impairments during transmission from the transmitter to the receiver.
  • Error correction(say EC) is the detection of errors and reconstruction of the original, error-free data.
So problem what we are trying to solve here is find 1) if there are incorrect bits in a byte stream (EC) and 2) find which bit is incorrect and correct it (EC)

Error Detection using Parity Check
Strategy is intentionally add few extra bit(s) to the original message which would help us finding sanity of the message on the other end.
Extra bits are called parity.
Original message + Parity is called Codeword.

Single parity check code
Simplest of strategy is called single parity check code (SPC).The SPC involves the addition of a single extra bit to the binary message, the value of this parity would depends on the bits in the message. In an even parity code, the additional bit added to each message ensures an even number of 1s in every codeword.
e.g. lets say message(7 bit ) is 1000 110. It has 3 bits as 1 so parity bit would be 1 to make total numbers event i.e 4
Code word would be 1000 1101

Mathematically
c = [c1 c2 c3 c4 c5 c6 c7 c8],
where each ci is either 0 or 1, and every codeword satisfies the constraint
c1 ⊕ c2 ⊕ c3 ⊕ c4 ⊕ c5 ⊕ c6 ⊕ c7 ⊕ c8 = 0

symbol ⊕ is modulo 2 addition.

You must be getting now, if message was sent through faulty channel and you recevied
1001 1101
Here number of 1's are 5 which are odd, means, there is error in this message.

Yeah, I get it you are already jumping your guns telling me
1) What if there are more than 1 bits which were turned from 0 to 1
2)  How to find which bit got flipped.

Answer of both of above questions is you DON'T. Remember when  I told its very primitive strategy. You have to add more redundant and sophisticated parity bits.

Below I will try to explain error detection strategy for 1 bit error correction

Encoding and Constraint Matrix
Lets say we have message bits c1, c2 and c3
and we are adding three parity bits
c4, c5,c6

we make a strategy to define c4, c5, c6 in such a way that

c4 = c1 ⊕ c2
c5 = c2 ⊕ c3
c6 = c1 ⊕ c2 ⊕ c3
These conditions of defining/choosing parity bits is called constraints. 

Using the constraints above the message 110 produces the parity-check bits
c4 = 1 ⊕ 1 = 0,
c5 = 1 ⊕ 0 = 1,
c6 = 1 ⊕ 1 ⊕ 0 = 0,
and so the codeword for this message is c = [1 1 0 0 1 0].

Above things can be put in mathematical form as



[c1 c2 c3 c4 c5 c6] = [c1 c2 c3][   1 0 0 1 0 1    ]
                                                       0 1 0 1 1 1
                                                       0 0 1 0 1 1           ---------Equation   1


Second matrix here is call Generator matrix or Parity check matrix(H)
suppose y is the code word received then

       c1
       c2           0
[G][c3  ]  = [ 0 ]                                                   ---------    Equation 2
       c4           0
       c5
       c6

            T
or H(y)     =  0

If its not all zero.. That particular code constraint is not satisfied.

The message bits are conventionally labeled by u = [u1, u2, · · · uk], where the vector u holds the
k message bits. Thus the codeword c corresponding to the binary message
u = [u1u2u3] can be found using the matrix equation c = uG.

Substituting each of 8 distinct messages c1 c2 c3 = 000, 001, . . . , 111
into equation above yields the following set of codewords for the code from
[0 0 0 0 0 0] [0 0 1 0 1 1] [0 1 0 1 1 1] [0 1 1 1 0 0]
[1 0 0 1 0 1] [1 0 1 1 1 0] [1 1 0 0 1 0] [1 1 1 0 0 1]   --------------------- Equation3


Error Detection and Correction

Suppose a code word is sent through binary channel and one bit got corrupted. This section would try to find and correct that bit.

The codeword c = [1 0 1 1 1 0] from the code in  was sent through a binary symmetric channel and the the string y = [1 0 1 0 1 0]  received.



substituting in equation 2

                  1
H(y)T  = [  0  ] 
                  0
This result is non zero that means bit flip must have been occurred. This resultant matrix is called syndrome of y.

Syndrome tells which parity constraint is not satisfied.

Error Detection/Correction
Above syndrome tells that first parity check constraint is not satisfied. Since this parity-check
equation involves the 1-st, 2-nd and 4-th codeword bits we can conclude that at
least one of these three bits has been inverted by the channel.

Hamming Distance
The Hamming distance between two codewords is defined as the number of bit positions in which they differ. For example the codewords [1 0 1 0 0 1 1 0] and [1 0 0 0 0 1 1 1] differ in two positions, the third and eight codeword bits, so the Hamming distance between them is two. The measure of the ability of a code to detect errors is the minimum Hamming distance or just minimum distance of the code. The minimum distance of a code, dmin, is defined as the smallest Hamming distance between any pair of codewords in the code. For the code in above example, dmin = 3, so the corruption of three or more bits in a codeword can result in another valid codeword.


A Simple Decoder to Find corrupted bit/ Maximum LikeliHood Decoder
To go further and correct the bit flipping errors requires that the decoder determine which codeword was most likely to have been sent. Based only on knowing the binary received string, y, the best decoder will choose the codeword closest in Hamming distance to y. When there is more than one codeword at the minimum distance from y the decoder will randomly choose one of them.This decoder is called the maximum likelihood (ML) decoder as it will always chose the codeword which is most likely to have produced y.

In above example we detected that the received string y = [1 0 1 0 1 0] was not a codeword of the code. By comparing y with each of the codewords in this code, the ML decoder will choose c = [1 0 1 1 1 0], as the closest codeword as it is the only codeword Hamming distance 1 from y.

This is one of the strategy. But still you can see that
1) Above exampled code can only correct errors  if hamming distance is 1
2) There are better algorithms available but this is just to give feel how things can work. You can check LDPC on wikipedia(reference below)




Refrences
1)http://www.cs.utexas.edu/~eberlein/cs337/errorDetection3.pdf
2)http://en.wikipedia.org/wiki/Error_detection_and_correction
3)http://en.wikipedia.org/wiki/Low-density_parity-check_code


No comments: