Difference between revisions of "CRC"

From SEGGER Wiki
Jump to: navigation, search
(C- Source code)
Line 20: Line 20:
 
== C- Source code ==
 
== C- Source code ==
   
The below is a piece of C-Code that SEGGER actually uses in prodution code. It computes a CRC LSB first, for any
+
The below is a piece of C-Code that SEGGER actually uses in production code. It computes a CRC LSB first, for any
Polynom up to 32 bits, on any amount of data. The starting value can be specified, and the routine can be used
+
Polynomial up to 32 bits, on any amount of data. The starting value can be specified, and the routine can be used
to compute the CRC of multiple memory areas by simply using the return value of the first block as Starting value for the
+
to compute the CRC of multiple memory areas by simply using the return value of the first block as initial value for the
 
second block. It is small and efficient. The only downside is that is a bit slow, as each byte is computed in the inner loop,
 
second block. It is small and efficient. The only downside is that is a bit slow, as each byte is computed in the inner loop,
on a bit-by-bit basis. Typically CPUs will need about 150 cycles per byte, so a 200MHz microcontroller will deliver
+
on a bit-by-bit basis, so being executed 8 times for each byte.
  +
Typically CPUs will need about 150 cycles per byte, so a 200MHz microcontroller will deliver
 
about 1.3 MB/sec. Faster versions using a table (with typically 256 entries, so the handle 1 byte per loop) can be optimized
 
about 1.3 MB/sec. Faster versions using a table (with typically 256 entries, so the handle 1 byte per loop) can be optimized
 
to use about 15 cycles per loop, delivering about 13MB performance.
 
to use about 15 cycles per loop, delivering about 13MB performance.

Revision as of 20:57, 7 May 2019

In Embedded Systems, CRC usually stands for cyclic redundancy check. A CRC is in many respect similar to a checksum, and in fact, this is what they are typically used for. CRCs come in all kinds of sizes, from 4 bits to 64 bits, but there is no upper limit.

Generating CRCs in Hardware

CRCs can be generated very easily in hardware, by shifting the data through a shift registers with feedback at certain bit positions. A 32-bit CRC requires not much more than a 32-bit shift registers. This is ideal for applications in which the data is actually transferred bit-by-bit. A lot of MCUs have built-in CRC circuits, that allow computation of typically a specific CRC, such as 0x1021 (16-bit). The downside of these hardware units are 2 things:

  1. Fixed to a specific CRC
  2. Not thread safe: The CRC unit can only be used from one thread(task) at a time, so the software needs to be designed in a way that makes sure this is so, typically using the unit only in a single thread, or by securing it using a Mutex/Semaphore/Critical region or simply disabling interrupts


Generating CRCs in software

Generating CRCs is something that needs to done a lot. Again, there are a lot of ways to do it. The easiest way is to do the same thing that is easy in hardware, and compute the CRC bit-by-bit. Unfortunately, this is also a slow way of doing it. More advance ways of doing it are table based. The tables can be any size, so typically 16 entries for 4 bits at a time or 256 entries for 8 bits (one byte) at a time.

More to come.

C- Source code

The below is a piece of C-Code that SEGGER actually uses in production code. It computes a CRC LSB first, for any Polynomial up to 32 bits, on any amount of data. The starting value can be specified, and the routine can be used to compute the CRC of multiple memory areas by simply using the return value of the first block as initial value for the second block. It is small and efficient. The only downside is that is a bit slow, as each byte is computed in the inner loop, on a bit-by-bit basis, so being executed 8 times for each byte. Typically CPUs will need about 150 cycles per byte, so a 200MHz microcontroller will deliver about 1.3 MB/sec. Faster versions using a table (with typically 256 entries, so the handle 1 byte per loop) can be optimized to use about 15 cycles per loop, delivering about 13MB performance. Note that the code does not need any globals / statics and is fully reentrant.

U32 CRC_Calc(const U8* pData, unsigned NumBytes, U32 crc, U32 Polynom) {

 int i;
 U8 Data;
 U8 Xor;
 do {
   Data = *pData++;
   i = 8;
   do {
     Xor = crc ^ Data;
     crc >>= 1;
     if (Xor & 1)
       crc ^= Polynom;
     }
     Data >>= 1;
   } while (--i);
 } while (--NumBytes);
 return crc;

}