Monday, March 2, 2015

Generating Audio PSK31 with an Arduino (Part 3) UPDATE

UPDATED: Complete code listing added to end of posting.  Apologies for previously posting an incorrect (and non-functional) version of the code.  The corrected code is now included at the end of this posting.

In three previous posts (part 1updatepart 2), I have described a working PSK31 encoder implemented on an Arduino ATMega328 or ATMega2560 device.  While this implementation worked for the most part, it was based on some incorrect PSK32 bit timing.  Most PSK31 decoders are sufficiently robust to allow successful decoding of my generated signal, but I wanted to write a follow-up posting to describe the error and propose a fix.

By way of review, PSK31 is a communication protocol with a transmission rate of 31.25 baud.  So, what does this mean?  From google, we see the following definition of "baud".



So, a 31.25 baud transmission is 31.25 bits per second or a bit time of 32 ms.  The technique described in my previous postings (above) results in a bit time of 32.768 ms which apparently is still decode-able, but I would still like to get this right.

My error is in how I constructed the waveform of the PSK31 signal.  I used a 1 kHz tone which of course has a 1 ms period (1/1000 = .001 seconds).  To construct this 1 kHz tone, I generated 32 phase points per cycle.  This means that I needed to generate a new phase point every 31.25 us.  (.001 / 32 = .00003125 seconds = 31.25 us).  To do this, I used a timer driven at 16 MHz in phase correct PWM mode.  This means that the timer counts from zero to 255 and then back down to zero at which time it generates an interrupt.  Therefore, the interrupt would happen every 512 ticks of the 16 MHz clock which is every 32 us.  (16000000 / 512 = 31250 = .000032 seconds = 32 us)

So, instead of generating a phase point every 31.25 us, I was generating one every 32 us. (Hey, what's 0.75 us amongst friends?)  This in and of itself is not really a problem, but the error I made was that I was using a model where I generated 1024 phase points per PSK31 bit time.  Therefore, my bit time ended up being 32.768 ms rather than the desired 32 ms.  (1024 * .000032 = .032768 seconds = 32.768 ms).

In an ideal world, I would run the timer at 16 MHz / 500 = 32 kHz instead of 16 MHz / 512 = 31.25 kHz and my current scheme would work fine.  The counter would run from zero to 250 and then back down to zero instead of counting up to 255.  However, there are a limited number of prescaler divisors available on Arduino timers and they are all powers of two, so this is not an option.  Additionally, I need phase correct PWM mode, so the counter has to run from minimum (zero) to max (255) and back down for each phase point.  The 1 kHz tone output goes high at the current phase amplitude value when the counter gets to it on the way up to 255 and sets the output low when the counter gets to the phase amplitude value on the way back down to zero.  This allows the generation of phase correct PWM output.  Once the PWM output is integrated, a nice phase continuous sine wave output is produced.

So, the fix appears to be to use 1000 phase points per PSK31 bit time rather than 1024.  (1000 * .000032 = .032 = 32 ms)  I will need to re-generate the sine tables to use possibly 20 phase points per cycle rather than 32 and make sure the state machine knows the new number of phase points per cycle, so the code changes should be pretty trivial (famous last words...)

The implementation does not separate the bit timing from the timing of the generation of the 1 kHz audio tone.  We count cycles to know when the bit has ended.  If these timing tasks were separated, then we could set the bit time independently of the frequency of the tone being generated.  But, since I coupled these two things together, I have to change the number of phase points per cycle in order to generate the correct character timing.

Here is the updated code listing with the changes discussed above.  I have verified the bit timing is now correct.

As always, your mileage may vary.  I am happy to try and help you if you have questions or comments about anything you read here.  Post a reply here or drop me a note at ko7m at arrl dot net.

// PSK31 audio generation
// Jeff Whitlatch - ko7m

// We are going to generate a 1 kHz centre frequency tone.  
// Each 1 kHz cycle of the sinusoid will be generated from
// 32 eight bit amplitude samples.  The period of a 1 kHz tone
// is 1 ms.  Each of the 32 samples per cycle has a period
// of 31.25 us.  We will construct each sinusoid from a 32 byte
// per cycle lookup table of amplitude values ranging from
// 0x00 to 0xff where the zero crossing value is 0x80.

// The PSK31 character bit time is 31.25 ms constructed of 1024
// samples.  A binary zero is represented by a phase reversal
// while a binary 1 is represented by the lack of a phase reversal.

// Characters are encoded with a variable bit length code (varicode)
// where the length of each character is inversely
// proportional to the frequency of use in the english language 
// of that character.  Characters are encoded with a bit
// pattern where there are no sequential zero bits.  Two zero bits
// in a row signify the end of a character.

// Varicode lookup table
//
// This table defines the PKS31 varicode.  There are 128 entries,
// corresponding to ASCII characters 0-127 with two bytes for each entry.
// The bits for the varicode are to be shifted out LSB-first.
//
// More than one zero in sequence signifies the end of the character.
// For modulation, a 0 represents a phase reversal while a 1 
// represents a steady-state carrier.

uint16_t varicode[] = {
  0x0355,  // 0 NUL
  0x036d,  // 1 SOH
  0x02dd,  // 2 STX
  0x03bb,  // 3 ETX
  0x035d,  // 4 EOT
  0x03eb,  // 5 ENQ
  0x03dd,  // 6 ACK
  0x02fd,  // 7 BEL
  0x03fd,  // 8 BS
  0x00f7,  // 9 HT
  0x0017,  // 10 LF
  0x03db,  // 11 VT
  0x02ed,  // 12 FF
  0x001f,  // 13 CR
  0x02bb,  // 14 SO
  0x0357,  // 15 SI
  0x03bd,  // 16 DLE
  0x02bd,  // 17 DC1
  0x02d7,  // 18 DC2
  0x03d7,  // 19 DC3
  0x036b,  // 20 DC4
  0x035b,  // 21 NAK
  0x02db,  // 22 SYN
  0x03ab,  // 23 ETB
  0x037b,  // 24 CAN
  0x02fb,  // 25 EM
  0x03b7,  // 26 SUB
  0x02ab,  // 27 ESC
  0x02eb,  // 28 FS
  0x0377,  // 29 GS
  0x037d,  // 30 RS
  0x03fb,  // 31 US
  0x0001,  // 32 SP
  0x01ff,  // 33 !
  0x01f5,  // 34 @
  0x015f,  // 35 #
  0x01b7,  // 36 $
  0x02ad,  // 37 %
  0x0375,  // 38 &
  0x01fd,  // 39 '
  0x00df,  // 40 (
  0x00ef,  // 41 )
  0x01ed,  // 42 *
  0x01f7,  // 43 +
  0x0057,  // 44 ,
  0x002b,  // 45 -
  0x0075,  // 46 .
  0x01eb,  // 47 /
  0x00ed,  // 48 0
  0x00bd,  // 49 1
  0x00b7,  // 50 2
  0x00ff,  // 51 3
  0x01dd,  // 52 4
  0x01b5,  // 53 5
  0x01ad,  // 54 6
  0x016b,  // 55 7
  0x01ab,  // 56 8
  0x01db,  // 57 9
  0x00af,  // 58 :
  0x017b,  // 59 ;
  0x016f,  // 60 <
  0x0055,  // 61 =
  0x01d7,  // 62 >
  0x03d5,  // 63 ?
  0x02f5,  // 64 @
  0x005f,  // 65 A
  0x00d7,  // 66 B
  0x00b5,  // 67 C
  0x00ad,  // 68 D
  0x0077,  // 69 E
  0x00db,  // 70 F
  0x00bf,  // 71 G
  0x0155,  // 72 H
  0x007f,  // 73 I
  0x017f,  // 74 J
  0x017d,  // 75 K
  0x00eb,  // 76 L
  0x00dd,  // 77 M
  0x00bb,  // 78 N
  0x00d5,  // 79 O
  0x00ab,  // 80 P
  0x0177,  // 81 Q
  0x00f5,  // 82 R
  0x007b,  // 83 S
  0x005b,  // 84 T
  0x01d5,  // 85 U
  0x015b,  // 86 V
  0x0175,  // 87 W
  0x015d,  // 88 X
  0x01bd,  // 89 Y
  0x02d5,  // 90 Z
  0x01df,  // 91 [
  0x01ef,  // 92 
  0x01bf,  // 93 ]
  0x03f5,  // 94 ^
  0x016d,  // 95 _
  0x03ed,  // 96 `
  0x000d,  // 97 a
  0x007d,  // 98 b
  0x003d,  // 99 c
  0x002d,  // 100 d
  0x0003,  // 101 e
  0x002f,  // 102 f
  0x006d,  // 103 g
  0x0035,  // 104 h
  0x000b,  // 105 i
  0x01af,  // 106 j
  0x00fd,  // 107 k
  0x001b,  // 108 l
  0x0037,  // 109 m
  0x000f,  // 110 n
  0x0007,  // 111 o
  0x003f,  // 112 p
  0x01fb,  // 113 q
  0x0015,  // 114 r
  0x001d,  // 115 s
  0x0005,  // 116 t
  0x003b,  // 117 u
  0x006f,  // 118 v
  0x006b,  // 119 w
  0x00fb,  // 120 x
  0x005d,  // 121 y
  0x0157,  // 122 z
  0x03b5,  // 123 {
  0x01bb,  // 124 |
  0x02b5,  // 125 }
  0x03ad,  // 126 ~
  0x02b7   // 127 (del)
};

// 25 cycles of 20 samples each (500 bytes) of ramp-up
// sinusoid information.  There is an extra byte at the
// end of the table with the value 0x80 which allows the
// first byte to always be at the zero crossing point
// whether ramping up or down.
//
char data[] =
{
  0x80,0x82,0x85,0x86,0x88,0x88,0x88,0x86,0x85,0x82,
  0x80,0x7E,0x7B,0x7A,0x78,0x78,0x78,0x7A,0x7B,0x7E,
  0x80,0x85,0x89,0x8D,0x8F,0x90,0x8F,0x8D,0x89,0x85,
  0x80,0x7B,0x77,0x73,0x71,0x70,0x71,0x73,0x77,0x7B,
  0x80,0x87,0x8E,0x93,0x97,0x98,0x97,0x93,0x8E,0x87,
  0x80,0x79,0x72,0x6D,0x69,0x68,0x69,0x6D,0x72,0x79,
  0x80,0x8A,0x92,0x99,0x9D,0x9F,0x9D,0x99,0x92,0x8A,
  0x80,0x76,0x6E,0x67,0x63,0x61,0x63,0x67,0x6E,0x76,
  0x80,0x8C,0x97,0xA0,0xA5,0xA7,0xA5,0xA0,0x97,0x8C,
  0x80,0x74,0x69,0x60,0x5B,0x59,0x5B,0x60,0x69,0x74,
  0x80,0x8F,0x9C,0xA6,0xAD,0xAF,0xAD,0xA6,0x9C,0x8F,
  0x80,0x71,0x64,0x5A,0x53,0x51,0x53,0x5A,0x64,0x71,
  0x80,0x91,0xA0,0xAC,0xB3,0xB6,0xB3,0xAC,0xA0,0x91,
  0x80,0x6F,0x60,0x54,0x4D,0x4A,0x4D,0x54,0x60,0x6F,
  0x80,0x93,0xA4,0xB1,0xBA,0xBD,0xBA,0xB1,0xA4,0x93,
  0x80,0x6D,0x5C,0x4F,0x46,0x43,0x46,0x4F,0x5C,0x6D,
  0x80,0x95,0xA8,0xB7,0xC1,0xC4,0xC1,0xB7,0xA8,0x95,
  0x80,0x6B,0x58,0x49,0x3F,0x3C,0x3F,0x49,0x58,0x6B,
  0x80,0x97,0xAC,0xBD,0xC7,0xCB,0xC7,0xBD,0xAC,0x97,
  0x80,0x69,0x54,0x43,0x39,0x35,0x39,0x43,0x54,0x69,
  0x80,0x99,0xB0,0xC2,0xCD,0xD1,0xCD,0xC2,0xB0,0x99,
  0x80,0x67,0x50,0x3E,0x33,0x2F,0x33,0x3E,0x50,0x67,
  0x80,0x9B,0xB3,0xC6,0xD3,0xD7,0xD3,0xC6,0xB3,0x9B,
  0x80,0x65,0x4D,0x3A,0x2D,0x29,0x2D,0x3A,0x4D,0x65,
  0x80,0x9D,0xB7,0xCB,0xD8,0xDD,0xD8,0xCB,0xB7,0x9D,
  0x80,0x63,0x49,0x35,0x28,0x23,0x28,0x35,0x49,0x63,
  0x80,0x9E,0xBA,0xCF,0xDD,0xE2,0xDD,0xCF,0xBA,0x9E,
  0x80,0x62,0x46,0x31,0x23,0x1E,0x23,0x31,0x46,0x62,
  0x80,0xA0,0xBD,0xD3,0xE2,0xE7,0xE2,0xD3,0xBD,0xA0,
  0x80,0x60,0x43,0x2D,0x1E,0x19,0x1E,0x2D,0x43,0x60,
  0x80,0xA1,0xBF,0xD7,0xE6,0xEB,0xE6,0xD7,0xBF,0xA1,
  0x80,0x5F,0x41,0x29,0x1A,0x15,0x1A,0x29,0x41,0x5F,
  0x80,0xA2,0xC1,0xDA,0xEA,0xEF,0xEA,0xDA,0xC1,0xA2,
  0x80,0x5E,0x3F,0x26,0x16,0x11,0x16,0x26,0x3F,0x5E,
  0x80,0xA4,0xC4,0xDD,0xED,0xF3,0xED,0xDD,0xC4,0xA4,
  0x80,0x5C,0x3C,0x23,0x13,0x0D,0x13,0x23,0x3C,0x5C,
  0x80,0xA4,0xC5,0xDF,0xF0,0xF6,0xF0,0xDF,0xC5,0xA4,
  0x80,0x5C,0x3B,0x21,0x10,0x0A,0x10,0x21,0x3B,0x5C,
  0x80,0xA5,0xC7,0xE2,0xF3,0xF9,0xF3,0xE2,0xC7,0xA5,
  0x80,0x5B,0x39,0x1E,0x0D,0x07,0x0D,0x1E,0x39,0x5B,
  0x80,0xA6,0xC8,0xE4,0xF5,0xFB,0xF5,0xE4,0xC8,0xA6,
  0x80,0x5A,0x38,0x1C,0x0B,0x05,0x0B,0x1C,0x38,0x5A,
  0x80,0xA7,0xC9,0xE5,0xF7,0xFD,0xF7,0xE5,0xC9,0xA7,
  0x80,0x59,0x37,0x1B,0x09,0x03,0x09,0x1B,0x37,0x59,
  0x80,0xA7,0xCA,0xE6,0xF8,0xFE,0xF8,0xE6,0xCA,0xA7,
  0x80,0x59,0x36,0x1A,0x08,0x02,0x08,0x1A,0x36,0x59,
  0x80,0xA7,0xCB,0xE7,0xF9,0xFF,0xF9,0xE7,0xCB,0xA7,
  0x80,0x59,0x35,0x19,0x07,0x01,0x07,0x19,0x35,0x59,
  0x80,0xA7,0xCB,0xE7,0xF9,0xFF,0xF9,0xE7,0xCB,0xA7,
  0x80,0x59,0x35,0x19,0x07,0x01,0x07,0x19,0x35,0x59,
  0x80
  };

// The last 20 bytes (21 with the extra on the end)
// define a single cycle of full amplitude sinusoid.
#define one  (&data[24*20])      // Sine table pointer for a one bit
#define zero (&data[25*20])      // Sine table pointer for a zero bit

// Useful macros for setting and resetting bits
#define cbi(sfr, bit) (_SFR_BYTE(sfr) &= ~_BV(bit))
#define sbi(sfr, bit) (_SFR_BYTE(sfr) |= _BV(bit))

// Variables used by the timer ISR to generate sinusoidal information.
volatile char    rgchBuf[256];    // Buffer of text to send
volatile uint8_t head = 0;        // Buffer head (next character to send)
volatile uint8_t tail = 0;        // Buffer tail (next insert point)

volatile uint16_t vcChar = 0;     // Current varicode char being sent

volatile int   cbHalfBit = 500;   // 500 phase points required for PSK 1/2 bit time

volatile char *pbSine = zero;
volatile int   cbDirection = 500;
volatile char  fSendOne = false;

//volatile char *pbSine = one;
//volatile int   cbDirection = 32;
//volatile char  fSendOne = true;

volatile char ix       = -1;
volatile char phase    = 1;
volatile char fFullBit = 0;

volatile char cZeroBits = 0;
volatile char maxZeroBits = 2;

// Setup timer2 with prescaler = 1, PWM mode to phase correct PWM
// See the ATMega datasheet for all the gory details

void timer2Setup()
{
  // Clock prescaler = 1
  sbi (TCCR2B, CS20);    // 001 = no prescaling
  cbi (TCCR2B, CS21);
  cbi (TCCR2B, CS22);

  // Phase Correct PWM
  cbi (TCCR2A, COM2A0);  // 10 = clear OC2A on compare match when up counting
  sbi (TCCR2A, COM2A1);  //      set OC2A on compare match when down counting

  // Mode 1
  sbi (TCCR2A, WGM20);   // 01 = Mode 1 uses 0xff as TOP value
  cbi (TCCR2A, WGM21);
}

// Timer 2 interrupt service routine (ISR).
//
// Grab the next phase point from the table and 
// set the amplitude value of the sinusoid being
// constructed.  For a one bit, set 500 phase points
// (20 amplitudes of 25 samples each) to ramp
// down to zero and then immediately back up to full
// amplitude for a total of 1024 phase points.
//
// For a zero bit, there is not amplitude or phase
// change, so we just play 32 phase points of
// full amplitude data 32 times for a total of 1024
// phase points.
//
// Each end of the ramp-up table starts with a zero 
// crossing byte, so there is one extra byte in
// the table (501 entries).  Ramping up plays bytes
// 0 -> 499 and ramping down plays bytes 500 -> 1
// allowing each direction to start at the zero
// crossing point.
ISR(TIMER2_OVF_vect)
{
  // Set current amplitude value for the sine wave 
  // being constructed taking care to invert the
  // phase when processing the table in reverse order.
  OCR2A = *pbSine * ix * phase;
  pbSine += ix;
  
  // At the half bit time, we need to change phase
  // if generating a zero bit
  if (0 == --cbHalfBit)
  {
    cbHalfBit = 500;  // Reset 1/2 PSK bit time phase counter
    
    // Get the next varichar bit to send
    if (fFullBit)
    {
      // Count the number of sequential zero bits
      if (fSendOne = vcChar & 1) cZeroBits = 0; else cZeroBits++;

      // Shift off the most least significant bit.
      vcChar >>= 1;
      
      // If we have sent two zero bits, end of character has occurred
      if (cZeroBits > maxZeroBits)
      {
        cZeroBits = 0;
        
        // If send buffer not empty, get next varicode character
        if (head != tail)
        {
          // Assumes a 256 byte buffer as index increments modulo 256
          vcChar = varicode[rgchBuf[head++]];
        }
        else
          if (maxZeroBits > 2) cbi (TIMSK2,TOIE2); else maxZeroBits = 75;
      }
    }
    
    fFullBit = !fFullBit;  // Toggle end of full bit flag
    
    // When we get done ramping down, phase needs to
    // change unless we are sending a one bit
    if (ix < 0 &&!fSendOne) phase = -phase;
  }
  
  // At the end of the table for the bit being
  // generated, we need to change direction
  // and process the table in the other direction.
  if (0 == --cbDirection)
  {
    cbDirection = fSendOne ? 20 : 500;
    ix = -ix;
  }
}

void setup() 
{
  // PWM output for timer2 is pin 10 on the ATMega2560
  // If you use an ATMega328 (such as the UNO) you need
  // to make this pin 11
  // See https://spreadsheets.google.com/pub?key=rtHw_R6eVL140KS9_G8GPkA&gid=0
  pinMode(10, OUTPUT);   // Timer 2 PWM output on mega256 is pin 10

  // Set up timer2 to a phase correct 32kHz clock
  timer2Setup();

  // Put something in the buffer to be sent
  strcpy((char *) &rgchBuf[0], "\nCQ CQ CQ de ko7m ko7m ko7m"
                               "\nCQ CQ CQ de ko7m ko7m ko7m"
                               "\nCQ CQ CQ de ko7m ko7m ko7m CN87xp pse k\n");
  tail = strlen((const char *) rgchBuf);
  head = 0;
  
  sbi (TIMSK2,TOIE2);    // Enable timer 2.
}

void loop() 
{
}

Sunday, February 15, 2015

Sorry to be absent for a while

I had hoped to be more active posting of late but I guess it has been my turn to deal with pesky medical issues since about Christmas.  After this week, I think things will normalize a little more for me and I will get back to more regular postings.

73's de Jeff - ko7m

Monday, February 2, 2015

Arduino Due High Frequency Waveform Output

In response to a previous post on Due Timers a reader asked if it was possible to obtain more than 1 MHz output frequency for a generated waveform.  To examine this question, I initially looked at performance using the simplest approach to generating a waveform output by bit banging the output waveform.

void setup() 
{
  pinMode(8, OUTPUT);
}
void loop() 
{  
  while (true) 
  {
    digitalWrite(8, HIGH);
    digitalWrite(8, LOW);
  }
}

This bit of code will allow the generation of a 200 kHz square wave on digital pin 8.  The while loop in the loop function may not be obvious as to it's function.  If we comment out this while statement line, our output frequency changes from 200 kHz to approximately 144.5 kHz and the waveform is no longer symmetrical 50% duty cycle.  This is caused by the routine that calls the loop function doing some checking of the serial port on each iteration.  So, to eliminate this overhead, I use my own while loop.

The next most obvious way to improve on this performance is to eliminate the overhead of the digitalWrite call and manipulate the port directly.

void setup() 
{
  pinMode(8, OUTPUT);
}
void loop() 
{  
  while (true) 
  {
    g_APinDescription[8].pPort -> PIO_SODR = g_APinDescription[8].ulPin;
    g_APinDescription[8].pPort -> PIO_CODR = g_APinDescription[8].ulPin;
  }
}

With this code I can now obtain approximately 16.9 MHz square wave output on pin 8.  The waveform has a bit of overshoot at this frequency.  This approach also suffers from the inability to accurately control the frequency.

When looking to see just how fast we can clock the PWM outputs, as noted in my previous post you will find that the highest PWM clock frequency that can be accomodated is 42 MHz.  If you look at pmc.c and pwmc.c in hardware/archino/sam/system/libsam/source you will notice that a function "FindClockConfiguration" is used to ensure that the frequency parameter passed is less than the master clock (MCK) frequency of 84 MHz.  With integer dividers used as a prescaler, the smallest prescale value of 2 results in a 42 MHz frequency.

So, if you are looking for  a way to output the highest frequency possible, use a one bit PWM which will essentially output a square wave with a duty cycle of one.  Setting the prescaler to 2 will obtain an 84 MHz waveform.

uint32_t pwmPin = 8;
uint32_t maxDutyCount = 2;
uint32_t clkAFreq = 42000000ul;
uint32_t pwmFreq = 42000000ul; 
void setup() {
  pmc_enable_periph_clk(PWM_INTERFACE_ID);
  PWMC_ConfigureClocks(clkAFreq, 0, VARIANT_MCK);
  PIO_Configure(
    g_APinDescription[pwmPin].pPort,
    g_APinDescription[pwmPin].ulPinType,
    g_APinDescription[pwmPin].ulPin,
    g_APinDescription[pwmPin].ulPinConfiguration);
  uint32_t channel = g_APinDescription[pwmPin].ulPWMChannel;
  PWMC_ConfigureChannel(PWM_INTERFACE, channel , pwmFreq, 0, 0);
  PWMC_SetPeriod(PWM_INTERFACE, channel, maxDutyCount);
  PWMC_EnableChannel(PWM_INTERFACE, channel);
  PWMC_SetDutyCycle(PWM_INTERFACE, channel, 1);
  pmc_mck_set_prescaler(2);
}
void loop() 
{
}

Many thanks to Kerry Wong for providing most of the details above.

Monday, January 19, 2015

Arduino ADC conversion rate

I know a lot of postings have been written about analogue to digital conversion rates in the 8 bit Arduino processors.  I decided to do a little poking around and performance timing to see for myself how well these little processors perform.  I will compare the performance of the 8 bit ATMega328 and ATMega2560 processors with the 32 bit Arduino Due processor.

The ADC clock is 16 MHz divided by a prescale factor.  The default setting is found in wiring.c:

        // set a2d prescale factor to 128
        // 16 MHz / 128 = 125 KHz, inside the desired 50-200 KHz range.
        // XXX: this will not work properly for other clock speeds, and
        // this code should use F_CPU to determine the prescale factor.
        sbi(ADCSRA, ADPS2);
        sbi(ADCSRA, ADPS1);
        sbi(ADCSRA, ADPS0);

        // enable a2d conversions
        sbi(ADCSRA, ADEN);

Using the default setting of 128 for the prescale factor gives a conversion clock of 125 kHz.  Since ADC conversion requires 13 ADC clocks the effective sample rate at best is approximately 125 kHz / 13 = 9.615 kHz.

Using a prescale of 16 would give an ADC clock of 1 MHz and a sample rate of 76.923 kHz.  Increasing the ADC clock can affect ADC accuracy however.  ATMel recommends that the maximum ADC clock frequency is limited by the internal DAC in the conversion circuitry and should not exceed 200 kHz.  However frequencies up to 1 MHz do not reduce the ADC resolution significantly.  Operation above 1 Mhz has not been characterized however.

So to do a quick test of the impact on performance I did a quick an dirty script to measure the time required to do 1000 analogRead operations before and after speeding up the ADC clock and see how much performance gain there is.

// useful defines for setting and clearing register bits
#define cbi(sfr, bit) (_SFR_BYTE(sfr) &= ~_BV(bit))
#define sbi(sfr, bit) (_SFR_BYTE(sfr) |= _BV(bit))

void setup() {
 int start;

 Serial.begin(115200) ;
 Serial.println("ADCTest at default 9.6 kHz sample rate") ;
 start = millis() ;
 for (int i = 0 ; i < 1000 ; i++)
   analogRead(0) ;
 Serial.print(millis() - start) ;
 Serial.println(" ms (1000 calls)") ;
 Serial.println();

 // set prescale to 16
 sbi(ADCSRA,ADPS2) ;
 cbi(ADCSRA,ADPS1) ;
 cbi(ADCSRA,ADPS0) ;

 Serial.println("ADCTest at 76.93 kHz sample rate") ;
 start = millis() ;
 for (i = 0 ; i < 1000 ; i++)
   analogRead(0) ;
 Serial.print(millis() - start) ;
 Serial.println(" ms (1000 calls)") ;
}

void loop()
{

}

The results are about as you would expect with nearly an order of magnitude improvement in ADC speed.

ADCTest at default 9.6 kHz sample rate
111 ms (1000 calls)

ADCTest at 76.93 kHz sample rate

18 ms (1000 calls)

Testing the Due with the following code shows the following:

ADCTest on Due 

3 ms (1000 calls)

Here is the code used:

void setup() 
{
 int start ;
 int i ;

 Serial.begin(115200) ;

 Serial.println("ADCTest on Due ") ;
 start = millis() ;
 for (i = 0 ; i < 1000 ; i++)
   analogRead(0);
 Serial.print(millis() - start) ;
 Serial.println(" ms (1000 calls)") ;
 Serial.println();

}

void loop() 
{

}

Wednesday, January 14, 2015

Arduino Due Timers (Part 1)

My next foray into the wild and wonderful world of Arduino Due will be to take a close look at the Due notion of Timers.  Tighten up the seat belt as this world gets deep in a hurry.  I will endeavour to keep things as simple and practical as I can.

The Arduino Due Timers or Counter Timer (TC) as they are called are a bit different implementation from the 8 bit Arduino devices.  There is a lot of functionality in the Due  Timer Counter module and it is not a simple thing to describe it fully so I will likely break this into several postings.

The SAM3X8E CPU has 3 Timer Counters (TCs) named TC0, TC1, TC2.  Each TC includes three identical 32-bit channels.  Each channel can be independently programmed to perform a wide range of functions including frequency measurement, event counting, interval measurement, pulse generation, delay timing and pulse width modulation (PWM).

Each channel has three external clock inputs, five internal clock inputs and two multi-purpose input/output signals which can be configured by the user.  Each channel drives an internal interrupt signal which can be programmed to generate processor interrupts.

The TC embeds quadrature decoder logic connected in front of the 3 timers and driven by TIOA0, TIOB0 and TIOA1 inputs. When enabled, the quadrature decoder performs input line filtering and decoding of quadrature signals.  We will not be covering this feature in these postings.

The TC block has two global registers which act upon all three TC channels. The Block Control Register allows the three channels to be started simultaneously with the same instruction.  The Block Mode Register defines the external clock inputs for each channel, allowing them to be chained.

Clocks are assigned to Timer Counters as follows:
  • TIMER_CLOCK1 - MCK/2
  • TIMER_CLOCK2 - MCK/8
  • TIMER_CLOCK3 - MCK/32
  • TIMER_CLOCK4 - MCK/128
  • TIMER_CLOCK5 - SLCK

MCK is the master clock (84 MHz) and SLCK is the slow clock (32 kHz).  It should be noted that it is possible to select the slow clock as the master clock, which case TIMER_CLOCK5 input is equivalent to the master clock.  As will be seen later, TCs can be chained together using the TIOA0, TIOA1, TIOA2 as an external clock input for subsequent TCs allowing further division of the clock frequency.    I may get into clock chaining in further detail in a separate post.

This rather daunting image is the Timer Counter block diagramme.  It is not as bad as it looks.



Channel signals seen above are as follows:

  • XC0, XC1, XC2 - External Clock Inputs
  • TIOA                - Capture Mode: TC Input, Waveform Mode: TC Output
  • TIOB                - Capture Mode: TC Input, Waveform Mode: TC I/O
  • INT                  - Interrupt Signal Output
  • SYNC               - Synchronization Input Signal


The three channels of TC are identical in operation except when Quadrature decoder is enabled.

Each channel is organized around a separate 32-bit counter. The value of the counter is incremented at each positive edge of the selected clock. When the counter has reached the value 0xFFFF and wraps around to 0x0000, an overflow occurs and the COVFS bit in TC_SR (Status Register) is set.  The current value of the counter is accessible anytime by reading the Counter Value Register, TC_CV. The counter can also be reset by a trigger. In this case, the counter value resets to 0x0000 on the next valid edge of the selected clock following the trigger event.

At the block level, input clock signals of each channel can either be connected to the external inputs TCLK0, TCLK1 or TCLK2, or be connected to the internal I/O signals TIOA0, TIOA1 or TIOA2 for chaining by programming the TC_BMR (Block Mode) register.

Each channel can independently select an internal or external clock source for its counter via the TCCLKS bits in the TC Channel Mode register (TC_CMR).

  • Internal clock signals: TIMER_CLOCK1, TIMER_CLOCK2, TIMER_CLOCK3, TIMER_CLOCK4, TIMER_CLOCK5
  • External clock signals: XC0, XC1 or XC2

The selected clock can be inverted using the CLKI bit in TC_CMR. This allows counting on the opposite edges of the clock.  There is a burst function which allows the clock to be validated when an external signal is high. The BURST parameter in the Mode Register defines this signal (none, XC0, XC1, XC2).

Note that in all cases, if an external clock is used, the duration of each of its levels must be longer than the master clock period and the external clock frequency must be at least 2.5 times lower than the master clock.

Here is a block diagramme of the clock selection logic:



We still have not covered clock control, operating modes, or triggers, but we will touch on these topics as we work through examples.


Ok, enough background about Due timers for now and on to the first practical example.  In this example we will define a function that allows the configuration of a TC to generate a square wave at a relatively low frequency of the caller's choice.

Firstly, let's think about clocking our timer.  We have a system clock speed of 84 Mhz that can be divided by 4 different divisors (2, 8, 32 and 128) and the slow clock.  So, the available timer clock speeds are:

  • 42 MHz
  • 10.5 MHz
  • 2.652 MHz
  • 656.25 kHz
  • 32 kHz


As previously mentioned, TCs can be chained to obtain other clock speeds, but that topic is beyond the scope of this posting.

To start up a timer, we need to deal with at least 4 different bits of information when doing simple operations with TCs.

  • The Timer Counter (TC) you wish to use
  • The channel in that TC you with to use
  • The IRQ if interrupts are used
  • The frequency of the timer


The following table is useful when performing TC configuration as it shows the relationship between the TC, it's channels, the IRQ to use, what the IRQ function must be called and the power management ID for that peripheral.  Looking at the first TC in the list (TC0) we can see that it has three channels (0, 1, 2).  The Nested Interrupt controller IRQ value is TC0_IRQn, TC1_IRQn and TC2_IRQn respectively.  When using interrupts, the IRQ handler function that is called is named TC0_Handler, TC1_Handler and TC2_Handler respectively.  The power management controller ID lastly are ID_TC0, ID_TC1 and ID_TC2 respectively.  The remaining TCs follow the same pattern.



So, we will create a function to encapsulate all this to get a simple timer running.  The timer will generate a square wave at the specified frequency.


There is a bit of housekeeping that needs to occur.  

  • We need to enable the ability to modify the power management controller's registers.
  • We need to enable a specific peripheral clock specified by the IRQ.
  • We need to set the TC configuration.


Power Management Controller calls look like this.  We need to turn off write protection and then enable the peripheral clock for TC1 Channel 0.

   pmc_set_writeprotect(false);
   pmc_enable_periph_clk(ID_TC3);

You could also use TC3_IRQn rather than ID_TC3 as they are both different names for the same constant value.  It is more clear to use the correct constant name, but as we will see shortly, it does simplify the implementation if we don't.

TC_Configure is used to configure a TC to operate in a given mode.  The timer is stopped after configuration and must be restarted with TC_Start().  All interrupts of the timer are also disabled.

We will select Waveform Mode and instruct the TC to count up with a reset on register C (RC) compare.  The following graphic depicts this mode, though it seems to imply the maximum counter value is 0xffff which is not true.  With 32 bits, the maximum counter value would be 0xffffffff.


TC configuration is accomplished with the following code.  We will use Timer Clock 4 (master clock / 128  = 656.25 kHz) as for this example we will be generating low frequency waveforms.  The function takes the TC and Channel as the first two parameters.  The last parameter sets bits to indicate the fact we are in Waveform mode, only counting up to the maximum value specified in Register C (RC) and which of the 5 clocks we will use.

   TC_Configure(tc, channel, TC_CMR_WAVE | TC_CMR_WAVSEL_UP_RC |
                             TC_CMR_TCCLKS_TIMER_CLOCK4);

Now we need to set Register A (RA) to be the clock count where our output (TIOA) goes high and Register C (RC) at the clock count where our output goes low.  See the graphic above.  We chose points that would generate a symmetrical 50% duty cycle square wave.  Register C is set to the maximum count  specified by the clock frequency divided by the desired frequency.

   uint32_t rc = VARIANT_MCK / 128 / freq;
   TC_SetRA(tc, channel, rc / 2); // 50% duty cycle
   TC_SetRC(tc, channel, rc);

Now we enable the Register C (RC) compare interrupt.  This bit is a little strange, because we have both an interrupt enable register and an interrupt disable register.  I suspect this is so that a complete set of interrupts that you might need can be set in the interrupt enable list and sub-sets turned off by modifying the list of disabled interrupts.  This way you don't have to remember which ones were enabled previously.  This code enables only the RC compare interrupt and disables everything except RC compare interrupt, or so I believe.

   tc->TC_CHANNEL[channel].TC_IER =  TC_IER_CPCS;
   tc->TC_CHANNEL[channel].TC_IDR = ~TC_IER_CPCS;

Start the timer running again.

   TC_Start(tc, channel);

And tell the Nested Interrupt Controller to enable our IRQ.

   NVIC_EnableIRQ(irq);

Simple, eh?  Yeah...  Nothing to it...  Here is the entire function:

void TimerStart(Tc *tc, uint32_t channel, IRQn_Type irq, uint32_t freq)
{
   pmc_set_writeprotect(false);
   pmc_enable_periph_clk((uint32_t) irq);
   TC_Configure(tc, channel, TC_CMR_WAVE | TC_CMR_WAVSEL_UP_RC |
                             TC_CMR_TCCLKS_TIMER_CLOCK4);
   uint32_t rc = VARIANT_MCK / 128 / freq;
   TC_SetRA(tc, channel, rc/2); // 50% duty cycle square wave
   TC_SetRC(tc, channel, rc);
   TC_Start(tc, channel);
   tc->TC_CHANNEL[channel].TC_IER=TC_IER_CPCS;
   tc->TC_CHANNEL[channel].TC_IDR=~TC_IER_CPCS;
   NVIC_EnableIRQ(irq);
}

Whew...  Still with me?  Ok!

Now we will implement an ISR handler that just toggles the LED on digital pin 13 on and off every time the timer fires an interrupt.  It also has to read the status of the Timer Counter (TC) in order to allow the next interrupt.

volatile boolean ledOn;

void TC3_Handler()
{
   TC_GetStatus(TC1, 0);
   digitalWrite(13, ledOn = !ledOn);
}

Ok, so given that we are only interrupting when Register C compare match occurs (see graphic above), and we are toggling pin 13 on every interrupt, we effectively divide the frequency that the led blinks at by two.  If we want the frequency of the led blinking to match the frequency of the timer, we will need to interrupt on Register A compare match as well.  The code changes to implement this would be to just enable the interrupt on RA compare as well as RC compare.

   tc->TC_CHANNEL[channel].TC_IER=  TC_IER_CPCS | TC_IER_CPAS;
   tc->TC_CHANNEL[channel].TC_IDR=~(TC_IER_CPCS | TC_IER_CPAS);

So, the only thing remaining is to implement the setup function and stand back...  We set the LED pin to output and initialize timer TC1, channel 0 using the IRQ TC3_IRQn (from the table above) with a frequency of 1 Hz.

void setup()
{
  pinMode(13, OUTPUT);
  TimerStart(TC1, 0, TC3_IRQn, 1);

}

So all of this is to blink a freaking LED at a 1 Hz rate.  Amazing flexibility (and the associated complexity) comes at the bit of a steep learning curve.  Here is the complete listing for your reference.

volatile boolean ledOn;

//TC1 ch 0
void TC3_Handler()
{
   TC_GetStatus(TC1, 0);
   digitalWrite(13, ledOn = !ledOn);
}

void TimerStart(Tc *tc, uint32_t channel, IRQn_Type irq, uint32_t freq)
{
   pmc_set_writeprotect(false);
   pmc_enable_periph_clk(irq);
   TC_Configure(tc, channel, TC_CMR_WAVE | TC_CMR_WAVSEL_UP_RC |
                             TC_CMR_TCCLKS_TIMER_CLOCK4);
   uint32_t rc = VARIANT_MCK / 128 / freq;
   TC_SetRA(tc, channel, rc >> 1); // 50% duty cycle square wave
   TC_SetRC(tc, channel, rc);
   TC_Start(tc, channel);
   tc->TC_CHANNEL[channel].TC_IER=  TC_IER_CPCS | TC_IER_CPAS;
   tc->TC_CHANNEL[channel].TC_IDR=~(TC_IER_CPCS | TC_IER_CPAS);
   NVIC_EnableIRQ(irq);
}

void setup()
{
  pinMode(13, OUTPUT);
  TimerStart(TC1, 0, TC3_IRQn, 1);
}

void loop()
{

}

More to come, but have fun with this if you are so inclined.  I am always willing to help out if you have questions.  Drop me a note at ko7m at arrl dot net or comment here and I will do my best.