For optimal experience with the equations, please use a browser that can render MathML, such as Firefox or Safari. Chromium-based browsers apparently do not support MathML.

Introduction

This project sends and receives low-latency audio over the network using the PIC32 microcontroller.

In this project, we implemented a system that sends audio packets over the Internet using the Jacktrip protocol, which was specifically designed for low-latency audio for musicians. We used the PIC32 microcontroller along with an ENC28J60 ethernet chip as both a sender and receiver, communicating with a Raspberry Pi set up in the lab. Hardware components used in the project include the ADC for sampling audio played by a musician, various DMA channels for data movement, SPI for communicating with peripherals and the ENC28J60 chip, and the DAC for playing received audio packets. We utilized a modified version of the Microchip network stack to handle the details of the transport layer and below, ultimately settling on UDP as the transport protocol for its speed and simplicity.

High Level Design

Project Motivation

COVID-19 has made it impossible for musicians around the world to meet in person and rehearse together. Unfortunately, most video calls or even simple audio calls are incapable of synchronizing audio produced by musicians playing remotely due to long network delays. In most cases network latency is not a problem for verbal communication because dialogue naturally follows a turn-based structure; the listener will simply wait until they hear what the speaker has said before speaking themselves. In musical ensembles, however, musicians are expected to perform together in perfect synchrony, meaning that any significant network delay will wreak havoc. Researchers at Stanford developed a multi-machine, uncompressed audio streaming system called Jacktrip to address this exact issue along with a Raspberry Pi equipped with Jacktrip intended for the general public to use. Our project explores the feasibility of using a PIC32 instead of a Raspberry Pi as the processing device, which would considerably reduce the cost to the consumer. The main advantages of using a full computer such as a Raspberry Pi include its built-in network peripherals and OS services which abstract away the details of network functions. On the other hand, the OS may unnecessarily hinder the data transfer rate due to preemptive multitasking and other sources of overhead. With a PIC32, we have to interface the microcontroller with an external Ethernet chip and handle the communication between the chip and the PIC, but we are also able to avoid the overhead of an OS while throwing the full resources of the PIC at the task at hand.

Bare Minimum Data Transfer Rate

A key requirement is the bidirectional streaming data rate for 48KHz uncompressed audio.
(MathML fractions do not format correctly on Chromium-based browsers.)
data rate= ( 48000 samples 1 second * 1 channel ) ( 16 bits 1 sample ) ( 1 channel ) + ( 48000 samples 1 second ) ( 1 packet 64 samples ) ( 16 * 8 header bits 1 packet ) = 864 kbps.

Logical Structure

The primary project specifications include the ability to send and receive single-channel Jacktrip audio packets on the network using a PIC32. Since the PIC does not have hardware support for the physical and link layers, we obtained a Microchip ENC28J60 Ethernet controller chip to do so. The ENC28J60 communicates with the PIC via SPI. Other project requirements include being able to transform audio signals sampled by the ADC into network packets and converting received packets into audible sound using a DAC. A Raspberry Pi serves as the other Jacktrip-enabled end host and is used to verify that packets are being sent by the PIC correctly. The Pi is also used to transmit packets to the PIC to test the receive functionality on the PIC.

When a microphone or other audio capture device (in our case, we tested using the audio output from the lab desktop computers) produces an analog signal on AN11, we use the ADC – triggered by timer 3 – to sample the signal. Since we want to save CPU cycles whenever possible to perform critical functions, we use DMA to transfer the data in the ADC buffer through SPI to buffers on the ethernet chip. To achieve maximal performance, the DMA functions had to be modified to be non-blocking, as explained in the next section. Once the data is on the ENC28J60 chip, we call functions from the Microchip network stack to initiate a UDP packet send to the destination IP address of the Raspberry Pi.

When the Pi sends data to the PIC, packets are accumulated in the ENC28J60’s internal receive buffers. Again, we use DMA to move the data from the ethernet chip onto the PIC’s memory to save precious CPU cycles. From there, the audio data is written to the MCP4822 DAC, which also communicates with the PIC via SPI.

Hardware/Software Tradeoffs

Tradeoffs in hardware and software implementations affected our design process and results. For instance, as mentioned before, data transfer can either be accomplished by CPU load and store functions or using the DMA. While using CPU software is easier to code, we likely would not have been able to meet our timing constraints due to wasted cycles. The DMA is more complicated to set up, but it operates completely independently of the software, meaning the CPU is free to run more high-level control code.

Program Design

Software

Our software drew heavily from the Microchip network stack modified by Alex Whiteway. The network stack has several high-level application examples available for use, most of which are removable (to gain higher performance) simply by commenting out a few lines at most in ethernet_entry.c. That network stack also requires the naming of a few pins using the latch register bit structs.

Configuring the Network Stack

Configuring the network stack is straightforward. The stack is set up to use non-framed SPI mode, so you must select a chip select pin using the tristate and latch register bit structures. We chose pin B3 as our chip select pin and placed the following lines in HardwareProfile.h:


#define ENC_CS_TRIS TRISBbits.TRISB3
#define ENC_CS_IO LATBbits.LATB3
#define _ENC_USE_SPI_1

We also included the following lines simply as a memory aid.


#define ENC_SDO_TRIS TRISBbits.TRISB8
#define ENC_SDI_TRIS TRISBbits.TRISB13
#define ENC_SCK_TRIS TRISBbits.TRISB14

If your ENC28J60 uses SPI2, you would have to use #define _ENC_USE_SPI_2 in the appropriate place.
In TCPIPConfig.h, we chose to use the DHCP client, DNS, the Berkeley sockets API, and the ICMP server by uncommenting the appropriate options. We also changed the default IP address to 0.0.0.0 to force our application to wait for DHCP configuration. That is accomplished by changing the appropriate define constants.


#define MY_DEFAULT_IP_ADDR_BYTE1        (0ul)//(192ul)
#define MY_DEFAULT_IP_ADDR_BYTE2        (0ul)//(168ul)
#define MY_DEFAULT_IP_ADDR_BYTE3        (0ul)//(1ul)
#define MY_DEFAULT_IP_ADDR_BYTE4        (0ul)//(120ul)

DMA Packet Sending and Receiving

The original network stack blocks on every SPI transfer. Most notably, it blocks on all of the packet data. That means that at least 16 CPU cycles (assuming a 40MHz system clock and a 20MHz SPI clock) are wasted while shifting out every packet byte. Considering that the packets are 144 bytes in size, the network stack easily becomes CPU bound. With overhead, we could only achieve around 650 kbps in a bidirectional fashion, and during that test, we did not implement any significant data processing such as reading from the ADC or writing to the DAC.
It was straightforward to boost performance by simply using DMA channels to read and write the packet data. To do that, we defined and implemented nonblocking, DMA-based functions in UDP.h/UDP.c and in mac.h/ENC28J60.c. They use DMA channels 2 (to write to SPI1BUF) and 3 (to read from SPI1BUF). When sending a packet, we do not care what is received as the ENC28J60 has no documented full-duplex SPI operation. So, we can simply use DMA channel 2 to send the packet buffer, saving many CPU cycles. When receiving a packet, DMA channel 2 must be set as auto-enable so that a minimally-sized dummy byte can be repeatedly read and sent and disabled when the packet send is complete. A global variable blocks other SPI1 transfers from occurring while a DMA transfer to SPI1 is occurring, state machine states poll the appropriate DCH#CON register to determine if we have completed a packet send or receive, and handles the de-assertion of the ENC chip select signal.

The following diagram shows the state machine for UDP packet handling. On startup, the PIC sets the destination IP address and creates a socket for transmission. It always initiates the first packet send, and then transitions into a send/receive loop which continues as long as the user maintains the connection. When receive data is not available, the state machine goes back to sending another packet before trying again.

Audio input/output

We used DMA channels 0 and 1 to load double-buffered packets directly from the ADC. Since Jacktrip uses s.15 fixed point for 16-bit audio transport, we simply had to copy the output from the ADC into the packets. Channel 1 is chained from channel 0. Both channels raise block transfer done interrupts whose ISRs signal the network routines that there is a packet available to send. Channel 1’s ISR also re-enables channel 0 because we could not satisfactorily chain the channels to each other.
When a packet is received, the receiver code sets playback start and end pointers. The Timer3 ISR writes a sample to the DAC at 48000 Hz when there is a sample available, and does nothing otherwise. Timer3 also signals the ADC to start a conversion.
Local audio loopback is achieved by having the DMA 0 and DMA 1 ISRs set the playback pointers and not the receiver code.

Hardware Design

The current design only sends and receives one channel of audio. The analog input from the left channel of a computer’s headphone jack is biased to mid-rail through two 10k resistors and AC-coupled with a 10uF capacitor and then arrives at pin RA0. The right channel goes to RA1 in the same manner but is not used. The audio output runs through an MCP4822 SPI DAC which is AC coupled and loaded through a 10nF capacitor and 10k resistor to ground. The SPI pins for the ENC chip breakout board and MCP DAC are chosen in such a way that the external oscillator and USB pins are left open for possible improvement. Please see the schematic appendix for the specific connections.

Non-PIC32 Components

A Raspberry Pi ran Jacktrip and dnsmasq (for DHCP). Since Cornell may change the eduroam/RedRover 10-space address of the Pi, the Pi needed a way to share its ip address automatically. We used a cron job that checked every five minutes for a new IP address and pushed updates to a git repo as necessary.


#!/bin/bash
#Example update IP script
#crontab -e
#append the following line (uncommented) to the end of the user crontab to 
#update every 5 minutes
#this script assumes that update_ips is the git repo that you want to push to.
#*/5 * * * * /home/pi/update_ips/update.sh
pushd /home/pi/update_ips
CURR_IP=$(hostname -I)
HOSTNAME=$(hostname)-$(ifconfig wlan0 | grep -o -E '([[:xdigit:]]{1,2}:){5}[[:xdigit:]]{1,2}')
if [[ -e $HOSTNAME ]]; then
    REC_IP=$(cat $HOSTNAME)
else
  REC_IP=0
fi
if [[ $CURR_IP != $REC_IP ]]; then
  hostname -I > $HOSTNAME
  git pull
  git add $HOSTNAME
  git commit -m "update $HOSTNAME"
  git push
fi
popd

Jacktrip on the Pi

It is best to compile Jacktrip from source. You need to install the qt5-qmake and libjack-jackd2-dev packages (which should install all other dependencies for building), assuming you are running Raspbian Buster. Get the source from the Jacktrip Github repo. Make a shadow build directory (i.e. jacktrip/build) and run qmake ../src/jacktrip.pro in that shadow build directory. Then, run make. Make install is not required.
If you are running a different flavor of Linux (i.e. Arch Linux, RHEL, CentOS) perhaps not on a Pi, be aware that there might not be recent JACK or Qt packages available. JACK, Qt, Qjackctl, and Jacktrip are straightforward to build and give decent performance with recent versions of GCC. The prebuilt Jacktrip package on Debian-based Desktop OSes is often out-of-date, but the core functions implemented in this project have not changed between Jacktrip 1.2 (prepackaged) and Jacktrip 1.3 (on Github).
JackTrip does run on Windows and MacOS as well. There are prebuilt packages for each, although it is very useful to have a DHCP server on your testing network.

Results

The device was tested in a somewhat quantitative manner with sine wave inputs after the demonstration, and the demonstration below is representative of the quality of the device audio.

Frequency Distortion on Packet Receipt

There is significant frequency distortion that worsens as the frequency increases. The individual frequencies appear to make it through. Spectra for 61 Hz, 122 Hz, and 488 Hz network inputs appear. The spurs worsen as the frequency increases. Additionally, at certain frequencies, the DAC’s hold value creates a relatively clear tone, such as at 1538 Hz. This is a result of the DAC holding its value when a packet fails to arrive to play. It seems clear by the scope reading that approximately half of the samples are lost.

Frequency Distortion on Packet Send

The packets sent over the network receive so much distortion that they are essentially unusable. However, a clue may reside in the aliasing that occurs at 17:10 in the video above. It seems like not many packets are actually making it out of the device.


2021-05-18T18:29:14 10.253.98.244 send: 0/0 recv: 2077/4 prot: 292302/0/0 tot: 448144 sync: 10/0/0/4/290225 skew: -290229/-290229 bcast: 0/0 autoq: 1.9/29.5
2021-05-18T18:29:17 10.253.98.244 send: 0/0 recv: 3101/4 prot: 436462/0/0 tot: 593530 sync: 10/0/0/4/433361 skew: -433365/-433365 bcast: 0/0 autoq: 1.9/24.8

From the above logging present in Jacktrip on the Raspberry Pi, it was clear that thousands of packets per second were lost (first number in “prot” section) which caused thousands of receive buffer underruns (first number in “recv” section). We were unable to rectify the problem in the course of this project, but we suspect that it may be due to the fact that the ENC28J60 does not easily support 10Mbps full-duplex mode. We tried setting a switch in ENC28J60.c and using ethtool on the Raspberry Pi as


sudo ethtool -s eth0 speed 10 duplex half autoneg off

(and to reverse)


sudo ethtool -s eth0 autoneg on

but were unsuccessful in getting the ENC28J60 to connect as a full duplex device. The switching overhead is likely quite high, so it is unlikely that half duplex mode is appropriate for streaming.

Local Loopback is Functional

If the recorded packet is simply played back locally, there is not much distortion, even when the network part of the program is running at full speed. This shows that our program is likely not CPU-bound and that the DMA-based audio sampling and network transmission lightens the CPU load enough.

Conclusions

Meeting Expectations

Our choice of networking hardware was not physically able to meet our specifications fully given our knowledge of it. However, we were able to unit test several functions and transmit intelligible speech and low frequency music over the internet to the PIC32. In that sense, the device met the expectation that we have something that can interface with existing Jacktrip computers. Our next step would be to try a stronger SPI Ethernet controller such as the ENC424J600 if we wanted to stay with the PIC32MX250F128B.

Intellectual Property Considerations

Code Licensing From Microchip

The Microchip network stack is licenseable for use on Microchip microcontrollers, and our modifications to the library appear to be distributable in source form if we ensure that the users of our modified source code accept Microchip’s license to the Microchip Libraries for Applications and if we ensure that all future users know that the source code is modified from the original code. The ENC28J60 driver is compatible with that stack and is also licenseable for ports to non-Microchip MCUs that interface with Microchip’s ethernet controller. However, it is not acceptable to port the full network stack to non-Microchip MCU.

Jacktrip Licensing

Jacktrip is available under an MIT-style license. We used its packet header struct in our code, but otherwise, simply blasted packets at the Raspberry Pi receiver and accepted the packets blasted at the ENC28J60.

Digital Rights Management

When making a device that can transmit signals over the internet, an important consideration is of the rights of copyright holders. Our device implements no Digital Rights Management (DRM) strategies. Users of our device, especially on the full internet, will have to assume all risks and responsibilities of privacy and copyright laws. This is especially important as our proof of concept does not encrypt any data.

Aside: Jacktrip and FERPA

If we were to create an actual Jacktrip device, we may want to choose a MCU that has cryptographic primitives built into hardware so that the application has the best chance at being useable at institutions of learning. Alex Coy has a port of Jacktrip that encrypts the audio data using 256-bit AES-CBC mode encryption. Jacktrip is fairly immune to the oracle attack weakness of AES-CBC mode because audio packets expire quickly and there is no way for an attacker to gain the necessary feedback from the server in a surreptitious way. There are several low-latency audio transport services advertised, but only Sonobus explicitly mentions that the connection is not encrypted. All other vendors advertise that the audio “gets through,” but make no mention of security or privacy. Plain Jacktrip as we have implemented in this proof of concept is not encrypted.

Patent Oppotunities

The process of cramming uncompressed audio through a network is nothing new. Porting stuff from open-source Jacktrip is also not new from a patent sense. We do not believe that there is anything patentable in this project.

Publishing Opportunities

This PIC32 proof of concept is proof that inexpensive, durable hardware can sustain the open-source Jacktrip program. We do not believe that anything is publishable at the moment, but Alex Coy may pursue an implementation on a beefier MCU such as the RP2040 or something like the MK64FN1M0VLL12 which has an Ethernet PHY controller and some cryptographic primitives built in. A better Ethernet controller such as the ENC424J600 would be essential as well.

Ethical Considerations

We have said several times that plain Jacktrip (which the PIC32 runs) does not encrypt its audio data. So, that poses a privacy issue that we hope that we have identified in the spirit of section I-1 of the IEEE Code of Ethics. Other than that, our device minimizes risk to the public by using low voltages which are attainable through an isolating power converter.
We do believe that it is ethical to work towards a world where more people have access to effective technology. Realtime audio transmission is certainly among that technology.

Legal Considerations

Our device is considered an unintentional radiator as it does not contain a radio transmitter. However, we should still avoid poor choices with high frequency SPI and Ethernet signals such that our device could pass FCC unintentional radiator certification if we ever solve the problem of basic functionality.

Appendix A

The group approves this report for inclusion on the course website. The group approves the demo video for inclusion on the course Youtube channel.

Source: PIC32 Realtime Network Audio

About The Author

Muhammad Bilal

I am a highly skilled and motivated individual with a Master's degree in Computer Science. I have extensive experience in technical writing and a deep understanding of SEO practices.