SPI Explained Beyond MOSI and MISO

Embedded Systems Fundamentals #2
Most engineers learn SPI through a simple diagram:
Master ---- Slave
MOSI
MISO
SCLK
CS
A transfer begins. Bytes move. The transfer completes. Simple.
But inside the hardware, significantly more is happening. Clock generators are running. Shift registers are exchanging bits. Transmit and receive FIFOs are filling and draining. Interrupts may be firing. DMA engines may be moving data without CPU involvement. Understanding these mechanisms is critical when building high-performance firmware and device drivers.
This article explores what actually happens inside an SPI transfer.
Why SPI Exists
SPI (Serial Peripheral Interface) was designed to provide faster communication than protocols such as UART and I2C.
SPI offers:
Full-duplex communication
Higher throughput
Simple hardware implementation
Deterministic timing
Low protocol overhead
Common SPI devices include:
Flash memories
Displays
Sensors
ADCs
DACs
Wireless modules
Camera subsystems
Because of its simplicity and speed, SPI remains one of the most important communication interfaces in embedded systems.
The Application View
Most applications interact with SPI through a simple API.
uint8_t tx_data[4] = {0x9F};
spi_transfer(tx_data, rx_data, 4);
From the application's perspective, Call SPI API -> Data transfers -> Receive response. However, several layers participate underneath.
SPI Software Stack
A typical SPI transfer path looks like:
Application
↓
SPI Driver
↓
SPI Controller
↓
TX FIFO
↓
Shift Register
↓
SPI Bus
↓
Slave Device
↓
RX FIFO
↓
Driver
↓
Application
Each layer has a specific responsibility.
SPI Hardware Architecture
A modern SPI controller usually contains:
+-------------------+
| Control Registers |
+-------------------+
+-------------------+
| Clock Generator |
+-------------------+
+-------------------+
| TX FIFO |
+-------------------+
+-------------------+
| RX FIFO |
+-------------------+
+-------------------+
| Shift Register |
+-------------------+
+-------------------+
| Interrupt Logic |
+-------------------+
+-------------------+
| DMA Interface |
+-------------------+
These blocks work together to perform a transfer.
Understanding SPI Signals
SPI uses four primary signals.
SCLK - Serial Clock. Generated by the controller. Controls data timing.
MOSI - Master Out Slave In. Used to send data from controller to peripheral.
MISO - Master In Slave Out. Used to send data from peripheral to controller.
CS / SS - Chip Select. Activates a specific peripheral. Only the selected peripheral responds.
What Happens When a Transfer Starts?
Consider:
spi_transfer(tx_buffer, rx_buffer, 32);
The sequence typically looks like:
Driver validates parameters
Driver configures controller
Driver asserts Chip Select
Driver loads TX FIFO
Clock generation begins
Shift register starts transmission
RX FIFO receives data
Transfer completion event occurs
Driver deasserts Chip Select
This entire process can occur within microseconds.
The Role of the Shift Register
The shift register is the heart of SPI communication.
Example:
Transmit Byte
0xA5
10100101
The shift register shifts one bit every clock cycle.
At the same time:
MOSI transmits a bit
MISO receives a bit
This is why SPI is full duplex.
Data moves in both directions simultaneously.
Why SPI is Full Duplex
Unlike UART:
TX --> RX
SPI operates as:
Master <----> Slave
Every transmitted bit generates a received bit.
Even when reading from a device, dummy bytes are often transmitted.
Example:
MOSI: 0x00 0x00 0x00
MISO: DATA DATA DATA
Clock pulses are required to receive data.
Understanding CPOL and CPHA
One of the most confusing SPI concepts.
CPOL - Clock Polarity.
Determines idle clock state.
CPOL = 0
____|‾|____|‾|____
CPOL = 1
‾‾‾|_|‾‾‾|_|‾‾‾
CPHA - Clock Phase.
Determines when data is sampled.
CPHA = 0
Sample on first edge
CPHA = 1
Sample on second edge
Combining CPOL and CPHA creates four SPI modes.
SPI Modes
Mode 0
CPOL = 0
CPHA = 0
Mode 1
CPOL = 0
CPHA = 1
Mode 2
CPOL = 1
CPHA = 0
Mode 3
CPOL = 1
CPHA = 1
Master and peripheral must use the same mode. Otherwise data corruption occurs.
TX and RX FIFOs
Modern controllers contain FIFOs.
Example:
TX FIFO
+------+
| 0x11 |
| 0x22 |
| 0x33 |
| 0x44 |
+------+
Benefits:
Reduces CPU overhead
Improves throughput
Minimizes interrupt frequency
Without FIFOs, software would need to service every byte individually.
Interrupt Driven SPI
Polling is simple but inefficient.
Instead:
Application
↓
Driver
↓
FIFO Threshold
↓
Interrupt
↓
ISR
↓
Load More Data
Interrupts allow the CPU to perform other work while the transfer progresses.
DMA Driven SPI
For large transfers, DMA becomes essential.
Example:
CPU
|
| Configure DMA
|
v
DMA Engine
|
v
TX FIFO
|
v
SPI Bus
Advantages:
Higher throughput
Reduced CPU utilization
Better power efficiency
DMA is commonly used for:
Displays
Camera interfaces
External flash memory
Large data streams
Multi-Slave SPI Systems
One controller can communicate with multiple peripherals.
Sensor
|
|
Controller----Flash
|
|
Display
Separate Chip Select lines determine which peripheral is active.
Only one selected device participates in the transfer.
Common SPI Problems
Wrong SPI Mode - Most common issue. The symptoms are Corrupted data, Random failures
Incorrect Clock Speed - Peripheral may not support configured frequency.
FIFO Overflows - Receiver cannot consume data fast enough.
DMA Configuration Errors - Incorrect transfer lengths or alignment.
Chip Select Timing - Improper assertion or deassertion can break communication.
Performance Considerations
Theoretical SPI speed is rarely achieved.
Performance depends on:
FIFO depth
Interrupt latency
DMA efficiency
Bus contention
Peripheral response time
Driver implementation
A well-designed driver often matters as much as hardware speed.
Putting Everything Together
The next time you call:
spi_transfer(tx, rx, len);
Remember the actual flow:
Application
↓
SPI Driver
↓
Controller Registers
↓
TX FIFO
↓
Shift Register
↓
Clock Generation
↓
SPI Bus
↓
Peripheral
↓
RX FIFO
↓
Driver
↓
Application
What appears to be a simple API call is actually a coordinated interaction between software, hardware, FIFOs, clocks, interrupts, DMA engines, and timing logic.
Understanding this flow is the foundation of building reliable and high-performance embedded systems.
What's Next?
Part 3:
Understanding I2C From a Driver Engineer's Perspective
We'll explore:
Open-drain signaling
Pull-up resistors
Addressing
Arbitration
Clock stretching
Interrupt handling
DMA support
Multi-controller systems
Real-world debugging techniques


