Vehicle Protocol
Architecture
CAN → LIN → FlexRay → DoIP as a single evolving narrative. Each protocol is an industry answer to a specific engineering constraint — cost, bandwidth, determinism, or remote access. Understanding why each exists is the difference between memorizing specs and having genuine architectural intuition.
Who this is for: New engineers entering the electrical vehicle (EV) and software-defined vehicle (SDV) industry. Use this as your protocol foundation before diving into security requirements and OTA/diagnostics implementation.
The Master Narrative
The best way to understand this protocol stack is not as four separate specs to memorize, but as a single evolving answer to one recurring question: how do you move data reliably between dozens of computers inside a vehicle, given hard constraints on weight, cost, noise, timing, and bandwidth?
Each protocol represents an industry decision made at a specific moment in automotive history, solving a problem the previous protocol couldn't handle. Understanding why each wasn't fully replaced by its successor gives you genuine systems thinking — the kind that surfaces in architecture interviews.
Software-Defined Vehicle (SDV) Context
In a software-defined vehicle, features and behavior are increasingly determined by software (OTA updates, new functions, domain controllers) rather than fixed hardware. The in-vehicle network is the backbone: domain or zonal ECUs talk over CAN, LIN, or Ethernet; a central gateway connects to the cloud for OTA and diagnostics (DoIP over Ethernet). Knowing which protocol is used where — and why — is essential for SDV roles in architecture, integration, manufacturing, and security.
This guide focuses on the protocol layer. For security protocols and requirements (SecOC, secure boot, UN R155/ISO 21434, key provisioning), see the companion EV Cybersecurity Engineering Guide.
Historical Context
Each protocol solved the problem that its predecessor couldn't. The timeline is the narrative.
Mercedes-Benz approached Bosch: luxury vehicles had 2km of copper wiring. Point-to-point connections made adding features exponentially expensive. CAN replaced a spider's web of wires with a single two-wire bus shared by all ECUs. Production vehicles from 1991 (Mercedes W140 S-Class).
CAN was too expensive for seat motors, mirror controls, and window switches. LIN uses a standard UART peripheral — already on every cheap microcontroller — over a single wire. Cost per node dropped 3–5x. LIN didn't replace CAN; it extended below it for non-safety-critical leaf nodes.
Drive-by-wire required guaranteed, microsecond-precise message delivery that CAN's event-driven arbitration could never provide. FlexRay introduced time-triggered static slots: every safety-critical message transmits at a pre-scheduled moment, every cycle, unconditionally. BMW X5 2006 was first production deployment.
A 50MB ECU firmware update over CAN takes ~15 minutes per ECU. Modern vehicles have 70–100 ECUs. OTA at scale is impossible without Ethernet. DoIP wraps standard UDS diagnostic payloads in TCP/IP, delivering them over 100Mbit/s–1Gbit/s automotive Ethernet. The same UDS commands work — just 200x faster.
Protocol Comparison
| Protocol | Speed | Topology | Wires | Payload | Scheduling | Primary Use |
|---|---|---|---|---|---|---|
| CAN 2.0 | 1 Mbit/s | Multi-master bus | 2 (differential) | 8 bytes | Event-driven, arbitration | Powertrain, chassis, body |
| CAN FD | 8 Mbit/s | Multi-master bus | 2 (differential) | 64 bytes | Event-driven, arbitration | High-bandwidth ECUs, OTA prep |
| LIN 2.2A | 20 kbit/s | 1 master + 16 slaves | 1 + ground | 8 bytes | Master schedule table | Seats, mirrors, doors, lighting |
| FlexRay | 10 Mbit/s/ch | Multi-master, dual-channel | 2 per channel | 254 bytes | Time-triggered static + dynamic | Drive-by-wire, safety-critical sync |
| DoIP | 100M–1Gbit/s | Star (Ethernet switches) | 1 pair (100BASE-T1) | 64KB UDS | TCP/IP (connection-oriented) | OTA flashing, remote diagnostics |
Before CAN, every ECU that needed to talk to another had a dedicated point-to-point wire. A 1980s luxury vehicle had over 2km of copper wiring in its harness — heavier than the engine block in some configurations. Adding a new feature meant designing new wiring for every vehicle variant.
CAN replaced this with a shared serial bus: two wires running through the vehicle, and every ECU connects to those same two wires. Any ECU can broadcast a message; every ECU on the bus hears it and decides whether to act based on the message ID.
CAN is architecturally identical to a Kafka topic with multiple producers and consumers. The message ID is the topic. Producers broadcast; consumers filter by ID. There is no concept of "send this to ECU #7" — only "publish message 0x0C4, and whoever cares about 0x0C4 will consume it." If you've designed Kafka consumer groups, you already understand CAN's addressing philosophy.
CAN Frame Structure
Every message on a CAN bus follows this exact bit layout. Knowing it cold is a baseline expectation for any protocol-facing role.
Non-Destructive Bitwise Arbitration
This is CAN's most elegant engineering decision. When two ECUs transmit simultaneously, there is no collision in the Ethernet sense. The lower-ID message wins — and crucially, it is not corrupted and does not need retransmission.
Ethernet CSMA/CD: both frames are destroyed on collision; both must retransmit. CAN CSMA/CR: only the loser backs off. The winner's frame was never corrupted. This is non-destructive arbitration — it's why CAN can guarantee bounded worst-case latency: in a fully-loaded bus, a message must wait at most (N-1) message durations before winning arbitration.
Error Handling & Fault Confinement
CAN's fault confinement is implemented entirely in hardware — not in software. Every CAN controller maintains two counters: Transmit Error Counter (TEC) and Receive Error Counter (REC). These govern a three-state machine.
In manufacturing end-of-line testing, a Bus Off event on a factory tester CAN port is a common failure mode
when a new ECU is first connected and sends malformed frames. Your monitoring system needs to catch
Bus Off transitions — they won't surface as application errors, only as hardware counter events in the CAN controller's status register.
Plan B's tooling (python-can) exposes these via the bus.state attribute.
LIN was not invented to replace CAN. It was invented to solve a different problem: CAN is too expensive for actuators that don't need it.
A seat motor moves slowly, transmits a few bytes per second, and has no safety-critical timing requirements. A full CAN controller with differential transceivers costs $3–5 per node in silicon alone. LIN uses a standard UART peripheral — already present on virtually every cheap microcontroller for pennies — over a single wire.
LIN's schedule table is the network equivalent of a PLC scan cycle. The master has a fixed program: at every time slot, it solicits a specific frame from a specific slave. No slave ever initiates. No arbitration. No surprises. The behavior of the entire LIN network is completely predictable from reading the schedule table alone — making it trivial to simulate for factory end-of-line testing.
LIN Frame Structure & Schedule Table
PIDs 0x3C and
0x3D are reserved
in LIN 2.x for diagnostics — they allow UDS-over-LIN for ECU configuration at the factory.
This is how body domain ECUs (seat modules, mirror controllers) get their variant coding during manufacturing end-of-line testing,
even though they live on a LIN bus with no direct Ethernet connection.
By the early 2000s, engineers were designing drive-by-wire systems — brake-by-wire, steer-by-wire — with no mechanical fallback. CAN's event-driven arbitration is fundamentally unsuitable for this. In CAN, a high-priority message might be delayed by microseconds or milliseconds depending on bus load — non-deterministic. You cannot certify a steering system with non-deterministic latency under ISO 26262 ASIL-D.
If the steering angle sensor ECU sends "turn left 15°" and the steering actuator ECU receives it 1.2ms late instead of 0.8ms late — that 0.4ms variance is the problem. At highway speeds, that inconsistency compounds across control loops. FlexRay eliminates variance by guaranteeing the message transmits at a pre-scheduled, hardware-enforced time slot, every cycle, unconditionally.
The FlexRay Communication Cycle
FlexRay's static schedule must be computed offline using specialized tooling, making network design rigid. Every node needs a dedicated FlexRay controller chip. And 10 Mbit/s became insufficient once ADAS pushed bandwidth requirements into the gigabit range. Automotive Ethernet with TSN (IEEE 802.1AS/Qbv) now provides microsecond-accurate time-triggered scheduling at 100Mbit/s–1Gbit/s, making FlexRay architecturally redundant for new designs. You will encounter FlexRay on vehicles from 2006–2020 and must understand it to maintain and test those platforms — but Rivian's architecture is Ethernet-native.
DoIP is the solution to one specific, quantifiable problem: OTA at scale is impossible over CAN. A 50MB ECU firmware image over CAN at 500 kbit/s (with ISO-TP overhead) takes approximately 15–20 minutes per ECU. A modern vehicle has 70–100 ECUs. Full vehicle software update over CAN would take days. Over 100Mbit/s automotive Ethernet, that same 50MB takes under 5 seconds.
The critical insight: DoIP does not change UDS. The same service IDs, the same session states, the same SecurityAccess seed/key — all identical. DoIP only replaces the transport layer beneath UDS, swapping ISO-TP over CAN for TCP over Ethernet. Everything above that layer is unchanged.
ISO-TP (ISO 15765-2) is the transport used for UDS over CAN: it segments payloads larger than 7 bytes into multiple CAN frames (First Frame, Consecutive Frames) with flow control. When diagnostics run over DoIP, TCP carries the same UDS payloads without segmentation limits (up to 64KB). Understanding ISO-TP helps when debugging gateway behavior (DoIP ↔ CAN translation) or when working with legacy CAN-only ECUs.
DoIP Session Establishment Sequence
This sequence must complete successfully before a single UDS byte can flow. Failures here are among the most common integration issues in OTA development. Your Python OTA client must implement all seven steps.
The gateway's address translation table — mapping DoIP logical addresses to CAN bus addresses — is itself a software artifact
that must be maintained as vehicles gain new ECUs via OTA. A DoIP routing activation that returns code 0x06
(Unknown Source Address) is one of the most common integration failures during OTA testing,
and it always means the gateway's routing table doesn't include your tester's source address.
This is managed at the vehicle configuration layer, not the protocol layer.
Modern EV — All Four Protocols Coexisting
In a current-generation EV like Rivian's platform, all four protocols are present simultaneously. Understanding the boundaries — where each protocol lives, who owns the translation between them — is the core architectural knowledge for all four target roles.
100BASE-T1 / 1000BASE-T1 connecting domain controllers and the central gateway. DoIP for diagnostics and OTA. SOME/IP for service-oriented runtime communication between major ECUs. TSN (Time-Sensitive Networking, IEEE 802.1) provides deterministic, time-synchronized Ethernet for ADAS and safety-critical traffic. All new SDV architecture design is Ethernet-native.
Connects ECUs within each domain — powertrain, chassis, ADAS — where bandwidth exceeds classic CAN but node count and cost don't justify Ethernet per-ECU. BMS, motor controller, ABS, ESC all speak CAN FD internally.
Connects leaf-node actuators to body domain gateway ECUs. Door modules, seat position motors, mirror controllers, ambient lighting nodes. Gateway bridges LIN ↔ CAN. Factory variant coding of LIN slaves happens via 0x3C/0x3D diagnostic frames.
Present in vehicles designed 2006–2020, possibly carried forward in specific safety-critical subsystems. Not used in new architecture design at Rivian. Understanding it matters for testing and maintaining platforms that bridge old and new architecture generations.
The central gateway speaks all protocols simultaneously. It enforces firewall rules between domains, performs address translation for DoIP routing, and is the single entry point for all OTA and diagnostic traffic. From a factory software role perspective, the gateway's behavior during manufacturing — how it's configured, how it routes diagnostic traffic, how it handles variant coding across all bus types — is the primary integration surface you would own. This is where your MES/systems engineering background maps most directly onto vehicle software architecture.
Factory Software View
How these protocols map to manufacturing end-of-line testing, OTA release pipelines, and your existing experience.
| Manufacturing Concept | Vehicle Software Equivalent | Protocol Involved | Your Bridging Advantage |
|---|---|---|---|
| PLC scan cycle | LIN master schedule table | LIN | Deterministic, master-controlled polling — identical design philosophy to PLC I/O scan |
| OPC-UA / MQTT message bus | CAN message bus (topic = CAN ID) | CAN | Pub/sub addressing model: producers broadcast, consumers filter — identical to Kafka topics |
| MES change control + software version management | Vehicle SW release pipeline + OTA campaign | DoIP | Your MES TPO role is structurally equivalent to a vehicle SW release role — apply the same change-control rigor |
| End-of-line functional test | UDS service 0x22 ReadDataByIdentifier, 0x2F IOControl | DoIP CAN | EOL testers communicate via DoIP/UDS — designing test sequences uses the same structured validation methodology |
| Device variant coding at assembly | UDS 0x2E WriteDataByIdentifier + LIN 0x3C diagnostic frames | LIN DoIP | Configuration at point of manufacture — same concept as MES recipe management, different protocol |
| DFMEA / process FMEA | ISO 21434 TARA (Threat Analysis and Risk Assessment) | All | Same structured risk decomposition methodology — STRIDE adds attacker intent dimension to familiar risk framework |
| Kafka / MQTT telemetry pipeline | NATS JetStream → ClickHouse for vehicle telemetry | All | Your Kafka experience maps directly — NATS is faster and lighter; the architectural pattern is identical |
Session 2: CAN deep dive — arbitration at the bit level, error frame types, ISO-TP segmentation math, DBC signal decoding, CAN FD differences. | Session 3: DoIP + UDS as a system — full ECU flash sequence service by service, SOVD as the forward-looking replacement, Python implementation patterns. | Session 4: NATS JetStream architecture vs Kafka, subject hierarchy design for vehicle telemetry, JetStream consumers for OTA event streaming. | Session 5: ClickHouse MergeTree engine, columnar storage internals, schema design for CAN telemetry workloads, query patterns for fleet analytics.
For security protocols and requirements in EV/SDV (SecOC, secure boot, UN R155/R156, ISO/SAE 21434, key provisioning, OTA security), use the EV Cybersecurity Engineering Guide as your technical education companion.