Controller Area Network:- Node fault detection and tolerance.
This article is about the different phases of hardware implementation methodology of one of my published work, available in the following link.
The above paper talks about a new algorithm for node fault detection and tolerance on a CAN based system. CAN is an in-vehicular networking protocol introduced by Bosch in 1980s, and promptly became popular in the distributed electronic control unit networking domains due to its simplicity, flexibility and reliability. As mentioned above, this article concentrates on the phases of hardware implementation methodology of the same, so if you are interested to look into the algorithm aspect, please have a look into the paper.
Motivation
Recent progress in autonomous vehicle technology, has captured the imagination of many researchers in developing in-vehicular autonomous diagnosis systems to deliver highly dependable behaviour. So in this project, I am trying to address the need of autonomous in-vehicular network fault detection algorithm.
Objectives
- Hardware implementation of a new fault detection and tolerance algorithm for a 4 node CAN platform.
- Hardware design of 4 node CAN platform.
- Develop the new fault detection algorithm on the hardware framework.
- Detection of 3 node failures in a 4 node distributed system.
- Hardware fault tolerance with redundancy for the critical node.
2. Hardware Implementation and testing in power-train modules.
3. Simulation and performance evaluation to transient error burst faults.
Methodology
To achieve the above objectives, 4 different phases were considered
Phase 1:- Defining basic system model.
Phase 2:- Design and development of new fault detection and tolerance algorithm.
Phase 3:- Automotive power-train module implementation.
Phase 4:- Hardware-In-the-Loop simulation.
Phase 1
The basic system architecture consist of four nodes and one redundant node for the critical 4th node, all interconnected using Controller Area Network bus. The blue development board is the redundant node. Eight messages were considered in the basic system to verify priority based non-preemptive message passing behaviour of CAN. Firmware for the message attributes were implemented using keil IDE and the first hardware prototype with 4+1 nodes was developed.
The development board details is as follows
The bus arbitration process and the messages ID's considered are shown in the following figure.
Phase 2
This phase consist of the design and development of the proposed fault detection and tolerance algorithm for CAN. To achieve modularity and portability, the algorithm was implemented in 3 main sections.The first section dealt with broadcasting of the detection test message during periodic invocation of the detection cycle. The delay during each message transmission and interrupt timeouts of test message broadcasting were reflected in this section. Algorithm sections 2&3 describes the behaviour during new node entry or re-entry of repaired node.The fault cases considered include transient hardware faults, transient software due to error burst and permanent hardware faults.
The blue downward arrows shows detection message broadcasts from different nodes during specific time instance. The red arrows represent the node responses depending upon its status( i.e fault or fault free). The real time response from nodes has a core factor in determining the node status during every instance of detection cycle. For better in-depth understanding of phase 2, please refer the paper.
Phase 3
This phase explains about the compatibility of the proposed DNFDT CAN algorithm with power-train control unit tasks. In this phase, display node functions were implemented in node 1, driver inputs functions in node 2, transmission control unit functions in node 3 and engine control unit functions in node 4 and node 4 has been equipped with a hot standby redundant unit to overcome critical node failure. The inputs read from the driver input has been used to control the fuel injector connected to the engine control unit, along with the periodic invocation of the DNFDT CAN algorithm.
Phase 4
A quantitative verification and validation of CAN message frames and the detection cycle timings has been achieved by Hardware-in-the-Loop simulation of the implemented experimental setup. The real-time plots of fault detection cycle timings for various failure conditions has been obtained for consecutive 10 instances of the fault algorithm. All the results were tabulated and analysis of the timings has been done by considering the average values of detection cycle timings.
The worst case response time values of engine control unit tasks have been obtained for error burst condition and its comparison is done with that of the values without error burst. For the values and plots of worst case response, please refer the paper.
So the above mentioned are the major four phases of the hardware implementation methodology section of this published paper. Please let me know if you have any doubts or clarifications.