Você está na página 1de 73

Analysis and System Test

of Powertrain Embedded Control


Systems in Heavy Vehicles during
Start-Up and Shutdown

MARK

BARTISH

Master of Science Thesis


Stockholm, Sweden 2011

Analysis and System Test


of Powertrain Embedded Control
Systems in Heavy Vehicles during
Start-Up and Shutdown

MARK

BARTISH

Masters Thesis in Computer Science (30 ECTS credits)


at the School of Engineering Physics
Royal Institute of Technology year 2011
Supervisor at CSC was Alexander Baltatzis
Examiner was Stefan Arnborg
TRITA-CSC-E 2011:065
ISRN-KTH/CSC/E--11/065--SE
ISSN-1653-5715

Royal Institute of Technology


School of Computer Science and Communication
KTH CSC
SE-100 44 Stockholm, Sweden
URL: www.kth.se/csc

Abstract
This diploma project was performed at Scania CV AB in Sdertlje.
The goal was to investigate embedded powertrain control systems with
respect to their startup and shutdown processes which are extra sensitive phases in these systems. That is due to the fact that these control
systems, which are called Electronic Control Units (ECUs), interact
through a communication bus called a Controller Area Network (CAN).
A unit that sends faulty data may affect other ECUs on the same communication bus. All ECUs on the same bus do not start simultaneously
and the variation in startup times must be taken into account. During shutdown, the sensitive process of saving of Non-Volatile Memory
(NVM) data is initiated. Should something go wrong during this process the result may be corruption of operational data and End-Of-Line
configuration (EOL). Also misleading error codes may be built.
Scania therefore wanted to have one or several test cases for system
test of the powertrain ECU software focused specifically on these areas.
The author of this report performed a technical analysis of the problem
areas of the ECUs as well as failure report analysis in order to determine what the areas of greatest risk are. Based on this analysis, the
system functional requirements on the ECUs were identified and test
cases were developed. The work resulted in a total of two test cases
each of which is related to an identified problem area. The test cases
are divided into test flows which are a set of direct instructions how the
tests should be performed. Each test case verifies one or more system
functional requirements and are meant to be implemented as scripts for
the test automation rigs. The actual implementation in test automation
scripts has not been done as part of this diploma work, only a manual
conduction of the test flows in a laboratory environment. Also a theoretical study of different techniques for software testing was performed.
The result of this study is presented in the theory chapter of the report.

Referat
Analys och systemtest av inbyggda drivlinestyrsystem i
tunga fordon under uppstart och nedstngning
Detta examensarbete utfrdes vid Scania CV AB i Sdertlje. Syftet var
att underska inbyggda drivlinestyrsystem i lastbilar och bussar med avseende p problem i samband med uppstart och nedstngning som r
extra knsliga moment hos dessa styrsystem. Detta p grund av att styrenheterna kommunicerar genom CAN (Controller Area Network) och
en styrenhet som eventuellt skickar felaktig data pverkar alla andra p
samma kommunikationsbuss. Alla system p ntverket startar inte exakt samtidigt drfr mste hnsyn tas till variationer av uppstartstider.
Vid nedstngning kan sparande av NVM-data1 i EEPROM vara ett problem, en ovntad avstnging av ett styrsystem kan resultera i korrupt
data. Ovanstende problem kan leda till att missvisande felkoder bildas.
Scania ville drfr utveckla testfall fr systemtest av mjukvara i dessa styrsystem specifikt fokuserat p dessa problemomrden. Det brjade
med en teknisk analys av problemomrden och fortsatte med genomgng
av felrapporter bde interna och frn auktoriserade Scania-verkstder.
Drefter identifierades krav p mjukvaran och testfall utvecklades utifrn fretagets styrdokument som definierar testfallsutvecklingprocessen. Resultatet blev tv testfall som var och en berr ett identifierat
problemomrde. Testfallen r uppdelade i testflden som r en uppsttning direkta instruktioner fr hur testning skall g till. Varje testflde
verifierar ett eller flera systemkrav. Testfldena r tnkta att vara ett
underlag fr implementation av testskript fr testautomatiseringsriggarna. Ngon implementation i skript har dock inte gjorts inom ramen fr
exjobbet, endast en manuell genomkrning i laborationsmilj. En teoretisk studie utfrdes ocks kring olika tekniker fr mjukvarutest. Resultat
av denna presenteras i rapportens teoridel.

NVM = Non Volatile Memory

Contents
1 Introduction
2 Background and Problem Statement
2.1 Electronic Systems in Heavy Vehicles . . . . .
2.2 EMS - Engine Management System . . . . . .
2.2.1 Hardware . . . . . . . . . . . . . . . .
2.2.2 Software . . . . . . . . . . . . . . . . .
2.3 GMS - Gearbox Management System . . . . .
2.3.1 Scania Opticruise . . . . . . . . . . . .
2.3.2 Scania Comfort Shift . . . . . . . . . .
2.3.3 Scania Retarder . . . . . . . . . . . .
2.4 EEC - Exhaust Emission Control . . . . . . .
2.5 OBD - On Board Diagnostics . . . . . . . . .
2.5.1 KWP - Keyword Protocol . . . . . . .
2.6 The problem . . . . . . . . . . . . . . . . . .
2.7 Test platforms . . . . . . . . . . . . . . . . .
2.7.1 Automated Testing in Simulator Rigs
2.7.2 Manual Testing in a Vehicle . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

3
3
4
6
6
8
8
8
9
9
10
11
11
12
12
12

3 Theory
3.1 CAN - Controller Area Network . . . . . . . . . .
3.2 Theory behind CAN . . . . . . . . . . . . . . . .
3.2.1 Real-time System . . . . . . . . . . . . . .
3.2.2 Differential bus . . . . . . . . . . . . . . .
3.2.3 Data transmission . . . . . . . . . . . . .
3.2.4 Bit stuffing . . . . . . . . . . . . . . . . .
3.2.5 Bit arbitration . . . . . . . . . . . . . . .
3.3 Software Testing . . . . . . . . . . . . . . . . . .
3.3.1 Module Testing (also called Unit Testing)
3.3.2 Function Testing . . . . . . . . . . . . . .
3.3.3 Integration testing . . . . . . . . . . . . .
3.3.4 System testing . . . . . . . . . . . . . . .
3.3.5 Acceptance Testing . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

15
15
16
16
17
18
20
20
21
21
21
21
21
21

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

3.4

Other Types of Testing . . . . . . . . . . . . . . . .


3.4.1 Regression testing . . . . . . . . . . . . . . .
3.5 Software Development and Testing Procedures . . .
3.5.1 The V-model . . . . . . . . . . . . . . . . . .
3.6 Test Techniques: Black Box Testing . . . . . . . . .
3.6.1 Decision Tables and Decision Trees . . . . . .
3.6.2 State Transition Testing . . . . . . . . . . . .
3.6.3 Equivalence Class Partitioning and Boundary
3.7 White Box Testing . . . . . . . . . . . . . . . . . . .
3.8 Gray Box Testing . . . . . . . . . . . . . . . . . . . .
3.9 Smoke testing . . . . . . . . . . . . . . . . . . . . . .
3.10 Non Functional Testing . . . . . . . . . . . . . . . .
3.11 How far should we test? . . . . . . . . . . . . . . . .
3.12 Formal Methods . . . . . . . . . . . . . . . . . . . .
4 Methods
4.1 Technical Analysis . . . .
4.2 Failure Report Analysis .
4.3 Requirement Identification
4.4 Test Techniques . . . . . .
4.5 Developing the Test Cases

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
Value Analysis
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .

22
22
22
22
23
23
24
26
26
27
27
27
28
29

.
.
.
.
.

31
31
31
32
32
32

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

5 Results
5.1 Technical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1.1 CAN messages and signals . . . . . . . . . . . . . . . . . . . .
5.1.2 Signal and Component Statuses . . . . . . . . . . . . . . . . .
5.1.3 Start Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1.4 Cranking . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1.5 Shutdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3 Test case 1: EMS EEC Communication, CAN timeouts detection .
5.3.1 Requirement Identification . . . . . . . . . . . . . . . . . . .
5.3.2 Test Techniques . . . . . . . . . . . . . . . . . . . . . . . . .
5.3.3 Test Flows . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3.4 Testing the Test Case . . . . . . . . . . . . . . . . . . . . . .
5.4 Test case 2: EMS Shutdown . . . . . . . . . . . . . . . . . . . . . . .
5.4.1 Requirement Identification . . . . . . . . . . . . . . . . . . .
5.4.2 Test Techniques . . . . . . . . . . . . . . . . . . . . . . . . .
5.4.3 Test Flows . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4.4 Detect an Abnormal Shutdown and Set a DTC and Internal
Event (INTE) . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4.5 Possibility to Cancel a Shutdown in Progress . . . . . . . . .
5.4.6 Test Results . . . . . . . . . . . . . . . . . . . . . . . . . . . .

33
33
33
33
34
35
35
36
37
37
41
42
45
46
47
49
49
52
52
53

6 Conclusions and Suggestions for Future Work


6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2 Future Work Suggestions . . . . . . . . . . . . . . . . . . . . . . . .

57
57
58

Bibliography

61

Nomenclature
CCP CAN Calibration Protocol
ComP Common Platform
COO Coordinator control unit
CRC Cyclic Redundancy Check
DEC Diagnostic Event Code
DIN Deutsches Institut fr Normung
DTC Diagnostic Trouble Code
E2

See EEPROM

ECU Electronic Control Unit


EEC Exhaust Emission Control
EEPROM Electrically Erasable and Programmable Read-Only Memory
EMS Engine Management System
EOL End of Line
GMS Gearbox Management System
ICL

Instrument Cluster

ISO

International Organization for Standardization

J1939 The SAE standard for CAN communication defining some of the signals
that are sent between control units
KWP Keyword Protocol
NEVS System Test group within NE
NE

Powertrain Control System department at Scania

OBD On-Board Diagnostics


OPC Opticruise
S8

A version of EMS

SAE Society of Automotive Engineers


SCR Selective Catalytic Reduction
SDP3 Scania Diagnos and Programmer 3
SFR System Functional Requirements
U15

An input signal used for wake up of the ECUs. It is 0 V if ignition key is


OFF and the same level as U30 if ignition key is ON (DIN 72552)

U30

An input line for the main power source to the ECUs (DIN 72552)

VCI Vehicle Communication Interface

Chapter 1

Introduction
Today, electronic control systems are used in almost all motorized vehicles. Electronic controllers consist of computer hardware and software that constantly reads
input signal values from the sensors connected to it and based on these values calculates the output signals to the actuators, in other words controls the system.
They are called Electronic Control Units (ECUs) a denotation that will be used
throughout the report.
The ECUs control different parts of a vehicles function, from the most vital, like
fuel injection in the engine, to less important like radio and cab heating system.
As the complexity of every vehicles electrical and electronic system grows, so does
the need to systematize the ECUs. It is not possible neither would be desirable to
have one single control unit for the whole vehicle. For the ability to modularize
there needs to be several ECUs, one for each function. This of course leads to a
need for ECUs to be able to communicate with each other. A standard, named
CAN1 was developed for this purpose.
Scania CV AB is a manufacturer of heavy vehicles (trucks, buses) and industry
and marine engines (subsequently denoted I&M-engines). Scanias research and development department is located in Sdertlje, Sweden.
This master thesis project was performed at the System Test group within the
department for development of powertrain control systems at Scania Research and
Development. The objective was to analyze operation of powertrain ECUs during
their start-up and shutdown phase, memory initialization, non-volatile data management (EEPROM data), file handling and communication establishment as well
as sensor/actuator signals and CAN-signals with signal statuses and related DTCs
(Diagnostic Trouble Codes). After this analysis a list of software SFRs (System
Function Requirements) was made which consists of the requirements already defined by other documents at the department as well as the test developers own
1

CAN = Controller Area Network

CHAPTER 1. INTRODUCTION

defined requirements. The latter may be requirements that are parsed out of SFDs
(System Function Descriptions) but not explicitly stated there. The ultimate aim
was to develop one or several test cases that cover SFRs, a process which included
choosing of test techniques and test platform.

Chapter 2

Background and Problem Statement


As mentioned in the introduction digital electronic controllers really made their way
into the automotive domain during the last decades. A considerable part of Scania
Research and Development (R&D) is involved in development of these control systems, commonly known as embedded systems.
Embedded systems have similarities to the desktop computers in the sense that
they also have a microprocessor and surrounding hardware, such as motherboard
and memory chips. However unlike desktop computers they are not general purpose
machines. They are designed, both in hardware and software, to perform a very
specific task, like controlling the operation of an engine. Hardware resources like
CPU and memory are often much more limited in embedded systems compared
to desktop computers. Also the operating systems in embedded applications does
usually not contain such services as virtual memory management which puts much
stronger requirements on memory management in application software in such systems. Another dissimilarity with the desktop computers is the human interaction.
Some embedded systems lack direct human interaction of any kind while others
have a different kind of communication with the user than a desktop computer.
There may be light emitting diodes (LEDs), small displays, relays and switches and
similar devices used for user interaction.

2.1

Electronic Systems in Heavy Vehicles

The figures 2.1 and 2.2 illustrate a network of ECUs in a Scania truck. Each ECU is
responsible for different functionality in the vehicle. A typical ECU is connected to
different sensors and actuators, constantly reading input signals from sensors and
it outputs control signals to actuators. ECUs are connected to each other in such
way that they can act as sensors and actuators for each other (more about this in
section 3.1). There are many ECUs in a typical modern vehicle. A modern car can
contain up to 40 ECUs [HYB10].

CHAPTER 2. BACKGROUND AND PROBLEM STATEMENT

Figure 2.1. ECUs in a Scania truck. For the meaning different colors see fig. 2.2

Not all of the ECUs used in Scania vehicles are developed by Scania. Some are
purchased from external suppliers but many of the ECUs shown in fig. 2.1 are developed by Scania and this applies to all of the powertrain ECUs which are named
GMS1 , EMS and EEC (not shown in fig. 2.1).

2.2

EMS - Engine Management System

Engine Management System - This ECU is found on all engines produced by Scania,
both road vehicle and I&M-engines. It is one of the most important and complex
ECUs found in a Scania product. There have been several different software and
1
with the exception of fully automatic gearboxes which are purchased from external suppliers
as a complete unit.

2.2. EMS - ENGINE MANAGEMENT SYSTEM

- Powertrain ECUs

Figure 2.2. Topological view of a CAN network in a typical Scania vehicle. The
three different communication buses (red, yellow and green) make a logical division of
ECUs in three groups. ECUs on the same bus communicate with each other directly
while any communication between ECUs on different buses are controlled by the
Coordinator ECU. Each one of the buses may operate at different bit rates. The
red bus interconnects the most critical ECUs for a vehicles function and operation
safety. The yellow contains less critical but still important ECUs and the green is
for systems which does not have any impact on the operational safety of a vehicle
(however some ECUs on the green bus are responsible for providing information to the
driver that may be considered safety critical). Not all ECUs are directly connected
to one of the three buses, an example of the one that is not is the EEC which directly
communicates only with the EMS.

hardware versions of EMS over past years. The current one is named S8. The differences from the previous S7 is mostly in software. The largest software modification
with respect to S7 is that S8 is now based on what is called Common Platform
5

CHAPTER 2. BACKGROUND AND PROBLEM STATEMENT

Figure 2.3. EMS version S7 mounted on an 6-cylinder engine

(ComP). S8 is totally compatible (with respect to connector-pin interface) with S7


and replaces it as a spare part.

2.2.1

Hardware

EMS S8 has 140 connector sockets pins to sensors and actuators including four
CAN-pairs, one for the red CAN bus, one for the EEC sub-bus, one for internal
development use only and one that is currently unused. [Scania Internal Documentation]

2.2.2

Software

From the software point of view the EMS consists of different components (or layers).
Common Platform (ComP) which may be seen as the lowest level software
(i.e. closest to the hardware). ComP contains software for direct signal I/O, signal
processing a part of non-volatile memory (NVM) management and interacts closely
with the LLAP. ComP also contains code that translates physical signals (measured
in Volts) into engineering units. ComP can be seen as the foundation and is used
in multiple ECUs (all powertrain ECUs).
6

2.2. EMS - ENGINE MANAGEMENT SYSTEM

Figure 2.4. EMS S8 software architecture layers and managers.

Low Level Application software (LLAP) responsible for transmission and reception of CAN messages, encoding and decoding CAN messages from signals according to Scania CAN specifications and timeout diagnosis. It also is responsible
for some hardware checks, AD conversion and diagnosis of sensors and actuators.
Since this layer contains parts that are directly involved in the start-up and shutdown process, this one is analyzed quite thoroughly.
High Level Application software (APPL or HLAP) contains managers (which
in turn contain modules) that control and monitor fuel injection, combustion, gas
exchange, after-treatment and so on. These are of limited relevance to this thesis.
Virtual Sensors (VSEN) Irrelevant to the project. Will not be described.
File manager (FILE) is responsible for non-volatile data reading and writing
to/from EEPROM as well as the RAM mirror which is a mechanism to make data
normally residing in EEPROM to be read into RAM during execution for faster
access. During shutdown the data is written back to EEPROM.
Run-Time Database (RTDB) manages signals that are cross-layer accessible
within the unit.
Commonly used utilities (UTIL) is not relevant to the report and will therefore
not be described

CHAPTER 2. BACKGROUND AND PROBLEM STATEMENT

2.3

GMS - Gearbox Management System

GMS is a control unit that controls a special type of gearbox found in Scania trucks,
called Opticruise, as well as an auxiliary braking system named Retarder. There are
several different types of gearboxes present in Scania road vehicles on the market
today (2011). These are
Manual gearbox
Automatic gearbox, developed by external suppliers
Opticruise, an older version with a clutch pedal
Opticruise, the new version without a clutch pedal
Comfort shift
The manual gearbox is purely mechanical which means that the driver manually
controls shifting of the gears by moving the shift stick into right position. The
automatic gearbox obviously needs some sort of controller but since it is developed
outside Scania, it is outside the scope of this description. The Opticruise and
Comfort Shift on the other hand are developed by Scania and will be described in
greater detail below.
Scania Opticruise and Scania retarder are controlled by a control unit named OPC
(a logical controller within the physical GMS unit) which currently (year 2011) is
at version OPC5.

2.3.1

Scania Opticruise

Scania Opticruise is a so called Automated Manual Transmission. The gearbox


itself is an ordinary manual gearbox, where gear shifting is controlled by electric or
electro-hydraulic actuators instead of the driver. The clutch is used by the driver
only to get the vehicle rolling from standstill. Gear changing is done by sending
a request to the EMS for engine speed matching the wheels speed and putting in
the gear without ever opening the clutch. In the second generation Opticruise there
is no clutch pedal at all meaning that even take off from standstill is controlled
automatically. This is done by electro-hydraulic clutch actuators.

2.3.2

Scania Comfort Shift

Scania Comfort Shift is a gear changing system used on buses. Instead of having
a mechanical link between the gear lever and the gearbox, an electrical actuator is
used to switch the gear. A gear shifting scenario looks as follows: the driver requests
a new gear by moving the gearshift lever (which is not mechanically linked to the
gearbox) lever into position without opening the clutch, the controller registers the
request and as soon as the clutch pedal is pressed down it controls the gear shifting
actuator to switch gears.
8

2.4. EEC - EXHAUST EMISSION CONTROL

2.3.3

Scania Retarder

A retarder is an auxiliary braking system that helps the vehicles ordinary brakes
when moving downhill in order to decrease wear and tear of the ordinary brakes.
A retarder control unit (RET) is a so called logical node that is physically present
in the same box as Opticruise (OPC) i.e. it uses the same embedded controller
hardware as OPC, namely GMS.

2.4

EEC - Exhaust Emission Control

Figure 2.5. An illustration of the more and more restrictive emission tolerance from
heavy road vehicles in the EU.

Legislative requirements on exhaust emissions from diesel engines in heavy vehicles


become more and more restrictive in Europe. Naturally this increases the demand
on exhaust gas cleaning mechanisms (this process is known as after-treatment). As
this report is written a legislation known as Euro5 is in prevail which specifies limits
on four different kind of toxic gases and particles allowed in exhaust from vehicles.
These are Carbon Monoxide (CO), Nitrogen Oxides (NOx), Hydrocarbon (HC) and
Particulate Matter (PM) also known as soot. The limits for heavy vehicles are defined by exhaust mass per unit of energy output (g/kWh), unlike for passenger cars
where they are defined in terms of exhaust mass per distance driven (g/km). In
January 2013 Euro6 will come into force which heavily restricts the allowed amount
of NOx in the exhaust. [WPEMSTD]
A cleaning mechanism named Selective Catalytic Reduction, SCR is used in Scania
vehicles to reduce NOx emissions which makes use of a liquid called AdBlue which
9

CHAPTER 2. BACKGROUND AND PROBLEM STATEMENT

Figure 2.6. An illustration of the SCR process. The SCR unit is controlled by
EEC3.

contains urea and is injected into the exhaust system of a vehicle where it undergoes
a catalytic reaction and is degraded into water and nitrogen.
A separate ECU called Exhaust Emission Control system, EEC is responsible for
controlling this process. The EEC is located on a sub-bus to the EMS on CAN,
meaning it can only communicate to EMS directly. Communication to other ECUs
on the red CAN-bus must go through EMS (see fig. 2.2).

2.5

OBD - On Board Diagnostics

Figure 2.7. A VCI unit that is used to connect a computer to the diagnostic port
on a Scania vehicle

OBD is present on every modern vehicle, this is a legislation requirement for diesel
vehicles sold in the European Union since 2004 [WPCAN]. OBD is a generic term
referring to a vehicles self-diagnostic and reporting capability [OBD]. OBD-II is a
connection interface standard for a 16 pin connector that enables an external device
10

2.6. THE PROBLEM

(typically a laptop) to be connected to the vehicles controller systems network.


This enables workshops to perform diagnosis on the vehicles electrical/electronic
system. It also allows programming of the vehicles ECUs, i.e. storing configuration
parameters in the ECU memory.
Each ECU in a vehicle performs a number of diagnostic tests on the sensors/actuators
connected to it as well as CAN communication channels with other ECUs. When
these diagnoses results in an indication that for example a component is faulty,
does not send sensor data at all or send an implausible value, a Diagnostic Trouble
Code (DTC) is built. The DTC is stored in the ECUs non volatile memory and
can be read using a diagnostic tool connected to the OBD port. The DTCs are
primarily intended to enable effective diagnosis in a workshop as well as functional
degradation of the vehicle.

2.5.1

KWP - Keyword Protocol

Keyword Protocol (KWP) or KWP2000 is a protocol for diagnostic device to vehicle


communication standardized as ISO14230. In modern Scania vehicles the communication between the diagnostic device (often a PC) is done over the CAN network.
KWP2000 offers a possibility to send a diagnostic command to the vehicles ECU
system and receive information from it. A command may be a request for DTCs
for instance, or to request an ECU to reset itself. The diagnostic device sends a
parameter identifier PID which is an integer of one or more bytes to the diagnostic
server at the ECU and receives a number of bytes in response. Two PC software
tools are used to communicate with vehicles by KWP2000. These are Scania Diagnos and Programmer 3 (SDP3) which is intended to be used by workshops that
serves Scania vehicles and XCOM that is intended for development purposes only
and is used internally at NE.

2.6

The problem

Whenever a driver of a Scania vehicle turns on the ignition key, a signal called U15 is
sent to all ECUs in the system. It is a wake-up signal for the ECUs, meaning that for
the vehicle to be fully operational, an ECU must, in a limited amount of time, power
itself on, start the boot-loader, read the software that is stored in the EEPROM2
(also called E2) and be ready to send and receive messages to/from other ECUs on
CAN. Otherwise, other ECUs may set DTCs indicating communication errors with
the ECU that did not wake up in time. This may lead to problems both inside the
ECU and in communication between several ECUs. For instance, the engine cannot
start without properly operating EMS. Therefore it is important that all ECUs comply to the requirements with respect to maximum start-up time and establish proper
CAN communication. Many things can go wrong during the start up process. An
2

EEPROM = Electrically Erasable and Programmable Read Only Memory

11

CHAPTER 2. BACKGROUND AND PROBLEM STATEMENT

ECU that does not wake up properly within the required time limit will make other
ECUs degrade signal statuses for the signals that are contained in CAN messages
expected from the failing ECU. During cranking (starter engine operation) a lot of
electrical current is consumed by the starter engine and this causes a temporary
decrease in supply voltage which may make some ECUs shut down, or even worse,
entering an undefined state. When ignition key is switched to OFF position the
powertrain ECUs are expected to perform a well defined and controlled shutdown
which includes saving some operational data, adaptations, diagnostic information
and similar data to non volatile memory (NVM). The problem is to investigate
which areas are the most problematic during start-up and shutdown and how can
test cases be developed to perform a system test of the powertrain ECUs developed
at Scania with the focus on correct start-up and shutdown.

2.7
2.7.1

Test platforms
Automated Testing in Simulator Rigs

At Scania, automated testing of the ECUs at different levels have been performed
for some time. This involves both system tests on single ECUs, integration tests on
a few ECUs and integration tests on all the ECUs that would normally be present
in a vehicle. These tests are performed in so called HIL3 -rigs. The function of a
HIL rig is to simulate the environment of an ECU in such way that the ECU thinks
it is inside a real vehicle. The real vehicle is simulated by mathematical models
and it communicates with the ECU through several types of signal converters and
processors which converts it into data to the model. The model is controlled by a
computer which allows to choose different environments to simulate. The rigs allow
manipulation of CAN messages as well as different electrical failures or interrupts.
There is a framework (based on Python) which allows creating Python scripts to
perform different test actions automatically. There are currently three rigs used
in the powertrain control lab. One that contains only EMS, another that contains
solely GMS and a third which has EMS, GMS, EEC, COO and ICL.

2.7.2

Manual Testing in a Vehicle

Since the models that simulate the environment of the ECUs are and always will be
approximations of the real world. Many tests, where a true real world operational
environment is significant, are being performed in a real vehicle in its real working
conditions. The interface of the tested ECU(s) of the test vehicles is connected to
CAN and electrical components through a break-out box. The break-out box is
a box that allows to interrupt each contact at the ECU interface to measure each
signal that comes in/out of the ECU. One can for instance measure the current of
the voltage supply line (U30 signal) in order to determine if the ECU is asleep or
3

HIL = Hardware In the Loop

12

2.7. TEST PLATFORMS

awake, or for any other purpose. Also manipulation of signals is possible through
the break-out box. The ECU COO provides a diagnostic port for connecting a computer to the vehicle and perform operations like DTC reading and programming of
some parameters. It is also possible to connect a PC to the CCP4 port allowing
real-time logging of internal variables in the ECU. Logging of signals sent on CAN
is also possible by connecting a listening device to CAN_L and CAN_H contacts
on the break-out-box and using appropriate PC software. The one used at NE is
XCom, an internally developed application used for diagnosis and programming of
parameters by KWP2000 protocol (see 2.5.1). Logging and manipulating of ECU
internal variables and CAN signals is done with the help of ATI Vision (Accurate
Technologies Inc). Since reading of internal variables is done by reading specific
memory addresses in the ECU, a database which contains a mapping between variable names and memory addresses is loaded into Vision.

CCP = CAN Calibration Protocol

13

Chapter 3

Theory
With separate control units controlling different functionality, a distributed controller network where some nodes control vital and safety-critical functions of a
heavy vehicle, there is a hard requirement on the communication network that
these ECUs use to communicate with each other. CAN is used for ECU to ECU
communication in vehicles produced by Scania today.

3.1

CAN - Controller Area Network

In the beginning of the 1990s the demands for comfort in cars were increased. Electric window elevators, electric seat adjustments, rear-view mirrors, climate control,
audio systems and navigation, etc., appeared on the market. Also, the international requirements regarding safety and environment were increased, the vehicles
became more fuel-efficient and environmentally-friendly. The safety functions like
ABS-brakes and Immobilizer came as well as more efficient automatic gear changing
mechanisms.
The increased demands in the automotive industry drove the development of a communication bus system adapted to embedded microcomputer systems that would
fulfill high transmission rate demands, have good real-time properties and be robust
and cheap. By the beginning of 1990s many vehicle manufacturers developed their
own bus concepts and it became difficult for the suppliers to support the different
systems. Each large manufacturer attempted to make their own solution become
an international standard. A couple of these bus system solutions continued to be
used in the mid-90s. One of them was CAN. CAN - Controller Area Network is a
standard developed by Bosch in the 1980s in order to fulfill the increasing demands
of European automotive industry. It was later also accepted by the American automotive industry due to successful use in the Europe. CAN was officially released
in 1986 by SAE - Society of Automotive Engineering as a Recommended Practice.
CAN data link layer and some aspects of physical link layer is ISO-standardized,
ISO-11898.
15

CHAPTER 3. THEORY

In figure 2.2 we can see a typical ECU network in a Scania vehicle. There are
three buses: red for vital and safety critical ECUs, yellow for not so critical but
still important ECUs and finally green for comfort function controllers. These three
buses are interconnected through a coordinator ECU (COO). Coordinator acts as
a gateway for messages that need to travel between ECUs on different buses since
the buses typically operate at different bit-rates. Currently (year 2011), the red bus
is operated at 250 kbit/s as do the yellow and the green buses.

3.2

Theory behind CAN

CAN is a a multi-master broadcast network for connecting ECUs into a distributed


controller network [CANEMB]. By multi-master we mean that there is no master
node, communication is carried out between the ECUs directly. Broadcast means
that each node is sending messages on whole network and each node is able to
listen to each message on the bus, in other words there is no way to send a message
between node A and node B without all other nodes on the bus knowing it.

3.2.1

Real-time System

The real-time system concept is often misunderstood as a system that needs to be


fast. In fact real-time has nothing to do with speed. Although it is often desirable for a system to perform fast it is not a requirement in order for a system to
be real-time. In fact, how fast a system should react to changes is defined by the
dynamics of the controlled environment. [RTOS, p.1] defines a real-time system as
a system in which performance depends not only on the correctness of the single
controller actions but also on the time at which actions are produced. The main
characteristics of a real-time system is that (in case it is a controller) it should,
given an input signal, finish the calculation of the output signal within a deadline,
i.e. maximum time allowed to finish a computational process execution. Real-time
tasks can be divided into hard and soft ones. A missed deadline in a hard real-time
task does not only result in system malfunction but can be directly dangerous. In a
soft real-time task a rate of missed deadlines can be tolerated without more severe
effects than degradation of performance. A system that is able to operate hard real
time tasks is called a hard real-time operating system.
One can find many examples of both hard and soft real time applications in automotive industry. A so called drive-by-wire systems which have some use in trucks
and buses do use communication networks (like CAN) to control the throttle. A
late response of such system may result in an uncontrollably accelerating vehicle
(at least until the driver reacts and activates the brakes) which can be directly dangerous.

16

3.2. THEORY BEHIND CAN

CAN is a system that clearly is hard real-time. A late response of a CAN controller
(unit within an ECU that is responsible for physical layer CAN communication)
can be damaging to a vehicle.

3.2.2

Differential bus

Physically, a CAN transmission channel consists of two electrical wires which we


denote as CAN High (CAN_H) and CAN Low (CAN_L). CAN is a differential
bus meaning the difference in voltage between the two lines gives either dominant
(logical 0) or recessive (logical 1) bit. In an arbitration situation (described below)
the dominant bit wins the arbitration and the other nodes transmitting simultaneously stop transmitting allowing the node which sent the dominant bit to continue
transmission.

Figure 3.1. CAN differential bus. In a differential bus, there are two states: dominant (logical 0) and recessive (logical 1). The states are based on voltage difference
between the CAN_L and CAN_H lines. As can be seen in table 3.1 recessive state
is given by CAN_L and CAN_H at the same voltage level and dominant otherwise.
The sequence given in the figure is 0101

CAN_H
CAN_L

Min
2.0
2.0

Recessive
Nominal Max
2.5
3.0
2.5
3.0

Dominant
Min Nominal Max
2.75
3.5
4.5
0.5
1.5
2.25

Volt
Volt

Table 3.1. Voltage levels of the differential bus that gives recessive resp. dominant
logical state on bus in a transmission.

A dominant bit wins over a recessive bit if a conflict of two simultaneous transmissions is arisen. See 3.2.5

17

CHAPTER 3. THEORY

3.2.3

Data transmission
B. CAN Extended Frame Format
CAN Extended Data Frame
Maximum frame length with bit stuffing = 150 bits
Control
Arbitration Field

Field
6 Bits

S
O

Identifier

F
Bits

11

CRC

Data
Field

32 Bits

Identifier
Ext.
18

Delimiter

R
T

DLC

Data Field

0 - 64

CRC

15

ACK

Field

No Bit
Bit Stuffing

Stuffing

Figure 3.2. CAN data frame

There are four types of frames on CAN:


Data frame
A frame that contains ordinary data of up to 8 bytes in length. This type of frame is
referred to later in this report as CAN-Message. The format for this type of frame
is given below:
Field
SOF
Identifier

Length in bits
1
11

SRR
IDE

1
1

Identifier Ext
RTR
Reserved bits
DLC
Data Field
CRC

18
1
2
4
0 - 64
15

CRC delimiter
ACK bit

1
1

ACK delimiter
End of Frame

1
7

Meaning
Start of Frame
Identifier of the message. Contains priority information.
Substitute Remote Request (must be recessive)
Recessive (1) in an extended frame. Dominant
(0) in a standard frame. Makes sure a standard
frame gets higher priority than an extended
Extended part of the identifier
Should be dominant (0)
Always dominant (0)
Data length in bytes, 0-8
Data to transmit
Cyclic Redundancy Check. Used for message
integrity check
Always recessive (1)
Sent as recessive (1), can be set dominant (0) by
receivers
Always recessive (1)
Always recessive (1)
18

3.2. THEORY BEHIND CAN

Remote Frame
CAN messages may be either periodical or sent only on request. In the latter case
a request must be made by the node that desires to receive information. In this
case the requesting node sends a remote frame, which looks as a data frame but
with two differences from it: RTR bit is recessive in a remote frame and there is no
data field in a remote frame. The fact that RTR bits is recessive in a remote frame
makes, in case a data frame and a remote is sent simultaneously, the remote frame
to lose arbitration so that data frame is transmitted first on bus and the remote
frame must be resent.
Error Frame
A node that detects a fault may send a message that violates bit stuffing rules. By
sending 6 bits of the same polarity, either dominant or recessive a node notifies the
other nodes on network that it discovered the fault. Other nodes will transmit error
frames also. An error frame consists of a 6 consecutive either dominant or recessive
bits and 8 recessive bits that are an error delimiter. Dominant 6 bits indicate an
active error flag and is sent by a node that detects an error while 6 recessive bits is
sent by a node that detects an active error flag on the bus.
Overload Frame
Overload frame is a way for a receiving node to indicate to the sending node that it is
busy at the moment. The overall layout is very much like to that of an error frame.

Figure 3.3. A data field of a CAN Message containing 64 bits of data. In this
example it consists of 5 signals each using 12 bits. We see bytes on the vertical axis
and bits on the horizontal. Bits are given from right to left. A part of each signal
name is hidden since they have information class Internal. We see five different
temperature signals in the depicted CAN message.

19

CHAPTER 3. THEORY

A CAN message contains up to 8 bytes (64 bits) of data which contains one or
more signals. A signal can range from one bit to 64 bits in size and different signals
within the same message can have different length in bits. In fig. 3.3 we see how a
layout of signals within a message can look. In this particular case all signals have
the same lengths. The layout of each CAN message must be known by both the
sending and the receiving node on network.

3.2.4

Bit stuffing

Bit stuffing makes sure that normally no more than five bits of the same logical
value is sent over the bus. Six consecutive of the same polarity are in fact used to
indicate error. A bit of opposite polarity is inserted after each occurrence of five
consecutive bits of the same polarity. At the receiver side the opposite is done. If
a sequence of five bits of the same value have occurred, the next bit is removed.
Here is an example:
Original frame
After bit stuffing

00000 10101 11110 10000 11110 01111 1...


00000 1 1010 11111 0 0100 00111 10011 1110...

(italic bits are the inserted stuffing bits)

The receiver removes stuffed bits


Received bits

00000 1
6 1010 11111 60 0100 00111 10011 11160...
00000 10101 11110 10000 11110 01111 1...

The last ten bits of a frame in fig. 3.2 i.e. CRC delimiter, ACK-bits and Endof-Frame-bits are not subjected to bit stuffing. Neither are the three inter-frame
bits. Bit stuffing implies that there may be more bits to send over bus than the
frame contains. The worst case number of bits to send over bus is given by
n1
ns(n) =
4


where n is number of bits in the stream and x means to floor x (giving the largest
integer smaller than or equal to x). [CANTIMING, eq. 13.1]

3.2.5

Bit arbitration

Bit arbitration is a mechanism that prevents two nodes to transmit on the bus
simultaneously. When a node is about to send a message on CAN a check is performed to determine if the bus is idle. If it is not then the node waits until it is.
If two nodes start to transmit a data frame simultaneously then the one with the
highest priority wins the arbitration because its dominant (0) priority bit will win
over the recessive bit of a message with lower priority. This means that the node
sending lower prioritized message will stop transmitting and wait for bus free to
retransmit.
20

3.3. SOFTWARE TESTING

3.3

Software Testing

The software testing process may be coarsely divided into three levels which are
Module/Unit testing
Integration Testing
System Testing
Of course, depending on application, other levels and sub-levels to the above mentioned are possible which is shown by fig. 3.4.

3.3.1

Module Testing (also called Unit Testing)

Module test is performed in order to verify that a single unit within a software system is functioning as intended. A unit (or module) is typically a class within object
oriented programming or a source code file (C module for instance) in procedure
oriented programming. The unit is isolated from the rest of the system and tested
independently, often in a separate testing framework. [AST].

3.3.2

Function Testing

Function test is testing of what may be a combination of modules controlling a


specific function, like engine torque limitation in a truck engine.

3.3.3

Integration testing

A successful thorough unit testing makes it plausible that a unit within a software
system is functioning as intended on its own, but there is no guarantee that the unit
interacts with other units/modules as intended. Integration test is about verifying
that units tested separately interact with each other as expected. The method is
typically to take a few units run some tests and add more units, run some tests and
increase the number of units. Any errors revealed at this stage is most likely related
to the units interface since the units have been tested successfully on their own.

3.3.4

System testing

System test is about testing the whole system against requirement specifications.
One wants to verify that the system and all of its components are functioning as
intended in its regular working environment.

3.3.5

Acceptance Testing

Acceptance test is usually performed by actual users of a software product (so called
beta testing) before a final release in order to test the product in its real operating
environment and possibly discover bugs that was not discovered at prior testing
phases.
21

CHAPTER 3. THEORY

3.4
3.4.1

Other Types of Testing


Regression testing

Regressions test is a type of test that is performed after changes in the code with the
focus on revealing bugs that have previously been fixed but may have reappeared
after the change. It looks to reveal software regressions, meaning that previously
correct working functionality stops working, therefore the name. Regressions usually occur in a way that newly added code that adds functionality or a bug fix
introduces a new bug or makes a fixed one reappear [MSDNRT]. The method is
to re-run previously successful tests and checking that none of the bugs appear or
reappear. This type of test can be performed on almost all test levels from module
to system integration.
As to relation of regression testing to the system test of powertrain ECUs can
be mentioned that it is a very large area that involves very advanced hardware and
software. As regression tests are performed over and over again, there is no possibility to do this all manually. Manual tests are performed on a live vehicle. The
cost in terms of both man-hours and equipment (vehicle) utilization would increase
massively if this type of testing is to be performed manually. The need for test
automation is therefore significant here and so called Hardware-In-the-Loop (HIL)
rigs are used for this purpose. The working principles of these are to simulate the
environment of the ECU to make it believe that it is actually inside a real vehicle.
For example: the EMS located inside the HIL rig is made to act exactly as it would
if it were controlling a real engine. The dynamics of the engine is simulated by
computer models based on differential equations.

3.5
3.5.1

Software Development and Testing Procedures


The V-model

One can summarize the way of software development from an initial idea to a complete product using a so called V-model. When the project starts, one is at the
upper left tip of the V drawn in fig. 3.4 and is successfully moving down towards
the lower tip where code implementation is taking place. The development part
(left side of the V) goes from the highest level (overview architecture and design)
to the lowest (implementation of separate modules). The testing part (right side of
the V) goes from the lowest level (unit tests) to the highest level system integration
test [TDP, p.43]. This model is a coarse representation of the method used by ECU
systems software development groups at Scania.
The model in 3.4 is simplified. The points are generalized and the exact ones are defined depending on the application. Also, software development (as well as testing)
is an iterative process. Testing is performed continuously during the development
22

3.6. TEST TECHNIQUES: BLACK BOX TESTING

Business Requirements

Acceptance Test

System Test

System Requirements

Function Test

Function Requirements

Module Design Spec

Unit Test

Code
Figure 3.4. The V Model used in ECU software development at Scania

at module level and system level. Acceptance level testing is done immediately prior
to a release often together with the customer.
The concepts described below will be referred to later in the report therefore a
brief presentation of the most important software testing types and techniques are
given in the subsequent section.

3.6

Test Techniques: Black Box Testing

Testing of a part of a software where no internal knowledge of the object being


tested is necessary. The primary goal is to test functionality from a user perspective.
Test cases are built based on functional requirement specification and documents
describing what the software should do. The test developer decides what input
should be given and what the expected output is. The description of different test
techniques of black-box testing follows.

3.6.1

Decision Tables and Decision Trees

Decision tables are used to analyze and test complex sets of rules where a number
of variables are used. The purpose is to examine the logical correctness of the set
of rules and to identify appropriate test cases. [TDP, p. 181]
The technique basically means that all variables in a set of rules are listed and
combined. An example of a decision table is given in tab. 3.2 The first quadrant
lists all the conditions and condition entries in the second quadrant. In the third
or fourth quadrants we have actions and action entries respectively. A condition is
something that is easy to qualify as either fulfilled or not, i.e. engine output torque
within [200 1000] Nm, gearbox in neutral or not, functional degradation due to a
DTC enabled or not and similar. An action is something that should or should
23

CHAPTER 3. THEORY

Condition 1
Condition 2

1
T
T

2
T
F

3
F
T

4
F
F

Action 1
Action 2
Action 3

Y
N
N

Y
N
Y

N
Y
N

Y
Y
Y

Table 3.2. Decision table example. T = True, F = False

not be done when a set of conditions is fulfilled. In tab. 3.2 we have a complete
list of all possible combinations of two conditions which are 22 = 4, in general 2n
condition combinations are possible given n conditions. The strategy is then to
rule out inconsistent combination of conditions which are often present. It may
also be the case that one condition rules all other out. After simplifying the table
to only contain valid combinations of conditions we can use heuristics to choose
which cases to test. Generally each column in the table is a test case. However
there is an exponential explosion on the number of columns with increasing number
of conditions. NEVS made a summary of pros and cons of decision tables which are:
Pros
Good survey
Test cases directly from the table
Easy to limit number of test cases

3.6.2

Cons
Exponential growth with the number of conditions
Difficult to identify all the conditions
Important test cases may disappear
during simplification

State Transition Testing

State transition testing is a model-based test technique. It is commonly used when


testing event triggered systems, real-time systems and digital electronics hardware.
Since we often deal with that kind of systems in automotive industry, this is an
important test technique for us. What characterizes this technique is that we use a
finite automaton graph to represent the states as nodes and the transition between
states as edges. A very simple ATM machine can serve as an example. In state
transition testing as in any other test technique we need a good heuristic to cover
the state graph in the best way. In real world applications, the states are often
much more complex than the one presented above and to test all the transitions
would be unfeasible. Some heuristics for graph coverage are
1. The most probable paths
24

3.6. TEST TECHNIQUES: BLACK BOX TESTING


E8
Transaction request correct
Perform transaction, eject card

S1
Waiting
for a Card

E2
Correct card,
Ask for PIN

S2
Waiting
for PIN

E6
Correct PIN,
Ask for transaction

E3
Wrong PIN 3 times,
Block Card
E4
E1
E5
Cancel
Wrong Card inserted,
Transaction, Wrong PIN entered,
Eject card
Ask for PIN again
Eject card

S3
Waiting
for
transacti
on

E7
Wrong transaction
Request,
Ask again

E9
Cancel
Transaction,
Eject card

Figure 3.5. Here we see a simplified picture of an ATM transaction process. Here
we see 3 states and 9 transitions in this finite automaton model.

2. Traveling Salesman path: visit all states once


3. Eulerian path: visit all edges once
4. Risk based: path where a combination of transitions can be problematic
5. All paths of a certain length: takes a long time to process
6. All ways out of a state
7. All events that should not trigger a transition: this verifies system robustness
[TDP]
A transition table for the ATM example is given in tab. 3.3
Trans. No.
E1
E2
E3
E4
E5
E6
E7
E8
E9

Start state
S1
S1
S2
S2
S2
S2
S3
S3
S3

Event
Wrong card
Correct card inserted
Wrong PIN third time
Cancel transaction
Wrong PIN
Correct PIN entered
Wrong transaction input
Transaction request correct
Cancel transaction

Response
Eject card
Ask for PIN
Block card
Eject card
Ask for PIN
Ask for transaction
Ask for transaction
Perform transaction, eject card
Eject card

Table 3.3. State transition table of an ATM machine

25

End State
S1
S2
S1
S1
S2
S3
S3
S1
S1

CHAPTER 3. THEORY

3.6.3

Equivalence Class Partitioning and Boundary Value Analysis

The purpose of equivalence class partitioning is to reduce the tests or test data set
by a heuristic that only one representative of each class of test cases can be used with
highest probability to find an error if it exists. The choice of such representative
test case should be based on the assumption that (although not absolutely sure)
if one test case within the class finds a deviation then the other ones within
the class also do
if one test case does not find any deviations then neither does the other test
cases
The test cases in a set are considered equivalent if the above mentioned points are
met. By experience one knows that errors usually occur at the boundaries (both in
input and output data). This is especially true when working with numeric values.
Test case design by equivalence classes consists of two steps.
1. Identify the equivalence classes
2. Design the test case
[MYERS, p.42]

Engine speed (RPM)


0

-
class 1 (invalid)

3000
class 2 (valid)

class 3 (invalid)

Figure 3.6. Illustration of equivalence class partition and boundary value analysis.
Here we have three intervals. < 0 rpm, 0 3000 rpm and > 3000 rpm.

As an example we see in fig. 3.6 that a continuous numerical variable such as the
engine speed is divided into three equivalence classes. In order to test an engine
controller software, how it reacts to different sensor values we may manipulate the
sensor to give any value we want in order to test the software reaction. We can
choose these values wisely so we only have to test a few values in each equivalence
class and have a very high probability to discover a deviation (error) instead of
testing all possible values (this number is finite since the measurement resolution is
limited). Experience shows that tests of values on the boundaries usually have the
highest probability to discover an error (boundary value analysis).

3.7

White Box Testing

This type of testing requires knowledge of the internal working of a piece of software.
The purpose here is to test how the object behaves on the inside. Test cases typically
contain input data that tests different execution paths of the software.
26

3.8. GRAY BOX TESTING

3.8

Gray Box Testing

In case of gray box testing we use the knowledge that we have on data structures
and algorithms for test case design but we do test at the user level i.e. only at the
interface (black-box level).
However, modifying a data repository does qualify as gray box, as the user would
not normally be able to change the data outside of the system under test. Grey box
testing may also include reverse engineering to determine, for instance, boundary
values or error messages.

3.9

Smoke testing

Smoke testing in software development means to perform a quick and less detailed
test as a preliminary to further testing. A set of test cases that test the most
important functionality is used as a smoke test to reveal severe bugs that need to
be fixed before ordinary testing can begin.

3.10

Non Functional Testing

Apart from the functional requirements on software there are often other sorts of
requirements on how the software should behave. The standard ISO9126 Software
Product Evaluation defines six headlines for assessing software quality. These are
1. Functionality: Presence of desired functions, correctness, inter-operation,
compliance with standards, security
2. Reliability: System robustness and correct function in different situations,
failure tolerance, recovery, accessibility
3. Usability: Ease to understand and use the system
4. Efficiency: The optimal resource utilization, timing aspects (performance),
resource requirements, scalability
5. Maintainability: The possibility to upgrade the system when necessary, the
ability to analyze, modify and test the system
6. Portability: Possibility of the system to work in different environments,
different operating systems or with different databases
The above points are important to bear in mind during the development process. It
is also important to let the experts in respective area to perform these assessments.
As for an example, usability is in focus at an early stage of the system design and
should be done by the experts in usability. It is not a good idea to let test developers
without specialized knowledge to handle the questions concerning it. [TDP, p.96]
27

CHAPTER 3. THEORY

3.11

How far should we test?

There is no simple answer to this question. To perform an exhaustive test on the


software is practically impossible other than for the simplest software. One way is
to use five basic criteria which together are used to determine when we are finished,
or more likely, achieved good enough quality. These are:
1. We have achieved the goal we set in the test coverage heuristics.
2. The number of discovered deviations (errors) are less than a limit value we
define.
3. The cost of finding more errors is larger than the estimated value loss due to
undiscovered errors.
4. The project team decides together that a product is ready to be released.
5. Management team decide to release the product.
[TDP, p. 269]
Each of the above mentioned criteria on its own has its weaknesses. The fact that
we do not find any more errors may mean that we are not doing the tests right and
not that there are no errors. The decision that the cost is higher than the gain from
performing more tests is a subjective assessment that is difficult for a test developer
to perform. The 5th criteria is a deadline that is decided outside the testers domain
and is not related to quality of the tests but a business assessment on when the
product shall be released. Several of the mentioned criteria together shall be used
in order to decide when the product is ready for a release.

Sufficient Quality
As mentioned, it is often not the testers that decide when a product is released.
However a tester should be prepared to answer the question what opinion he/she
has about the quality. The answer should be well founded. According to James
Bachs view, what he calls a good enough quality, it can be summarized in four
points (it refers to the product).
It has sufficient number of advantages.
It does not have critical problems.
The advantages are sufficiently higher than the disadvantages.
Further testing would, the current situation and all parts considered, do more
harm than help.
28

3.12. FORMAL METHODS

[BACH]
The bottom line here is that several criteria must be taken into consideration together before deciding that testing is finished and the product is release-ready.

3.12

Formal Methods

The methods for software testing presented above are well accepted and widely
used in practical applications due to their simplicity in use and possibility to reuse
the test cases over software refactoring cycles. This applies primarily to black box
testing methods. Black box testing can also be performed without detailed knowledge of internal workings of the tested system which is often desired since it is
almost always different people that develop and test software in large industrial applications. However these methods totally lack any kind of theoretical background.
Formal methods is a somewhat ambiguous denomination of a collection of mathematical/logical tools for software testing and verification. These are, unlike software
testing methods above, very well theoretically founded, but their use in practical
applications are very limited due to the complexity.

29

Chapter 4

Methods
The overall aim of this project is to contribute to the work aimed at making the
powertrain ECU software less prone to problems during start up and shutdown.
A natural first step in this work is to investigate what makes the ECUs prone to
start up and shutdown problems. By using technical analysis and completing the
investigation with an analysis of failure reports we identify a number of areas that
are considered to be the most high risk areas for the ECUs during start up and
shutdown. The work process itself was largely based on the steps described below.

4.1

Technical Analysis

During this part we go down into details of the process of three different phases of
ECU operation which are start up, cranking and shutdown. We study a number of
technical regulations, system descriptions, software architecture documentations as
well as looking at the ECU software code base itself in order to determine scenarios
that could lead to problems like misleading DTCs, corrupt files etc.

4.2

Failure Report Analysis

While the technical analysis reveals areas where problems can occur, we need to
complete it with information about where they actually do occur. This was done by
searching internal failure report databases as well as interviewing system architects
and system owners of EMS, GMS and EEC as well as platform software, LLAP and
ComP architects. The database search part was one of the most difficult because
it is not always straightforward to investigate how a failure did occur based on the
information given in the database to which authorized workshops and field tests
report. One often needs more information about the state of the ECUs at the
time when the problem occurred in order to perform an investigation which was
not always available. Another database that was searched contained failure reports
that was found by the staff at NE, powertrain control system department. This
31

CHAPTER 4. METHODS

database often contained more detailed information that is usable in order to find
the actual cause of the problem.

4.3

Requirement Identification

Now when we have knowledge of where actual problems may occur, we need to
define or identify system functional requirements for the ECU software. This is
done by studying documentation on system function description, system function
specification and technical descriptions and parsing out how the system software
should function which leads to a requirements definition. It is very common that a
round of discussion with software developers and/or function developers is needed.
In the case of this thesis we identified the requirements by going through the documents [TBENV], [TBJ1939COM], CANM1 description and requirements document,
FILE2 description and requirements. A number of requirement that was found to
be related to start up or shutdown of the ECUs was picked and combined into a
test case requirement set upon which test cases were built.

4.4

Test Techniques

A deep study of different test techniques was done and it is reported in chapter 3.
During this phase we need to decide which test technique to apply to our test cases
and how. We make a test design which is a specification of the test techniques we
choose and how we use them. The test design is specified in each test specification
document.

4.5

Developing the Test Cases

After making a test design which is a rather theoretical part of the test development
process it is time to develop a test flow, which is a set of exact instructions of what
to do and when. We need to identify the testability of different requirements, that
is, how the tests should technically be performed. During this part we choose test
platform which may be a single ECU, HIL rigs or a complete vehicle. We chose to
perform all of our test cases in HIL-rigs due to the fact that we need to perform many
high-precision timing and measurements that are not feasible to do in a vehicle. A
single ECU is not an option either since we often need to have access to scenarios
where several ECUs communicate with each other in a realistic manner. During this
step we also identify the variables that we need to observe, whether it is internal
state variables in an ECU or a CAN signal, sampling frequency and similar technical
details. The result of this process is a complete test case specification that may be
implemented as a script for the test automation HIL rigs.
1
2

CAN Communication Manager in the LLAP layer


Non Volatile Memory Manager

32

Chapter 5

Results
5.1
5.1.1

Technical Analysis
CAN messages and signals

Each CAN-message that is sent periodically on bus by an ECU have a defined period time Tmax (according to SAE-J1939 specifications). A database that contains
all CAN messages and their specifications, which includes period time, comes along
with each software release. Not all CAN-messages are periodic though. Some are
sent only on request. In order to ensure good real-time performance in the embedded ECU systems each ECU must be able to detect missing expected periodic CAN
messages from other ECUs.
In EMS, LLAP1 is responsible for monitoring CAN messages and indicating a timeout when a signal fails to be received by EMS when it is expected. This indicating is
done by setting a signal status, that is traveling along with the signal from LLAP to
APPL through RTDB. The signal value itself should be set to some pre-programmed
default value.

5.1.2

Signal and Component Statuses

Each of the ECUs that we are working with in this project has a mechanism for
diagnostics of the electrical components connected to the ECU. This mechanism is
rather complex and only a subset containing necessary basics will be covered here.
A part of this mechanism lies inside the ComP layer of the controller software,
another part is in the application layer. We call the former DIMA-BSW and the
later DIMA-AP. A diagnosed component is an electrical component that is either a
sensor/actuator connected to the ECU or a part of the ECU electronics like memory, AD/DA converters and similar. The status of a component can indicate that
a component:
1

See 2.2

33

CHAPTER 5. RESULTS

is flawless
is possibly affected by an error that is present in the system
is affected by an electrical or non-electrical error
does not exist in the current configuration
A signal is a software variable that is transferred between different parts of
software which most of the time are physical quantities like Volt, Bar, Kelvin etc.
A signal is usually transfered together with a signal status. Signal status is encoded
using 8 bits and is contained in the same C-struct as the signal value. Signal status
are indicating whether the signal
is flawless
is possibly affected by an error that is present in the system
value is a good replacement value
value is possibly a bad replacement value
value has a plausibility error
is not available or based on a nonexistent component
At the ECU start up, the signal status is initialized to INIT meaning that the signals
are not classified as good or bad at all until a specified amount of time has passed
since power up of the ECU.

5.1.3

Start Up

A measurement of start up times of two ECU was performed. A measurement device


(Ipetronik M-SENS) was connected to the S8 U15 signal through a break-out-box
along with the CAN1 channel which made it possible to record the U15 signal in
the same graph as all the CAN signals sent by EMS on the red CAN bus, using
ATI Vision. The difference between the time point where U15 becomes high (24 V)
and when the first CAN signal is sampled is considered the start-up time. This was
performed on a live vehicle. Five iterations of this measurement was performed and
the result is that an average start up time was 0.2733 s with a standard deviation
of 0.0017. See fig 5.1.
Further, a number of failure reports from different kinds of test vehicles indicate the
resetting of duty cycle data when starting with a low battery voltage as a problem
area. These problems should be handled at module level and/or ComP level. Other
reports point out possibly false or misleading DTCs with a variety of causes at start
up. Some of the reported problems are nearly impossible to test at the system level
and should be investigated by the developers at module level. Others seem to be a
result of a previous incorrect shutdown or file saving.
34

5.1. TECHNICAL ANALYSIS

U15 is turned on
First CAN signal from S8

First CAN signal from RET

Figure 5.1. Measuring of the EMS S8 start up time

5.1.4

Cranking

It is a known phenomenon that engine start causes a short drop in the supply voltage
from the battery since the starter engine in a heavy vehicle requires a massive
amount of electrical current. Technically a driver must first turn the ignition key
to ON position which gives U15 signal to all the system ECUs. Only after that
can the engine be turned on by turning the key to the START position. Also, the
engine management system EMS must be fully up and running in order for it to
be possible to start the engine. This diploma project is not focused on the engine
start up but more on the ECU start up, however this phenomenon is still interesting
since it may affect the ECU behavior due to the voltage drop.

5.1.5

Shutdown

When the driver turns off the ignition key, the U15 signal is broken and the software
of the ECUs detects it which makes the software initiate the shutdown process of
the ECU. However the ignition key turn-off is one of the several conditions for ECU
shutdown. Just to mention a few examples: the engine controller, EMS must ensure
that the engine is standing still, the gearbox controller, GMS must check that the
neutral gear is in and the SCR system controller EEC must sometimes perform a
regeneration process which may take some time during which the controller must
be running. Also, most of the ECUs have a set of data that is saved to non-volatile
memory during shutdown so it can be read at the next start up. This includes but
35

CHAPTER 5. RESULTS

Figure 5.2. A starter motor does require a lot of electrical current from the battery.
Batteries in heavy trucks normally give 24 V DC. But as we see, the heavy current
consumption during engine start may make the voltage level to drop significantly
below 24 V. This may affect the ECU software. Here we see the U15 signal level
(which is the same as U30 provided ignition is ON and 0 V otherwise. The step
at 4.3 s is due to ignition key turn to on. At 5.2 s the ignition key is turned to
ENGINE START position and the engine starts. Later when the engine is started the generator produces about 28 V.

is not limited to duty cycle data, DTC and freeze-frame2 data, adaptation data etc.
This data is saved at shutdown and an ECU must not power itself down until the
saving process is complete. Also there are kinds of data that is not allowed to be
written to NVM areas at shutdown. An example of the latter is End-of-Line (EOL)
configuration data. This data is critical for the system function and therefore is
written only during a reprogramming session initiated by a KWP request. During
the system runtime a copy of this data resides in RAM and there is, at least in
theory, a possibility that it may accidentally be modified in RAM and then written
back to NVM at shutdown and in such way be corrupted. A test case (denoted as
Test Case 2) in this report is focused on these issues with the EMS.

5.2

Definitions

Before we continue to describe the results that were achieved during this project we
must make a clear definition of the concepts test case and test flow. A test case is
a system function test specification document that is a complete and detailed description of the tested function, system requirements, prerequisites, post-requisites,
limitations, used test design techniques, used heuristics and finally test flows.
2
Freeze Frame is a set of state variable whose values, at the moment when a DTC is set, is
saved

36

5.3. TEST CASE 1: EMS EEC COMMUNICATION, CAN TIMEOUTS DETECTION

A test flow is a set of direct instructions to the implementer of the test. The
instructions describe what to do, when to do it and how to do it. A test flow typically covers a requirement or a number of them. In some cases, several requirements
are best combined in a single test flow. A test flow may be conducted directly in a
vehicle or a test rig and it can also be implemented as a script for a test rig.
A total of two test case were developed in this project. Although the complete test
case documents are not presented here since they contain some information that
is not allowed to be published, parts of these documents that are considered most
interesting and does not reveal internal information are given below. Test flows are
presented as completely as possible with some modifications in order not to reveal
internal information. The main principles should however be clearly readable.

5.3
5.3.1

Test case 1: EMS EEC Communication, CAN


timeouts detection
Requirement Identification

After having a correspondence with test developers at NEVS it was shown that a
DTC was discovered by a test developer that indicated a CAN message timeout. A
message, that EEC sends to EMS with nominal period time (which we call Tmax )
of 200 ms was in EMS wrongly programmed to be expected every 50 ms. In cases
of some messages, a DTC is set when 5Tmax period time of an expected message
has passed (no related DTC is allowed to be set earlier than 5 Tmax due to a requirement). In this particular case, a DTC Communication with the SCR control
unit error was set, apparently due to the fact that EMS expected the message from
EEC each 50 ms and after 5*50 = 250 ms it was allowed to set CAN timeout DTC
which it did. The message should in normal cases be received every 200 ms which
is less than 250 ms allowed to pass before a DTC is set. However due to high bus
load the message was delayed for an additional amount of time and was missing for
250 ms or longer which lead to the DTC.
We decided that a test case should be developed to detect this kind of errors.
This test case is to be based on the requirements:
1. No DTC is allowed to be set until 5 lost frames of a message (i.e. 5 Tmax
time has passed since last received message)
2. S8 LLAP must detect a missing CAN message after 5 missing instances (i.e.
if tmessage 5 Tmax
3. Results of timeout test for message must not be reported until at least 2
seconds + 5 Tmax has passed since ECU on
37

CHAPTER 5. RESULTS

4. Timeout must not be reported outside Operating voltage mode, in this case
22V 32V
A closer inspection of these requirements follows.
Requirement 1
The rationale behind this requirement is that fault codes (DTCs) indicating communication error with an ECU should not be set too early. This is because such
a fault code may be misleading and make a user or a repair workshop mechanic
think that there is a faulty ECU that should be replaced when in fact a CAN
message was just delayed a little because high load on the CAN bus. Therefore
each ECU that receives CAN messages from other ECUs must be tolerant to a delay on the messages that are expected to be received within a certain period time
from other ECUs. System architects at Scania have decided that a delay up to 5
times a nominal message period time should be acceptable and not result in a DTC.
As to testability of this requirement, it is not easy to check DTCs in an ECU
at exactly the right moment according to the test specifications. This is because
a KWP request must be made to the ECU asking it for a list of DTCs, a time is
required to process it and return an answer, which contains a list of DTCs. This
process altogether can take up to several seconds to complete so it is not feasible to
use this straight approach since we must know when a DTC is set with millisecond
precision. So we use another way. By inspecting the code we find that CAN communication manager, call it CANM, reports a timeout diagnostic result by what is
called a test communication struct (shortly testcom struct). A testcom struct is a
C-struct or more precisely a bit-field containing 8 bits. There is a testcom struct
variable declared for almost all CAN messages that S8 is receiving from EEC3.
These variables can be monitored in ATI Vision3 by using CCP4 . By monitoring
this 8-bit integer we can see whether a timeout failure is reported by looking at one
bit in this structure.
Requirement 2
The signals that are contained within CAN messages often contain valuable real
time data. There are cases where closed control loops over CAN are based on the
signals. Therefore it is critical that application layer of the ECU software knows
whether a signal value is reliable or not. LLAP indicates this by altering the signal
status. The signal status is a 8-bit variable that travels from LLAP to APPL via
RTDB in the same C-struct and is used by APPL to decide whether to rely on the
signal value, make some degradations to some functions of the vehicle/engine or to
3

ATI Vision (or just Vision) is a computer software used to perform measurements of internal
state variables in an ECU, CAN monitoring etc.
4
CCP = CAN Calibration Protocol which is a way to monitor internal ECU variables with a
100Hz sampling rate.

38

5.3. TEST CASE 1: EMS EEC COMMUNICATION, CAN TIMEOUTS DETECTION

set a DTC. In case where a CAN message that is expected to be received with a
certain period time discontinues to arrive, the LLAP must degrade its signal status
after 5 times the period time and set the signal value itself to a pre-defined default
value. There are many levels of signal statuses however for this test case we are
dealing with only three of them: FLAWLESS, NOTAVAILABLE and BADREPLACEMENT.
In fig. 5.3 we see actual measurements (through CCP) that shows how signal status
is degraded for two messages each of which are sent with cycle time of 6 times their
respective nominal cycle time. As the CAN message with nominal cycle time of 50
ms is received (each 6*50 = 300 ms), the status for its signals is set to FLAWLESS
and continues to be FLAWLESS for a time of 5*50 = 250 ms. After that the signal
status is degraded to NOTAVAILABLE and holds that status for a 50 ms after which
a new instance of the message is received and the signal status is again restored
to FLAWLESS. The same applies to a message with cycle time of 200 ms. This
measurement was performed in a single EMS S8 unit connected to a DC power
source. A VCI2 unit was connected (see fig. 2.7) to the S8 units CAN2 and CAN3
port. The software used to program the S8 unit and read DTCs/DECs was XCOM
(internal Scania Software) that can send and receive diagnostic information from
the ECU through its CAN3 channel. The software that was used to monitor the
internal ECU status variables, including signal statuses, was ATI Vision. Vision
also allows us to send simulated CAN messages so they would appear to come from
the EEC3 unit that is not present. The CAN messages were sent this way.

Requirement 3
Variation in wake-up time of different ECUs makes it inappropriate to set a diagnostic trouble code too early. For the same reason that an ECU is required to tolerate
5 missing CAN message frames from another ECU that it is scheduled to receive
before setting a timeout DTC an ECU must have a tolerance for ECUs that is not
starting to transmit CAN messages immediately after U15 on. This is to ensure
that a DTC that indicates a faulty ECU is set if and only if there is a faulty ECU.
A time of two seconds is therefore allowed to pass since ECU wake up in addition
to the usual rule of no DTCs until 5 Tmax time has passed without an expected
CAN message to give a chance to all ECUs on the CAN bus to power up properly.
As to testability of this requirement, we need to be able to measure how much
time has passed since ECU has been turned on. As measurements indicate, it takes
some time for an ECU to wake up and start responding to CCP after U15 on. It
is therefore not accurate to base the track of time on the time point when U15 is
turned on. Instead, we use an internal variable of the ECU that counts how many
10 ms loops have passed since initialization of the ECU. By recording this variable
together with other variables like testcom structs we can see at which point in ECU
run-time a timeout indication appeared.

39

CHAPTER 5. RESULTS

Figure 5.3. We see how signal status (no engineering unit) jumps between the
FLAWLESS (upper line) to NOTAVAILABLE (lower line) for two CAN messages
traveling from EEC3 to S8. The messages are sent with a cycle time 6 times the
nominal cycle time. In the upper graph the signal belongs to a signal that is contained
in a message with cycle time of 200 ms, and in the lower graph corresponding cycle
time is 50 ms.

Requirement 4
This requirement is another measure to counter the phenomenon of misleading
DTCs. As we discovered during technical analysis the supply voltage from a vehicles battery may vary significantly. When the control unit system is powered by a
vehicles battery, i.e. when the ignition is on but the engine is off and the battery is
weak and unable to supply the nominal 24V DC not all sensors/actuators and control systems can be expected to work properly and send a reliable data. Therefore,
according to [TBENV] full ECU system functionality can only be expected when
system supply voltage is within the range of 22 V 32 V. Outside this range, a
CAN timeout DTC may be misleading and is therefore not allowed.
Taking these requirements as a set of rules we can now formulate a decision table which is used to decide which set of conditions are to be covered by the test,
see tab. 5.1.
40

5.3. TEST CASE 1: EMS EEC COMMUNICATION, CAN TIMEOUTS DETECTION

Decisions

Actions

U30 within 22 V 32 V
tECU on > 2 s + 5 Tmax
tmsg > 5 Tmax
DTC allowed
Timeout detected by LLAP

T
T
T
Y
Y

F
T
T
N
N

T
F
T
N
N

F
F
T
N
N

T
T
F
N
N

F
T
F
N
N

T
F
F
N
N

F
F
F
N
N

Table 5.1. Decision table for the described test case

5.3.2

Test Techniques

In the cases where we verify that the system properly reacts to a timeout of a CAN
message, we do it by sending a simulated message to the ECU on CAN bus with a
modified cycle time.
Requirement 1
Tested using equivalence class partitioning and boundary value analysis in the time
domain for all diagnosed messages using their respective testcom-struct. Equivalence classes are
1. tmsg,act < 5 tmsg
2. tmsg,act 5 tmsg
where tmsg is the message maximal period time for each of the message tested tmsg,act
is the period at which a manipulated message is sent during the test process.
Requirement 2
Tested using equivalence class partitioning and boundary value analysis in the time
domain in a similar way as for the previous requirement. In this case we use signal
statuses specified in the system function specification.
Requirement 3
Tested using boundary value analysis in the time domain (testcomstruct). Equivalence classes are
1. tECU on < 2 s + 5tmsg
2. tECU on 2 s + 5tmsg
Requirement 4
According to [TBENV] the Operating Voltage Mode is 22V to 32V. This should be
tested by equivalence class partitioning and boundary value analysis.
We have three equivalence classes
41

CHAPTER 5. RESULTS

1. Uop < 22 V
2. 22 V Uop 32 V
3. 32 V < Uop

5.3.3

Test Flows

By logically grouping the requirements we define three test flows which covers different test requirements. It is specified which requirement is covered by a test flow.
The test flows are presented below. Please note that the variable/constant names
are not allowed for publishing. Therefore the variable names given in the test flows
below are not real. The names given here however is chosen to be descriptive of
what the variable represents.
A note about the use of boundary value analysis is that we are using somewhat
relaxed version of it by not analyzing signal data exactly on the boundaries since
the system requirements does not define them clearly and it is not deemed necessary.

42

5.3. TEST CASE 1: EMS EEC COMMUNICATION, CAN TIMEOUTS DETECTION

Test Flow 1
This test flow covers requirement 1 and 3.

43

CHAPTER 5. RESULTS

Test Flow 2
This test flow covers requirement 2. Note that the appendix mentioned here refers
to the appendix in the internal test specification document. Fig. 5.5 in this report
is its close correspondence.

44

5.3. TEST CASE 1: EMS EEC COMMUNICATION, CAN TIMEOUTS DETECTION

Test Flow 3
This test flow covers requirement 1, 3 and 4 and involves manipulating system
supply voltage.

5.3.4

Testing the Test Case

This test case was run on two different software versions, one of which had a known
bug. The software version with the known bug tested was 61.38.00 while the bug
was fixed in version 61.38.01 As we see how signal status of one message whose Tmax
according to CANdb is 200 ms behaves when the message is sent with actual cycle
time of 1200 ms. In 61.38.00 we see cycles of 950 ms with status NOTAVAILABLE and
250 ms with status FLAWLESS. This is incorrect behavior and violates requirement
2 by degrading the signal status too early. In fig 5.5 we see the correct behavior of
the signal. The test case was able to detect this bug.

45

CHAPTER 5. RESULTS

Figure 5.4. Signal status variation, the higher line indicates FLAWLESS signal
status, the lower is NOTAVAILABLE signal status. This one is faulty

Figure 5.5. Signal status variation, the higher line indicates FLAWLESS signal
status, the lower is NOTAVAILABLE signal status. This one is OK

5.4

Test case 2: EMS Shutdown

The process of shutting down an ECU may be complex and depend on conditions
both within the software of the ECU being shut down and other ECUs. As an
instance, EMS must wait for EEC3 to shut down in some cases. In general, an
ECU must be shut down in such way that it should be possible to start it up again
without it entering an undefined state. Another important issue is that data in the
RAM mirror must be saved correctly to EEPROM during shutdown. Each time
data is saved a CRC5 checksum is calculated and stored in EEPROM. During upstart, the checksum on EEPROM data is calculated again and compared to the one
5

CRC = Cyclic Redundancy Check

46

5.4. TEST CASE 2: EMS SHUTDOWN

stored from the time data was saved. A mismatch indicates corrupt data area and
it should be replaced by pre-defined default values.
The technical analysis shows that S8 in a road vehicle (truck or bus) is powered by
the vehicles battery or generator when the engine is running. Besides U30 power
supply, S8 (like almost any other ECU) has a U15 input signal, which is defined as
U15 =

U30 if ignition is on
0
otherwise

When ignition is turned from on to off, the ECU detects the loss of U15 signal and
initializes its shutdown sequence. During this sequence the ECU presumes to have
full U30 voltage supply. If U30 is lost while the shutdown phase is in progress the
ECU may enter an undefined state and/or corrupt the EEPROM data. This scenario is fully realistic in a truck because trucks are equipped with a master battery
switch, usually located outside the cab, near the battery or in some special purpose
vehicles inside the cab. This switch turns off all of the vehicles electrical systems
(with a very few exceptions). If this switch is turned off while ignition is on then
S8 will lose power at the same moment as the U15 signal is switched to 0. In worst
case, the ECU may initialize the EEPROM saving process which it does not have
a chance to complete due to sudden voltage loss.
The above illustrates a rationale behind developing a test case that tests correctness
of a shutdown process. As with previous test case we begin the development process
with a requirement identification.

5.4.1

Requirement Identification

Four requirements were identified and chosen to be covered by this test case. The
abbreviation EXEM in this section refers to a software module within the SYSM
manager that controls the initial phase of the execution of an ECU.
Requirement 1: The EMS should turn itself off if and only if the
following conditions are fulfilled
1. U15 is OFF for 100 ms
2. No engine movement for 100 ms
3. APPL layer allows shutdown or a timeout has occurred
4. EEC, if present, allows shutdown or a wait timeout has occurred
47

CHAPTER 5. RESULTS

Requirement 2: EXEM shall validate a DTC if the ECU does not


perform the correct shutdown actions for a configurable number of
times in a row
See description of requirement 3
Requirement 3: EXEM shall validate an INTE as fast as it detects an
abnormal shutdown conditions
It is important that an incorrect shutdown process is detected by the ECU during
the following start-up. A cause may be an interrupt in power (U30) supply or a
software/hardware failure. According to a system requirement an incorrect shutdown must result in an internal event (INTE) and after a configurable number of
times a DTC. This test case verifies these requirements by provoking an incorrect
shutdown and checking that an INTE and DTC is set. Incorrect shutdown is done
by cutting U30 power supply to EMS when it is running giving it no chance to save
files and perform a normal shutdown sequence.
Requirement 4: It must be possible to abort a shutdown process until
all shutdown actions are acknowledged
There is a system requirement stating that a powertrain control unit should be
up and running within 500 ms from ignition key on. Measurements indicate that
a complete file saving process takes about 900 ms meaning that S8 shutdown at
least takes 900 ms. It is therefore impossible to fulfill the 500 ms start-up time
requirement if the ignition key is turned on very shortly after it has been turned off
in case the system performs a complete shutdown and start-up sequence. Therefore
there is a system requirement that S8 must interrupt the ongoing shutdown sequence
and restore itself to be a fully operational ECU if U15 is suddenly turned on during
the time that the ECU saves files and prepares for shutdown as a result of an U15
turnoff.

48

5.4. TEST CASE 2: EMS SHUTDOWN

5.4.2

Test Techniques

For requirement 1 we use a decision table. Based on the conditions in requirement


1 we form a decision table as shown in tab. 5.2. As we see with the help of the

Table 5.2. Decision table for the S8 shutdown test case. Majority of combinations
of the four conditions was excluded from the test case heuristically based on the
assumption that it is unlikely that two conditions in combination would give a failure
while they would not if tested separately. This way we test whether the EMS does
shut down when all conditions are fulfilled and does not when one condition at a time
is not fulfilled.

decision table we get a good overview of all possible scenarios. Their number is
reasonably small in this case. One should be careful with decision tables as one or
many important condition combinations may be excluded while they should not be.
To verify other requirements no special test technique was used since the test data
set and the number of execution flows is limited and verified directly.

5.4.3

Test Flows

Test Flow 1: All Conditions Fulfilled


Note that variable names are modified in order not to disclose internal
information. The real variable names are different. Also note that u15
in this case is a boolean variable that can either be 0 or 1 indicating
whether ignition is off or on respectively. During this test flow we assume
that the EEC3 unit does not request stay-alive. Also we assume that the signal
status of the EEC3 stay-alive request is FLAWLESS at all times. Note: the variable
u15_allow_shutdown is used to determine whether the U15 condition for shutdown
is fulfilled. u15 is 1 when U15 is on and 0 when U15 is off. In order for U15 condition
to be fulfilled u15 must be 0 for 100 ms.
49

CHAPTER 5. RESULTS

Test Flow 2: Engine does not Stop


In this test flow we test that the EMS does not turn off when the engine is running.
S8 is not allowed to turn itself off until there is no engine movement for some time.
In this test flow we verify that the EMS stays turned on for as long as the engine is
on. There is no timeout so EMS may theoretically stay up for an unlimited amount
of time as long as the engine is running but since we cannot test for an unlimited
time we must define a time limit. We set 60 seconds as it appears to be a reasonable
time limit. It is reasonable to assume that EMS either shut itself down soon after
all conditions for shutdown except engine movement has been fulfilled (which would
mean that this test case fails) or it will stay up as long as the engine is moving.
Even if it would happen the consequences are not deemed as severe.
50

5.4. TEST CASE 2: EMS SHUTDOWN

Test Flow 3: APPL Layer Holds Shutdown due to Adaptations

adaptation_shutdown_halt

ADAPTATION_TIMEOUT

ADAPTATION_TIMEOUT

51

CHAPTER 5. RESULTS

Test Flow 4: EEC3 halts the Shutdown

adaptation_shutdown_halt

5.4.4

Detect an Abnormal Shutdown and Set a DTC and Internal


Event (INTE)

Test Flow 5
Pt
1
2
3
4

5
6

5.4.5

Action
Turn U15 on and connect XCOM to the
power train ECUs.
Clear DTCs DECs and INTEs
Turn battery off (U15+U30)
Turn battery back on, turn U15 on and
reconnect XCOM to the ECU (if
necessary)
Clear all DTCs/DECs/INTEs
Perform battery turn off and turn back on
the number of times given by
MAX_NO_OF_ABN_SHUTDOWNS

Expected result

Check INTEs, an INTE in module (reset


cause check) should be set.
Check that a DTC 0xF001 (Incorrect EMS
shutdown is set)

Possibility to Cancel a Shutdown in Progress

Since it is not known exactly how much time file saving process takes we cannot put
a time requirement how long it should be possible to interrupt shutdown after U15
off. This case is therefore based on first performing a measurement of how much
52

5.4. TEST CASE 2: EMS SHUTDOWN

time a file saving process takes and then perform the scenario again but with half
the measured time. Since variations in file saving time of more than 50% seems
very unlikely we consider that the system behaves not according to requirement if
it is not possible to interrupt a shutdown after half the measured time.
Test Flow 6

5.4.6

Test Results

This test case was run in the test automation environment which is a Hardware Inthe Loop (HIL) rig containing the ECUs EMS (S8), GMS (OPC5), EEC3, COO7
53

CHAPTER 5. RESULTS

and ICL2. A selection of most interesting results is presented here.


Test Flow 1

Figure 5.6. Vision recording of the scenario given in test flow 1 in this test case.
The recording was stopped after the ECU (EMS) stopped to respond to CCP and
we assume that it powers itself down directly after that. This means that the last
samples in this recording is the last samples of the ECU before shutdown. At approx 23.5 s time the ignition (U15) is turned off and approx 100 ms after that the
u15_allow_shutdown condition becomes true. At approx 25.5 s the engine stops moving and the eec_allow_shutdown conditions become true and after approx 1.2 s the
EMS stops responding to CCP. This 1.2 s can be explained by the time it takes to
save the NVM data.

Test Flow 2 4
The behavior in these test flows are very similar to the one in the test flow 1 result.
The difference is only in details and variable names.
Test Flow 5
By inspection with XCOM it was found that the INTE code was set when a U30
switch was shut off during running ECU in the HIL lab. Also after one shutdown
by cutting off U30 to the control unit a DTC 0xF001 Incorrect EMS shutdown was
set.
54

5.4. TEST CASE 2: EMS SHUTDOWN

Test Flow 6
Please observe that in the following plots the units on the y-axis are different for
each line. There are boolean variables, u15 and e2saving_in_progress and the
time counters which have the unit second, mixed in the same plot.

Figure 5.7. The U15 is turned off at approx 6.1 s and then turned back at approx
7.3 s. As we see the counter that counts the time since ECU was turned on did not
reset. This means that the ECU went back to fully operational state without being
turned off first.

55

CHAPTER 5. RESULTS

Figure 5.8. This is the scenario where the EMS actually does turn itself off as we see
by the run time counter. The time period where the counters (ecuon_time_counter
and e2saving_time) are at a constant is the time when the EMS is down and Vision
is not receiving any samples from CCP.

56

Chapter 6

Conclusions and Suggestions for Future


Work
6.1

Conclusions

In general, many things can go wrong during an ECU or vehicle start up and shutdown. To be able to classify the areas of highest risk one needs to have an experience
in developing or testing the ECU software (preferably both) and a great knowledge
of the internal structure of the ECU software. The authors communication with
platform software developers was of great help to be able to get somewhere and to
realize some high-risk problem areas.
Through communication (informal interviews) with other developers and testers
at the department, analysis of technical documents, some fault reports and fault
codes database, some problem areas was identified as probably the most high-risk
and two test cases were developed. These problems can be coarsely partitioned into
Start up problems such as CAN communication timeouts, fault codes that
may be false or misleading. A great number of these problems can most likely
be explained by a previous improper shutdown.
Cranking problems which is a result of a sudden voltage drop of U30 supply
line to all the ECUs. At very cold temperatures (down to 40 C) this problem
is even worse than what is shown in fig. 5.2. In such cases the voltage may
reach 0 V and stay at this level for a significant amount of time [Interview
with the platform architect at NE, Feb. 2011]
Shutdown problems that comprises all cases where an ECU is turned off unexpectedly due to either a deliberate or non-deliberate U30 cut off or unexpected
ECU reset due to software (or hardware) problems. A common denominator
for these scenarios is that the ECU has little chance of saving its data to
NVM. It can be compared to typing a document in a word processor on a
57

CHAPTER 6. CONCLUSIONS AND SUGGESTIONS FOR FUTURE WORK

desktop computer and then make the computer suddenly lose power without
previously saving the document.
The assignors requirement was to develop several test cases related to identified
problem areas within the start-up and shutdown of the ECUs and if time and/or
resources allows also implement the test cases in either a test automation rig or a
live vehicle. At the time point when the first test case was developed (test specification was produced) in the end of march 2011 it was clear that there is no time
to go on with the implementation phase. The natural step for a future work (a
recommendation to Scania) is therefore to implement the test case EMS-EEC CAN
Timeout as well as part of the EMS Shutdown test case as a HIL-script.
One of the main conclusions of this work is that there is no simple solution to
the start up and shutdown problems. Also the system test as an approach to attack
these problems is not sufficient. Many things happen under the surface during the
start up and shutdown which requires a more white box approach. Also many problems discovered are not because of software bugs. This mainly applies to problems
that arise due to battery voltage or system electrical supply interruption. Therefore other ways, such as module testing and possibly hardware redesign, must be
considered.

6.2

Future Work Suggestions

One of the natural continuations of this work would be to investigate further which
files that EMS S8 system is most critical to save correctly and add the test flows
focused on these into the S8 Shutdown test case. Also the data that is not allowed
to be written to non volatile memory, such as EOL data, must be verified that it is
not being written at shutdown. This can be done by modifying the configuration
parameters in the RAM mirror through CCP and shutting down the unit being
tested. At the next start-up the RAM mirror must contain the original value of the
parameter, i.e. not be affected by the modification made before shutdown. Either
all configuration parameters can be tested this way or a heuristic selection can be
made of the most important ones.
Another suggestion is to make implementation of the given test flows in a test
automation environment, specifically the HIL rigs to make them more useful as regression tests. The test flow may have to be modified in order to be scriptable as
the current one is written in such way that it is presumed to be run manually in the
HIL-rig. Instead of inspecting the signal and/or testcom struct graphs manually in
ATI Vision a script can be written to go through the data arrays and compare the
values of samples at critical time points with predefined reference arrays.
Failure report analysis show that a lot of problems in the development or field
test vehicles is due to battery/electrical problems and supply voltage variations.
58

6.2. FUTURE WORK SUGGESTIONS

One of the problems is that it is possible to shut down an ECU in an improper way
by switching off the master battery switch while ignition is on. An intuitive thought
is that this problem can be solved by introducing a backup battery in each ECU
independent of the main battery. When U30 is turned off the ECU then can use
the backup battery power to perform a proper shutdown which would reduce the
number of misleading fault codes significantly. This would however require a major
hardware redesign of the ECUs. The author of this report is unable to make the
decision of whether it is worth the effort. There is a recommendation to investigate
the matter.

59

Bibliography
[AST] Pr Eriksson, Jim Pettersson
Implementation of a System for Automatic Software Verification, diploma thesis.
Ume University,
2006.
[BACH] Bach, James,
Good Enough Quality: Beyond the Buzzword.
IEEE Computer Society,
1997.
[CANEMB] Gianluca Cena, Adriano Valenzano,
Controller Area Networks for Embedded Systems.
Taylor & Francis Group, LLC
2009.
[CANTIMING] Thomas Nolte,
Timing Analysis of CAN-Based Automotive Communication Systems.
Mlardalen University
Taylor & Francis Group, LLC
2008.
[ECUSK10] ECU systemkurs
Scania CV AB (Internal Document),
2010.
[HYB10]
Lecture Notes in Hybrid and Embedded Control Systems, Lecture 1.
KTH, School of Electrical Engineering, Stockholm,
2010.
[KVASERCAN] Kvaser CAN home page
Kvaser
http://www.kvaser.com/can
Accessed 2011-03-30.
[MSDNRT] Regression Testing
MSDN Library
61

BIBLIOGRAPHY

http://msdn.microsoft.com/en-us/library/aa292167
Accessed 2011-03-31.
[MYERS] Myers, Glenford
The Art of Software Testing.
John Wiley and Sons,
2004
[OBD] OBD
KMB Systems
http://www.kbmsystems.net/obd_tech.htm
Accessed 2011-03-21.
[RTOS] Giorgio Buttazo
Real-Time Operating Systems: Problems and Novel Solutions.
University of Pavia,
Vol. 2469 of Lecture Notes in Computer Science, Springer-Verlag, pp. 37-51
2002.
[SDS8] System description EMS S8.
Scania CV AB (Internal Document),
2008.
[TBENV] Technical Regulation - Requirements and verification methods for electrical factors in a 24V system
Scania CV AB (Internal Document)
2010.
[TBJ1939COM] Technical Regulation - Data communication requirements, control
units connected to a SAE J1939 network segment
Scania CV AB (Internal Document),
2010.
[TDP] Torbjrn Ryber,
Testdesign fr programvara.
No Digit Media,
ISBN 91-976062-1-9
2006.
[WPCAN] Controller Area Network
Wikipedia article
http://en.wikipedia.org/wiki/Controller_area_network
Accessed 2010-11-27.
[WPEMSTD] European Emission Standards
Wikipedia article
http://en.wikipedia.org/wiki/European_emission_standards
Accessed 2010-11-27.
62

TRITA-CSC-E 2011:065
ISRN-KTH/CSC/E--11/065-SE
ISSN-1653-5715

www.kth.se

Você também pode gostar