automated design software to save time and cost.
semiconductor and capacitor technologies ensure more reliable power electronic
monitoring and fault tolerant design allow extended lifetime.
level and system level smart de-rating operation.
of heat flow and thermal distribution by controlling the power in power
approach provides insight into how to avoid failures in power electronic
mission profile and on-line monitoring data from the field.
testing for reliability prediction and robustness validation.
understanding of failure mechanisms and failure modes of reliability critical
Increasing electrical/electronic content
in mission profile and strength of components.
paradigms and lack of understanding in design for reliability approach.
The major challenges and
opportunities in the research on reliability of power electronic systems are
efforts have been devoted to better power electronic systems in terms of
reliability to ensure higher availability, more power generation and low
maintenance cost. A change in basic assumptions in reliability research on
power electronics is going on from simple handbook-based calculations to the
physics-of-failure approach and design for reliability process. A systematic
design procedure consisting of various design tools is presented in this report
to give an outline on how to design reliability into the power electronic products
from the early concept phase.
monitoring is an effective way to enhance reliability when the power converters
are in operation . It provides the real-time operating characteristics and
health conditions of the systems by monitoring specific parameters of power
electronic components (e.g. saturation voltage of IGBTs). Therefore, proactive
maintenance work could be planned to avoid failures that would occur.
4.3 On-line condition monitoring
schemes can improve thermal loading of the power device. Some modulation
methods for thermal optimization of 3L-NPC-BTB wind power converters during
extreme LVRT are proposed in . The basic idea of these modulations is to
select the proper vector sequences which can reduce the dwelling time or
commutations involving zero voltage level. The loss and thermal in the most
stressed devices can thereby be reduced.
4.2 Low Voltage Ride Through (LVRT) thermal optimized modulation
Selection of IGBT modules has significant
impact on the lifetime. Comprehensive analysis on the device selection based on
both cost and performance is needed to avoid either over engineering design or
fail to meet the specifications.
of proper devices
4 Methods to Improve Reliability
k and m are empirically-determined constants and N is the number of cycles to failure. is the temperature cycle range and is the portion of that in the elastic strain range. Constant
parameters in the combined models can be estimated according to the available
data. Therefore, the reliability of each critical individual component is
predicted by considering each of its associated critical failure mechanism. To
map the component level reliability prediction to the system level, the system
modeling method reliability block diagram(RBD), fault-tree analysis (FTA) or state-space
analysis (e.g. Markov analysis) is applied as discussed in detail in .
Thermal cycling is
found to be one of the main drivers for failure of IGBT modules. The effect of
the temperature cycling can be explained by the typical stress-strain curve in Fig.9.
defined as the cyclic stress (e.g. temperature cycling) and E: is defined as
the deformation. With a low cyclic stress below (, no damage occurs, and the material is in
the elastic region. When the stress is increased above (, an irreversible deformation is induced, and
the material enters the plastic region. The coefficients of thermal expansion for
different materials in the IGBT modules are different, leading to stress
formation in the packaging. The degradation will continue with each cycle until
the material fails. The number of cycle to failure for temperature cycling can
be obtained as
Structural details of an IGBT module with red connections that are relevant to
cycling is a response to the converter line and loading variations as well as
periodically commutation of power switching devices. It will induce cyclic
temperature stress on different layers of materials used for fabrication of power
electronic components. For example, Fig. 8 shows the typical structure details
of IGBT modules.
3.3.2 Life model on the temperature cycling effect
The impact of additional stress on materials/device degradation
under low and high , respectively. Thus, a
power law dependence for stress is used to bridge the gap between low stress
and high stress.
combined stresses are applied, the activation energy is dependent on additional applied stress (e.g. electrical stress,
mechanical stress, and chemical stress) as shown in Fig.7. The parameters a and
b are determined from stress-induced degradation testing data. a is temperature
dependent and defined as . It can be obtained that
Free energy description of material/device degradation
Where , is
the Boltzmann’s constant (), T is the temperature in Kelvin and is
a material/device specific constant. It should be noted that the
simplified result in (1) is the same as Arrhenius equation that is widely used
for reliability prediction. The value of the activation energy is dependent on
the type of material and device.
Fig. 6 illustrates
the degradation of a material or device from initial stable state with free
energy of to
a degraded state with free energy of , The driving force for this degradation
is the free energy difference between and defined
as , The heat induced by power losses in
power electronic components provides the energy for the transformation from one
state to another. The activation energy limits
the rate of the degradation. Define
, , and as
the degradation rate, recovery rate and net reaction rate, respectively. It can
be derived that
3.3.1 Degradation model on the temperature effect
Fault tolerant design is a way to reduce the system
level failures for some critical applications requiring high level of
reliability. Due to redundancy design, a fault in a component or subsystem does
not induce the failure of the whole system, thus, preventing the system form
significant loss or unexpected interruptions. At this stage, an initial
reliability prediction can be performed. Temperature and temperature cycling
are the major stressor that affect the reliability performance as shown in fig.
3, which will be more significant with the trend for high power density and
high temperature power electronic systems. Thus, two models are presented here
to study their effects.
Multi-domain simulation, especially the
electrical-thermal simulation is a very useful tool to virtually investigate
the static and dynamic properties of the system. The link between the
electrical domain and thermal domain is the power loss and thermal model of
individual component. Finite Element Analysis can be used to study the thermal
3.3 Design Phase – Initial Design
This element covers the following four aspects: a)
fundamental operation of the power electronic circuit and system; b) electrical
and thermal stress analysis based on the system specifications and mission
profile for preliminary selection of components to meet the stress-strength
requirement; c) Failure Mode Effect and Analysis to identify the failure
mechanisms as shown in Table 1, failure mode (e.g. open circuit, short circuit,
etc.), occurrence and severity level of the failure and likelihood of prior
detection for each cause of failure and d) list of reliability critical
components in the system and their associated failure mechanisms.
3.2 Design Phase – Analysis
Distribution of wind speed (red) and energy generated (blue).
Proposed design for reliability procedure for power electronic systems 16
The initial concept phase has the relevant conditions
to which the power electronic systems are expected to be exposed i.e. mission
profile are identified. Benchmarking of system architecture and circuit
topology is conducted. Then, the potential new risks brought into the design
are analysed based on past experience and applied new type of devices and topologies.
In the wind power application, the mission profile mainly depends on the wind
speed profile. Thus, it is necessary to analyse the future of the wind profile
based on record data of a specific location and time. Fig. 5 shows an example
of the winds speed distribution at the Lee Ranch facility during the year of
2002 16. The histogram shows the measured data, while the curve is the
Raleigh model distribution for the same average speed 16.
3.1 Concept Phase
A systematic DFR procedure specifically applicable to
power electronic system design is proposed as given in figure 4. One can notice
that the procedure designs reliability into segments of development processes
(i.e. concept, design, validation, production, and release) of power electronic
products, especially in the design phase. Thus, attention is given to the
detailed procedures and various design tools applied in the design phase
according to the initial design concept.
3 Proposed DFR Procedure for Power Electronic Systems and Associated
Failure and stress distributions in power electronic systems 15
Failure root cause distribution. (b) Source of stresses distribution.
To perform reliability-oriented design, it is
worthwhile to explore the measure failure mechanisms of all reliability
critical components. Fig. 3 (a) and Fig. 3 (b) show the failure distribution
among power electronic components 14 and source of stresses that have
significant impact on reliability. It can be noted that capacitors and
semiconductors are the most vulnerable power electronic components. Temperature
has the most significant impact on the reliability of power electronic components
and systems. Thus, electrical-thermal analysis and simulation are important and
necessary to perform reliability-oriented design.
2.4 Typical Distribution of Failures and Source of Stresses in Power
Where: T-temperature; H-humidity; -cyclic range; V-voltage; M-moisture; J-current
density; -gradient; S-stress.
Failure mechanism, relevant loads, and models in electronics
Physics-of-Failure (PoF) approach is a methodology
based on root-cause failure mechanism analysis and the impact of materials,
defects, and stresses on product reliability 12. Failure mechanisms can be
generally classified into overstress and wear out. Overstress failure arises
because of cumulative damage related to load i.e. temperature cycling. Compared
to empirical failure analysis based on historical data, PoF approach requires
the knowledge of deterministic science i.e. materials, physics and chemistry,
and probabilistic variation theory i.e. statistics. The analysis involves the
mission profile of the component, type of failure mechanism and the associated
physical-statistical model. Table 1 gives examples of wear out failure
mechanisms for electronic components 13.
2.3 Physics of Failure Approach
Load-strength analyses to explain overstress failure and wear out failure
Although the load and strength analysis my not ensure
an accurate prediction of probability of failure due to the uncertainty of
their distributions, it provides insight into how to reduce failure. Proactive
measures can be taken in design phase by setting a reasonable design margin
i.e. selection of S or managing load i.e. active control of L during operation.
Thus, degradation models and lifetime models are necessary to estimate the
failures at the end of life in the initial design phase.
A component fails when the applied load L exceeds the
design strength S. In this matter, Load L refers to a kind of stress i.e.
voltage, cyclic load, temperature, etc. and strength S refers to any resisting
physical property i.e. harness, melting point, adhesion, etc. 11. Fig.2
presents a typical load-strength interference evolving with time. Most of power
electronic components, neither load nor strength are fixed, but allocated
within a certain interval which can be presented by a specific probability density
function. i.e. normal distribution. Moreover, the strength of a material or
device could be degraded with time. Theoretically, the probability of failure
can be obtained by analysing the overlap area between distribution.
2.2 Load and Strength Analysis
In power electronic systems, the degradation of one
component may affect the operation of another, i.e. the reduction of
capacitance will increase the associated voltage ripple, which may cause over
voltage stress of switching devices even though the capacitor itself can still
operate under normal mode. Similarly, the deterioration of input and output
performance of a specific power electronic converter may induce failures of
other subsystems. Thus, it could be more difficult to determine the failure
criteria of a component or system in power electronics than in other domains.
The selection of the parameter as the failure indicator and the corresponding
criteria depends on specific design, operation condition and standard. For
illustration, the failure criteria for electrolytic capacitors can be set as
100% increase of the equivalent-series-resistance or 20% reduction of the
capacitance. Different results could be obtained for different choices of
2.1 Failure Criteria
In general, reliability is defined as the ability of
an item to perform required duty under stated conditions for a certain period,
which is measured by probability of failure, frequency of failure, or in terms
of availability. The essence of reliability engineering is to prevent the creation
of failures. Deficiencies in the design phase have effect on all produced items
and the cost to correct them is progressively increased as development
proceeds. Therefore, this section introduces the following aspects of failure
in power electronic systems.
2 Identification of Failures in Power Electronic Systems
reliance on handbook-based models and statistics. Military handbooks MIL-HDBK-217F
is widely used to predict the failure rate of power electronic components 9.
However, temperature cycling, failure rate change with material, combined
environments, supplier variations, hence technology and quality are not
considered. Physically, a failure rate of a component is the sum of the failure
rates of all failure modes, which have different reliability models
corresponding to specific failure mechanisms. Statistics is necessary
fundamental when dealing with the effects of uncertainty and variability on
reliability. However, as the variation is often a function of time and
operating conditions, statistics itself is not sufficient to interpret the
reliability data without judgement of the assumptions and non-statistical
Failure rates represented by bathtub curve during three distinct periods
reliance on calculated value Mean-Time-To-Failure (MTTF) or
Mean-Time-Between-Failures (MTBF) and bathtub curve 7. The Bathtub curve separates
the device or system operation into three distinct periods. Although it is
approximately consistent with some practical cases, the assumption of random
failure and constant failure rate during the useful life period are misleading
and the true root causes of failure modes are not identified 8. The basic
assumptions of MTTF are constant failure rate and no wear out. Thus, the
obtained values may have high degree of inaccuracy if wear out occurs within
of systematic DFR approach specific for design of power electronic systems. The
DFR approach studied in reliability engineering is too broad in focus 6 while
power electronic systems have their own specific challenges and new
opportunities in enhancing the reliability. Moreover, design tools except for
the reliability prediction, are rarely applied in state-of-the art research on
reliability of power electronic systems.
Industries have advanced the development of
reliability engineering from traditional methods of testing reliability to
design for reliability (DFR) 4. This process is conducted during the design
phase of a component or system that ensures them to perform required level of
reliability. It aims to understand and fix reliability issues up-front in the
design process. Accordingly, strong efforts have been devoted to considerations
into the reliability aspect performance of power electronic components,
converters, and systems 5. However, the following three limitations are
encountered in the reliability research in power electronics area:
Various techniques have been applied to investigate
failures in power electronic systems. Today, most of the lifetime data of
semiconductors is captured by statistics of failed product and experience-based
books are written to provide guidance of lifetime of different devices 2. However,
this approach has been proven to be untrusted and inaccurate as there are too
general and application-independent. Interestingly, some device manufacturers
have tried to investigate the failure mechanisms and implement accelerating
tests to discover the lifetime boundaries of power semiconductors under extreme
stress levels 3, this approach may not be sufficient to acquire an accurate
lifetime information of power semiconductors for practical use.
In a wind power application, power electronic
converters have particularly touch operating conditions, they need to withstand
a large amount of power, even up to a few megawatts, but with frequent
fluctuations of wind speeds, perform a series of complex functions, and be
exposed to harsh environments like temperature swings, dust, vibrations,
humidity, etc. 1. Power electronics tend to be fragile and have become a
bottleneck of the whole wind turbine system with respect to reliability as a
concern. This issue will significantly increase the cost of energy not only due
to the increase in maintenance and repairs, but also because of the reduced
energy delivered to the customers.
Power electronics have enabled efficient conversion
and more flexible control of electric energy in the last four decades. However,
the reliability performance of power electronic systems constrains huge
challenges in various emerging applications, more especially for the grid
integration of renewable energy with long operation hours under harsh
environment setting. It has significant impact on the life cycle cost of the
systems, levelized energy cost, customer expectations and satisfaction, thus,
the penetration of renewable energy in our modern electrical grid is a strong
challenge in the long run.