Newsletter

Nonlinear DSP Structures





TechOnline

Digital signal processing is concerned with the problem of transforming one sequence of numbers into another in a way that meets certain practical objectives. For example, filtering involves the separation of one sequence into two or more components (such as, separating a telephone system voice signal from noise, echoes, and other unwanted disturbances). Historically, much successful use has been made of linear systems that satisfy the principle of superposition:



Furthermore, the most popular linear structures are time-invariant, meaning that if the input sequence {xk} is shifted j units in time to obtain {xk-j}, the output sequence {yk} is shifted by the same amount, to {yk-j}. All linear, time-invariant systems (abbreviated LTI) may be completely characterized by their response, here denoted {hk}, to the impulse input k (equal to 1 when k = 0 and zero otherwise). Given {hk}, the response {yk} for any input sequence {xk} is given by the convolution relation:



The practical utility of this class of systems arises from two sources: first, the fact that LTI models can reasonably approximate the dynamics of many important physical systems (at least over a suitably restricted range of operating conditions) and second, that the principle of superposition may be exploited to obtain many useful system analysis and design results.


Why We Need Nonlinear Structures
Nonlinear systems are those for which the principle of superposition does not hold and, as the following examples illustrate, there are important DSP problems for which linear solutions are either nonexistent or demonstrably inferior to nonlinear solutions. One example is the use of chaotic signals for secure communication, since linear models cannot exhibit chaotic responses to simple inputs. An example where nonlinear methods yield better results than the best known linear approaches is that of filtering to reduce interference terms in the discrete Wigner distribution, an extension of the power spectrum for signals with time-varying spectral characteristics. Similarly, nonlinear filters are better suited for edge preservation in noisy images than are linear filters.

Another important situation where linear structures are demonstrably inadequate is in treatment of impulsive noise: Figure 1 shows 1024 physical property measurements from an industrial manufacturing process, which exhibits spikes like the points marked "?" at k 720, k 800, and k 870.


Figure 1: Industrial process data sequences often contain outliers like those marked with "?" in this figure.

Denoting this data sequence {Yk}, it may be described by the additive outlier model:

Yk = k + ok

where {k} is the nominal part of the data sequence that is of primary interest.and {ok} is a sequence of outliers, assuming the value 0 for most k but occasionally assuming large values that may arise from a variety of sources, including gross measurement errors. These outliers can profoundly influence the results of subsequent analysis, leaving us with two practical alternatives: the development of special analysis procedures that are inherently outlier-resistant, or the use of data cleaning filters to attenuate the outliers, followed by standard analysis procedures. The first of these approaches is usually more effective, but generally more complex to implement, motivating practical interest in the development of simple data-cleaning filters.

An ideal data cleaning filter applied to the observed data sequence {Yk} would return the nominal data sequence {k} without any modification, regardless of the magnitude or location of the outliers {ok}. For a linear, time-invariant filter , the response to the contaminated data sequence is given by



To reject outliers regardless of their position or magnitude, the second sum should vanish, implying hi = 0 for all i, but to avoid distortion, the first sum should be equal to k, implying h0 = 1and hi = 0 forall i 0. Since these requirements are in conflict, it follows that ideal data-cleaning filters are necessarily nonlinear. Further, nonlinear filters exist whose impulse response is identically zero, thus making them effective against isolated outliers, but which also leave a large class of root sequences completely unmodified. A specific example is the standard median filter, defined as follows: first, for each k, define the moving data window wk as the set of successive data samples xk-m through xk+m. Then, rank-order the values in this data window from smallest to largest to obtain the set:



The output of the median filter is the middle value in this list, w(0). Complete characterizations of the median filter's root sequences are known, and this set includes all monotonically increasing or decreasing sequences.

Conversely, it is also known that the distortion introduced by the median filter is unacceptable for some applications, motivating interest in the Hampel filter which belongs to the class of decision-based filters. This filter compares the central point in the data window xk with the median w(0) and, if this distance is too large (relative to an outlier-resistant scale estimate S computed from wk), the filter output is Yk = w(0); otherwise, the filter output is Yk = xk. This filter involves two tuning parameters: the window width 2m + 1, and a threshold parameter t that determines the filter's agressiveness (specifically, the filter only modifies xk if ¦ xk - w(0) ¦ > tS). Two key observations are first, that this filter reduces to the standard median filter when t = 0 and second, that the root sequence set increases as t increases, including essentially all input sequences as t —> .


Nonlinear DSP Structures
Despite their utility, nonlinear systems are more difficult to analyze and design than linear systems, in part because only partial characterizations exist. These characterizations may be either behavioral, obtained by replacing Equation 1, or structural, obtained by replacing Equation 2. Because it is more popular in practice, this article focuses primarily on the structural approach, although the behavioral approach is considered briefly at the end of this section to illustrate its general nature.

For LTI systems, a significant disadvantage of Equation 2 is that {hi} represents an infinite sequence of design parameters. Fortunately, most of the LTI systems used in DSP applications may also be represented as



for some integers p > 0 and l < m, and some real numbers ai and bi. Here, we consider the following nonlinear generalization of this structure, which includes an uncountably vast range of important special cases:



For linear systems, taking p = 0 in Equation 4 removes the first sum and yields a system whose impulse response coefficients are given by hi = bi for < i < m and zero otherwise. In the digital filtering literature, these systems are called finite-impulse-response (FIR) filters. Analogously, taking p = 0 in Equation 5 removes the functional dependence of yk on past outputs yk-j and defines the class of nonlinear FIR structures. This class contains most of the nonlinear filter structures in current use in DSP applications. As specific examples, note that both the standard median filter and the Hampel filter belong to this class, with = -m.

Taking p > 0 in Equation 5 defines the class of recursive structures, which exhibit greater flexibility than the FIR case but are also more difficult to analyze. For example, Figure 2 shows the responses of two different systems to the same input sequence, consisting of five steps, each of the same amplitude. The smoother curve shows the response of the Hammerstein structure:



consisting of the cascade connection of the memoryless nonlinearity g(x)=x1/3 followed by a linear dynamic model. For continuous functions g(•) and stable linear models, the resulting Hammerstein structure is BIBO stable (refer to the section "Qualitative Behavior" for a definition of BIBO stability).

Figure 2:  This comparison of Hammerstein and Lur'e model responses to the same input sequence illustrates the difference between linear and nonlinear recursive terms

Similar results do not extend to Lur'e structures, also composed from these same two components, but with the nonlinear function g(•) appearing as a feedback element around the linear system. This point is illustrated by the less regular curve in Figure 2, which shows the response of the Lur'e structure:



Although both of these structures are recursive, the recursive terms are linear in the Hammerstein case but nonlinear in the Lur'e case; as a consequence, the general qualitative behavior of the Hammerstein structure does not change dramatically with increasing input amplitude, while that of the Lur'e structure does: the step responses change from monotonic, to slightly oscillatory, to highly oscillatory, to sustained oscillations, and finally chaotic. Further amplitude increases eventually lead to unstable (in other words, unbounded) responses.

Returning to the idea of behavioral characterizations of nonlinear systems, consider the following two examples. First, replacing Equation 1 with



defines the class of positive-homogeneous systems, which includes both the standard median filter and the Hampel filter. The second behavioral characterization considered here is location invariance and is also exhibited by both of these nonlinear filters:



for any constant c. Both of these conditions are natural ones in many filtering applications since, taken together, they imply that unit conversions (such as, temperature conversion from Farenheit to Celsius) applied to filter inputs result in the corresponding changes in filter outputs. Consequently, it is particularly interesting to note that the only nonlinear FIR filters based on functions (•) such that (x, . . ., x) is differentiable with respect to x are linear-weighted averages. This observation means that nonlinear FIR filters exhibiting this scale-invariance involve non-smooth nonlinearities to which Taylor series expansions and related notions like small-amplitude linearization are not applicable.


Choosing (•)
Historically, polynomials represent one of the most popular classes of functions because they are extremely well-behaved, being infinitely differentiable everywhere and easy to compute. Taking (•) as a multivariable polynomial leads to the class of polynomial filters but again, most applications use the FIR subclass obtained for p = 0, often called Volterra filters or Volterra models. A typical application is compensating for loudspeaker nonlinearities.

Another popular function class is based on artificial neural networks (ANNs), combining weighted linear combinations with smooth saturation nonlinearities. An extremely simple example is the following three-layer structure:



Here, xk represents the input sequence, h1(k) and h2(k) define a two-node hidden layer, Yk is the network output, and d is an unspecified synaptic weight parameter. These networks may be either recurrent structures corresponding to p > 0, as in this example, or nonlinear FIR structures. As in the case of polynomials, the functions (•) defined by ANNs are smooth (in other words, infinitely differentiable) and hence continuous, but unlike polynomials, these functions are also bounded, which can be a significant advantage in some applications. More generally, the functional flexibility of ANNs is comparable to that of polynomials, as both of these function classes can approximate any continuous function (•) arbitrarily well on any closed and bounded set. Despite these observations, it is important to note that there are certain forms of qualitative behavior that cannot be exhibited by either of these function classes; in particular, note that these functions cannot exhibit either positive-homogeneity or location invariance properties.

A class of functions that can exhibit both of these forms of behavior is the class of piecewise linear functions, leading to the class of linear multimodels. It has recently been shown that the FIR restriction of this structure class includes all L-filters, whose output Yk consists of a linear combination of the rank-ordered elements w(j) defined in Equation 3. Because this rank-ordering is invariant under both scaling by >0 and translation by a constant c, it follows that L-filters are both positive-homogeneous and location-invariant. A simple class of recursive linear multimodels is the positive-homogeneous TARMA or TARMAX structure, defined by:



Although this structure is not location-invariant, its simplicity of form facilitates behavioral analysis and it is quite flexible, reducing to Equation 2 when ci = 0 and di = 0, including well-behaved Hammerstein structures when ci = 0 but di 0, and yielding chaotic responses for some ci 0.


Qualitative Behavior
Stability may be defined in various ways, most of which are equivalent for LTI systems but not for nonlinear systems. The following stability notion is useful, both because it is a generally desirable system characteristic and because it can provide useful guidance in selecting nonlinear DSP structures. A system with input xk and output Yk is bounded-input, bounded-output stable, or BIBO stable if, given any finite M, ¦ xk¦< M for all k implies ¦ Yk ¦ < N for all k for some finite N. It can be shown that any nonlinear FIR structure based on a continuous function (•) is BIBO stable and, as an important special case, it follows that Volterra filters are BIBO stable. Conversely, Figure 3 shows the response of the following rational FIR structure:



to a noise-corrupted sinusoidal input sequence {xk} . The function (•) on which this nonlinear FIR filter is based is not continuous because Yk diverges when the denominator sum goes to zero, so the previous BIBO stability result does not apply and the character of the response shown in Figure 3 is a consequence of this system's lack of BIBO stability. Similarly, the Lur'e structure illustrates that recursive structures are generally not BIBO stable, although they can be. For example, because ANNs involve bounded nonlinearities, any ANN-based structure is BIBO stable, even in the recurrent case, p > 0.

Figure 3:  Not all filters are BIBO stable, as illustrated by this response of a rational FIR filter to a noisy sinusoid input.

Another general characteristic of nonlinear FIR structures is that if the input is ultimately periodic, meaning xk+P = xk for all k > M, then so is the output sequence. Specifically, note that for k > M + m, we have



As a special case, taking P = 1 corresponds to sequences that are ultimately constant, a class that includes both steps and impulses. The practical implication of Equation 7 is that nonlinear FIR structures cannot change the period of periodic inputs, nor can they exhibit nonperiodic responses to such inputs. Similarly, sustained oscillations or chaotic responses to pulse inputs like impulses and steps are also impossible. The Lur'e structure step responses shown in Figure 2 illustrate that these conclusions do not extend to nonlinear recursive structures.

More generally, typical behavior of recursive structures is illustrated in Figure 4. The upper two plots show the responses of the ANN defined in Equation 6 to the same step input for two different values of the parameter d: when d = 3.9 (upper left plot), the response decays initially to a presistent, regular oscillation, but when d = 3.0 (upper right plot), the network's response to this same step input is persistent but highly irregular. Qualitatively analogous behavior is seen in the lower left plot, showing the response to a sinusoidal input for d = 3.0; the key point here is that because of the nonlinear recursive terms in this recurrent network structure, nonperiodic responses to periodic inputs are possible, in marked contrast to the nonlinear FIR case. Finally, the lower right plot shows the sinusoidal response of the following Lur'e structure:



which is BIBO stable because it may be represented as a stable linear model with a bounded, additive perturbation (in other words, ¦ 4sinYk-1¦ < 4 for all k).

Figure 4:  Recurrent neural network and Lur'e structures need not preserve either asymptotic constancy or periodicity.

Most currently popular nonlinear DSP structures and infinitely many as yet unexplored ones may be expressed as special cases of Equation 5. Consequently, the design of nonlinear DSP structures generally reduces to the specification of the integers p, , and m and the function (•) appearing in this equation.

In some respects, the selection of the integers p, , and m has more profound consequences: specifically, setting p = 0 leads to the class of generally well-behaved and extremely popular nonlinear FIR structures. Alternatively, certain applications like the generation of chaotic sequences require recursive structures with p > 0, which are generally not as well behaved and more difficult to analyze. Similarly, real-time applications require strict causality, implying > 0, but off-line signal and data processing applications often benefit from non-causal structures, for which < 0; as a specific example, note that = -m for the median and Hampel filters.



 






 Featured Jobs
Silicon Labs seeking Sr. Product Test Engineer in Austin, TX

ON Semi seeking Sr Mixed-Signal Test Development Engr in Lower Gwynedd, PA

Cirrus Logic seeking Digital IC Design Engineer in Austin, TX

ITT Corporation seeking Senior Engineer 2 in Norfolk, VA

Protingent Staffing seeking Firmware/Software Verification Engr in Northridge, CA

More jobs on EETimesCareers
 Sponsor
 CAREER CENTER
Ready to take that job and shove it?
SEARCH JOBS:

 SPONSOR

 RECENT JOB POSTINGS
For more great jobs, career related news, features and services, please visit EETimes' Career Center.