Wavelet Transform

学海无涯GL 2013-07-15

展开全文

Wavelet Transform

Ruye Wang 2008-12-16

Why Wavelet?

A time signal contains complete information in time domain, i.e., the amplitude of the signal at any given moment . However, no information is explicitly available in regarding the frequency contents of the signal. On the other hand, as the spectrum obtained by Fourier transform of the time signal is extracted from the entire time duration of the signal, it contains complete information in the frequency domain in terms of the magnitude and phase angle of any frequency component , but there is no information explicitly available in the spectrum regarding the temporal characteriscs of the signal such as where in time certain frequency component appeared. Neither in time domain nor in frequency domain provides complete description of the signal.

To address this dilemma, we can truncate the signal by a time window $w(t-\tau)$

$\begin{displaymath}x'(t)=w(t-\tau) x(t) \end{displaymath}$

where the time window has width and can shift in time according to $\tau$ :

$\begin{displaymath}w(t)=\left\{\begin{array}{ll}1 & 0<t<T 0 & \mbox{otherwise} \end{array} \right. \end{displaymath}$

The spectrum of this windowed signal is for the specific time period only. One immediate drawback of this windowed signal is the severe distortion in frequency domain caused by the sudden truncation in time domain. To reduce this distortion, the square time window can be replaced by a smooth Gaussian bell-shaped window with gradual decay:

$\begin{displaymath}x'(t)=g(t) x(t),\;\;\;\;\mbox{where}\;\;\; g(t)=c\;exp((t-\tau)^2/\sigma^2) \end{displaymath}$

The Fourier transform of this Gassian filtered time signal is called the Gabor transform. Although the spectral property of a windowed signal can be better localized in time, its resolution in frequency is reduced as its spectrum is blurred by the convolution with $G(f)={\cal F}[g(t)]$ :

$\begin{displaymath}X'(f) = X(f)*G(f) \end{displaymath}$

Alternatively, we can assume the truncated signal repeats itself outside a finite period T. In this case, the spectrum becomes discrete, composed of a set of infinite number of frequency components ( $k=0,\pm 1, \pm 2, \cdots$ ) for discrete frequencies , including DC component at , fundamental frequency at , and the higher harmonics (). In this discrete spectrum, the information of any frequency in the gap between any two consecutive frequency components and is lost. Moreover, the higher temporal resolution we achieve by reducing , the lower frequency resolution will result due to the larger gaps between frequency components.

We see that it is simply impossible to have the complete information of a given signal in both time and frequency domains at the same time, as increasing the resolution in one domain will necessarily reduce that in the other. This effect is referred to as Heisenberg uncertainty, as it is anologous to the fact in quantum physics that the position and momentum of a particle cannot be measured simultaneously, higher precision in one quantity implies lower in the other.

If the characteristics of the signal in question do not change much over time, i.e, the signal is stationary, then Fourier transform is sufficient for the analysis of the signal. However, in many applications it is the transitory or nonstationary aspects of the signal (e.g., drift, trends, abrupt changes) that are of most interest. In such cases, Fourier analysis is unable to detect when/where such events take place and is therefore not suitable to describe or represent them.

In order to overcome this limitation of Fourier analysis to gain information in both time and frequency domain, a different kind of transform, called wavelet transform can be used. Wavelewt transform can be viewed as a trade-off between time and frequency domains. Unlike Fourier transform which transforms a signal between time (or space) and frequency domains, wavelet transform emphasizes locations and scales (instead of frequency). From the lowest time scale to the highest, the scale is always halved to reveal more details (time resolution doubled), while the gap between consecutive scales is doubled, i.e., the scale resolution is reduced as illustrated by the Heisenberg Box (Cell) below:

Haar Wavelets

A continuous signal can be approximated by a sequence of unit impulse functions, also called scaling functions, weighted by the sampling values of the intensity or amplitude of the signal:

$\begin{displaymath}x(t)=\sum_i s_i \phi_{[t_j,t_{j+1})} \end{displaymath}$

where $\phi_{[t_j,t_{j+1})}$ is a unit impulse with width $t_{j+1}-t_j$ defined as

$\begin{displaymath}\phi_{[t_j,t_{j+1})}=\left\{ \begin{array}{ll} 1 & t_j \le t < t_{j+1} 0 & \mbox{otherwise} \end{array} \right. \end{displaymath}$

Consider two adjacent impulse functions:

$\begin{displaymath}\phi_{[0,1/2)}=\left\{ \begin{array}{ll} 1 & 0 \le t <1/2 \\... ... 1 & 1/2 \le t <0 0 & \mbox{otherwise} \end{array} \right. \end{displaymath}$

The sum of two adjacent impulse functions is a wider impulse:

$\begin{displaymath}\phi_{[0,1)}=\phi_{[0,1/2)}+\phi_{[1/2,1)}=\left\{ \begin{arr... ...} 1 & 0 \le t < 1 0 & \mbox{otherwise} \end{array} \right. \end{displaymath}$

and the difference of two adjacent impulse functions is the basic wavelet, denoted by

$\begin{displaymath}\psi_{[0,1)}=\phi_{[0,1/2)}-\phi_{[1/2,1)}=\left\{ \begin{arr... ...l} 1 & 0 \le t < 1/2 -1 & 1/2 \le t <0 \end{array} \right. \end{displaymath}$

By solving (adding and subtracting) the two equations above, the two impulse functions can be obtained:

$\begin{displaymath}\left\{ \begin{array}{l} \phi_{[0,1/2)}=(\phi_{[0,1)}+\psi_... ...hi_{[1/2,1)}=(\phi_{[0,1)}-\psi_{[0,1)})/2 \end{array} \right. \end{displaymath}$

Then any two-sample function can be written as

$\begin{displaymath}x(t)=s_0 \phi_{[0,1/2)} + s_1 \phi_{[1/2,1)} =s_0 \frac{\phi... ...=\frac{s_0+s_1}{2} \phi_{[0,1)}+\frac{s_0-s_1}{2} \psi_{[0,1)} \end{displaymath}$

where represents the average of the function and represents the change in the function. This is the Haar transform of the function. See here for more details.

Scaling Functions

Consider a set of functions

$\begin{displaymath}\varphi_{j,k}(t)=2^{j/2}\varphi(2^j t-k) \end{displaymath}$

where the specific function $\varphi(t)$ is to be determined later. These functions have two integer subscripts or parameters and :

The first subscript specifies the magnitude $2^{j/2}$ as well as the scale of the function. A greater corresponds to a larger magnitude but a more compressed function (higher ``frequency'' representing more details). This corresponds to the parameter representing frequency in the basis functions $\varphi_n(t)=e^{-j2\pi nt}=cos(2\pi nt)-jsin(2\pi nt)$ for Fourier expansion. When $j\rightarrow -\infty$ , $\varphi_{j,k}$ becomes infinitely wide (zero frequency) with infinitely small magnitude, while $j\rightarrow \infty$ , $\varphi_{j,k}$ approaches a delta function with infinitely narrow width but infinitely high magnitude.
The second subscript specifies the position (integer location, translation or shift) of the function. This does not have a counterpart in the basis functions for Fourier transform, where the position information is totally missing.

Replacing

in the equation above by

, we get:

$\begin{displaymath} \varphi_{j+1,k}(t)=2^{(j+1)/2}\varphi(2^{j+1} t-k)=\sqrt{2} 2^{j/2}\varphi_{j,k}(2t-k) \end{displaymath}$

indicating how a function $\varphi_{j+1,k}(t)$ at the level is related to the corresponding function $\varphi_{j,k}(t)$ at the level.

Corresponding to a specific index , a subset of functions $\varphi_{j,k}\vert _{j=j_0}=\varphi_{j_0,k}$ spans a function space $V_{j_0}$ . In general, there are many such functions spaces: $\cdots, V_{-1}, V_0, V_1, \cdots$ .

If a family of functions $\varphi_{j,k}(t)$ satisfy the follow requirements, they become a set of basis functions that span a function space, so that a given function can be represented as a linear combination of these bases:

$\begin{displaymath}x(t)=\sum_j \sum_k c_{j,k} \varphi_{j,k}(t) \end{displaymath}$

The functions $\varphi_{j,k}(t)$ have to be orthogonal to its integer translates:

$\begin{displaymath}(\varphi_{j,k}(t),\varphi_{j,l}(t))=\int \varphi_{j,k}(t)\;\varphi_{j,l}(t))dt =0\;\;\;\;\;\mbox{if $l\ne k$} \end{displaymath}$
The function space spanned by a set of functions $\varphi_{j,k}$ is a subspace of the function space $V_{j+1}$ spanned by the functions of a higher scale $\varphi_{j+1,k}$ (with double-resolution and capable of representing more details):

$\begin{displaymath}V_{-\infty} \subset \cdots, \subset V_{-1} \subset V_0 \subset V_1,\cdots, \subset V_{\infty} \end{displaymath}$
The only function contained in all subspaces $V_j,\;(\cdots,-1,0,1,\cdots)$ is (constant of lowest resolution or frequency carrying no information), which is the only function contained in $V_{-\infty}$ , i.e., $V_{-\infty}=\{x(t)=0\}$ .
A function can be represented with arbitrary precision by including terms $\varphi_{j,k}$ of progressively higher scales (greater ) for representing more details (high frequency components) of the function. Any square-integrable functions (with arbitrary details) can be represented in $V_{\infty}$ , i.e.,

$\begin{displaymath}V_{\infty}=L^2(R) \end{displaymath}$

where represents all square-integrable functions ( $\int f^2(t) dt <\infty$ ).

Functions satisfying these requirements are called scaling functions. As the basis functions, a subset of scaling functions $\varphi_{j,k}$ with a certain scale can only represent a function up to a certain level of details corresponding to the scale of the bases. All details in finer than the limit of the scales of $\varphi_{j,k}$ are lost. However, such details can be better represented by the basis functions of the next higher scale $\varphi_{j+1,k}$ . In general, space $V_{j+1}$ contains all functions representable in , as well as those functions with more details not representable in , i.e., $V_j \subset V_{j+1}$ , and a function can always be more precisely represented by increasing the scales (the parameter ) of the bases.

Note: The nested relations between the sequence of subspaces shown above is closely related to the Gaussian-Laplacian pyramid discussed earlier. Essentially they both state that a signal can be decomposed into a set of components each representing details of different levels contained in the signal. The images in the Laplacian pyramid correspond to the subspaces as they both represent the difference between two consecutive levels of details.

The basis functions $\varphi_{j,k}(t)$ are themselves members of the space they span as well as all the super-spaces:

$\begin{displaymath}\varphi_{j,k}(t) \in V_j \subset V_{j+1} \subset \cdots \end{displaymath}$

And they can be expanded in the space $V_{j+1}$ of next higher scale with doubled resolution:

$\begin{displaymath}\varphi_{j,k}(t) =\sum_l h_{\varphi}[l] \varphi_{j+1,l}(t) =\sum_l h_{\varphi}[l] 2^{(j+1)/2} \varphi(2^{j+1}t-l) \end{displaymath}$

where $h_{\varphi}$ are called scaling functions coefficients. Usually we let and drop the subscripts and to indicate that any scaling function $\varphi_j(t)$ can be expressed as a linear combination of the basis scaling functions $\varphi_{j+1,k}(t)$ of the space with the next higher resolution:

$\begin{displaymath}\varphi(t)=\sum_l h_{\varphi}[l]\sqrt{2} \varphi(2t-l) \end{displaymath}$

This is called refinement, dilation or MRA (multiresolution analysis) equation.

The first 4 panels show the scaling functions: $\varphi(t)=\varphi_{0,0}(t)$ , $\varphi_{0,1}(t)=\varphi(t-1)$ , $\varphi_{1,0}(t)=\sqrt{2}\varphi(2t)$ , and $\varphi_{1,1}(t)=\sqrt{2}\varphi(2t-1)$ , The 5th panel shows a function in space spanned by $\varphi_{1,k}(t)$ . The 6th panel shows $\varphi_{0,0}(t)$ as a special function in space .

Example: The Haar scaling function at level is defined as a square pulse of unit width and unit height:

$\begin{displaymath}\varphi(t)=\varphi_{0,0}(t)=\left\{ \begin{array}{ll}1 & 0\le t < 1 \0 & \mbox{otherwise} \end{array} \right. \end{displaymath}$

Scaling function $\varphi(t)$ together with its shifted versions $\varphi_{0,k}(t)=\varphi(t-k),\;(k>0)$ form a set of basis functions that span the space . The scaling functions of higher scales () for higher resolutions (for signal details) can be obtained by

$\begin{displaymath}\varphi_{j,k}(t)=2^{j/2}\varphi(2^j t-k) \end{displaymath}$

For example, when , we have

$\begin{displaymath}\varphi_{1,k}(t)=\sqrt{2}\varphi(2t-k) \end{displaymath}$

A given function can be expanded as a linear combination of these scaling (basis) functions:

$\begin{displaymath}x(t)=0.5\varphi_{1,0}(t)+\varphi_{1,1}(t)-0.25\varphi_{1,4}(t) \end{displaymath}$

This is shown in panel 5 of the figure above.

Moreover, the Haar scaling functions in space are also functions in space and they can be expressed as a linear combination of the basis functions $\varphi_{1,k}(t)$ :

$\begin{displaymath}\varphi_{0,k}(t)=\frac{1}{\sqrt{2}}\varphi_{1,2k}(t) +\frac{1}{\sqrt{2}}\varphi_{1,2k+1}(t) \end{displaymath}$

This is shown in panel 6 of the figure above. In particular,

$\begin{displaymath} \varphi_{0,0}(t)=\frac{1}{\sqrt{2}}\varphi_{1,0}(t)+\frac{1}{\sqrt{2}}\varphi_{1,1}(t) \end{displaymath}$

But as

$\begin{displaymath}\varphi_{1,0}(t)=\sqrt{2}\varphi_{0,0}(2t),\;\;\;\; \varphi_{1,1}(t)=\sqrt{2}\varphi_{0,0}(2t-1) \end{displaymath}$

the above can be written as

$\begin{displaymath}\varphi(t)=\varphi_{0,0}(t)=h_{\varphi}[0]\sqrt{2}\varphi_{0,... ...0,0}(2t-1) =\sum_{l=0}^1 h_{\varphi}[l] \sqrt{2}\varphi(2t-l) \end{displaymath}$

where the two coefficients $h_{\varphi}[0]=h_{\varphi}[0]=1/\sqrt{2}$ are the first row of the Haar matrix ${\bf H}_2$ .

Wavelet Functions

We denote by the difference between the function space $V_{j+1}$ spanned by scaling functions $\varphi_{j+1,k}$ and the function spanned by $\varphi_{j,k}$ , i.e.

$\begin{displaymath}V_{j+1}=V_j \oplus W_j \end{displaymath}$

(where $\oplus$ represents the union of the two spaces). is composed of all functions representable in $V_{j+1}$ but not representable in (as the scale of the basis functions of is too coarse for the details of these functions). This can be carried out recursively to get:

$\begin{displaymath}V_{j+2}=V_{j+1} \oplus W_{j+1}=V_j \oplus W_j \oplus W_{j+1} \end{displaymath}$

and finally:

$\begin{displaymath}L^2(R)=V_{\infty}=V_j \oplus W_j \oplus W_{j+1} \oplus W_{j+2} \oplus \cdots \end{displaymath}$

Similar to a function space spanned by the scaling functions $\varphi_{j,k}(t)$ , the function space is also spanned by a set of basis function, called the wavelet functions:

$\begin{displaymath}\psi_{j,k}(t)=2^{j/2}\psi(2^jt-k) \end{displaymath}$

We see that the scaling sequence and the wavelet sequence correspond to low-pass filter and band-pass filter, respectively.

Also, as the wavelet functions $\psi_{j,k}(t)$ are members of the space they span as well as all the super-spaces:

$\begin{displaymath}\psi_{j,k}(t) \in W_j \subset V_{j+1} \subset \cdots \end{displaymath}$

they can be expanded in the space $V_{j+1}$ of next higher scale with doubled resolution:

$\begin{displaymath}\psi_{j,k}(t) =\sum_l h_{\psi}[l] \varphi_{j+1,l}(t) =\sum_l h_{\psi}[l] 2^{(j+1)/2} \varphi(2^{j+1}t-l) \end{displaymath}$

where $h_{\psi}$ are the expansion coefficients. Usually we let and drop the subscripts and to indicate that any wavelet function $\psi_j(t)$ can be expressed as a linear combination of the basis scaling functions $\varphi_{j+1}(t)$ of the space with the next higher resolution:

$\begin{displaymath}\psi(t)=\sum_l h_{\psi}[l]\sqrt{2} \varphi(2t-l) \end{displaymath}$

This is in the same form for the scaling functions:

$\begin{displaymath}\varphi(t)=\sum_l h_{\varphi}[l]\sqrt{2} \varphi(2t-l) \end{displaymath}$

It can be shown that the coefficients of the two expressions are related

$\begin{displaymath}h_{\psi}[l]=(-1)^l h_{\varphi}[1-l] \end{displaymath}$

Examples: The Haar scaling function $\varphi(t)$ is a square pulse with unit height and width, and the coefficients are $h_{\varphi}[0]=h_{\varphi}[1]=1/\sqrt{2}$ . Now the coefficients for the wavelet functions can be obtained as

$\begin{displaymath}\begin{array}{l} h_{\psi}[0]=(-1)^0 h_{\varphi}[1-l]=1/\sqrt{... ... h_{\psi}[1]=(-1)^1 h_{\varphi}[1-l]=-1/\sqrt{2} \end{array} \end{displaymath}$

and the wavelet function is:

$\begin{displaymath}\psi(t)=\sum_l h_{\psi}[l]\sqrt{2}\varphi[2t-l] =\frac{1}{\s... ...-1 & 0.5 \le t < 1 0 & \mbox{otherwise} \end{array} \right. \end{displaymath}$

where the two coefficients $h_{\psi}[0]=-h_{\psi}[0]=1/\sqrt{2}$ are the second row of the Haar matrix ${\bf H}_2$ .

The first two panels show the wavelet functions of scale : $\psi(t)=\psi_{0,0}(t)$ and $\psi_{0,2}(t)=\psi(t-2)$ . Note $\varphi_{1,k}(t)=\sqrt{2}\varphi_{0,k}(2t)$ can be generated by the linear combination of $\varphi_{0,k}(t)$ and $\psi_{0,k}(t)$ :

$\begin{displaymath}\varphi_{1,k}(t)=\frac{\sqrt{2}}{2}[\varphi_{0,k}(t)+\psi_{0,k}(t)] \end{displaymath}$

The 3rd panel shows the wavelet functions of scale : $\psi_{1,0}(t)=\sqrt{2}\psi(2t)$ . The 4th panel shows a function in space spanned by $\varphi_{0,k}(t)$ . The 5th panel shows a function in space spanned by $\psi_{0,k}(t)$ . The 6th panel shows a function in space $V_1=V_0 \oplus W_0$ , a linear combination of $\varphi_{1,k}(t)$ , or $\varphi_{0,k}(t)$ and $\psi_{0,k}(t)$ .

Wavelet Expansion

For a specific value , the equation discussed above can be written as

$\begin{displaymath}L^2(R)=V_{\infty}=V_{j_0} \oplus W_{j_0} \oplus W_{j_0+1} \oplus W_{j_0+2} \oplus \cdots \end{displaymath}$

which indicates that any square-integrable function $x(t) \in L^2$ can be expanded as a linear combination of both the scaling basis functions $\varphi_{j_0,k}(t)$ and the wavelet basis functions $\psi_{j,k}(t)$ ( $j=j_0, j_0+1,\cdots$ ):

$\begin{displaymath} x(t)=\sum_k c_{j_0,k}\varphi_{j_0,k}(t)+\sum_{j=j_0}^{\infty} \sum_k d_{j,k}\psi_{j,k}(t) \end{displaymath}$

where $c_{j_0,k}$ is the approximation coefficient:

$\begin{displaymath}c_{j_0,k}=(x(t),\varphi_{j_0,k}(t))=\int x(t)\varphi_{j_0,k}(t) dt \end{displaymath}$

and $d_{j,k}$ is the detail coefficient:

$\begin{displaymath}d_{j,k}=(x(t),\psi_{j,k}(t))=\int x(t)\psi_{j,k}(t) dt \end{displaymath}$

The first term contained in the wavelet expansion of the function represents the approximation of the function at scale level by the linear combination of the scaling functions $\varphi_{j_0,k}(t)$ , and the summation with index in the second term in the expansion is for the details of different levels contained in the function approximated by the linear combination of the wavelet functions of progressively higher scales $j_0+1, j_0+2, \cdots$ .

An Example

A continuous function is defined over the period $0 \le t < 1$ as:

$\begin{displaymath}x(t)=\left\{ \begin{array}{ll} t^2 & 0\le t < 1 0 & \mbox{otherwise} \end{array} \right. \end{displaymath}$

We use Haar wavelets, and a starting scale . Each individual space (, , , $\cdots$ ) is spanned by different number of basis functions. For example, there is only one basis function in spaces and , while space is spanned by 2 bases, and space is spanned by 4 bases (See Haar matrix).

$\begin{displaymath}c_0(0)=\int_0^1 t^2 \varphi_{0,0}(t) dt=\int_0^1 t^2 (t) dt=\frac{1}{3} \end{displaymath}$

$\begin{displaymath}d_0(0)=\int_0^1 t^2 \psi_{0,0}(t) dt =\int_0^{0.5} t^2 (t) dt-\int_{0.5}^1 t^2 (t) dt=-\frac{1}{4} \end{displaymath}$

$\begin{displaymath}d_1(0)=\int_0^1 t^2 \psi_{1,0}(t) dt =\int_0^{0.25} \sqrt{2}... ...dt-\int_{0.25}^{0.5} t^2 \sqrt{2}(t) dt =-\frac{\sqrt{2}}{32} \end{displaymath}$

$\begin{displaymath}d_1(1)=\int_0^1 t^2 \psi_{1,1}(t) dt =\int_{0.5}^{0.75} \sqr... ...t) dt-\int_{0.75}^1 t^2 \sqrt{2}(t) dt =-\frac{3\sqrt{2}}{32} \end{displaymath}$

Therefore the wavelet series expansion of the function is

$\begin{displaymath}x(t)=\frac{1}{3}\varphi_{0,0}(t)+[-\frac{1}{4}\psi_{0,0}(t)] ... ...}{32}\psi_{1,0}(t)-\frac{3\sqrt{2}}{32}\psi_{1,1}(t)]+ \cdots \end{displaymath}$

Here the first term is , the second term is , the third term is , and $V_1=V_0 \oplus W_0$ , $V_2=V_1\oplus W_1=V_0\oplus W_0 \oplus W_1$

This process can be carried out further. By contineously reducing the scale by half (spaces $V_3, V_4, \cdots$ ), higher temporal resolution (always doubled) is achieved. However, at the same time, the frequency resolution is reduced (always halved), as shown in the Heisenberg box.

Discrete Wavelet Transform

The discrete signal ${\bf x}=[x[0],\cdots,x[N-1]]^T$ is a set of N samples taken from a continuous signal

$\begin{displaymath}x[m]=x(t_0+m \triangle t),\;\;\;\;(m=0,1,\cdots,N-1) \end{displaymath}$

for some initial time and sampling period $\triangle t$ . The basis functions ${\bf\varphi}=[\varphi[0],\cdots,\varphi[N-1]]^T$ and ${\bf\psi}=[\psi[0],\cdots, \psi[N-1]]^T$ are also vectors containing elements. We let , , $j=0,1,\cdots,J-1$ and $k=0,1,\cdots,2^j-1$ . Correspondingly the wavelet expansion becomes discrete wavelet transform (DWT). The discrete function is represented as a weighted sum in the space spanned by the bases ${\bf\varphi}$ and ${\bf\psi}$ :

$\begin{displaymath} x[m]=\frac{1}{\sqrt{N}}\sum_k W_{\varphi}[j_0,k]\varphi_{j_0... ... \sum_k W_{\psi}[j,k]\psi_{j,k}[m], \;\;\;\;\;(m=0,\cdots,N-1) \end{displaymath}$

This is the inverse wavelet transform where the summation over is for different scale levels and the summation over is for different translations in each scale level, and the coefficients (weights) are projections of the function onto each of the basis functions:

$\begin{displaymath}W_{\varphi}[j_0,k]=({\bf x},{\bf\varphi}_{j_0,k})=\frac{1}{\s... ...=0}^{N-1} x[m] \varphi_{j_0,k}[m],\;\;\;\;\mbox{(for all $k$)} \end{displaymath}$

$\begin{displaymath}W_{\psi}[j,k]=({\bf x},{\bf\psi}_{j,k})=\frac{1}{\sqrt{N}}\su... ...m] \psi_{j,k}[m],\;\;\;\;\mbox{(for all $k$ and all $j>j_0$)} \end{displaymath}$

where $W_{\varphi}[j_0,k]$ is called the approximation coefficient and $W_{\psi}[j,k]$ is called the detail coefficient. These are the forward wavelet transform.

An Example:

Assume -point discrete singal ${\bf x}=[x[0],\cdots,x[N-1]]^T=[1, 4, -3, 0]^T$ and the discrete Haar scaling and wavelet functions are:

$\begin{displaymath}\left[ \begin{array}{rrrr} 1 & 1 & 1 & 1 1 & 1 & -1 & -1 \... ... \psi_{0,0}[m] \psi_{1,0}[m] \psi_{1,1}[m] \end{array}\end{displaymath}$

The coefficient for :

$\begin{displaymath}W_{\varphi}[0,0]=\frac{1}{2}\sum_{m=0}^3 x[m]\varphi_{0,0}[m] =\frac{1}{2}[1\cdot 1+4\cdot 1-3\cdot 1+0\cdot 1]=1 \end{displaymath}$

The coefficient for :

$\begin{displaymath}W_{\psi}[0,0]=\frac{1}{2}\sum_{m=0}^3 x[m]\psi_{0,0}[m] =\frac{1}{2}[1\cdot 1+4\cdot 1-3\cdot (-1)+0\cdot (-1)]=4 \end{displaymath}$

The two coefficients for :

$\begin{displaymath}W_{\psi}[1,0]=\frac{1}{2}\sum_{m=0}^3 x[m]\psi_{1,0}[m] =\fr... ...ot \sqrt{2}+4\cdot (-\sqrt{2})-3\cdot 0+0\cdot 0]=-1.5\sqrt{2} \end{displaymath}$

$\begin{displaymath}W_{\psi}[1,1]=\frac{1}{2}\sum_{m=0}^3 x[m]\psi_{1,0}[m] =\fr... ...ot 0+4\cdot 0-3\cdot \sqrt{2}+0\cdot (-\sqrt{2})]=-1.5\sqrt{2} \end{displaymath}$

In matrix form, we have

$\begin{displaymath}\left[ \begin{array}{c} 1 4 -1.5\sqrt{2} -1.5\sqrt{2} ... ...t] \left[ \begin{array}{c} 1 4 -3 0 \end{array} \right] \end{displaymath}$

Now the function can be expressed as a linear combination of these basis functions:

$\begin{displaymath}x[m]=\frac{1}{2}[W_{\varphi}[0,0]\varphi_{0,0}[m]+W_{\psi}[0,... ...[m]+W_{\varphi}[1,1]\psi_{1,1}[m] \;\;\;\;\;\;(m=0,\cdots,3) \end{displaymath}$

or in matrix form:

$\begin{displaymath}\left[ \begin{array}{r} 1 4 -3 0 \end{array} \right] =... ...y}{c} 1 4 -1.5\sqrt{2} -1.5\sqrt{2} \end{array} \right] \end{displaymath}$

Fast Wavelet Transform (FWT) and Filter Bank

As shown before, the discrete wavelet transform of a discrete signal ${\bf x}=[x[0],\cdots,x[N-1]]^T$ is the process of getting the coefficients:

$\begin{displaymath}W_{\varphi}[j_0,k]=\frac{1}{\sqrt{N}}\sum_{m=0}^{N-1} x[m] \v... ...m] 2^{j_0/2}\varphi[2^{j_0} m-k], \;\;\;\;\mbox{(for all $k$)} \end{displaymath}$

$\begin{displaymath}W_{\psi}[j,k]=\frac{1}{\sqrt{N}}\sum_{m=0}^{N-1} x[m] \psi_{j... ...2}\psi[2^j m-k], \;\;\;\;\mbox{(for all $k$ and all $j>j_0$)} \end{displaymath}$

where the basis scaling and wavelet functions are respectively

$\begin{displaymath}\varphi_{j,k}[m]=2^{j/2}\varphi[2^j m-k]\;\;\;\;\mbox{and}\;\;\;\; \psi_{j,k}[m]=2^{j/2}\psi[2^j m-k] \end{displaymath}$

Recall that both the scaling and wavelet functions can be expanded in terms of the basis scaling functions of the next higher resolution:

$\begin{displaymath}\begin{array}{l} \varphi[m]=\sum_l h_{\varphi}[l]\sqrt{2} \v... ... \psi[m]=\sum_l h_{\psi}[l]\sqrt{2} \varphi[2m-l] \end{array} \end{displaymath}$

We now consider a fast algorithm to obtain the coefficients $W_{\varphi}[j,k]$ and $W_{\psi}[j,k]$ of different scales . Consider first the scaling function $\varphi[m]$ . Replacing by (scaled by and translated by ), we get

$\begin{displaymath}\varphi[2^jm-k]=\sum_l h_{\varphi}[l]\sqrt{2} \varphi[2(2^jm-k)-l] =\sum_l h_{\varphi}[l]\sqrt{2} \varphi[2^{j+1}m-2k-l] \end{displaymath}$

We then let i.e. , the above becomes

$\begin{displaymath}\varphi[2^jm-k]=\sum_n h_{\varphi}[n-2k]\sqrt{2} \varphi[2^{j+1}m-n] \end{displaymath}$

Similarly, the wavelet function can also be expanded:

$\begin{displaymath}\psi[2^jm-k]=\sum_n h_{\psi}[n-2k]\sqrt{2} \varphi[2^{j+1}m-n] \end{displaymath}$

This wavelet function is identical to the one used in the discrete wavelet transform above, which can be replaced by the right-hand side of the equation:

$\displaystyle W_{\psi}[j,k]$	$\textstyle =$	$\displaystyle \frac{1}{\sqrt{N}}\sum_{m=0}^{N-1} x[m] 2^{j/2}\psi[2^j m-k] =\fr... ...um_{m=0}^{N-1} x[m] 2^{j/2} [\sum_n h_{\psi}[n-2k]\sqrt{2} \varphi[2^{j+1}m-n]]$
	$\textstyle =$	$\displaystyle \sum_n h_{\psi}[n-2k][\frac{1}{\sqrt{N}}\sum_{m=0}^{N-1} x[m] 2^{(j+1)/2} \varphi(2^{j+1}m-n)]$

Note that the expression inside the brackets happens to be the wavelet transform for the coefficient of scale :

$\begin{displaymath} W_{\psi}[j+1,k]=\frac{1}{\sqrt{N}}\sum_{m=0}^{N-1}x[m]2^{(j+1)/2}\varphi(2^{j+1}m-n) \end{displaymath}$

therefore we have a recursive relation between the wavelet transform coefficients of two consecutive scale levels and :

$\begin{displaymath}W_{\psi}[j,k]=\sum_n h_{\psi}[n-2k]W_{\varphi}[j+1,n] \end{displaymath}$

and the same is true to the scaling function:

$\begin{displaymath}W_{\varphi}[j,k]=\sum_n h_{\varphi}[n-2k]W_{\varphi}[j+1,n] \end{displaymath}$

Comparing these equations with a discrete convolution:

$\begin{displaymath}y[k]=h[k]*x[k]=\sum_n h[k-n]x[n] \end{displaymath}$

we see that the wavelet transform coefficients $W_{\varphi}[j,k]$ and $W_{\psi}[j,k]$ at the jth scale can be obtained from the coefficients $W_{\varphi}[j+1,k]$ and $W_{\psi}[j+1,k]$ at the (j+1)th scale by:

Convolution with time-reversed $h_{\varphi}$ or $h_{\psi}$ ;
Sub-sampling to get every other samples in the convolution.

We can therefore write

$\begin{displaymath}\begin{array}{l} W_{\psi}[j,k]=h_{\psi}[-n]*W_{\varphi}[j+1,... ...rphi}[-n]*W_{\varphi}[j+1,n]\vert _{n=2k,k\le 0} \end{array} \end{displaymath}$

Based on these two equations, all wavelet and scaling coefficients $W_{\psi}[j,k]$ and $W_{\varphi}[j,k]$ of a given signal ${\bf x}$ can be obtained recursively from the coefficients $W_{\psi}[J,k]$ and $W_{\varphi}[j,k]$ at the highest resolution level (with maximum details), i.e., the data points ( $m=0,\cdots,N-1$ ) directly sampled from the signal . As a member of space , these discrete samples can be written as a linear combination of the scaling basis functions $\varphi_{J,k}[m]$ :

$\begin{displaymath}x[m]=\sum_k W_{\varphi}[J,k] \varphi_{J,k}[m] \end{displaymath}$

If we let the kth basis function be a unit pulse at the kth sampling time, i.e., $\varphi_{J,k}[m]=\delta[k-m]$ (same as the ith component of a unit vector ${\bf e}_j$ in N-dimensional vector space is $e_{ij}=\delta[i-j]$ ), then the kth coefficient $W_{\varphi}[J,k]$ is the same as the kth sample of the function . I.e., $W_{\varphi}[J,k]=x(k)$ , from which the scaling and wavelet coefficients of the lower scales can be obtained by the subsequent filter banks.

Inverse Wavelet Transform

As the forward wavelet transform - finding the transform coefficients $W_{\varphi}$ and $W_{\psi}$ from a given function - can be implemented by the analysis filter bank, the inverse wavelet transform - reconstructing the function from the coefficients $W_{\varphi}$ and $W_{\psi}$ - can be implemented by the synthesis filter bank.

Complexity of FWT

The computation cost of the fast wavelet transform (FWT) is the convolutions carried out in each of the filters. The number of data samples in the convolution is halved after each sub-sampling, therefore the total complexity is:

$\begin{displaymath}O(N+\frac{N}{2}+\frac{N}{4}+\frac{N}{8}+\cdots+1)=O(N) \end{displaymath}$

i.e., the FWT has linear complexity.

The system shown in the figure above is called a filter bank and is further discussed here

A detailed discussion about Matlab implementaton can be found on this MathWorks webpage.

2D DWT

Matlab code for 2DWT (forward)

Matlab code for 2DWT (inverse)

function [c, s] = wavefast(x, n, varargin)

%WAVEFAST Perform multi-level 2-dimensional fast wavelet transform.
%   [C, L] = WAVEFAST(X, N, LP, HP) performs a 2D N-level FWT of
%   image (or matrix) X with respect to decomposition filters LP and
%   HP.
%
%   [C, L] = WAVEFAST(X, N, WNAME) performs the same operation but
%   fetches filters LP and HP for wavelet WNAME using WAVEFILTER.
%
%   Scale parameter N must be less than or equal to log2 of the
%   maximum image dimension.  Filters LP and HP must be even. To
%   reduce border distortion, X is symmetrically extended. That is,
%   if X = [c1 c2 c3 ... cn] (in 1D), then its symmetric extension
%   would be [... c3 c2 c1 c1 c2 c3 ... cn cn cn-1 cn-2 ...].
%
%   OUTPUTS:
%     Matrix C is a coefficient decomposition vector:
%
%      C = [ a(n) h(n) v(n) d(n) h(n-1) ... v(1) d(1) ]
%
%     where a, h, v, and d are columnwise vectors containing
%     approximation, horizontal, vertical, and diagonal coefficient
%     matrices, respectively.  C has 3n + 1 sections where n is the
%     number of wavelet decompositions. 
%
%     Matrix S is an (n+2) x 2 bookkeeping matrix:
%
%      S = [ sa(n, :); sd(n, :); sd(n-1, :); ... ; sd(1, :); sx ]
%
%     where sa and sd are approximation and detail size entries.
%
%   See also WAVEBACK and WAVEFILTER.

%   Copyright 2002-2004 R. C. Gonzalez, R. E. Woods, & S. L. Eddins
%   Digital Image Processing Using MATLAB, Prentice-Hall, 2004
%   $Revision: 1.5 $  $Date: 2003/10/13 01:14:17 $

% Check the input arguments for reasonableness.
error(nargchk(3, 4, nargin));

if nargin == 3
   if ischar(varargin{1})   
      [lp, hp] = wavefilter(varargin{1}, 'd');
   else 
      error('Missing wavelet name.');   
   end
else
      lp = varargin{1};     hp = varargin{2};   
end

fl = length(lp);      sx = size(x);

if (ndims(x) ~= 2) | (min(sx) < 2) | ~isreal(x) | ~isnumeric(x)
   error('X must be a real, numeric matrix.');     
end
  
if (ndims(lp) ~= 2) | ~isreal(lp) | ~isnumeric(lp) ...
       | (ndims(hp) ~= 2) | ~isreal(hp) | ~isnumeric(hp) ...
       | (fl ~= length(hp)) | rem(fl, 2) ~= 0
   error(['LP and HP must be even and equal length real, ' ...
          'numeric filter vectors.']); 
end
  
if ~isreal(n) | ~isnumeric(n) | (n < 1) | (n > log2(max(sx)))
   error(['N must be a real scalar between 1 and ' ...
          'log2(max(size((X))).']);    
end
  
% Init the starting output data structures and initial approximation.
c = [];       s = sx;     app = double(x);

% For each decomposition ...
for i = 1:n
   % Extend the approximation symmetrically.
   [app, keep] = symextend(app, fl);
   
   % Convolve rows with HP and downsample. Then convolve columns
   % with HP and LP to get the diagonal and vertical coefficients.
   rows = symconv(app, hp, 'row', fl, keep);
   coefs = symconv(rows, hp, 'col', fl, keep);
   c = [coefs(:)' c];    s = [size(coefs); s];
   coefs = symconv(rows, lp, 'col', fl, keep);
   c = [coefs(:)' c];
   
   % Convolve rows with LP and downsample. Then convolve columns
   % with HP and LP to get the horizontal and next approximation
   % coeffcients.
   rows = symconv(app, lp, 'row', fl, keep);
   coefs = symconv(rows, hp, 'col', fl, keep);
   c = [coefs(:)' c];
   app = symconv(rows, lp, 'col', fl, keep);
end

% Append final approximation structures.
c = [app(:)' c];       s = [size(app); s];

%-------------------------------------------------------------------%
function [y, keep] = symextend(x, fl)
% Compute the number of coefficients to keep after convolution
% and downsampling. Then extend x in both dimensions.

keep = floor((fl + size(x) - 1) / 2);
y = padarray(x, [(fl - 1) (fl - 1)], 'symmetric', 'both');

%-------------------------------------------------------------------%
function y = symconv(x, h, type, fl, keep)
% Convolve the rows or columns of x with h, downsample,
% and extract the center section since symmetrically extended.

if strcmp(type, 'row')
   y = conv2(x, h);
   y = y(:, 1:2:end);
   y = y(:, fl / 2 + 1:fl / 2 + keep(2));
else
   y = conv2(x, h');
   y = y(1:2:end, :);
   y = y(fl / 2 + 1:fl / 2 + keep(1), :);
end

******************************************************************

function [varargout] = waveback(c, s, varargin)
%WAVEBACK Performs a multi-level two-dimensional inverse FWT.
%   [VARARGOUT] = WAVEBACK(C, S, VARARGIN) computes a 2D N-level
%   partial or complete wavelet reconstruction of decomposition
%   structure [C, S]. 
%
%   SYNTAX:
%   Y = WAVEBACK(C, S, 'WNAME');  Output inverse FWT matrix Y 
%   Y = WAVEBACK(C, S, LR, HR);   using lowpass and highpass 
%                                 reconstruction filters (LR and 
%                                 HR) or filters obtained by 
%                                 calling WAVEFILTER with 'WNAME'.
%
%   [NC, NS] = WAVEBACK(C, S, 'WNAME', N);  Output new wavelet 
%   [NC, NS] = WAVEBACK(C, S, LR, HR, N);   decomposition structure
%                                           [NC, NS] after N step 
%                                           reconstruction.
%
%   See also WAVEFAST and WAVEFILTER.

%   Copyright 2002-2004 R. C. Gonzalez, R. E. Woods, & S. L. Eddins
%   Digital Image Processing Using MATLAB, Prentice-Hall, 2004
%   $Revision: 1.5 $  $Date: 2003/10/13 01:29:36 $

% Check the input and output arguments for reasonableness.
error(nargchk(3, 5, nargin));
error(nargchk(1, 2, nargout));
   
if (ndims(c) ~= 2) | (size(c, 1) ~= 1)
  error('C must be a row vector.');   
end
  
if (ndims(s) ~= 2) | ~isreal(s) | ~isnumeric(s) | (size(s,2) ~= 2)
  error('S must be a real, numeric two-column array.');   
end
  
elements = prod(s, 2);
if (length(c) < elements(end)) | ...
      ~(elements(1) + 3 * sum(elements(2:end - 1)) >= elements(end))
  error(['[C S] must be a standard wavelet ' ...
         'decomposition structure.']); 
end

% Maximum levels in [C, S].
nmax = size(s, 1) - 2;        
% Get third input parameter and init check flags.
wname = varargin{1};  filterchk = 0;   nchk = 0;    
  
switch nargin
case 3
   if ischar(wname)   
      [lp, hp] = wavefilter(wname, 'r');   n = nmax;
   else 
      error('Undefined filter.');  
   end
   if nargout ~= 1 
      error('Wrong number of output arguments.');  
   end
case 4
   if ischar(wname)
      [lp, hp] = wavefilter(wname, 'r');   
      n = varargin{2};    nchk = 1;
   else
      lp = varargin{1};   hp = varargin{2};   
      filterchk = 1;   n = nmax;
      if nargout ~= 1 
         error('Wrong number of output arguments.');  
      end
   end
case 5
    lp = varargin{1};   hp = varargin{2};   filterchk = 1;
    n = varargin{3};    nchk = 1;
otherwise
    error('Improper number of input arguments.');     
end
  
fl = length(lp);
if filterchk                                        % Check filters.
  if (ndims(lp) ~= 2) | ~isreal(lp) | ~isnumeric(lp) ...
        | (ndims(hp) ~= 2) | ~isreal(hp) | ~isnumeric(hp) ...
        | (fl ~= length(hp)) | rem(fl, 2) ~= 0
      error(['LP and HP must be even and equal length real, ' ...
             'numeric filter vectors.']); 
  end
end

if nchk & (~isnumeric(n) | ~isreal(n))          % Check scale N.
    error('N must be a real numeric.'); 
end
if (n > nmax) | (n < 1)
   error('Invalid number (N) of reconstructions requested.');    
end
if (n ~= nmax) & (nargout ~= 2) 
   error('Not enough output arguments.'); 
end
  
nc = c;    ns = s;    nnmax = nmax;             % Init decomposition.
for i = 1:n
   % Compute a new approximation.
   a = symconvup(wavecopy('a', nc, ns), lp, lp, fl, ns(3, :)) + ...
       symconvup(wavecopy('h', nc, ns, nnmax), ...
                 hp, lp, fl, ns(3, :)) + ...
       symconvup(wavecopy('v', nc, ns, nnmax), ...
                 lp, hp, fl, ns(3, :)) + ...
       symconvup(wavecopy('d', nc, ns, nnmax), ...
                 hp, hp, fl, ns(3, :));
      
    % Update decomposition.
    nc = nc(4 * prod(ns(1, :)) + 1:end);     nc = [a(:)' nc];
    ns = ns(3:end, :);                       ns = [ns(1, :); ns];
    nnmax = size(ns, 1) - 2;
end

% For complete reconstructions, reformat output as 2-D.
if nargout == 1
    a = nc;   nc = repmat(0, ns(1, :));     nc(:) = a;    
end
  
varargout{1} = nc;
if nargout == 2 
   varargout{2} = ns;  
end

%-------------------------------------------------------------------%
function z = symconvup(x, f1, f2, fln, keep)
% Upsample rows and convolve columns with f1; upsample columns and
% convolve rows with f2; then extract center assuming symmetrical
% extension.

y = zeros([2 1] .* size(x));      y(1:2:end, :) = x;
y = conv2(y, f1');
z = zeros([1 2] .* size(y));      z(:, 1:2:end) = y;
z = conv2(z, f2);
z = z(fln - 1:fln + keep(1) - 2, fln - 1:fln + keep(2) - 2);

Applications

The Haar transform is a wavelet transform. For example, when , the transform matrix is

$\begin{displaymath}H_8=\frac{1}{\sqrt{8}}\left[ \begin{array}{cccccccc} 1 & 1 & ... ...\\psi_{2,1}(t) \psi_{2,2}(t) \psi_{2,3}(t) \end{array}\end{displaymath}$

The idea of Laplacian pyramid discussed before is very similar to wavelet transform. The forward transform is the Gaussian (low-pass) filtering and down-sampling process to obtain the Gaussian pyramid (the scaling coefficients) composed of images $f_1, f_2, \cdots$ , and the Laplacian pyramid (the wavelet coefficients) compsed of images $h_1, h_2,\cdots$ as the difference of two consecutive Gaussian images. The inverse transform is the reconstruction of the image from the last Gaussian image (representing the approximation at the lowest scale level) and all Laplacian images ( $i=1,2,\cdots,n$ ) at different scale levels (representing details at different scale levels).

Similar to other orthogonal trnasforms (such as Fourier, discrete cosine, Walsh-Hardamard, etc.) and the Laplacian pyramid image coding, wavelet transform can also be used for signal/image compression and feature extraction.

It is in general much more effective to carry out compression in the transform domain as the signal is likely to be more decorrelated and the energy (information) more concentrated in a small number of components, compared to the original signal (in time or space domain), so that only a small subset of transform coefficients is needed to represent most of the information (energy) contained in the signal. This is the most essential reason why orthogonal transforms are widely used.

However, different from and usually better than other orthogonal transforms, the wavelet transform can represent information of the signal in time (or space), such as the location or translation (represented by the subscript k of $\psi_{j,k}(t)$ for position), as well as in frequency, such as the different levels of details or resolution (represented by the subsript j for different scales). This added property makes it more likely that wavelet transform can represent a given signal more effecitvely and efficiently, in the sense that fewer number of coefficients than other transforms may be needed to approximate the signal.

For example, consider a time signal containing only an isolated narrow shape at some location . In Fourier domain, this signal has a wide spectrum, i.e., a large number of frequency components are necessary to represent the signal. However, when wavelet transform is used, it is possible that a small number of coefficients associated with wavelet functions corresponding to the location of the shape in the signal can closely approximate (represent) the signal, thereby a high compression rate can be achieved.

Due to these advantages of wavelet transform compared to discrete fourier and discrete cosine transforms, it is used to replace the DCT in the image compression standard JPEG (1992) in the new version JPEG 2000.