# Audio source separation using independent component analysis and beam formation

Project Report 2013 25 Pages

## Excerpt

TABLEOFCONTENTS:

ABSTRACT

LIST OF FIGURES

1. MOTIVATION

2: INDEPENDENT COMPONENT ANALYSIS

2.1 - DEFINITION OF ICA

2.2 - ESTIMATION OF ICA.

2.3 - AMBIGUITIES OF ICA .

2.4 - ILLUSTRATION

3. ASSUMPTIONS..

4. ICA ESTIMATION USING NON-GAUSSIANITY MAXIMIZATION

4.1- PRINCIPLE

4.2 - KURTOSIS AS A MEASURE OF NON GAUSSIANITY.

4.3- NEGENTROPY AS A MEASURE OF NONGAUSSIANITY

4.3.1 - APPROXIMATION OF NEGENTROPY

5. FREQUENCY DOMAIN APPROACH

6. ACOUSTIC BEAM FORMATION.

7. RESULTS

7.1 - WHITENING AND PCA

7.2 - NEGENTROPY METHOD

7.2.1 - TIME DOMAIN ANALYSIS

7.2.2 - FREQUENCY DOMAIN ANALYSIS.

7.3 - ACOUSTIC BEAM FORMING

8. REFERENCES ..

## ABSTRACT:

Audio source separation is the problem of automated separation of audio sources present in a room, using a set of differently placed microphones, capturing the auditory scene. The whole problem resembles the task a human can solve in a cocktail party situation, where using two sensors (ears), the brain can focus on a specific source of interest, suppressing all other sources present (cocktail party problem).

For computational and conceptual simplicity this problem is often represented as a linear transformation of the original audio signals. In other words, each component (multivariate signal) of the representation is a linear combination of the original variables (original subcomponents).

In signal processing, independent component analysis (ICA) is a computational method for separating a multivariate signal into additive subcomponents by assuming that the subcomponents are non-Gaussian signals and that they are all statistically independent from each other. Such a representation seems to capture the essential structure of the data in many applications.

Here we separate audio using different criteria suggested for ICA, being PCA (Principal Component Analysis), Non-gaussianity maximization using kurtosis and neg-entropy methods, frequency domain approach using non-gaussianity maximization and beamforming.

## LIST OF FIGURES:

Figure 1 Mixing of source signals

Figure 2 The general noiseless audio source separation problem

Figure 3 Original unmixed signals

Figure 4 Mixed signals of original signals

Figure 5 The estimates of the original source signals

Figure 6 The joint distribution of the independent components

Figure 7 The joint distribution of the observed mixtures

Figure 8 The joint distribution of the whitened mixtures

Figure 9 The distribution of the super and sub-gaussian functions

Figure 10 PCA distribution plot

Figure 11 Mixed signals for time domain method

Figure 12 Separated signals for time domain method

Figure 13 Mixed signals for frequency domain method

Figure 14 Separated signals for frequency domain method

Figure 15 Beam formed in the direction of source

## 1. MOTIVATION

Imagine that you are in a room where two people are speaking simultaneously. You have two microphones, which you hold in different locations. The microphones give you two recorded time signals, which we could denote by x_{1} (*t*) and x_{2} (*t*), with x_{1} and x_{2} the amplitudes, and *t * the time index. Each of these recorded signals is a weighted sum of the speech signals emitted by the two speakers, which we denote by s_{1} (*t*) and s_{2} (*t*). We could express this as a linear equation:

Abbildung in dieser Leseprobe nicht enthalten

where a11, a12, a21, and a22 are some parameters that depend on the distances of the microphones from the speakers. It would be very useful if you could now estimate the two original speech signals s1 (t) and s2 (t), using only the recorded signals x1(t) and x2(t). This is called the cocktail-party problem.

Abbildung in dieser Leseprobe nicht enthalten

Figure 1: Signal mixing.

Here parameters aii and the original source signals si are both unknown. If by any chance parameters ai are known signals si can be easily solved as solutions of linear equations. But aii are unknown and problem would be solved if we use some information on the statistical properties of the signals si (t) to estimate the aii. It turns out that it is enough to assume that s1(t) and s2 (t), at each time instant t, are statistically independent and non-gaussian. This is not an unrealistic assumption in many cases, and it need not be exactly true in practice. The recently developed technique of Independent Component Analysis (ICA), can be used to estimate the ai j based on the information of their independence, which allows us to separate the two original source signals s_{1} (*t*) and s_{2} (*t*) from their mixtures x_{1} (*t*) and x_{2} (*t*).

2. INDEPENDENT COMPONENT ANALYSIS.

2.1 Definition of ICA (ICA model)

Assume that there are n linear mixtures x1, ...,xn of n independent components

Abbildung in dieser Leseprobe nicht enthalten

Dropping the time index *t* in the ICA model, assume that each mixture xj as well as each independent component sk is a random variable, instead of a proper time signal. The observed values xj(t), e.g., the microphone signals in the cocktail party problem, are then a sample of this random variable. Without loss of generality, we can assume that both the mixture variables and the independent components have zero mean. If this is not true, then the observable variables xi can always be centered by subtracting the sample mean, which makes the model zero-mean.

It is convenient to use vector-matrix notation instead of the sums like in the previous equation. Let us denote by x the random vector whose elements are the mixtures x1, ...,xn, and likewise by s the random vector with elements s1, ..., sn. Let us denote by A the matrix with elements aij. Bold lower case letters indicate vectors and bold upper-case letters denote matrices. All vectors are understood as column vectors: thus xT , or the transpose of x, is a row vector. Using this vector-matrix notation, the above mixing model is written as

x = As (4)

The statistical model in Eq. 4 is called independent component analysis, or ICA model. The ICA model is a generative model, which means that it describes how the observed data are generated by a process of mixing the components si. The independent components are latent variables, meaning that they cannot be directly observed. Also the mixing matrix is assumed to be unknown. All we observe is the random vector x, and we must estimate both A and s using it. This must be done under as general assumptions as possible.

### 2.2 Estimation approach

The starting point for ICA is the very simple assumption that the components si are statistically independent. Another assumption is that the independent component must have non-gaussian distributions. For simplicity, let the unknown mixing matrix is square, but this assumption can be sometimes relaxed. In many applications, it would be more realistic to assume that there is some noise in the measurements, which would mean adding a noise term in the model. For simplicity, we omit any noise terms and it seems to be sufficient for many applications.

Then, after estimating the matrix A, we can compute its inverse, say W, and obtain the independent component simply by En5 or W can be directly found without computing A.

s =Wx .(5)

Abbildung in dieser Leseprobe nicht enthalten

Figure 2: The general noiseless audio source separation problem.

Figure 2 pictorially represents the ICA model (ignoring noise) for audio source separation problem showing the unmixing process. The system outputs ui are the estimated versions of si.

**[...]**

## Details

- Pages
- 25
- Year
- 2013
- ISBN (eBook)
- 9783656588870
- ISBN (Book)
- 9783656588863
- File size
- 687 KB
- Language
- English
- Catalog Number
- v267455
- Grade
- 10
- Tags
- audio