Saturday, February 26, 2011

Quantized Redshifts. VII. Common Power Spectral Density "Bloopers"

A surprising number of researchers seem to regard the PSD (power spectral density) function as some kind of magical tool that 'finds frequencies'.
The PSD is a very powerful tool, but one with which you can 'shoot your own foot off' if you are not careful.

I've laid out the basic mathematics of the PSD, and this will enable us to explore a few basic claims that have been posed in the creationist and other literature.

Can the Power Spectral Density identify a "Center" of a distribution?

First we should make clear that
  1. In Big Bang cosmology (BBC), in the underlying Friedmann-Robertson-Walker metric,  every point is a center.  No matter where the observer is located, they will interpret the universe around them as being centered on them.  See Misconceptions about the Big Bang (Scientific American), "Is it possible to point to a direction in the sky and say "that way is the center of the universe, where the Big Bang started?"
  2. In addition, because of the finite speed of light (which many creationists try to ignore), when we look into the distant cosmos, we are looking into the distant past.  Since BBC initiates a starting point from which structures grow and evolve, the further out we observe, the universe will begin to look different as we see younger and younger cosmological structures.
The popular retort that the universe will look different from different locations works only in a case where the universe is static (not evolving) and the speed of light is very large or infinite (which violates loads of atomic spectra we observe in distant galaxies as well as using redshift as a measure of anything).  Myself and others have explored the 'decaying speed of light' claims of Barry Setterfield elsewhere (A Changing Speed of Light?).

I have no doubt that many Big Bang opponents will keep repeating this lie.

It has been suggested that the very existence of periodicities in extragalactic Power Spectra is evidence of a geocentric distribution.  Russ Humphreys explored this, suggesting that broader peaks in the PSD implied that the Earth was actually slightly offset from the center of the Universe in Humphreys' strangely titled "Our galaxy is the centre of the universe, ‘quantized’ redshifts show".

The major problem with such a claim is that it can be rigorously demonstrated that the PSD is independent of the location of the 'center' of a dataset, and therefore cannot identify the location of the 'center'.   I hinted at this by my question in my previous post (Quantized Redshifts. V. Exploring the PSD with Simple Datasets).  How do you prove this?

Consider the 1-dimensional case of a some function, f(x) with a 'center' defined as x=0.  Now displace that function on the x-axis by the distance, a, so the 'center' is now at x=a.  This defines a new function, f':
f'(x') = f(x-a)
because x'=x-a.
To see how this affects the PSD, first, take the Fourier transform of both of these functions:

So we see that the Fourier transform of the original function gets multiplied by a complex (real+imaginary) factor
when displaced in position.  But the Power Spectral Density is the Fourier transform multiplied by its own complex conjugate

Therefore the PSD of the original function is EXACTLY the same as the PSD of the function with the displaced center.  It is impossible for the PSD of a distribution to be used to identify its center.  Potentially, one could use the Fourier transform itself to identify a value for the complex offset value found above, but the only way to do this would be to compare the Fourier transforms with a specific model distribution.

This proof is easily extended to three- and four-dimensional transforms.

The bottom line is the PSD cannot identify the center of a distribution.

Adding & Subtracting PSDs

Consider a signal, s(t), that is the sum of two signals, s1(t) and s2(t)
Because the Fourier Transform is linear, it is easy to show
But what about the PSD?  From the definition of the PSD:
We see that the PSD of two summed signals are not necessarily equal to the sums of the individual PSDs.  There is an additional component which is a type of mixed product of the Fourier transforms of the two input signals. There are useful cases where you CAN demonstrate the mixed term is zero, for example, if you are collecting signal from a source and there is an uncorrelated (random) background signal (see Effects of background counts in RMS normalization).

But in general, this mixed product is not automatically zero.

A similar relationship holds if we wish to subtract one signal from another.

Some researchers attempt to split the PSD into two additive components to justify subtraction of the components they perceive to be part of the distribution of the data.  This is an attempt to leave only the oscillating components.  The problem is the distribution they wish to subtract often has significant variation as well. It is the responsibility of the researcher to demonstrate that the input components have no correlations, or are correlated in a simple way, but I have not seen this done.

The errors described above are made in a number of papers claiming quantized redshifts and represent a basic misunderstanding of how the Fourier transform and PSD works.  These are basic errors that anyone who has who taken a course in mathematical methods for physicists (wikipedia) should not make.  One might expect a first-year graduate student or a neophyte to make these types of errors, but not someone with long experience with the PSD. 

But what should really be an embarrassment to some journals is that their peer-reviewers did not catch these errors either!

Next Weekend: An Uncommon Power Spectral Density "Blooper"

Wednesday, February 23, 2011

SpaceMath @ NASA

In this blog, I have repeatedly brought attention to the fact that the real language of science is mathematics (see Mathematics: The Language of Science), not the rhetorical word games that drive pseudo-science 'arguments'. 

Sten Odenwald, who also operates The Astronomy Cafe, also develops math-based lessons with a science & engineering focus for NASA education projects.  The lessons are freely available at SpaceMath @ NASA.

SpaceMath hosts, at the time of this writing, about 400 sample problems covering a range of topics from astronomy to space physics to spacecraft engineering, selected for a variety of educational levels.  Many use REAL astronomical data.  In many cases, the problems are the 'back-of-the-envelope' calculations that lay the ground-work for designing and developing real missions.

Many of these problems are at levels of high school mathematics and science, yet many are more sophisticated than anything I have seen from the pseudo-scientists I have dealt with in this blog.  I wonder how many of those pseudo-scientists could even do the problems at SpaceMath@NASA.

Mathematics enables us to plan, build, and launch missions to other planets.  It enables us to plot the course to a destination and determine the fuel requirements, years before a launch.  The theory of gravity and motion spent over two hundred years applied to the motions of distant planets and stars before the development of rocketry and space flight made them an engineering reality.  Mathematical models of space weather enable us to estimate the radiation environments around the solar system so we know how much radiation shielding is needed for instruments or even a crew.

A theory that can't make even close to reasonable numerical predictions  in these environments, such as the Electric Sun models or neo-geocentrism, is not only useless (see Crank Science: Worse than Wrong), it is, by definition, a pseudo-science.

But it's worse than that.

If we teach such nonsense as the Electric Universe or neo-gencentrism to our children, who will maintain the national infrastructure of space-based communication and weather satellites?  Supporters of these 'alternative sciences' can't provide info on how to do even fundamental computations of things such as the fuel requirements of a rocket to make it to a given orbit, or even estimate how much radiation the crew might receive.  If our children can't do that, then how can they maintain these critical national infrastructures, much less be leaders in the exploration of space?

Saturday, February 19, 2011

Quantized Redshifts. V. Exploring the PSD with Simple Datasets

The ready availability of the Fast Fourier Transform (FFT) in numerics libraries on a variety of computing platforms has turned the FFT into a tool available to many researchers.  However, the availability of a tool does not mean that its behavior is understood.  Too many researchers seem to use the FFT as if it blindly 'finds frequencies' and that these frequencies are physically relevant to their problem of interest.

As I've demonstrated in earlier posts in this series, the fundamental idea behind the Fourier series & transforms is that ALL functions can be decomposed in waves of various frequencies - therefore EVERY SIGNAL will exhibit frequencies in a PSD.  The question becomes which frequencies are physically relevant and not an artifact of the data selection and distribution.

An important issue in working with the PSD is to understand just how it behaves with different types of known data as input.  Generally, if your data contains a clean, sinusoidal signal, the PSD will generate nice, sharp peaks at the frequency of the signal.  But as we will see below, even a little bit of deviation from the perfect, infinite signal changes the PSD.

A note on normalization:  For comparisons, I want to maintain a matching scale between analytically computed power spectra using Fourier transforms with the power spectra using the FFT.  I have adopted the convention of the previous post (Quantized Redshifts. IV. The Fourier Transform, FFT & PSD) for analytic computations.  I have also included a multiplicative factor for normalizing the PSD generated by the FFT.  In Python using the numpy library, this can be expressed as

>>> import numpy as np
>>> from numpy import fft           

FFT computation.  nbins=number of data bins
>>> ff=fft.fft(dataf,nbins)
>>> psd=np.abs(ff)**2/nbins

Analytic computation
>>> psdref=(nbins/(2.0*np.pi))*np.abs(fftref)**2

PSD of Pure Sinusoidal Signal:
First we'll consider the basic case of the sine wave.  The ideal sine wave has infinite in duration in time, t, with some well-defined frequency, \omega_0.  It is expressed mathematically by

But in the real world, we cannot have a sine wave of infinite duration.  All data runs are of finite duration in time and space.  Therefore, we will define a sine wave of finite duration, starting at a time \tau and lasting for time, T.  Mathematically, this can be expressed as:

We can compute the Fourier transform, and compute the FFT of this system, and we plot them below.
Click to enlarge

In this example, the infinite sine wave produces a PSD that is a delta function (wikipedia) at the frequency of the wave, which I've designated with red in the graphic above.  Note that in the above plot, we are plotting the frequency, f, not the angular frequency, \omega.  But the PSD generated by the FFT (blue) and the analytic function (green) are slightly different.  We see this more clearly in the plot below where we plot the amplitude of the PSD on a logarithmic scale.

Click to enlarge
Here we see that the peak is still at 10Hz, but power leaks into other frequencies, above and below the input frequency.  We also see that the two different methods of generating the PSD generate slightly different results.  With the normalization chosen, the amplitudes at the higher frequencies differ by a factor of ten.

PSD of A Couple of (Independent) Sine Waves
Next we'll consider the superposition of two overlapping sine waves of finite duration.  Because the Fourier Transform is a linear transformation, the Fourier transform of the sum of the Fourier transform of the two input waves.  However, when we compute the PSD, we must still compute the product with the total Fourier transform.  This means that the PSD of two signals is not necessarily equal to the sum of the PSDs of the individual signals. 
Click to Enlarge
Again, we will plot the power on a logarithmic scale so we can see how the two computational techniques vary. 

Note that while in this example, the PSD is equal to the sum of the PSDs of the two input signals, this will not always the case for reasons we will see next week. 

PSD of a Square Pulse
Finally, we consider a rectangular pulse, which has a constant amplitude, different from zero, for some finite duration, T.  To make the inquiry more general, I'll also consider offset from time t=0 by a time \tau. 

For the analytic part of the analysis, we compute the Fourier Transform of this function:
And now we compute the power spectrum using the function above, and the FFT.  The results are plotted below.
Click to Enlarge
Rather than plotting the full range of the frequency scale for this test, I've zoomed in on the lower frequencies to reveal how the power is distributed in a series of broad peaks.  We even see a slight difference in the locations of the peaks when comparing the analytic PSD and the PSD computed from the FFT. 

Here again are frequencies in the power spectrum from an input that is clearly perfectly uniform within its boundaries, it has no 'internal structure'. 

This is a very counter-intuitive result.  What generates all these peaks?

Whatever the origin of the peaks, it means we must be very careful when claiming peaks in a PSD represent actual internal structure in a dataset.  To make sure we understand what our analysis techniques are doing, we must test the null hypothesis (Wikipedia) - what kind of PSD is produced if we have a uniform region subjected to the same sampling as our data?

Here's another question.  Suppose we change the offset value, \tau.  How will the Fourier Transform change?  How will the PSD change?  Suppose the signal is not a square pulse, will the answers change?  How is this question relevant to cosmological power spectral analysis?

Some General Observations
  • We see that the power spectrum computed by the Fourier Transform and the FFT are indeed different, but still exhibit many of the same qualitative features, such as the locations of local peak values.  Some of this difference is due to aliasing, in the Discrete Fourier Transform (DFT).
  • Narrow features in time or space are broad in frequency and wave number.  Well-defined frequencies correspond to infinite waves.  Broad structures in spatial dimensions generate low-frequencies in the power spectra.
I've laid the basic mathematical groundwork on what the power spectral density (PSD) is, and how it is computed.

Enough of the introductions.  I've now covered enough material to explore some specific errors.  Time to get to the meat of our story...

Next Weekend: Common Power Spectral Density "Bloopers"

Saturday, February 12, 2011

Quantized Redshifts. IV. The Fourier Transform, FFT & PSD

Continuing my series on Fourier analysis and its application to power spectra...

Fourier Series begat the Fourier Transform...
The Fourier series is a powerful technique, but had the limitation that it required waves whose wavelengths changed in integral fractions of your scale distance, L or 2L.  What happened if you needed wavelengths not subject to that limitation?  Some simple manipulations from the calculus enabled the Fourier series to be extended to the realm of continuous functions in the form of the Fourier Transform. 

In this section, we'll make some notational changes to bring our conventions more in line with usage in the physics community.

1) We adopt the physics custom of presenting the frequency as angular frequency,
related to the traditional frequency, f,
.  Omega is a frequency measured in radians per unit time (there are 2*pi ~6.28  radians in a circle). 

2) While studying the Fourier Series, we separated the sine and cosine terms.  Now we will combine them into the exponential term which is a complex number (it contains real and imaginary parts) using De Moivre's formula  (Wikipedia). 

With these revisions, we can then write the Fourier transform as:

The inverse transform, which converts the transform back into the original function, f(t), becomes

3) We can also do the Fourier transform in a spatial (position) coordinate.   For this, we will choose x (it could easily be y or z).  In this case, the 'conjugate coordinate' is called the wave number, usually designated by the letter k.  The wave number is related to the wavelength, lambda, of the wave by the equation
The wave number is the number of waves (or radians of a wave) per unit of distance.  Note that in frequency and wavelength, their product is the speed of the wave.  Similarly, if we use angular frequency and wave number, it is basic algebra, using the definitions above, to demonstrate the relationship
In the case of spatial dimensions, the Fourier Transforms and its inverse can be written:
As with the Fourier series, these the normalization of these equations can change slightly based on the conventions chosen.  This is generally not a problem so long as the conventions are used consistently.

The Discrete Fourier Transform

But the Fourier transform has a problem when applying it to real data because real data generally does not make a truly continuous function.  When a measurement is made, it is determined at a specific time, t, and a position, (x,y,z), in a given coordinate system. A full dataset would consist of many of such measurements at discrete times and/or positions.

If we take measurements uniformly sampled in time, at intervals T, then each measurement takes place at a time t_k, where k is an integer.
We would say that T defines our bin size.  From this, we can compute the Discrete Fourier Transform (DFT)
 where the value of the transform, F, is computed in bins corresponding to discrete frequencies, fn,
We can also define the inverse transformation,
Because of the discrete nature of the time sampling and the finite size of the dataset, this yields a finite frequency resolution.  That is, there is a minimum frequency that can be resolved by the DFT.  This minimum frequency also defines the minimum frequency that you can resolve for a given dataset.  You can make this frequency resolution smaller by increasing the duration of the total amount of data. 

The DFT also an upper limit to the frequency which can be identified in a given dataset, called the Nyquist frequency.  This frequency can be increased by increasing the the number of samples, N, in a fixed duration of time.

In one-dimensional cases, we can generally freely interchange the pairs of time/frequency and position/wave number.

...begat the Fast-Fourier Transform (FFT)...

With the advent of digital computers, the hunt was on for algorithms that could efficiently use the computational power.  While the Fourier transforms were computationally simple, the fact was that researchers wanted to apply these transforms to large amounts of data.  Using the standard recipe above, computational time required grew as N^2, where N was the number of samples.  If you wanted to compute a transform for four times as many data points, the computation generally took 4^2 = 16 times longer.  This would quickly become computationally expensive as dataset became larger.

Enter the Fast-Fourier Transform (FFT), the best known of the algorithms developed by Cooley & Tukey (wikipedia) in the mid-1960s.  This technique sped up the computation of the DFT, but with a couple of trade-offs.  The speedup was achieved by efficiently allocating the computation in powers of two, and using shift registers inside the processor which could do powers of two operations very quickly.  The most notable trade-off was that the algorithm was most optimized with data samples with integral powers of two, such as 2, 4, 8, 16,...1024, 2048, ...16384,... 2^n.

The major implication of this is that the FFT is not identically equal to the Fourier transform of a given function.

...and Power Spectral Density (PSD)

The Fourier transform also has an analog to the Parseval Relation (wikipedia), as noted in Quantized Redshifts. III.  How is Fourier Series Related to the PSD?  For the Fourier Transform, the relation is
For the Discrete Fourier Transform (DFT) and the Fast-Fourier transform (FFT), this relationship can be written:
By this analogy, we define the Power Spectral Density (PSD), or Power Density Spectra (PDS), or Power Spectra (PS) as

By these definitions, the PSD is always positive (negative power doesn't really make sense).  Phase information about the waves in the dataset, which is determined by comparing the real and imaginary components, is partially lost in the PSD. 

General Properties and Insights
  • The DFT and FFT are symmetric functions, so the transform value at frequencies greater than zero is equal to the value at frequencies less than zero.  For this reason, we generally only plot the positive frequency side.  In the case of the FFT, the array bins corresponding to the negative frequencies are actually stored in the 'upper' half of the array.  Often, researchers are only interested in the location of any structures, but in cases where we want to measure the relative power, the researcher may multiply amiplitude of the positive frequencies by two to account for the power in the negative frequencies (and satisfy the Parseval relation).
  • In the DFT and the FFT, the Fourier transformed term in bin zero (corresponding to zero frequency or a constant), contains information about the symmetry of the function above and below zero.  If the term is zero, it means the function contains equal positive and negative amplitudes.  For most real datasets where the data are all positive measurements, this quantity is kind of redundant, and often very large compared to the frequency distribution the researcher is interested in.  Therefore it is sometimes the custom to not plot the zero frequency bin.
  • Another problem one encounters in the transition of Fourier transforms on functions to DFT and FFTs on discrete data points is the problem of aliasing.  In the case of Fourier transforms on functions, we can generally find that the transform has a no-zero amplitude at frequencies that can range from zero to infinite (and negative infintity).  This happens even if the original function is non-zero over a finite range of time.  But what happens in the DFT and FFT where there is a maximum frequency (the Nyquist frequency) that can be analyzed?  The answer is that the power from those frequencies above the Nyquist frequency get aliased into frequencies below the Nyquist frequency.  This means that some of the power measured at a given frequency is not necessarily from that frequency and care must be exercised in the interpretation.
Extending the Fourier Transform to Three Dimensions

We examined the one-dimensional Fourier transform above.  However, we can expand the tranform from 1-dimension to 3-dimensions.  This is easier when done using the wave number, k, because it acts as a 3-dimensional vector.

Using the wave number, we express the represents the direction of the wave with  k = 2pi/lambda * direction vector, or
With a 3-dimensional form of the wave number vector, we can define the phase as the projection of the wave number vector on the direction vector,

With this definition of the phase, the Fourier Transform can then be extended to three dimensions as

with the inverse transformation defined as
where the integration is performed over the volume of the system. 

This is the proper 3-dimensional form of the Fourier transform that preserves the orthogonality characteristics (that the 'waves' identified are linearly independent).  I shall show in a future post that the radial form used by many quantized redshift supporters is NOT equivalent to this form and therefore cannot reliably identify periodicites in a 3-dimensional space.

The Fourier transform can also be extended to 4-dimensions (spatial + temporal) where the function is integrated over a volume and time.  The term in the complex exponential term becomes wt-k*x, which is an invariant in relativity.  This form is used heavily in electromagnetism in design of antennas and waveguides.

Next, we explore the power spectral density of some known datasets.

Next Weekend: Playing with the PSD

Saturday, February 5, 2011

Quantized Redshifts. III. How is Fourier Series Related to the PSD?

Continuing with the introduction to Fourier analysis...

The amplitude of each term satisfies a relationship with the original function called the Parseval equality (Wikipedia).
Mathematical Note: some readers may not have the mathematical background to make sense of some f the symbols used in this equation or others in this blog (and particularly this series of posts).  The vertically stretched 'S' on the left hand side of the equation is the calculus symbol for integration (wikipedia), basically a method of summing little chunks of 'area' under a smoothly varying function.  The funky-looking 'E' on the right side of the equation is the Greek letter Sigma, and represents summation of a series of values (wikipedia).  On a personal note, it was curiosity about these symbols when I was in the eight-grade that motivated me to learn calculus on my own.  It's value extends far beyond science and mathematics.

Physically, we can think of this as an expression of conservation of energy, that the total energy in the signal, the square of the amplitude of the function, f(x), must be equal to the total energy in each frequency composing the signal.  If we plot the square of amplitude of each frequency (an^2+bn^2), we obtain a plot which is the equivalent of the power spectral density (PSD).
Note that the PSD for the square wave displays power not just in the zero frequency bin, but also has smaller peaks of power at higher frequencies in between frequencies of zero power!  Beyond the fact that the square wave repeats with a wavelength of 20, why would it have these additional peaks?!  It is our first clue that one must be very careful when interpreting the PSD.

We'll examine a few more cases, such as the sawtooth wave, were the Fourier coefficients are computed to be
A plot like those above yields:
Click to Enlarge
Due to the sharp discontinuity at L=20, we observe a small overshoot that is part of the Gibbs phenomenon.

Let's also look at the half-sine (rectifier) wave with Fourier coefficients

Click to Enlarge
Notice that in all these graphs, there is power in frequencies other than the zero (no frequency) bin.  With appropriate scaling, one would see there are non-zero amplitudes for values of n up to infinity.

Are the frequencies real?

One should be careful to define what one means by 'real'.  I'd be inclined to define 'real' in this case as if the frequencies found are representative of the underlying physical mechanism for forming the original signal as opposed to an artifact of the sampling.  And since all non-zero signals exhibit some sampling, there will always be sampling artifacts in the resulting PSD.

The Fourier series demonstrates that there is an alternative way to represent almost any finite function.  In many ways, it is as indistinguishable as saying that you could represent the number 10 by 3+2+5 or 4+6.  Think about the way you do pen-and-paper arithmetic, especially subtraction and division, and you'll realize these alternative representations are done all the time.  The Fourier series becomes useful by providing an additional set of mathematical 'tricks' which may facilitate solving some other problem.

For example, in the case of the half-sine (rectifier) example above, such as signal is usually generating by running a sine wave signal (such as A/C) through a non-linear device like a rectifier, which chops off the negative (reverse) part of the current.  In this case, the signal does not have to be formed by a summation of the frequencies in the Fourier series (though that would be an alternative way to form it).  In this case, I would be inclined to say the frequency components of the signal are not 'real', but simply an alternative representation.

But suppose you're examining a process where you expect to have some regular structures or variation?  Then the peaks might tell you more about the structure under examination, but one must exercise extreme care.

The major point of this section is to introduce the fact that non-zero power in the frequency bin of a given signal profile is NOT evidence of a real frequency indicating some intrinsic structure.  I will demonstrate more examples of this in the coming sections.

More online Fourier resources
Models were generated using Python v2.6, v2.7, numpy 1.5, scipy 0.8.  Plots generated using matplotlib 1.0.0

Next Weekend: The Fourier Transform, FFT and PSD, Oh My!