Difference between revisions of "Calc/To-Dos/Statistics/Miscellaneous Data Analysis"

From Apache OpenOffice Wiki
< Calc‎ | To-Dos
Jump to: navigation, search
(Goal)
m (Summarizing Data)
Line 17: Line 17:
 
** the resulting variables are non-correlated;
 
** the resulting variables are non-correlated;
 
** disadvantage: resulting variables might be difficult to interpret (do not have any logical meaning)
 
** disadvantage: resulting variables might be difficult to interpret (do not have any logical meaning)
* Varimax  
+
* Varimax
 +
** see http://de.wikipedia.org/wiki/Varimax
 +
** see also: http://sekhon.berkeley.edu/stats/html/varimax.html (an R-implementation: package ''stats'')
 
* Simple Component Analysis:
 
* Simple Component Analysis:
 
** variables are not necassarily non-correlated
 
** variables are not necassarily non-correlated
 
** but are easier to understand/ to interpret
 
** but are easier to understand/ to interpret
 
** see Rousson V, Gasser T. Simple component analysis. Appl. Statist. 2004; 53:539–555
 
** see Rousson V, Gasser T. Simple component analysis. Appl. Statist. 2004; 53:539–555
 
  
 
== Energy-Frequency Analysis ==
 
== Energy-Frequency Analysis ==

Revision as of 18:01, 5 August 2006

Front matter....

Goal

One of the most important tasks in data analysis is to describe optimally the data. Other important issues include extracting all the useful information from the original data set and describe complex relationships/ variability. Some of these are accomplished using statistical techiques (especially in biomedical sciences, see http://wiki.services.openoffice.org/wiki/Statistical_Data_Analysis_Tool), yet most will use different techniques, which will be described here.

Unfortunately, I lost any contact with mathematics more than 10 years ago. Therefore, my comments will be very brief, and I hope that people with interest and knowledge will develop this page further.

...

Specific Techiques

Summarizing Data

Methods to summarize the information in a limited number of components, e.g. linear dimension reduction

  • Principal Component Analysis:
    • most variability is extracted from the original data;
    • the resulting variables are non-correlated;
    • disadvantage: resulting variables might be difficult to interpret (do not have any logical meaning)
  • Varimax
  • Simple Component Analysis:
    • variables are not necassarily non-correlated
    • but are easier to understand/ to interpret
    • see Rousson V, Gasser T. Simple component analysis. Appl. Statist. 2004; 53:539–555

Energy-Frequency Analysis

  • Fourier Transform (limited to stationary and linear data)
  • wavelet analysis
  • Wigner-Ville distribution
  • Empirical Mode Decomposition: more robust (a detailed description is freely available somewhere on the net; The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Huang et al. Proc. R. Soc. Lond. A 1998; 454:903-995)

Resources

Links

  • ...
Personal tools