Difference between revisions of "Calc/To-Dos/Statistics/Miscellaneous Data Analysis"

From Apache OpenOffice Wiki
< Calc‎ | To-Dos
Jump to: navigation, search
(initial page creation (stub))
 
m (Goal: Corrected Link 2)
 
(15 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
Front matter....
 
Front matter....
  
= Introduction =
+
= Goal =
 +
 
 +
One of the most important tasks in data analysis is to describe optimally the data. Other important issues include extracting all the useful information from the original data set and describe complex relationships/ variability. Some of these are accomplished using statistical techiques (especially in biomedical sciences, see [[Calc/To-Dos/Statistical Data Analysis Tool]]), yet most will use different techniques, which will be described here.
 +
 
 +
Unfortunately, I lost any contact with mathematics more than 10 years ago. Therefore, my comments will be very brief, and I hope that people with interest and knowledge will develop this page further.
 +
 
 
...
 
...
 +
 +
= Specific Techiques =
 +
 +
== Summarizing Data ==
 +
Methods to summarize the information in a limited number of components, e.g. linear dimension reduction
 +
* Principal Component Analysis:
 +
** most variability is extracted from the original data;
 +
** the resulting variables are non-correlated;
 +
** optimal linear transformation
 +
** disadvantage: resulting variables might be difficult to interpret (do not have any logical meaning)
 +
** see http://en.wikipedia.org/wiki/Principal_components_analysis
 +
* Varimax
 +
** see http://de.wikipedia.org/wiki/Varimax
 +
** see also: http://sekhon.berkeley.edu/stats/html/varimax.html (an R-implementation: package ''stats'')
 +
* Simple Component Analysis:
 +
** variables are not necassarily non-correlated
 +
** but are easier to understand/ to interpret
 +
** see Rousson V, Gasser T. Simple component analysis. Appl. Statist. 2004; 53:539–555, http://www.unizh.ch/biostat/Manuscripts/simpcomp.pdf
 +
** R-implementation: http://www.maths.lth.se/help/R/.R/library/sca/html/00Index.html (package ''sca'')
 +
 +
== Energy-Frequency Analysis ==
 +
 +
* Fourier Transform (limited to stationary and linear data)
 +
* wavelet analysis
 +
* Wigner-Ville distribution
 +
* Empirical Mode Decomposition: more robust
 +
** a detailed description is freely available here: [http://keck.ucsf.edu/~schenk/Huang_etal98.pdf The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis]. Huang et al. Proc. R. Soc. Lond. A 1998; 454:903-995
 +
** see also [http://perso.ens-lyon.fr/patrick.flandrin/NSIP03.pdf this document] for another good description of the algorithm and [http://perso.ens-lyon.fr/patrick.flandrin/emd.html this] accompanying power point presentation (see ''emd.ppt'')
 +
** further articles to download can be found [http://perso.ens-lyon.fr/patrick.flandrin/publis.html here], like [http://perso.ens-lyon.fr/patrick.flandrin/emd_spl.pdf this one]
 +
** '''google''' also for '''"Empirical Mode Decomposition"''' to find additional material
 +
 +
== Data Mining ==
 +
 +
see also http://en.wikipedia.org/wiki/Data_mining
  
 
= Resources =
 
= Resources =
 
== Links ==
 
== Links ==
 
* ...
 
* ...
 +
 +
[[Category:Calc|Statistics Tool]]
 +
[[Category:To-Do]]
 +
[[Category:Statistics Tool|Miscellanous]]

Latest revision as of 18:56, 1 August 2007

Front matter....

Goal

One of the most important tasks in data analysis is to describe optimally the data. Other important issues include extracting all the useful information from the original data set and describe complex relationships/ variability. Some of these are accomplished using statistical techiques (especially in biomedical sciences, see Calc/To-Dos/Statistical Data Analysis Tool), yet most will use different techniques, which will be described here.

Unfortunately, I lost any contact with mathematics more than 10 years ago. Therefore, my comments will be very brief, and I hope that people with interest and knowledge will develop this page further.

...

Specific Techiques

Summarizing Data

Methods to summarize the information in a limited number of components, e.g. linear dimension reduction

Energy-Frequency Analysis

Data Mining

see also http://en.wikipedia.org/wiki/Data_mining

Resources

Links

  • ...
Personal tools