Statistical Analysis

Introduction: What is Statistical Analysis?

Statistical analysis is a collection method to process large amounts of data and result overall trends. Statistical analysis is a useful tool to deal with noisy data. It provides ways to objectively report on how unusual an event is based on historical data. There are two different areas of statistics - descriptive statistics and inferential statistics; which are related to but still different from each other.

Descriptive Statistics

Descriptive statistics is simply the process of defining characteristics of a statistical measurement. Speaking, descriptive statistics involves a observational study of a population. Charts and graphs are an important role, and some standard measurements such as averages, percentiles, and measures of variation, and the standard deviation. For example in a paper reporting on a study involving human subjects, the table is giving the overall sample size, sample sizes in important subgroups, and demographic or clinical characteristics such as the average age, the proportion of subjects with each gender, and the proportion of subjects with related co-morbidities.

Inferential Statistics

Inferential statistics is measuring the trustworthiness of conclusions about the population parameter based on its information, this is called random sample. There are many possible uses of inferential statistics, for example - political predictions. In order to predict who the winner of a presidential election is: chosen sample from amount of Americans and asked which way they will be voting. From the answers given in this situation, statisticians will able to predict what general population will vote for with a high level of confidence. The keys of inferential statistics are choosing which members of the general population will be polled and which questions will be asked.

Software Programs: Open source [source: more info can be found in wiki]

  • ADMB – a software suite for non-linear statistical modeling based on C++ which uses automatic differentiation.
  • Apophenia – a library of statistical functions for C, on the same level of abstraction as most stats packages.
  • Bayesian Filtering Library
  • Chronux – for neurobiological time series data
  • DAP – A free replacement for SAS
  • ELKI a software framework for development of data mining algorithms in Java.
  • gretl – gnu regression, econometrics and time-series Library
  • JAGS – Just another Gibbs sampler (JAGS) is a program for analysis of Bayesian hierarchical models using Markov Chain Monte Carlo (MCMC) developed by Martyn Plummer. It is similar to WinBUGS.
  • JHepWork – Java-based data analysis framework for scientists and engineers. It includes an advanced IDE and Jython shell.
  • JMulTi
  • Octave – programming language (very similar to Matlab) with statistical features
  • OpenBUGS
  • OpenEpi – A web-based, open source, operating-independent series of programs for use in epidemiology and statistics based on JavaScript and HTML
  • Ploticus – software for generating a variety of graphs from raw data
  • PSPP – A free software replacement for SPSS
  • R
  • R Commander – GUI interface for R
  • RapidMiner, a machine learning toolbox
  • Shogun, an open source Large Scale Machine Learning toolbox that provides several SVM (Support Vector Machine) implementations (like libSVM, SVMlight) under a common framework and interfaces to Octave, Matlab, Python, R
  • Simfit – Simulation, curve fitting, statistics, and plotting
  • SOCR
  • SOFA Statistics – a desktop GUI program focused on ease of use, learn as you go, and beautiful output.
  • Statistical LabR-based and focusing on educational purposes

Gretl screenshot [non-attachment picture] [link]