Analysis of variance (ANOVA)¶
ANOVA can mean several things: actual decomposition of variance, comparing the group means or representation of regression results
Boils down to a regression with dummy (categorical) variables
Heavy traction in terminolgy from design of exepriments (see definitons section here)
Standartised result tables with
SS
,DF
,MSS
,F
,p
Frightening multitude of R packages
May want to look at a simple reference case
Quote:
ANOVA can be seen as “syntactic sugar” for a special subgroup of linear regression models. ANOVA is regularly used by researchers who are not statisticians by training. They are now “institutionalized” and its hard to convert them back to using the more general representation suncoolsu
Tweet:
I remember learning about ANOVA and thinking this is just linear regression with factors. Why does it even have its own name?! 😂
— Abhinav Maurya (@ahmaurya) June 21, 2019
Code examples¶
Links¶
Intro by Jim
Cross-Validated has several general discussions:
… followed by ANOVA vs regression:
NIST Handbook deals with ANOVA assumptions and interepations, as well as provides reference datasets:
Very simple and illustrative case NIST reference case here.
Comoact Julia package ANOVA.jl at about 150 lines of code, but not as much documentation yet.
ANOVA is again a case where Russian wikipedia is more concise and clear on the subject.
‘Types’ of sum of squares and associated confusion:
References¶
Gelman, A. (2005). Analysis of variance: why it is more important than ever (with discussion). Annals of Statistics 33, 1–53. doi:10.1214/009053604000001048