Tests of distributions - Installing R (CRAN) packages

Gordon Haverland

Like a lot of things in statistics, you cannot prove that two
distributions are the same. What you can show is the probability is
larger than something based on some metric you calculate.

The F test compares variances. The T test compares means. An older
one for distributions, is the Kolmogorov–Smirnov test (K–S test or KS


A nice thing about the K-S test, is that it is non-parametric. You can
use it for arbitrary distributions of data.

Is it the best test? It's old, so probably not. Well, a little
looking around produced an astrophysics comment, which said that the
K-S test was bad for a bunch of well thought out reasons, and they
suggested that one use the Anderson-Darling test. This test was
invented in 1952. Which is probably old, so I went looking for
improvements on it. There was mention that Shapiro-Wilk was better.

Well, it seems there is an R module (kSamples) which runs a few
different tests against data. There is a part of the tarball which
needs to be compiled.

While I have installed modules for Perl in the past, I don't know if
how Debian sets up R allows for this to happen easily.

Has anyone installed modules "manually" in R? How manual is it? Do
you really need to know what you are doing?




Join elug@groups.io to automatically receive all group messages.