ARTool

Align-and-rank data for a nonparametric ANOVA

Jacob O. Wobbrock, University of Washington [contact]

Leah Findlater, University of Washington

Darren Gergle, Northwestern University

James J. Higgins, Kansas State University

Matthew Kay, University of Michigan*^{†}

Download

Current Version 1.6.2

Windows executable: ARToolExe.zip

Source code: ARTool.zip

[R] version: ARTool package

The Windows version of ARTool requires the Microsoft .NET 4.6.1 Framework.

This software is distributed under the New BSD License agreement.

Publication

Wobbrock, J.O., Findlater, L., Gergle, D. and Higgins, J.J. (2011). The Aligned Rank Transform for nonparametric factorial analyses using only ANOVA procedures. Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI '11). Vancouver, British Columbia (May 7-12, 2011). New York: ACM Press, pp. 143-146. Honorable Mention Paper.

Related Statistics Study Guide

*Practical Statistics for HCI*, a self-guided independent study guide for human-computer interaction
researchers and students. Freely available online.

Related Coursera Course

Designing, Running & Analyzing Experiments, a course offered through Coursera.org and taught by Prof. Wobbrock.

Purpose

The need for a general nonparametric factorial analyses is acute
for many types of data obtained in human-computer interaction (HCI) studies,
especially for repeated measures data.
The Kruskal-Wallis and Friedman tests handle only __one__ factor of *N* levels,
and therefore cannot be used to examine interaction effects.
Examples of data warranting nonparametric factorial analyses are those obtained from
ordinal Likert-type scales, error rates that occur in human performance studies,
or preference tallies. These measures often cannot be transformed
for suitability to ANOVA, e.g., with the popular log transform
(Aitchison & Brown 1957; Berry 1987).

But isn't there a nonparametric equivalent to the factorial ANOVA? Surely there must be a nonparametric equivalent to the F-test!? Surprisingly, such an analysis is elusive, and although there has, of course, been work by researchers on nonparametric factorial analyses, those methods remain relatively uncommon, obscure, or only partially vetted. For a review of some methods, see, e.g., Sawilowsky (1990).

To illustrate the point, consider this useful table of analyses from U.C.L.A.; you will see that no entry is given for two or more independent variables with dependent groups (e.g., repeated measures). A parametric analysis would be, of course, the repeated measures ANOVA, but an equivalent nonparametric analysis is unstated.

The popular Rank Transform (RT) method of Conover and Iman (1981) applies ranks, averaged in the case of ties, over an entire data set, and then uses a parametric ANOVA on ranks, resulting in a nonparametric factorial procedure. However, researchers determined that this process only produces reliable results for main effects; interactions are subject to big increases in Type I errors (i.e., claiming statistical significance where there is none) (Salter & Fawcett 1993; Higgins & Tashtoush 1994).

The *Aligned Rank Transform (ART)* procedure was devised to correct this problem.
For each main effect or interaction, all responses (Y_{i}) are "aligned," a
process that strips from Y all effects but the one for which alignment is being done (main effect
or interaction effect). This aligned response
we'll call Y_{aligned}. The aligned responses are then assigned ranks, averaged
in the case of ties, and the new response we'll call Y_{art}. Then a factorial
ANOVA is run on the Y_{art} responses, ** but only the effect for
which Y was aligned is examined in the ANOVA table**. Thus, for each possible main or interaction effect,
one new aligned column (Y

In a word: Yay!

How ARTool Works

Most modern statistical packages lack a built-in feature for aligning data. (Many do have features for assigning averaged ranks, also called midranks.) Aligning data is extremely tedious and error-prone to do by hand, especially when more than two factors are involved.

ARTool takes a character-delimited CSV file as input (*.csv). ARTool can work with any text character as a delimiter, or a space or a tab. It can also employ different delimiters for reading in and writing out data tables. The default delimiter is a comma, but European number formats can be handled by telling ARTool to use a delimiter other than a comma (e.g., a semi-colon) and to treat commas as decimal points.

The file read in by ARTool must represent
a *long-format* data table (one measure Y_{i} per row, in the right-most column). The first row
should be delimited column names. The first column
should be the experimental unit, *Subject* (i.e., s01,
s02, s03, etc.). This column, which we'll call S, is *not* used in ARTool's mathematical calculations, but
is useful for clarity in the output table, and is essential anyway for long-format repeated measures
designs where the same experimental unit must be listed on multiple rows. As noted, the last column
must be the sole numeric measure (Y). (This measure is also called the dependent variable.)
Every column between S and Y represents one factor (X1, X2, X3, etc.) from the experiment. (Factors are also called
independent variables.) Each possible main effect and interaction is given a new aligned column and a new ranked
column in the output table produced by ARTool.

The alignment process used is that for a completely randomized design. This can
result in reduced power for other designs like split-plots, as described in
Higgins *et al.* (1990). But this is the simplest and most easily generalized alignment algorithm
to implement. As it may only reduce power, any significant results can be trusted.
For more on this issue, see Higgins *et al.* (1990) and Higgins & Tashtoush (1994).

The output of ARTool is a new character-delimited CSV file, by default with extension *.art.csv.
This new file will have, for each effect X1, X2, X3, ..., an "aligned" column showing the aligned
data (Y_{aligned}) and an ART column (Y_{art}), showing the averaged ranks applied
to the corresponding aligned column. As the original table's columns are retained, the output data table will have,
for *N* factors, (2+*N*) + 2*(2^{N}-1) columns. Thus, if the original table has 2 factors,
the output table will have (2+2) + 2*(2^{2} - 1) = 10 columns. If the original table has 3 factors,
the output table will have (2+3) + 2*(2^{3} - 1) = 19 columns.
A verification step is automatically performed by ARTool to ensure that each aligned column
sums to zero. Users of ARTool can perform a further sanity check
by running a full-factorial ANOVA on the aligned columns. All effects other than the one
for which the column was aligned should be close to F=0.00 and *p*=1.00. This is the "stripping out" of
effects mentioned above.

The long-format output file produced by ARTool can be opened directly by Microsoft Excel. Most statistics packages also directly open CSV files as well. There, you can run ANOVA analyses on the aligned-and-ranked columns. Always use a full-factorial model, but only interpret the main effect or interaction for which the response was aligned-and-ranked. For example, for two factors X1 and X2, you would run three full-factorial ANOVAs, one for the main effect of X1, one for the main effect of X2, and one for the X1*X2 interaction.

Note that by using the [R] version of ARTool, these multiple factorial ANOVAs are run invisibly behind the scenes, and the proper result from each ANOVA is assembled into a single results table.

*Warning.* In general, because the aligning process strips out all but one effect from the
data, the ANOVA on aligned-ranks should also show close to F=0.00 and *p*=1.00 for all other
effects except the one corresponding to the given column. (However, it will rarely be
exactly F=0.00 and *p*=1.00.) If this is not the case, then it may be
that the data is not suitable somehow for the ART procedure. Proceed with caution in
this case, and perhaps consider an alternative approach (e.g., a robust rank-based approach,
a bootstrap approach, Generalized Linear Mixed Models (GLMM), or Generalized Estimating Equations (GEE)).
See, e.g., Sawilowsky (1990) or Higgins (2004).

Important: *post hoc* Comparisons

There are some important details about *post hoc* comparisons that you might wish to perform after a
significant main effect or interaction, as is common practice. For this discussion, we'll use the
[R] version of ARTool, but the same concepts apply to how you might manually do *post hoc*
comparisons after using the Windows version of ARTool, above.

The ARTool package is available at CRAN. The source code for the package is available on Github. You can install the latest released version from CRAN with this [R] command:

```
# install the ARTool package
install.packages("ARTool")
```

Using the ARTool package, the ART procedure is run on a long-format data table in CSV format using the following code:

# read the data table into variable 'df' df <- read.csv("df.csv") # load the ARTool library library(ARTool) # perform the ART procedure on 'df' # assume 'Y' is the name of the response column # assume 'X1' is the name of the first factor column # assume 'X2' is the name of the second factor column # assume 'S' is the name of the subjects column m = art(Y ~ X1 * X2 + (1|S), data=df) # uses a linear mixed model anova(m)

Here is fictitious output in the format produced:

Analysis of Variance of Aligned Rank Transformed Data Table Type: Analysis of Deviance Table (Type III Wald F tests with Kenward-Roger df) Model: Mixed Effects (lmer) Response: art(Y) F Df Df.res Pr(>F) X1 4.3132 2 144 0.01516 * X2 0.0165 2 72 0.98361 X1:X2 2.0173 4 144 0.09510 . --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

*Post hoc* pairwise comparisons of levels __within__ single factors
can be conducted. For example, you can conduct pairwise comparisons
among all three levels of X1 as follows:

# load the lsmeans library library(lsmeans) # m is the model returned by the call to art() above # lsmeans reports p-values Tukey-corrected for multiple comparisons # assume levels of 'X1' are 'a', 'b', and 'c' lsmeans(artlm(m, "X1"), pairwise ~ X1)

$lsmeans X1 lsmean SE df lower.CL upper.CL a 108.5600 7.359735 216 94.05391 123.0661 b 130.0133 7.359735 216 115.50724 144.5194 c 100.4267 7.359735 216 85.92057 114.9328 Results are averaged over the levels of: X2 Confidence level used: 0.95 $contrasts contrast estimate SE df t.ratio p.value a - b -21.453333 10.40824 144 -2.061 0.1017 a - c 8.133333 10.40824 144 0.781 0.7150 b - c 29.586667 10.40824 144 2.843 0.0141 Results are averaged over the levels of: X2 P value adjustment: tukey method for a family of 3 means

Caution. With the above approach, you
cannot safely conduct pairwise comparisons *involving multiple factors*.
For example, if X2 has three levels (d, e, f), you cannot safely use the above approach to
compare

Thus, in terms of [R] code, do not do this:

# do not do this! lsmeans(artlm(m, "X1 : X2"), pairwise ~ X1 : X2) # not ok with art!

So what are your options for cross-factor *post hoc* comparisons in the ART paradigm?

One option is to use the `testInteractions`

function from the `phia`

package
to perform interaction contrasts, which look at *differences of differences* (Marascuilo & Levin 1970, Boik 1979).
Assume now X1 has two levels (a, b) and X2 has three levels (d, e, f). The code and output is:

```
# for cross-factor comparisons, one approach with art is an interaction contrast
library(phia)
testInteractions(artlm(m, "X1:X2"), pairwise=c("X1","X2"), adjustment="holm")
```

Chisq Test: P-value adjustment method: holm Value Df Chisq Pr(>Chisq) a-b : d-e -5.083 1 0.5584 0.4549 a-b : d-f -76.250 1 125.6340 <2e-16 *** a-b : e-f -71.167 1 109.4412 <2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

In the above output, a-b : d-e is interpreted as a difference-of-differences, i.e.,
the difference between `testInteractions`

, see
the [R] vignette,
Contrast tests with ART.

Another option for cross-factor pairwise comparisons is to simply use either
nonparametric Mann-Whitney *U* tests or Wilcoxon signed-rank tests on the original data.
Use the former when subjects were in only one of the conditions being compared (i.e., between-subjects),
and use the latter when subjects were in both of the conditions being compared (i.e., within-subjects).

For example, if the interaction between X1 and X2 is statistically significant from the
`art`

and `anova`

calls above, and if X1 now, for simplicity, has
levels a and b, and X2 has levels d and e, then we can do:

# cross-factor pairwise comparisons using Mann-Whitney U tests. # we don't have to do all six possible comparisons over X1=(a,b), # X2=(d,e), but we do so below for completeness. Note each comparison # assumes subjects were in only one of the compared conditions. ad_vs_ae = wilcox.test(df[df$X1 == "a" & df$X2 == "d",]$Y, df[df$X1 == "a" & df$X2 == "e",]$Y)$p.value ad_vs_bd = wilcox.test(df[df$X1 == "a" & df$X2 == "d",]$Y, df[df$X1 == "b" & df$X2 == "d",]$Y)$p.value ad_vs_be = wilcox.test(df[df$X1 == "a" & df$X2 == "d",]$Y, df[df$X1 == "b" & df$X2 == "e",]$Y)$p.value ae_vs_bd = wilcox.test(df[df$X1 == "a" & df$X2 == "e",]$Y, df[df$X1 == "b" & df$X2 == "d",]$Y)$p.value ae_vs_be = wilcox.test(df[df$X1 == "a" & df$X2 == "e",]$Y, df[df$X1 == "b" & df$X2 == "e",]$Y)$p.value bd_vs_be = wilcox.test(df[df$X1 == "b" & df$X2 == "d",]$Y, df[df$X1 == "b" & df$X2 == "e",]$Y)$p.value # correct for multiple comparisons using Holm's sequential Bonferroni procedure (Holm 1979) p.adjust(c(ad_vs_ae, ad_vs_bd, ad_vs_be, ae_vs_bd, ae_vs_be, bd_vs_be), method="holm")

# cross-factor pairwise comparisons using Wilcoxon signed-rank tests. # we don't have to do all six possible comparisons over X1=(a,b), # X2=(d,e), but we do so below for completeness. Note each comparison # assumes subjects were in both of the compared conditions. library(reshape2) df2 <- dcast(df, S ~ X1 + X2, value.var="Y") # make wide-format table ad_vs_ae = wilcox.test(df2$a_d, df2$a_e, paired=TRUE)$p.value ad_vs_bd = wilcox.test(df2$a_d, df2$b_d, paired=TRUE)$p.value ad_vs_be = wilcox.test(df2$a_d, df2$b_e, paired=TRUE)$p.value ae_vs_bd = wilcox.test(df2$a_e, df2$b_d, paired=TRUE)$p.value ae_vs_be = wilcox.test(df2$a_e, df2$b_e, paired=TRUE)$p.value bd_vs_be = wilcox.test(df2$b_d, df2$b_e, paired=TRUE)$p.value # correct for multiple comparisons using Holm's sequential Bonferroni procedure (Holm 1979) p.adjust(c(ad_vs_ae, ad_vs_bd, ad_vs_be, ae_vs_bd, ae_vs_be, bd_vs_be), method="holm")

ART Mathematics

The mathematics for the general ART nonparametric factorial analysis were
worked out by Higgins & Tashtoush (1994). Prof. Higgins was kind enough
to explain the mathematics of his article in a personal communication to me.
To the best of our knowledge, the literature on
the Aligned Rank Transform does not present a general formulation for *N* factors;
most publications deal with only two factors.
It was for the purpose of creating ARTool that Prof. Higgins kindly worked out
the mathematics for *N* factors. The following are the steps that ARTool
goes through to align and rank your data. (You'll see why you wouldn't want to
do this by hand.)

**Step 1 - Residuals.** For each raw response Y_{i}, compute its residual as

```
residual = Y - cell mean
```

The cell mean is the mean response Y̅_{i} for that *cell*, i.e., over all Y_{i}'s whose
levels of their factors (X*N*_{i}'s) match that of the Y response for which we're computing this residual.

The example table below has two factors (X1, X2), each with two levels {a,b} and {x,y}, and one response column (Y), and shows the calculation of cell means:

Subject | X1 | X2 | Y | cell mean |

s01 | a | x | 12 | (12+19)/2 |

s02 | a | y | 7 | (7+16)/2 |

s03 | b | x | 14 | (14+14)/2 |

s04 | b | y | 8 | (8+10)/2 |

s05 | a | x | 19 | (12+19)/2 |

s06 | a | y | 16 | (7+16)/2 |

s07 | b | x | 14 | (14+14)/2 |

s08 | b | y | 10 | (8+10)/2 |

**Step 2 - Estimated Effects.** Compute the "estimated effects." This is best illustrated with an example.
Let A, B, C, D be factors with levels

```
A
```_{i}, i = 1...a

B_{j}, j = 1...b

C_{k}, k = 1...c

D_{ℓ}, ℓ = 1...d.

Let A_{i} indicate the mean
response Y̅_{i} only for rows where factor A
is at level i. Let A_{i}B_{j} indicate the mean
response Y̅_{ij} only for rows where factor A is at level i and factor B is at level j. And so on.
Let *μ* be the grand mean of Y̅ over all rows.

*Main effects*

The estimated effect for factor A with response Y_{i} is

```
= A
```_{i}

- *μ*.

*Two-way effects*

The estimated effect for the A×B interaction with response Y_{ij} is

```
= A
```_{i}B_{j}

- A_{i} - B_{j}

+ *μ*.

*Three-way effects*

The estimated effect for the A×B×C interaction with response Y_{ijk} is

```
= A
```_{i}B_{j}C_{k}

- A_{i}B_{j} - A_{i}C_{k} - B_{j}C_{k}

+ A_{i} + B_{j} + C_{k}

- *μ*.

*Four-way effects*

The estimated effect for the A×B×C×D interaction with response Y_{ijkℓ} is

```
= A
```_{i}B_{j}C_{k}D_{ℓ}

- A_{i}B_{j}C_{k} - A_{i}B_{j}D_{ℓ} - A_{i}C_{k}D_{ℓ} - B_{j}C_{k}D_{ℓ}

+ A_{i}B_{j} + A_{i}C_{k} + A_{i}D_{ℓ} + B_{j}C_{k} + B_{j}D_{ℓ} + C_{k}D_{ℓ}

- A_{i} - B_{j} - C_{k} - D_{ℓ}

+ *μ*.

*N-way effects*

The estimated effect for an *N*-way interaction is

```
=
```*N* way

- Σ(*N*-1 way)

+ Σ(*N*-2 way)

- Σ(*N*-3 way)

+ Σ(*N*-4 way)

.

.

.

- Σ(*N*-*h* way) // if *h* is odd, or

+ Σ(*N*-*h* way) // if *h* is even

.

.

.

- *μ* // if *N* is odd, or

+ *μ* // if *N* is even.

**Step 3 - Alignment.** Compute the aligned data point Y_{aligned} as the replacement for raw data point Y_{i}
for the effect of interest as

`Y`_{aligned} = residual + estimated effect

, i.e.,`= result from step (1) + result from step (2).`

**Step 4 - Ranking.** Assign averaged ranks to all aligned observations Y_{aligned} within each new
aligned column, thereby turning Y_{aligned} into Y_{art}. With averaged ranks,
"if a value is unique, its
averaged rank is the same as its rank. If a value occurs *k* times,
the average rank is computed as the sum of the value's ranks divided
by *k*" (SAS JMP 7.0 help documentation).

As noted above, ARTool computes aligned data columns (for inspection) and the averaged ranks for each of these columns (for ANOVA).

**Step 5 - ANOVA on Ranks.** This step is one __not__ performed by the Windows version of ARTool, but is performed
by the [R] version of ARTool: Perform full-factorial ANOVAs, or fit linear mixed models, on the aligned ranks data
(Y_{art}) produced by ARTool. Using the same factors (X*N*'s) as model input, perform a separate ANOVA
for each main effect or interaction, being careful to interpret the results only for the factor or interaction
for which the response was aligned and ranked. (Note: Again, if you're using the [R] version of ARTool, then it
does this for you.)

__Example__: If you have two factors (X1 and X2), and response (Y), you will
run three ANOVAs, each using the same input model
(X1, X2, X1*X2), but using a different response variable, one for each aligned-and-ranked Y.
That is, one ANOVA will use the response for which Y was aligned-and-ranked for X1.
The second ANOVA will use the response for which Y was aligned-and-ranked
for X2. The third ANOVA will use the response for which Y was aligned-and-ranked for
X1*X2. When interpreting the results in each ANOVA's output,
only look at the main effect or interaction for which Y was aligned-and-ranked. So you would
extract one result from each of three ANOVAs, for three total results.

Sample Data

Four example data sets are included in the ARTool\data folder. The first two are from
Higgins *et al.* (1990). The first of these, named Higgins1990-Table1.csv, shows a mock data
set with two between-subjects factors named *Row* and *Column*. Each factor has
3 levels. Although in Higgins *et al.* (1990) this table is represented in wide-format,
*ARTool* requires long-format tables, so it has been rendered as such. After using *ARTool*
on it, an output file named Higgins1990-Table1.art.csv is created. This data has also been put in
an SAS JMP table, Higgins1990-Table1.art.JMP, which contains saved analyses of variance for
inspection. One can verify that the aligned ranks and the test results agree with those found in
Higgins *et al.* (1990).

A second example is in Higgins1990-Table5.csv. The output file created by *ARTool* is
Higgins1990-Table5.art.csv. This data is from a real study of moisture
levels and fertilizer as it affects the dry matter created in peat. It has two factors,
*Moisture* and *Fertilizer*. *Moisture* is a between-subjects factor of 3 levels,
while *Fertilizer*
is a within-subjects factor of 4 levels. Twelve trays containing four
pots of peat each were put in a different moisture condition. Each peat pot on a tray was
subjected to a seperate fertilizer. The *Tray* is therefore regarded as the experimental
unit (the "Subject"), and each peat pot on each tray is a "trial." The response variable is
the amount of dry matter produced in the pot. In agricultural-statistical terminology, this is a
classic split-plot design, with *Moisture* as the whole-plot factor and *Fertilizer* as the subplot factor. It
is instructive to compare the layout of Table 5 in Higgins *et al.* (1990) to the long-format
layout in Higgins1990-Table5.csv. The aligned data has been put in an SAS JMP table
named Higgins1990-Table5.art.JMP. Analyses have been saved to the table and match the
results in Higgins *et al.* (1990).

A third example is HigginsABC.csv, which is a mock data set with two between-subjects factors,
*A* and *B*, and a third within-subjects factor, *C*. The aligned table is
HigginsABC.art.csv, and in SAS JMP it is HigginsABC.art.JMP. An analysis
of variance will show that all main effects and the A*B interaction are significant. An
analysis of variance on the aligned-ranks data (i.e., the "ART" columns) will show that the
same significance conclusions are drawn.

A fourth example is HigginsABC.csv renamed to 'Produces Error.csv' and given an invalid non-numeric response ("X") on the third row of data. When analyzed by ARTool, a red-text error is produced. In general, ARTool produces descriptive error messages, identifying where errors occur so they can be remedied.

A trial version of SAS JMP can be downloaded from http://www.jmp.com/software/.

Further Reading

- Aitchison, J. and Brown, J. A. C. (1957). The Lognormal Distribution. Cambridge, England: Cambridge University Press.
- Akritas, M. G. and Brunner, E. (1997). A unified approach to rank tests for mixed models. Journal of Statistical Planning and Inference 61 (2), pp. 249-277.
- Akritas, M. G. and Osgood, D. W. (2002). Guest editors' introduction to the special issue on nonparametric models. Sociological Methods and Research 30 (3), pp. 303-308.
- Beasley, T. M. (2002). Multivariate aligned rank test for interactions in multiple group repeated measures designs. Multivariate Behavioral Research 37 (2), pp. 197-226.
- Berry, D. A. (1987). Logarithmic transformations in ANOVA. Biometrics 43 (2), pp. 439-456.
- Boik, R. J. (1979). Interactions, partial interactions, and interaction contrasts in the analysis of variance. Psychological Bulletin 86 (5), pp. 1084-1089.
- Conover, W. J. and Iman, R. L. (1981). Rank transformations as a bridge between parametric and nonparametric statistics The American Statistician 35 (3), pp. 124-129.
- Fawcett, R. F. and Salter, K. C. (1984). A Monte Carlo study of the F test and three tests based on ranks of treatment effects in randomized block designs. Communications in Statistics: Simulation and Computation 13 (2), pp. 213-225.
- Frederick, B. N. (1999). Fixed-, random-, and mixed-effects ANOVA models: A user-friendly guide for increasing the generalizability of ANOVA results. In Advances in Social Science Methodology, B. Thompson (ed). Stamford, Connecticut: JAI Press, pp. 111-122.
- Friedman, M. (1937). The use of ranks to avoid the assumption of normality implicit in the analysis of variance. Journal of the American Statistical Association 32 (200), pp. 675-701.
- Higgins, J. J., Blair, R. C. and Tashtoush, S. (1990). The aligned rank transform procedure. Proceedings of the Conference on Applied Statistics in Agriculture. Manhattan, Kansas: Kansas State University, pp. 185-195.
- Higgins, J. J. and Tashtoush, S. (1994). An aligned rank transform test for interaction. Nonlinear World 1 (2), pp. 201-211.
- Higgins, J. J. (2004). Introduction to Modern Nonparametric Statistics. Pacific Grove, California: Duxbury Press.
- Hodges, J. L. and Lehmann, E. L. (1962). Rank methods for combination of independent experiments in the analysis of variance. Annals of Mathematical Statistics 33 (2), pp. 482-497.
- Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics 6 (2), pp. 65-70.
- Kaptein, M., Nass, C. and Markopoulos, P. (2010). Powerful and consistent analysis of Likert-type rating scales. Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI '10). New York: ACM Press, pp. 2391-2394.
- Lehmann, E. L. (2006). Nonparametrics: Statistical Methods Based on Ranks. New York: Springer.
- Littell, R. C., Henry, P. R. and Ammerman, C. B. (1998). Statistical analysis of repeated measures data using SAS procedures. Journal of Animal Science 76 (4), pp. 1216-1231.
- Mansouri, H. (1999). Aligned rank transform tests in linear models. Journal of Statistical Planning and Inference 79 (1), pp. 141-155.
- Mansouri, H. (1999). Multifactor analysis of variance based on the aligned rank transform technique. Computational Statistics and Data Analysis 29 (2), pp. 177-189.
- Mansouri, H., Paige, R. L. and Surles, J. G. (2004). Aligned rank transform techniques for analysis of variance and multiple comparisons. Communications in Statistics: Theory and Methods 33 (9), pp. 2217-2232.
- Marascuilo, L.A. and Levin, J.R. (1970). Appropriate post hoc comparisons for interaction and nested hypotheses in analysis of variance designs: The elimination of Type IV errors. American Educational Research Journal 7 (3), pp. 397-421.
- Richter, S. J. (1999). Nearly exact tests in factorial experiments using the aligned rank transform. Journal of Applied Statistics 26 (2), pp. 203-217.
- Salter, K. C. and Fawcett, R. F. (1985). A robust and powerful rank test of treatment effects in balanced incomplete block designs. Communications in Statistics: Simulation and Computation 14 (4), pp. 807-828.
- Salter, K. C. and Fawcett, R. F. (1993). The ART test of interaction: A robust and powerful rank test of interaction in factorial models. Communications in Statistics: Simulation and Computation 22 (1), pp. 137-153.
- Sawilowsky, S. S. (1990). Nonparametric tests of interaction in experimental design. Review of Educational Research 60 (1), 91-126.
- Schuster, C. and von Eye, A. (2001). The relationship of ANOVA models with random effects and repeated measurement designs. Journal of Adolescent Research 16 (2), pp. 205-220.
- Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics Bulletin 1 (6), pp. 80-83.

Acknowledgements

This work was supported in part by the National Science Foundation under grants IIS-0811884 and IIS-0811063. Any opinions, findings, conclusions or recommendations expressed in this work are those of the authors and do not necessarily reflect those of the National Science Foundation.

Copyright © 2011-2018 Jacob O. Wobbrock. All rights reserved.

Last updated October 27, 2018.