Analysis tips for types of data we frequently work with in the lab
Guys, feel free to fill in. As for the suggested outlier criteria below, I should say that I think that it's best if one can find transformations of the data that make it normally distributed rather than to exclude outliers based on distributional criteria. Doing the latter is one of the reasons why so many studies proudly report interactions on, e.g., RT data where the difference for condition A is larger when the mean for condition B is larger. Those differences might all be spurious (RTs are bounded on one side and closer to log or reciprocally distributed than having a normal distribution, which means that such interactions often trivially arise). -- Florian Jaeger
1. Self-paced reading
1.1. Common exclusion criteria
- Exclusions based on values that would not make sense and hence are likely due to technical errors:
RT < 100msecs
RT > 2000msecs
- Exclusion based on evidence that participant was not paying (sufficient) attention:
abs(scale(RT)) > 3, also > 2.5 is also frequently used.
- answer not correct, i.e. Correct == 0
- Participant-based exclusion (could also be applied to items, though I haven't seen that much):
mean(Correct) < 75
data loss > 50%