6491
Comment:
|
6873
|
Deletions are marked like this. | Additions are marked like this. |
Line 24: | Line 24: |
2. Record and investigate unexplained or unclear terminology. Because we're cherry picking chapters from multiple sources, it's likely that at some point an author will use a term that was originally presented in some (unread by us) earlier section of the text. Alternatively, an author might just assume knowledge that we don't have. In any case, when you come across a term in the reading that you believe is not explained well enough, please make a note of the term and where you found it. Then, please go one step further. Do your best to find a simple definition of the term, and record it for others to use (wikipedia and mathepedia are likely to be good resources for this, but also feel free to consult your favorite stats text books). | 2. Record and investigate unexplained or unclear terminology. Because we're cherry picking chapters from multiple sources, it's likely that at some point an author will use a term that was originally presented in some (unread by us) earlier section of the text. Alternatively, an author might just assume knowledge that we don't have. In any case, when you come across a term in the reading that you believe is not explained well enough, please make a note of the term and where you found it. Then, please go one step further. Do your best to find a simple definition of the term, and record it for others to use ([http://en.wikipedia.org/wiki/Statistics Wikipedia] and [http://mathworld.wolfram.com/topics/ProbabilityandStatistics.html MathWorld] are likely to be good resources for this, but also feel free to consult your favorite stats text books). |
Line 28: | Line 28: |
If there is sufficient interest, a short primer on using R will be offered during the first week of the course (somewhere between May 26 and May 30). If you're interested in this, or if you have requests on what to cover during the primer, please write to AustinFrank. |
HLP Lab Mini Course on Regression Methods
May 27 2008 - June 9 2008
Texts
[http://www.amazon.com/Analysis-Regression-Multilevel-Hierarchical-Models/dp/0521867061/ref=sr_1_1?ie=UTF8&s=books&qid=1211219851&sr=8-1 Data Analysis Using Regression and Multilevel/Hierarchical Models] by Gelman & Hill (2007). [http://www.stat.columbia.edu/~gelman/arm/ Online resources]. G&H07.
[http://www.amazon.com/Analyzing-Linguistic-Data-Introduction-Statistics/dp/0521882591/ref=sr_1_1?ie=UTF8&s=books&qid=1211219948&sr=8-1 Analyzing Linguistic Data: A Practical Introduction to Statistics using R] by Harald Baayen (2008). [attachment:baayen_analyzing_08.pdf Complete electronic draft]. Baa08.
[http://www.amazon.com/Introductory-Statistics-R-Peter-Dalgaard/dp/0387954759/ref=sr_1_1?ie=UTF8&s=books&qid=1211228905&sr=8-1 Introductory Statistics with R] by Peter Dalgaard (2004). [http://staff.pubhealth.ku.dk/~pd/ISwR.html Online resources]. Dal04.
[http://www.amazon.com/Categorical-Analysis-Wiley-Probability-Statistics/dp/0471360937/ref=pd_bbs_1?ie=UTF8&s=books&qid=1211231537&sr=8-1 Categorical Data Analysis] by Alan Agresti (2002). [http://www.stat.ufl.edu/~aa/cda/cda.html Online resources]. Agr02.
R packages
[http://cran.r-project.org/web/packages/Design/index.html Design]. Linear and generalized linear regression.
[http://cran.r-project.org/web/packages/lme4/index.html lme4]. Multilevel modeling.
[http://cran.r-project.org/web/packages/arm/index.html ARM]. Companion package for Gelman & Hill (2007).
[http://cran.r-project.org/web/packages/languageR/index.html languageR]. Companion package for Baayen (2008).
How to read
One goal of this course is to make sure we're all comfortable with the same terminology and methods. Another goal is to make sure that as new people enter the community, we can bring them up to speed pretty quickly. To help with both of these goals, we're asking that you take some additional steps when you're doing the reading for this class.
- Keep an eye out for redundancy. If multiple pieces of assigned reading cover the same topic, and you find a single one of the treatments to be superior and sufficient, please make a note describing the nature of the redundant content, which source you preferred, and why. This will help us develop a set of "canonical" readings on these topics.
Record and investigate unexplained or unclear terminology. Because we're cherry picking chapters from multiple sources, it's likely that at some point an author will use a term that was originally presented in some (unread by us) earlier section of the text. Alternatively, an author might just assume knowledge that we don't have. In any case, when you come across a term in the reading that you believe is not explained well enough, please make a note of the term and where you found it. Then, please go one step further. Do your best to find a simple definition of the term, and record it for others to use ([http://en.wikipedia.org/wiki/Statistics Wikipedia] and [http://mathworld.wolfram.com/topics/ProbabilityandStatistics.html MathWorld] are likely to be good resources for this, but also feel free to consult your favorite stats text books).
Session 0: Basics
Understanding of this material will be assumed throughout the course. Please read these introductory materials and make sure you understand them before beginning the readings for the first session.
If there is sufficient interest, a short primer on using R will be offered during the first week of the course (somewhere between May 26 and May 30). If you're interested in this, or if you have requests on what to cover during the primer, please write to AustinFrank.
Reading
Baa08 |
Chapter 1 (pp. 1-20) |
Intro to R. |
G&H07 |
Chapter 2 (pp. 13-26) |
Intro to probability theory. |
Dal04 |
??? |
??? |
Session 1: Linear regression
Tuesday, May 27 2008.
Reading
G&H07 |
Chapter 3 (pp. 29-49) |
Linear regression: the basics |
Baa08 |
Section 4.3.2 (pp. 91 - 105) |
Functional relations: linear regression |
|
Sections 6 - 6.2.1 (pp. 181-198) |
Regression Modeling (Introduction and Ordinary Least Squares Regression) |
|
Section 6.6 (pp. 258-259) |
General considerations |
Assignments
G&H07 |
Section 3.9 (pp. 50-51) |
Exercises 3 and 5 |
Baa08 |
Section 4.7 (p. 126) |
Exercises 3 and 7* |
* (for Exercise 7, Baayen treats linear regression using lm or ols as the same as analysis of covariance (see section 4.4.1 (pp. 117-119))).
Session 2: Issues in linear regression
Thursday, May 29 2008.
Reading
G&H07 |
Chapter 4 (pp. 53-74) |
Linear regression: before and after fitting the model |
Baa08 |
Sections 6.2.2-6.2.4 (pp. 198-212) |
Collinearity, Model criticism, and Validation |
|
Section 6.4 (pp. 234-239) |
Regression with breakpoints |
Assignments
G&H07 |
Section 4.9 (p.76) |
Exercise 4 |
Baa08 |
Section 6.7 (p. 260) |
Exercise 1 |
In addition to the book problems, we will distribute a data set from the ongoing ngrams project.
Session 3: Multilevel (a.k.a. Hierarchical, a.k.a. Mixed ) Linear Models
Tuesday, June 3 2008.
Reading
G&H07 |
Sections 1.1-1.3 (pp. 1-3) |
Intro, examples, motivation |
|
Chapter 11 (pp. 237-248) |
Multilevel structures |
|
Chapter 12 (pp. 251-277) |
Multilevel linear models: the basics |
Assignments
Session 4: Logistic regression, Generalized Linear Multilevel Models
Thursday, June 5 2008.
Reading
G&H07 |
Chapter 5 (pp. 79-105) |
Logistic regression |
Baa08 |
Section 6.3 (pp. 214-234) |
Generalized Linear Models |
|
Section 6.4 (pp. 239.243) |
end of Regression with breakpoints |
Agr02 |
Section 16.3 (624-625) |
??? |
Assignments
Session 5: Mixed logit models
Tuesday, June 10 2008.
Reading
G&H07 |
Chapter 14 (pp. 301-321) |
Multilevel logistic regression* |
* In this chapter, Gelman & Hill define some multilevel models in BUGS rather than in R. We will either provide translations for you, will do the translations together in class, or will assign the translations as an assignment.
Assignments
Session 6: Computational methods for model fitting
Thursday, June 12 2008.
Reading
G&H07 |
Chapter 18 (pp. 387-413) |
Likelihood and Bayesian inference and computation |
Agr02 |
Section 15.2 (pp. 604-611) |
??? |
lme4 |
implementation vignettes |
attachment:Implementation.pdf attachment:Theory.pdf attachment:Notes.pdf |