6930
Comment:
|
4325
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
#acl HlpLabGroup:read,write,delete,revert All:read | ## page was renamed from HlpLab/StatsCourses/HLPCourse ## page was renamed from HlpLab/StatsMiniCourse #acl HlpLabGroup,TanenhausLabGroup:read,write,delete,revert,admin All:read |
Line 7: | Line 9: |
May 27 2008 - June 9 2008 | May 27 2008 - June 19 2008 |
Line 9: | Line 11: |
|| [wiki:/Session1 Session 1] || Linear regression || | || [[HLPMiniLabSession Session0]] || May 27 || Basics and R primer (attendance optional, reading required) || || [[HLPMiniLabSession Session1]] || May 29 || Linear regression || || [[HLPMiniLabSession Session2]] || June 5 || Issues in linear regression || || [[HLPMiniLabSession3 Session 3]] || June 10 || Multilevel linear regression || || [[HLPMiniLabSession4 Session 4]] || June 12 || Logistic regression, GLM || || [[HLPMiniLabSession5 Session 5]] || June 17 || Multilevel logistic regression, GLMM || || [[HLPMiniLabSession6 Session 6]] || June 19 || Computational methods for model fitting || || [[HLPMiniLabSession7 Session 7]] || ??? || R wrap-up || |
Line 12: | Line 22: |
* [http://www.amazon.com/Analysis-Regression-Multilevel-Hierarchical-Models/dp/0521867061/ref=sr_1_1?ie=UTF8&s=books&qid=1211219851&sr=8-1 Data Analysis Using Regression and Multilevel/Hierarchical Models] by Gelman & Hill (2007). [http://www.stat.columbia.edu/~gelman/arm/ Online resources]. G&H07. * [http://www.amazon.com/Analyzing-Linguistic-Data-Introduction-Statistics/dp/0521882591/ref=sr_1_1?ie=UTF8&s=books&qid=1211219948&sr=8-1 Analyzing Linguistic Data: A Practical Introduction to Statistics using R] by Harald Baayen (2008). [attachment:baayen_analyzing_08.pdf Complete electronic draft]. Baa08. * [http://www.amazon.com/Introductory-Statistics-R-Peter-Dalgaard/dp/0387954759/ref=sr_1_1?ie=UTF8&s=books&qid=1211228905&sr=8-1 Introductory Statistics with R] by Peter Dalgaard (2004). [http://staff.pubhealth.ku.dk/~pd/ISwR.html Online resources]. Dal04. * [http://www.amazon.com/Categorical-Analysis-Wiley-Probability-Statistics/dp/0471360937/ref=pd_bbs_1?ie=UTF8&s=books&qid=1211231537&sr=8-1 Categorical Data Analysis] by Alan Agresti (2002). [http://www.stat.ufl.edu/~aa/cda/cda.html Online resources]. Agr02. |
'''Master copies of the texts are available in the HLP lab (Meliora 123).''' * [http://www.amazon.com/Analysis-Regression-Multilevel-Hierarchical-Models/dp/0521867061/ Data Analysis Using Regression and Multilevel/Hierarchical Models] by Gelman & Hill (2007). [http://www.stat.columbia.edu/~gelman/arm/ Online resources]. G&H07. * Analyzing Linguistic Data: A Practical Introduction to Statistics using R by Harald Baayen (2008). [http://www.amazon.com/Analyzing-Linguistic-Data-Introduction-Statistics/dp/0521882591/ hardback ($97)] [http://www.amazon.com/Analyzing-Linguistic-Data-Introduction-Statistics/dp/0521709180/ paperback ($35)] [attachment:baayen_analyzing_08.pdf Complete electronic draft]. Baa08. * [http://www.amazon.com/Introductory-Statistics-R-Peter-Dalgaard/dp/0387954759/ Introductory Statistics with R] by Peter Dalgaard (2004). [http://staff.pubhealth.ku.dk/~pd/ISwR.html Online resources]. [http://site.ebrary.com/lib/rochester/Doc?id=10047812 Electronic copy through U of R libraries]. Dal04. * [http://www.amazon.com/Categorical-Analysis-Wiley-Probability-Statistics/dp/0471360937/ Categorical Data Analysis] by Alan Agresti (2002). [http://www.stat.ufl.edu/~aa/cda/cda.html Online resources]. Agr02. |
Line 23: | Line 35: |
== Datasets == attachment:attention-r-data.csv |
|
Line 27: | Line 42: |
== Session 0: Basics == Understanding of this material will be assumed throughout the course. Please read these introductory materials and make sure you understand them before beginning the readings for the first session. If there is sufficient interest, a short primer on using R will be offered during the first week of the course (somewhere between May 26 and May 30). If you're interested in this, or if you have requests on what to cover during the primer, please write to AustinFrank. === Reading === || Baa08 || Chapter 1 (pp. 1-20) || Intro to R. || || G&H07 || Chapter 2 (pp. 13-26) || Intro to probability theory.|| || Dal04 || ??? || ??? || == Session 1: Linear regression == Tuesday, May 27 2008. === Reading === || G&H07 || Chapter 3 (pp. 29-49) || Linear regression: the basics || || Baa08 || Section 4.3.2 (pp. 91 - 105) || Functional relations: linear regression || || || Sections 6 - 6.2.1 (pp. 181-198) || Regression Modeling (Introduction and Ordinary Least Squares Regression) || || || Section 6.6 (pp. 258-259) || General considerations || === Assignments === || G&H07 || Section 3.9 (pp. 50-51) || Exercises 3 and 5 || || Baa08 || Section 4.7 (p. 126) || Exercises 3 and 7* || * (for Exercise 7, Baayen treats linear regression using {{{lm}}} or {{{ols}}} as the same as analysis of covariance (see section 4.4.1 (pp. 117-119))). == Session 2: Issues in linear regression == Thursday, May 29 2008. === Reading === || G&H07 || Chapter 4 (pp. 53-74) || Linear regression: before and after fitting the model || || Baa08 || Sections 6.2.2-6.2.4 (pp. 198-212) || Collinearity, Model criticism, and Validation || || || Section 6.4 (pp. 234-239) || Regression with breakpoints || === Assignments === || G&H07 || Section 4.9 (p.76) || Exercise 4 || || Baa08 || Section 6.7 (p. 260) || Exercise 1 || In addition to the book problems, we will distribute a data set from the ongoing ngrams project. == Session 3: Multilevel (a.k.a. Hierarchical, a.k.a. Mixed ) Linear Models == Tuesday, June 3 2008. === Reading === || G&H07 || Sections 1.1-1.3 (pp. 1-3) || Intro, examples, motivation || || || Chapter 11 (pp. 237-248) || Multilevel structures || || || Chapter 12 (pp. 251-277) || Multilevel linear models: the basics || === Assignments === == Session 4: Logistic regression, Generalized Linear Multilevel Models == Thursday, June 5 2008. === Reading === || G&H07 || Chapter 5 (pp. 79-105) || Logistic regression || || Baa08 || Section 6.3 (pp. 214-234) || Generalized Linear Models || || || Section 6.4 (pp. 239.243) || end of Regression with breakpoints || || Agr02 || Section 16.3 (624-625) || ??? || === Assignments === == Session 5: Mixed logit models == Tuesday, June 10 2008. === Reading === || G&H07 || Chapter 14 (pp. 301-321) || Multilevel logistic regression* || * In this chapter, Gelman & Hill define some multilevel models in BUGS rather than in R. We will either provide translations for you, will do the translations together in class, or will assign the translations as an assignment. === Assignments === == Session 6: Computational methods for model fitting == Thursday, June 12 2008. === Reading === || G&H07 || Chapter 18 (pp. 387-413) || Likelihood and Bayesian inference and computation || || Agr02 || Section 15.2 (pp. 604-611) || ??? || || lme4 || implementation vignettes || attachment:Implementation.pdf attachment:Theory.pdf attachment:Notes.pdf || |
HLP Lab Mini Course on Regression Methods
May 27 2008 - June 19 2008
May 27 |
Basics and R primer (attendance optional, reading required) |
|
May 29 |
Linear regression |
|
June 5 |
Issues in linear regression |
|
June 10 |
Multilevel linear regression |
|
June 12 |
Logistic regression, GLM |
|
June 17 |
Multilevel logistic regression, GLMM |
|
June 19 |
Computational methods for model fitting |
|
??? |
R wrap-up |
Texts
Master copies of the texts are available in the HLP lab (Meliora 123).
[http://www.amazon.com/Analysis-Regression-Multilevel-Hierarchical-Models/dp/0521867061/ Data Analysis Using Regression and Multilevel/Hierarchical Models] by Gelman & Hill (2007). [http://www.stat.columbia.edu/~gelman/arm/ Online resources]. G&H07.
Analyzing Linguistic Data: A Practical Introduction to Statistics using R by Harald Baayen (2008). [http://www.amazon.com/Analyzing-Linguistic-Data-Introduction-Statistics/dp/0521882591/ hardback ($97)] [http://www.amazon.com/Analyzing-Linguistic-Data-Introduction-Statistics/dp/0521709180/ paperback ($35)] [attachment:baayen_analyzing_08.pdf Complete electronic draft]. Baa08.
[http://www.amazon.com/Introductory-Statistics-R-Peter-Dalgaard/dp/0387954759/ Introductory Statistics with R] by Peter Dalgaard (2004). [http://staff.pubhealth.ku.dk/~pd/ISwR.html Online resources]. [http://site.ebrary.com/lib/rochester/Doc?id=10047812 Electronic copy through U of R libraries]. Dal04.
[http://www.amazon.com/Categorical-Analysis-Wiley-Probability-Statistics/dp/0471360937/ Categorical Data Analysis] by Alan Agresti (2002). [http://www.stat.ufl.edu/~aa/cda/cda.html Online resources]. Agr02.
R packages
[http://cran.r-project.org/web/packages/Design/index.html Design]. Linear and generalized linear regression.
[http://cran.r-project.org/web/packages/lme4/index.html lme4]. Multilevel modeling.
[http://cran.r-project.org/web/packages/arm/index.html ARM]. Companion package for Gelman & Hill (2007).
[http://cran.r-project.org/web/packages/languageR/index.html languageR]. Companion package for Baayen (2008).
Datasets
attachment:attention-r-data.csv
How to read
One goal of this course is to make sure we're all comfortable with the same terminology and methods. Another goal is to make sure that as new people enter the community, we can bring them up to speed pretty quickly. To help with both of these goals, we're asking that you take some additional steps when you're doing the reading for this class.
- Keep an eye out for redundancy. If multiple pieces of assigned reading cover the same topic, and you find a single one of the treatments to be superior and sufficient, please make a note describing the nature of the redundant content, which source you preferred, and why. This will help us develop a set of "canonical" readings on these topics.
Record and investigate unexplained or unclear terminology. Because we're cherry picking chapters from multiple sources, it's likely that at some point an author will use a term that was originally presented in some (unread by us) earlier section of the text. Alternatively, an author might just assume knowledge that we don't have. In any case, when you come across a term in the reading that you believe is not explained well enough, please make a note of the term and where you found it. Then, please go one step further. Do your best to find a simple definition of the term, and record it for others to use ([http://en.wikipedia.org/wiki/Statistics Wikipedia] and [http://mathworld.wolfram.com/topics/ProbabilityandStatistics.html MathWorld] are likely to be good resources for this, but also feel free to consult your favorite stats text books).