Differences between revisions 29 and 65 (spanning 36 versions)
Revision 29 as of 2008-05-19 21:40:20
Size: 6873
Editor: colossus
Comment:
Revision 65 as of 2011-08-09 18:01:46
Size: 4395
Editor: echidna
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
#acl HlpLabGroup:read,write,delete,revert All:read ## page was renamed from HlpLab/StatsCourses/HLPCourse
## page was renamed from HlpLab/StatsMiniCourse
#acl HlpLabGroup,TanenhausLabGroup:read,write,delete,revert,admin All:read
Line 7: Line 9:
May 27 2008 - June 9 2008 May 27 2008 - June 19 2008


|| [[HLPMiniCourseSession0 |Session0]] || May 27 || Basics and R primer (attendance optional, reading required) ||
|| [[HLPMiniCourseSession1 |Session1]] || May 29 || Linear regression ||
|| [[HLPMiniCourseSession2 |Session2]] || June 5 || Issues in linear regression ||
|| [[HLPMiniCourseSession3 |Session 3]] || June 10 || Multilevel linear regression ||
|| [[HLPMiniCourseSession4 |Session 4]] || June 12 || Logistic regression, GLM ||
|| [[HLPMiniCourseSession5 |Session 5]] || June 17 || Multilevel logistic regression, GLMM ||
|| [[HLPMiniCourseSession6 |Session 6]] || June 19 || Computational methods for model fitting ||
|| [[HLPMiniCourseSession7 |Session 7]] || ??? || R wrap-up ||
Line 10: Line 22:
 * [http://www.amazon.com/Analysis-Regression-Multilevel-Hierarchical-Models/dp/0521867061/ref=sr_1_1?ie=UTF8&s=books&qid=1211219851&sr=8-1 Data Analysis Using Regression and Multilevel/Hierarchical Models] by Gelman & Hill (2007). [http://www.stat.columbia.edu/~gelman/arm/ Online resources]. G&H07.
 * [http://www.amazon.com/Analyzing-Linguistic-Data-Introduction-Statistics/dp/0521882591/ref=sr_1_1?ie=UTF8&s=books&qid=1211219948&sr=8-1 Analyzing Linguistic Data: A Practical Introduction to Statistics using R] by Harald Baayen (2008). [attachment:baayen_analyzing_08.pdf Complete electronic draft]. Baa08.
 * [http://www.amazon.com/Introductory-Statistics-R-Peter-Dalgaard/dp/0387954759/ref=sr_1_1?ie=UTF8&s=books&qid=1211228905&sr=8-1 Introductory Statistics with R] by Peter Dalgaard (2004). [http://staff.pubhealth.ku.dk/~pd/ISwR.html Online resources]. Dal04.
 * [http://www.amazon.com/Categorical-Analysis-Wiley-Probability-Statistics/dp/0471360937/ref=pd_bbs_1?ie=UTF8&s=books&qid=1211231537&sr=8-1 Categorical Data Analysis] by Alan Agresti (2002). [http://www.stat.ufl.edu/~aa/cda/cda.html Online resources]. Agr02.
'''Master copies of the texts are available in the HLP lab (Meliora 123).'''

* [[http://www.amazon.com/Analysis-Regression-Multilevel-Hierarchical-Models/dp/0521867061/|Data Analysis Using Regression and Multilevel/Hierarchical Models]] by Gelman & Hill (2007). [[http://www.stat.columbia.edu/~gelman/arm/|Online resources]]. G&H07.
 * Analyzing Linguistic Data: A Practical Introduction to Statistics using R by Harald Baayen (2008). [[http://www.amazon.com/Analyzing-Linguistic-Data-Introduction-Statistics/dp/0521882591/|hardback ($97)]] [[http://www.amazon.com/Analyzing-Linguistic-Data-Introduction-Statistics/dp/0521709180/|paperback ($35)]] [[attachment:baayen_analyzing_08.pdf|Complete electronic draft]]. Baa08.
 * [[http://www.amazon.com/Introductory-Statistics-R-Peter-Dalgaard/dp/0387954759/|Introductory Statistics with R]] by Peter Dalgaard (2004). [[http://staff.pubhealth.ku.dk/~pd/ISwR.html|Online resources]]. [[http://site.ebrary.com/lib/rochester/Doc?id=10047812|Electronic copy through U of R libraries]]. Dal04.
 * [[http://www.amazon.com/Categorical-Analysis-Wiley-Probability-Statistics/dp/0471360937/|Categorical Data Analysis]] by Alan Agresti (2002). [[http://www.stat.ufl.edu/~aa/cda/cda.html|Online resources]]. Agr02.
Line 16: Line 30:
 * [http://cran.r-project.org/web/packages/Design/index.html Design]. Linear and generalized linear regression.
 * [http://cran.r-project.org/web/packages/lme4/index.html lme4]. Multilevel modeling.
 * [http://cran.r-project.org/web/packages/arm/index.html ARM]. Companion package for Gelman & Hill (2007).
 * [http://cran.r-project.org/web/packages/languageR/index.html languageR]. Companion package for Baayen (2008).
 * [[http://cran.r-project.org/web/packages/Design/index.html|Design]]. Linear and generalized linear regression.
 * [[http://cran.r-project.org/web/packages/lme4/index.html|lme4]]. Multilevel modeling.
 * [[http://cran.r-project.org/web/packages/arm/index.html|ARM]]. Companion package for Gelman & Hill (2007).
 * [[http://cran.r-project.org/web/packages/languageR/index.html|languageR]]. Companion package for Baayen (2008).

== Datasets ==
[[attachment:attention-r-data.csv]]
Line 24: Line 41:
 2. Record and investigate unexplained or unclear terminology. Because we're cherry picking chapters from multiple sources, it's likely that at some point an author will use a term that was originally presented in some (unread by us) earlier section of the text. Alternatively, an author might just assume knowledge that we don't have. In any case, when you come across a term in the reading that you believe is not explained well enough, please make a note of the term and where you found it. Then, please go one step further. Do your best to find a simple definition of the term, and record it for others to use ([http://en.wikipedia.org/wiki/Statistics Wikipedia] and [http://mathworld.wolfram.com/topics/ProbabilityandStatistics.html MathWorld] are likely to be good resources for this, but also feel free to consult your favorite stats text books).

== Session 0: Basics ==
Understanding of this material will be assumed throughout the course. Please read these introductory materials and make sure you understand them before beginning the readings for the first session.

If there is sufficient interest, a short primer on using R will be offered during the first week of the course (somewhere between May 26 and May 30). If you're interested in this, or if you have requests on what to cover during the primer, please write to AustinFrank.

=== Reading ===
|| Baa08 || Chapter 1 (pp. 1-20) || Intro to R. ||
|| G&H07 || Chapter 2 (pp. 13-26) || Intro to probability theory.||
|| Dal04 || ??? || ??? ||
 
== Session 1: Linear regression ==
Tuesday, May 27 2008.

=== Reading ===
|| G&H07 || Chapter 3 (pp. 29-49) || Linear regression: the basics ||
|| Baa08 || Section 4.3.2 (pp. 91 - 105) || Functional relations: linear regression ||
|| || Sections 6 - 6.2.1 (pp. 181-198) || Regression Modeling (Introduction and Ordinary Least Squares Regression) ||
|| || Section 6.6 (pp. 258-259) || General considerations ||

=== Assignments ===
|| G&H07 || Section 3.9 (pp. 50-51) || Exercises 3 and 5 ||
|| Baa08 || Section 4.7 (p. 126) || Exercises 3 and 7* ||
* (for Exercise 7, Baayen treats linear regression using {{{lm}}} or {{{ols}}} as the same as analysis of covariance (see section 4.4.1 (pp. 117-119))).

== Session 2: Issues in linear regression ==
Thursday, May 29 2008.

=== Reading ===
|| G&H07 || Chapter 4 (pp. 53-74) || Linear regression: before and after fitting the model ||
|| Baa08 || Sections 6.2.2-6.2.4 (pp. 198-212) || Collinearity, Model criticism, and Validation ||
|| || Section 6.4 (pp. 234-239) || Regression with breakpoints ||

=== Assignments ===
|| G&H07 || Section 4.9 (p.76) || Exercise 4 ||
|| Baa08 || Section 6.7 (p. 260) || Exercise 1 ||

In addition to the book problems, we will distribute a data set from the ongoing ngrams project.

== Session 3: Multilevel (a.k.a. Hierarchical, a.k.a. Mixed ) Linear Models ==
Tuesday, June 3 2008.

=== Reading ===
|| G&H07 || Sections 1.1-1.3 (pp. 1-3) || Intro, examples, motivation ||
|| || Chapter 11 (pp. 237-248) || Multilevel structures ||
|| || Chapter 12 (pp. 251-277) || Multilevel linear models: the basics ||

=== Assignments ===

== Session 4: Logistic regression, Generalized Linear Multilevel Models ==
Thursday, June 5 2008.

=== Reading ===
|| G&H07 || Chapter 5 (pp. 79-105) || Logistic regression ||
|| Baa08 || Section 6.3 (pp. 214-234) || Generalized Linear Models ||
|| || Section 6.4 (pp. 239.243) || end of Regression with breakpoints ||
|| Agr02 || Section 16.3 (624-625) || ??? ||

=== Assignments ===

== Session 5: Mixed logit models ==
Tuesday, June 10 2008.

=== Reading ===
|| G&H07 || Chapter 14 (pp. 301-321) || Multilevel logistic regression* ||
* In this chapter, Gelman & Hill define some multilevel models in BUGS rather than in R. We will either provide translations for you, will do the translations together in class, or will assign the translations as an assignment.

=== Assignments ===

== Session 6: Computational methods for model fitting ==
Thursday, June 12 2008.

=== Reading ===
|| G&H07 || Chapter 18 (pp. 387-413) || Likelihood and Bayesian inference and computation ||
|| Agr02 || Section 15.2 (pp. 604-611) || ??? ||
|| lme4 || implementation vignettes || attachment:Implementation.pdf attachment:Theory.pdf attachment:Notes.pdf ||
 2. Record and investigate unexplained or unclear terminology. Because we're cherry picking chapters from multiple sources, it's likely that at some point an author will use a term that was originally presented in some (unread by us) earlier section of the text. Alternatively, an author might just assume knowledge that we don't have. In any case, when you come across a term in the reading that you believe is not explained well enough, please make a note of the term and where you found it. Then, please go one step further. Do your best to find a simple definition of the term, and record it for others to use ([[http://en.wikipedia.org/wiki/Statistics|Wikipedia]] and [[http://mathworld.wolfram.com/topics/ProbabilityandStatistics.html|MathWorld]] are likely to be good resources for this, but also feel free to consult your favorite stats text books).

HLP Lab Mini Course on Regression Methods

May 27 2008 - June 19 2008

Session0

May 27

Basics and R primer (attendance optional, reading required)

Session1

May 29

Linear regression

Session2

June 5

Issues in linear regression

Session 3

June 10

Multilevel linear regression

Session 4

June 12

Logistic regression, GLM

Session 5

June 17

Multilevel logistic regression, GLMM

Session 6

June 19

Computational methods for model fitting

Session 7

???

R wrap-up

Texts

Master copies of the texts are available in the HLP lab (Meliora 123).

R packages

  • Design. Linear and generalized linear regression.

  • lme4. Multilevel modeling.

  • ARM. Companion package for Gelman & Hill (2007).

  • languageR. Companion package for Baayen (2008).

Datasets

attention-r-data.csv

How to read

One goal of this course is to make sure we're all comfortable with the same terminology and methods. Another goal is to make sure that as new people enter the community, we can bring them up to speed pretty quickly. To help with both of these goals, we're asking that you take some additional steps when you're doing the reading for this class.

  1. Keep an eye out for redundancy. If multiple pieces of assigned reading cover the same topic, and you find a single one of the treatments to be superior and sufficient, please make a note describing the nature of the redundant content, which source you preferred, and why. This will help us develop a set of "canonical" readings on these topics.
  2. Record and investigate unexplained or unclear terminology. Because we're cherry picking chapters from multiple sources, it's likely that at some point an author will use a term that was originally presented in some (unread by us) earlier section of the text. Alternatively, an author might just assume knowledge that we don't have. In any case, when you come across a term in the reading that you believe is not explained well enough, please make a note of the term and where you found it. Then, please go one step further. Do your best to find a simple definition of the term, and record it for others to use (Wikipedia and MathWorld are likely to be good resources for this, but also feel free to consult your favorite stats text books).

HLPMiniCourse (last edited 2011-08-09 18:01:46 by echidna)

MoinMoin Appliance - Powered by TurnKey Linux