Differences between revisions 13 and 14
Revision 13 as of 2008-05-28 19:27:01
Size: 3886
Editor: forbin
Comment:
Revision 14 as of 2008-05-28 19:27:38
Size: 3964
Editor: forbin
Comment:
Deletions are marked like this. Additions are marked like this.
Line 48: Line 48:
       Dependence on assumptions
       Dependence on sample
       Dependence on available outcome and input measures
       Goal:
             Find generalizations that hold beyond the sample
             Predicting and outcome based on a set of predictors
       Dependence on assumptions[[BR]]
       Dependence on sample[[BR]]
       Dependence on available outcome and input measures[[BR]]
       Goal: [[BR]]
             Find generalizations that hold beyond the sample[[BR]]
             Predicting and outcome based on a set of predictors[[BR]]
Line 56: Line 56:
       {{{str(), summary(), names()}}}        {{{str(), summary(), names()}}}[[BR]]
Line 58: Line 58:
       {{{plot(), points(), lines(), barplot()}}}
       {{{hist(), histogram(), densityplot()}}}
       {{{plot(), points(), lines(), barplot()}}}[[BR]]
       {{{hist(), histogram(), densityplot()}}}[[BR]]
Line 61: Line 61:
       {{{pairs(), cor(), abline(), loess()}}}        {{{pairs(), cor(), abline(), loess()}}}[[BR]]
Line 64: Line 64:
       Geometric interpretation
       Ordinary least squares (and how they are "optimal" for the purpose of predicting an outcome Y)
       Geometric interpretation[[BR]]
       Ordinary least squares (and how they are "optimal" for the purpose of predicting an outcome Y)[[BR]]
Line 67: Line 67:
       {{{lm(), ols()}}}        {{{lm(), ols()}}}[[BR]]

Session 1: Linear regression

May 29 2008

This session will cover the basics of linear regression. See below for a [#Topics list of topics]. Please make sure to do the readings, and post any terminology you'd like to be clarified or other questions you have below. You can also suggest further topics, but keep in mind that Session2 also covers aspects of linear regression modeling, specifically typical issues that come up during the modeling. The goal of this first session is to go through the basic steps of building a linear regression model and understanding the output of it. Session 2 is on validating how good this model is.

We've also posted some [#assignments assignments] below that you should hand in by Friday, so that we can post them on this wiki page. There is only one way to learn how to use the methods we will talk about and that is to apply them yourself to a data set that you understand. The tutorial is intended to get you to the level where you can do that.

Reading

G&H07

Chapter 3 (pp. 29-49)

Linear regression: the basics

Baa08

Section 4.3.2 (pp. 91 - 105)

Functional relations: linear regression

Sections 6 - 6.2.1 (pp. 181-198)

Regression Modeling (Introduction and Ordinary Least Squares Regression)

Section 6.6 (pp. 258-259)

General considerations

Notes on the readings

Additional terminology

Feel free to add terms you want clarified in class:

Questions

  • Q:

Anchor(assignments)

Assignments

Send your solutions to Andrew Watts, who will upload them here. Please send them by Friday 3:30pm.

G&H07

Section 3.9 (pp. 50-51)

Exercises 3 and 5

Baa08

Section 4.7 (p. 126)

Exercises 3 and 7*

* (for Exercise 7, Baayen treats linear regression using lm or ols as the same as analysis of covariance (see section 4.4.1 (pp. 117-119))).

Suggested topics

If you have any material that you would like to cover that isn't included in the list below, please make note of it here.

Anchor(Topics)

Topics

Interacting with R and R files

  • Quick recap: Formulating your research questions; Hypothesis testing; a "model"
    • Dependence on assumptionsBR Dependence on sampleBR Dependence on available outcome and input measuresBR Goal: BR

      • Find generalizations that hold beyond the sampleBR Predicting and outcome based on a set of predictorsBR

  • Understanding your data set, predictors, and outcome, available information
    • str(), summary(), names()BR

  • Understanding the distributions of your variables
    • plot(), points(), lines(), barplot()BR hist(), histogram(), densityplot()BR

  • Understanding dependencies between your variables
    • pairs(), cor(), abline(), loess()BR

  • The Linear Model (LM)
    • Geometric interpretationBR Ordinary least squares (and how they are "optimal" for the purpose of predicting an outcome Y)BR

  • Building a linear model (for data analysis)
    • lm(), ols()BR Structure and class of these objects

      • coef(), summary(), resid()

      Standard output
  • Interpreting the output of a linear model
    • What hypotheses are we testing? What are coefficients and how to read them? Coding?
      • contrasts()

      Transformations and other non-linearities
      • log(), sqrt() {rcs(), pol()

  • Using a model to predict unseen data
    • predict()

  • Understanding the influence of individual cases, identifying outliers
    • boxplot() lm.influence()

HLPMiniCourseSession1 (last edited 2008-11-09 02:03:09 by cpe-67-240-134-21)

MoinMoin Appliance - Powered by TurnKey Linux