Tutorial: Non-Linear Regression Models in R |
Motivation:
The tutorial aims at illustrating how to use R to fit non-linear regression models that consist of several curves. How to fit such models is a recurring theme on the R-help mailing list. A few relevant inquiries from the list give an idea about the type of problems encountered:
The extension package drc (Ritz & Streibig, 2005) can be used to resolve these problems. Thus it can be seen as a useful supplement to the base R function nls.
Experiments, for example in biology, often produce a response measured at several positions on a scale, for instance a range of concentrations or a set of time points. Data from such experiments often exhibit curvature as the concentrations vary, be it exponential decrease/increase or horizontal asymptotes, and consequently non-linear regression models are required.
One example is dose-response experiments with different curves for different toxic substances. These experiments are modelled using symmetric or asymmetric sigmoidal function. The fitted curves are used to derive measures of toxicity, which often is the dose used to obtain a certain effect. Subsequently these measures are used to compare toxicity between the substances.
Another example is found in R in ?Puromycin where enzymatic reaction is measured over a range of substrate concentrations for two treatments. This is modelled using the Michaelis-Menten equation. Here interest could lie in investigating whether or not the two curves are identical, implying no difference between the treatments.
Outline:
A variety of functional relationships will be covered in the tutorial, including both built-in functions in drc and user-defined functions.
The factor grouping data into curves may be different herbicides, crops or treatments. In general differences between curves - if present - need not manifest themselves in all parameters. Sometimes this is convenient for interpretation of differences, one such situation is generalised parallelism. The tutorial will cover test procedures for comparing models as well as for assessing which parameters agree across curves within a model. Hypotheses that can be explored may involve one or more of the curves and one or more of the parameters in the non-linear function. More general model structures with several factors influencing the parameters may occur in large experiments. Specification of such models is also possible in drc.
Several model checking techniques are available. A common problem is that the variation in the response is decreasing or increasing as concentrations vary. In drc adjustment for variance heterogeneity is possible by means of transform-both-sides Box-Cox transformation approach or modelling the variance as a function of the mean.
Through a series of small case studies the tutorial will illustrate the use of drc. The time schedule for the tutorial with thematic keywords covered in each lesson is displayed below.
Model specification: Built-in models and user-defined models. Self starter facility. Specification of several curves in one model. Estimation. Model diagnostics. Remedies for model deviations: transformation and variance modelling.
Model reduction: Comparison of nested models by means of significance tests. Comparison of relevant parameters across curves within a model. Visualization of results.
General structures: Models with several covariates, that is several quantitative independent variables. Models with several categorical factors influencing one or more of the parameters.
Who can participate?
Researchers and statisticians that are using non-linear regression models will benefit greatly from the tutorial. Such models may appear within fields such as agricultural sciences (crop or weed experiments), chemistry, ecology, environmental sciences and toxicology (algae and daphnids tests).
Analysts of dose-response data, such as ecotoxicologists, toxicologists and scientists, for instance in agrochemical and pharmaceutical companies, will find that there is special functionality available in the package drc for analysing this type of data. See Ritz & Streibig (2005) and the web page www.bioassay.dk for more details.
Participants should have basic knowledge of statistics and an understanding of linear and (ideally also) non-linear regression models. Moreover they should bring their own laptop so that they can follow the case studies step by step.