Past Spring 2001 Meetings

Informal Seminars Presented In Spring 2001

March 27, 2001
Title:	Problems in ecological inference and methods for resolving them
Speaker:	Sebastien Haneuse, Graduate student of Biostatistics
Abstract:	The talk that I plan on giving is based upon reading that I have been doing with Jon Wakefield over the last several weeks. In particular, it is centered around his JRSS A (2001) paper, with Ruth Salway, and references provided within. Most epidemiology studies approach the exposure-disease association with the use of either cohort or case-control studies. These individual-level methods suffer from diminshed power and reliability when there is a lack of exposure varibility within the study population or there is substantial measurement error. Ecological studies investigate the association of interest at the population (or group) level, exploiting the variation across populations and alleviating some of the above issues. However, the inference that we generally want to draw is at the individual level, not at the group level. The different biases that arise from gruop-level studies have been given the general term 'Ecological bias', and lead to what has been termed the Ecological Falacy. In the talk I hope to outline the various types of bias that occur, as well as review some of the methods that have been proposed to overcome these problems.

April 3, 2001
Title:	Distinguishing confounding from noncollapsibility
Speaker:	Dan Gillen, Graduate student of Biostatistics
Abstract:	Much of the statistics literature does not distinguish between the concepts of confounding and noncollapsibility. In fact, it is often taught that during the modeling process, one method for determining whether or not a covariate is a confounder is to examine the change in the parameter associated with a predictor of interest before and after adjustment of the potential confounder. While this approach is reasonable in the setting of linear regression, due to the noncollapsibility of the odds ratio, it is not valid in the context of logistic regression. This talk will focus on the 1999 Statistical Science paper entitled 'Confounding and Collapsibility in Causal Inference,' by Greenland et al. During the talk I will discuss various definitions of confounding, the concept of collapsibility, and give examples to highlight the differences between confounding and noncollapsibility.

April 10, 2001
Title:	Predictive Capability Operating Characteristic curves: Evaluating accuracy of continuous diagnostic tests
Speaker:	Jacquee Williamson, Graduate student of Biostatistics
Abstract:	In many clinical situations it is necessary to distinguish between two groups of individuals, for example those who have a disease and those who do not. It is also important to know how accurately one can make this distinction. Methods of evaluating the accuracy of a continuous diagnostic test through true and false positive rates are readily available (ROC analysis). No analogous methods are available for predictive values. I will be proposing the Predictive Capability Operating Characteristic (PCOC) curve as a way of evaluating accuracy and showing how to estimate it. This will cover my thesis work with Margaret Pepe.

April 17, 2001
Title:	Abstract definition of prediction error for statistical models with applications to survival analysis
Speaker:	Thomas Gerds, Visiting student of Statistics
Abstract:	In survival analyis good point predictions for the survival time are rarely available. Instead the characteristics of predictions are often similar to weather forecast. In fact, scoring rules developed by meteorologists have been proved useful for the assessment of prognostic classification schemes for time-to-event data. Here it is only required that the conditional survival function is specified given the patients covariates. In this talk I introduce a general definition of prediction error for probability forecasts that can be appropriately modified for missing data problems. I discuss the difficulties with estimation of the resulting parameter considered as a functional defined on a model for the potentially uncomplete observations. Finally, an application in breast cancer illustrates the potential use of the prediction error also for model selection and goodness-of-fit testing. Unfortunately, estimation is still based on an inconsistent and inefficient procedure.

April 24, 2001
Title:	Analysis of Repeated Pre-Post Measurements in Randomized Clinical Trials
Speaker:	Jon Schildcrout, Graduate student of Biostatistics
Abstract:	Randomized clinical trials are the most conclusive experiments we have for asserting a causal relationship between a treatment and an outcome. In many clinical trials, multiple pre and post randomization/treatment measurements are taken to describe the potential treatment effect. However, what exactly do we mean when we say that treatment A is effective? Do we believe the treatment acts quickly to produce a response that remains constant over time, or do we believe it produces an effect that increases over time? Careful consideration should be paid to the type of treatment effect one expects to see, as there are many ways in which a treatment effect can manifest itself. I will compare methods for capturing 2 types of treatment effects. First, I will compare methods that utilize mean summary statistics to capture the average effect of treatment over time. Then, I will compare methods for analyzing a treatment effect that is linearly divergent over time . I will address questions of bias, efficiency, and will briefly discuss robustness of these methods when the treatment effect is misspecified (i.e. when you attempt to answer the wrong question).

May 1, 2001
Title:	How do we incorporate death into the statistical analysis of longitudinal health related quality of life data?
Speaker:	Laura Lee Johnson, Graduate student of Biostatistics
Abstract:	I am looking for ideas and opinions on topics related to my dissertation. My work has focused on how to include death in the longitudinal assessment of health related quality of life data. In studies involving the elderly or chronic diseases, a number of study participants may die while a study is still collecting information. The first part of the seminar will be focused on identifying the appropriate research questions. Next we will discuss different approaches to answering the questions, some of them involving data management methods (complete case analysis, imputation, etc.) and others involving applications of statistical methods (GEE, pattern mixture models, etc.) to the data at hand. We need to keep in mind not only statistical accuracy but also ease of interpretation. Some of the methods discussed will be 570's material, but most of what we will be talking about does not require a background in statistics.

May 8, 2001
Title:	A discussion concerning the problems associated with background noise in membrane microarray data for ovarian cancer
Speaker:	Erinn Hade, Graduate student of Biostatistics
Abstract:	As part of on-going work for my master's thesis, Jon Wakefield, Martin McIntosh and I have been interested in investigating the problems associated with the background measurements taken along with microarray expression level data. I will describe the microarray experiment data that we are using and what we have found to lead us to believe there to be a better way of estimating expression levels.

May 15, 2001
Title:	Logistic regression in two-phase sampling
Speaker:	Sebastien Haneuse, Graduate student of Biostatistics
Abstract:	We all know that the duality in the odds ratio permits the direct use of a prospective logistic regression model to the retrospective data provided by a case-control study. The resulting estimates are maximum likelihood. The formal justification is subtle however. When disease is rare, case-control sampling results in an increase in efficiency. But what if the exposure is rare as well ? Two-phase case-control sampling can lead to further efficiency gains. It can also be appropriate when faced with a missing data or measurement error problem. I plan on presenting the two-phase likelihood and showing how three methods approach the estimation problem. These include weighted likelihood, pseudolikelihood and maximum likelihood.

May 22, 2001
Title:	Reducing the need for electrocardiograms in heart attack diagnosis
Speaker:	Clemma Jacobsen, Graduate student of Biostatistics
Abstract:	Serial electrocardiogram (EKG) measurements are considered indispensable to diagnosis of heart attacks. EKGs are expensive and sometimes unavailable, however, so researchers have developed diagnostic algorithms based on cheaper and more readily available data. Though these algorithms largely reduce the need for EKGs, they rely on subjective threshold criteria, and an algorithm based on statistical analysis might be more accurate and efficient. For my thesis I am verifying the already existing algorithms and developing an improved version using classification and regression tree analysis.

May 29, 2001
Title:	A new framework for nonparametric estimation of the bivariate survivor function
Speaker:	Zoe Moodie, Graduate student of Biostatistics
Abstract:	Nonparametric estimation of the bivariate survivor function is a problem that has long haunted statisticians. Parametric approaches to this problem often require strong distributional assumptions that may not hold and are difficult to assess. A bivariate analogue to the Kaplan-Meier estimator is highly desirable since many settings provide bivariate survival data. Although many nonparametric estimators have been proposed, a proper, consistent, efficient estimator has yet to be defined. In this talk, I will illustrate the inherent difficulties of the nonparametric estimation problem then compare our new representation of the bivariate survivor function to an existing representation. Theoretical results will be briefly discussed. For illustration of the methods, bladder cancer study data will be analyzed.

Last Modification: 8 October 2001