Past Spring 2001 Meetings

Informal Seminars Presented In Autumn 2000

September 25, 2000
Title:	A hierarchical aggregate data model with allowance for spatially correlated disease rates
Speakers:	Katherine Guthrie, Graduate student, Biostatistics
Abstract:	The aggregate data study design (Prentice and Sheppard, 1995) aims to estimate exposure effects by regressing population-based disease rates on covariate data from survey samples in each population group. The design is motivated by the need to accurately estimate individual-level associations between exposures with limited within-population variability, such as dietary fat intake, and the risk of chronic diseases. By incorporating individual-level exposure and confounder data, the aggregate data study design can overcome many of the sources of bias that are inherent to ecological studies. In this work, we further develop the aggregate data model in the context of Bayesian disease mapping in order to allow for spatial correlation among disease rates across populations. Disease mapping is a process of describing geographical variation in disease incidence and mortality. Our model differs from the standard disease-mapping model by focusing on the exposure effect, instead of on prediction of disease outcomes. We show how we can integrate the aggregate and disease-mapping models in order to provide an intuitive and generalizable approach to the modeling of spatial effects while retaining the efficiency of the exposure effect estimation.

October 2, 2000
Title:	A comparative study of the Genentech dynamic randomization method
Speakers:	Hao Liu, Graduate Student, Biostatistics
Abstract:	In a controlled clinical trial, treatment groups of equal size within prognostic groupings are usually desired for both statistical and clinical reasons. Dynamic randomization methods that are based on the patient covariates are usually used to achieved this goal, especially when the total sample size is small, and the number of prognostic grouping is large. In this talk, I review a dynamic randomization method used at the Genentech Inc. The statistical properties of the Genentech dynamic randomization method are studied via simulations. Furthermore, the performance of the Genentech method is compared with three classical randomization methods: Pocock and Simon's method, Efron's method, and the randomly permuted blocks design. Recommendations based on the simulation studies are provided.

October 9, 2000
Title:	Semiparametric regression models and spatial binary data
Speaker:	Chuan Zhou, Graduate Student, Biostatistics
Abstract:	Every year large areas of forest in the United States are sprayed to prevent defoliation caused by gypsy moths. Without intervention, the impact of gypsy moth infestation can be devastating, leaving complete stands of host trees barren. Therefore it is important to predict defoliation in order to develop intervention strategies. Our goals are to develop and refine regression models that use both the current season's environmental and geographical data as well as the historical defoliation data to predict defoliation in the current year. We propose to fit spatial-temporal transition models to the data. Point estimates of the parameters can be obtained using estimating equations. Standard error estimates can be obtained through use of window subsampling method for dependent data as developed in Heagerty and Lumley (JASA 2000). The transition models will be evaluated by ROC curves. This is part of my NRCSE RA work under the supervision of Dr. Heagerty and Dr. Lumley.

October 16, 2000
Title:	Finding a dissertation/thesis advisor
Speaker:	Panel Discussion
Abstract:	A panel discussion will be lead by four students, 2 Master's level students and 2 Ph.D. students, on their experiences of finding a thesis/dissertation adviser and their recommendations. The panel will also briefly discus their thesis/dissertation topics to give students an idea of what qualifies as an appropriate topic. The remaining time will be devoted to a question and answer session.

October 23, 2000
Title:	Uncertainty in outcomes for survival analysis
Speaker:	Amalia Meier, Graduate Student, Biostatistics
Abstract:	Estimation of failure time has mainly been done in the context in which failure is observed with certainty. This project describes a failure time estimation technique that addresses information loss due to imperfect sensitivity/specificity in the outcome measure. Following the methods of Richardson and Hughes (2000) that handle discrete time data with uncertain outcomes, estimation methods are developed that include other types of data. Methods described are appropriate for data with missed visits and/or covariate measures which influence survival rates. The final goal of this project is to adapt methods for interval censoring survival analysis with uncertain outcomes. Most of the methods described make use of the E-M algorithm. This talk is being given in preparation for a general exam the following week.

October 30, 2000
Title:	Variability in diagnostic mammography accuracy
Speaker:	AYingye Zheng, Graduate Student, Biostatistics
Abstract:	I will present an analysis of correlated ROC data using ordinal regression models. The study is based on diagnostic mammography outcomes from Breast Cancer Surveillance Consortium (1994-2000). The goal of the study is to provide an overall summary of accuracy for diagnostic mammography in a community setting, identify systematic variation in accuracy and characterize random variations due to radiologists. Results from generalized estimating equation, random effect model and Bayesian hierarchical model approaches will be discussed. This is part of my RA work under the supervision of Dr. William Barlow.

November 13, 2000
Title:	Screening based on the risk of cancer calculation from Bayesian hierarchical change-point models of longitudinal markers
Speaker:	Donna Pauler, Assistant Professor, Biostatistics
Abstract:	The standard approach to early detection of disease with a quantitative marker is to set a population-based fixed reference level for making individual screening or referral decisions. For many types of disease, additional information is contained in the subject-specific temporal behavior of the marker, which exhibits a characteristic alteration early in the course of the disease. In this talk I present a Bayesian approach to screening based on calculation of the posterior probability of disease from longitudinal biomarker levels. The method is motivated by a randomized ovarian cancer screening trial in the UK comprising 22,000 women screened over four years with an additional five years of follow-up on average (Jacobs et.al., 1993). Levels of the antigen CA125 were recorded annually in the screened arm. I model CA125 trajectories with hierarchical mixtures of change-point models, using data from related screening trials to inform the prior change-point distributions and reversible jump Markov chain Monte Carlo to estimate parameters from the posterior distribution. I then show how samples from the posterior distribution can be combined with prior risk to calculate the posterior risk of ovarian cancer for a new subject given her longitudinal CA125 levels. A screening strategy based on the risk calculation is proposed and evaluated using data from an independent screening trial of 5,550 women performed in Sweden(Einhorn and Sjovall, 1992) and via simulation of a prospective seven year screening trial. Both applications indicate large increases in sensitivity for fixed specificity compared to the standard approach, and form the basis for implementation of the Bayesian screening method in one arm of a large upcoming screening trial in the UK.

November 27, 2000
Title:	Parametric identifiability and related problems
Speaker:	Abhijit Dasgupta. Graduate Student, Statistics
Abstract:	Identifiability is often the first assumption made in methods of statistical inference. I consider the case when we do not have even local identifiability of the model. This results in the information matrix being singular. I present a method of reparametrization so that we can get a transformed parameter set under which the model is at least locally identifiable. I also present a sufficient condition so that the model will be globally identifiable under the new parameters. Often constraints are placed on the parameters so that the constrained model is then identifiable (e.g. ANOVA). The problem of finding the constraints is, in a sense, the dual problem to the reparametrization problem. I suggest a method of construction constraints, and describe what properties constraints need to have to be model-preserving. The objective of reparametrization here is to be able to make inference on the original parameter space. I suggest a method for obtaining likelihood-based confidence regions on the original parameter space.

Last Modification: 29 November 2001