Informal Seminars Presented In Winter 2004

January 21, 2004
Speaker:Rebecca Nugent, graduate student UW Statistics

January 28, 2004
Speaker:Fadoua Balabdaoui, graduate student UW Statistics

February 4, 2004
Title:Marginal Regression Modeling of Longitudinal, Categorical Response Data
Speaker:Jonathan Schildcrout, graduate student UW Statistics
Abstract: Longitudinal regression analysis is important in a variety of settings when the goal is to characterize changes that occur over time. The focus of this talk is on marginal regression models for longitudinal, categorical response data. I will first discuss a consistency-efficiency tradeoff with semi-parametric modeling when the goal is to estimate the cross-sectional relationship between the response and an exposure E[Y(t) | X(t)]. Next, I will describe the "marginalized" model class which permits likelihood-based estimation of marginal regression parameters. I will extend this class to accomodate response dependence that I have seen with long series of response data (the functional form of response dependence has both serial and long-range components). Finally, I will discuss prospective inference with retrospective, outcome dependent sampling. One situation where such a sampling scheme might be important is in a study where interest is in estimating the relationship between a response and a time-varying exposure, the exposure is expensive to measure, and a number of subjects exhibited no response variation during the study period (e.g., never had symptoms). With this sampling design, under certain conditions, we are able to make valid inference that is efficient when we exclude subjects without response variation as long as we account for the covariate ascertainment mechanism.


February 11, 2004
Title:Two-stage multiple imputation
Speaker:Ofer Harel, postdoctoral fellow UW Biostatistics
Abstract: Conventional multiple imputation (MI) replaces the missing values in a dataset by m>1 sets of simulated values. I explore a two-stage extension of MI in which the missing data are partitioned into two parts and imputed $N=mn$ times in a nested fashion. Two-stage MI divides the missing information into two components of variability, lending insight when the missing values are of two qualitatively different types. Point estimates and standard errors from the N complete-data analyses are consolidated by simple rules derived by analogy to nested analysis of variance. I present simple examples of two-stage MI and discuss a variety of potential applications. I also clarify the inferential role of the missingness indicators, extending Rubin's concept of ignorability to accommodate two types of missing values.


February 18, 2004
Title:Bayesian image analysis of cDNA microarray data
Speaker:Raphael Gottardo, graduate student UW Statistics
Abstract: DNA microarrays are an increasingly important tool that allow biologists to gain insight into the function of thousands of genes in a single experiment. By using an array containing many DNA samples, scientists can measure---in a single experiment---the expression levels of hundreds or thousands of genes within a cell by measuring the amount of labelled cDNA bound to each site on the array. In a typical two-color microarray experiment, two mRNA samples, from control and treatment situations, are compared for gene expression. Both mRNA samples, or targets, are reverse-transcribed into cDNA, labeled using different fluorescent dyes (red and green dyes), then mixed and hybridized with the arrayed DNA sequences. The hybridized arrays are then imaged to measure the red and green intensities for each spot on the glass slide. Image analysis is an important aspect of microarray experiments, whose purpose is to provide estimates of the foreground and background intensities for both the red and green channels. In this talk, I will take a Bayesian approach to the problem. In broad terms, the Bayesian approach treats the recorded raw images as numerical data, generated by a statistical model, involving both a stochastic component (to accommodate the effects of noise due to the environment and imperfect sensing) and a systematic component (to describe the true scene under view). Using Bayes' theorem, the corresponding likelihood is combined with a prior distribution on the true scene description to allow inference about the scene on the basis of the recorded image. I will not assume any knowledge on cDNA microarrays. I will give a brief introduction about the technology and review the main statistical issues involved in the analysis of the images.


March 3, 2004
Title:Optimal Dynamic Treatment Regimes
Speaker:Erica Moodie, graduate student UW Biostatistics
Abstract: Dynamic treatment regimes offer an ethical and flexible protocol for studying the effects of treatments which are adjusted over time according to response to treatment. A dynamic treatment regime is a list of decision rules - one for each time interval - for how levels of treatment should be allocated. Consequently, dynamic regimes may reduce non-compliance due to toxicity or under-treatment.

Until recently, few methods existed to study these regimes. I will examine the traditional approach of dynamic programming to solving these problems as well as recent advances in the area which rely on least squares methods and estimating equations.


March 10, 2004
Title:Empirical Evaluation of Data Transformation and Ranking Statistics for Microarray Analysis
Speaker:Lixuan Qin, graduate student UW Biostatistics
Abstract: Many choices in the analysis of a microarray dataset affect the results, such as normalization, background adjustment, and test statistics. Some procedures (e.g., background adjustment) are common practice now, but whether they truly benefit the analysis has not been fully evaluated. We used ten spike-in microarray experiments to evaluate the relative effectiveness of analysis choices in three categories: background adjustment, normalization, and ranking statistic. Findings support the use of an intensity-based normalization procedure and also indicate that local background-adjustment is harmful. We find that t-statistics perform poorly in identifying differentially expressed genes; more robust statistics are preferred. During this talk microarray experiments and some commonly-used terms will be briefly introduced. This work is done with Dr. Katie Kerr.