|
|
Title: |
Markov Chain |
Speakers: |
Sebastien Haneuse, Graduate Student, Biostatistics |
Abstract: |
Often we hear of the use of Markov Chain Monte Carlo in statistical analyses. Unfortunately, these powerful inferential methods are not presented very often. In this talk I am going to present two of the most widely used Markov chain simulation methods - the Metropolis-Hastings Algorithm and the Gibbs sampler - and in particular their role in a Bayesian analysis. Initially, I will go through the Bayesian paradigm - treating the unknown parameter as a random variable and then combining prior information and data to form a posterior distribution - and then go through the principle of the duality between a density and samples generated by that density. MCMC is a technique for simulating samples from posterior densities, even though the model may be very complex. I will also sketch a proof for the convergence of MCMC algorithms and go through a method for monitoring the convergence in practice. Click here for a postscript copy of the slides for this talk |
|
|
Title: |
Semiparametric Efficient and Useful Inefficient
Estimation for the Auxiliary Outcome Problem with the Conditional Mean Model |
Speakers: |
Jinbo Chen, Graduate Student, Biostatistics |
Abstract: |
The auxiliary outcome problem involves study of the regression relationship between an outcome Y and a set of covariates X, where Y is observed only for a subset of subjects but a correlate of Y, S, is observed for all the subjects. For example, a binary outcome may be subject to misclassification, but a validation subsample exists. In this talk, I will present semiparametric efficient estimation and useful inefficient estimation for the auxiliary outcome problem when the relationship between Y and X is restricted by the conditional mean model. The semiparametric efficient estimator we consider is a one-step estimator based on the efficient score function. I will present an algebraic approach for efficient score calculation, which is based on our insight for semiparametric efficient estimation by connecting semiparametric efficient estimation theory and Godambe's optimal estimating function theory. We consider two other useful inefficient estimators. We extend the robust imputation method of Yi-hau Chen (Biometrika, 2000) to the situation when the probability that an outcome is validated depends on the auxiliary outcome. We also propose an estimator based on the conditional expectation of unbiased estimating functions, where we condition on the observed data. Some simulation results for the finite sample performance of the estimators proposed will be presented. |
|
|
Title: |
"My
Dissertation" |
Speakers: |
Do Peterson, Graduate Student, Biostatistics |
Abstract: |
I will be presenting a talk given through 13 original music recordings with lyrics and 13 corresponding supplemental slides detailing the distribution theory for the recurrence risk ratio that I developed as part of my dissertation work in statistical genetics. This 45 minute work in progress entitled, "My Dissertation," is inspired by my interest in using multimedia and the arts to communicate deeper scientific material to more people. My primary objective in this presentation is to get feedback, i.e. find out what works and what needs work in this presentation. Please come if you are curious! |
|
|
Title: |
Statistical
Analysis of the Comet Assay Using a Mixture of Gamma Distributions |
Speakers: |
Bryan Shepherd, Graduate Student, Biostatistics |
Abstract: |
The single cell gel electrophoresis (comet) assay is an increasingly popular method for detecting and comparing nuclear DNA damage and repair. In this technique, a measurement called the "tail moment" quantifies DNA damage for an individual cell. The distribution of tail moments among a group of cells on a slide (experimental unit) often follows a skewed bimodal distribution, perhaps because cells are at different stages of the cell cycle when exposed to treatments. To better examine DNA damage, the distribution of tail moments on a slide were modeled using a mixture of two gamma distributions. Maximum likelihood, modified to accommodate left censored data, can be used to estimate the 5 parameters of the gamma mixture distribution for each slide. A weighted analysis of variance on the parameter estimates for the gamma mixtures can be performed to determine differences in DNA damage between treatments. These methods were applied to an experiment on the effect of thymadine kinase in DNA damage and repair. Analysis based on a mixture of gamma distributions was found to be more statistically valid, more powerful, and more scientifically informative than an analysis on the log-transformed tail moments. |
|
|
Title: |
CARTscans: A tool for Visualizing Tree-Based Models |
Speakers: |
Martha Nason, Graduate Student, Biostatistics |
Abstract: |
Tree-Based Models, including Classification and regression trees (CART), provide a useful alternative to more standard regression techniques using linear predictors. These models are especially useful for forming diagnostic or prognostic rules. However, the predictive models obtained from trees typically involve complex, high order interactions between the modeled covariates, and it is therefore difficult to visualize the results of a tree model in a succinct manner. In particular, reports of the fitted tree may obscure similarity of response distribution among regions that are actually adjacent in Cartesian space. We present CARTscans, a graphical tool providing a view of the structure of a tree model. Predicted values are displayed across a four dimensional subspace of the covariates with smoothing of effects due to other covariates. Using these graphs, a user is able to take advantage of the flexibility of tree-based models to find complex interactions and pick out interesting regions while still being able to visualize main effects. |
|
|
Title: |
Bias-Reduced
Variance Estimators for GEE? |
Speakers: |
Kristian Lynch, Graduate
Student, Biostatistics |
Abstract: |
Generalized Estimating Equations methodology (GEE)
provides an alternative regression tool for analyzing correlated response
data. However, several studies have
shown the 'sandwich' empirical
covariance estimator (the estimated covariance matrix of the regression
coefficients in GEE) to be bias in
certain settings. For example, in
small samples with binary
response data, the empirical covariance estimator tends to
underestimate the variance of
regression coefficients and thus provide liberal confidence intervals. In this talk, I will first review previous studies on the performance
of GEE for continuous and binary response data. I will next mention
alternative covariance estimators for GEE and compare their performance to
the sandwich estimator,
and finally I will present a
small simulation study which examines the covariance estimators on
clustered count data. The simulations include overdispersion,
various covariate patterns, balanced or unbalanced cluster sizes and other
misspecifications. The talk is part of an ongoing Master's Thesis with Dr Lumley and
should be suitable for all levels. |
|
|
Title: |
Short and |
Speakers: |
Jon Schildcrout, Graduate
Student, Biostatistics |
Abstract: |
Recent studies in the field of pediatric asthma
have collected daily exposure and outcome data on cohorts of children with
the aim to estimate the association between ambient pollution levels and
daily indicators of symptoms. Analysis
using regression methods such as GEE requires specification of a correlation
structure with common choices for short longitudinal series admitting
temporal autoregressive and/or exchangeable dependence structures. I will discuss
a useful method to estimate the dependence structure for longitudinal,
categorical data, and the impact dependence misspecification has on
efficiency of estimates. Specifically, I evaluate three types of
misspecification, or working correlations--independence, exchangeable, and
autoregressive—under several true autoregressive/exchangeable data
generating structures when the predictor of interest is a time-varying
covariate. Finally, I will discuss an analysis of a panel of 134 subjects
participating in the Child Asthma Management Program (CAMP) during the pre-
randomization phase of the study. |
|
|
Title: |
Estimating
Lifetime Medical Costs Using a Joint Frailty Model of Survival Time and Cost
as a Mark Variable |
Speakers: |
Kristin Berry, Graduate Student, Biostatistics |
Abstract: |
The analysis of lifetime medical costs with
censored data presents statistical challenges. The assumption of independent
censoring may be valid on the time scale, but is not reasonable on the cost
scale. The censoring pattern on the cost scale is typically induced to be
dependent. Of more concern is the fact that the cost distribution is
potentially nowhere identifiable in a nonparametric setting owing to the
censoring. Methods to date have avoided this problem by arbitrarily
estimating costs only up to the time of the final failure. We propose a
semi-parametric joint gamma frailty model for costs and survival. This model
assumes a common frailty for an individual's costs and survival time. We will
develop maximum likelihood estimates (MLE) for baseline hazards and the gamma
frailty parameter using the E-M algorithm. These MLE estimates can be
combined to obtain the marginal cost distribution and mean. We will discuss
the existence and consistency of these estimates. We will also present
results of these methods as applied to both simulated and real data. |
Last Modification: