Estimation of direct effect for survival data using the Aalen additive hazards model
Friday, 7.5.10, 09:30-10:30, IMBI, Stefan-Meier-Str.26
We are interested in estimating the direct effect of an exposure variable X on a survival outcome T. In case of an intermediate variable K and an unobserved confounder U for the effect of K on T standard regression techniques will render a biased estimate of the direct effect of X on T. This problem may be solved with the inclusion of additional information, L, that removes the effect of U on K. However, if L is also affected by X then standard methods are still not appropriate. Marginal structural models have been suggested to tackle this problem but they need estimation of specific weights that may be quite unstable. To overcome this problem, Goetgeluk et al. (JRSSB, 2009) suggested a so-called G-estimation approach in the case of an un-censored response variable. In this talk I show how to generalize their approach to the setting of survival data. I start out by describing the dynamic path analysis approach and point out that it may give wrong answers in case of an un-measured confounder.
Detecting the Emergence of a Signal in a Noisy Image
Friday, 7.5.10, 11:15-12:15, Raum 404, Eckerstr. 1
We study sequential change-point detection when observations form a sequence of independent Gaussian random fields, and the change-point is the time at which a signal of known functional form involving a finite number of unknown parameters appears. We first identify a detection procedure of Shiryayev-Roberts type that is asymptotically minimax up to terms that vanish as the false detection rate converges to zero. We then compare approximations to the Shiryayev-Roberts detection procedure with comparatively simple approximations to CUSUM type procedures. Although the CUSUM type procedures are suboptimal, our numerical studies indicate that they compare favorably to the asymptotically optimal procedures.
From genomes to phenotypes - statistical applications in transcriptomics, high-throughput RNAi and microscopy image based phenotyping
Friday, 21.5.10, 11:15-12:15, Raum 404, Eckerstr. 1
How do variations in the genomes of individuals shape their phenotypes?\nRecent technological progress in high-throughput sequencing, genetic tools and automated microscopy imaging enable powerful experiments to address this question and place exciting challenges for data analysis and modelling.\n\nThe talk will have two sections:\nFirst, I will report a statistical error model for high throughput nucleotide sequencing data. This technology provides quantitative readouts in assays for RNA expression (RNA-Seq) and protein-DNA binding (ChIP-Seq). Statistical inference of differential signal in these data needs to take into account their natural variability throughout the dynamic range. When the number of replicates is small, error modeling is needed to achieve statistical power. We propose an error model that uses the negative binomial distribution, with variance and mean linked by local regression, to model the null distribution of the count data. The method controls type-I error and provides good detection power. A free open-source R/Biondonductor software package, “DESeq”, is available.\n\nSecond, I will describe some aspects of the statistical modelling of large-scale RNAi experiments, where the response of cellular populations to the RNAi perturbations is monitored by live-cell microscopy. The data are analysed by automated image analysis, fitting of dynamic models of cell cycle progression, extraction of multivariate phenotypes, and definition of a multivariate phenotypic landscape.
Electrochemical oscillations and chaos on macro- and microscales: phase diffusion, synchronization, and pacemaker design
Monday, 7.6.10, 14:00-15:00, Raum 404, Eckerstr. 1
Complex chemical and biological systems exhibit dynamic self-organization with emergent properties depending both on the behavior of the constituent parts and the types and extent of their interactions. We introduce the subject of chemical complexity through the description of macroscale and microscale electrochemical oscillations that occur on multi-particle electrodes.\n\nIn the presentation, electrochemical oscillations are described by using the concept of cycle phase and determine how the frequency and precision of the current oscillations on a single electrode depends on resistance and temperature, and how the extent of interactions affects periodic and chaotic synchronization patterns on electrode arrays.\n\nFor optimal pacemaker design, a theory for obtaining waveform for the effective entrainment of a weakly forced oscillator is presented. Phase model analysis is combined with calculus of variation to derive a waveform with which entrainment of an oscillator is achieved with minimum power forcing. The theory is tested in chemical entrainment experiments in which oscillations close to and further away from a Hopf bifurcation exhibited sinusoidal and higher harmonic nontrivial optimal waveforms, respectively.
MODEL SELECTION BASED ON FDR-THRESHOLDING -- OPTIMIZING THE AREA UNDER THE RECEIVER -- OPERATING CHARACTERISTIC CURVE
Friday, 11.6.10, 11:15-12:15, Raum 404, Eckerstr. 1
In gene expression or proteomic studies large numbers of variables are investigated. We generally can not assume that a few of the investigated variables show large effects. Instead we often hope that there is at least a combination of several variables, which, e.g. allow prediction of the response of an individual patient to a particular therapy. The task of selecting useful variables with rather moderate effects from a very large number of candidates and estimating suitable scores to be used for the prediction of a clinical outcome in future patients is a hard exercise. \n\nWe evaluate variable selection by multiple tests controlling the false discovery rate (FDR) to build a linear score for prediction of a clinical outcome in high-dimensional data. Quality of prediction is assessed by the receiver operating characteristic curve (ROC) for prediction in independent patients. Thus we try to combine both goals: prediction and controlled structure estimation. We show that the FDR-threshold which provides the ROC-curve with the largest area under the curve (AUC) varies largely over the different parameter constellations not known in advance. \n\nHence, we investigated a cross validation procedure based on the maximum rank correlation estimator to determine the optimal selection threshold. This procedure (i) allows to choose an appropriate selection criterion, (ii) provides an estimate of the FDR close to the true FDR and (iii) is simple and computationally feasible also for rather moderate to small sample sizes. Low estimates of the cross validated AUC (the estimates generally being positively biased) and large estimates of the cross validated FDR may indicate a lack of sufficiently prognostic variables and/or too small sample sizes. The method is applied to an example dataset.
Risk aggregation: New techniques and new problems
Friday, 18.6.10, 11:15-12:15, Raum 404, Eckerstr. 1
Quantitative Risk Management often starts with a vector X of one-period loss random variables. \nWe introduce a general mathematical framework which interpolates between different levels of information on the distribution of X \nand illustrate some basic issues on how to aggregate and risk measure the random position X. In particular, we study Risk Aggregation \nunder different mathematical set-ups, for different aggregating functionals and risk measures, focusing on Value-at-Risk. \nWe show how the theory of Mass Transportations and tools originally developed to solve so-called Monge-Kantorovich problems turn out to be useful in this context. \nFinally, we introduce some new numerical integration techniques which solve some open aggregation problems and raise new interesting research issues.