NUS: Department of Statistics and Applied Probability
NUS Home | Search: in Go
Back to NUS homepage
 Home > Seminar
 
 

Seminar Details

Title: On The Discounted Penalty Function In A Discrete Time Renewal Risk Model With General Interclaim Times

Speaker: Prof Xueyuan Wu, Centre for Actuarial Studies, Department of Economics, The University of Melbourne

Date: 13 December 2007 (Thursday)

Time: 4:00pm - 5:00pm

Venue: S16-06-118 (Seminar Room)

Abstract

In this paper a discrete time renewal risk model with arbitrary interclaim times is discussed. We show that the expected discounted penalty function satisfies a recursive formula. In particular, the probability generating function of the time of ruin, as a function of the initial surplus, has a compound geometric tail. When the claim amounts follow a geometric distribution, explicit expression for the Gerber-Shiu function can be obtained for the specially chosen penalty function. The constant claim amounts and mixed geometric claim amounts are also examined.

Title: Some Bayesian Solutions for Zero-Inflated Poisson Model Selection

Speaker: Prof Gauri S. Datta, University of Georgia

Date: 12 December 2007 (Wednesday)

Time: 4:00pm - 5:00pm

Venue: S16-06-118 (Seminar Room)

Abstract

Count data are often encountered in agriculture, biology, economics, engineering, health service, industry, meteorology, sociology, to name a few. Number of insurance claims, product defects, traffic fatalities, terrorist incidents, hurricanes, infections, and deaths from AIDS or some other disease are some among many examples dealing with count data. The Poisson distribution, which is usually considered to describe a model for such datasets, sometimes does not work well if there are too many zeros. To account for excessive zeros in count data, a zero-inflated Poisson (ZIP) distribution is suggested in the literature. A ZIP distribution is a mixture of a standard Poisson distribution and a degenerate Poisson distribution with zero mean.

The ZIP distribution has been used both for independent and identically distributed (i.i.d.) observations and for non-i.i.d.

observations where suitable auxiliary variables are available to model the mean. In the latter case, which is referred to as a ZIP regression model, each count is assumed to have a different distribution depending on some explanatory variable(s) and suitable generalized linear models are fitted to the Poisson parameter and/or to the mixing probability. Although there are a number of frequentist solutions discussing statistical inference for such models, Bayesian contribution to this area is rather limited. In this talk, we propose two Bayesian solutions to this problem. In our first solution, treating it as a model selection problem, we rewrite the ZIP model as a mixture of a zero-truncated Poisson distribution and a degenerate distribution at zero. We justify an objective prior for the new parameters. Using this prior and the standard Jeffreys' prior for the Poisson mean we obtain the Bayes factor for the ZIP model versus the standard Poisson model. In the second approach, for the i.i.d. setup we embed the ZIP model into a larger class of models by suitable extension of the parameter space. Our Bayesian test depends on the posterior probability of the hypothesis of zero inflation. Some applications of both solutions and suitable extension to the regression case will be discussed.



Title: Cross-Profile Shrinkage in Multivariate Bayesian Variable Selection - With Applications to Gene Set Enrichment Analysis

Speaker: Dr Sierra M. Li, Division of Oncology Biostatistics,
Sidney Kimmel Cancer Center, Johns Hopkins School of Medicine,
Baltimore, MD

Date: 04 December 2007 (Tuesday)

Time: 4:00pm - 5:00pm

Venue: S16-06-118 (Seminar Room)

Abstract

Variable selection is an important problem in statistical modelling in both association and prediction studies. Pervious research on multivariate Bayesian variable selection has the same definition of latent variable types as in univariate regressions. We focus on the multivariate nature of the problem and raise the new concept of 3 latent types that distinguishes variables by common and differential magnitude in their regression coefficients. We propose a Bayesian hierarchical model that is flexible for both conjugate and non-conjugate structure. The dimension of the model is not fixed when the prior puts a point mass at zero. We are able to integrate out parameters that affect the dimensionality of the model and obtain the marginal posterior of the latent variable types.

Simulation studies prove that the 3-type model out-performs the traditional 2-type model when there are heterogenous signals, by compare the sensitivity and specificity in variable classification. The model framework is general enough for a wide range of applied problems. We demonstrate the model by two case studies. The first study is to find gene-phenotype association in random recombinant yeast segregants treated by diverse small molecules. The second example is to analylize the enrichment of KEGG pathways in a breast cancer study. Gene set enrichment analysis (GSEA) examines the pre-defined, biologically meaningful sets of genes with increased power and robustness to find subtle changes. With gene expression data measured by a profile of multiple responses, such as in different cell lines and under different drug treatments, it is of great interest to elucidate which and how the enrichment differs among these multiple responses. The 3-type Bayesian variable selection model leads to the new concept of common and differential enrichment modelling cross-profile mean and variance of regression coefficients. The case studies show that the 3-type hierarchical model is generally applicable at both gene and gene-set levels.



Title: Cure Model with Current Status Data

Speaker: Prof Shuangge Ma, Steven, Department of Epidemiology and Public Health, Yale University

Date: 21 November 2007 (Wednesday)

Time: 4:00pm - 5:00pm

Venue: S16-06-118 (Seminar Room)

Abstract

Current status data arise when only random censoring time and event status at censoring are available. We consider current status data under the cure model, where a proportion of the subjects are not susceptible to the event of interest and the cure probability satisfies a generalized linear model. We assume Cox proportional hazards models for the event time of susceptible subjects. We investigate the maximum likelihood estimate for the linear Cox model and the penalized maximum likelihood estimate for the partly linear Cox model. It is shown that estimates of the parametric regression coefficients are root-n consistent, asymptotically normal and efficient. The nonparametric baseline function and nonparametric covariate effect can be estimated with n^1/3 convergence rate. We propose inference for estimates of the regression coefficients using the weighted bootstrap. Simulation studies are used to assess finite sample performance of the proposed estimates. We also analyze the Calcification data for demonstration.



Title: Another Look at the Moment Method for Large Dimensional Random Matrices - I & II

Speaker: Prof Arup Bose, Indian Statistical Institute, Calcutta

Date: 14 November 2007 (Wednesday) - Part I
Date: 16 November 2007 (Friday) - Part II

Time: 4:00pm - 5:00pm - Part I & II

Venue: S16-06-118 (Seminar Room) - Part I
Venue: S16-05-101, Computer Lab 1 - Part II

Abstract

The methods to establish the limiting spectral distribution (LSD) of large dimensional random matrices includes the well known moment method which invokes the trace formula. Its success has been demonstrated in several types of matrices such as the Wigner matrix and the sample variance covariance matrix. In a recent article Bryc, Dembo and Jiang (Annals of Probability, 2006) establish the LSD for the random Toeplitz and Hankel matrices using the moment method. They perform the necessary counting of terms in the trace by splitting the relevant sets into equivalent classes and relating the limits of the counts to certain volume calculations.

We build on their work and present a unified approach. This helps provide relatively short and easy proofs for the LSD of several common matrices while at the same time providing insight into the nature of different LSD and their interrelations. By extending these methods we are also able to deal with matrices with appropriate dependent entries.

[This work is joint with Dr. Anindya Roy, University of Maryland, BaltimoreCounty, U.S.A. ]



Title: On Coverage of Generalized Confidence Intervals

Speaker: Prof Arup Bose, Indian Statistical Institute, Calcutta

Date: 07 November 2007 (Wednesday)

Time: 4:00pm - 5:00pm

Venue: S16-06-118 (Seminar Room)

Abstract

Generalized confidence intervals do not have exact frequentist coverage, but often provide coverage close to the nominal value and have the correct asymptotic coverage.

Many articles have shown that for messy parametric problems with certain pivotal structure, the generalized intervals perform adequately in the repeated sampling set up (even though the generalized intervals are not motivated from a repeated sampling argument).

Generalized procedures have been successfully applied to several problems of practical importance. Several simulation studies have demonstrated the success of the generalized procedure in many problems where the classical approach fails to yield adequate confidence intervals.

There has been some theoretical investigation of the success of generalized intervals in the frequentist sense. Hannig, Iyerand Patterson (2006) have shown that asymptotically the generalized intervals maintain the target coverage level for a large class of problems. Hannig (2006) has also investigated the connection between the generalized procedures and fiducial inference.

The focus of this talk is to provide theoretical explanation of the observed empirical behavior of the generalized intervals and to suggest ways of improving the finite sample performance of the generalized intervals.

We derive expansions of coverage probabilities of one-sided generalized confidence intervals and use the expansions to explain the nonuniform performance of the generalized intervals. We establish that in general the generalized confidence intervals are not first order accurate, i.e., accurate only up to the n-1/2term. We provide a necessary and sufficient condition for the generalized intervals to be first order accurate.

We then show how to use these expansions to obtain improved coverage by suitable calibration. The benefits of the proposed modification are illustrated in the context of several examples.

[This work is joint with Dr. Anindya Roy, University of Maryland, BaltimoreCounty, U.S.A. ]



Title: Random Continued Fractions

Speaker: Prof Alok Goswami

Date: 11 October 2007 (Thursday)

Time: 4:00pm - 5:00pm

Venue: S16-06-118 (Seminar Room)

Abstract

Given a terminating or non-terminating sequence of positive integers, the continued fraction determined by this sequence gives a positive real number. Moreover, every positive real can be represented this way. Research on properties of this continued fraction representation had been a signi?cant part in classical mathematics. The most important of these has been the study of the Gauss dynamical system. A stochastic counterpart of this is when the continued fractions are generated by sequences of random variables, giving rise to Random Continued Fractions. For the case of a sequence of i.i.d. non-negative random variables, the random continued fraction converges almost surely. A related markov chain and its ergodic properties play a crtical role in deriving interesting properties of this limit random variable. Some special cases give rise to interesting distributions for the limit random variable. These ideas extend in natural way to higher dimensions.



Title: On Distribution Estimation and Prediction for Bivariate Extreme-Value Distributions

Speaker: Prof Nader Tajvidi, Mathematical Statistics, Centre for Mathematical Sciences, Lund Institute of Technology, Lund, Sweden

Date: 10 October 2007 (Wednesday)

Time: 3:00pm - 4:00pm

Venue: S16-06-118 (Seminar Room)

Abstract

Two new methods are suggested for estimating the dependence function of a bivariate extreme-value distribution. One is based on a multiplicative modification of an earlier technique suggested by Pickands, and the other employs spline smoothing under constraints.

Both produce estimators that satisfy all the conditions that define a dependence function, including convexity and the restriction that its curve lie within a certain triangular region. The first approach does not require selection of smoothing parameters; the second does, and for that purpose we suggest explicit tuning methods, one of them based on cross-validation. Applications of our dependence function estimators to estimating the full bivariate distribution, and its density, are described, as too are applications to prediction. Indeed, the cross-validation algorithm is designed to provide near-optimal performance when estimating the bivariate density, and is particularly useful for constructing compact prediction regions by the method of profiling.

 

Title: Estimating the Error Distribution In Multivariate Heteroscedastic Time Series Models

Speaker: Prof M.J. Silvapulle, Department of Econometrics and Business Statistics, Monash University, Australia

Date: 28 September 2007 (Friday)

Time: 4:00pm - 5:00pm

Venue: S16-06-118 (Seminar Room)

Abstract

Copulas have attracted considerable interest for modelling multivariate observations and for stress testing in quantitative finance. In this paper, a semiparametric method is studied for estimating the copula parameter and the joint distribution of the error term in a class of multivariate time series models when the marginal distributions of the errors are unknown.The proposed method first obtains √n-consistent estimates of the parameters of each uni-variate marginal time-series, and computes the corresponding residuals. These are then used to estimate the joint distribution of the multivariate error terms, which is specified using a copula. The proposed estimator of the copula parameter of the multivariate error term is
asymptotically normal, and a consistent estimator of its large sample variance is also given so that confidence intervals may be constructed. A simulation study was carried out to compare the estimators particularly when the error distributions are unknown. In this simulation study, our proposed semiparametric method performed better than the well-known parametric methods. An example on exchange rates is used to illustrate the method.

Title: Skew Hedging of the Barrier Options

Speaker: Dr Szymon Borak, Humboldt University, Berlin

Date: 19 September 2007 (Wednesday)

Time: 4:00pm - 5:00pm

Venue: S16-06-118 (Seminar Room)

Abstract

The price of the barrier options depends on the shape of the implied volatility surface. Barrier options can be understood for instance as an option on the implied volatility skew. The implied volatility surface, however, is highly dynamic object, that is subjected to considerable deformations as time passes. Consequently, the hedging performance of these options crucially depends on the strategy to extract the key factors of the implied volatility surface dynamics. We extract these factors by applying dynamic semiparametric factor model and study the hedging performance of the knock-out options.



Title: A Bayes Method of a Monotone Hazard Rate Via S-paths

Speaker: Dr Ho Man Wai

Date: 12 September 2007 (Wednesday)

Time: 4:00pm - 5:00pm

Venue: S16-06-118 (Seminar Room)

Abstract

A class of random hazard rates, which is defined as a mixture of an indicator kernel convoluted with a completely random measure, is of interest. We provide an explicit characterization of the posterior distribution of this mixture hazard rate model via S-paths. A closed-form and tractable Bayes estimator for the hazard rate is derived to be a finite sum over S-paths. The path characterization or the estimator is proved to be a Rao-Blackwellization of an existing partition characterization or partition-sum estimator. This accentuates the importance of S-paths in Bayesian modeling of monotone hazard rates. An efficient Markov chain Monte Carlo method for sampling the S-paths is proposed to approximate this class of estimates. Numerical studies show that it performs better than existing popularly used partition-based sampling methods.



Title: Parameter Estimation Techniques for Statistical Process Monitoring in the Presence of Data Autocorrelation

Speaker: Prof Thaung Lwin , CSIRO Mathematical and Information Sciences, Australia

Date: 05 September 2007 (Wednesday)

Time: 4:00pm - 5:00pm

Venue: S16-06-118 (Seminar Room)

Abstract

The present paper considers an application of the first-order autoregressive (AR(1)) model to realizations, ,of an unobservable variable, , representing a quality characteristic of a process monitored at a sequence of 'time' intervals in mineral processing or manufacturing production. The unknown realizations are observed subject to errors, implying errors-in-variables model, (AR(1)\_EIV), for the observed sequence of data. The model has a reasonably wide range of applications in process monitoring with autocorrelated data.

Application of such a model to process data requires both estimation of the unobservables, , in constructing one-step-ahead predictions and also estimation of all the underlying model parameters.

For given values of the underlying model parameters, estimation of the unobservables can be carried out most efficiently by Kalman-filter technique. Estimation of the model parameters can be handled by a number of techniques. Specific contributions of the present paper are: (i) a parametric approach comprising a comprehensive development of the full maximum likelihood technique for estimation of the model parameters in the presence of random effects, the number of which increases with the number of observations and (ii) a semi-parametric approach combining a direct or indirect fitting of a variogram with the method of moments, and minimum prediction error sum of squares techniques for estimation of model parameters.



Title: Nonparametric Monotone Regression for Generalized Linear Models

Speaker: Prof Jyh-Jen Horng Shiau, National Chiao Tung University

Date: 08 August 2007 (Wednesday)

Time: 3:00pm - 4:00pm

Venue: S16-06-118 (Seminar Room)

Abstract

In this study, motivated by the WAT-EC problem in semiconductor manufacturing, we develop a new nonparametric monotone smoothing spline smoother for analyzing responses from exponential families. The new method modifies the monotone smoothing spline smoother developed by Zhang (2004) and then combines with the methodology developed by Gu (2002) for data from exponential families. An algorithm with implementation details is provided. Computation is efficient because we utilize the characteristics of the natural cubic splines. The effectiveness of the proposed method is studied by simulation and the results demonstrate that the proposed method performs well in the regression models with both the Bernoulli and Poisson responses. In terms of the averages squared error, the proposed monotone estimator outperforms the unconstrained smoother when the latter produces non-monotone estimates, while retaining about the same performance when the latter produces monotone estimates. As an illustrative example for applications, we demonstrate the proposed method can be used in screening WAT test items for more stringent engineering control and in setting appropriate control limits.



Title: Analysis of Least Absolute Deviation

Speaker: Prof Ying Zhiliang, Columbia University

Date: 25 June 2007 (Monday)

Time: 3:00pm - 4:00pm

Venue: S16-06-118 (Seminar Room)

Abstract

In this talk, I will describe a least absolute deviation-based method for testing linear hypothesis. Like ANOVA, this method is coordinate-free, and admits singular design matrices. A simple approximation using stochastic perturbation is developed to obtain cut-off values for the resulting tests. Theoretical justification, computer implementation and simulation will be presented. Focus will be given to the special cases of one and multi-way layouts.

Title: The Markovian Frame of the Bayesian Inference Upon the Missing Data Models

Speaker: Prof Gang Wei, School of Mathematics and System Sciences, Shandong University, Jinan, Shandong

Date: 22 June 2007 (Friday)

Time: 4:00pm - 5:00pm

Venue: S16-06-118 (Seminar Room)

Abstract

The EM algorithm and the Data Augmentation scheme have been taken as the most fundamental approaches handling the statistical inference for the missing data models. Though it had long been recognized since 1995 that exact posterior probability density function might be obtained for some missing data models, statisticians do not seem to have paid enough attention to this idea. In this talk we will demonstrate that in most missing data models with low dimensional parameter space, the Bayesian/Likelihood inference could be performed more efficiently as compared with the iterative/sampling schemes of EM algorithm and the Data Augmentation. The Markovian frame for this not so well-known approach, the so called Inverse Bayesian Formula, will be briefly introduced and discussed.

Title: Spectral Analysis of Faint Astronomical Objects: Bayesian Modeling, Computation, and Inference

Speaker: Prof David A. Van Dyk, Department of Statistics, University of California, Irvine

Date: 25 April 2007 (Wednesday)

Time: 4:00pm - 5:00pm

Venue: S16-06-118 (Seminar Room)

Abstract

The development of ever more sophisticated space-based telescopes brings forth richer astronomical data that are opening a new window on the cosmos. Instruments designed to record high-energy electromagnetic radiation (X-rays and gamma-rays), for example, are clarifying our understanding of some of the most energetic events in the universe. Matter falling into black holes, the birth and death of stars, and the collisions of galaxies all can be explored through their high-energy spectra. This cosmic exploration, however, requires careful quantitative analysis of sometimes very limited photon counts.

Statistical methods must account not only for the complexity of the astronomical objects themselves, but also of the instruments and the scientific questions that are posed. In this talk we discuss the search for narrow emission lines in spectra. The spectra are the distribution of photon energies and the emission lines are narrow ranges of energy with excess photon emission. The search for lines involves constructing a multi-level model that accounts for data degradation, instrumental effects, and the structure in the astronomical sources. The complexity of the model leads to highly multimodal likelihoods and complicated inferential questions. Standard test statistics cannot be directly used to evaluate the evidence for including lines and computational methods must be specially tailored to the problem. In the talk I will emphasize the use of profile methods for exploratory data analysis and a generalization of the Gibbs sampler that samples incompatible conditional distributions but is guaranteed to have the target posterior distribution as its stationary distribution.

This is is joint work with Taeyoung Park and the California-Harvard Statistics Collaboration.


Title: Kernel Methods For Optimal Change-Points Estimation In Derivatives

Speaker: Prof Ming-Yen Cheng, National Taiwan University and Marc Raimondo University of Sydney

Date: 18 April 2007 (Wednesday)

Time: 4:00pm - 5:00pm

Venue: S16-06-118 (Seminar Room)

Abstract

We propose an implementation of the so-called zero-crossing-time detection technique specifically designed for estimating the location of jump-points in the first derivative (kinks) of a regression function. Our algorithm relies on a new class of kernel functions having a second derivative with vanishing moments and an asymmetric first derivative steep enough near the origin. We provide a software package which, for a sample of size $n$, produces estimators with an accuracy of order, at least, $O(n^{-2/5})$. This contrasts with current algorithms for kink estimation which at best provide an accuracy of order $O(n^{-1/3})$. In the software, the kernel statistics is standardised and compared to the universal threshold to assess the significance of the kink scenario. A simulation study shows that our algorithm enjoys very good finite sample properties even in large noise levels. The method reveals kink features in real data sets with high noise level where modern regression methods tend to oversmooth the data.

Title: Cluster Identification via Projection Pursuit

Speaker: Prof Yannis Yatracos, Department of Statistics & Applied Probability, NUS

Date: 11 April 2007 (Wednesday)

Time: 4:00pm - 5:00pm

Venue: S16-06-118 (Seminar Room)

Abstract

In a sample variance decomposition, the largest component "I" (for index) determines two least homogeneous sample clusters. For multivariate data, "I" can be used in the pursuit of two clusters with the least homogeneous one-dimensional data projection.

The properties of "I", of its population counterpart and of the associated projection pursuit index are examined. Applications include the determination of:

a) clusters from a mixture distribution,

b) remote observations in regression,

c) a separating hyperplane in support vector machines,

d) data structures.

With the proposed method the "curse of dimensionality" turns into an advantage in cluster detection.

Title: Bayesian Functional Mapping of Complex Dynamic Traits

Speaker: Dr. Liu Tian, Department of Statistics, University of Florida

Date: 4 April 2007 (Wednesday)

Time: 4:00pm - 5:00pm

Venue: S16-06-118 (Seminar Room)

Abstract

Understanding the genetic control of complex dynamic traits is fundamental to agricultural, evolutionary, and biomedical genetic research. In the past, the so-called functional mapping model was derived within the maximum likelihood context to characterize the genetic and developmental mechanisms for many biological processes. However, when dealing with such a high-dimension problem, identifiability problems tend to occur for the maximum likelihood method. Moreover, the computation load is substantial to perform the significant tests and to obtain the confidence interval estimators by using repeated sampling techniques. To cope with those problems, we propose a Bayesian approach that can identify multiple QTLs for a dynamic complex trait simultaneously within the functional mapping framework. Bayesian parameter estimation and hypothesis testing, in our approach, are implemented via Markov chain Monte Carlo algorithms. Some mice body mass data from an F2 population are used to demonstrate the effectiveness of this proposed method.

Title: Global and Local Stationary Modelling in Finance: Theory and Empirical Evidence

Speaker: Prof Guegen Dominique, Department of d'Economie et Gestio, E.N.S, Cachan, France

Date: 3 April 2007 (Tuesday)

Time: 4:00pm - 5:00pm

Venue: S16-05-101, Computer Lab 1


Abstract

To model real data sets using second order stochastic processes imposes that the data sets verify the second order stationarity condition. This stationarity condition concerns the unconditional moments of the process. It is in that context that most of models developed from the sixties’ have been studied; We refer to the ARMA processes (Brockwell and Davis, 1988), the ARCH, GARCH and EGARCH models (Engle, 1982, Bollerslev, 1986, Nelson, 1990), the SETAR process (Lim and Tong, 1980 and Tong, 1990), the bilinear model (Granger and Andersen, 1978, Guégan, 1994), the EXPAR model (Haggan and Ozaki, 1980, the long memory process (Granger and Joyeux, 1980, Hosking, 1981, Gray, Zang andWoodward, 1989, Beran, 1994, Giraitis and Leipus, 1995, Guégan, 2000), the switching process (Hamilton, 1988). For all these models, we get an invertible causal solution under specific conditions on the parameters, then the forecast points and the forecast intervals are available.

Thus, the stationarity assumption is the basis for a general asymptotic theory for identification, estimation and forecasting. It guarantees that the increase of the sample size leads to more and more information of the same kind which is basic for an asymptotic theory to make sense.

Now non-stationarity modelling has also a long tradition in econometrics. This one is based on the conditional moments of the data generating process. It appears mainly in the heteroscedastic and volatility models, like the GARCH and related models, and stochastic volatility processes (Ghysels, Harvey and Renault (1997)). This non stationarity appears also in a different way with structural changes models like the switching models (Hamilton, 1988), the stopbreak model (Diebold and Inoue, 2001, Breidt and Hsu, 2002, Granger and Hyung, 2004) and the SETAR models, for instance. It can also be observed from linear models with time varying coefficients (Nicholls and Quinn, 1982, Tsay, 1987).

Thus, using stationary unconditional moments suggest a global stationarity for the model, but using non-stationary unconditional moments or nonstationary conditional moments or assuming existence of states suggest that this global stationarity fails and that we only observe a local stationary behavior.

The growing evidence of instability in the stochastic behavior of stocks, of exchange rates, of some economic data sets like growth rates for instance, characterized by existence of volatility or existence of jumps in the variance or on the levels of the prices imposes to discuss the assumption of global stationarity and its consequence in modelling, particularly in forecasting. Thus we can address several questions with respect to these remarks.

1. What kinds of non-stationarity affect the major financial and economic data sets? How to detect them?

2. Local and global stationarities: How are they defined?

3. What is the impact of evidence of non stationarity on the statistics computed from the global non stationary data sets?

4. How can we analyze data sets in the non stationary global framework? Does the asymptotic theory work in non-stationary framework?

5. What kind of models create local stationarity instead of global stationarity? How can we use them to develop a modelling and a forecasting strategy?

These questions began to be discussed in some papers in the economic literature. For some of these questions, the answers are known, for others, very few works exist. In this paper we discuss all these problems and we propose new stategies and modelling to solve them. Several interesting topics im empirical finance awaiting future research are also discussed.

Title: Asymptotics of Eigenvectors of Large Sample Covariance Matrices

Speaker: Dr. Pan Guangming, University of Science & Technology of China

Date: 21 March 2007 (Wednesday)

Time: 4:00pm - 5:00pm

Venue: S16-06-118, Seminar Room


Abstract

The eigenvectors of sample covariance matrices play an important role in principal component, wireless communication and some other fields. But, relative less work was done regarding the asymptotic behavior of eigenvectors in the research of large dimensional sample covariance matrices, compared to the eigenvalues. In this talk, we define a new form of empirical spectral distribution, which involves the eigenvectors and the eigenvalues. It is shown that this empirical spectral distribution and the classical empirical spectral distribution converge to the same limiting spectral distribution. Based on this new empirical spectral distribution, the central limit theorem of linear spectral statistics involving the eigenvectors and eigevalues are also established. Finally, we demonstrate how large sample covariance matrix theory work in wireless communication area.

Title: Marginal Models For Analyzing Data On Recurrent And Terminal Events

Speaker: Prof John D. Kalbfleisch, Saw Swee Hock Professor of Statistics, Department of Statistics & Applied Probability (NUS) and University of Michigan

Date: 14 March 2007 (Wednesday)

Time: 4:00pm - 5:00pm

Venue: S16-06-118, Seminar Room


Abstract

In clinical and observational studies, recurrent event data (e.g. repeated hospitalizations) are often encountered and, in some important applications, the recurrent events are censored by a terminal event (e.g. death). In such situations, the terminal and recurrent event rates are often strongly correlated. We review models and methods of analysis for data on recurrent and terminal events, which have for the most part been based on complete intensity models with strong Poisson type assumptions for the recurrent event process. We develop a new approach that retains the use of shared frailties to build in correlations, but relaxes the complete intensity assumption. Specifically, methods based on estimating functions with nonparametric components are used to assess dependence on covariates and to estimate the correlations between the recurrent and terminal processes. Asymptotic results and approximations parallel closely those available in the analysis of semiparametric models. The approach is compared with others in the literature and illustrated on data on recurrent hospitalizations and failure of treatment that arise in a Canada/USA study of peritoneal dialysis as a treatment for end stage renal disease.

Title: The General Dynamic Factor Model: Determining the Number of Factors

Speaker: Prof Marc Hallin, Institut de Statistique, E.C.A.R.E.S., and Department of Mathematics, Université Libre de Bruxelles

Date: 7 March 2007 (Wednesday)

Time: 4:00pm - 5:00pm

Venue: S16-06-118, Seminar Room


Abstract

In this talk we briefly review estimation methods in the dynamic factor model, and propose an information criterion for determining the number q of factors in the general model developed by Forni et al.~(2000), as opposed to the static and restricted dynamic models considered in Bai and Ng (2002, 2005) or Amengual and Watson (2006). Our criterion is based on the fact that this number q is also the number of diverging eigenvalues of the spectral density matrix of the observations as the cross-sectional dimension n goes to infinity. We provide sufficient conditions for consistency of the criterion for large n and T (where T is the series length). We show how the method can be implemented, and provide simulations and empirics illustrating its excellent finite sample performance. Application to real data brings some new empirical contribution in the ongoing debate on the number of factors driving the US economy.
*This is a joint work with Roman Liska


Title: A Covariate-Adjusted Adaptive Design For Two-Stage Clinical Trials With Survival Data

Speaker: Dr. Atanu Biswas, Indian Statistical Institute

Date: 27 February 2007 (Tuesday)

Time: 4:00pm - 5:00pm

Venue: S16-05-101, Computer Lab 1

Abstract

A new two-stage response-adaptive design for phase III clinical trials is proposed with survival data in the presence of covariates. Several exact and asymptotic properties of the design are studied. The procedure is illustrated by using some real data.

(Joint work with Uttam Bandyopadhyay and Rahul Bhattacharya)

Title: A New Approach to Singular Stochastic Control in Optimal Hedging and Investment-Consumption Under Transaction Costs

Speaker: Dr. Lim Tiong Wee, Department of Statistics and Applied Probability, NUS

Date: 28 February 2007 (Wednesday)

Time: 4:00pm - 5:00pm

Venue: S16-06-118, Seminar Room

Abstract

The problems of optimal investment and consumption and of option pricing and hedging in the presence of proportional transaction costs can be formulated as singular stochastic control problems. Up till now, numerical computation of the optimal trading or hedging strategy has been based on the method of Markov chain approximation and discrete-time dynamic programming applied directly to the control problem, which necessitates the comparison of maximum attainable utilities from buying stock, selling stock, or doing nothing. This approach is computationally intensive. In this talk, we propose a new approach. Beginning with a class of singular stochastic control problems that can be transformed to optimal stopping problems, we use the equivalence to optimal stopping to develop an efficient backward induction algorithm. We then use the method of finite differences to modify the backward induction algorithm for much more general stochastic control problems, including those that arise in applications to finite-horizon optimal investment and consumption and to option pricing and hedging in the presence of transaction costs. Specific algorithms and numerical results are provided for these applications.

Title: High-Dimensional Data Analysis

Speaker: Prof Bai Zhidong, Department of Statistics and Applied Probability, NUS

Dates: 7 February 2007 (Talks 1 & 2), 9 February 2007 (Talk 3) & 14 February 2007 (Talk 4 & 5)

Time: 3:00pm - 4:00pm (7 February 2007), 12:00pm - 1:00pm (9 February 2007), 3:00pm - 4:00pm (14 February 2007)

Venue: S16-06-118, Seminar Room

Summary

Talk 1: In the first talk, I would introduce some examples to show the difference between large and small data analysis. How serious classical limiting theorems made errors in statistical inferences.

Talk 2: I will introduce some methodologies used in RMT. Introduce how the moment method and Stieltjes transforms are used in RMT.

Talk 3: I will introduce some results on Wigners Semicircular law and Marcenko-Pastur Law.

Talk 4: I will introduce spectral analysis of products of two random matrices and the limit spectral law of large F-matrices.

Talk 5: I will introduce the CLT of Linear Spectral Statistics constructed by eigenvalues and those by eigenvectors.


Title: Rounded Data Analysis

Speaker: Prof Bai Zhidong, Department of Statistics and Applied Probability, NUS

Dates: 31 January 2007 (Talks 1 & 2) & 2 February 2007 (Talk 3)

Time: 3:00pm - 4:00pm (31 January 2007), 12:00pm - 1:00pm (2 February 2007)

Venue: S16-06-118, Seminar Room

Abstract

Except categorical data, all continuous data need to be rounded when they are collected and recorded. The rounding errors definitely affect the accuracy of statistical inferences. In old days, it was not seriously treated because statistical problems are only met for small samples. However, with the wide application of advanced computers, statisticians are gradually facing to deal with data of large sample with large dimension. This forces our statisticians to pay serious attention to the rounding errors. It has been found in the literature that the usual t-test will reject the true null hypothesis with probability near to 1 when the sample size is large enough. This shows that we have to search new methodologies to deal with analysis of data when the observations are rounded and the rounding scale is large relatively to the sample size. In a series of my recent research works, we proposed some new methods to deal with the analysis of rounded data.

In the first talk, I will give certain examples to show how seriously the rounding errors affect the statistical inferences.

In the second talk, I will revisit an example discussed by Dempster and Rubin in their 1982 JRSSB paper. We will show that Sheppard corrections are superior to the BRB corrections only for their example and generally not the case. Further, we propose a new method to find consistent and asymptotically normal estimates in rounded linear models. The new method is named as two-stage estimation.

In the third talk, I will discuss the estimation problem through a rounded time series model. We named the method as snake-cutting (sc method). We show that the sc-estimates are consistent and asymptotically normal.

Title: Testing for Threshold Moving Average with Conditional Heteroscedasticity

Speaker: Dr. Li Guodong, Department of Statistics and Actuarial Science, University of Hong Kong

Date: 22 January 2007 (Monday)

Time: 10:30am - 11:30am

Venue: S16-06-118, Seminar Room

Abstract

The recent paper by Ling and Tong (2005) considered a quasi-likelihood ratio test for the threshold in moving average models with i.i.d. errors. This article generalizes their results to the case with GARCH errors and a new quasi-likelihood ratio test is derived. The generalization is not direct since the techniques developed for TMA models heavily depend on the property of p-dependence which is no longer satisfied by the time series odels with conditional heteroscedasticity. The new test statistic in this article is shown to converge weakly to a functional of a centered Gaussian process under the null hypothesis of no threshold and it is also proved that the test has nontrivial asymptotic power under local alternatives. Monte Carlo experiments demonstrate the necessity of our test when moving average time series has a time varying conditional variance. As a further support, two real data examples are also reported.

Title: Informative Transmission Disequilibrum Test (i-TDT): Combined Linkage and Association Mapping that Includes Unaffected Offspring as well as Affected Offspring

Speaker: Dr Guo Chao-Yu, Department of Mathematics & Statistics, Boston University

Date: 18 January 2007 (Thursday)

Time: 3:00pm - 4:00pm

Venue: S16-05-101, Computer Lab1

Abstract

To date, there is no test valid for the composite null hypothesis of no linkage or no association that utilizes transmission information from heterozygous parents to their unaffected offspring as well as the affected offspring from ascertained nuclear families. Since the unaffected siblings also provide information about linkage and association, we introduce a new strategy called the informative-transmission disequilibrium test (i-TDT), which uses transmission information from heterozygous parents to all of the affected and unaffected offspring in ascertained nuclear families and provides a valid chi-square test for both linkage and association. The i-TDT can be used in various study designs and can accommodate all types of independent nuclear families with at least one affected offspring. We show that the transmission/disequilibrium test (TDT) [Spielman et al., 1993] is a special case of the i-TDT, if the study sample contains only case-parent trios. If the sample contains only affected and unaffected offspring without parental genotypes, the i-TDT is equivalent to the sibship disequilibrium test (SDT) [Horvath and Laird, 1998]. In addition, the test statistic of i-TDT is simple, explicit and can be implemented easily without intensive computing. Through computer simulations, we demonstrate that power of the i-TDT can be higher in many circumstances compared to a method that uses affected offspring only. Applying the i-TDT to the Framingham Heart Study data, we found that the apolipoprotein E (APOE) gene is significantly linked and associated with cross-sectional measures and longitudinal changes in total cholesterol.

Title: Changing Patterns of Myopia and Eye Growth in Singapore Children - a Cohort Study

Speaker: Assoc. Prof Saw Seang Mei, Department of Community, Occupational and Family Medicine, Yong Loo Lin School of Medicine, NUS

Date: 17 January 2007 (Wednesday)

Time: 3:00pm - 4:00pm

Venue: S16-06-118, Seminar Room

Abstract

The Singapore Cohort study Of the Risk factors for Myopia was conducted to determine the longitudinal patterns of refractive error and risk factors for incident myopia. 1,979 children from 3 schools have been examined yearly. There were 1478 Chinese, 349 Malays and 152 children who were Indian and other races, amongst them, 851 children aged 7 years, 630 children aged 8 years and 498 children aged 9 years. During the first visit, the parents completed a questionnaire that included questions about possible risk factors for myopia such as the number of books read per week and whether the parents were myopic. Yearly eye examinations, including vision chart testing, eye testing using an autorefractor machine and biometry tests of eye size and shape have been conducted and continue to be conducted in the schools. We have examined the children in Yio Chu Kang Primary School and Tao Nan School every year for the past 8 years and Rulang Primary School yearly for the previous 6 years. We plan to continue the yearly examinations of the children even after the commencement of secondary school education. The prevalence rates of myopia in Singapore school children are one of the highest in the world: 28% in 7 year olds, 34% in 8 year olds, 43% in 9 year olds, 62.5% in 10 year olds, 67.1% in 11 year olds, and 63% in 12 year olds. The 3-year increases in axial length, anterior chamber depth, lens thickness, vitreous chamber depth and corneal curvature were 0.89 mm, -0.02 mm, -0.01 mm, 0.92 mm and 0.01 mm, respectively. Children who were younger, female and who had a parental history of myopia were more likely to have greater increases in axial length. In a cohort analysis of three year data, the relative risks (RR) of myopia was 1.37 [95% confidence interval (CI) 1.05 to1.80] for two versus no myopic parents, after controlling for school, age, gender, income, reading in books per week and intelligence quotient (IQ). The multivariate RR of myopia for IQ in the third versus first tertile was 1.47 (95% CI 1.16 to1.87). Among children with IQ in the highest tertile, the RR of high myopia was 2.72 (95% CI 1.26 to 5.84) for those reading more than 2 books per week as compared to those reading 2 books or less per week. This cohort provides valuable data about the aetiology of the incidence and progression of myopia.

Title: Higher Order Semiparametric Frequentist Inference Based on the Profile Sampler

Speaker: Mr Cheng Guang, Institute of Statistics & Decision Sciences, Duke University

Date: 15 January 2007 (Monday)

Time: 10:30am - 11:30am

Venue: S16-06-118, Seminar Room

Abstract

In this talk, we have systematically constructred a higher order frequentist validation of semiparametric estimation procedures through easy-to-implement Bayesian MCMC methodology. Specifically speaking, inference for the parametric component of a semiparametric model based on sampling from the posterior profile distribution, called "the profile sampler", is thoroughly investigated from frequentist viewpoint. We first derive the second order asympotic frequentist properties of the profile sampler in terms of distributions, moments and confidence intervals. Further, by a delicate analysis of the entropy of the semiparametric models involved, we find that the accuracy of inferences based on the profile sampler improve as the convergence rate of the nuisance parameter increases.

From the above analysis, we notice that the estimation accuracy of the profile sampler method is intrinsically determined by the semiparametric model specifications. Therefore it is natural to question how to control the degree of accuracy. In the last section, we address this by proposing the penalized profile sampler method, in which we profile the penalized likelihood rather than the full likelihood. Thus, we can achieve the desired estimation accuracy for the parameter of interest by tuning the associated smoothing parameters.

Our theory is verified in several popular semiparametric models arising from Survival Analysis, Epidemiology and Econometrics. As far as we are aware, the above results are the first higher order frequentist inferences obtained for semiparametric estimation.

Title: Testing Hypothesis of Erros / Innovations in Non-parametric Regression

Speaker: Prof Estate V. Khmaladze, Victoria University of Wellington

Date: 11 January 2007 (Thursday)

Time: 3:00pm - 4:00pm

Venue: S16-05-101, Computer Lab 1

Title: Approximating the Variance of the Conditional Probability of the state of a Hidden Markov Model

Speaker: Prof David Siegmund, Stanford University

Date: 10 January 2007 (Wednesday)

Time: 3:00pm - 4:00pm

Venue: S16-06-118, Seminar Room

Abstract

For a hidden Markov model the variance of the conditional probability of the underlying state given the observations measures the information lost by failure to observe directly the state of the hidden process. In the case when changes of state occur slowly relative to the speed at which information about the underlying state accumulates in the observed data, the variance of this conditional probability is computed approximately in terms of functionals of Brownian motion that arise in change-point analysis. Applications in gene-mapping, where this variance plays a role in standardizing the score statistic and in evaluating the loss of noncentrality due to incomplete information, are discussed. Numerical examples illustrate the range of validity and limitations of our results.

Statistics and Applied Probability: Home | Search | Site Map | Contact Us

© Copyright 2001-04 National University of Singapore. All Rights Reserved.
Terms of Use | Privacy | Non-discrimination