Advisor: Dr. Nagaraj Neerchal
Date & Time
October 29, 2021, 12:30 pm – 1:30 pm
Title: Statistical Modeling using Conditionally Specified Joint Distributions with Applications
Often in practice, conditional distributions are easier to model and interpret while the joint distribution itself is either intractable or not available in closed form. When the observed response consists of both continuous and discrete components, specifying conditionals is more convenient. There are many real-world applications where the conditional specification approach is intuitively appealing, and knowing the conditional distributions makes it easier to understand and visualize the joint distribution. Furthermore, the researcher can obtain a better insight by investigating and interpreting the conditional distributions. In this thesis, we propose a joint distribution that can be specified using its respective conditionals and which can handle both continuous data and discrete data together. In literature, such models are referred to as conditionally specified models. We explored the theoretical aspects of conditionally specified models, where conditionals are from the exponential family of distributions, including parameter estimation, data generation, and uniqueness of the joint distributions.
The Maximum Likelihood (ML) method, which is the preferred estimation method of parametric models, turns out to be difficult to implement for estimating the parameters of conditionally specified joint distributions because it contains an awkward normalizing constant. Thus, Composite Likelihood (CL) was used as an alternative method of estimation. We used numerical methods to obtain the estimates of parameters since closed-form expressions for estimates using the proposed density is not feasible. Simulation study was conducted for different sample sizes to investigate the properties of ML estimates and CL. It showed that the ML method has less bias (and nearly zero in some cases) than the CL method, however CL method involves relatively less computational burden. In both methods, the variances of the estimates decrease as the sample size increases. Further, joint asymptotic relative efficiency (JARE) between the ML method and CL method were calculated for different sample sizes using the Godambe Information matrix. In addition, we conducted a performance analysis utilizing the two methods. The results showed that for a larger sample size, the computational advantage of the CL method surpasses that of the ML method quickly. Thus, choosing the CL method over the ML method is a trade-off between efficiency and computational cost. The proposed normal-logistic joint density was applied to the stock prices (continuous data) and expert recommendations (categorical data) for buying/selling specific stocks. Parameters of the model were estimated using both ML and CL methods.