Tweedie densities are exponential dispersion models characterised by a variance of the form var[Y] = øμρ where μ = E[Y] is the mean of the distribution, ø> 0 is the dispersion parameter and p is the index of the distribution. Of particular interest is the case with 1 < p < 2 when the distributions are continuous for Y > 0 with a discrete mass at Y = 0. Although there are notable special cases of the Tweedie distributions (the normal (р = 0), Poisson (p = 1), gamma (p = 2) and inverse Gaussian (p = 3) distributions), the Tweedie distributions do not generally have a closed form. This means that maximum likelihood estimation techniques are difficult to use, and consequently the distributions have been used little in practice. In this thesis, two numerical methods for evaluating the densities are examined: infinite series expansions and Fourier inversions. In addition, algorithms for evaluating the cumulative distribution function are also studied. Programs have been written in S-Plus to implement these techniques. These programs are then used for a number of analysis problems, such as fitting generalized linear models to data, residual analysis, and modelling the deviance.
Chapter 1 initially provides a motivation for the research. Some background information for dispersion models, exponential dispersion models and the Tweedie distributions follows in Chapter 2.
Then, the infinite series expansions for the Tweedie densities are considered in Chapter 3. The series are evaluated by carefully considering the region where the terms in the series contribute to the infinite summation. The result is a technique for accurate evaluation for most regions in the parameter space. Using similar techniques, derivatives of the density with respect to ø are also calculated.
In Chapter 4, Fourier inversion of the cumulant generating function is used to evaluate the Tweedie densities. This requires a large amount of analysis to ensure that the algorithms converge with certainty. The Fourier inversion method works well for most parts of the parameter space. Fortunately, the regions where the Fourier inversion technique fails or is very slow are the regions where the series expansions perform well. Likewise, where the series expansions do not work well, the Fourier inversion method can be used.
Chapter 5 discusses the problem of computing the cumulative distribution function for the Tweedie distributions. The focus is on using a Fourier inversion technique similar to that used in Chapter 4 for the density. For the case 1 < p < 2, a series expansion also exists for the cumulative distribution function and is examined.
In Chapter 6, the accuracy of the saddlepoint approximation to the Tweedie densities is explored. The accuracy of the saddlepoint approximation can be determined only because accurate evaluation of the Tweedie densities is possible using the results from Chapters 3 and 4.
A fast but accurate method of evaluating the Tweedie densities for practically any parameter values is then developed in Chapter 7 using a polynomial interpolation technique. The interpolation is based on the ratio of the actual density (computed using the results of Chapters 3 and 4) to the saddlepoint approximation to the Tweedie densities.
In Chapter 8, the ability to accurately compute the Tweedie densities is then used to compute the log-likelihood function, which enables maximum likelihood estimates to be found. Estimation of parameters is discussed, and a number of data examples are used to demonstrate the procedures. The likelihood calculations allow efficient estimation of the parameters p and ø (Chapter 8); allow model checking related to the distributional form through randomized quantile residuals (Chapter 9); and allow investigations into the domain of adequacy of the EQL and saddlepoint approximations (Chapters 6 and 10).
To assess the suitability of fitted generalised linear models, it is standard to use Pearson or deviance residuals. In Chapter 9, however, a new type of residual is developed that can be used to better judge the suitability of fitted models. These residuals are called randomized quantile residuals, and can be used for any distribution. They utilize the cumulative distribution function, so they can be used for the Tweedie distributions using the algorithms that were developed in Chapter 5. Using the quantile residuals, some of the data examples from Chapter 8 are considered and the quality of the fitted models is discussed. In addition, other examples are used to demonstrate the advantages of using quantile residuals.
In Chapter 10, quasi-likelihood and extended quasi-likelihood functions are considered. For exponential dispersion models such as the Tweedie distributions, the extended quasi-likelihood function is equivalent to using the saddlepoint approximation to the densities. The accuracy of the extended quasi-likelihood estimates are therefore directly related to the accuracy of the saddlepoint approximation. Examples are used to demonstrate the use of extended quasi-likelihood for parameter estimation, and the results are compared to the maximum likelihood estimates found in Chapter 8. Using extended quasi-likelihood also implies distributional assumptions about the unit deviances. Using the results of Chapters 3 and 4, these assumptions are examined for the Tweedie distributions.