To join this seminar via Zoom please click here.
If you would like to join the seminar and are not currently affiliated with ANU, please contact Kenneth Duru at firstname.lastname@example.org.
Markov models and Hidden Markov Models (HMMs) have pervaded a wide diversity of technical literature for the last 60 years. The first basic Markov model was created/inspired by the analysis of Russian text (the poem "Onegin" by Pushkin) by Audret Andreyevich Markov. The introduction of the HMM is generally attributed to Leonard Esau Baum in the 1960s. The first real application was in Speech Processing in the 1970s.
In this seminar we will consider semi-Markov models. When a mathematical object is prefixed with "semi", this usually means one or more of the defining conditions are relaxed, but it also implies some of the original character of the primary object is retained. Well known examples are a semi-norm or a semi-group. So, what does "semi" mean for a Markov process? What do we keep and what do we relax? To address that question we must state precisely what we mean by a Markov process and how we define our process. We shall see that memoryless random variables are central to Markov processes and in fact this property is implicitly encoded in a Markov process' transition matrix. This innocuous fact is both a strength (for computational simplicity) and a weakness. One such weakness is that calibrating/fitting a HMM to data is usually a Maximum Likelihood task. What that means is that all a HMM can ever do is compute the "best" fitting memoryless model, nothing more. In contrast semi-Markov models allow arbitrary sojourn models, but keep some form of the convenient conditional independence known (loosely) as the "Markov property". In addition to this we construct our "new" model in such a way that it includes the classical Markov model as a degenerate special case, that is, semi-Markov models offer a much richer class of infinite-state stochastic processes and this class contains the standard Markov model as a degenerate special case.
We will introduce and explain the basic notions of the semi-Markov models and consider a first estimation task of deriving a recursive filter for a partially observed semi-Markov chain. The technique we will use to compute a state-estimation filter is the well-known change-of-probability-measure technique. Ultimately, we would like an EM algorithm for model calibration, (HsMM), but such a scheme can be shown to be fundamentally dependent upon a set of recursive filters, or filters and smoothers for offline data scenarios. Consequently, deriving a state-estimation recursive filter is a good place to start. In fact, we have two filters for semi-Markov models to discuss. One is joint work with Professor John van der Hoek (University of South Australia) and one is joint work with Professor Robert Elliot (University of Calgary). Computational issues arise in both of these estimation schemes, but in different forms. In the filter with Prof. van der Hoek we must deal with a memory management issue, as the extact filter needs its history. In contrast, the filter with Prof. Elliott takes a different approach via a lattice-based state space. This results in potentially large matrices which must be approximated to meet a column-stochastic requirement.