[The Conference]
[International Scientific Committee]
[Local Organising Committee]
[Administrative Team]
[Registration and Accommodation]
[Abstract]
[Visitor Informations]
[Local maps and Lecture venue]
[Conference Programme]
[List of Participants]
[View of Pescara (PDF)]
[Photo Gallery]
Abstracts
July 1 - 9.30-10.30
S.Amari
- Information Geometry of Singular Statistical Models
Information geometry studies manifolds of probability
distributions or statistical models. The intrinsic structure
of regular finite-dimensional statistical models has been
investigated well, and lots of applications emerge in a wide
range of fields such as information theory, control systems
theory, optimization, neural netwroks, and belief propagation.
Many of these models are hierarchical, in the sense that smaller
models are included in larger models as submanifolds. Typical
such examples are multilayer perceptrons, ARMA time series
models and Gaussian mixtures. In such a model, there exist
critical areas corresponding to smaller models, on which the
parameters become unidentifiable and the Fisher metric degenerates.
Geometrically, such models include algebraic singularities.
The present talk analyzes the structure of singularities by
using a simple model of Gaussian mixtures. The cusp type
singularities are found in this case, and accuracy of parameter
estimation is analyzed when the true distribution is close to
singularity. We also use a simple toy model (cone model) to
show some other properties of singularities and the effects of
singularities on learning.
|
July 3 - 11.40-12.30
N.Ay
- On the Geometry of Complexity: an Approach to Neural Information Processing
I am following the general concept that complexity should
somehow quantify the deviation of a composed system from being
the unrelated collection of its individual constituents.
Information geometry provides a powerful framework for a
mathematical elaboration of this concept. The aim of my talk
is to present analytical results on complex systems and
illustrate them by computer simulations. Applying my approach
to the field of neural networks, it leads to a generalized
version of the infomax principle by Linsker.
|
July 4 - 11.40-12.30
D.C. Brody, G.W. Gibbons
- Cone Geometry in Statistical Mechanics
The idea of a convex cone is a very simple one, but it has a surprisingly
large number of applications in
both mathematics and physics. This talk will cover an elementary
introduction to the geometry of convex
homogeneous cones. A natural connection to the information geometry of
statistical mechanics arises
in this context, which will also be discussed.
|
July 1 - 15.50-16.20
A. Cena
- Christoffel symbols of the alpha-connections on the alpha-bundles over the Exponential Statistical Manifolds.
Gibilisco and Pistone (1998) show that the pretangent and the tangent bundle
over the Exponential Statistical Manifold are the natural domains to define,
respectively, the mixture and the exponential connection in the non-parametric case.
Then they define the infinite-dimensional version of the alpha-connections on a suitable family of vector bundles.
In this context we evaluate the Christoffel symbols of the exponential and
mixture connection. After studying the regularity of the sphere in the Lebesgue
spaces and the natural connection on its tangent bundle, we are able to give
the Christoffel symbols of the alpha-connections.
|
July 2 - 11.40-12.30
J.M. Corcuera - F. Giummole'
- Simultaneous prediction
In the present work the problem of prediction is considered in a
multidimensional setting. Extending an idea presented in
Barndorff-Nielsen and Cox (1996), a predictive density for the future
multivariate random variable is proposed. This density has the form of
an
estimative density plus a correction term which is easily calculated. It
gives simultaneous prediction regions with coverage error of smaller
asymptotic order than the estimative density. Several examples with a
simulation study are presented, showing how the proposed solution
improves
on the estimative one.
|
July 2 - 17.10-17.40
A. De Sanctis
- Exact asymptotics on Zoll surfaces
One of the procedures used to derive asymptotic expansions of the characteristic function is the method of stationary phase (Barndorff-Nielsen and Cox, '89). Using the Morse Theory, this method requires that we locate the critical points of the original function and then we approximate the characteristic function by certain sums depending on the values of the function and its higher derivatives at critical points.
The problem of finding random variables for which the method of stationary phase produces the exact value of the characteristic function involves topological properties of the statistical manifold. For this the spheres of even dimension are privileged with respect to those of odd dimension (Donald St. P. Richards,'95).
We prove that the classical exactness result on two-dimensional sphere holds also for particular perturbations of the metric, and then of the random variable, the so-called "Zoll metrics".
|
July 1 - 11.50-12.40
S.Eguchi
- Information Geometry of Bregman Divergences
The class of Bregman divergences and the application
to statistical methods including PCA, ICA, Gaussian
mixture and so forth have been proposed. It is shown
that this class offers a special structure on the information
geometry, which is in contrast with that associated with
the alpha divergences. In the dual connections one is
always the mixture connection in the class, which enables
us to getting easily the empirical form of the divergence.
Thus the objective function to be optimised becomes a linear
functional of the empirical distribution. The structure
determines the statistical performance of the proposed methods.
We also apply this discussion to classification problems. By
using the dual form for the optimisation problem to the empirical
Bregman distance over a linear combination of weak learners we
propose the class of U-boost including AdaBoost, and investigate
the performance structure from the statistical point of view.
|
July 1 - 11.00-11.50
K. Fukumizu
- Singularities of statistical models: from the estimation viewpoint
It is known that some important statistical models such as finite mixture
models and multilayer neural networks are not necessarily smooth
manifolds. These models have singularities at the points corresponding to
density functions of a smaller size. In the Gaussian mixture model with
two components, for example, a point of the standard normal distribution
can be represented by high dimensional subsets of the parameter space, and
it is a singularity of the model if the model is considered in a
functional space. This work discusses singularities of a model from the
viewpoint of statistical estimation. If the true density is located at
such a singularity, the behavior of an estimator for a sample from the
density does not follow the standard theory. This problem has been known
as unidentifiability of a parameter, and has been studied very much.
However, little has been clarified on the general asymptotics of the
maximum likelihood estimator (MLE) around a singularity. It has been
known that in some cases the likelihood ratio test statistics (LRTS)
diverges to infinity as the sample size goes to infinity, which shows a
clear difference from the ordinary chi-square asymptotics. This
divergence result implies also that the degree of freedom around a
singularity can be infinite, if it is measured by the dimensionality of
the fluctuation of MLE around the singularity. Main results of this work
are on the asymptotic order of MLE for i.i.d. sample of size n, assuming
that the true density is located at a singularity. I focus on nonlinear
regression models and multilayer neural networks, in particular. First,
as an extension of Hartigan idea [1], a simple but useful sufficient
condition on the divergence of LRTS is derived from a geometric viewpoint,
using the framework of locally conic models, which is proposed by
Dacunha-Castelle and Gassiat [2]. Next, a universal upper bound O_p(log
n) of LRTS is derived under the assumption that the model is nonlinear
regression with binary output or with Gaussian noise, each of regression
function is bounded, and the family of the functions is of finite
Vapnik-Chervonenkis dimension.
Finally, these results are applied to multilayer perceptrons, which is
one of the most successful models of neural networks, showing that LRTS is
of a larger order than O_p(1) if the model has surplus hidden units to
realize the true function. I derive also a log n lower bound in the case
that the model has at least two surplus hidden units for the true
function, which means the asymptotic order of LRTS is exactly of log n in
such cases.
References
[1] Hartigan, J. A. (1985) A failure of likelihood asymptotics for normal
mixtures.
In Proceedings of Berkeley Conference in Honor of Jerzy Neyman and Jack
Kiefer II
|
July 1 - 17.40-18.10
D. P.K. Ghikas
- Killing Symmetries in Information Geometry
We address the question concerning the possible interpretation and usefulness of the existence of Killing Symmetries of Information Manifolds, both classical and quantum. These symmetries are isometries under the action of Lie Transport on the Fisher Information metric. In the classical case we conjecture that they are related to the Transformation models of Barndorff-Nielsen while in the quantum case we expect them to be related to isoentropic transformations. As first results towards a general proof we show that for the normal family the Killing symmetry is generated by sl(2,R) , which is in fact the symmetry of the hyperbolic geometry of this family, while for two models of quantum information geometry, the SO(3) and SL(2,R) of Nencka and Streater these isometries give the isoentropic directions. Finally we discuss some possible applications of these results.
References :
- M.K. Murray, J.W. Rice :"Differential Geometry and Statistics"
- H. Nencka, R.F. Streater : "Information Geometry for some Lie Algebras", Infinite-Dimensional Analysis, Quantum Probability and Related Topics, 2, pp 441-460. World Scientific.
- B. Schutz : " Geometrical Methods of Mathematical Physics"
|
July 5 - 11.40-12.30
P. Gibilisco, T. Isola
- Some open problems in noncommutative Information Geometry
We discuss some of the open problems of the theory of noncommutative alpha-connections and of non-commutative monotone metrics. As an example we show how it is possible to calculate the geodesic distance associated to Wigner-Yanase information from an approach that mimic the classical pull-back approach to Fisher information (note that there exists only another explicit formula of this kind, namely the formula of geodesic distance for the Bures metric).
References
[1] P. Gibilisco, T.Isola, Monotone metrics on statistical manifolds of density matrices by geometry of noncommutative L^2-spaces, in Disordered and Complex Systems, eds. A.C.Coolen, L.Hughston, P.Sollich, R.F.Streater (AIP, 2001), 129-140.
[2] P.Gibilisco, T.Isola. A Characterisation of Wigner-Yanase skew information among statistically monotone metrics. Vol. 4, No. 4 (2001), 553-557. Infinite Dimensional Analysis, Quantum Probability & Related Fields.
|
July 4 - 10.50-11.40
M.Grasselli, R.F. Streater
- Monotonicity, Duality and Uniqueness of the WYD Metrics
In a previous work, we have found that the Bogoliubov-Kubo-Mori metric is
the only monotone metric on finite dimensional quantum systems for which
theexponential and mixture connections are mutually dual.
It is well established that both the $\pm$-connections and the BKM metric
are limiting cases of the more general class of $\alpha$-connections and
Wigner-Yanase-Dyson metrics.
The present paper extends the uniqueness result mentioned above for this
more general class. Namely, for each value of $\alpha \in (-1,1)$, we
prove that the only monotone metrics for which the $\pm\alpha$-
connections are mutually dual are scalar multiples of the
Wigner-Yanase-Dyson metric.
|
July 5 - 10.50-11.40
H.Hasegawa
- On the Dual Geometry of Wigner-Yanase-Dyson Information Quantities
Wigner-Yanase-Dyson conjecture appeared about forty years ago as a subject
of mathematical physics concerning the convexity of a matrix-valued
information quantity. Lieb gave an affirmative answer to the conjecture
in 1973 in the more general context of operator algebras. Another proof of
the so-called Wigner-Yanase-Dyson-Lieb concavity was given by Uhlmann in
1977. What interests us about this well-established subject is its
information-geometrical significance: it provides us with a typical
example of quantum Fisher information, and furthermore this example
carries Amari's concept of duality. In the present talk I wish to show
that this concept: (a) enables us to sharpen Petz's classification theorem
of monotone metrics; (b) characterizes the associated quasi-entropy; (c)
introduces naturally (in the framework of matrix analysis) a connection
that conforms to Amari's dual connection.
|
July 3 - 10.50-11.40
S.Ikeda, T. Tanaka, S. Amari
- Information Geometry of Turbo Codes and Low-density Parity-Check Codes
Since the proposal of turbo codes in 1993, many studies have appeared on
this simple and new error correcting codes which give a powerful and
practical method for error correction. The essential point of turbo codes
is
their iterative decoding algorithm, however, the main properties of the
decoding algorithm which have been so far obtained
are mostly empirical. The essence of the turbo decoding has not been
fully
understood theoretically.
Except for the experimental studies, a clue has been sought in other
iterative methods, which are closely related to turbo codes. One of the
methods is another class of error correcting codes called low-density
parity-check (LDPC) codes, which was originally proposed by Gallager in
1960's. Related ideas are found even in different
fields, one is in artificial intelligence and another in statistical
physics. McEliece et al. showed that the turbo decoding algorithm is
equivalent to the belief propagation, applied to a belief diagram with
loops, MacKay showed that LDPC codes are also equivalent to the belief
propagation, while Kabashima and Saad showed that the iterative process
of
Bethe approximation in statistical physics is the same as that of the
belief
propagation. However, the efficiencies of these
methods are also a sort of mystery, and they didn't help us clarify the
mathematical structure of turbo codes.
In this presentation, we focus on the turbo and the LDPC decoding and
investigate the mathematical structure of the iterative decoding methods
from the information geometrical viewpoint. We first formulate the
problem
of the error correcting codes as an m--projection of a given distribution
to
an e--flat submanifold which consists of
factorizable distributions. Since the exact m--projection is usually
computationally intractable, it is approximated through iterative
algorithms. We also express the turbo and the LDPC decoding algorithm as
the
combination of an m--projection and an e--projection.
|
July 4 - 16.40-17.30
A. Jencova
- Information geometry in the standard representation of matrix spaces
The algebra of operators acting on a Hilbert space is standardly
represented on the space W of Hilbert-Schmidt
operators. The aim of the present contribution is to show how (in finite
dimensions) the basic structures
of quantum information geometry are lifted to W. It was shown by Dittmann
and Uhlmann that the monotone
Riemannian metrics are related to certain real vector subspaces in W. We
show that there is a natural duality of such subspaces, which suggests a
duality of the corresponding metrics. We also introduce dual parallel
transports, related to the exponential and mixture connections. As
examples, we treat the smallest (Bures) and the largest monotone metric
and the smallest WYD metric. In these cases, we also show that the
corresponding one-dimensional exponential families are related to positive
cones in W.
|
July 2 - 14.30-15.20
P. Jupp
- Yoke Geometry in Parametric Inference
A basic structure in the differential-geometric approach to higher-order
statistical asymptotics is that of a yoke. The role of yoke geometry will
be illustrated by three topics:
(i) cubic modifications of score tests,
(ii) parameterisation-invariant versions of Wald tests,
(iii) modifications of likelihood functions.
|
July 2 - 10.50-11.40
F. Komaki
- Information Geometry of Statistical Prediction
Bayesian predictive distributions are investigated from the viewpoint of
information geometry.
Kullback-Leibler divergence from the true distribution to a predictive
distribution is adopted as a loss function.
We show that there are many examples where the Bayesian predictive
distribution based on
the Jeffreys prior is dominated by Bayesian predictive distributions
based
on other priors.
It is shown that the Bayesian predictive distribution based on the right
invariant measure
is the best invariant predictive distribution when a model has a group
structure. Furthermore, we show that there exist shrinkage predictive
distributions asymptotically dominating Bayesian predictive
distributions
based on the Jeffreys prior or other vague priors if the model manifold
satisfies some differential geometric conditions.
We show several examples where shrinkage predictive distributions
exactly
dominate Bayesian predictive distributions based on vague priors.
|
July 3 - 9.30-10.20
F.Matus, I. Csiszar
- Information Projections and MLE in Exponential Families Revisited
The goal of this contribution is to complete results available about
I-projections, reverse I-projections, and their generalized versions,
with
focus on linear and exponential families. Pythagorean-like identities
and
inequalities are revisited and generalized, and generalized maximum
likelihood estimates for exponential families are introduced. Regularity
conditions, that have been frequently imposed, can be removed. The main
tool
is a new concept of extension of exponential families, based on our
earlier
results on convex cores of measures. Given a sample from an unknown
distribution in an exponential family, the maximum likelihood estimate
(MLE)
exists if and only if the sample mean of the directional statistic
belongs
to the relative interior of the domain of the convex conjugate of the
cumulant generating function. We show for each point of that domain that
`approximate MLEs' converge to a unique member of an information closure
of
the exponential family. This follows from a new refinement of Fenchel
inequality.
The MLE in that closure and in extensions of exponential families will
be
related to minimization of the information divergence in the second
coordinate.
|
July 4 - 15.20-16.10
H. Nagaoka
- Quantum Information Geometry and Statistical Inference on Quantum States
tatistical inference problems such as parameter estimation and hypothesis
testing on quantum states bring strong motivations to differential
geometrical study of a quantum state space just as in the classical
information geometry. I would like to talk about how such geometrical
concepts as Riemannian metric, duality of affine connections, autoparallel
submanifold (geodesic in particular), etc. are related to several basic
problems concerning statistical inference on quantum states. The talk is
partly based on a joint work with Akio Fujiwara.
|
July 1 - 14.30-15.20
A.Ohara
- Dualistic Differential Geometry on Symmetric Cones and its Applications
We discuss dually flat structures on symmetric (i.e., homogeneous and
self-dual ) cones associated with Euclidean Jordan algebra.
First we exploit relations between dual connections on symmetric cones
and
Euclidean Jordan algebras. In particular, we introduce the property
called
"doubly autoparallelism" and show how doubly autoparallel submanifolds
are
characterized by Jordan subalgebras.
Next we define means on symmetric cones in an axiomatic
way following Kubo-Ando theory and then we discuss them from a viewpoint
of dualistic
differential structure.
We show that various means are expressed by the midpoints on geodesics
with
respect to the corresponding dualistic structures by elucidating the
relation between the geodesics and operator monotone functions that
generate means.
|
July 5 - 14.30-15.30
D.Petz
- Covariance and Fisher information in quantum mechanics
Variance and Fisher information are ingredients of the Cramer-Rao
inequality. We regard Fisher information as a Riemannian metric on a
quantum statistical manifold and choose monotonicity under coarse graining
as the fundamental property of variance and Fisher information. In this
approach we show that there is a kind of dual one-to-one correspondence
between the candidates of the two concepts. We emphasis that Fisher
informations are obtained from relative entropies as contrast functions on
the state space and argue that the scalar curvature might be interpreted
as an uncertainty density
on a statistical manifold.
|
July 1 - 15.20-15.50
G.Pistone
- Recent Results on Exponential Statistical Manifolds
In a paper published 1995 with C. Sempi, a definition of the manifold
structure of the positive probability densities was introduced. Such
manifold in modeled on Orlicz spaces with exponential Young function
and is based on the representation of probabilities as non-parametric
exponential models. The idea was further developed in a paper with
M.-P. Rogantin (1999) with improvement of the basic contruction and a
few results on the expectation parameterization on submanifolds. the
theory is still lacking of important features and the basic approach,
eg the use of Banach space of Orlicz type as local models in the
framework of standard manifold theory has been questioned.
On the positive side, a number of new results has been derived
recently and old results have been improved: it is expected that some
of these improvement will be presented by the author during the
meeting.
We will give a short presentation of the basic theory as we know it
now, recalling what it is already known and adding the new features,
expecially on the regularity of change of coordinates, cumulant
function, submanifolds, alternative structures. Other important
chapters, eg the theory on the tangent bundle with submanifolds and
connections, or the relation with information theory will be presented
by other authors.
|
July 4 - 14.30-15.20
M. B. Ruskai
- Monotone Metrics on Density Matrices
The distance between two density matrices in quantum information theory can be measured in many ways, including the trace norm, the relative entropy (which is not a true metric) and the Bures metric. All of these contract under completely positive, trace-preserving maps. We describe a general framework for monotone metrics using convex operator functions. Each function in the class defines a symmetric relative entropy pseudo-distance, a Riemannian metric on the tangent space, and a geodesic distance.
[Contraction of Relative Entropy, Riemannian Metrics and Related Measures of Distance between States on Non-commutative Probability Spaces (PDF)]
[Examples of monotone metrics and related quantities (PDF)]
|
July 2 - 9.30-10.20
A. Salvan, L. Pace
- The geometric structure of likelihood expansions in the presence of nuisance parameters
Stochastic expansions of likelihood quantities are usually derived through
ordinary Taylor expansions,
rearranging terms according to their asymptotic order.The most convenient
form for such expansions involves
the score function, the expected information, higher-order log-likelihood
derivatives and their
expectations. Expansions of this form are called expected/observed. If the
quantity expanded is a tensor
under a group of transformations on the parameter space, the entire
contribution of a given asymptotic
order to the expected/observed expansion will follow the same
transformation law. When there are no nuisance
parameters, explicit representations through appropriate tensors are
available. In this contribution, we analyse the geometric structure of
expected/observed likelihood expansions when nuisance parameters are
present. We outline the derivation of
likelihood quantities which behave as tensors under interest-respecting
reparameterisations. This allows
us to write the usual stochastic expansions of profile likelihood
quantities in an explicitly tensorial form.
|
July 4 - 9.30-10.20
R. F. Streater
- Dual structures on a quantum information manifold.
We find conditions on a manifold M of states of W*-algebra B(H) such that
both the (+1) and the (-1) affine structures are defined on the tangent
space. Sufficient conditions are that M consists of density operators D
such that D^p is of trace class for all p>0, and that the topology is such
that a neighbourhood of a point D(0) consists of all points D of M such
that there exist c, C such that 0 < cD < D(0) < CD holds. An equivalent
condition in terms of the Connes cocycle is derived.
|
July 1 - 16.50-17.40
J.Takeuchi, S. Amari
- Alpha-parallel prior and its properties
It is known that the Jeffreys prior plays an important role in statistical
inference. In this paper, we generalize the Jeffreys prior from the point
of view of information geometry introducing a one-parameter family of
prior distributions, which we named
alpha-parallel priors. The alpha-parallel prior is defined as the
parallel volume element with respect to the alpha-connection and coincides
with the Jeffreys prior when alpha=0. Further, we analyze asymptotic
behaviors of the various estimators such as the projected Bayes estimator
(the estimator obtained by projecting the Bayes predictive density onto
the original class of distributions) and the MDL estimator, when the
alpha-parallel prior is used. The correction term due to the alpha-prior
is shown to be regulated by an invariant vector field of the statistical
model. Although the Jeffreys prior always exists, the existence of
alpha-parallel prior with non-zero alpha is not always guaranteed. Hence
we consider conditions for the existence of the
alpha-parallel prior, elucidating the conjugate symmetry in a statistical
model.
|
July 5 - 9.30-10.20
A.Uhlmann
- The Bures Distance and its Riemannian Metric
In the classification of monotone metrics by D. Petz the Bures one seems
to
be the simplest.
I follow the way from Bures´ distance to its metric form, and try to
explain
what is physically important.
A few open problems shall be presented.
|
July 2 - 15.20-16.10
P. Vos
- Dual geometries in statistics
An overview of the role of dual geometries in statistics is given,
beginning with the classical result on the relative
information loss of a statistic expressed in terms of two curvatures.
This important result is used to illustrate
the various contributions dual geometry can make in statistics. Other
topics, including maximum likelihood estimation,
sufficiency, and generalized linear models, are also discussed.
|
July 2 - 16.40-17.10
J. Zhang
- Information Divergence and Convex Analysis
An observation is made that information divergence in various forms
(Amari, 1985; Zhu and Rohwer, 1995; Kass and Vos, 1997) arise naturally
from basic inequalities and duality in convex analysis. Some new families
of divergence can be introduced that would include the alpha-divergence as
a special case.
|
|