Research topics
The following is a list of our current/past research domains. Recently,
our focus is mainly on symbolic-statistical modeling and structural
learning (inductive logic programming and grammar induction).
For a full list of international publications,
please visit here.
- Symbolic-Statistical Modeling / Statistical Abduction
- Knowledge Acquisition / Structural Learning
- Inductive Logic Programming
- Quantified decision trees
- Knowledge Acquisition by Evolutionary Approach (genetic algorithm/programming)
- Grammar Induction
- Learning Automata
- Logic Programming
- First-Order Compiler
- Program Transformation
- Reinforcement Learning
- Multi-Agent Reinforcement Learning
- Relational Reinforcement Learning
* The documentation of Gemini is available only in Japanese.
Programming Language for Symbolic-Statistical Modeling
PRISM is a logic-based programming language for
symbolic-statistical modeling. It is designed to be a general-purpose
device for statistical abduction, which integrates abductive reasoning
and statistical inference seamlessly. Thanks to PRISM's first-order
expressive power, we can easily mix symbolic knowledge (structural
knowledge) and statistical knowledge, and it is also easy to describe
the various extensions of, and go beyond, hidden markov models,
probabilistic context-free grammars, and Bayesian networks.
On the other hand, to support various statistical inference
tasks (e.g. probability computation or parameter estimation via the
EM algorithm), the programming system provides efficient built-in
routines. Currently, we are concentrating on statistical language
modeling and user modeling.
Related publications
-
Sato, T.:
Inside-outside probability computation for belief propagation.
Proceedings of the 20th International Joint Conference on
Artificial Intelligence
(IJCAI-2007),
pp.2605-2610, 2007.
PDF
-
Sato, T.:
A Generic Approach to EM Learning for Symbolic-Statistical Models.
Proeedings of the 4th Learning Language in Logic Workshop
(LLL05), 2005.
PDF
-
Sato, T., Kameya, Y. and Zhou, N.-F.:
Generative modeling with failure in PRISM.
Proceedings of the 19th International Joint Conference on
Artificial Intelligence
(IJCAI2005), pp.847-852, 2005.
PDF
-
Sato, T. and Kameya, Y.:
A dynamic programming approach to parameter learning of
generative models with failure.
Proceedings of ICML Workshop on Statistical Relational Learning
and its Connection to the Other Fields
(SRL2004), 2004.
PS,
PDF
-
Sato, T. and Kameya, Y.:
Parameter Learning of Logic Programs for Symbolic-Statistical Modeling.
Journal of Artificial Intelligence Research
(JAIR),
Vol.15,
pp.391-454, 2001.
PS,
PS + compress(.Z),
PDF
-
Sato, T. and Kameya, Y.:
PRISM: A symbolic-statistical modeling language.
Proceedings of the 15th International Joint Conference on Artificial
Intelligence (IJCAI97),
pp.1330-1335, 1997.
PS,
PS + gz,
PDF
-
Sato, T.:
A statistical learning method for logic programs with distribution
semantics.
Proceedings of the 12th International Conference on Logic Programming
(ICLP95), Tokyo,
pp.715-729, 1995.
extended version:
PS,
PS + gz,
PDF
Statistical Language Modeling
In recent years, statistical language models are widely utilized for
analyzing a sequence of symbols in statistical natural language
processing, and bioinformatics. In particular, hidden Markov models
(HMMs), probabilistic context-free grammars (PCFGs) and their
extensions (history-based models) are well-known. In our laboratory,
the following studies have been done.
- We have found that, for parameter estimation of moderately
ambiguous PCFGs, a concatenation of a CFG parser and the
graphical EM algorithm (proposed during the research on PRISM) runs faster than the
Inside-Outside algorithm by an order (orders) of magnitude. The
implementation based on the Earley parser, called Gemini, is
downloadable from here*.
- We have proposed a simple EM learning method for hierarchical
hidden Markov models.
- We have applied the variational Bayesian learning to PCFGs and
derived an efficient learning algorithm. We have also found
empirically that the algorithm achieves better prediction than
the Inside-Outside algorithm.
* The documentation of Gemini is available only in Japanese.
Related publications
-
Kurihara, K. and Sato, T.:
An Application of the Variational Bayesian Approach to Probabilistic
Context-Free Grammars.
IJCNLP-04 Workshop Beyond shallow analyses, 2004.
PDF
-
Sato, T. and Zhou, N.-F.:
A New Perspective of PRISM Relational Modeling.
Proceedings of IJCAI-03 workshop on Learning Statistical Models
from Relational Data
(SRL2003),
pp.133-139,
2003.
PS,
PS + gz,
PDF
-
Ueda, N., Sato, T.:
Simplified training algorithms for hierarchical hidden Markov models.
Proceedings of the 4th International Conference on Discovery Science
(DS2001),
LNCS Vol.2226, pp.401-415, Springer, 2001.
-
Sato, T., Abe, S., Kameya, Y., and Shirai, K.:
A separate-and-learn approach to EM learning of PCFGs.
Proceedings of the 6th Natural Language Processing Pacific Rim
Symposium
(NLPRS-2001),
pp. 255-262, 2001.
Revised version:
PS,
PS + gz,
PDF
-
Kameya, Y., Mori, T. and Sato, T.:
Efficient EM learning of probabilistic CFGs and their extensions
by using WFSTs.
Journal of Natural Language Processing,
Vol.8, No.1, pp.49-84, 2001 (in Japanese).
PS,
PS + gz,
PDF
-
Ueda, N., Kameya, Y., and Sato, T.:
A parameter updating of stochastic context-free grammars in
linear time on the number of productions.
Proceedings of the 1st IMC workshop, 1999.
Inductive Logic Programming
Decision trees or association rules are quite powerful tools
for data mining, but there is a limitation that they are only
able to represent propositional knowledge. Contrastingly,
with inductive logic programming, we can extract first-order
rules from data. As a natural first-order extension of decision
trees and their divide-and-conquer induction strategy, we proposed
a learning method for quantified decision trees (QDTs), in which
existential and universal quantifiers are treated mathematically.
Related publications
-
Sato, T.:
Program extraction from quantified decision trees,
Proc. of Machine Intelligence 17,
Bury St Edmunds, pp.78-80, 2000.
PS,
PS + gz,
PDF
Last Update: Jan. 23, 2007