archives

Unsupervised Learning of Probabilistic Context-Free Grammar using Iterative Biclustering (Extended Version)


Home 

About 

Browse 

Search 

Register 

Subscriptions 

Deposit Papers 

Help
    

Tu, Kewei and Honavar, Vasant (2008) Unsupervised Learning of Probabilistic Context-Free Grammar using Iterative Biclustering (Extended Version). Technical Report 572, Computer Science, Iowa State University.

Full text available as:Adobe PDF

This is the latest version of this eprint.

Abstract

This paper presents PCFG-BCL, an unsupervised algorithm that learns a probabilistic context-free grammar (PCFG) from positive samples. The algorithm acquires rules of an unknown PCFG through iterative biclustering of bigrams in the training corpus. Our analysis shows that this procedure uses a greedy approach to adding rules such that each set of rules that is added to the grammar results in the largest increase in the posterior of the grammar given the training corpus. Results of our experiments on several benchmark datasets show that PCFG-BCL is competitive with existing methods for unsupervised CFG learning.

Keywords:Probabilistic context-free grammar, grammar induction, grammar learning
Subjects:Computing Methodologies: ARTIFICIAL INTELLIGENCE: Learning (K.3.2)
ID code:00000581
Deposited by:Kewei Tu on 15 July 2008

Available Versions of This Paper



Contact site administrator at: ssg@cs.iastate.edu