archives

Learning Classifiers from Distributed, Ontology-Extended Data Sources


Home 

About 

Browse 

Search 

Register 

Subscriptions 

Deposit Papers 

Help
    

Caragea, Doina, Zhang , Jun, Pathak, Jyotishman and Honavar, Vasant (2005) Learning Classifiers from Distributed, Ontology-Extended Data Sources. Technical Report, Computer Science, Iowa State University.

Full text available as:Adobe PDF

Abstract

There is an urgent need for sound approaches to integrative and collaborative analysis of large, autonomous (and hence, inevitably semantically heterogeneous) data sources in several increasingly data-rich application domains. In this paper, we precisely formulate and solve the problem of learning classifiers from such data sources, in a setting where each data source has a hierarchical ontology associated with it and semantic correspondences between data source ontologies and a user ontology are supplied. given user-supplied semantic correspondences between data source ontologies and the user ontology. The proposed approach yields algorithms for learning a broad class of classifiers (including Bayesian networks, decision trees, etc.) from semantically heterogeneous distributed data with strong performance guarantees relative to their centralized counterparts. We illustrate the application of the proposed approach in the case of learning Naive Bayes classifiers from distributed, ontology-extended data sources.

Keywords:Machine learning, knowledge discovery, semantically heterogeneous data, ontologies, attribute value taxonomies, naive Bayes algorithm.
Subjects:Computing Methodologies: ARTIFICIAL INTELLIGENCE
Computing Methodologies: ARTIFICIAL INTELLIGENCE: Learning (K.3.2)
ID code:00000395
Deposited by:Doina Caragea on 13 December 2005



Contact site administrator at: ssg@cs.iastate.edu