EASTWEB > Digital Library > Digital Library

[CaRP] Operation timed out (60)
Recently published resources (RSS)

The Semantic Web: a view on data integration, reasoning, human factors, collective intelligence a...
Third Asian Semantic Web Conference, ASWC2008#xD; Bangkok, Thailand December 2008, Workshops Proceedings

on 1/1/1970 1:00 AM

Machine Learning-Based Keywords Extraction for Scientific Literature
With the currently growing interest in the SemanticWeb, keywords/metadata extraction is coming to play an increasingly important role. Keywords extraction from documents is a complex task in natural languages processing. Ideally this task concerns sophisticated semantic analysis. However, the complexity of the problem makes current semantic analysis techniques insufficient. Machine learning methods can support the initial phases of keywords extraction and can thus improve the input to further semantic analysis phases. In this paper we propose a machine learning-based keywords extraction for given documents domain, namely scientific literature. More specifically, the least square support vector machine is used as a machine learning method. The proposed method takes the advantages of machine learning techniques and moves the complexity of the task to the process of learning from appropriate samples obtained within a domain. Preliminary experiments show that the proposed method is capable to extract keywords from the domain of scientific literature with promising results.

on 1/1/1970 1:00 AM

An enhanced text categorization method based on improved text frequency approach and mutual infor...
Text categorization plays an important role in data mining. Deature selection is the most important process of text categorization. Focused on feature selection, we present an improved text frequency method for filtering of low frequency features to deal with the data preprocessing, propose an improved mutual information algorithm for feature selection, and develop an improved tf.idf method for characteristic weights evaluation. The proposed method is applied to the benchmark test set Reuters-21578 Top 10 to examine its effextiveness. Numerical results show that the precision, the recall and the value of F1 of the proposed method are all superior to those of existing conversional methods.

on 1/1/1970 1:00 AM

Canonical Huffman code based full-text index
Full-text indices are data structures that can be used to find any substring of a given string. Many full-text indices require space larger than the original string. In this paper, we introduce the canonical Huffman code to the wavelet tree of a string T[1. . .n]. Compared with Huffman code based wavelet tree, the memory space used to represent the shape of wavelet tree is not needed. In case of large alphabet, this part of memory is not negligible. The operations of wavelet tree are also simpler and more efficient due to the canonical Huffman code. Based on the resulting structure, the multi-key rank and select functions can be performed using at most nH0 + jRj(lglgn + lgn lgjRj)+O(nH0) bits and in O(H0) time for average cases, where H0 is the zeroth order empirical entropy of T. In the end, we present an efficient construction algorithm for this index, which is on-line and linear. 2007 National Natural Science Foundation of China and Chinese Academy of Sciences. Published by Elsevier Limited and Science in China Press. All rights reserved.

on 1/1/1970 1:00 AM

From Web Directories to Ontologies: Natural Language Processing Challenges
Hierarchical classifications are used pervasively by humans as a means to organize their data and knowledge about the world. One of their main advantages is that natural language labels, used to describe their contents, are easily understood by human users. However, at the same time, this is also one of their main disadvantages as these same labels are ambiguous and very hard to be reasoned about by software agents. This fact creates an insuperable hindrance for classifications to being embedded in the Semantic Web infrastructure. This paper presents#xD; an approach to converting classifications into lightweight ontologies, and it makes the following contributions: (i) it identifies the main NLP problems related to the conversion process and shows how they are different from the classical problems of NLP; (ii) it proposes heuristic solutions to these problems, which are especially effective in this domain; and (iii) it evaluates the proposed solutions by testing them on DMoz data.

on 1/1/1970 1:00 AM

A Description Method of Ontology Change Management Using Pi-Calculus
In an open and dynamic environment, due to the changes in the application?s domain or the user?s requirements, the domain knowledge changes over time and ontology evolves continually. Pi-calculus is a kind of mobile process algebra which can be used for modeling concurrent and dynamic systems. Based on the pi-calculus, this paper proposes a kind of ontology process model used for solving the change implementation and propagation problems in ontology evolution process. This solution is discussed at three levels: the change implementation of single ontology evolution, the push-based synchronization realization for the change propagation in the evolution of multiple dependent ontologies within a single node, and the pull-based synchronization realization for the change propagation of the distributed ontologies evolution.

on 1/1/1970 1:00 AM

A Pi-Calculus Based Ontology Change Management
Based on the pi-calculus, this paper proposes a kind of ontology process model used for solving the change implementation and propagation problems of ontology evolution process. This solution is discussed at three levels: the change implementation of single ontology evolution, the push-based synchronization realization for the change propagation in the evolution of multiple dependent ontologies within a single node, and the pull-based synchronization realization for the change propagation of the distributed ontologies evolution.

on 1/1/1970 1:00 AM

An Ontology Slicing Method Based on Ontology An Ontology Slicing Method Based on Ontology
Slicing is a method that can extract required segments from data according some special criteria. Program slicing and model slicing are two familiar slicing techniques. By introducing slicing technique into#xD; ontology engineering domain, an ontology slicing method is provided in this paper. In the method, an Ontology Dependency Graph (ODG) is derived from OMG?s Ontology Definition Metamodel (ODM), and then ontology slices are generated automatically according slicing criteria. This method has many applications in which large scale ontology processing is needed.

on 1/1/1970 1:00 AM

Semantic Web Service Offer Discovery
Semantic Web Services are a research effort to automate the usage of Web services, a necessary component for the Semantic Web. Traditionally, Web service discovery depends on detailed formal semantic descriptions of available services. Since a complete detailed service description is not always feasible, the client software cannot select the best service offer for a given user goal only by using the static service descriptions. Therefore the client needs to interact automatically with the discovered Web services to find information about the available concrete offers, after which it can select the best offer that will fulfill the user?s goal. This paper shows when and why complete semantic description is unfeasible, it defines the role and position of offer discovery, and it suggests how it can be implemented and evaluated.

on 1/1/1970 1:00 AM

Efficient Discovery of Services Specified in Description Logics Languages
Semantic service descriptions are frequently given using expressive ontology languages based on description languages. The expressiveness of these languages, however, often implies problems for efficient service discovery, especially when increasing numbers of services become available in large organizations and on the Web. To remedy this problem, we propose an efficient service discovery/retrieval method grounded on a conceptual clustering approach, where services are specified in Description Logics as class definitions [10] and they are retrieved by defining a class expression as a query and by computing the individual subsumption relationship between the query and the available descriptions. We present a new conceptual clustering method that constructs tree indices for clustered services, where available descriptions are the leaf nodes, while inner nodes are intensional descriptions (generalization) of their children nodes. The matchmaking is performed by following the tree branches whose nodes might satisfy the query. The query answering time may strongly improve, since the number of retrieval steps may decrease from O(n) to O(log n) for concise queries. We also show that the proposed method is sound and complete.

on 1/1/1970 1:00 AM

More >>