Skip to content

Toggle service links
Sem Search Logo
 
Search Process
Query Interface
Entity Search
Query Construction
Ranking
 

The following figure illustrates a typical search process in SemSearch, which can be briefly described as User query -> formal queries -> results. Core components include: Query Interface, Semantic Entity Search Engine, Query Construction Engine, Query Ranking Engine, and Results Ranking Engine.


The proposed query language extends traditional keyword search languages by allowing explicitly specifying the querying scope and the keywords that are relevant SemSearch proposes two heuristic terms to support the specification:
  • Symbol ":", which supports the specification of query scope.
  • Term "and", which indicates that the system would favour those results that are relevant to all the specified keywords.
A user query in SemSearch looks like "subject:keyword1 and keyword2 and keyword3 ...", which looks for data that have semantic relations to all the keywords in the specified scope.

Based upon the Apache Lucene text search technology, semantic entity search is responsible for locating the right semantic entities in the pre-indexed underlying domain ontologies and semantic data repositories.

The search engine looks into labels and short literal values of semantic entities for finding matches for keywords. The matches are then be used to interpret user queries by constructing formal queries, which can be executed by formal query engines.

As shown in the figure above, SemSearch indexes all the semantic entities contained in the specified domain ontologies and data repositories for the purpose of entity search.



Query construction takes semantic entity matches as input and produces formal queries. To facilitate the task, a set of templates have been developed to describe the popular patterns that often appear in real world usage scenarios.

An important issue which concerns query construction is that there maybe large numbers of formal queries that can be constructed, as each keyword may have a number of matches. The large quantity will significantly slow down the search process, as each query needs to be fed into the querying engine. a query ranking mechanism is developed to address this issue. It computes and selects the most relevant queries from the ones that are constructed.

Ranking serves as a backbone to drive the search process. More specifically, two types of ranking mechanisms have been developed:

  • Query ranking is used to determine the relevance of the derived formal queries, thus enabling the filtering out of the irrelevant ones and speeding up the search process.
  • Results ranking is developed to compute the significance of the results retrieved from the formal queries selected by the query ranking step. This is crucial for helping end users to quickly locate the right information, especially in large scale data environments where one query may results in large numbers of hits.