Power Aqua Ok-Evaluation : Knowledge Media Institute : The Open University

The evaluation of PowerAqua as a standalone system focuses on its capability to answer queries by relying on information provided by multiple ontologies. As such, the evaluation will primarily assess the mapping capabilities of the system (i.e., its ability to map a user query into ontological triples on real time) rather than its linguistic coverage oe merging and ranking capabilities, which for this version, are still quite limited

The queries used in this evaluation were selected in such a way that they can be answered using data provided by at least one ontology. Precision equals the percentage of correctly answered questions from a given corpus of questions. An answer is described as correct with respect to a query over an ontology or set of ontologies. In order for an answer to be correct, PowerAqua has to align the vocabularies of both the asking query and the answering ontologies. Therefore a valid answer is the one considered correct in the ontology world. PowerAqua fails to give an answer if the knowledge is in the ontology(ies) but it can not find it. Note that a conceptual failure (the knowledge is not in the ontology) is not considered as a PowerAqua failure because the ontology does not cover the information needed to map all the terms or relations in the user query. Moreover, a correct answer, corresponding to a complete ontology-compliant representation, may give no results at all if the ontology is not populated.

Total recall can not be measured in this open scenario, as we dont know in advance how many ontologies can potentially answer the users query. Therefore recall is measured in terms of getting at least an answer or not

We have tested our prototype on a collection of ontologies saved into online repositories and indexed by PowerMap. The collection includes high level ontologies, like ATO, TAP, SUMO, DOLCE, and very large ontologies like SWETO_DBLP or SWETO [1] with around 800.000 entities and 1.600.000 relations. In total, we collected around 2GBs of data stored in 130 sesame repositories that are accessible through http://kmi-web07.open.ac.uk:8080/sesame. (The running times are an approximation, and it can vary depending on the load of the server and network connections)

The questions used during the evaluation were selected as follows. We asked seven members of KMi, familiar with the Semantic Web and ontologies, to generate questions for the system that were covered by at least one ontology in our collection. We have pointed out to our colleagues that the system is limited in handling temporal information; therefore we asked them not to design questions which required temporal reasoning (e.g. today, last month, between 2004 and 2005, last year, etc). Because no quality control was carried out on the questions, it was admissible for these to contain some spelling mistakes and even grammatical errors. Also, we pointed out that PowerAqua is not a conversational system. Each query is resolved on its own with no references to previous queries. We collected a total of 69 questions listed in what follows.

The current version of PowerAqua can answer correctly 69.5% of the queries. The average time is 15 secs per query. All the repositories used in this evaluation are online at: Sesame repositories (no longer available)

Questions

Q1: Give me all papers authored by Enrico Motta in 2007
OK:
23.305 secs.

Q2: What is the title of the paper authored by Enrico Motta in 2007?
Linguistic Component failure:
Out of coverage.

Q3: Give me all albums of Metallica
OK:
1.817 secs

Q4: Which are the albums of the rock group Metallica?
OK:
18.428 secs

Q5: Give me all californian dry wines.
PowerMap failure (ontology discovery):
it fails to map "Californian" to "CaliforniaRegion", therefore it only gives dry wines as an answer

Q6: Which californian wines are dry?
OK:
6.123 secs

Q7: Which religion does Easter belong to?
OK:
5.337 secs

Q8: Which are the main fasting periods in Islam?
OK:
12 secs

Q9: Is there a railway connection between Amsterdam and Berlin?
Linguistic Component failure:
Out of scope

Q10: What are the main rivers of Russia?
OK:
6.108 secs

Q11: Which russian rivers flow into the Black Sea?
OK:
8.867 secs

Q12: Which prizes have been won by Laura Linney?
OK:
9.302 secs

Q13: What are the symptoms of Parkinson?
OK:
4.916 secs

Q14: Who stars in Bruce Almighty?
OK:
12.89 secs

Q15: Who presented a poster at ESWC 2006?
Triple Similarity Services failure:
the relevant ontology does not have an ontological element to represent the event "ESWC 2006" as it assumes the ontology is about that.

Q16: What is the capital of Turkey?
OK:
14.19 secs (Ranking is not accurate because in the main ranked triple capital gives as a answer all the list of local capitals. This is because PowerMap identifies both an exact mapping, "Capital", and an approximate one, "CountryCapital". It then proceeds to rule out the approximate one, which is actually the one that would have led to the correct answer, the exact mapping "Capital" generates regional capital cities instead).

Q17: Who are the people working in the same place of Paolo Buquet?
Linguistic Component failure:
out of coverage

Q18: Give me all the articles written by people from KMi.
PowerMap failure (mapping discovery):
The term "people" is not mapped to "person" in the relevant ontology. Moreover the term "KMi" does not appear as an alternative name for "knowledge media institute" anywhere, and therefore it can not be found.

Q19: Give me the list of restaurants which provide italian food in San Francisco
OK:
28.414 secs

Q20: Which restaurants are located in San Pablo Ave?
OK:
22.287 secs

Q21: Which cities are located in the region of Sacramento Area?
OK:
13.461 secs

Q22: What is the apex lab??
OK:
0.501 secs

Q23: Who believe in the apocalypse?
OK:
12.016 secs

Q24: What is skoda doing?
Linguistic Component failure:
Query not correctly classified, it can be reformulated to "What is skoda?".

Q25: Where are Sauternes produced?
PowerMap failure (filtering heuristics):
The system correctly maps the linguistic term "Sauternes" to the wine "Sauterne". However this is not related to a region, leading to a failure at the next stage of the process. PowerMap had indeed identified the mapping which would have led to the answer ("SauterneRegion", which is located in "Bordeaux"), however this mapping was discarded because PowerMap considered it less likely to be correct that the exact mapping to "Sauterne".

Q26: Give me the papers written by Marta Sabou.
OK:
3.822 secs (However, it can not find all relevant ontologies because the term "papers" is not mapped to the entity "publication")

Q27: which organization employs Enrico Motta?
OK:
10.785 secs

Q28: where is Odessa?
OK:
58.301 secs

Q29: which russian cities are close to the Black Sea?
OK:
15.533 secs

Q30: give me oil industries in Russia.
OK:
4.066 secs

Q31: Which sea is close to Volgograd?
OK:
8.782 secs

Q32: Name the president of Russia.
OK:
3.491 secs

Q33: which countries are members of the EU?
OK:
27.855 secs

Q34: What is the religion in Russia?
OK:
3.816 secs

Q35: Which sea does the russian rivers flow to?
OK:
79.197 secs

Q36: what is activated by the Lacasse enzyme?
OK:
12.896 secs

Q37: What enzymes are activated by adrenaline?
OK:
5.212 secs

Q38: what enzymes are used in wine making?
OK:
7.391 secs

Q39: give me the actors in "the break up"
OK:
6.013 secs

Q40: Give me fishes in salt water
OK:
7.606 secs

Q41: Give me the main companies in India.
OK:
23.646 secs

Q42: What is the birthplace of Albert Einstein
PowerMap failure (ontology discovery):
fails to map "birthplace", and it splits the compound "Albert Einstein".

Q43: Show me museums in Chicago.
OK:
5.075 secs

Q44: Give me types of birds.
OK:
0.647 secs

Q45: In which country is Mizoguchi?
Triple Similarity Service failure:
It can not infere from Riichiro Mizoguchi who is affiliated to the Osaka University from Ontology 1 to the fact that Osaka is in Japan from the Ontology 2. PowerAqua can not do a "double cross-ontology jump" to link two linguistic terms within the same linguistic triple.

Q46: Find all rivers in Asia
OK:
15.501 secs (partial answer because, for efficiency reasons, as there is a direct relation between rivers and Asia in the ontology it does not look for the indirect relation river-country-Asia which would have produce more answers within the same ontology)

Q47: Find all Asian cities
PowerMap failure (ontology discovery):
It fails to map "Asia" to "Asian", the query can be reformulated to "find me cities in Asia"

Q48: Which universities are in Pennsylvania?
OK:
10.52 secs

Q49: Who won the best actress award in the toronto film festival?
OK:
57.246 secs

Q50: Which desserts contain fruits and pastry?
PowerMap failure (ontology discovery):
It can not map "dessert" to "dessertDishes" and "fruit" to "fruitDishes"

Q51: Are there any rivers in Russia that flow to Caspian and Black Sea?
Linguistic Component failure:
Out of coverage.

Q52: What ways are there for weight management?
OK:
6.311 secs

Q53: What kinds of brain tumour are there?
PowerMap failure (ontology discovery):
It can fin the literal "Brain Tumor SY NCI" which is a synonym of the class "Brain Neoplasm".

Q54: Where is Lake Erie?
OK:
14.448 secs

Q55: Which islands belong to Spain?
OK:
40.61 secs

Q56: List some Spanish islands.
PowerMap failure (ontology discovery):
It can not map "Spanish" to "Spain". It can be reformulated to "list me some islands in Spain"

Q57: Which islands are situated in the Mediterranean sea?
OK:
12.44 secs

Q58: Which Spanish islands lie in the Mediterranean sea?
OK:
8.12 secs

Q59: How many airports exist in Canada?
OK:
17.839 secs

Q60: Which terrorist organization performed attacks in London?
PowerMap failure (filtering heuristics):
The ontology is not well modeled (redundant terms not connected between themselves). The literal "London, United Kingdom" is discarded by the exact instance mapping "London", which is not related neither to the instance "United Kingdom" nor to the literal "London, United Kingdom", being those two last ontological terms the only ones that link to the class "terrorist organization".

Q61: Which are the main attacks that took place in the United Kingdom?
OK:
35.729 secs

Q62: What are the terrorist organizations that are active in Spain?
OK:
12.077 secs

Top

Q63: What type of buildings exist?
PowerMap failure (ontology discovery):
It does not find the term "building" in the sweto ontology.

Q64: Which RBA banks are situated in Switzerland?
OK:
24.252 secs

Q65: What are the bank types that exist in Switzerland?
OK:
5.789 secs

Q66: How many tutorials were given at iswc-aswc2007?
PowerMap failure (ontology discovery):
It can not find "tutorialEvent" as a mapping for "tutorials". The query can be re-formulated to "how many tutorial events were given at iswc-aswc 2007?" (in this case it only maps the term "2007" which is the localname of the entity "/iswc-aswc/2007" with no label).

Q67: How can you treat acne?
PowerMap failure (ontology filtering):
It finds the class "acne" which is not connected to any other entity, while it discards the approximate mapping "acneTreatment" that would have lead to the answer.

Q68: What drugs can be used for reducing fever?
Triple Similarity Service failure:
it tries to find mappings for the terms “drugs”, “reducing” and “fever” while the answer is contained in an unique class, namely “FeverReducingDrug”

Q69: Which drug can treat colds and reduce headache?
PowerMap failure:
It can not map the term “drug” to the class “drugMedicine” and the term “colds” to the class “coldCoughCareProduct” to obtain for example (ibuprofen, is-a, coldCoughCareProduct) (ibuprofen, is-a, drugMedicine). Nevertheless it maps “cold” to “CommonCold”, however the class “CommonCold” is not connected to “drugMedicine” or to “coldCoughCareProduct”.

Many of PowerAqua failures are because the relevant mappings can not be found, or even if they were found, they were discarded by PowerMap filtering heuristics,, which is a compromise between good performance and recall (exploring all possible mappings that can lead to a solution). Also, in many cases these errors are the consequence of bad modelled ontologies (ontologies with redundant terms not connected between themselves). The linguistic coverage should also be extended by augmenting the grammars or the type of queries PowerAqua can understand.