These experiments show the performance of PowerAqua with the Watson semantic search engine. All queries produce an answer in one or many ontologies in Watson. Recall cannot be measure in this open scenario, therefore we measure precision (which results are valid from all set of results), precision@1 (which results are valid from all the results ranked in first position), and recall@1 (which of the valid results are ranked in first position with respect to the total of all valid results) in the aggregated results. The raking criteria used in here is the combination ranking (based on both confidence and popularity measures, it takes the highest score or any score > 3). The precision and recall of the fusion algorithm has already be measure in evaluation 3 (94%precision and 93%recall). These experiments were performed on November 2010, because of the dynamic nature of the Watson search engine, they may not be reproducible.


Which russian rivers flow into the Black sea?

1 answers from 2 semantic sources (Dnepr)
precision = 1/1, precision@1 = 1/1, recall@1 = 1/1
21.6 secs for mapping the onto-triples, 0 secs for fusion.

Which animals are reptiles?

13 answers from 12 sources (Aligator, Crocodile, Gecko, Lizard, Snake, Tortoise_Reptile, Turtle, etc.)
precision = 13/13, precision@1 = 13/13, recall@1 = 13/13
51.7 secs for mapping the onto-triples, 15.5 secs for fusion.

Where is kazakhstan?

19 answers from 13 sources (Asia and Kazakhstan itself are ranked first, followed by countries, cities or seas that border Kazakhstan, we only consider "Asia" as the valid precise answer, although some of the other answers (Caspian_sea, Turkmenistan, Russua, People's republic of China are not strictly incorrect).
precision = 1/19, precision@1 = 1/2, recall@1 = 1/1
45.5 secs for mapping the onto-triples, 4.3 secs for fusion.

What borders Croatia?

20 answers from 19 sources (Europe, Bosnia_and_Herzegovina, Hungary, Montenegro, Serbia, Slovenia...and noisy answers such as Zagreb, war-and-revolution and duplicated terms for Croatia).
precision = 6/20, precision@1 = 1/1, recall@1 = 1/6
11.0 secs for mapping the onto-triples, 7.4 secs for fusion.

where is islam practiced?

20 answers from 10 sources (iran, iraq, christians-and-christianity, japan, mosques, new-york-city, nigeria, etc.).
precision = 6/20, precision@1 = 6/20, recall@1 = 6/6
14.7 secs for mapping the onto-triples, 22.2 secs for fusion.

who knows Enrico Motta?

24 answers from 27 sources (Barry_Norton, Carlos_Pedrinaci, Alan_Ruttenberg, Deborah_McGuinness, Denny_Vrandecic, Diana_Maynard, York_Sure, Stefan_Decker, etc.).
precision = 24/24, precision@1 = 2/2, recall@1 = 2/24
88.9 secs for mapping the onto-triples, 9.3 secs for fusion.

who has an interest in ontology evaluation?

2 answers from 11 sources (York_Sure, Denny_Vrandecic, all the sources are in fact from ontoworld.org or semanticweb.org, e.g.:http://ontoworld.org/index.php/Special:ExportRDF/ESWC2006?xmlmime=rdf, http://ontoworld.org/index.php/Special:ExportRDF/OntoClean?xmlmime=rdf, etc.).
precision = 2/2, precision@1 = 2/2, recall@1 = 2/2
31.2 secs for mapping the onto-triples, 2.76 secs for fusion.

who works at AIFB?

31 answers from 37 sources (AIFB, Denny_Vrandecic, Anupriya_Ankolekar, Peter_Haase, Andreas_Eberhart,etc.).
precision = 30/31, precision@1 = 3/4, recall@1 = 3/30
74.0 secs for mapping the onto-triples, 28.2 secs for fusion.

which organizations are working on the Semantic Web?

17 answers from 13 sources (AIFB, DERI_Galway, DERI_Innsbruck, Salzburg_Research,etc.).
precision = 17/17, precision@1 = 3/3, recall@1 = 3/17
31.2 secs for mapping the onto-triples, 8.0 secs for fusion.

who attended both at ISWC2006 and ESWC2006?

72 answers from 53 sources (Abraham_Bernstein, Boris_Motik, Dieter_Fensek, Steffen_Staab, Tom_Heath, Aldo_Gangemi ,etc.).
precision = 72/72, precision@1 = 5/5, recall@1 = 5/72
169.3 secs for mapping the onto-triples, 8.0 secs for fusion.

Which countries are in Europe?

9 answers from 3 sources (Belgium, Czech, France, Germany, Holland, EastEurope, etc.).
precision = 7/9, precision@1 = 7/7, recall@1 = 7/7
12.5 secs for mapping the onto-triples, 8.6 secs for fusion.

does Jones Marion takes steroids?

1 answers from 2 sources .
precision = 1/1, precision@1 = 1/1, recall@1 = 1/1 7.1 secs for mapping the onto-triples, 0 secs for fusion.

Who attended ESWC2007 ?

25 answers from 27 sources (Stefan_Decker, Abraham_Bernstein, Alexander_Loeser, etc.).
precision = 25/25, precision@1 = 1/1, recall@1 = 1/25
30.2 secs for mapping the onto-triples, 8.9 secs for fusion.

where is Tom Heath working?

32 answers from 30 sources (KMi, Knowledge_Media_Institute, Talis, Talis_Information_Ltd, The_Open_University, Chris_Bizer, ESWC2007, foaf.rdf, http://kmi.open.ac.uk/people/tom, technical_girls_and_guys).
precision = 7/32 precision@1 = 0/2, recall@1 = 0/25
36.4 secs for mapping the onto-triples, 8.1 secs for fusion.

What is Sierra Leone?

15 answers from 11 sources (Africa, lessDevelopedCountry, Atlantic_Ocena, Guinea, Liberia, Sierra Leone).
precision = 1/15 precision@1 = 1/2, recall@1 = 1/1
8.4 secs for mapping the onto-triples, 4.32 secs for fusion.

in which city is Barajas international airport?

1 answers from 1 source (Madrid)
precision = 1/1 precision@1 = 1/1, recall@1 = 1/1
3.7 secs for mapping the onto-triples, 0 secs for fusion.

what airport is in washington?

1 answers from 1 source (Dulles_International_Airport)
precision = 1/1 precision@1 = 1/1, recall@1 = 1/1
4.8 secs for mapping the onto-triples, 0 secs for fusion.

who works at the Open University?

5 answers from 3 source (DAndriy_Nikolov, Liliana_Cabral, Marta_Sabou, etc.)
precision = 5/5 precision@1 = 5/5, recall@1 = 5/5
28.0 secs for mapping the onto-triples, 0.8 secs for fusion.

Which sea is close to Volgograd?

2 answers from 4 source (Azov_sea, Caspia_sea)
precision = 2/2 precision@1 = 1/1, recall@1 = 1/2
4.4 secs for mapping the onto-triples, 0.3 secs for fusion.

what is the religion in Russia?

3 answers from 2 source (Animist, Islam, Russian_Orthodox)
precision = 3/3 precision@1 = 3/3, recall@1 = 3/3
4.6 secs for mapping the onto-triples, 0 secs for fusion.

Which are the universities based in Madrid

2 answers from 1 source (Universidad_Complutense_de_Madrid, Universad_Politecnica_de_Madrid)
precision = 2/2 precision@1 = 0/2, recall@1 = 0/2 (merging failed)
3.9 secs for mapping the onto-triples, 0 secs for fusion.

Which are the universities based in Madrid

2 answers from 1 source (Universidad_Complutense_de_Madrid, Universad_Politecnica_de_Madrid)
precision = 2/2 precision@1 = 0/2, recall@1 = 0/2 (merging failed)
3.9 secs for mapping the onto-triples, 0 secs for fusion.

where is Odessa?

1 answers from 2 sources (Russia)
precision = 1/1 precision@1 = 1/1, recall@1 = 1/1
2.8 secs for mapping the onto-triples, 0 secs for fusion.

Find me all cities in California

20 answers from 11 sources (Arizona, California, San Diego, Sacramento, San Franciso, Mexico, Santa Cruz, Pacific_Ocean)
precision = 6/20 precision@1 = 6/18, recall@1 = 6/6
14.7 secs for mapping the onto-triples, 6.2 secs for fusion.

In which countries are earthquakes

39 answers from 4 sources (alaska, seismology, earthquakes, food_contamination-and-poisoning)
precision = 1/39 precision@1 = 1/39, recall@1 = 1/39
5.9 secs for mapping the onto-triples, 2.2 secs for fusion.

Name the president of Russia

1 answers from 2 sources (Vladimir_Vladimirovich_Putin)
precision = 1/1 precision@1 = 1/1, recall@1 = 1/1
4.8 secs for mapping the onto-triples, 0 secs for fusion.

Name the president of Russia

1 answers from 2 sources (Vladimir_Vladimirovich_Putin)
precision = 1/1 precision@1 = 1/1, recall@1 = 1/1
4.8 secs for mapping the onto-triples, 0 secs for fusion.

Give me oil industries in Russia

1 answers from 2 sources (Lukoil)
precision = 1/1 precision@1 = 1/1, recall@1 = 1/1
12.8 secs for mapping the onto-triples, 0 secs for fusion.

Give me all oceans and seas

11 answers from 10 sources (Arafura_See, Azov_Sea, Caspian_Sea, Gulf of Mexico, PacificOcean)
precision = 11/11 precision@1 = 6/6, recall@1 = 6/11
23.8 secs for mapping the onto-triples, 1.7 secs for fusion.

  • Average seconds for mapping = 27.7 secs
  • Average seconds for fusion = 5.43 secs
  • Average seconds total = 33.1 secs
  • Average precision = 77 %
  • Average precision@1 = 83 %
  • Average recall@1 = 69 %
    Problems and Limitations:
  • Semantic sources in the open Web have very bad quality: duplicated, very small (foaf files) and very noisy.
  • Large ontologies are missing (many of them are available in a .zip format and cannot be crawl by search engines)
  • Lack of schema: no domain and range for properties, not type defined for instances or classes (or defined as Thing)
  • Not intuitive labels (string algorithms fail to match the correct term), e.g.: the user keyword "pc member" could not be map to the property with label "Property-3AHas_PC_member"
  • Domain sparseness (geography, semantic conference, publications, foaf files) and many not populated ontologies (taxonomies only)
  • The main problem is that ontologies are split in different files and those different files are not recognize as part of the same graph or ontology (as with Virtuoso or Sesame), thus the schema definition is not part of the instantiated triples (types missing or define somewhere else)
  • Different graphs (URIs) for the same source. For example, consider the query "where islam is practiced?", for which there are many files (with no associated schema) which contains mappings for islam (most of the sources are originated from the same domain http://aaronland.info/nytimes, but they have different URI and are not considered part of the same graph:
    - Triples in http://aaronland.info/nytimes/knows/related/2004/05/13/index.rdf [3]
    (nigeria, place, islam), (chirstians-and-christianity, idea, islam) (violence, idea, islam)
    - Triples in http://aaronland.info/nytimes/knows/related/2004/05/17/index.rdf [2]
    (iraq, place, islam) (politics-and-government, idea, islam)
    - Triples in (mappings in http://aaronland.info/nytimes/knows/related/2004/05/15/index.rdf [2])

    (roman-catholic-church, organizaton, islam) (marriages, idea, islam)
    - Triples in http://aaronland.info/nytimes/knows/related/2004/05/28/index.rdf [6])
    (mosques, idea, islam) (new-york-city, idea, islam) (taxi-and-limousine-commission, organzation, islam)
