User:Joal/WDQS Queries Analysis
Analysis of WDQS queries on public cluster. The charts and data have been computed on Jupyter notebooks running Spark on the Analytics hadoop cluster, and Google sheets. Processed data is events sourced through the Modern Event Platform. Written in January 2021 over data of November 2020. For a more general analysis of WDQS queries traffic, see User:Joal/WDQS Traffic Analysis.
Important note: The original idea of this analysis was to categorize queries in buckets defining the graph-structure of the queries. For instance, queries with no hop, or query with multiple one-hops from a single node (star-pattern) but not for filter, or star-pattern queries with filtering (sparkly queries)... I have failed to implement this general approach, mostly due to the high variability of ways to express filtering/joining in queries. I have reoriented the analysis toward trying to provide meaningful information over query-structure usage for mostly used structures.
TL;DR: Shared features of top WDQS query-classes
Here is the finding of the detailed analysis of the top 23 query-classes (see below) representing 60% of all requests and 25% of all the time taken to answer the queries. In this paragraph we have made the assumption that the path seen in the query-classes (P31/P279*
, P131*
, P31*/P279*
etc) have been precomputed to single-hop links.
- 21 query-classes out of 23 use the truthy subgraph only (54% of all requests).
- 18 query-classes out of 23 are one-hop queries, not counting label extraction as a hop (47% of all requests).
- 4 query-classes are two-hops queries and 1 three-hops, not counting label extraction again (respectively 8% and 5% of all queries)
- Among the 17 one-hop query-classes using the truthy subgraph (46% of all requests):
- 8 have defined Subject and Predicate (none with defined Subject without Predicate, 20% of all requests) - One thing to consider: some queries use functions to filter/refine the data.
- 6 have defined Predicate and Object (21% of all requests) - Some query-patterns use functions.
- 2 have defined Object only (5% of all requests)
- 1 has defined Subject, Predicate and Object, and uses a function over Object (5% of all requests)
- All the 4 two-hops query-classes have defined Subject, and 2 among them have defined Predicate.
- The three-hops query-class has defined Subject and Predicate.
Detailed analysis
All Queries
Query-time classes
query_time_class | requests | query_time | requests % | query_time % |
---|---|---|---|---|
less_10ms | 9358686 | 46929172 | 5.63% | 0.10% |
10ms_to_100ms | 123061327 | 3695019785 | 74.07% | 7.70% |
100ms_to_1s | 30229454 | 8196933696 | 18.20% | 17.09% |
1s_to_10s | 2378012 | 8121611518 | 1.43% | 16.93% |
more_10s | 1106493 | 27903053789 | 0.67% | 58.18% |
Top Queries - raw string + UA (not disclosed)
In this section we look at the queries having happened most, by query-string and user-agent. This analysis is interesting to find queries that are repeated over and over again without a change, whether due to clients repeating queries for real (weird), or monitoring systems querying to check the system (normal).
Queries
query | user-agent | requests | query_time | requests % | query_time % |
---|---|---|---|---|---|
ASK{ ?x ?y ?z }
|
###
|
13123149 | 328704767 | 7.90% | 0.69% |
ASK{ ?x ?y ?z }
|
1085848 | 64177327 | 0.65% | 0.13% | |
SELECT ?isdead WHERE {
|
492865 | 18447350 | 0.30% | 0.04% | |
#Tool: wdi_core fastrun
|
449705 | 10325156 | 0.27% | 0.02% | |
#Tool: wdi_core fastrun
|
449574 | 9794984 | 0.27% | 0.02% |
The first two rows are queries that don't generate results based on data. They only ask the query system if some data exists (and if it answers the query). The related user-agents also tell us that those requests are issued by monitoring systems. The cost in term of computation time for those requests is negligible, but not in term of number of requests! We'll remove those requests from our further analysis.
Other lines (I looked at more than the top 5 but it was not interesting to list it here) show requests that actually compute some results, and that are issued repeatedly. They represent relatively small numbers, both in term of requests number and query-time.
Query-time classes without monitoring queries
query_time_class | requests | query_time | requests % | query_time % |
---|---|---|---|---|
less_10ms | 1949377 | 12097708 | 1.28% | 0.03% |
10ms_to_100ms | 116495975 | 3561119568 | 76.68% | 7.49% |
100ms_to_1s | 29998482 | 8153852457 | 19.75% | 17.14% |
1s_to_10s | 2376773 | 8119145049 | 1.56% | 17.07% |
more_10s | 1104368 | 27724451084 | 0.73% | 58.28% |
It is expected and interesting to notice that the removal of monitoring queries mostly removes ultra-fast queries (less than 10ms).
Top Queries - Operators sequence + variables + UA (not disclosed)
Query processing explanation
In order to get deeper into query analysis, we need a formal representation of the queries. I have used the JENA-ARQ SparQL parser to parse each query and generate abstract algebras. Then the abstract query representations are processed to generate interesting structures:
- List of SparQL operators used in the query, in processing order (depth-first)
- Map of variable-names used in the query with their count (named variable-usage below)
- Map of URIs used in the query with their count
- Map of literals (values) used in the query with their count
Those structures allow to group similar queries with a high enough degree of confidence: Queries sharing both the same operators-list and variable-names (when non-empty) have a very high probability to also share either URIs and/or literals, and therefore do similar data-computation in term of query-semantic.
Query-classes by Operators-list, variables-usage and user-agent
I analysed top queries using various grouping fields, and the grouping (operators-list, variables-usage, user-agent) provides a high coherence in query-classes. The following table shows the top-100 query-classes, representing 82% of queries made to WDQS on November 2021, for 30% of the total query-time.
The next section contains a detailed analysis of the top 23 query-classes from that list, providing a deeper understanding of the most-used query patterns.
index | Operators-list | Variable-names | Variables-usage-count | User-agent | requests | sum_query_time | Requests % | sum_query_time % | Cumulative requests % | Cumulative sum_query_time % |
---|---|---|---|---|---|---|---|---|---|---|
1 | [path, table, bgp, join, bgp, union, join, project] | [NODE_VAR[prop], NODE_VAR[q]] | [1, 3] | ###
|
12992308 | 329238149 | 8.99% | 0.80% | 8.99% | 0.80% |
2 | [path, table, bgp, join, bgp, union, bgp, union, join, project] | [NODE_VAR[prop], NODE_VAR[q]] | [1, 4] | ###
|
8649136 | 224082155 | 5.99% | 0.55% | 14.98% | 1.35% |
3 | [table, bgp, join, filter, project, distinct] | [NODE_VAR[died], NODE_VAR[q], NODE_VAR[born]] | [2, 2, 2] | ###
|
7543103 | 313493743 | 5.22% | 0.76% | 20.20% | 2.11% |
4 | [table, table, bgp, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, bgp, leftjoin, bgp, join, extend, filter, union, bgp, union, bgp, union, bgp, union, bgp, union, bgp, union, join, bgp, service, join, project] | [NODE_VAR[industry], NODE_VAR[bloombergCompanyID], NODE_VAR[DUNSnumber], NODE_VAR[inception], NODE_VAR[australianRegisteredBodyNumber], NODE_VAR[subsidiary], NODE_VAR[hqLS], NODE_VAR[ISIN], NODE_VAR[exchangeStm], NODE_VAR[company], NODE_VAR[GS1code], NODE_VAR[streetAddress], NODE_VAR[hqStreet], NODE_VAR[website], NODE_VAR[e], NODE_VAR[permID], NODE_VAR[australianCompanyNumber], NODE_VAR[expediaHotelID], NODE_VAR[legalEntityIdentifier], NODE_VAR[czechRegistrationID], NODE_VAR[ownerOf], NODE_VAR[t], NODE_VAR[openCorporatesID], NODE_VAR[c], NODE_VAR[hqPostalCode], NODE_VAR[hqStreetDep], NODE_VAR[australianBusinessNumber], NODE_VAR[legalName], NODE_VAR[UNSPSCCode], NODE_VAR[hungarianCompanyID], NODE_VAR[companySize], NODE_VAR[germanTaxAuthorityID], NODE_VAR[danishP_number], NODE_VAR[centralIndexKey], NODE_VAR[l], NODE_VAR[country], NODE_VAR[OKPO_ID], NODE_VAR[ownedBy], NODE_VAR[hqlon], NODE_VAR[companiesHouseID], NODE_VAR[austrianFirmenbuchnummer], NODE_VAR[hqlat], NODE_VAR[dataGouvFrOrganizationID], NODE_VAR[EUTransparencyRegisterID], NODE_VAR[legalForm], NODE_VAR[parent], NODE_VAR[hq]] | [1, 1, 1, 1, 1, 1, 6, 1, 3, 36, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1] | ###
|
7392382 | 2923532705 | 5.12% | 7.12% | 25.32% | 9.23% |
5 | [bgp, filter, project] | [NODE_VAR[pid], NODE_VAR[prop]] | [2, 1] | ###
|
7045783 | 182124475 | 4.88% | 0.44% | 30.20% | 9.67% |
6 | [table, table, join, path, bgp, sequence, join, project] | [NODE_VAR[prop], NODE_VAR[?0], NODE_VAR[class], NODE_VAR[base], NODE_VAR[parent]] | [2, 1, 1, 1, 2] | ###
|
5766765 | 1174741371 | 3.99% | 2.86% | 34.19% | 12.53% |
7 | [table, bgp, join, bgp, leftjoin, bgp, leftjoin, bgp, join, filter, project] | [NODE_VAR[givenNameLabel], NODE_VAR[familyNameLabel], NODE_VAR[familyName], NODE_VAR[countryLabel], NODE_VAR[personDesc], NODE_VAR[givenName], NODE_VAR[article], NODE_VAR[country], NODE_VAR[personLabel], NODE_VAR[person]] | [1, 1, 2, 2, 2, 2, 3, 2, 2, 6] | ###
|
3230619 | 220658372 | 2.24% | 0.54% | 36.42% | 13.07% |
8 | [path, bgp, sequence, filter, project] | [NODE_VAR[wiki], NODE_VAR[wiki_description]] | [2, 2] | ###
|
3181130 | 86229303 | 2.20% | 0.21% | 38.63% | 13.28% |
9 | [bgp, bgp, leftjoin, filter, bgp, extend, filter, union, bgp, extend, filter, union, project] | [NODE_VAR[directClaimP], NODE_VAR[pname], NODE_VAR[p], NODE_VAR[o], NODE_VAR[olabel]] | [2, 2, 2, 2, 5] | ###
|
3144489 | 578750284 | 2.18% | 1.41% | 40.80% | 14.69% |
10 | [table, extend, bgp, join, filter, project] | [NODE_VAR[sitelink], NODE_VAR[wikipedia]] | [2, 1] | ###
|
2929966 | 77008332 | 2.03% | 0.19% | 42.83% | 14.87% |
11 | [path, bgp, sequence, project] | [NODE_VAR[x], NODE_VAR[q]] | [1, 2] | ###
|
2770152 | 229766903 | 1.92% | 0.56% | 44.75% | 15.43% |
12 | [bgp, path, sequence] | [NODE_VAR[x]] | [2] | ###
|
2536016 | 660258400 | 1.76% | 1.61% | 46.50% | 17.04% |
13 | [table, bgp, join, bgp, leftjoin, bgp, service, join, extend, extend, order, project] | [NODE_VAR[property], NODE_VAR[formatter_url], NODE_VAR[propertyType]] | [2, 1, 1] | ###
|
2437160 | 104672105 | 1.69% | 0.25% | 48.19% | 17.29% |
14 | [bgp, bgp, service, join, filter, project] | [NODE_VAR[p], NODE_VAR[o]] | [2, 1] | ###
|
2292337 | 58555392 | 1.59% | 0.14% | 49.78% | 17.44% |
15 | [bgp, bgp, service, join, filter, project] | [NODE_VAR[p], NODE_VAR[o]] | [2, 1] | ###
|
2263726 | 55421994 | 1.57% | 0.13% | 51.34% | 17.57% |
16 | [table, bgp, join, bgp, leftjoin, bgp, service, join, extend, extend, order, project] | [NODE_VAR[property], NODE_VAR[formatter_url], NODE_VAR[propertyType]] | [2, 1, 1] | ###
|
2110409 | 93752110 | 1.46% | 0.23% | 52.80% | 17.80% |
17 | [bgp, project, distinct] | [NODE_VAR[x]] | [1] | ###
|
2064573 | 45530564 | 1.43% | 0.11% | 54.23% | 17.91% |
18 | [bgp, table, join, table, join, path, bgp, sequence, join, project] | [NODE_VAR[prop], NODE_VAR[?0], NODE_VAR[class], NODE_VAR[base], NODE_VAR[parent]] | [2, 1, 1, 1, 2] | ###
|
1593350 | 2307436146 | 1.10% | 5.62% | 55.34% | 23.53% |
19 | [bgp, project, distinct] | [NODE_VAR[subject]] | [1] | ###
|
1519964 | 55957477 | 1.05% | 0.14% | 56.39% | 23.66% |
20 | [table, bgp, join, bgp, service, join, order, project] | [NODE_VAR[ps], NODE_VAR[p], NODE_VAR[ps_], NODE_VAR[wd], NODE_VAR[statement], NODE_VAR[person]] | [2, 2, 1, 2, 2, 1] | ###
|
1470991 | 189636109 | 1.02% | 0.46% | 57.41% | 24.13% |
21 | [bgp, project] | [NODE_VAR[wt]] | [1] | ###
|
1470980 | 36523716 | 1.02% | 0.09% | 58.42% | 24.22% |
22 | [table, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, project] | [NODE_VAR[art_sv], NODE_VAR[label_en], NODE_VAR[art_en], NODE_VAR[label_sv], NODE_VAR[person]] | [2, 1, 2, 1, 4] | ###
|
1470971 | 87940435 | 1.02% | 0.21% | 59.44% | 24.43% |
23 | [bgp] | [NODE_VAR[item]] | [1] | ###
|
1239501 | 44601346 | 0.86% | 0.11% | 60.30% | 24.54% |
24 | [bgp, bgp, service, join, filter, project, distinct] | [NODE_VAR[o]] | [2] | ###
|
947721 | 18770539 | 0.66% | 0.05% | 60.96% | 24.58% |
25 | [bgp, bgp, service, join, project, slice] | [NODE_VAR[wdpage], NODE_VAR[pic], NODE_VAR[name]] | [3, 1, 1] | ###
|
838280 | 33244965 | 0.58% | 0.08% | 61.54% | 24.66% |
26 | [bgp, project, distinct] | [NODE_VAR[subject]] | [1] | ###
|
804741 | 28872394 | 0.56% | 0.07% | 62.09% | 24.73% |
27 | [bgp, project, distinct] | [NODE_VAR[author]] | [1] | ###
|
771186 | 18084601 | 0.53% | 0.04% | 62.63% | 24.78% |
28 | [bgp, filter, distinct] | [NODE_VAR[label]] | [2] | ###
|
757628 | 26588253 | 0.52% | 0.06% | 63.15% | 24.84% |
29 | [bgp, service, bgp, service, join, order, project, slice] | [NODE_VAR[place], NODE_VAR[location], NODE_VAR[distance]] | [1, 1, 1] | ###
|
740574 | 37999646 | 0.51% | 0.09% | 63.66% | 24.94% |
30 | [bgp, project, distinct, slice] | [NODE_VAR[subject]] | [1] | ###
|
739547 | 24965605 | 0.51% | 0.06% | 64.18% | 25.00% |
31 | [table, bgp, join, bgp, leftjoin, bgp, service, join, order, project] | [NODE_VAR[company], NODE_VAR[ps], NODE_VAR[p], NODE_VAR[pq_], NODE_VAR[ps_], NODE_VAR[wd], NODE_VAR[statement], NODE_VAR[pq], NODE_VAR[wdpq]] | [1, 2, 2, 1, 1, 2, 3, 2, 1] | ###
|
681593 | 63120844 | 0.47% | 0.15% | 64.65% | 25.15% |
32 | [path, bgp, sequence, project] | [NODE_VAR[x], NODE_VAR[q]] | [1, 2] | ###
|
670919 | 60075798 | 0.46% | 0.15% | 65.11% | 25.30% |
33 | [bgp, path, sequence, bgp, service, join, project] | [NODE_VAR[item]] | [2] | ###
|
670825 | 52379041 | 0.46% | 0.13% | 65.58% | 25.42% |
34 | [table, extend, extend, bgp, join, project] | [NODE_VAR[taxonName], NODE_VAR[taxonRank], NODE_VAR[taxonRank1], NODE_VAR[item]] | [1, 1, 2, 2] | ###
|
668840 | 23638864 | 0.46% | 0.06% | 66.04% | 25.48% |
35 | [bgp, filter, project] | [NODE_VAR[o]] | [2] | ###
|
646445 | 19577115 | 0.45% | 0.05% | 66.49% | 25.53% |
36 | [bgp, filter, project] | [NODE_VAR[subject], NODE_VAR[wppage]] | [2, 2] | ###
|
645645 | 11463313 | 0.45% | 0.03% | 66.93% | 25.56% |
37 | [bgp] | [NODE_VAR[item]] | [1] | ###
|
625061 | 14297969 | 0.43% | 0.03% | 67.37% | 25.59% |
38 | [bgp] | [NODE_VAR[item]] | [1] | ###
|
624907 | 13626412 | 0.43% | 0.03% | 67.80% | 25.63% |
39 | [bgp, project] | [NODE_VAR[s]] | [1] | ###
|
584844 | 18176197 | 0.40% | 0.04% | 68.20% | 25.67% |
40 | [table, extend, bgp, join, extend, extend, filter, project] | [NODE_VAR[s], NODE_VAR[p], NODE_VAR[o]] | [1, 2, 2] | ###
|
567561 | 35223250 | 0.39% | 0.09% | 68.60% | 25.76% |
41 | [bgp, project, slice] | [NODE_VAR[id]] | [1] | ###
|
559198 | 18457293 | 0.39% | 0.04% | 68.98% | 25.80% |
42 | [table, bgp, service, join, project] | [] | [] | ###
|
534856 | 17434797 | 0.37% | 0.04% | 69.35% | 25.84% |
43 | [bgp, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, join, project] | [NODE_VAR[website], NODE_VAR[article], NODE_VAR[image], NODE_VAR[netflixId], NODE_VAR[item], NODE_VAR[IMDB_ID]] | [1, 2, 1, 1, 6, 1] | ###
|
531577 | 22971004 | 0.37% | 0.06% | 69.72% | 25.90% |
44 | [bgp, extend, table, join, group, extend, project, distinct] | [NODE_VAR[itemLabel], NODE_VAR[item]] | [1, 1] | ###
|
524802 | 28529302 | 0.36% | 0.07% | 70.09% | 25.97% |
45 | [bgp, bgp, service, join, project] | [NODE_VAR[lastName]] | [1] | ###
|
518799 | 18013009 | 0.36% | 0.04% | 70.45% | 26.01% |
46 | [bgp, service, bgp, leftjoin, extend, project] | [NODE_VAR[dod]] | [1] | ###
|
492865 | 18447350 | 0.34% | 0.04% | 70.79% | 26.06% |
47 | [path, bgp, sequence, project] | [NODE_VAR[x], NODE_VAR[q]] | [1, 2] | ###
|
490079 | 37818478 | 0.34% | 0.09% | 71.13% | 26.15% |
48 | [table, extend, bgp, join, filter, project] | [NODE_VAR[sitelink], NODE_VAR[wikipedia]] | [2, 1] | ###
|
485946 | 13035666 | 0.34% | 0.03% | 71.46% | 26.18% |
49 | [bgp, bgp, service, join, project, distinct] | [NODE_VAR[s], NODE_VAR[p], NODE_VAR[o]] | [2, 1, 1] | ###
|
470442 | 12192693 | 0.33% | 0.03% | 71.79% | 26.21% |
50 | [table, bgp, join, bgp, service, join, group, extend, extend, project, distinct] | [NODE_VAR[P27], NODE_VAR[item], NODE_VAR[p27llabel]] | [2, 1, 1] | ###
|
464364 | 45010719 | 0.32% | 0.11% | 72.11% | 26.32% |
51 | [bgp, bgp, leftjoin, filter, project] | [NODE_VAR[wikipedia], NODE_VAR[wiki_description]] | [4, 2] | ###
|
459908 | 18499624 | 0.32% | 0.05% | 72.43% | 26.36% |
52 | [bgp, bgp, minus, bgp, join, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, service, join, group, extend, extend, extend, extend, project, distinct] | [NODE_VAR[inception], NODE_VAR[label], NODE_VAR[countryLabel], NODE_VAR[itemLabel], NODE_VAR[lcnaf], NODE_VAR[country], NODE_VAR[item], NODE_VAR[enttypeLabel], NODE_VAR[enttype]] | [1, 1, 1, 1, 1, 2, 7, 1, 2] | ###
|
445923 | 18940199 | 0.31% | 0.05% | 72.74% | 26.41% |
53 | [table, extend, path, join, project, distinct] | [NODE_VAR[item]] | [1] | ###
|
440056 | 691103158 | 0.30% | 1.68% | 73.04% | 28.09% |
54 | [path, path, union, path, union, path, union, bgp, join, filter, project] | [NODE_VAR[wiki], NODE_VAR[wiki_description]] | [5, 2] | ###
|
395180 | 15766839 | 0.27% | 0.04% | 73.31% | 28.13% |
55 | [bgp, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, service, join, project, slice] | [NODE_VAR[sex], NODE_VAR[nationality], NODE_VAR[occupationlbl], NODE_VAR[birthplace], NODE_VAR[abstract_es], NODE_VAR[deathplace], NODE_VAR[agent], NODE_VAR[article], NODE_VAR[image], NODE_VAR[language], NODE_VAR[occupation]] | [1, 1, 1, 1, 1, 1, 10, 3, 1, 1, 2] | ###
|
392598 | 35486007 | 0.27% | 0.09% | 73.59% | 28.22% |
56 | [bgp] | [NODE_VAR[work]] | [1] | ###
|
375526 | 10486739 | 0.26% | 0.03% | 73.85% | 28.24% |
57 | [bgp] | [NODE_VAR[x]] | [2] | ###
|
363927 | 9325004 | 0.25% | 0.02% | 74.10% | 28.27% |
58 | [table, bgp, join, bgp, service, join, project] | [NODE_VAR[item], NODE_VAR[o]] | [1, 1] | ###
|
357627 | 28071493 | 0.25% | 0.07% | 74.35% | 28.33% |
59 | [bgp, bgp, leftjoin] | [NODE_VAR[s], NODE_VAR[item_id], NODE_VAR[mrt]] | [3, 1, 1] | ###
|
348452 | 10521964 | 0.24% | 0.03% | 74.59% | 28.36% |
60 | [bgp, bgp, join, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, service, join, project, slice] | [NODE_VAR[coords], NODE_VAR[street], NODE_VAR[location], NODE_VAR[rank], NODE_VAR[company_country], NODE_VAR[hq_node1], NODE_VAR[post_code], NODE_VAR[hq_node], NODE_VAR[country]] | [1, 1, 1, 1, 1, 8, 1, 1, 1] | ###
|
336273 | 15553097 | 0.23% | 0.04% | 74.82% | 28.40% |
61 | [bgp, bgp, join, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, service, join, project] | [NODE_VAR[coords], NODE_VAR[street], NODE_VAR[location], NODE_VAR[rank], NODE_VAR[company_country], NODE_VAR[hq_node1], NODE_VAR[post_code], NODE_VAR[hq_node], NODE_VAR[country]] | [1, 1, 1, 1, 1, 8, 1, 1, 1] | ###
|
335217 | 10803186 | 0.23% | 0.03% | 75.05% | 28.42% |
62 | [bgp, bgp, join, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, service, join, project] | [NODE_VAR[coords], NODE_VAR[street], NODE_VAR[location], NODE_VAR[rank], NODE_VAR[hq_node1], NODE_VAR[post_code], NODE_VAR[hq_node], NODE_VAR[country]] | [1, 1, 1, 1, 8, 1, 1, 2] | ###
|
334863 | 9790175 | 0.23% | 0.02% | 75.28% | 28.45% |
63 | [bgp, project] | [NODE_VAR[death_date]] | [1] | ###
|
317858 | 9132426 | 0.22% | 0.02% | 75.50% | 28.47% |
64 | [bgp, bgp, leftjoin] | [NODE_VAR[s], NODE_VAR[item_id], NODE_VAR[mrt]] | [3, 1, 1] | ###
|
307953 | 9690912 | 0.21% | 0.02% | 75.72% | 28.49% |
65 | [bgp, project] | [NODE_VAR[article]] | [3] | ###
|
301979 | 9273491 | 0.21% | 0.02% | 75.93% | 28.52% |
66 | [bgp, project] | [NODE_VAR[class]] | [1] | ###
|
298715 | 8628555 | 0.21% | 0.02% | 76.13% | 28.54% |
67 | [bgp, bgp, leftjoin, project] | [NODE_VAR[h], NODE_VAR[Z]] | [1, 1] | ###
|
297585 | 9102832 | 0.21% | 0.02% | 76.34% | 28.56% |
68 | [bgp, bgp, leftjoin, bgp, service, join, project] | [NODE_VAR[birthDateStatement], NODE_VAR[deathDate], NODE_VAR[placeStatement], NODE_VAR[occupationStatement], NODE_VAR[nameStatement], NODE_VAR[birthPlace], NODE_VAR[name], NODE_VAR[deathDateStatement], NODE_VAR[birthDate], NODE_VAR[occupation]] | [2, 1, 2, 2, 2, 1, 1, 2, 1, 1] | ###
|
292052 | 11227994 | 0.20% | 0.03% | 76.54% | 28.59% |
69 | [bgp, bgp, service, join, group, extend, project] | [NODE_VAR[subject], NODE_VAR[instance]] | [2, 1] | ###
|
284607 | 5391083 | 0.20% | 0.01% | 76.74% | 28.60% |
70 | [bgp, service, bgp, leftjoin, bgp, leftjoin, project] | [NODE_VAR[state], NODE_VAR[country]] | [1, 1] | ###
|
283387 | 10865222 | 0.20% | 0.03% | 76.93% | 28.63% |
71 | [bgp, bgp, service, join, project] | [NODE_VAR[entity]] | [1] | ###
|
279825 | 6211688 | 0.19% | 0.02% | 77.13% | 28.64% |
72 | [bgp, table, extend, extend, extend, bgp, union, join, bgp, join, bgp, leftjoin, filter, order, project, slice] | [NODE_VAR[property], NODE_VAR[ref], NODE_VAR[picture], NODE_VAR[valUrl], NODE_VAR[propUrl], NODE_VAR[valLabel], NODE_VAR[propLabel]] | [3, 1, 1, 3, 2, 2, 2] | ###
|
258792 | 44587116 | 0.18% | 0.11% | 77.31% | 28.75% |
73 | [bgp, bgp, leftjoin] | [NODE_VAR[s], NODE_VAR[item_id], NODE_VAR[mrt]] | [3, 1, 1] | ###
|
255795 | 7374888 | 0.18% | 0.02% | 77.48% | 28.77% |
74 | [bgp, extend, filter, project] | [NODE_VAR[longitude], NODE_VAR[pid], NODE_VAR[node], NODE_VAR[statement], NODE_VAR[latitude]] | [1, 2, 3, 2, 1] | ###
|
253751 | 8916121 | 0.18% | 0.02% | 77.66% | 28.79% |
75 | [bgp, bgp, leftjoin, project] | [NODE_VAR[_image], NODE_VAR[q]] | [1, 2] | ###
|
250481 | 32285219 | 0.17% | 0.08% | 77.83% | 28.87% |
76 | [bgp, table, join, bgp, path, sequence, join, extend, filter, project] | [NODE_VAR[propP], NODE_VAR[prop], NODE_VAR[value], NODE_VAR[wikitype], NODE_VAR[stmt], NODE_VAR[?0], NODE_VAR[base]] | [2, 2, 2, 2, 2, 1, 1] | ###
|
247932 | 19087912 | 0.17% | 0.05% | 78.00% | 28.92% |
77 | [table, extend, bgp, join, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, service, join, group, extend, project, group, extend, project] | [NODE_VAR[category], NODE_VAR[death], NODE_VAR[birth], NODE_VAR[altname], NODE_VAR[gender], NODE_VAR[person]] | [1, 1, 1, 1, 1, 5] | ###
|
247780 | 17959249 | 0.17% | 0.04% | 78.18% | 28.96% |
78 | [bgp, table, join, bgp, path, sequence, join, extend, filter, project] | [NODE_VAR[propP], NODE_VAR[prop], NODE_VAR[value], NODE_VAR[?1], NODE_VAR[wikitype], NODE_VAR[stmt], NODE_VAR[?0], NODE_VAR[base]] | [2, 3, 2, 1, 2, 2, 1, 1] | ###
|
245412 | 19885444 | 0.17% | 0.05% | 78.35% | 29.01% |
79 | [bgp, path, sequence, bgp, bgp, leftjoin, bgp, leftjoin, leftjoin, bgp, service, join, project] | [NODE_VAR[ps], NODE_VAR[valueLabel], NODE_VAR[property], NODE_VAR[qualifierValue], NODE_VAR[object], NODE_VAR[value], NODE_VAR[predicate], NODE_VAR[qualifier], NODE_VAR[pq]] | [2, 1, 2, 1, 3, 2, 2, 1, 2] | ###
|
239165 | 78077698 | 0.17% | 0.19% | 78.51% | 29.20% |
80 | [table, extend, bgp, join, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, extend, extend, bgp, service, join, filter, group, extend, extend, extend, extend, project] | [NODE_VAR[citizenship], NODE_VAR[Human], NODE_VAR[Description], NODE_VAR[BirthCountry], NODE_VAR[birthplace], NODE_VAR[birth_date], NODE_VAR[DeathCountry], NODE_VAR[bplace], NODE_VAR[deathplace], NODE_VAR[dplace], NODE_VAR[Human_Name], NODE_VAR[Citizenship_Name], NODE_VAR[death_date]] | [2, 7, 2, 1, 2, 1, 1, 2, 2, 2, 1, 1, 1] | ###
|
232815 | 15491661 | 0.16% | 0.04% | 78.67% | 29.24% |
81 | [table, bgp, join, project, distinct] | [NODE_VAR[subject], NODE_VAR[viaf]] | [1, 1] | ###
|
231259 | 16340131 | 0.16% | 0.04% | 78.83% | 29.27% |
82 | [table, extend, bgp, join] | [NODE_VAR[w], NODE_VAR[q]] | [2, 1] | ###
|
230313 | 10224156 | 0.16% | 0.02% | 78.99% | 29.30% |
83 | [table, bgp, join, project] | [NODE_VAR[property], NODE_VAR[propertyType], NODE_VAR[template]] | [2, 1, 1] | ###
|
219169 | 7547007 | 0.15% | 0.02% | 79.14% | 29.32% |
84 | [table, path, join] | [NODE_VAR[classes]] | [1] | ###
|
217544 | 92234241 | 0.15% | 0.22% | 79.29% | 29.54% |
85 | [bgp, filter, project] | [NODE_VAR[object_label], NODE_VAR[p], NODE_VAR[propType], NODE_VAR[object], NODE_VAR[predicate]] | [2, 2, 2, 3, 3] | ###
|
215816 | 27559086 | 0.15% | 0.07% | 79.44% | 29.61% |
86 | [bgp, bgp, service, join, project] | [NODE_VAR[p]] | [1] | ###
|
210595 | 4796195 | 0.15% | 0.01% | 79.59% | 29.62% |
87 | [table, table, extend, bgp, bgp, union, join, extend, table, extend, bgp, join, union, table, extend, bgp, join, union, table, extend, bgp, join, union, table, extend, bgp, join, union, table, extend, bgp, join, union, table, extend, bgp, join, union, table, extend, bgp, join, union, table, extend, bgp, join, union, table, extend, bgp, join, union, bgp, table, extend, table, join, bgp, join, table, extend, bgp, join, union, table, extend, bgp, join, union, bgp, table, extend, bgp, join, table, extend, bgp, join, union, join, union, join, union, join, project, distinct] | [NODE_VAR[stock_exchange], NODE_VAR[hqcity], NODE_VAR[hqlocation], NODE_VAR[has_street_address], NODE_VAR[hqcitycountry], NODE_VAR[value], NODE_VAR[country], NODE_VAR[prospect], NODE_VAR[hqcountry]] | [2, 3, 5, 1, 2, 16, 2, 12, 2] | ###
|
209769 | 31203046 | 0.15% | 0.08% | 79.73% | 29.70% |
88 | [table, bgp, join, bgp, service, join, filter, group, project, distinct] | [NODE_VAR[instanceLabel], NODE_VAR[instance], NODE_VAR[entity], NODE_VAR[item]] | [2, 2, 2, 1] | ###
|
209271 | 23151188 | 0.14% | 0.06% | 79.88% | 29.75% |
89 | [bgp, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, bgp, leftjoin, bgp, leftjoin, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, service, join, group, extend, extend, extend, extend, extend, extend, extend, extend, extend, extend, extend, extend, extend, extend, extend, extend, extend, extend, extend, project, slice] | [NODE_VAR[rottenTomatoesId], NODE_VAR[narrativeLocation_pt], NODE_VAR[duration], NODE_VAR[productionStudio_label], NODE_VAR[subject], NODE_VAR[universe], NODE_VAR[officialWebSite], NODE_VAR[label_en], NODE_VAR[tmdbId], NODE_VAR[language_label], NODE_VAR[genre], NODE_VAR[metacriticId], NODE_VAR[subject_label], NODE_VAR[distributor], NODE_VAR[color], NODE_VAR[genre_label], NODE_VAR[distributor_label], NODE_VAR[pubDate], NODE_VAR[narrativeLocation_en], NODE_VAR[country_label], NODE_VAR[label_pt], NODE_VAR[entity], NODE_VAR[narrativeLocation], NODE_VAR[netflixId], NODE_VAR[country], NODE_VAR[productionStudio], NODE_VAR[language], NODE_VAR[boxOffice], NODE_VAR[alt_pt], NODE_VAR[instanceOf], NODE_VAR[cost], NODE_VAR[alt_en], NODE_VAR[originalTitle], NODE_VAR[freebaseId]] | [1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 2, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 26, 3, 1, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1] | ###
|
208889 | 90420970 | 0.14% | 0.22% | 80.02% | 29.97% |
90 | [bgp, filter, slice] | [NODE_VAR[label]] | [2] | ###
|
206073 | 7944906 | 0.14% | 0.02% | 80.17% | 29.99% |
91 | [bgp, project] | [NODE_VAR[WDid]] | [1] | ###
|
206005 | 3232184 | 0.14% | 0.01% | 80.31% | 30.00% |
92 | [bgp] | [NODE_VAR[item]] | [1] | ###
|
203920 | 6495422 | 0.14% | 0.02% | 80.45% | 30.02% |
93 | [path, project] | [NODE_VAR[class]] | [1] | ###
|
201191 | 12882148 | 0.14% | 0.03% | 80.59% | 30.05% |
94 | [bgp, path, bgp, sequence, project] | [NODE_VAR[article], NODE_VAR[q], NODE_VAR[parent]] | [2, 2, 3] | ###
|
198908 | 10122749 | 0.14% | 0.02% | 80.73% | 30.07% |
95 | [bgp, bgp, leftjoin, project] | [NODE_VAR[image], NODE_VAR[product]] | [1, 2] | ###
|
191888 | 7765579 | 0.13% | 0.02% | 80.86% | 30.09% |
96 | [bgp, project, slice] | [NODE_VAR[image]] | [1] | ###
|
191231 | 9528936 | 0.13% | 0.02% | 80.99% | 30.11% |
97 | [table, bgp, join, project] | [NODE_VAR[sitelink], NODE_VAR[lemma], NODE_VAR[item]] | [3, 1, 1] | ###
|
190557 | 5629812 | 0.13% | 0.01% | 81.12% | 30.13% |
98 | [bgp, filter, project] | [NODE_VAR[name], NODE_VAR[item]] | [3, 1] | ###
|
189693 | 92460024 | 0.13% | 0.23% | 81.26% | 30.35% |
99 | [bgp, bgp, leftjoin, filter, bgp, filter, union] | [NODE_VAR[p], NODE_VAR[property], NODE_VAR[constraint], NODE_VAR[val], NODE_VAR[conflictValue], NODE_VAR[?0]] | [2, 2, 4, 2, 3, 1] | ###
|
183585 | 8264402 | 0.13% | 0.02% | 81.38% | 30.37% |
100 | [bgp, bgp, service, join, project, distinct] | [NODE_VAR[label], NODE_VAR[article], NODE_VAR[item]] | [1, 3, 2] | ###
|
181158 | 8469655 | 0.13% | 0.02% | 81.51% | 30.39% |
Detailed analysis of query classes
In this section we provide a detailed analysis of the top 23 query-classes from the previous table. They are the query-classes counting more than 1 million queries each, cumulatively representing 60 percent of all queries made to WDQS in November 2020.
Query class 1
Example query:
SELECT ?q
{
?q wdt:P31/wdt:P279* wd:Q16521 .
{
{
VALUES ?prop { wdt:P225 wdt:P1420 } .
?q ?prop 'Tradescantia aff. pallida Bradely 24980'
} UNION {
?q skos:altLabel 'Tradescantia aff. pallida Bradely 24980'@en
}
}
}
- The query searches for an entity being an instance/subclass(
wdt:P31/wdt:P279*)
of taxon (Q16521
), by taxon-name (P225
), taxon-synonym (P1420
) oraltLabel
. - 8.99% percent of requests for 0.80% of query-time: Queries are efficient, which is good for the most repeated on WDQS.
- The query pattern reuses always the same properties and items.
- The query pattern uses the truthy subgraph.
- Except for the path
wdt:P31/wdt:P279*
, the query pattern is a one-hop query.
Query class 2
Example query:
SELECT ?q
{
?q wdt:P31/wdt:P279* wd:Q16521 .
{
{
VALUES ?prop { wdt:P225 wdt:P1420 } .
?q ?prop 'Cloning vector M13plex07'
} UNION {
?q skos:altLabel 'Cloning vector M13plex07'@en
} UNION {
?q skos:altLabel 'Cloning vector var. M13plex07'@en
}
}
}
- This is a variant of the group 1 query, with an additional
altLabel
search component. - 5.99% percent of requests for 1.35% of query-time: Queries are efficient (less than previous group).
- The query pattern reuses always the same properties and items.
- The query pattern uses the truthy subgraph.
- Except for the path
wdt:P31/wdt:P279*
, the query pattern is a one-hop query.
Query class 3
Example query:
SELECT DISTINCT ?q
{
VALUES ?q { wd:Q95665608 } .
?q wdt:P569 ?born ; wdt:P570 ?died.
FILTER ( year(?born)=1860).
FILTER ( year(?died)=1884 )
}
- This query validates that some defined entities (here
Q95665608
, there could multiple values) have truthy properties born (P569
) and died (P570
) set at given years. - 5.22% percent of requests for 0.76% of query-time: Queries are efficient.
- The query pattern reuses always the truthy born (
P569
) and died (P570
) properties, as well as theyear
function for both values. - The query pattern uses the truthy subgraph.
- The query pattern is a multi-value one-hop query (one-hop pattern over multiple defined items)
Query class 4
Example query:
SELECT
?company
?companyLabel ?countryLabel ?ownerOf ?industryLabel ?hqLabel
?hqPostalCode ?hqStreet ?hqStreetDep ?hqlon ?hqlat ?extckr
?legalFormLabel ?parent ?ownedBy
?ISIN ?legalEntityIdentifier ?openCorporatesID ?OKPO_ID ?hungarianCompanyID
?companiesHouseID ?germanTaxAuthorityID ?EUTransparencyRegisterID ?DUNSnumber ?danishP_number ?GS1code ?dataGouvFrOrganizationID
?permID ?bloombergCompanyID ?australianBusinessNumber ?australianCompanyNumber ?australianRegisteredBodyNumber
?czechRegistrationID ?austrianFirmenbuchnummer ?expediaHotelID ?centralIndexKey
?companySize ?UNSPSCCode ?inception
?legalName ?streetAddress ?website ?subsidiary
WHERE
{
VALUES (?company) {
(wd:Q911347) (wd:Q1472987) (wd:Q1486934) (wd:Q1550912) (wd:Q1632461) (wd:Q1770909)
(wd:Q1771942) (wd:Q2132023) (wd:Q2300932) (wd:Q3519156) (wd:Q5337066) (wd:Q6537850)
(wd:Q17073302) (wd:Q22101796) (wd:Q22799638) (wd:Q23134166) (wd:Q28971310) (wd:Q28971308)
(wd:Q28971309) (wd:Q28974630) (wd:Q28974631) (wd:Q28974632) (wd:Q29790564) (wd:Q41598511)
(wd:Q45208078) (wd:Q97927899) (wd:Q98456097) (wd:Q7529) (wd:Q26989) (wd:Q43449)
(wd:Q53222) (wd:Q89070794) (wd:Q89070792) (wd:Q89070798) (wd:Q89070800) (wd:Q89070805)
(wd:Q89070808) (wd:Q89070812) (wd:Q89071138) (wd:Q89071143) (wd:Q89071140) (wd:Q89071147)
(wd:Q89071149) (wd:Q89071153) (wd:Q89071158) (wd:Q89071163) (wd:Q89071170) (wd:Q89071178)
(wd:Q89071176) (wd:Q89071182) (wd:Q89071185) (wd:Q89071189) (wd:Q89071193) (wd:Q89071198)
(wd:Q89071196) (wd:Q89071200) (wd:Q89071228) (wd:Q89071229) (wd:Q89071234) (wd:Q89071239)
(wd:Q89071247) (wd:Q89071244) (wd:Q89071250) (wd:Q89071253) (wd:Q89071259) (wd:Q89071256)
(wd:Q89071263) (wd:Q89071266) (wd:Q89071268) (wd:Q89071272) (wd:Q89071276) (wd:Q89071280)
(wd:Q89071286) (wd:Q89071284) (wd:Q89071290) (wd:Q89071293) (wd:Q89071296) (wd:Q89071303)
(wd:Q89071300) (wd:Q89071306) (wd:Q89071315) (wd:Q89071312) (wd:Q89071318) (wd:Q89071319)
(wd:Q89071321) (wd:Q89071326) (wd:Q89071331) (wd:Q89071336) (wd:Q89071340) (wd:Q89071347)
(wd:Q89071344) (wd:Q89071351) (wd:Q89071354) (wd:Q89071359) (wd:Q89071362) (wd:Q89071367)
(wd:Q89071365) (wd:Q89071369) (wd:Q89071372) (wd:Q89071376)
}
{
OPTIONAL {
?company p:P159 ?hqLS.
OPTIONAL { ?hqLS pq:P281 ?hqPostalCode. }
OPTIONAL { ?hqLS pq:P6375 ?hqStreet. }
OPTIONAL { ?hqLS pq:P969 ?hqStreetDep. }
#todo: located on street (P669)
OPTIONAL { ?hqLS ps:P159 ?hq. }
OPTIONAL {
?hqLS pqv:P625 ?c.
?c wikibase:geoLongitude ?hqlon.
?c wikibase:geoLatitude ?hqlat. }
}
OPTIONAL { ?company wdt:P946 ?ISIN. }
OPTIONAL { ?company wdt:P1278 ?legalEntityIdentifier. }
OPTIONAL { ?company wdt:P1320 ?openCorporatesID. }
OPTIONAL { ?company wdt:P2391 ?OKPO_ID. }
OPTIONAL { ?company wdt:P2619 ?hungarianCompanyID. }
OPTIONAL { ?company wdt:P2622 ?companiesHouseID. }
OPTIONAL { ?company wdt:P2628 ?germanTaxAuthorityID. }
OPTIONAL { ?company wdt:P2657 ?EUTransparencyRegisterID. }
OPTIONAL { ?company wdt:P2771 ?DUNSnumber. }
OPTIONAL { ?company wdt:P2814 ?danishP_number. }
OPTIONAL { ?company wdt:P3193 ?GS1code. }
OPTIONAL { ?company wdt:P3206 ?dataGouvFrOrganizationID. }
OPTIONAL { ?company wdt:P3347 ?permID. }
OPTIONAL { ?company wdt:P3377 ?bloombergCompanyID. }
OPTIONAL { ?company wdt:P3548 ?australianBusinessNumber. }
OPTIONAL { ?company wdt:P3549 ?australianCompanyNumber. }
OPTIONAL { ?company wdt:P3551 ?australianRegisteredBodyNumber. }
OPTIONAL { ?company wdt:P4156 ?czechRegistrationID. }
OPTIONAL { ?company wdt:P5285 ?austrianFirmenbuchnummer. }
OPTIONAL { ?company wdt:P5651 ?expediaHotelID. }
OPTIONAL { ?company wdt:P1128 ?companySize. }
OPTIONAL { ?company wdt:P1454 ?legalForm. }
OPTIONAL { ?company wdt:P2167 ?UNSPSCCode. }
OPTIONAL { ?company wdt:P749 ?parent. }
OPTIONAL { ?company wdt:P571 ?inception. }
OPTIONAL { ?company wdt:P127 ?ownedBy. }
OPTIONAL { ?company wdt:P1448 ?legalName. }
OPTIONAL { ?company wdt:P6375 ?streetAddress. }
OPTIONAL { ?company wdt:P5531 ?centralIndexKey. }
}
UNION {
?company p:P414 ?exchangeStm.
OPTIONAL { ?exchangeStm pq:P249 ?t. }
?exchangeStm ps:P414 ?e.
?e rdfs:label ?l
FILTER (LANG(?l) = "en")
BIND(IF (BOUND(?t), CONCAT(?l, ":", ?t), ?l) as ?extckr).
}
UNION
{ ?company wdt:P1830 ?ownerOf. }
UNION
{ ?company wdt:P452 ?industry. }
UNION
{ ?company wdt:P856 ?website. }
UNION
{ ?company wdt:P17 ?country. }
UNION
{ ?company wdt:P355 ?subsidiary. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
- This query extracts company-information (a lot!) from a predefined list of entities (here
Q911347,Q1472987
etc). - 5.12% percent of requests for 7.12% of query-time: Those queries take more time than the average query and repeated a lot.
- The properties and variables used to extract related information are the same across the query-class.
- The query pattern uses mostly the truthy subgraph, but also uses statement in two places (for Head-quarter location and for stock-exchange on which the company is traded).
- The query pattern is a multi-value one-hop query (one-hop pattern over multiple defined items) except for head-quarter and stock-exchange information.
Query class 5
Example query:
SELECT ?pid ?prop
WHERE {
?pid wdt:P2250 ?prop.
FILTER (?pid = wd:Q2910370)
}
- Queries from this class perform single-property lookup for a given entity.
- 4.88% percent of requests for 0.44% of query-time: Queries are efficient.
- The query pattern reuses a lot properties:
P2250
,P2131
,P2132
,P1081
were used in more than 1M requests each for instance - The query pattern don't reuse a lot items: maximum 3206 repetitions for
Q1998791
for instance. - The query pattern filters all queries with the formula
?pid = ITEM
. - The query pattern uses the truthy subgraph.
- The query pattern is a one-hop query.
Query class 6
Example query:
SELECT ?base ?prop ?parent
WHERE {
# hint:Query hint:optimizer "None".
VALUES ?base { wd:Q72903266 wd:Q9312 wd:Q46139 wd:Q797892 wd:Q17280087 }
VALUES ?class { wd:Q451553 }
?parent (wdt:P31|wdt:P279) ?class .
?parent ?prop ?base .
[] wikibase:directClaim ?prop .
}
- Queries from this class check that the entity
parent
is linked by P31 or P279 toclass
value(s), and has truthy links tobase
entities. - 3.99% percent of requests for 2.86% of query-time: Queries are not very efficient.
- The query pattern reuses properties
P31
andP279
as well aswikibase:directClaim
. - The query pattern uses the truthy subgraph but in a non-direct way: it filters out non-truthy links as the property is looked for and not defined.
- The query pattern is a one-hop query.
Query class 7
Example query:
SELECT ?person ?personLabel ?givenNameLabel ?familyNameLabel ?countryLabel ?personDesc ?article
WHERE
{
VALUES ?person { wd:Q1291170 wd:Q1291179 wd:Q1291185 wd:Q1291204 wd:Q1291224 wd:Q1291583 wd:Q1292140 wd:Q1293389 wd:Q1294306 wd:Q1300083 wd:Q1301016 wd:Q1304154 wd:Q1898836 wd:Q1898893 wd:Q516722 wd:Q516909 wd:Q3293896 wd:Q3294127 wd:Q3295144 wd:Q2628217 }
?person wdt:P27 ?country;
#wdt:P734 ?familyName;
#wdt:P735 ?givenName;
rdfs:label ?personLabel;
schema:description ?personDesc.
OPTIONAL {
?person wdt:P734 ?familyName.
?familyName rdfs:label ?familyNameLabel FILTER(LANG(?familyNameLabel) = "pt").
}
OPTIONAL {
?person wdt:P735 ?givenName.
?givenName rdfs:label ?givenNameLabel FILTER(LANG(?givenNameLabel) = "pt").
}
?country rdfs:label ?countryLabel.
#?givenName rdfs:label ?givenNameLabel.
?article schema:about ?person;
schema:inLanguage "pt";
schema:isPartOf <https://pt.wikipedia.org/> .
FILTER(LANG(?personLabel) = "pt").
FILTER(LANG(?countryLabel) = "pt").
#FILTER(LANG(?givenNameLabel) = "en").
FILTER(LANG(?personDesc) = "pt").
}
- Queries from this class extract information (country of citizenship
P27
, family-nameP734
and given-nameP735
specifying label-language and description language) about persons in a list who are being subject of a wikipedia article . - 2.24% percent of requests for 0.54% of query-time: Queries are efficient.
- The query pattern reuses properties
P27
andP734
andP735
as well asrdfs:label
,schema:description
,schema:about
,schema:inLanguage
andschema:isPartOf
. - Literals in the query pattern vary for languages (
fr
,pt
,it
,es
,de
,en
) - The query pattern uses the truthy subgraph.
- The query pattern is a two-hops query from a defined subset of entities.
Query class 8
Example query:
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX schema: <http://schema.org/>
PREFIX ebsco: <http://ebscohost.com/ontology/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX sesame: <http://www.openrdf.org/schema/sesame#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX fn: <http://www.w3.org/2005/xpath-functions#>
SELECT ?wiki_description ?wiki
WHERE {
?wiki <http://www.w3.org/2000/01/rdf-schema#label> | skos:altLabel "Dussumieria"@stq.
?wiki schema:description ?wiki_description.
filter(lang(?wiki_description)='en')
}
- Queries from this class get items from their label or alternative-label and return their id and description.
- 2.24% percent of requests for 0.54% of query-time: Queries are efficient.
- The query pattern reuses properties
rdfs:label
,skos:altLabel
andschema:description
. - The query patterns only for
en
language (both labels and description) - The query pattern uses the truthy subgraph (actually
label
,altLabel
anddescription
only). - The query pattern is a one-hop query.
Query class 9
Example query:
SELECT ?pname ?o ?olabel WHERE
{
{
wd:Q4379890 ?directClaimP ?o . # Get the truthy triples.
?p wikibase:directClaim ?directClaimP . # Find the Wikibase properties linked
?p rdfs:label ?pname . # to the truthy triples' predicates
FILTER ( lang(?pname) = "en" ) # and their labels, in English.
OPTIONAL {
?o rdfs:label ?olabel
FILTER ( lang(?olabel) = "en" )
}
} UNION {
wd:Q4379890 schema:description ?olabel
FILTER ( lang(?olabel) = "en" )
BIND('_description' AS ?pname)
} UNION {
wd:Q4379890 rdfs:label ?olabel
FILTER ( lang(?olabel) = "en" )
BIND('_name' AS ?pname)
}
}
- Queries from this class get, from a given item (here
Q4379890
), all truthy links (?p ?o) with the properties label and the object label (if any). It also gathers the item description and label. - 2.18% percent of requests for 1.41% of query-time: Queries are somewhat efficient.
- The query pattern reuses properties
wikibase:directClaim
,rdfs:label
, andschema:description
. - The query pattern uses
en
language only. - The query pattern uses the truthy subgraph but in a non-direct way: it filters out non-truthy links as the property is looked for and not defined.
- The query pattern is a one-hop query from a single entity if not considering labels.
Query class 10
Example query:
SELECT ?sitelink
WHERE {
BIND(wd:Q7430400 AS ?wikipedia)
?sitelink schema:about ?wikipedia .
FILTER REGEX(STR(?sitelink), '.wikipedia.org/wiki/') .
}
- Queries from this class check that an item (here
Q7430400
) has a related wikipedia article. - 2.03% percent of requests for 0.19% of query-time: Queries are efficient.
- The query pattern reuses the property
schema:about
and the literal.wikipedia.org/wiki/
. - The query pattern uses the truthy subgraph with the
schema:about
property only. - The query pattern is a one-hop query to a single entity.
Query class 11
Example query:
SELECT ?q ?x
{
wd:Q37938621 wdt:P131* ?q .
?q wdt:P300 ?x
}
- Queries from this class check that an item (here
Q37938621
) is located in the administrative territorial entity (wdt:P131*
) of a country subdivision (wdt:300
) - 1.92% percent of requests for 0.56% of query-time: Queries are efficient.
- The query pattern reuses the properties
P131
andP300
. - The query pattern uses the truthy subgraph.
- The query is a two-hops query with path from a single entity.
Query class 12
Example query:
SELECT *
WHERE {
wd:Q57612 wdt:P166 ?x.
?x wdt:P31*/wdt:P279* wd:Q684511
}
- Queries from this class check that an item (here
Q57612
) has an object by a defined property (hereP166
), and that this object is an instance of (or subtype) defined (hereQ684511
) - 1.76% percent of requests for 1.61% of query-time: Queries are average-efficient.
- The query pattern reuses the properties-path
P31*/P279*
on almost all queries (99.96%), and the property of the first pattern varies (P106
is used for 11.4% of the requests,P136
for 7.5%,P27
for 5.7% etc). - When the properties-path
P31*/P279*
is not used, the path is eitherP31*
orP279*
. - The query pattern uses the truthy subgraph.
- The query is a two-hop query with path from a single entity to as defined entity (one-hop filtering if edges are not oriented).
Query class 13
Example query:
#Tool: wdi_core fastrun
SELECT
(STRAFTER(STR(?property), 'entity/') as ?id)
?property
?propertyType
?propertyLabel
?propertyDescription
?propertyAltLabel
(STRAFTER(STR(?propertyType), '#') as ?value_type)
?formatter_url
WHERE {
VALUES (?property) { (wd:P7314) }
?property wikibase:propertyType ?propertyType .
OPTIONAL {
?property wdt:P1630 ?formatter_url.
}
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
ORDER BY ASC(xsd:integer(STRAFTER(STR(?property), 'P')))
- Queries from this class extract information on defined properties (here
P7314
). - 1.69% percent of requests for 0.25% of query-time: Queries are efficient.
- The query pattern reuses the properties
wikibase:propertyType
andP1630
. - The query pattern uses only
'entity/'
,'#'
and[AUTO_LANGUAGE],en
literals. - The query pattern uses the truthy subgraph.
- The query pattern is a single-hop query from a single entity.
Query class 14
Example query:
SELECT ?p ?oLabel
WHERE {
wd:Q241961 ?p ?o.
FILTER (?p IN (wdt:P31, wdt:P279, wdt:P361, wdt:P527, wdt:P138, wdt:P21,
wdt:P569, wdt:P3150, wdt:P1477, wdt:P570, wdt:P276, wdt:P664,
wdt:P710, wdt:P832, wdt:P1110, wdt:P144, wdt:P136, wdt:P135,
wdt:P179, wdt:P840))
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".
}
}
- Queries from this class get properties and objects associated to a given item (here
Q241961
), with properties only in a predefined set of possible properties. - 1.59% percent of requests for 0.14% of query-time: Queries are efficient.
- The query pattern reuses always and only the properties defined in the filter (
P31
,P279
...) - The query pattern uses only
AUTO_LANGUAGE],en
literal. - The query pattern uses the truthy subgraph.
- The query pattern is a one-hop query from a single entity, with labels.
Query class 15
- Exactly the same query pattern as in query class 14 (just above), change is the user-agent being a different version of the tool.
- 1.57% percent of requests for 0.13% of query-time: Queries are efficient.
Query class 16
- Exactly the same query pattern as in query class 13 (3 sections above), change is the user-agent being a different version of the tool.
- 1.46% percent of requests for 0.23% of query-time: Queries are efficient.
Query class 17
Example query:
SELECT DISTINCT ?x
WHERE {
wd:Q7320857 wdt:P172 ?x
}
- Queries from this class get distinct objects of links from a defined subject (here
Q7320857
) and property (hereP172
). - 1.43% percent of requests for 0.11% of query-time: Queries are efficient.
- The query pattern property varies, with 7 main properties used ~10% of requests each:
P21
,P19
,P172
,P136
,P27
,P31
,P279
. - The query pattern uses the truthy subgraph.
- The query is a one-hop query from a single entity.
Query class 18
Example query:
SELECT ?base ?prop ?parent
WHERE {
hint:Query hint:optimizer "None".
VALUES ?base { wd:Q4985551 wd:Q20205579 wd:Q4985508 wd:Q20146615 wd:Q28360255 }
VALUES ?class { wd:Q1549591 }
?parent (wdt:P31|wdt:P279) ?class .
?parent ?prop ?base .
[] wikibase:directClaim ?prop .
}
- Queries from this class are a variation of query class 6 from the same user-agent that includes a hint for the query-executor not to optimize the query. The query semantic is identical.
- 1.10% percent of requests for 5.62% of query-time: Queries are inefficient.
- There is a difference in query-efficiency between queries run with the optimizer (query class 6) or without (this query class): query running with the optimizer go on average 7 times faster than without.
Query class 19
Example query:
SELECT DISTINCT ?subject
WHERE {
?subject wdt:P846 '8250806' .
}
- Queries from this class get distinct items from their Global Biodiversity Information Facility ID (
P846
). - 1.05% percent of requests for 0.14% of query-time: Queries are efficient.
- The query pattern uses only
P846
. - The query pattern uses the truthy subgraph.
- The query is a one-hop query with a single object.
Query class 20
Example query:
SELECT ?wd ?wdLabel ?ps_ ?ps_Label {
VALUES (?person) {(wd:Q2664524)}
?person ?p ?statement .
?statement ?ps ?ps_ .
?wd wikibase:claim ?p.
?wd wikibase:statementProperty ?ps.
# ?wd rdfs:label ?wdLabel.
# FILTER(LANG(?wdLabel) = ""||LANGMATCHES(LANG(?wdLabel), "en"))
# ?ps_ rdfs:label ?ps_Label.
# FILTER(LANG(?ps_Label) = ""||LANGMATCHES(LANG(?ps_Label), "en"))
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
} ORDER BY ?wd ?statement
- Queries from this class extract all statements having a specified entity as subject (here
Q2664524
), getting statements property and object as well as their labels. - 1.02% percent of requests for 0.46% of query-time: Queries are efficient.
- The query pattern reuses always and only the properties
wikibase:claim
andwikibase:statement
- The query pattedrn uses only
en
literal. - The query pattern is NOT restricted to the truthy subgraph.
- The query pattern is a one-hop query from a single entity, with labels.
Query class 21
Example query:
SELECT ?wt
WHERE {
wd:Q1348015 wdt:P2949 ?wt
}
- Queries from this class retrieve the WikiTree person Id (
P2949
) for a given item. - 1.02% percent of requests for 0.09% of query-time: Queries are very efficient.
- The query pattern uses only the property
P2949
. - The query pattern uses only the truthy subgraph.
- The query pattern is a one-hop query from a single entity.
Query class 22
Example query:
SELECT ?label_sv ?label_en ?art_sv ?art_en {
VALUES (?person) {(wd:Q954668)}
OPTIONAL {
?person rdfs:label ?label_en.
FILTER(LANG(?label_en) = ""||LANGMATCHES(LANG(?label_en), "en")).
}
OPTIONAL {
FILTER(LANG(?label_sv) = ""||LANGMATCHES(LANG(?label_sv), "sv")).
?person rdfs:label ?label_sv.
}
OPTIONAL {
?art_sv schema:about ?person ; schema:isPartOf <https://sv.wikipedia.org/> .
}
OPTIONAL {
?art_en schema:about ?person ; schema:isPartOf <https://en.wikipedia.org/> .
}
}
- Queries from this class retrieve label in both
sv
anden
languages of an item (hereQ954668
), and also their article in wikipedia, bothsv
anden
languages. - 1.02% percent of requests for 0.21% of query-time: Queries are efficient.
- The query pattern uses only and always the properties
rdfs:label
,schema:about
,schema:isPartOf
. - The query pattern uses only literals
sv
anden
. - The query pattern uses only the truthy subgraph.
- The query pattern is a one-hop query to a single object.
Query class 23
Example query:
prefix schema: <http://schema.org/>
SELECT * WHERE {
<https://en.wikipedia.org/wiki/Saint_Cera> schema:about ?item .
}
- Queries from this class retrieve the item referenced from a wikipedia article.
- 0.86% percent of requests for 0.11% of query-time: Queries are efficient.
- The query pattern uses only and always the property
schema:about
. - The query pattern queries only
en.wikipedia.org
articles. - The query pattern uses only the truthy subgraph.
- The query pattern is a one-hop query from a single entity.