Query–subquery nets for Horn knowledge bases in first-order logic

ABSTRACT We formulate query–subquery nets and use them to create the first framework for developing algorithms for evaluating queries to Horn knowledge bases with the properties that: the approach is goal-directed; each subquery is processed only once and each supplement tuple, if desired, is transferred only once; operations are done set-at-a-time; and any control strategy can be used. Our intention is to increase efficiency of query processing by eliminating redundant computation, increasing adjustability (i.e. easiness in adopting advanced control strategies) and reducing the number of accesses to the secondary storage. For this purpose, we transform a logic program into an equivalent net structure and use it to determine which set of tuples or subqueries should be evaluated at each step, in an efficient way. The framework forms a generic evaluation method called QSQN, which is sound and complete and has polynomial time data complexity when the term-depth bound is fixed. The experimental results confirm the efficiency and usefulness of this method.


Introduction
Query processing is an important research area in computer science and information technology. Huang, Green, and Loo (2011) wrote we are witnessing an exciting revival of interest in recursive Datalog queries in a variety of emerging application domains such as data integration, information extraction, networking, program analysis, security, and cloud computing. During the last decade, rule-based query languages, including languages related to Datalog, were also intensively studied for the Semantic Web (e.g. in Cao, Nguyen, & Szalas, 2014;Eiter, Ianni, Lukasiewicz, & Schindlauer, 2011;Ruckhaus, Ruiz, & Vidal, 2008). In general, since deductive databases and knowledge bases are widely used in practical applications, improvements for processing recursive queries are always desirable. Due to the importance of the topic, it is worth doing further research on the topic.
Horn knowledge bases are extensions of Datalog deductive databases without the rangerestrictedness and function-free conditions. As argued by Madalińska-Bugaj and Nguyen (2012), the Horn fragment of first-order logic plays an important role in knowledge representation and reasoning. A Horn knowledge base consists of a positive logic program for defining intensional predicates and an instance of extensional predicates. When the knowledge base is too big, not all of the extensional and intensional relations can be totally kept in the computer memory and query evaluation cannot be totally done in the computer memory. In such cases, the system usually has to load (resp. unload) relations from (resp. to) the secondary storage. Thus, in contrast to logic programming, for Horn knowledge bases efficient access to the secondary storage is a very important aspect.
This work studies query processing for Horn knowledge bases, which is a topic that has not been well studied as query processing for Datalog-like deductive databases or the theory and techniques of logic programming. The survey by Ramakrishnan and Ullman (1995) provides a good overview of deductive database systems, with a focus on implementation techniques. The book by Abiteboul, Hull, and Vianu (1995) is also a good source for references. We refer the reader to Madalińska-Bugaj and  for a discussion on query processing for Horn knowledge bases.
The most well-known methods for evaluating queries to Datalog deductive databases or Horn knowledge bases are QSQR (Madalińska-Bugaj & Nguyen, 2012;Vieille, 1989) and Magic-Sets (Bancilhon, Maier, Sagiv, & Ullman, 1986;Beeri & Ramakrishnan, 1991;Rohmer, Lescouer, & Kerisit, 1986). By Magic-Sets we mean the evaluation method that combines the magic-set transformation with the improved semi-naive bottom-up evaluation method. Both of these methods are goal-directed. As observed by Vieille (1989), the QSQR approach is like iterative deepening search. It allows redundant recomputations (Madalińska-Bugaj & Nguyen, 2012, Remark 3.2). On the other hand, the Magic-Sets method applies breadth-first search. The following example shows that the breadth-first approach is not always efficient.
Example 1.1 The order of program clauses and the order of atoms in the bodies of program clauses may be essential, for example, when the positive logic program that defines intensional predicates is specified using the Prolog programming style. In such cases, the top-down depth-first approach may be much more efficient than the breadthfirst approach. Here is such an example, in which p, q 1 and q 2 are intensional predicates, r 1 and r 2 are extensional predicates, x, y and z are variables, a i and b i,j are constant symbols: . the positive logic program: . the extensional instance (illustrated in Figure 1): Notice that the depth-first approach needs only Q(m) steps for evaluating the query, while the breadth-first approach performs Q(m · n) steps. When n is comparable to m, the difference is too big. The magic-sets transformation does not help for this case.
Our postulate is that the breadth-first approach (including the Magic-Sets evaluation method) is inflexible and not always efficient. Of course, depth-first search is not always good either. The aim of this work is to develop an evaluation method for Horn knowledge bases that is more efficient than the QSQR evaluation method and more adjustable than the Magic-Sets evaluation method. In particular, a good method should be not only setoriented and goal-directed but should also reduce computational redundancy as much as possible and allow various control strategies. This paper is a revised and extended version of our conference paper (Nguyen & Cao, 2012) and forms a chapter of the Ph.D. dissertation (Cao, 2016). In this work, we formulate query-subquery nets and use them to develop the first framework for developing algorithms for evaluating queries to Horn knowledge bases with the following properties: . the approach is goal-directed, . each subquery is processed only once, . each supplement tuple, if desired, is transferred only once, . operations are done set-at-a-time, . any control strategy can be used.
The intention of our framework is to increase efficiency of query processing by eliminating redundant computation, increasing adjustability 1 and reducing the number of accesses to the secondary storage. The framework forms a generic evaluation method called QSQN. As a supplement, the Ph.D. dissertation (Cao, 2016) also contains: . proofs of soundness and completeness of the QSQN method, . data complexity analysis for the QSQN method, . a control strategy called the Improved Depth-First Control Strategy (IDFS), which together with QSQN forms the QSQN-IDFS method, . experiments for comparing the QSQN-IDFS, Magic-Sets and QSQR methods w.r.t.
. the number of read/write operations on relations, . the maximum number of tuples/subqueries kept in the computer memory, . the number of accesses to the secondary storage when the memory is limited.
To deal with function symbols, we use a term-depth bound for atoms and substitutions occurring in the computation and propose to use iterative deepening search which iteratively increases the term-depth bound. Similar to the work by Madalińska-Bugaj and  but in contrast to the QSQ framework for Datalog queries (Abiteboul et al., 1995), our framework for Horn knowledge bases does not use adornments and annotations, but uses substitutions instead. This is natural for the case with function symbols and without the range-restrictedness condition.
Our experiments show that the QSQN-IDFS evaluation method is more efficient than the QSQR evaluation method and as competitive as the Magic-Sets evaluation method. In the case when the order of program clauses and the order of atoms in the bodies of program clauses are essential as in Prolog programming, the QSQN-IDFS evaluation method usually outperforms the Magic-Sets method. As QSQN-IDFS is just an instance of the generic QSQN evaluation method, we claim that this generic method is useful.
The rest of this paper is structured as follows. Section 2 recalls the most important notation and definitions of first-order logic, logic programming and Horn knowledge bases. Section 3 presents our QSQN evaluation method for Horn knowledge bases. The preliminary experiments are discussed in Section 4. Conclusions are given in Section 5.

Preliminaries
First-order logic is considered in this work and we assume that the reader is familiar with it. We recall only the most important definitions for our work and refer the reader to Lloyd (1987) and Madalińska-Bugaj and Nguyen (2012) for further reading.
A signature for first-order logic consists of constant symbols, function symbols, variable symbols and predicate symbols. Terms, atoms and formulas are defined in the usual way. An expression is either a term, a tuple of terms, a formula without quantifiers or a list of formulas without quantifiers. A simple expression is either a term or an atom.
A substitution is a finite set of the form u = {x 1 /t 1 , . . . , x k /t k }, where x 1 , . . . , x k are pairwise distinct variables, t 1 , . . . , t k are terms, and t i = x i for all 1 ≤ i ≤ k. We denote ɛ the empty substitution.
The domain of a substitution θ is the set dom(u) = {x 1 , . . . , x k }, and the range of θ is the set range(u) = {t 1 , . . . , t k }. The restriction of a substitution θ to a set X of variables is the substitution Let u = {x 1 /t 1 , . . . , x k /t k } be a substitution and E be an expression. Then Eu, the instance of E by θ, is the expression obtained from E by simultaneously replacing all occurrences of the variable x i in E by the term t i , for 1 ≤ i ≤ k.
Let u = {x 1 /t 1 , . . . , x k /t k } and d = {y 1 /s 1 , . . . , y h /s h } be substitutions (where x 1 , . . . , x k are pairwise distinct variables, and y 1 , . . . , y h are also pairwise distinct variables). Then the composition ud of θ and δ is the substitution obtained from the sequence {x 1 /(t 1 d), . . . , x k /(t k d), y 1 /s 1 , . . . , y h /s h } by deleting any binding x i /(t i d) for which x i = (t i d) and deleting any binding y j /s j for which y j [ {x 1 , . . . , x k }.
If θ and δ are substitutions such that ud = du = 1, then we call them renaming substitutions. We say that an expression E is a variant of an expression E ′ if there exist substitutions θ and γ such that E = E ′ u and E ′ = Eg.
A substitution θ is more general than a substitution δ if there exists a substitution γ such that d = ug. Let Γ be a set of simple expressions. A substitution θ is called a unifier for Γ if Gu is a singleton. If Gu = {w} then we say that θ unifies Γ (into φ). A unifier θ for Γ is called a most general unifier (mgu) for Γ if θ is more general than every unifier of Γ.
The term-depth of an expression (resp. a substitution) is the maximal nesting depth of function symbols occurring in that expression (resp. substitution). If E is an expression or a substitution then by Vars(E) we denote the set of variables occurring in E. If φ is a formula then by ∀(w) we denote the universal closure of φ, which is the formula obtained by adding a universal quantifier for every variable having a free occurrence in φ.
A (positive or definite) program clause is a formula of the form ∀ . . , B k are atoms (i.e. atomic formulas). A is called the head, and (B 1 , . . . , B k ) the body of the program clause. If p is the predicate of A then the program clause is called a program clause defining p.
A positive (or definite) logic program is a finite set of program clauses. A goal (also called a negative clause) is a formula of the form ∀(¬B 1 _ . . . _ ¬B k ), written as B 1 , . . . , B k , where B 1 , . . . , B k are atoms. If k=1 then the goal is called a unary goal. If k=0 then the goal stands for falsity and is called the empty goal (or the empty clause) and denoted by □.
A fresh variant of a formula φ, where φ can be an atom, a goal A or a program clause A B 1 , . . . , B k (written without quantifiers), is a formula wu, where θ is a renaming substitution such that dom(u) = Vars(w) and range(u) consists of variables that were not used in the computation.
Similarly as for deductive databases, we classify each predicate either as intensional or as extensional. A generalized tuple is a tuple of terms, which may contain function symbols and variables. A generalized relation is a set of generalized tuples of the same arity. A Horn knowledge base is defined to be a pair consisting of a positive logic program for defining intensional predicates and a generalized extensional instance, which is a function mapping each extensional n-ary predicate to an n-ary generalized relation. Note that intensional predicates are defined by a positive logic program which may contain function symbols and not be range-restricted. From now on, we use the term 'relation' to mean a generalized relation, and the term 'extensional instance' to mean a generalized extensional instance.
Given a Horn knowledge base specified by a positive logic program P and an extensional instance I, a query to the knowledge base is a positive formula w( x) without quantifiers, where x is a tuple of all the variables of φ. A (correct) answer for the query is a tuple t of terms of the same length as x such that P < I o ∀(w( t)). When measuring data complexity, we assume that P and φ are fixed, while I varies. Thus, the pair (P, w( x)) is treated as a query to the extensional instance I. We will use the term 'query' in this meaning.
It can be shown that, every query (P, w( x)) can be transformed in polynomial time to an equivalent query of the form (P ′ , q( x)) over a signature extended with new intensional predicates, including q. The equivalence means that, for every extensional instance I and every tuple t of terms of the same length as x, The transformation is based on introducing new predicates for defining complex subformulas occurring in the query. For example, if w = p(x)^r(x, y), then where q is a new intensional predicate. Without loss of generality, we will consider only queries of the form (P, q( x)), where q is an intensional predicate. Answering such a query on an extensional instance I is to find (correct) answers for P < I < { q( x)}.

The query-subquery net evaluation method
In this section, we generalize the QSQ approach for Horn knowledge bases. Given a positive logic program, we make a query-subquery net structure and use it as a flow control network to determine which subqueries in which nodes should be processed next. We show how the data are transferred through edges of the net. We also propose an algorithm together with related procedures and functions for this framework. The algorithm repeatedly selects an active edge and fires the operation for the edge to transfer unprocessed data. Such a selection is decided by the adopted control strategy, which can be arbitrary. In addition, the processing is divided into smaller steps which can be delayed to maximize adjustability and allow various control strategies. The intention is to increase efficiency of query processing by eliminating redundant computation, increasing adjustability and reducing the number of accesses to the secondary storage.
In what follows, P is a positive logic program and w 1 , . . . , w m are all the program clauses of P, with w i = (A i B i,1 , . . . , B i,n i ), for 1 ≤ i ≤ m and n i ≥ 0. The following definition shows how to make a QSQ-net structure from the given logic program P.
Definition 3.1 (Query-Subquery Net Structure): A query-subquery net structure (QSQnet structure for short) of P is a tuple (V, E, T) such that: . V is a set of nodes that consists of: . input p and ans p, for each intensional predicate p of P, . pre filter i , filter i,1 , …, filter i,n i , post filter i , for each 1 ≤ i ≤ m. . E is a set of edges that consists of: (input p, pre filter i ) and ( post filter i , ans p), for each 1 ≤ i ≤ m, where p is the predicate of A i , . ( filter i,j , input p) and (ans p, filter i,j ), for each intensional predicate p and each 1 ≤ i ≤ m and 1 ≤ j ≤ n i such that B i,j is an atom of p.
. T is a function, called the memorizing type of the net structure, mapping each node filter i,j [ V such that the predicate of B i,j is extensional to true or false. If T( filter i,j ) = false (and the predicate of B i,j is extensional) then subqueries for filter i,j are always processed immediately, without being accumulated at filter i,j .
If (v, w) [ E then we call w a successor of v, and v a predecessor of w. Note that V and E are uniquely specified by P. We call the pair (V, E) the QSQ topological structure of P.
Example 3.2 Consider the following (recursive) positive logic program, where x, y and z are variables, p is an intensional predicate, and q is an extensional predicate: q(x, z), p(z, y).
Its QSQ topological structure is illustrated in Figure 2.
Example 3.3 Consider the following positive logic program, where x, y and z are variables, p and r are intensional predicates, q, s and t are extensional predicates: This program is a modified version of an example from Zhou and Sato (2003). Figure 3 illustrates the QSQ topological structure of this program.
Definition 3.4 (Query-Subquery Net): A query-subquery net (QSQ-net for short) of P is a tuple N = (V, E, T, C) such that (V, E, T) is a QSQ-net structure of P, C is a mapping that associates each node v [ V with a structure called the contents of v, and the following conditions are satisfied: . C(v), where v = input p or v = ans p for an intensional predicate p of P, consists of: . tuples(v): a set of generalized tuples of the same arity as p,   For v = filter i,j and p being the predicate of A i , the meaning of a subquery ( t, d) [ subqueries(v) is that: for processing a goal p( s) with s [ tuples(input p) using the program clause w i = (A i B i,1 , . . . , B i,n i ), unification of p( s) and A i as well as processing of the subgoals B i,1 , . . . , B i,j−1 were done, amongst others, by using a sequence of mgu's g 0 , . . . , g j−1 with the property that t = sg 0 . . . g j−1 and d = (g 0 . . . g j−1 ) |Vars((B i,j ,...,B i,n i )) .
An empty QSQ-net of P is a QSQ-net of P such that all the sets of the form tuples(v), unprocessed(v, w), subqueries(v), unprocessed subqueries(v), unprocessed subqueries 2 (v) or unprocessed tuples(v) are empty.
In a QSQ-net, if v = pre filter i or v = post filter i or (v = filter i,j and kind(v) = extensional) then v has exactly one successor, which we denote by succ(v).
If v is filter i,j with kind(v) = intensional and pred(v) = p then v has exactly two successors. In that case, let if n i . j, post filter i otherwise,

JOURNAL OF INFORMATION AND TELECOMMUNICATION
and succ 2 (v) = input p. The set unprocessed subqueries(v) is used for (i.e. corresponds to) the edge (v, succ(v)), while unprocessed subqueries 2 (v) is used for the edge (v, succ 2 (v)).
Note that if succ(v) = w then post vars(v) = pre vars(w). In particular, post vars( filter i,n i ) = pre vars( post filter i ) = ∅.
The formats of data transferred through edges of a QSQ-net are specified as follows: . data transferred through an edge of the form (input p, v), (v, input p), (v, ans p) or (ans p, v) is a finite set of generalized tuples of the same arity as p, . data transferred through an edge (u, v) with v = filter i,j and u not being of the form ans p is a finite set of subqueries that can be added to subqueries(v), . data transferred through an edge (v, post filter i ) is a set of subqueries ( t, 1) such that t is a generalized tuple of the same arity as the predicate of A i .
If ( t, d) and ( t ′ , d ′ ) are subqueries that can be transferred through an edge to v then we say that ( t, d) is more general than Informally, a subquery ( t, d) transferred through an edge to v is processed as follows: to add a fresh variant of it to tuples(input p), . for each currently existing t ′ [ tuples(ans p), if atom(v)d = B i,j d is unifiable with a fresh variant of p( t ′ ) by an mgu γ then transfer the subquery ( tg, (dg) | post vars(v) ) through (v, succ(v)), . store the subquery ( t, d) in subqueries(v), and later, for each new t ′ added to tuples(ans p), if atom(v)d = B i,j d is unifiable with a fresh variant of p( t ′ ) by an mgu γ then transfer the subquery ( tg, (dg) | post vars(v) ) through (v, succ(v)), . if v = post filter i and p is the predicate of A i then transfer the tuple t through ( post filter i , ans p) to add it to tuples(ans p).
Formally, the processing of a subquery is designed more sophisticatedly so that: . every subquery or input/answer tuple that is subsumed by another one or has a termdepth greater than a fixed bound l is ignored, . the processing is divided into smaller steps which can be delayed at each node to maximize adjustability and allow various control strategies, . the processing is done set-at-a-time (e.g. for all the unprocessed subqueries accumulated in a given node).
The procedure transfer(D, u, v) specifies the effects of transferring data D through an edge (u, v) of a QSQ-net. If v is of the form pre filter i or post filter i or (v = filter i,j and kind(v) = extensional and T(v) = false) then the input D for v is processed immediately and an appropriate data Γ is produced and transferred through (v, succ(v)). Otherwise, the input D for v is not processed immediately, but accumulated into the structure of v in an appropriate way.
The function active-edge(u, v) returns true for an edge (u, v) if data accumulated in u can be processed to produce some data to transfer through (u, v), and returns false otherwise. If active-edge(u, v) is true then the procedure fire(u, v) processes the data accumulated in u that has not been processed before to transfer appropriate data through the edge (u, v). This procedure uses the procedure transfer (D, u, v). Both procedures fire(u, v) and transfer(D, u, v) use a parameter l as a term-depth bound for tuples and substitutions.
Algorithm 1 presents our QSQN evaluation method for Horn knowledge bases. It repeatedly selects an active edge and fires the operation for the edge. Such a selection is decided by the adopted control strategy, which can be arbitrary.
Example 3.5 This example illustrates Algorithm 1 step by step. Consider the following Horn knowledge base (P, I) and the query s(x), where p and s are intensional predicates, q is an extensional predicate, x, y, z are variables, and ao, u are constant symbols: . the positive logic program P: p(x, y) q(x, y) p(x, y) q(x, z), p(z, y) s(x) p(b, x), . the extensional instance I (illustrated in Figure 5):  The QSQ topological structure of P is presented in Figure 6. We give below a trace of a run of Algorithm 1 that evaluates the query (P, s(x)) on the extensional instance I, using term-depth bound l=0 and the memorizing type T that maps each node v such that kind(v) = extensional (i.e. filter 1,1 and filter 2,1 ) to false. For convenience, we denote the edges of the net with names E 1 -E 17 as shown in Figure 6.
Algorithm 1 starts with an empty QSQ-net. It then adds a fresh variant (x 1 ) of (x) to the empty sets tuples(input s) and unprocessed(E 14 ). Next, it repeatedly selects an active edge and fires the edge. Assume that the selection is done as follows.
After processing unprocessed subqueries 2 ( filter 3,1 ), the algorithm empties this set and Figure 6. The QSQ topological structure of the program given in Example 3.5. Table 1. A summary of the steps at which the data (i.e. tuples) were added to inputs, anss, inputp, ansp, respectively. input s ans s input p ans p transfers {(b, x 1 )} through E 13 . This adds a fresh variant (b, x 2 ) of the tuple (b, x 1 ) to the empty sets tuples(input p), unprocessed(E 1 ) and unprocessed(E 7 ).
After processing unprocessed subqueries 2 ( filter 2,2 ), the algorithm empties this set and transfers {(e, x 6 )} through the edge E 6 . This adds a fresh variant (e, x 8 ) of the tuple {(e, x 6 )} to the sets tuples(input p), unprocessed(E 1 ) and unprocessed(E 7 ). After these steps, we have: After processing unprocessed(E 1 ), the algorithm empties this set and transfers e), 1)}, which in turn is then transferred through the edge After processing unprocessed tuples( filter 2,2 ) and unprocessed subqueries( filter 2,2 ), the algorithm empties these sets and transfers {((b, d), 1), ((b, g), 1), ((c, e), 1)} through the edge E 10 . This produces {(b, d), (b, g), (c, e)}, which is then transferred through the edge E 11 and added to the sets tuples(ans p), unprocessed(E 5 ) and unprocessed(E 12 ). After these steps, we have: After processing unprocessed(E 5 ), the algorithm empties this set and transfers {(b, d), (b, g), (c, e)} through the edge E 5 and adds these tuples to the empty set unprocessed tuples( filter 2,2 ). (13) E 10 − E 11 After processing unprocessed tuples( filter 2,2 ), the algorithm empties this set and transfers {((b, e), 1)} through the edge E 10 . This produces {(b, e)}, which is then transferred through the edge E 11 and added to the sets tuples(ans p), unprocessed(E 5 ) and unprocessed(E 12 ). After these steps, we have: After processing unprocessed(E 12 ), the algorithm empties this set and transfers {(b, c), , (c, e), (b, e)} through the edge E 12 and adds these tuples to the empty set unprocessed tuples( filter 3,1 ).
After processing unprocessed tuples( filter 3,1 ) and unprocessed subqueries (  The edges E 5 and E 7 are still active, with unprocessed(E 5 ) = {(b, e)} and unprocessed(E 7 ) = {(e, x 8 )}. Firing the edge E 5 causes the edge E 10 to become active, but after that, firing the edges E 7 and E 10 does not create data to be transferred.
At this point, no edges are active (in particular, all the attributes unprocessed, unprocessed subqueries, unprocessed subqueries 2 and unprocessed tuples of the nodes in the net are empty sets). The algorithm terminates and returns the set tuples(ans s) = {(c), ( f ), (h), (d), (g), (e)}. Table 1 summarizes the effects of the steps of this trace. The numbers in bold font indicate the corresponding steps of the trace, which are listed in Example 3.5.
We present below properties of Algorithm 1. Due to the lack of the space, we refer the reader to Cao (2016) for their proofs.
Soundness: After a run of Algorithm 1 on a query (P, q( x)) and an extensional instance I, for every intensional predicate p of P, every tuple t [ tuples(ans p) is a correct answer in the sense that P < I o ∀( p( t)). Completeness: After a run of Algorithm 1 (using parameter l) on a query (P, q( x)) and an extensional instance I, for every SLD-refutation of P < I < { q( x)} that uses the leftmost selection function, does not contain any goal with term-depth greater than l and has a computed answer θ with term-depth not greater than l, there exists s [ tuples(ans q) such that xu is an instance of a variant of s.
Together with the completeness of SLD-resolution (Clark, 1979), this property makes a relationship between correct answers for P < I < { q( x)} and the answers computed by Algorithm 1 for the query (P, q( x)) on the extensional instance I.
For queries and extensional instances without function symbols, we take term-depth bound l=0 and obtain the following completeness result, which immediately follows from the above property: After a run of Algorithm 1 using l=0 on a query (P, q( x)) and an extensional instance I that do not contain function symbols, for every computed answer θ of an SLD-refutation of P < I < { q( x)} that uses the leftmost selection function, there exists t [ tuples(ans q) such that xu is an instance of a variant of t.
Data complexity: For a fixed query and a fixed bound l on term-depth, Algorithm 1 runs in polynomial time in the size of the extensional instance.

Preliminary experiments
In Cao (2016), we presented three control strategies DAR (Disk Access Reduction), DFS (Depth-First Strategy), IDFS (Improved Depth-First Strategy) and implemented QSQN together with these strategies to obtain the corresponding evaluation methods QSQN-DAR, QSQN-DFS and QSQN-IDFS. The intention of DAR is to reduce the number of accesses to the secondary storage. Because our current implementation of the DAR control strategy is not advanced enough and the implemented QSQN-DAR method is not more efficient than the implemented QSQN-IDFS method, for comparison with the Magic-Sets and QSQR methods we used QSQN-IDFS. We compared the QSQN-IDFS, Magic-Sets and QSQR evaluation methods with respect to: . the number of read/write operations on relations, . the maximum number of tuples/subqueries kept in the computer memory, . the number of accesses to the secondary storage when the memory is limited.
Our experiments consider different kinds of logic programs, including non-recursive, tail recursive, non-tail recursive as well as logic programs with or without function symbols. We used typical examples from well-known articles related to deductive databases. We also provided new examples. Due to the lack of the space, we refer the reader to Cao (2016) for more details on control strategies, experimental settings, test cases and experimental results. We report below only one test.
For the Datalog database and the query given in Example 1.1 with m=n=100, the QSQN-IDFS method reads data from relations 361 times, writes data to relations 154 times and keeps maximally in the memory 204 tuples, while the corresponding numbers of the Magic-Sets method are 721, 301 and 10,105, respectively, and the corresponding numbers of the QSQR method are 410, 358 and 356, respectively. When the number of tuples kept in the memory is restricted to 5052 (about 50% of the mentioned number 10,105), the Magic-Sets method needs to write relations to the secondary storage 29 times and read them from the secondary storage 60 times (using a certain unloading strategy), while the QSQN-IDFS method reads data from the secondary storage only once and does not need to write data to the secondary storage. When the number of tuples kept in the memory is restricted to 2021 (i.e. 20% of the mentioned number 10,105), the Magic-Sets method fails to evaluate the query, while the QSQN-IDFS method does not. This test shows that, when the positive logic program defining intensional predicates is specified using the Prolog programming style, the QSQN-IDFS and QSQR methods (which use depth-first search) are usually more efficient than the Magic-Sets method (which uses the breadth-first search).
As can be seen in Cao (2016, Tables 6.1-6.3), the QSQR method is often worse than the QSQN-IDFS and Magic-Sets methods w.r.t. the number of accesses to the secondary storage. As discussed by Madalińska-Bugaj and Nguyen (2012), QSQR uses iterative deepening search and clears input relations at the beginning of each iteration of the main loop, thus it allows redundant recomputations. In addition, the formulation of QSQR in Madalińska-Bugaj and Nguyen (2012) is at a logical level and uses the same relation for the whole sequence of supplements. This requires more relation loading/unloading when the recursive depth is high and no more memory is available.

Conclusions
We have provided the first framework for developing algorithms for evaluating queries to Horn knowledge bases with the properties that: the approach is goal-directed; each subquery is processed only once and each supplement tuple, if desired, 2 is transferred only once; operations are done set-at-a-time; and any control strategy can be used.
Our framework is an adaptation and a generalization of the QSQ approach of Datalog for Horn knowledge bases. One of the key differences is that we do not use adornments and annotations, but use substitutions instead. This is natural for the case with function symbols and without the range-restrictedness condition. When restricting to Datalog queries, it groups operations on the same relation together regardless of adornments and allows to reduce the number of accesses to the secondary storage although 'joins' would be more complicated.
Our framework forms a generic evaluation method called QSQN. This method is designed so that the query processing is divided into appropriate steps which can be delayed to maximize adjustability and allow various control strategies. In comparison with the most well-known evaluation methods, the generic QSQN evaluation method does not do redundant recomputations as the QSQR evaluation method and is more adjustable and thus has essential advantages over the Magic-Sets evaluation method. The QSQN method is sound and complete, and has polynomial time data complexity when the termdepth bound is fixed. Notice the significance of this: it states that one can develop and use any control strategy for QSQN and the resulting evaluation method is always guaranteed to be sound and complete. Our proofs (Cao, 2016) are important in the context that, without proofs, the methods proposed in Vieille (1986), Abiteboul et al. (1995) and Madalińska-Bugaj and Nguyen (2008) were wrongly claimed to be complete.
Our experiments presented in Cao (2016) show that the QSQN-IDFS evaluation method is more efficient than the QSQR evaluation method and as competitive as the Magic-Sets evaluation method. In the case when the order of program clauses and the order of atoms in the bodies of program clauses are essential as in Prolog programming, the QSQN-IDFS evaluation method usually outperforms the Magic-Sets method. As QSQN-IDFS is just an instance of the generic QSQN evaluation method, we conclude that this generic method is useful.
QSQ-nets are a more intuitive representation than the description of the QSQ approach of Datalog given in Abiteboul et al. (1995). Our notion of QSQ-net makes a connection to flow networks and is intuitive for developing efficient evaluation algorithms. For example, we have incorporated tail-recursion elimination into QSQ-nets (Cao, 2016;Cao & Nguyen, 2015) to obtain the QSQN-TRE method, as well as stratified negation into QSQ-nets (Cao, 2016) to obtain the QSQN-STR method for evaluating queries to stratified knowledge bases.
Notes 1. By 'adjustability' we mean easiness in adopting advanced control strategies. 2. when T(v) = false for all nodes v of the form filter i,j with kind(v) = extensional

Disclosure statement
No potential conflict of interest was reported by the authors.

Notes on contributors
Son Thanh Cao is a lecturer at Faculty of Information Technology, Vinh University. He obtained the Ph.D. degree in Computer Science in 2016 from the University of Warsaw. His research interests include logic programming, deductive databases and semantic web.
Linh Anh Nguyen is an associate professor of Computer Science at the Institute of Informatics, University of Warsaw. Since 2014 he has been cooperating with Faculty of Information Technology, Ton Duc Thang University in doing research. He has published more than 90 papers in international scientific journals and proceedings of international conferences/workshops.