[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Fixing QTYPE=* Recursion (was Re: Issue: Add a new QTYPE)



Prior to the Thanksgiving break, I received a number of replies from various WG members in the "Add a new QTYPE" thread, going in different directions, and it occurs to me that the branching of the thread is largely due to insufficient specificity on my part. I apologize for that. I've been throwing around terms like "complete response" without really attempting any kind of definition for them. Truth be told, I'm still thinking QTYPE=* recursion through in my head. This message is an attempt to give some form to my current thinking on the subject, presented to see if my clarification proposal has any chance of gaining traction in the WG.

The behavior I'm trying to crystallize with the term "complete response" is that the results of a recursive QTYPE=* query should at all times be identical to the combined results of (theoretically) issuing separate but simultaneous recursive queries of each record type that is known to the querier.

Corollaries:

1) When I say "at all times", I mean that in the case of an intermediate caching server, not only should the *initial* QTYPE=* query yield the same results as if it were actually separate queries, but any *subsequent* QTYPE=* queries of the same name should yield the same results as if all previous QTYPE=* queries were actually composed of separate queries too, and take into consideration what lingering effects those separate queries would have had on the cache contents. When attempting to answer from cached data, this corollary naturally implies that both positive *and* negative caching must be factors in the overall result (for the nitpickers out there, consider that CNAMEs, at least, make it impossible for a name to own records of *all* known record types, even if an implementation only has "knowledge" of the 1034/1035 RR definitions, so it is always the case that at least one queried record type for a given name will get a NODATA response, assuming that the responder is RFC 2308-aware).
2) If any of the RRsets for record types seen in the previous QTYPE=* query have expired, then a conforming responder cannot give a "complete" answer from cache and *must* recurse to answer the query, although it would be up to the responder whether it recurses the whole QTYPE=* query, or just one or more type-specific queries to fill in the missing RRset(s).
3) One of the necessary rules of this "combining" process is that if negative-caching records of differing expiration timestamps are encountered, the combined result will reflect the most proximate expiration timestamp. Admittedly, this devalues negative caching somewhat in this particular situation, but is necessary to maintain the integrity of the mechanism, much as minimizing TTLs within a positively-cached RRset was perceived (RFC 2181) as necessary to maintain cache integrity.


Ultimately, my thinking is that reconstructability of "complete" QTYPE=* responses from cache gives rise to a requirement that all QTYPE=* responses, whether from cache or authoritative data, include a "complementary" negative caching record (Authority Section SOA RR) in addition to any answers that are given in the Answer Section. I call this "complementary" because it implies a NODATA for all record types which are absent from the actual answer, e.g. if the answer to a particular QTYPE=* query consists of an A RRset and an MX RRset, then the "complementary" negative caching record would imply NODATA for all other potential RRsets (NS, PTR, SRV, CNAME and so forth). As long as the "complementary" negative caching record and all of the relevant "positively" cached RRsets are still unexpired, a "complete" answer -- which might include updated RDATAs and/or TTLs for the "positive" RRsets, but will not include any RRsets still precluded by the "complementary" negative caching record -- can be constructed from a cache for any subsequent QTYPE=* queries of the same name without having to resort to recursing the query. Particularly-clever resolvers may, under some low-TTL circumstances, still choose to pre-emptively recurse the QTYPE=* query or a type-specific query, in order to improve the quality of their cache and/or their response and therefore avoid unnecessary queries in the near future from downstream resolvers.

"Complementary" negative caching solves, I believe, the "completeness" issue, as defined, but raises the possibility of interoperability problems with modern-but-pre-clarification resolvers which will not be expecting negative-caching records to accompany answerful (ANCOUNT>0) responses. This lack-of-expectation could arise because RFC 2308 defined NODATA negative-caching responses quite narrowly as having "no relevant answers in the answer section". That definition would need to be loosened in order for "complementary" negative caching to become legal. The interoperability question this change raises is, do any implementations actually *ignore* or *reject* relevant contents of the Answer Section solely because there is also an SOA in the Authority Section? A quick code check of BIND 9 implies that the Answer Section RRs of a response to a recursed query are processed before it even looks at the Authority Section, and even if it subsequently finds an SOA record there, it will simply ignore it (in accordance with RFC 1034's "optional" negative caching pseudo-specification). So initial indications are that BIND 9 would not have a problem with this, and, although I am less sure about them, it would surprise me if any other implementations would have a problem either -- such a restrictive implementation would never have been able to interoperate with pre-2308 forms of "optional" negative caching. Pre-clarification resolvers are more likely by far to simply ignore the "illegal" negative-caching record, than to choke on it. But the proof is in the pudding, of course; that's what interoperability testing is all about...

Requiring a "complementary" negative caching record should also have the beneficial effect of definitively (again, this would need to be confirmed in field interoperability tests) marking a QTYPE=* answer as coming from a conforming implementation or not -- if the response contains a "complementary" negative caching record, then the querier can be confident that the responder understands the newly-clarified semantics and either recursed for the answer or was able to properly construct it from cache or authoritative data; conversely, if the response is non-authoritative and lacks the "complementary" negative caching record, then the querier is put on notice that the answer may be incomplete and it should take appropriate backup measures, as configured, in a similar fashion to how it deals with malformed responses, e.g. trying other upstream forwarders, if configured, or bypassing forwarders altogether and trying to query authoritative servers directly. If all other measures of dealing with the backlevel response have failed, it could return the possibly-partial answer _as_is_ and hope that the client/caller is sophisticated enough to realize that the absence of the "complementary" negative caching record from the response implies that the answer may be incomplete. Perhaps there could even be a new flag in resolver APIs to indicate whether the caller would accept a possibly-incomplete answer to a QTYPE=* query...

- Kevin



--
to unsubscribe send a message to namedroppers-request@ops.ietf.org with
the word 'unsubscribe' in a single line as the message text body.
archive: <http://ops.ietf.org/lists/namedroppers/>