[OpenAFS-devel] Re: freezes acessing /afs/.git

Jeffrey Hutzelman jhutz@cmu.edu
Fri, 15 Aug 2014 12:06:53 -0400


On Fri, 2014-08-15 at 07:35 -0400, chas williams - CONTRACTOR wrote:
> On Thu, 14 Aug 2014 14:46:51 -0500
> Andrew Deason <adeason@sinenomine.net> wrote:
> 
> > But we already cache the positive results; you just said in the next
> > sentence that we do. The subsystem that remembers cell information has
> > logic for timeouts and has the structure for remembering the cells; you
> > just duplicated it it an entirely separate place except only for
> > negative results for some reason. So now we'd have two separate caches
> > to deal with, which seems rather confusing and error prone, and I see no
> > reason to do it like that.
> 
> For the reasons I stated earlier -- the negative name space is much
> larger than the positive name space.  I don't see a pressing reason to
> modify the kernel's cache of cell name to address mappings to add
> hashing support to efficiently deal with a large negative name space.
> 
> The kernel is caching cell information.  The new code is caching DNS
> information (because sadly your local resolver doesn't do a very good
> job).  We could cache positive DNS results and it wouldn't be hard to
> do but since that currently isn't an issue, I choose not to do it.

FWIW, I really dislike the notion of caching DNS lookup results for a
time other than the time-to-live provided by the DNS.  If the problem is
looking for a top-level 'git' domain, then it ought to be solved simply
by obeying the 1-day TTL advertised by the root servers for negative
responses (this is the last number in the SOA record for the root zone).

That said, I think it may be a whole lot simpler to simply extend the
cache manager's existing cache to reflect the possibility of negative
entries.  The only real difficulty here is that such entries need
eventually to be removed, and I don't think we currently can ever remove
entries from the cell table.

Finally, I would point out that upcalls are expensive, or can be,
depending on the platform.  You may think upcalls are no big deal on
your fast, lightly loaded quad-core x86 Linux box, but you should also
consider the heavily loaded single-processor SPARC box, or even some
more modern low-power CPUs.  An awful lot of what's wrong with computing
today can be traced to people deciding it's OK to do expensive things
because resources are certainly always going to be cheap and plentiful.
Needing an upcall when a real new cell is discovered is a reasonable
thing.  Needing frequent upcalls because something is walking its way up
the directory tree looking for magic directories is a different thing.


OK, now, really finally: are there some kludges^Wheuristics we can
apply?  For example, don't try AFSDB lookups for single-label cell
names?  Or perhaps some kind of blacklist mechanism?

-- Jeff