[OpenAFS-devel] Re: freezes acessing /afs/.git

Andrew Deason adeason@sinenomine.net
Fri, 15 Aug 2014 11:32:39 -0500


On Fri, 15 Aug 2014 12:06:53 -0400
Jeffrey Hutzelman <jhutz@cmu.edu> wrote:

> FWIW, I really dislike the notion of caching DNS lookup results for a
> time other than the time-to-live provided by the DNS.  If the problem
> is looking for a top-level 'git' domain, then it ought to be solved
> simply by obeying the 1-day TTL advertised by the root servers for
> negative responses (this is the last number in the SOA record for the
> root zone).

I don't think anything here prevents that. The kernel afsdb entries have
a timeout entry, so that could always be recognized for negative
entries... Though with how rarely a cell is added to DNS, it seems like
a much more minor problem than the problem we're addressing. (I assume
we'd have to query the SOA record ourselves separately? that's
annoying.)

> That said, I think it may be a whole lot simpler to simply extend the
> cache manager's existing cache to reflect the possibility of negative
> entries.  The only real difficulty here is that such entries need
> eventually to be removed, and I don't think we currently can ever
> remove entries from the cell table.

I don't think you necessarily need to remove entries. Instead, an entry
can just be repurposed for a newly-added entry (if we've hit some space
limit). That gets a little harder of course if it's a hash table
instead.

But also, to make it really simple we don't even necessarily need a hash
table. Just using the dumb list that's there you could add negative
entries and limit the number of negative entries to something small like
50, and any performance difference should not be noticeable.

> OK, now, really finally: are there some kludges^Wheuristics we can
> apply?  For example, don't try AFSDB lookups for single-label cell
> names?  Or perhaps some kind of blacklist mechanism?

Yes, this was discussed on the -info thread for this. But my opinion (or
the impression I got) is that doing those heuristics is not enough; that
you get more benefit from only DNS negative caching than you get from
only blacklisting/whitelisting/etc. So, while we still could do both,
DNS caching seems like the bigger benefit so pursuing that first seems
more worthwhile. And of course with the heuristics/blacklists, it's
potentially more configuration knobs, and it's easier for things to go
wrong (but probably easier to fix when they do).

In particular, Markus did mention restricting no-dots entries, but that
doesn't always work. He captured real examples of attempted accesses to
things like latex .sty files, and one thing that even tried to load
library files (/afs/libX11.so).

-- 
Andrew Deason
adeason@sinenomine.net