[AFS3-std] Quota for max. number of files per volumes
Hartmut Reuter
reuter@rzg.mpg.de
Thu, 23 Jul 2009 15:40:27 +0200
I have submitted a patch which implements handling of new tags. Could
you please have a look on it.
Hartmut
Jeffrey Hutzelman wrote:
> Yes, that's the idea. I had intended that the length would be
> encoded in as few bytes as possible, but that's not required, and
> you're correct that for things which might be large, always encoding
> it the same way is easier. However, the receiver needs to be
> prepared to decode the length however it appears.
>
> It would probably be reasonable to impose a maximum encoded length
> for all TLV tags (maybe 2^64-1) to make the receiver's job easier.
>
> -----Original Message----- From: Hartmut Reuter <reuter@rzg.mpg.de>
> Date: Thursday, Jul 16, 2009 9:13 am Subject: Re: [AFS3-std] Quota
> for max. number of files per volumes To: Jeffrey Hutzelman
> <jhutz@cmu.edu>, afs3-standardization@openafs.org
>
> Jeffrey Hutzelman wrote:
>> --On Wednesday, July 15, 2009 09:53:31 AM +0200 Hartmut Reuter
>> <reuter@rzg.mpg.de> wrote:
>>
>>> Felix Frank wrote:
>>>>>> Bear in mind that not only are additional RPCs needed, but
>>>>>> the volume dump format must be extended as well. I expect
>>>>>> such an extension to be on the agenda for the upcoming
>>>>>> protocol upgrade?
>>>>> If it's not, well, we should be talking about it. This
>>>>> wouldn't be the only proposal that will require the ability
>>>>> to be backed up!
>>>> As always, I'm not sure whether I'm reading your hints
>>>> correctly ;-p
>>>>
>>>> Regardless, here goes for RxOSD (or should I repost in a
>>>> separate thread?): In the header: - tag 'P' (Int32) = OSD
>>>> policy In any vnodes: - tag 'u' (Int32) = last usage time
>>>> (timestamp)
>>> The "last usage time" is not really necessary. In MR-AFS
>>> wipe-decisions were based on a last usage time kept in the access
>>> history vnode and when I started development of AFS/OSD I thought
>>> it would be necessary to have some thing similar also here. But
>>> now I use the atime on the OSD's partition to find out which
>>> objects haven't been used for the longest time and this technique
>>> is much better. So we could drop 'u' and also the usageTime in
>>> the vnode when merging into git.
>>>
>>>> In directory vnodes: - tag 'P' (Int32) = Policy Index In file
>>>> vnodes: - tag 'z' (ByteStringWithLength) = string
>>>> representation of OSD metadata - tag 'x' (Int32) = "OSD file is
>>>> online" flag - tag 'y' (Int64 via DumpDouble() in OpenAFS) =
>>>> file length, for OSD-files only
>>> In addition I obviously abused 'm' in the volheader section to
>>> transmit maxfiles rather than minquota. But this can be changed:
>>> 'm' could remain minquota and 'Q' could be maxfiles ('Q' because
>>> it is a quota for the number of files)
>>>
>>> I can change that in the running cell by 1st making sure all
>>> volservers accept 'Q' in the incoming dump and in a 2nd round
>>> updating them to actually send 'Q' and in a 3rd round updating
>>> them to do the original thing with 'm'.
>> At the end of October, 2007, Jeff Altman posted a message titled
>> "Compression support for AFS vos dump - Specification". That
>> message and my reply the next day include detailed information on
>> extensibility of the volume dump format, including specifying the
>> format of future tags so that readers can successfully process
>> unknown tags. That specification was the result of extensive
>> discussion, some of which probably took place in OpenAFS RT #17947
>> and some probably offline. I know that's less than ideal, but lots
>> of things were in those days; nonetheless it was discussed and
>> there has never since been an objection, so I'm taking the proposal
>> (and really, the whole compressed-dump format proposal) as if
>> adopted, and I don't think I'm the only one.
>>
>> I'll be happy to write up a full internet-draft version at some
>> point in my copious free time, but in the meantime here are the key
>> points:
>>
>> - Existing tags are grandfathered - New tags 0x61-0x7a ('a' - 'z')
>> are 32-bit integers - New tags 0x05-0x60 ('A' - 'Z') are TLV; that
>> is, they are followed by a length (0-127) followed by that many
>> bytes of tag-specific data. (see the spec for lengths > 127).
>>
>>
>> Please read those messages and align the rxosd proposal to this
>> model.
>>
>> -- Jeff
>>
>> _______________________________________________
>> AFS3-standardization mailing list AFS3-standardization@openafs.org
>> http://michigan-openafs-lists.central.org/mailman/listinfo/afs3-standardization
>>
>>
>> Ok, I looked into the mail-archive and found both messages. But I
>> still do not really understand how it should be used.
>>
>> Example: ths OSD metadata can have a length between about 100 and
>> 2000 bytes.
>>
>> So the tag should be in the range 'A' - 'Z', say 'Y', ok? Then
>> should follow a single byte length which in this case (expected
>> length can be > 127) contains the length of the length field with
>> bit 7 set. That would be 0x82 for a short integer length field?
>> Then the length field and then <length>bytes of data, ok?
>>
>> So the data stream should be either
>>
>> 'Y' 0x82 0x01 0xb4 436 bytes of contents
>>
>> or
>>
>> 'Y' 0x84 0x00 0x00 0x01 0xb4 436 bytes of contents
>>
>> The last one could be done by
>>
>> DumpTag(iodp, 'Y'); DumpBytesStringWithLength(iodp, 0x84, buf,
>> len);
>>
>> Is that correct?
>>
>> Thanks, Hartmut
>> -----------------------------------------------------------------
>> Hartmut Reuter e-mail reuter@rzg.mpg.de phone
>> +49-89-3299-1328 fax +49-89-3299-1301 RZG (Rechenzentrum
>> Garching) web http://www.rzg.mpg.de/~hwr Computing Center of
>> the Max-Planck-Gesellschaft (MPG) and the Institut fuer
>> Plasmaphysik (IPP)
>> -----------------------------------------------------------------
>>
>>
>
>
> _______________________________________________ AFS3-standardization
> mailing list AFS3-standardization@openafs.org
> http://michigan-openafs-lists.central.org/mailman/listinfo/afs3-standardization
>
--
-----------------------------------------------------------------
Hartmut Reuter e-mail reuter@rzg.mpg.de
phone +49-89-3299-1328
fax +49-89-3299-1301
RZG (Rechenzentrum Garching) web http://www.rzg.mpg.de/~hwr
Computing Center of the Max-Planck-Gesellschaft (MPG) and the
Institut fuer Plasmaphysik (IPP)
-----------------------------------------------------------------