*** Spam *** Re: [OpenAFS] a noobs question and problems on a new cell

Kim Kimball dhk@ccre.com
Thu, 10 May 2007 09:06:51 -0600


<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
<br>
<br>
Christopher D. Clausen wrote:
<blockquote cite="mid26509746.1178719152790.JavaMail.root@m11"
 type="cite">
  <pre wrap="">Adnoh <a class="moz-txt-link-rfc2396E" href="mailto:adnoh@users.sourceforge.net">&lt;adnoh@users.sourceforge.net&gt;</a> wrote:
  </pre>
  <blockquote type="cite">
    <pre wrap="">Christopher D. Clausen wrote:
    </pre>
    <blockquote type="cite">
      <blockquote type="cite">
        <pre wrap="">i don't want to have all the volumes in our headquarter. so every
time a user openes his word-doc or similar it would be completly
transfered over our VPN - and I can hear the people crying "our
fileservers are too slow !" so seperate fileservers in every
district would be a good choice, I think - would'nt they ?
        </pre>
      </blockquote>
      <pre wrap="">That is an option.  There are of course problems with doing either.
Remember that the AFS clients themselves cache read-only data.  So if
most of your data is only being read and not written back that
often, it might make sense to have only centrally located AFS
servers.
      </pre>
    </blockquote>
  </blockquote>
</blockquote>
Read/write data is also cached.&nbsp; If not changed it remains cached.&nbsp; If
changed and pushed back to the file server it is marked stale in the
other client caches and fetched again by other clients only if it is
requested again.<br>
<br>
Caching is whole-file, so if a large file is changed on client A and
was previously cached on client B, when client B requests the file
again it notices it is marked as stale and fetches the entire file from
the file server by client B.<br>
<br>
(Since client A made the changes the file is already cached on client
A.&nbsp; Writes to cached AFS files go first to the AFS cache and then back
to the AFS file server on flushes or closes or syncs.)<br>
<blockquote cite="mid26509746.1178719152790.JavaMail.root@m11"
 type="cite">[]
  <pre wrap="">
  </pre>
  <blockquote type="cite">
    <blockquote type="cite">
      <pre wrap="">By default, the AFS client prefers to use readonly volumes, so if you
create a replica of a volume, the data will immediately become
readonly. </pre>
    </blockquote>
  </blockquote>
</blockquote>
Two other factors affect whether a given replicated volume will be
accessed as RO or RW.<br>
<br>
Given mount points /a/b/c/d, the volume mounted at d/ will not be
accessed as RO if <br>
<br>
a) any of the mount points above it are RW mount points.<br>
b) any of the volumes mounted above it are not replicated.&nbsp; (This is
determined by the VLDB entry for the volume.&nbsp; If I have five replicas
on five different file servers but the VLDB entry for the volume does
not show any RO sites, then the volume is treated as unreplicated.&nbsp;
It's important that the VLDB be in the correct state.)<br>
<br>
It's easy to check -- assume that /afs mounts root.afs and root.afs is
replicated, and that cellname mounts root.cell and root.cell is
replicated, and that &lt;next&gt; is in a volume that is also
replicated, where &lt;next&gt; might be a directory in /afs/cellname or
perhaps a mount point to a volume called '&lt;next&gt;'<br>
<br>
cd /afs<br>
fs lq&nbsp; <br>
cd cellname<br>
fs lq<br>
cd &lt;next&gt;<br>
fs lq<br>
<br>
Each 'fs lq' command should return a .readonly volume name .&nbsp; Example:<br>
<br>
[kim@satchmo afs]$ cd /afs<br>
[kim@satchmo afs]$ fs lq<br>
Volume Name&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Quota&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Used %Used&nbsp;&nbsp; Partition<br>
root.afs.readonly&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 5000&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 7&nbsp;&nbsp;&nbsp; 0%&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 48%<br>
[kim@satchmo afs]$ cd ccre.com<br>
[kim@satchmo ccre.com]$ fs lq<br>
Volume Name&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Quota&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Used %Used&nbsp;&nbsp; Partition<br>
root.cell.readonly&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 5000&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 24&nbsp;&nbsp;&nbsp; 0%&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 48%<br>
[kim@satchmo ccre.com]$ cd user<br>
[kim@satchmo user]$ fs lq<br>
Volume Name&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Quota&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Used %Used&nbsp;&nbsp; Partition<br>
root.cell.readonly&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 5000&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 24&nbsp;&nbsp;&nbsp; 0%&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 48%<br>
[kim@satchmo user]$ cd k<br>
[kim@satchmo k]$ fs lq<br>
Volume Name&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Quota&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Used %Used&nbsp;&nbsp; Partition<br>
root.cell.readonly&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 5000&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 24&nbsp;&nbsp;&nbsp; 0%&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 48%<br>
<br>
-- Up to here all directory nodes are in a replicated volume and all
mount points are 'regular' (not RW) mount points.<br>
-- The next volume is not replicated, and no .readonly suffix is
returned by 'fs lq'<br>
[kim@satchmo k]$ cd kim<br>
[kim@satchmo kim]$ fs lq<br>
Volume Name&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Quota&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Used %Used&nbsp;&nbsp; Partition<br>
user.k.kim&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; no limit&nbsp;&nbsp; 1074945&nbsp;&nbsp;&nbsp; 0%&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 4%<br>
[kim@satchmo kim]$<br>
<br>
-- To further illustrate, we know that root.afs is replicated (fs lq
returned .readonly at /afs.)&nbsp; However, the rule is "From a RW to a RW,"
so if I make a mount point to root.afs here (/afs/ccre.com/user/k/kim)
in the RW user volume, it will take us to the RW root.afs without using
the -rw switch when I create the mount point:<br>
<br>
[kim@satchmo kim]$ pwd<br>
/afs/ccre.com/user/k/kim<br>
[kim@satchmo kim]$ fs mkm AFSroot root.afs<br>
<br>
--- A RW mount point is indicated by a % sign in front of the volume
name<br>
--- Here the regular mount point is indicated by the # in front of the
volume name.<br>
<br>
[kim@satchmo kim]$ fs lsm AFSroot<br>
'AFSroot' is a mount point for volume '#root.afs'<br>
[kim@satchmo kim]$ cd AFSroot<br>
[kim@satchmo AFSroot]$ fs lq<br>
Volume Name&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Quota&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Used %Used&nbsp;&nbsp; Partition<br>
root.afs&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 5000&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 7&nbsp;&nbsp;&nbsp; 0%&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 23%<br>
[kim@satchmo AFSroot]$<br>
<br>
--- no .readonly suffix; we're in the RW root.afs instance<br>
<br>
<br>
<blockquote cite="mid26509746.1178719152790.JavaMail.root@m11"
 type="cite">
  <blockquote type="cite">
    <blockquote type="cite">
      <pre wrap="">You can however manualy force the mount point to be RW
(-rw option to fs mkm) and this way you can have an RW volume in
each local district and still be able to clone the data to other
servers using vos release.  </pre>
    </blockquote>
  </blockquote>
</blockquote>
This -rw mount point affects all clients, of course, regardless of
their location, since the mount point is in the AFS file system and not
local to each client.&nbsp; <br>
<br>
That is, there is still only one instance of the RW volume, not a RW
volume in each location.<br>
<br>
Only one instance of the RW is allowed -- that is, only one file server
and only one partition will have the RW volume of a given name..<br>
<br>
<blockquote cite="mid26509746.1178719152790.JavaMail.root@m11"
 type="cite">
  <blockquote type="cite">
    <blockquote type="cite">
      <pre wrap="">All volume rights must go to directly to
the RW volume.  The AFS client does not detect when you want to make
a write and find the proper RW volume. You can modify the code to
make it behave that way, but there are
reasons for not doing that.
      </pre>
    </blockquote>
  </blockquote>
</blockquote>
The convention for writing to replicated volumes is to create the
so-called 'dot path' and to use the dot path for writes.<br>
<br>
The 'dot path' is conventionally created under /afs and is a RW mount
point.<br>
<br>
/afs/cellname --&gt; follow the 'volume traversal rules' (basically,
from a RO to a RO, with some caveats)<br>
/afs/.cellname --&gt; the RW mount point causes the client to ignore
replicas from this node on down<br>
<br>
The dot path is created with 'fs mkmount' using the -rw switch:<br>
<br>
fs mkm .cellname root.cell -rw<br>
<br>
<br>
<blockquote cite="mid26509746.1178719152790.JavaMail.root@m11"
 type="cite">
  <blockquote type="cite">
    <pre wrap="">I tried that this way and didn't get it:
a volume called software (~1 Gig)
in our headquarter the rw-volume on the afs server.
in a district the (nightly) ro-snapshot of that volume.
mounted into afs like:
/afs/domain/.software (-rw)
/afs/domain/software (ro)
so if I understand that right i should now be able to access the data
under /afs/domain/.software on both sides.
in the headquarter it should use always the rw-instance and in the
district it should use the rw-instance (over vpn) on a write,
and on a read it should prefer the local ro-instance. but that
doesn't work for me.
everytime I accessed some software in the district it was transfered
completly over the vpn from our headquarter.
did I something missunderstood or have I done something wrong !?
    </pre>
  </blockquote>
  <pre wrap=""><!---->
What commands did you use to set this up?  And physically where are the 
servers that you used to do it?  It should be possible to do something 
that you want, but users will need to understand the difference between 
the paths and open the appropriate folder for either writting or 
reading.  You can't have just writes go to the RW volume.

  </pre>
  <blockquote type="cite">
    <pre wrap="">the idea of this behaviour (take the lokal ro if available and just
get what you still need over vpn) was the coolest feature of the afs
- i thougt. and is the most case why I was looking on the whole afs
thing - and not something like nfs.
    </pre>
  </blockquote>
  <pre wrap=""><!---->
  </pre>
</blockquote>
<br>
<blockquote cite="mid26509746.1178719152790.JavaMail.root@m11"
 type="cite">
  <pre wrap="">You might need to use fs setserverprefs to have the clients on each side 
use the correct server.  Also, note that the AFS client will NOT switch 
between using the RO and RW automatically (well, if the RO goes down, 
the RW will be used, but that isn't likely what you want to happen in 
this case.)</pre>
</blockquote>
This isn't correct. If the RO goes down the RW is not used.&nbsp; If another
RO site is available the client will automatically fail over to another
RO, but will not fail over from any RO to the RW.&nbsp; This is by design.<br>
<blockquote cite="mid26509746.1178719152790.JavaMail.root@m11"
 type="cite">
  <pre wrap="">  If you are using the "dot path" all reads and writes will 
be to the RW volume.

Generally, its a "best practice" to have an RO clone on the same server 
as the RW as well.  Not sure if you did that or not.
  </pre>
</blockquote>
I got tired of accidentally removing the RW when I intended to remove a
RO, so even though it is considered 'best practice' I don't follow it.&nbsp;
YMMV.<br>
<blockquote cite="mid26509746.1178719152790.JavaMail.root@m11"
 type="cite">
  <pre wrap="">
  </pre>
  <blockquote type="cite">
    <blockquote type="cite">
      <pre wrap="">However, you might simply be better off using a more common network
filesystem like NFS or samba and using something like rsync to backup
the data nightly.  You mentioned a VPN.  Since the network link is
already encrypted, you don't require filesystem encryption?  Or do
you?

      </pre>
    </blockquote>
    <pre wrap="">I'm not shure of the encryption ting. the vpn is a line from a large
provider in germany. so I think the line is secure, but I'm a little
bit paranoide ;-)
    </pre>
  </blockquote>
  <pre wrap=""><!---->
AFS has built-in encryption.  Its not the best, but its better than 
nothing.  Since you already have a secured VPN, that is not an issue for 
you though.

  </pre>
  <blockquote type="cite">
    <blockquote type="cite">
      <pre wrap="">It seems as though you are trying to use AFS like NFS or samba,
creating a single
large share point and allowing everyone to write in it.  This is not
the best way to use AFS, although it mostly works.  Replicating
single large volumes can take a long time, especially over slow
links.

      </pre>
    </blockquote>
  </blockquote>
</blockquote>
When a replica is updated only changed files are copied from the RW
site to the RO sites.<br>
<br>
If the '-f' switch is used on the 'vos release' command the entire RO
volume is copied over.<br>
<br>
If all of the files in a RO volume are changed, then the entire RO
volume is copied over (file by file.)<br>
<br>
<blockquote cite="mid26509746.1178719152790.JavaMail.root@m11"
 type="cite">
  <blockquote type="cite">
    <blockquote type="cite">
      <pre wrap=""></pre>
    </blockquote>
    <pre wrap="">yes and no. we have our samba-fileservers in every district completely
seperated from each other.
so if user a from district a wants to give a file to user b from
district b for working on it - he uses email. when
user b has his work completed on that file he uses that way to get
the file back to user a - and if someone in district
a has altered the file in that time - they have a problem...
so yes, i would like one big namespace - something like
/afs/domain/data/it
                      /controlling
                      /bookkeeping
and so on - so every user in a organisation unit can access his data
from each district he is at the moment and easilly share that to
someone else who is maybe not in the same district.
i thougt this is something afs wants me to give.
    </pre>
  </blockquote>
  <pre wrap=""><!---->
  </pre>
</blockquote>
If the files are relatively small then limited bandwidth isn't going to
be a 'killer.'&nbsp; Limited bandwidth applies to AFS files as well as files
from samba or NFS servers.<br>
<br>
If volume size is an issue, try using smaller volumes.<br>
<br>
Under /afs/domain/data/it/controlling, for example, do you have other
directories?&nbsp; If so, you can create a volume and mount it at the
directory node.&nbsp; This keeps volumes smaller.<br>
<br>
<blockquote cite="mid26509746.1178719152790.JavaMail.root@m11"
 type="cite">
  <pre wrap="">AFS can do what you want, but the performance over the WAN links is 
likely going to be poor.  And since the RW volume can only be a single 
server, someone is going to be stuck with the slow connection.

  </pre>
</blockquote>
True also for samba and NFS, unless using multiple instances of the RW
file and merging changes from different samba/NFS servers.<br>
<blockquote cite="mid26509746.1178719152790.JavaMail.root@m11"
 type="cite">
  <pre wrap=""></pre>
  <blockquote type="cite">
    <pre wrap="">Can you describe a "distrcit office" in more detail?  How many users?
-&gt;This differs - lets say 10 districts, 5 with ~100 users, 60 Gig of
data and a "data-change" of 100MB / Day
and the other 5 with the half of the above.
    </pre>
  </blockquote>
  <pre wrap=""><!---->
If you data change rate is only 100MBs, that should be okay to just use 
a client from each district.   Yes, opening and closing files will be 
slow, but try to use large client caches to minimize the impact.

  </pre>
  <blockquote type="cite">
    <pre wrap="">Is there technical staff there to diagnose problems with an AFS
server, if they occur?  Are the offices always connected to the
network?  What type of connection do they have?  Bandwidth?  Latency?
-&gt;no - the only technical staff is in our headquarter. we have a vpn
from a large provider which has a offline-time of maybe 10 Min / Year
at all - so it is very goot. The Bandwith differs - from 512k -
2Mbit. they are connected 24h / day.
    </pre>
  </blockquote>
  <pre wrap=""><!---->
You do not likely want to try and have AFS servers in each remote 
location if you do have the staff there to fix problems.
  </pre>
</blockquote>
We had file servers all over the planet.&nbsp; Remote administration is easy
enough.&nbsp; Aircraft mechanics could power cycle a server if required.&nbsp; <br>
<blockquote cite="mid26509746.1178719152790.JavaMail.root@m11"
 type="cite">
  <pre wrap="">
  </pre>
  <blockquote type="cite">
    <pre wrap="">Do you use Kerberos 5 currently within your organization?  A single
realm? Or a realm per district?
-&gt;We use a windows 2003 ADS for authentications of the windows
workstations and the samba-servers.
    </pre>
  </blockquote>
  <pre wrap=""><!---->
Ah, ok.

Have you looked into using Microsoft's Dfs?  It might provide the 
namespace that you want, but not require you to completely switch your 
infrastructure to use AFS.

  </pre>
  <blockquote type="cite">
    <pre wrap="">Do you have any off-site backup or disaster recovery requirements?
-&gt;I would like to have a backup on the local usb-hdd in each district
and a centraliced backup in our headquarter with a fullbackup/week and
diff-backup/day.
    </pre>
  </blockquote>
  <pre wrap=""><!---->
Okay.  Its pretty easy to clone or copy volumes with AFS.  The exact 
details would depend upon lots of factors and should probably be 
addressed in a seperate thread.

  </pre>
  <blockquote type="cite">
    <pre wrap="">Any specific features that the project MUST do?  Any features that the
project SHOULD do?  Anything else that
would be nice to do?
-&gt;  yes - that what I have mentioned above ;-) - the "global"
namespace would be nice. maybe it is
interesting to tell you that we wanne migrate the workstations to
linux in the next 2-3 years.
    </pre>
  </blockquote>
  <pre wrap=""><!---->
You can do a similar "global" type namespace using Dfs and Windows AD. 
I strongly suggest you look at it first, especially for a mostly Windows 
environment.

  </pre>
  <blockquote type="cite">
    <pre wrap="">How much data are we talking about here?  Total and at each district?
What is the "change rate" of your data?  How much
data is modified per day or per week as a percentage of the total
data? -&gt;mentioned above - all together, maybe ~ 500 Gig at the moment
- but I don't know how much duplicate data is there arround - you now
that "i need my files in every district, my local hdd and for best on
my usb again" ;-)
    </pre>
  </blockquote>
  <pre wrap=""><!---->
Yeah, getting accurate disk space counts across lots of different 
machines isn't easy.

-----

If you want some specific help with trying out AFS, it might be worth 
asking the good folks on the #openafs IRC channel on the Freenode 
network.  For instance, the RO vs RW stuff isn't easy to fully grasp at 
first.

&lt;&lt;CDC 


_______________________________________________
OpenAFS-info mailing list
<a class="moz-txt-link-abbreviated" href="mailto:OpenAFS-info@openafs.org">OpenAFS-info@openafs.org</a>
<a class="moz-txt-link-freetext" href="https://lists.openafs.org/mailman/listinfo/openafs-info">https://lists.openafs.org/mailman/listinfo/openafs-info</a>


  </pre>
</blockquote>
</body>
</html>