[OpenAFS-devel] tuning underlying filesystems for afs

Martin MOKREJŠ mmokrejs@ribosome.natur.cuni.cz
Thu, 14 Oct 2004 17:50:03 +0200


This is a multi-part message in MIME format.
--------------070509040402050406010108
Content-Type: text/plain; charset=ISO-8859-2; format=flowed
Content-Transfer-Encoding: 7bit

Hi,
  I'm installing new afs cell and configuring huge afs server - 1TB
raid5 array on a dual-controller based on adapter U160. It has 6 GB RAM,
2 xeon CPU's 3 GHz. It runs linux-2.4.28-pre3 kernel.

  AFS has inode based and namei based fileservers. What is their difference
in terms of performance?

  Filesystems are usually tuned for large or small files. What is the case of
fileserver? The /vicepX partitons are mostly filled with few small files,
which corresponde to volumes if I'm right. That has nothing to do with size of
files stored in afs volumes, I know ... but should I tune for "huge" or "small"
files?

  I expect to have several files above 1GB in afs volumes, in general more
huge files then small ones.

  AFS/kernel mounts /vicepX partitions automatically? However, for example xfs offers
several mount options, which affect performance. How can I make advantage of
such options under afs?

  I'm attaching my current results from bonnie++ tests. In general, xfs is fast
equally as reiserfs, except random operations. For random operations, reiserfs
is the best, them comes ext2, ext3 and xfs as the last one. At least if
I interpret the numbers correctly.

Any thoughts? Thanks
Martin

--------------070509040402050406010108
Content-Type: text/plain;
 name="testovani-fs.txt"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="testovani-fs.txt"

12 files of size 1073737728 bytes created.

Blocksize: 4096
Journal Size 8193 blocks
Hash function used to sort names: "r5"
ReiserFS 12G,30112,98,62671,30,27288,9,23914,72,61855,11,415.2,0,16,31936,99,+++++,+++,26624,99,31115,98,+++++,+++,24817,99

Blocksize: 8192
Journal Size 8193 blocks
Hash function used to sort names: "r5" 
Unsupported reiserfs blocksize: 8192 on 08:21, only 4096 bytes blocksize is supported.
sh-2021: reiserfs_read_super: can not find reiserfs on sd(8,33)

Blocksize: 4096
Journal Size 8193 blocks
Hash function used to sort names: "rupasov"
12G,30020,98,62855,31,27270,9,23909,71,61800,11,421.8,1,16,29762,99,+++++,+++,25914,99,28914,100,+++++,+++,24587,100

Blocksize: 4096
Journal Size 8193 blocks
Hash function used to sort names: "tea"
12G,30019,98,62512,29,27401,9,23911,72,61771,11,436.9,0,16,30382,100,+++++,+++,24815,93,30516,100,+++++,+++,25036,99


# mkfs.xfs -f /dev/sdc1
meta-data=/dev/sdc1              isize=256    agcount=16, agsize=2039501 blks
         =                       sectsz=512  
data     =                       bsize=4096   blocks=32632016, imaxpct=25
         =                       sunit=0      swidth=0 blks, unwritten=1
naming   =version 2              bsize=4096  
log      =internal log           bsize=4096   blocks=15933, version=1
         =                       sectsz=512   sunit=0 blks
realtime =none                   extsz=65536  blocks=0, rtextents=0
12G,32632,99,64567,15,29972,8,24055,72,63201,10,426.3,0,16,5322,28,+++++,+++,4577,22,5162,27,+++++,+++,3733,26

mount -o logbufs=8,osyncisdsync /dev/sdc1 /scratch/
12G,32400,99,65564,16,29843,8,24024,72,63198,9,428.5,0,16,5138,29,+++++,+++,4714,20,5179,36,+++++,+++,3794,23

mount -o logbufs=8,osyncisdsync,biosize=12 /dev/sdc1 /scratch/
12G,31964,99,65432,17,29976,8,24042,71,63249,9,421.8,1,16,5211,28,+++++,+++,4740,23,5159,33,+++++,+++,3823,19

mount -o logbufs=2,osyncisdsync,biosize=16 /dev/sdc1 /scratch
12G,32356,99,65364,15,29973,8,24023,71,63241,9,421.9,1,16,3919,21,+++++,+++,3565,16,3910,20,+++++,+++,2988,14

mount -o logbufs=8,osyncisdsync,biosize=16 /dev/sdc1 /scratch
12G,32688,99,65614,15,29923,8,23990,72,63265,9,429.5,1,16,6009,31,+++++,+++,5572,35,5480,35,+++++,+++,4214,24

mount -o logbufs=8,osyncisdsync,biosize=16,logbsize=32768 /dev/sdc1 /scratch
12G,32670,99,65413,16,29994,8,23992,71,63235,9,432.5,0,16,6049,41,+++++,+++,5610,25,5246,32,+++++,+++,4190,22

mount -o logbufs=8,osyncisosync,biosize=16,logbsize=32768 /dev/sdc1 /scratch
12G,32676,99,65634,15,29799,8,23958,71,63252,10,419.9,0,16,6100,40,+++++,+++,5548,22,5577,31,+++++,+++,4169,26

mount -o logbufs=8,osyncisdsync,biosize=16,logbsize=32768,noatime /dev/sdc1 /scratch
12G,32494,99,65647,15,29882,8,24024,72,63526,9,415.9,0,16,5736,36,+++++,+++,5660,27,5488,29,+++++,+++,4252,23



ext3:
# mkfs.ext3 /dev/sdc1
mke2fs 1.35 (28-Feb-2004)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
16318464 inodes, 32632023 blocks
1631601 blocks (5.00%) reserved for the super user
First data block=0
996 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks: 
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 
        4096000, 7962624, 11239424, 20480000, 23887872

Writing inode tables: done                            
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 39 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.

mount /dev/sdc1 /scratch
12G,26711,99,58659,32,24896,9,24016,71,63737,9,376.2,0,16,2968,89,+++++,+++,+++++,+++,3423,99,+++++,+++,9267,97

mount -o data=writeback /dev/sdc1 /scratch
12G,27194,97,59360,30,28820,9,23915,71,63629,9,438.5,0,16,3250,95,+++++,+++,+++++,+++,3454,99,+++++,+++,9807,98



ext2:
# mkfs.ext2 /dev/sdc1 
mke2fs 1.35 (28-Feb-2004)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
16318464 inodes, 32632023 blocks
1631601 blocks (5.00%) reserved for the super user
First data block=0
996 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks: 
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 
        4096000, 7962624, 11239424, 20480000, 23887872

Writing inode tables: done                            
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 37 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.

mount /dev/sdc1 /scratch
12G,28317,97,65698,15,28110,8,23912,71,63786,10,443.7,0,16,6992,99,+++++,+++,+++++,+++,7094,100,+++++,+++,18971,99


# mkfs.ext2 -T largefile4 /dev/sdc1 
mke2fs 1.35 (28-Feb-2004)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
31872 inodes, 32632023 blocks
1631601 blocks (5.00%) reserved for the super user
First data block=0
996 block groups
32768 blocks per group, 32768 fragments per group
32 inodes per group
Superblock backups stored on blocks: 
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 
        4096000, 7962624, 11239424, 20480000, 23887872

Writing inode tables: done                            
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 20 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.

mount /dev/sdc1 /scratch
12G,28396,97,65782,14,28029,8,23962,71,64147,10,444.7,1,16,3167,50,+++++,+++,+++++,+++,6759,99,+++++,+++,16907,100

--------------070509040402050406010108--