[OpenAFS] Stability of AFS
Hartmut Reuter
reuter@rzg.mpg.de
Thu, 24 Oct 2002 11:01:14 +0200
To me this looks like a problem with your (RAID?) filesystem. The
fileserver itself is a pure userland process. Also it doesn't look like
the AFS-client on this machine would be involved. Your fileserver should
be able to run without the AFS kernel extensions being loaded and
without the client being running.
The process name "kjournald" points to the filesystem layer. You have
3w-xxxx kernel extension loaded: this looks like you are using the 3ware
RAID-controller. We had problems with the Escalade 6000 series and have
them replaced by the 7850 controllers now. Does the web interface of the
3ware controller show any events?
Hartmut
makowskm@chemia.uj.edu.pl wrote:
> We are using AFS for few months in our organization. For two weeks we have
> constant problems with stability of file system. Every 2(3) days it
> collapses producing system logs like those:
>
> Oct 23 15:18:31 porsacz kernel: Unable to handle kernel paging request at
> virtual address 0f3c8b21
> Oct 23 15:18:31 porsacz kernel: printing eip:
> Oct 23 15:18:31 porsacz kernel: f883bde3
> Oct 23 15:18:31 porsacz kernel: *pde = 00000000
> Oct 23 15:18:31 porsacz kernel: Oops: 0002
> Oct 23 15:18:31 porsacz kernel: libafs-2.4.18-10-athlon.mp soundcore
> eepro100 ext3 jbd 3w-xxxx sd_mod scsi_mod
> Oct 23 15:18:31 porsacz kernel: CPU: 1
> Oct 23 15:18:31 porsacz kernel: EIP: 0010:[<f883bde3>] Tainted: PF
> Oct 23 15:18:31 porsacz kernel: EFLAGS: 00010246
> Oct 23 15:18:31 porsacz kernel:
> Oct 23 15:18:31 porsacz kernel: EIP is at journal_commit_transaction [jbd]
> 0x7c3 (2.4.18-10smp)
> Oct 23 15:18:31 porsacz kernel: eax: 0f3c8b11 ebx: f6488c90 ecx:
> 00000b5c edx: f6837840
> Oct 23 15:18:31 porsacz kernel: esi: 00000000 edi: f6946600 ebp:
> e3787f90 esp: f69bde80
> Oct 23 15:18:31 porsacz kernel: ds: 0018 es: 0018 ss: 0018
> Oct 23 15:18:31 porsacz kernel: Process kjournald (pid: 149,
> stackpage=f69bd000)
> Oct 23 15:18:31 porsacz kernel: Stack: 00003016 00000000 00000f9c c5363064
> 0000000a cc065ac0 cd977bd0 00000d77
> Oct 23 15:18:31 porsacz kernel: 00000001 ec274700 ec7e15c0 00000000
> d7bbc3c0 cb1c1240 cb1c11c0 cb1c1140
> Oct 23 15:18:31 porsacz kernel: cb1c10c0 cb5d3f40 cb5d3ec0 cb5d3e40
> cb5d3dc0 cb5d3d40 cb1c1d40 cb1c1cc0
> Oct 23 15:18:31 porsacz kernel: Call Trace: [<f883e7e6>] kjournald [jbd]
> 0x136
> Oct 23 15:18:31 porsacz kernel: [<f883e690>] commit_timeout [jbd] 0x0
> Oct 23 15:18:31 porsacz kernel: [<c0107286>] kernel_thread [kernel] 0x26
> Oct 23 15:18:31 porsacz kernel: [<f883e6b0>] kjournald [jbd] 0x0
> Oct 23 15:18:31 porsacz kernel:
> Oct 23 15:18:31 porsacz kernel:
> Oct 23 15:18:31 porsacz kernel: Code: f0 ff 40 10 8b 03 f0 0f ba 68 18 0a
> 8b 44 24 1c 50 8d 44 24
>
> Checking the server status after such events don't show anything wrong,
> but in fact none of the AFS clients can get to file system. All what can
> be done is to obtain a token.The only way to bring back functionality is
> restarting the server machine.
>
> We are using OpenAFS ver.1.2.6 on RedHat 7.3 with OpenAFS modules
> compiled for our kernel (2.4.18-10smp). The server works as SMP with two
> Athlons1800+.The file system is located on the RAID5 with ext3 type
> partition. The machine has both AFS server and client functionality and
> the client cache is located on a separate partition of ext2 type.
>
> Could anyone help us to explain the instability of AFS in such configuration?
>
> Yours,
>
> Marcin Makowski
> Department of the Theoretical Chemistry
> Jagiellonian University
> makowskm@chemia.uj.edu.pl
>
>
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
--
-----------------------------------------------------------------
Hartmut Reuter e-mail reuter@rzg.mpg.de
phone +49-89-3299-1328
RZG (Rechenzentrum Garching) fax +49-89-3299-1301
Computing Center of the Max-Planck-Gesellschaft (MPG) and the
Institut fuer Plasmaphysik (IPP)
-----------------------------------------------------------------