[OpenAFS] Crash in volserver when restoring volume from backup.

Anders Magnusson ragge@ltu.se
Wed, 26 Aug 2009 16:13:51 +0200

Hartmut Reuter wrote:
> Anders Magnusson wrote:
>> Hi,
>> I have a problem that I need some advice on how to go on with.
>> I have a volume dump file, but when trying to read it back volserver
>> crashes.
>> The dump was generated under 1.4.8, and the volserver segv appears with
>> both 1.4.8 and 1.4.11.
>> VolserLog.old says:
>> Wed Aug 26 14:56:34 2009 Starting AFS Volserver 2.0
>> (/usr/afs/bin/volserver -p 16)
>> Wed Aug 26 15:00:12 2009 1 Volser: CreateVolume: volume 537998421
>> (students.waqazi-4) created
>> BosLog says:
>> Wed Aug 26 15:00:14 2009: fs:vol exited on signal 11
>> Any hints where to go from here?  I can provide the dump file on
>> request, but since it's a
>> student home directory I don't want it to be public.
>> -- Ragge
> Attach the volserver with gdb before running the command. Then you may
> be able to see where it crashes and why.
# gdb /usr/afs/bin/volserver
GNU gdb Fedora (6.8-27.el5)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...
(gdb) symb
Discard symbol table from `/usr/afs/bin/volserver'? (y or n) y
No symbol file now.
(gdb) help file
Use FILE as program to be debugged.
It is read for its symbols, for getting the contents of pure memory,
and it is the program executed when you use the `run' command.
If FILE cannot be found as specified, your execution directory path
($PATH) is searched for a command of that name.
No arg means to have no executable file and no symbols.
(gdb) help symb
Load symbol table from executable file FILE.
The `file' command can also load symbol tables, as well as setting the file
to execute.
(gdb) symb volserver.debug
Reading symbols from /usr/afs/bin/volserver.debug...done.
(gdb) attach 30720
Attaching to program: /usr/afs/bin/volserver, process 30720
Reading symbols from /lib64/libpthread.so.0...done.
[Thread debugging using libthread_db enabled]
[New Thread 0x2ab6a9d3be90 (LWP 30720)]
[New Thread 0x4dd02940 (LWP 30740)]
[New Thread 0x4d301940 (LWP 30739)]
[New Thread 0x4c900940 (LWP 30738)]
[New Thread 0x4beff940 (LWP 30737)]
[New Thread 0x4b4fe940 (LWP 30736)]
[New Thread 0x4aafd940 (LWP 30735)]
[New Thread 0x4a0fc940 (LWP 30734)]
[New Thread 0x496fb940 (LWP 30733)]
[New Thread 0x48cfa940 (LWP 30732)]
[New Thread 0x482f9940 (LWP 30731)]
[New Thread 0x478f8940 (LWP 30730)]
[New Thread 0x46ef7940 (LWP 30729)]
[New Thread 0x464f6940 (LWP 30728)]
[New Thread 0x45af5940 (LWP 30727)]
[New Thread 0x450f4940 (LWP 30726)]
[New Thread 0x446f3940 (LWP 30725)]
[New Thread 0x43cf2940 (LWP 30724)]
[New Thread 0x432f1940 (LWP 30723)]
[New Thread 0x428f0940 (LWP 30722)]
[New Thread 0x41eef940 (LWP 30721)]
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /lib64/libresolv.so.2...done.
Loaded symbols for /lib64/libresolv.so.2
Reading symbols from /lib64/libc.so.6...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
0x0000003c13c0a899 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
(gdb) c
[New Thread 0x4e703940 (LWP 30748)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x4aafd940 (LWP 30735)]
0x0000003c13078d60 in strlen () from /lib64/libc.so.6
(gdb) bt
#0  0x0000003c13078d60 in strlen () from /lib64/libc.so.6
#1  0x0000000000430092 in afs_vsnprintf (p=0x4aafc3ba "4BF+0", avail=999,
    fmt=<value optimized out>, ap=0x4aafc7c0) at ../util/snprintf.c:395
#2  0x0000000000416a60 in vFSLog (
    format=0x467838 "1 Volser: ReadVnodes: IH_CREATE: %s - restore 
    args=0x4) at ../util/serverLog.c:135
#3  0x000000000042550e in Log (
    format=0x1311e7ec <Address 0x1311e7ec out of bounds>) at 
#4  0x000000000040e60c in RestoreVolume (call=<value optimized out>,
    avp=0x2aaaac03e780, incremental=<value optimized out>,
    cookie=<value optimized out>) at ../volser/dumpstuff.c:1214
#5  0x00000000004067f4 in VolRestore (acid=0x41ff510,
    atrans=<value optimized out>, aflags=1, cookie=0x4aafd000)
    at ../volser/volprocs.c:1406
#6  0x0000000000406850 in SAFSVolRestore (acid=0x1311e7ec, atrans=999,
    aflags=319940588, cookie=0x4) at ../volser/volprocs.c:1378
#7  0x0000000000413822 in AFSVolExecuteRequest (z_call=0x41ff510)
    at ../volser/volint.ss.c:104
#8  0x000000000044d0de in rxi_ServerProc (threadID=5, newcall=0x3e7,
    socketp=0x4aafd10c) at ../rx/rx.c:1445
#9  0x000000000042afe4 in rx_ServerProc (dummy=<value optimized out>)
    at ../rx/rx_pthread.c:303
#10 0x000000000042a6a8 in server_entry (argp=0x1311e7ec)
    at ../rx/rx_pthread.c:101
#11 0x0000003c13c06367 in start_thread () from /lib64/libpthread.so.0
#12 0x0000003c130d30ad in clone () from /lib64/libc.so.6

> You also could try to analyze the dump file with dumptool which is built
> under sudirectory src/tests.
# ./dumptool -v /tmp/students.waqazi-4
Dump is for volume 537980786 (students.waqazi-4.backup)
Volume information:
        id = 537980786
        parent id = 537980784
        name = students.waqazi-4.backup
        flags = inService blessed
        uniquifier = 1721
        Creation date = Fri Aug  7 12:38:37 2009
        Last access date = (null)       Last update date = Thu Aug  6 
16:32:52 2009
        Volume owner = 21361
Scanning vnodes (this may take a while)
Unknown dump tag "2"
First vnode scan failed, aborting

More hints?

-- Ragge

> Hartmut
>> _______________________________________________
>> OpenAFS-info mailing list
>> OpenAFS-info@openafs.org
>> https://lists.openafs.org/mailman/listinfo/openafs-info