[OpenAFS] Serious trouble, mounting /afs, ptserver, database rebuilding

kanou kanou@gmx.ch
Thu, 24 Jul 2008 06:46:27 +0200


So, I think there will be no help.
So please tell me:
Is it possible to build a system from scratch and to include all our  
user-data, all the files from our users?
I got clean backups of all files.

cheers
kanou

Am 23.07.2008 um 22:45 schrieb kanou:

> so, i thank you all for your help. i m sorry for not beeing so  
> experienced with afs/kerberos. it s pretty new to me.
> by now, my first server, the one that was broken, is running pretty  
> good and everybody gets their files, but still the database needs  
> rebuilding and i dont know how.
>
> the second one, that worked well until the other began running  
> again, is in a bad state.
> nobody can connect to their files and i m not able to get a ticket  
> from kerberos.
>
> bos status -server myserver2
> bos: no such entry (getting tickets)
> bos: running unauthenticated
> Instance fs, currently running normally.
>    Auxiliary status is: file server running.
> Instance ptserver, currently running normally.
> Instance vlserver, currently running normally.
>
> and logs:
> ==> /var/log/openafs/PtLog <==
> ptserver: Unknown code pt 11 (267275) Can't rebuild database because  
> it is not empty
>
> please tell me how to rebuild the database.
> kanou
>
> Am 23.07.2008 um 20:06 schrieb kanou:
>
>> Well i did:
>> udebug myserver 7002
>> Host's addresses are:  ip-myserver
>> Host's  ip-myserver time is Wed Jul 23 19:47:26 2008
>> Local time is Wed Jul 23 19:47:29 2008 (time differential 3 secs)
>> Last yes vote for ip-myserver2 was 24 secs ago (sync site);
>> Last vote started 24 secs ago (at Wed Jul 23 19:47:05 2008)
>> Local db version is 1214806124.7
>> I am not sync site
>> Lowest host ip-myserver2 was set 24 secs ago
>> Sync host ip-myserver2 was set 24 secs ago
>> Sync site's db version is 1214806124.7
>> 0 locked pages, 0 of them for write
>>
>> udebug myserver 7003
>> Host's addresses are: ip-myserver
>> Host's ip-myserver time is Wed Jul 23 19:49:34 2008
>> Local time is Wed Jul 23 19:49:35 2008 (time differential 1 secs)
>> Last yes vote for ip-myserver was 44 secs ago (not sync site);
>> Last vote started 44 secs ago (at Wed Jul 23 19:48:51 2008)
>> Local db version is 1216833173.3
>> I am not sync site
>> Lowest host ip-myserver2 was set 5 secs ago
>> Sync host 0.0.0.0 was set 154 secs ago
>> Sync site's db version is 1216833173.3
>> 0 locked pages, 0 of them for write
>>
>> and on myserver2:
>> udebug myserver2 7002
>> Host's addresses are: ip-myserver2
>> Host's ip-myserver2 time is Wed Jul 23 19:56:38 2008
>> Local time is Wed Jul 23 19:56:39 2008 (time differential 1 secs)
>> Last yes vote for ip-myserver2 was 6 secs ago (sync site);
>> Last vote started 6 secs ago (at Wed Jul 23 19:56:33 2008)
>> Local db version is 1214806124.7
>> I am sync site until 51 secs from now (at Wed Jul 23 19:57:30 2008)  
>> (2 servers)
>> Recovery state 1f
>> Sync site's db version is 1214806124.7
>> 0 locked pages, 0 of them for write
>>
>> Server (ip-myserver): (db 1214806124.7)
>>   last vote rcvd 9 secs ago (at Wed Jul 23 19:56:30 2008),
>>   last beacon sent 6 secs ago (at Wed Jul 23 19:56:33 2008), last  
>> vote was yes
>>   dbcurrent=1, up=1 beaconSince=1
>>
>> udebug myserver2 7003
>> Host's addresses are: ip-myserver2
>> Host's ip-myserver2 time is Wed Jul 23 19:57:50 2008
>> Local time is Wed Jul 23 19:57:50 2008 (time differential 0 secs)
>> Last yes vote for ip-myserver2 was 3 secs ago (sync site);
>> Last vote started 3 secs ago (at Wed Jul 23 19:57:47 2008)
>> Local db version is 1216835658.20
>> I am sync site until 56 secs from now (at Wed Jul 23 19:58:46 2008)  
>> (2 servers)
>> Recovery state 1f
>> Sync site's db version is 1216835658.20
>> 0 locked pages, 0 of them for write
>> Last time a new db version was labelled was:
>> 	 212 secs ago (at Wed Jul 23 19:54:18 2008)
>>
>> Server (ip-myserver): (db 1216835658.20)
>>   last vote rcvd 4 secs ago (at Wed Jul 23 19:57:46 2008),
>>   last beacon sent 3 secs ago (at Wed Jul 23 19:57:47 2008), last  
>> vote was yes
>>   dbcurrent=1, up=1 beaconSince=1
>>
>> The system on myserver (the first one) is now running again but if  
>> i try to create a user i still get:
>> pts: database needs rebuilding ; unable to create user TESTUSER  
>> with id 3563
>> Volume 536872154 created on partition /vicepa of myserver
>> Released volume cell.user successfully
>> fs: Invalid argument, possible reasons include:
>> 	-File not in AFS
>> 	-Too many users on access control list
>> 	-Tried to add non-existent user to access control list
>> pts: User or group doesn't exist ; unable to add user TESTUSER to  
>> group TESTGROUP
>> pts: User or group doesn't exist ; unable to add user TESTUSER to  
>> group TESTAFSUSER
>> Added replication site myserver /vicepa for volume user. TESTUSER
>>
>> Could please someone tell me how to rebuild the protection database?
>> I cant bos on myserver2 because i dont get a ticket from kerberos.
>> or does /etc/init.d/openafs-client stop and /etc/init.d/openafs- 
>> fileserver stop the job to stop the whole system?
>>
>> thanks all for your help
>> kanou
>>
>>
>> Am 23.07.2008 um 19:30 schrieb Hartmut Reuter:
>>
>>> kanou wrote:
>>>> My logs on the second machine tell me:
>>>> ==> /var/log/openafs/FileLog.old <==
>>>> Wed Jul 23 19:03:37 2008 File server starting
>>>> Wed Jul 23 19:03:37 2008 afs_krb_get_lrealm failed, myserver2.
>>>> Wed Jul 23 19:03:37 2008 VL_RegisterAddrs rpc failed; will retry   
>>>> periodically (code=5376, err=4)
>>>
>>>
>>> code 5376 means no quorum elected. Are you sure your database  
>>> servers are all running?
>>>
>>> Try "udebug <server> 7002" for the ptserver
>>> and "udebug <server> 7003" for the vldb
>>>
>>>> Wed Jul 23 19:03:37 2008 Couldn't get CPS for AnyUser, will try  
>>>> again  in 30 seconds; code=267275.
>>>> ==> /var/log/openafs/SalvageLog <==
>>>> 07/23/2008 19:08:27 SALVAGING OF PARTITION /vicepa COMPLETED
>>>> and aklog gives me:
>>>> aklog: Couldn't get hrf.uni-koeln.de AFS tickets:
>>>> aklog: Cannot contact any KDC for requested realm while getting  
>>>> AFS  tickets
>>>> damn! i did not do anything on that second one!
>>>>> Just to make sure you're working on the correct file:
>>>>> As I understand you first deleted the file /var/lib/openafs/db/  
>>>>> prdb.DB0.
>>>>> This file was then probably recreated when you restarted the  
>>>>> ptserver.
>>>>> Run this command on the backupfile you made first (or better on  
>>>>> a  copy of the backup file).
>>>>>
>>>>> T/Christof
>>>>> ________________________________________
>>>>> From: openafs-info-admin@openafs.org [openafs-info- admin@openafs.org 
>>>>> ] On Behalf Of kanou [kanou@gmx.ch]
>>>>> Sent: Wednesday, July 23, 2008 6:46 PM
>>>>> To: openafs-info@openafs.org
>>>>> Subject: Re: [OpenAFS] Serious trouble, mounting /afs,  
>>>>> ptserver,  database rebuilding
>>>>>
>>>>> Thanks for your answer.
>>>>> Well I found the file prdb_check. It doesnt print any errors. Only
>>>>> thing I can find is with
>>>>> ./prdb_check -database /var/lib/openafs/db/prdb.DB0 -uheader - 
>>>>> verbose
>>>>> this line:
>>>>> Ubik header size is 0 (should be 64)
>>>>>
>>>>> So there are no errors! I can start the server and everything runs
>>>>> fine but the machine wont mount /afs!
>>>>> kanou
>>>>>
>>>>> Am 23.07.2008 um 17:26 schrieb Steven Jenkins:
>>>>>
>>>>>> On Wed, Jul 23, 2008 at 10:51 AM, kanou <kanou@gmx.ch> wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>> well, there is a file called db_verify.c in the folder
>>>>>>> /usr/src/modules/openafs/ptserver but I don' know how to build  
>>>>>>> it.
>>>>>>
>>>>>>
>>>>>> If I recall correctly, db_verify gets renamed to 'prdb_check'  
>>>>>> during
>>>>>> the install, so you should check for the existence of that file.
>>>>>>
>>>>>> If you can't find it, you'll need to build it from source code:  
>>>>>> the
>>>>>> directions on the AFSLore wiki are a good place to start:
>>>>>>
>>>>>> http://www.dementia.org/twiki/bin/view/AFSLore/HowToBuildOpenAFSFromSource
>>>>>>
>>>>>> If you have problems building openafs-stable-1_4_x, you could get
>>>>>> openafs-stable-1_4_7 instead, as that is the latest official  
>>>>>> release.
>>>>>>
>>>>>> Once you have built the tree, src/ptserver/db_verify should  
>>>>>> get  built,
>>>>>> so you can simply copy it out of the source tree for your use.   
>>>>>> If it
>>>>>> doesn't get built automatically for you, you can cd into src/ 
>>>>>> ptserver
>>>>>> and do a 'make db_verify' manuall.
>>>>>>
>>>>>> Also, feel free to ask for help here  or on the irc channel.
>>>>>>
>>>>>> Steven Jenkins
>>>>>> End Point Corporation
>>>>>> http://www.endpoint.com/
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> OpenAFS-info mailing list
>>>>> OpenAFS-info@openafs.org
>>>>> https://lists.openafs.org/mailman/listinfo/openafs-info
>>>> _______________________________________________
>>>> OpenAFS-info mailing list
>>>> OpenAFS-info@openafs.org
>>>> https://lists.openafs.org/mailman/listinfo/openafs-info
>>>
>>>
>>> -- 
>>> -----------------------------------------------------------------
>>> Hartmut Reuter                  e-mail 		reuter@rzg.mpg.de
>>> 			   	phone 		 +49-89-3299-1328
>>> 			   	fax   		 +49-89-3299-1301
>>> RZG (Rechenzentrum Garching)   	web    http://www.rzg.mpg.de/~hwr
>>> Computing Center of the Max-Planck-Gesellschaft (MPG) and the
>>> Institut fuer Plasmaphysik (IPP)
>>> -----------------------------------------------------------------
>>
>> _______________________________________________
>> OpenAFS-info mailing list
>> OpenAFS-info@openafs.org
>> https://lists.openafs.org/mailman/listinfo/openafs-info
>
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info