On 22.08.2016 11:38, Radek Krotil wrote:

Thanks Stefan and Daniel for your effort in analyzing this. Unfortunately, I missed your replies as I expected that they will address my mailbox as well. So let me restart the thread now..

No worries, I'm on travels right now, so replies
will be delayed as well ...

Recently I hit this problem on another production repository from one of my customers. My test machine where I work with the repositories has 512 GB SSD drive, so I need to keep the repositories as small as possible, so I try to migrate them to the latest format with deltification enabled and pack them. Therefore I usually take the following steps: 1) Unpack the zipped repository the customer sends me under customer name folder

2) Rename it to repo-x, where x is the format of the repository
3) Dump the repository

4) Create new repository named repo-1.9

5) Load the dump to the new repository

6) Pack the repo-1.9 repository

Up to this point, the repo-1.9 should be fine and
any "handling mistakes" should only affect later
revisions.

7) Configure Polarion ALM to use the repository

8) Start Polarion – only at this point is the repository first used by Apache and svnserve

Assuming that those servers have never seen a
repository called repo-1.9 since they were started,
there should be no caching-related issues.

There is a svnserve process serving all repositories under /opt/repositories/, where the customer folders are stored. However, the repositories are not accessed until I’m fully done with the migration. The test server is dedicated for my use only, so I’m confident there are no other users reading/writing the repositories.

Sounds good. From what you describe here, I think
your conversion / upgrade process is correct.

On this particular repository, I ran the dump/load cycle twice and in both cases it resulted in the svnadmin pack command failure.

So, the freshly loaded repository (between step 5
and 6) can already not be read. This might either
be due to corrupted data on disk or a problem in
the reader code.

I’ll re-try to do dump/load on the other repository as well later. Svnadmin verify confirmed that the current repository is not corrupted.

Questions:
* To what degree is the pack problem reproducible?
  (sometimes / always with the same repo, same /
   different revision, same / different position in
   revision)
* Does 'svnadmin verify' complete w/o error on the
  repo that won't pack a minute later?
* Does retrying the pack result in the same error
  or does it complete the process?

The repository contains 334243 revisions total. As suggested by Stefan, I did the grep on the problematic repo file. The rev 203908 is about 231 MB big. This confirms my suspicion that the problem is related to big revision data.

Well, that is at least conceivable: Larger revisions
with many changed nodes come with larger index
information. There might be bugs along the line of
"this time we need one more page than usually".

[root@babybear svn]# svnadmin pack repo-1.9/

Packing revisions in shard 203...svnadmin: E160056: Offset 232966338 too large in revision 203908

[root@babybear 203]# grep -oba L2P-INDEX 203908

231917762:L2P-INDEX

O.k. that information is helpful: 232966338 is not
a valid data location. I'll try to trace back where
that number might come from.

Question:
* What is the exact size of the revision in bytes?

I tried to restart Apache, svnserve, even the whole box. The problem still persists. Unlike the other occurrences reported by other users, e.g. in https://issues.apache.org/jira/browse/SVN-4588, this does not seem to be related to invalid server cache, because I’m only using svnadmin command to work with the repository.

I, too, think that it is a separate problem and would
very much like to track it down.

-- Stefan^2.

Looking forward to further suggestions.

Best regards,

Radek Krotil

On 2016-06-04 18:57 (+0200), Daniel Shahaf <d...@daniel.shahaf.name <mailto:d...@daniel.shahaf.name>> wrote:

> Stefan Fuhrmann wrote on Sat, Jun 04, 2016 at 08:04:42 -0000:

> > On 2016-06-03 09:36 (+0200), Radek Krotil <radek.kro...@polarion.com <mailto:radek.kro...@polarion.com>> wrote:

> > > Hello.

> > >

> > > Today, I encountered a problem when trying to pack a repository after

> > > migrating it to the FSFS 7 format by performing full dump / load sequence.

> >

> > I assume you ran 'svnadmin load' onto a repository

> > that was not accessible to the server at that time,

> > so no remote user could accicentally write to it?

>

> Why would that matter?  What could happen if somebody makes a commit or

> a propedit in parallel to an 'svnadmin load'?  A concurrent commit will

> cause mergeinfo in later revisions to have to have off-by-one errors,

> but shouldn't cause FS corruption.

>

> > > Shortly, I get the following error

> > > “Packing revisions in shard 5...svnadmin: E160056: Offset 391658998 too

> > > large in revision 5102”

> >

> > This is basically an "invalid access" error message.

> > Typical causes include repository corruption and

> > admins tinkering with the repository without informing

> > the server process. A maybe similar issue:

> >

> > https://issues.apache.org/jira/browse/SVN-4588

> >

> > In your case, however, the corruption is probably in

> > the repository itself. Please run 'svnadmin verify' on it.

> >

> > > I was not able to understand from the documentation, what settings in

> > > fsfs.conf should be modified to workaround this problem. Neither search in

> > > the Internet brought any light into this. Is it even possible?

> >

> > This is most definitely not a configuration issue like

> > "your data is too large". Maybe, we should prefix the

> > error message with "invalid access" to prevent

> > confusion.

>

> How about being even more specific:

>

> svnadmin: E1600NN: failed to locate representation of %s at revision %ld, offset %lld

>

> where %s identifies the origin of the offset value or the object that

> was expected to be located at that offset.

>

> ?

>

> Cheers,

>

> Daniel

>


Reply via email to