Re: More debug infos - was: Re: iMac G5 "windfarm"

2021-01-04 Thread John Paul Adrian Glaubitz
Hello!

On 1/4/21 2:32 AM, John Paul Adrian Glaubitz wrote:
> On 1/4/21 2:02 AM, Cameron MacPherson wrote:
>> #define oprintf(...) \({ \char *msg_oprintf; \
>>fprintf(outfifo,__VA_ARGS__); \
>> fflush(outfifo); \msg_oprintf =
>> xasprintf(__VA_ARGS__); \log("OUT: %s\n",
>> msg_oprintf); \free(msg_oprintf); \})
>>
>> i would add fsync(fileno(outfifo)) after fflush(outfifo) otherwise its not
>> guaranteed that anything actually gets written to the file which is a fifo
>> so if something (the client?) is reading that fifo it will block until the
>> fsync happens.  if glibc used to fsync on fflush and doesnt any longer (its
>> not required to) i imagine this could cause the problem.
> 
> OK, I'll try that tomorrow.

I have tried this now. It didn't help, unfortunately.

I have, however, added several debug breakpoints now using oprintf():

--- /tmp/partman-base-214/parted_server.c   2019-06-02 05:29:29.0 
-0700
+++ partman-base/partman-base-213/parted_server.c   2021-01-04 
00:43:24.696811596 -0800
@@ -124,6 +124,7 @@
 char *msg_oprintf; \
 fprintf(outfifo,__VA_ARGS__); \
 fflush(outfifo); \
+fsync(fileno(outfifo)); \
 msg_oprintf = xasprintf(__VA_ARGS__); \
 log("OUT: %s\n", msg_oprintf); \
 free(msg_oprintf); \
@@ -1219,14 +1220,19 @@
 oprintf("OK\n");
 if (NULL != device_named(device_name)) {
 oprintf("OK\n");
+oprintf("Debug1\n");
 deactivate_exception_handler();
+oprintf("Debug2\n");
 set_disk_named(device_name,
ped_disk_new(device_named(device_name)));
+oprintf("Debug3\n");
 unchange_named(device_name);
+oprintf("Debug4\n");
 activate_exception_handler();
 } else
 oprintf("failed\n");
 free(device);
+free(device_name);
 }
 
 void

The resulting logfile is (ignore the noise in the beginning, scroll to the end):

> https://people.debian.org/~glaubitz/partman/powerpc/partman.debug.log

Thus, it stops at "set_disk_named(device_name, 
ped_disk_new(device_named(device_name)))".

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: More debug infos - was: Re: iMac G5 "windfarm"

2021-01-04 Thread John Paul Adrian Glaubitz
On 1/4/21 10:01 AM, John Paul Adrian Glaubitz wrote:
> The resulting logfile is (ignore the noise in the beginning, scroll to the 
> end):
> 
>> https://people.debian.org/~glaubitz/partman/powerpc/partman.debug.log
> 
> Thus, it stops at "set_disk_named(device_name, 
> ped_disk_new(device_named(device_name)))".

Another debug change:

--- /tmp/partman-base-214/parted_server.c   2019-06-02 05:29:29.0 
-0700
+++ partman-base/partman-base-213/parted_server.c   2021-01-04 
01:21:27.489651278 -0800
@@ -124,6 +124,7 @@
 char *msg_oprintf; \
 fprintf(outfifo,__VA_ARGS__); \
 fflush(outfifo); \
+fsync(fileno(outfifo)); \
 msg_oprintf = xasprintf(__VA_ARGS__); \
 log("OUT: %s\n", msg_oprintf); \
 free(msg_oprintf); \
@@ -558,6 +559,8 @@
 void
 set_disk_named(const char *name, PedDisk *disk)
 {
+
+log("Debug in set_disk_named()");
 PedDisk *old_disk;
 int index = index_of_name(name);
 assert(device_opened(name));
@@ -1219,14 +1222,19 @@
 oprintf("OK\n");
 if (NULL != device_named(device_name)) {
 oprintf("OK\n");
+log("Debug1\n");
 deactivate_exception_handler();
+log("Debug2\n");
 set_disk_named(device_name,
ped_disk_new(device_named(device_name)));
+log("Debug3\n");
 unchange_named(device_name);
+log("Debug4\n");
 activate_exception_handler();
 } else
 oprintf("failed\n");
 free(device);
+free(device_name);
 }
 
 void

Which results in this log:

/bin/partman: ***
/lib/partman/init.d/25md-devices: 
***
/lib/partman/init.d/30parted: 
***
parted_server: === Starting the server
parted_server: main_loop: iteration 1
parted_server: Opening infifo
/lib/partman/init.d/30parted: IN: OPEN =dev=sda /dev/sda
parted_server: Read command: OPEN
parted_server: command_open()
parted_server: Request to open =dev=sda
parted_server: Opening outfifo
parted_server: OUT: OK


parted_server: OUT: OK


parted_server: Debug1

parted_server: Debug2

so the crash occurs in ped_disk_new() which is part of libparted.

Might be an idea to update src:parted to the latest upstream version which
hasn't happend yet in Debian.

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: More debug infos - was: Re: iMac G5 "windfarm"

2021-01-04 Thread John Paul Adrian Glaubitz
On 1/4/21 10:39 AM, John Paul Adrian Glaubitz wrote:
> Might be an idea to update src:parted to the latest upstream version which
> hasn't happend yet in Debian.

My guess is that this issue can be worked around by zeroing the partition
table of the system before installing Debian.

Can someone give it a try?

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: More debug infos - was: Re: iMac G5 "windfarm"

2021-01-04 Thread Dennis Clarke
On 1/3/21 11:40 PM, John Paul Adrian Glaubitz wrote:
> On 1/3/21 10:30 PM, John Paul Adrian Glaubitz wrote:
>> FWIW, the source code is here:
>>
>>> https://salsa.debian.org/installer-team/partman-base/-/blob/master/parted_server.c
>>
>> If anyone has any clever idea, please let me know.
>>

Good morning, good day. I am just now sitting down with a coffee and
getting ready to look over this situation. Firstly I want to say that
you have done more than most people have ever done here to ensure we
the people have Debian on ports type machines. You have been doing all
the heavy lifting for years. That is not even remotely fair. Having
said that I must also say it is difficult to step into your shoes as
there is nothing easy about what you do. I have long wanted to be able
to create my own installer images but the process is mystical, magical
and not well documented. Lets not concern ourselves over that at this
time.

I have ppc64 and ppc64le and sparc64 here in my lab and I want to look
at the sparc64 case where everything "just works" but I have not seen
that yet. I want to use the installer image from 2021-01-03 with the
hope that both ppc64 and sparc64 are based on the same sources for the
parted/partman tools.

>> And if someone wants to debug the issue with the hanging partionier 
>> themselves,
>> check out the log in /var/log/partman on a second terminal while the 
>> partionier
>> is running.

I will take a look at both PPC64 and PPC64le and then for extra
information I will take a stare at SPARC64 but I am not sure to
what degree that will be reasonable data.

>>
>> As one can see, it just stops after /lib/init.d/30_parted and I'm afraid, I 
>> have
>> absolutely no idea why. There is no crash, no error, nothing.
> 
> Some more debug information.
> 
> Here are /var/lib/partman from the broken powerpc system:
> 
>> https://people.debian.org/~glaubitz/partman/powerpc/partman.dir/
> 
> Here the log file:
> 
>> https://people.debian.org/~glaubitz/partman/powerpc/partman.log
> 
> And here the same for a sparc64 installation where everything works:
> 
>> https://people.debian.org/~glaubitz/partman/sparc64/partman.dir/
>> https://people.debian.org/~glaubitz/partman/sparc64/partman.log
> 
> So it's crashing after the second "OK" in line 1221:
> 

Right. I will start the investigation here in my little lab and also see
if I can bring in some smart people to look at the issue. To be fair I
am not very well informed about the boot process but I am


>> https://salsa.debian.org/installer-team/partman-base/-/blob/master/parted_server.c#L1221
> 
> I wonder what's special on 32-bit PowerPC that it started crashing there.
> 

I am going to review the traffic here in the maillist and also begin an
install on SPARC64 and see if I can at least look at where things
differ.

I have a ppc64 partman log file already from the 2021-01-03 installer :

https://beta.genunix.com/debian_boot/ppc64/partman_2021-01-03.txt

Currently working on the SPARC64 and then I will start to review the
sources and see if I can drag in some other smart people on this also.


-- 
Dennis Clarke
RISC-V/SPARC/PPC/ARM/CISC
UNIX and Linux spoken
GreyBeard and suspenders optional



Re: More debug infos - was: Re: iMac G5 "windfarm"

2021-01-04 Thread John Paul Adrian Glaubitz
Hello!

On 1/4/21 6:38 PM, Dennis Clarke wrote:
> Firstly I want to say that you have done more than most people have ever
> done here to ensure we the people have Debian on ports type machines.
> You have been doing all the heavy lifting for years. That is not even
> remotely fair.

Thanks for the praise.

> Having said that I must also say it is difficult to step into your shoes as
> there is nothing easy about what you do. I have long wanted to be able
> to create my own installer images but the process is mystical, magical
> and not well documented. Lets not concern ourselves over that at this
> time.

I think I have explained before how that works and it's not actually that
complicated per se. The main requirement is that you set up a local mirror
with reprepro.

It's explained on the PA-RISC kernel wiki:

> https://parisc.wiki.kernel.org/index.php/How_to_create_Debian_unstable_iso_images

In practice, you will just have to perform the steps 2.1 and 2.2, ignore 2.3 
and 2.4
(except for powerpc and ppc64) and use my debian-cd configuration (the one on 
the wiki
contains some errors).

For powerpc and ppc64, you will need to patch grub-installer and partman-auto
locally because I haven't upstreamed the necessary changes yet - as I'm 
currently
still not happy with the current approach.

> I have ppc64 and ppc64le and sparc64 here in my lab and I want to look
> at the sparc64 case where everything "just works" but I have not seen
> that yet. I want to use the installer image from 2021-01-03 with the
> hope that both ppc64 and sparc64 are based on the same sources for the
> parted/partman tools.

Well, installing sparc64 works out of the box. I'm not sure what issues are 
there
from the installer side. If there are any, please report them to the 
debian-sparc
mailing list.

>>> And if someone wants to debug the issue with the hanging partionier 
>>> themselves,
>>> check out the log in /var/log/partman on a second terminal while the 
>>> partionier
>>> is running.
> 
> I will take a look at both PPC64 and PPC64le and then for extra
> information I will take a stare at SPARC64 but I am not sure to
> what degree that will be reasonable data.

The issue shows on 32-bit PowerPC only. So you won't see anything on these 
platforms.

>> So it's crashing after the second "OK" in line 1221:
>>
> 
> Right. I will start the investigation here in my little lab and also see
> if I can bring in some smart people to look at the issue. To be fair I
> am not very well informed about the boot process but I am

It's been localized down to "ped_disk_new(device_named(device_name)" in line
1224. So it's a crash libparted. And one theory is that this crash occurs
due to the Mac partitioning scheme.

So, in theory the issue could go away when starting with a disk with no
partition table (or an MS-DOS partition table).

>>> https://salsa.debian.org/installer-team/partman-base/-/blob/master/parted_server.c#L1221
>>
>> I wonder what's special on 32-bit PowerPC that it started crashing there.
>>
> 
> I am going to review the traffic here in the maillist and also begin an
> install on SPARC64 and see if I can at least look at where things
> differ.

If you want to produce useful data that will help with this investigation,
try installing the 32-bit PowerPC variant on a completely empty disk. You
can even do that on your PowerMac G5. But you should verify first that you
can reproduce the issue on your machine.

Thus:

- Download the current 32-bit PowerPC ISO
- Try installing on your G5 which should have an existing Mac partitioning
- If it hangs during the start of the partioning tool, you reproduced the 
problem
- Then wipe the disk completely, e.g. dd if=/dev/zero of=/dev/sda bs=1M count=10
- Now try again with the wiped disk and see if the partioning tool starts
  normally; if it does, it's either a bug in the Mac label [1] code or the HFS
  fs code [2].

Adrian

> [1] https://git.savannah.gnu.org/cgit/parted.git/tree/libparted/labels/mac.c
> [2] https://git.savannah.gnu.org/cgit/parted.git/tree/libparted/fs/hfs

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: More debug infos - was: Re: iMac G5 "windfarm"

2021-01-04 Thread John Paul Adrian Glaubitz
On 1/5/21 2:53 AM, John Paul Adrian Glaubitz wrote:
> It's been localized down to "ped_disk_new(device_named(device_name)" in line
> 1224. So it's a crash libparted. And one theory is that this crash occurs
> due to the Mac partitioning scheme.

Next I will build a libparted with debug printf() calls added to ped_disk_new() 
[1].

If there is a bug, it should occur inside this function.

Adrian

> [1] https://git.savannah.gnu.org/cgit/parted.git/tree/libparted/disk.c#n170

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913