Hi Brian

On Sun, 13 Oct 2024 10:41:58 -0600
Brian Inglis wrote:
> On 2024-10-12 17:14, Takashi Yano via Cygwin wrote:
> > Hi Brian,
> > 
> > On Tue, 8 Oct 2024 10:37:14 -0600
> > Brian Inglis wrote:
> >> On 2024-10-08 10:14, Brian Inglis via Cygwin wrote:
> >>> On 2024-10-08 05:20, Takashi Yano via Cygwin wrote:
> >>>> On Mon, 7 Oct 2024 15:11:52 +0200
> >>>> Christian Franke wrote:
> >>>>> $ gcc -o sigtest -O2 sigtest.c
> >>>>>
> >>>>> $ ./sigtest > out.txt
> >>>>> (press ^C 42x :-)
> >>>>>
> >>>>> $ sort out.txt | uniq -c
> >>>>>          3 x = 0x1.23456789p+0, y = -nan, d = -nan
> >>>>>          6 x = 0x1.23456789p+0, y = 0x1.23456789p+0, d = -nan
> >>>>>         33 x = 0x1.23456789p+0, y = 0x1.23456789p+0, d = 0x0p+0
> >>>>>
> >>>>> The problem also occurs if compiled without -O2, but less often. No
> >>>>> problem occurs if compiled with -DWORKS which suggests that only 'long
> >>>>> double' is affected.
> >>>>
> >>>> Thanks for the report. I looked into this problem and might find the
> >>>> cause. It seems due to a bug of scripts/gendef. It generates signal
> >>>> handler caller (sigfe.s) which stores/restores the registers.
> >>>>
> >>>> In sigdelayed, control word is stored/restored by fnstcw/fldcw 
> >>>> instruction,
> >>>> however, fninit instruction destroys some status registers in FPU (x87).
> >>>>
> >>>> I think we shold use fnstenv/fldenv rather than fnstcw/fldcw and fninit.
> >>>> However, I'm not familiar with x87 instructions, so I may overlook
> >>>> something.
> >>>>
> >>>> Could anyone expert of x87 instructions and sigfe stuff give some
> >>>> comments?
> >>>
> >>> AIUI x87 FP handling is outdated and mainly unused on current systems, as
> >>> current systems do more and use more than the legacy x87 instructions and 
> >>> stack.
> >>>
> >>> See https://en.cppreference.com/w/c/numeric/fenv and related docs for more
> >>> modern approaches.
> >>>
> >>> You would have to look into the AMD/Intel/IEEE docs for lower level 
> >>> details.
> >>
> >> This is basically what ISTR:
> >>
> >> https://beta.boost.org/doc/libs/1_82_0/libs/context/doc/html/context/rationale/x86_and_floating_point_env.html
> >>
> >> where legacy x87 and MMX registers are not used or preserved on 
> >> x86_64/amd64, as
> >> SSE... instructions and XMM registers are used.
> > 
> > Thanks for the advice. I read throuh the web pages and related documents
> > and made a patch which uses fxsave/fxrstor and xsave/xrstror to
> > cygwin-patc...@cygwin.com mailing list.
> > https://cygwin.com/pipermail/cygwin-patches/2024q4/012804.html
> > 
> > Is this as you intended?
> 
> That seems to be the preferred approach now, as long as you can correctly 
> determine adequate space for fxsave and xsave, given the varying feature 
> sets, 
> register counts, and register sizes of recent processors: 
> sse/2/3/4.1/4.2/4a/5/ssse3 avx2/512 128/256/512 bits X/Y/ZMM registers.

Thanks for checking.

According to https://cdrdv2.intel.com/v1/dl/getContent/671110 ,
fxsave uses 512 bytes fixed length memory to save the current
state of the x87 FPU, MMX technology, XMM, and MXCSR registers.

The patch allocates 0x238 bytes:
 0x200 (512 bytes): fxsave area
 0x008 (  8 bytes): for 16-byte alignment
 0x010 ( 16 bytes): work area
 0x020 ( 32 bytes): reserved for later processing

According to https://cdrdv2.intel.com/v1/dl/getContent/671436 ,
cpuid instruction with eax=0dh and ecs=00h returns the maximum
size required by xsave in ebx. So the patch allocates:
ebx + 0x048 bytes.
 0x018 ( 24 bytes): for 64-byte alignment
 0x010 ( 16 bytes): work area
 0x020 ( 32 bytes): reserved for later processing

-- 
Takashi Yano <takashi.y...@nifty.ne.jp>

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

Reply via email to