Hi Richard, > On 26 Oct 2023, at 20:49, Richard Sandiford <richard.sandif...@arm.com> wrote: > > Iain Sandoe <iains....@gmail.com> writes: >> This was written before Thomas' modification to the ELF-handling to allow >> a config-based change for target details. I did consider updating this >> to try and use that scheme, but I think that it would sit a little >> awkwardly, since there are some differences in the start-up scanning for >> Mach-O. I would say that in all probability we could improve things but >> I'd like to put this forward as a well-tested initial implementation. > > Sorry, I would prefer to extend the existing function instead. > E.g. there's already some divergence between the Mach-O version > and the default version, in that the Mach-O version doesn't print > verbose messages. I also don't think that the current default code > is so watertight that it'll never need to be updated in future.
Fair enough, will explore what can be done (as I recall last I looked the primary difference was in the initial start-up scan). > Some questions/suggestions below. > >> It has been in use for more than 2 years on the aarch64 Darwin devt. >> branch. Tested also in configurations on x86_64-linux and >> aarch64-linux. Obviously, this is not aarch64-specific but that is >> out initial use-case. OK for trunk? >> thanks >> Iain >> >> --- 8< --- >> >> We need to process the source slightly differently from ELF, especially >> in that we have __USER_LABEL_PREFIX__ and there are no function start >> and end assembler directives. This means we cannot delineate functions >> when frame output is switched off. >> >> TODO: consider adding -mtest-markers or something similar to inject >> assembler comments that can be scanned for. >> >> gcc/testsuite/ChangeLog: >> >> * lib/scanasm.exp: Initial handling for Mach-O function body scans. >> >> Signed-off-by: Iain Sandoe <i...@sandoe.co.uk> >> --- >> gcc/testsuite/lib/scanasm.exp | 55 ++++++++++++++++++++++++++++++++++- >> 1 file changed, 54 insertions(+), 1 deletion(-) >> >> diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp >> index 5df80325dff..fab4b0669ec 100644 >> --- a/gcc/testsuite/lib/scanasm.exp >> +++ b/gcc/testsuite/lib/scanasm.exp >> @@ -851,6 +851,44 @@ proc parse_function_bodies { config filename result } { >> close $fd >> } >> >> +proc parse_MACHO_function_bodies { filename result } { >> + upvar $result up_result >> + >> + # Regexp for the start of a function definition (name in \1). >> + set label {^(_[a-zA-Z_]\S+):$} >> + set start {^LFB[0-9]+} > > Do you need to single out the start label like this? Yes, as of now we do; Mach-O has not .function or .size markers to delineate the code. So, without modifying the asm output to emit something new that we can detect, and given that unwind frames are the default on Darwin, the LFB/LFE markers are the only reliable mechanism at present > Couldn't we just skip LFB labels, by extending the fluff regexp? in this instance, they are not fluff. The idea I had recently was to emit some well-defined markers into to asm comments (say in response to -memit-test-markers or similar) so that we could scan for those and avoid depending on the LFB/E. However, that means modifying the port output code as well. >> + >> + # Regexp for the end of a function definition. >> + set terminator {^LFE[0-9]+} >> + >> + # Regexp for lines that aren't interesting. >> + set fluff {^\s*(?:\.|//|@)} >> + set fluff3 {^L[0-9ACESV]} > > Is there any need for two regexps here? It looks like we could combine > them as: > > {^\s*(?:\.|//|@)|^L[0-9ACESV]} Will examine that again (I think I only inherited two matches from the original). > Or, with the above suggestion too: > > {^\s*(?:\.|//|@)|^L[0-9ACESV]|^LFB[0-9]} > > If that works, it seems to fit within the existing config scheme. OK - will see how far I get. thanks Iain > Thanks, > Richard > >> + >> + set fd [open $filename r] >> + set in_function 0 >> + while { [gets $fd line] >= 0 } { >> + if { !$in_function && [regexp $label $line dummy function_name] } { >> + set in_function 1 >> + set function_body "" >> + } elseif { $in_function == 1 } { >> + if { [regexp $start $line] } { >> + set in_function 2 >> + } else { >> + set in_function 0 >> + } >> + } elseif { $in_function == 2 } { >> + if { [regexp $terminator $line] } { >> + set up_result($function_name) $function_body >> + set in_function 0 >> + } elseif { ![regexp $fluff $line] && ![regexp $fluff3 $line] } { >> + append function_body $line "\n" >> + } >> + } >> + } >> + close $fd >> +} >> + >> # FUNCTIONS is an array that maps function names to function bodies. >> # Return true if it contains a definition of function NAME and if >> # that definition matches BODY_REGEXP. >> @@ -883,6 +921,14 @@ proc check-function-bodies { args } { >> error "too many arguments to check-function-bodies" >> } >> >> + set isELF 1 >> + # some targets have a __USER_LABEL_PREFIX__ >> + set needsULP 0 >> + if { [istarget *-*-darwin*] } { >> + set isELF 0 >> + set needsULP 1 >> + } >> + >> if { [llength $args] >= 3 } { >> set required_flags [lindex $args 2] >> >> @@ -944,7 +990,11 @@ proc check-function-bodies { args } { >> remote_upload host "$filename" >> } >> if { [file exists $output_filename] } { >> - parse_function_bodies config $output_filename functions >> + if { $isELF } { >> + parse_function_bodies config $output_filename functions >> + } else { >> + parse_MACHO_function_bodies $output_filename functions >> + } >> set have_bodies 1 >> } else { >> verbose -log "$testcase: output file does not exist" >> @@ -988,6 +1038,9 @@ proc check-function-bodies { args } { >> if { $xfail_all || [string equal $selector "F"] } { >> setup_xfail "*-*-*" >> } >> + if { $needsULP } { >> + set function_name "_$function_name" >> + } >> set testname "$testcase check-function-bodies $function_name" >> if { !$have_bodies } { >> unresolved $testname