Hi Rowan Tommins,

> > var_representation  may be useful to a user when any of the following apply:
> >
> > - You are generating a snippet of code to eval() in a situation where the 
> > snippet will occasionally or frequently be read by a human  (If the output 
> > never needs to be read by a human, `return unserialize(' . 
> > var_export(serialize($data), true) . ');` can be used)
> 
> 
> As far as I know I have never had any reason to generate code and then 
> eval() it, and can't think of a situation where I ever would. If I 
> wanted a machine-readable output from a variable, I would use 
> serialize() or json_encode().
>
> That's not to say that there aren't cases where those requirements do 
> happen, but I think it is a very niche use case to dedicate two 
> different built-in functions to.

Even if a developer such as yourself doesn't need to generate code and don't 
expect to directly use it,
**some of the applications and libraries they do use everyday would need to 
generate human and machine readable code.**
That output is then shown to users of those libraries/applications,
or saved to files that would need to be looked at by users submitting bug 
reports or trying to understand the issue,
e.g. trying to understand why a unit test mock isn't doing what they'd expect.

For example, the output of some composer autoload files are generated using 
var_export,
and composer uses var_export for generating some exception messages,
and it may be useful to have $e->getMessage() be a single line.
https://github.com/composer/composer/blob/master/src/Composer/Repository/FilesystemRepository.php#L205-L208
(if users were trying to diagnose composer not autoloading the class, having 
these files be more readable would be useful
many years from now if composer's minimum version became php 8.1)

And a subset of the uses of var_export in the dependencies of a project I'm 
using:

```
vendor/sebastian/global-state/src/CodeExporter.php
// in protected function recursiveExport
67:            return \var_export($variable, true);

70:        return 'unserialize(' . \var_export(\serialize($variable), true) . 
')';

vendor/phpunit/php-code-coverage/src/Report/PHP.php
16: * Uses var_export() to write a SebastianBergmann\CodeCoverage\CodeCoverage 
object to a file.

37:            \var_export($coverage->getData(true), true),

vendor/phpspec/prophecy/src/Prophecy/Doubler/Generator/ClassCodeGenerator.php
104:                $php .= ' = '.var_export($argument->getDefault(), true);

vendor/symfony/console/Descriptor/MarkdownDescriptor.php
62:            .'* Default: `'.str_replace("\n", '', 
var_export($argument->getDefault(), true)).'`'
```

If you want to distinguish between array, stdClass, and MyClass, json_encode 
isn't adequate.

> > - You are writing unit tests for applications supporting PHP 8.1+ (or a 
> > var_representation polyfill) that test the exact string representation of 
> > the output (e.g. phpt tests of php-src and PECL extensions)
> 
> 
> Since test output doesn't need to be executable, I would have thought 
> var_dump would be more appropriate than var_export here.

https://www.php.net/var_dump does not have an option to save to a string - it 
outputs to stdout.
if __debugInfo or an error handler echoes anything that would get captured by 
output buffering and interfere with that test.

Additionally, humans need to update the test expectations and to read the test 
output when it fails.
If that output contains control characters or unexpectedly mixes line endings, 
it is inconvenient to work with files using the raw output of in 
var_export/var_dump.

> > As I mentioned before, var_export suffers from many shortcomings such as 
> > the fact
> > that it can have more lines of output than var_dump for complex 
> > datastructures,
> > and doesn't escape control characters.
> 
> Checking on php-src master, this does seem to be the case in the 
> majority of tests:
> 
> - var_export appears 703 times in 136 different *.phpt files (0.8% of files)
> - print_r appears 827 times in 342 different *.phpt files (2.1% of files)
> - var_dump appears 33503 times in 9599 different *.phpt files (59.7% of 
files)
> 
> So if we want to improve anything for that use case, we need to improve 
> or replace var_dump, not var_export.

**php-src phpt tests are for tests of php itself, which puts strict and 
atypical limitations on the test framework.**
php-src's phpt test framework may represent needs of php-src and some pecl 
maintainers, not userland.
In a userland project, I could easily choose to add a whole lot of utility 
functions/methods such as
`function dump_repr($value) { echo var_representation($value), "\n"; }` and use 
that to replace var_dump.

I consider adding those helper methods **to php-src itself** impractical 
because tests of php-src should be self-contained.
If you encounter an issue with the engine, opcache or JIT, it's much, much 
harder to diagnose if dozens of userland helper functions
were loaded and invoked before the snippet in question was invoked.

So I do use var_dump in phpt tests itself, mainly because:

1. Currently, control characters such as `\r` are not escaped, so I'm more 
confident `string(4) "test"` has no control characters 
2. var_export does not append a newline, it's more convenient to copy the .out 
file into the `--OUT--` section
3. I'm avoiding adding reusable helpers in a self-contained test case 
4. var_export output make the overall test output longer for arrays of arrays

Still, I would prefer using var_representation over var_dump in phpt for many 
use cases, especially with VAR_REPRESENTATION_SINGLE_LINE available.

> > - You need to copy the output into a codebase that's following a modern 
> > coding style guideline such as modern coding guidelines such as PSR-2. It 
> > also saves time if you don't have to remove array keys of lists and convert 
> > array() to [].
> 
> 
> Trying to match any particular coding style seems rather outside the 
> remit of a built-in function - do we need flags for tabs vs spaces, 
> trailing commas, etc, etc? Surely it's simpler for users to take the 
> existing var_export format and use their IDE or dev scripts to re-format 
> it to taste.


Again, the IDE may have issues with the control characters in strings or 
unexpectedly remove or add them
(e.g. windows vs unix newlines)

For tabs vs spaces, Sara Golemon suggested an `'indent'` option
where users could choose what string prefix to use as spaces/tabs,
but I feel this would increase the initial scope of the RFC too much.

For a new developer, they may not have those features or plugins installed in 
their IDE,
or may not be aware of the existence of the shortcut/command to invoke those 
features on a range.
(phpcbf, or for reindenting Ctrl+space in eclipse, `=` in vim, etc.)
Additionally, scripts to reindent may not remove the `0 =>`, `1 =>`, etc,
either not containing a php parser or assuming the user deliberately added 
those keys.

It would be much easier to reindent the output of var_representation than to 
rewrite+reindent the output of var_export.


Regards,
Tyson
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php

Reply via email to