On Sun, Dec 22, 2024 at 5:12 AM Larry Garfield <la...@garfieldtech.com>
wrote:
> On Sat, Dec 21, 2024, at 10:43 AM, Jakub Zelenka wrote:
> > On Fri, Dec 20, 2024 at 8:29 PM Larry Garfield <la...@garfieldtech.com>
> wrote:
> >> Background: PHP has a not-often-considered feature, the stat-cache.
> That is, the runtime caches the OS stat() call for files, so that
> subsequent reads on the same file can be faster.  However, it's even less
> realized that it's a single-file cache.  It literally only applies when you
> try to do two file-infomation operations on the same file in rapid
> succession, without any other file reads in between.
>
> > I would prefer to disable it by default but keep some option (INI) to
> > re-enable it. I think that for most users the perf impact will be
> > negligible. However, it is quite likely that there are some user
> > workflows and platforms where benefiting from the stat cache can be
> > still significant in terms of performance. So those users should have
> > the option to re-enable it if they see some significant regression
> > rather then force them to update their code to make it faster or
> > implement their own cache which would just make their migration to the
> > next version much harder / potentially impossible. There is not such a
> > huge maintenance that we would really need to get rid of it completely.
> > I would really prefer having such option and tell to users to re-enable
> > it rather than not be able to deal with potentially reported future
> > perf regressions.
> >
> > I think the main issue with the cache is that is just not convenient
> > for use cases where it doesn't get flushed during some different access
> > methods that don't trigger flush. We could probably improve the stream
> > situation a bit but it still leaves external (e.g. shell) access
> > problem in place which we just cannot fix. On the other hand it is
> > possible to use it in a way that users can profit from it but they
> > really need to know how it works. That's way it should be an optional
> > feature IMO. We should also improve documentation in that regards.
> >
> > In terms of voting, if there was no option to re-enable it, I would
> > probably vote against this proposal as I'm a bit worried about those
> > possible regression reports.
>
> I really don't like the idea of another ini toggle.  That actually creates
> more work, as people writing code that works with the file system now have
> one more invisible context they have to think about.  Which means they
> probably won't, until it bites them.  (They'll either never bother clearing
> the cache, so their code may malfunction on the rare system where it's
> enabled, or always clear it, which 99.9% of the time will actually be
> slower as we have to invoke the function for it to do nothing.  Both are
> bad.)
>
>
Well it's much less likely to bite anyone than if it's always on. I think
if we document it well and there is a good switch note, it should be clear
enough for users and only users that understand what it does should enable
it.

I can see that if anyone enables it just on prod, then they will have hard
time to recreate the issues on local setup but that's already the case with
some other option. You just need to get the right settings from prod to be
able to recreate things on local setup.

I don't really have a better idea how to minimize impact on the users if
they see significant regression from this change. Changing the functions
signature is just not viable IMO.


> I suppose a possible alternative would be to modify all file system
> mutation functions (file_put_contents(), touch(), etc.) to flush the cache,
> which for whatever reason doesn't happen now.  That would be above my skill
> level, though, so someone else would need to do it.  Also, I don't know if
> there's a good reason those functions don't clear the cache currently or if
> it was just an oversight.
>
>
As I said we could probably handle some stream cases more aggressively but
it won't resolve the problem completely. We still have things like
system("touch /file/path") which we cannot flush the stat cache for. And
it's not just shell access - there might be some 3rd party extensions that
operate on files or there might be other programs accessing files at the
same time. So there are many places which we just cannot control.

Regards

Jakub

Reply via email to