Wenjun, I feel we're just not communicating. Your suggestion seems to be a solution in search of a problem. And now you're making more super speculative suggestions. How much do you actually know about Python's internals? It's not at all like C++, where I could see the distinction between user allocations and system allocations making sense.
--Guido On Mon, Jul 20, 2020 at 7:25 PM Wenjun Huang <[email protected]> wrote: > Hi Guido, > > Thank you for bearing with me. I wasn't trying to say you guys are mean > btw. > > I thought that the interpreter might allocate some memory for its own use. > Perhaps I was wrong, but I'll work with your examples here just to be sure. > > Stack frames would be considered as interpreter objects here, as they > aren't created because a user object is created. Instead, they are the > results of function calls. Following that, empty spaces in hash tables and > string hashes would be considered as user allocations, as they are created > through explicitly created objects. I think a transitive relation would > work here (i.e. if an explicit object allocation triggers an implicit > allocation, then the latter is considered an user allocation). > > Now, maybe getting this to work doesn't benefit profiler users so much, > but there are other potential uses as well. Hopefully they can be more > compelling. I didn't bring these up earlier because I thought the profiling > case was easier to discuss. > > For example, provenance of data can be tracked through taint analysis, but > if all objects are lumped together then we have to taint the entire > interpreter. > > Another example would be partial GIL sidestepping. The approach would be > blowing up threads into processes and allocating all user objects in shared > memory (accesses would be synchronized). This way we get parallel execution > and threading semantics. However, this is not possible if we can't isolate > user objects, as there's no sensible default to synchronize interpreter > states. This design has been done before for C/C++ ( > https://people.cs.umass.edu/~emery/pubs/dthreads-sosp11.pdf), but for > different reasons. > > On Mon, Jul 20, 2020 at 8:16 PM Guido van Rossum <[email protected]> wrote: > >> On Mon, Jul 20, 2020 at 4:09 PM Wenjun Huang <[email protected]> >> wrote: >> >>> Hi Barry, >>> >>> It's not just about leaks. You might want to know if certain objects are >>> occupying a lot of memory by themselves. Then you can optimize the memory >>> usage of these objects. >>> >>> Another possibility is to do binary instrumentation and see how the user >>> code is interacting with objects. If we can't tell which objects are >>> created by the interpreter internals, then interpreter accesses and user >>> accesses would be mixed together. It's likely that some accesses would be >>> connected of course, but I don't think this should be outright labeled as >>> useless. >>> >> >> I have to side with Barry -- I don't understand why the difference >> between "interpreter internals" and "user objects" matters. Can you give >> some examples of interpreter internals that aren't being allocated in >> direct response to user code? For example you might call stack frames >> internals. But a stack frame is only created when a user calls a function, >> so maybe that's a user object too? Or take dictionaries. These contain hash >> tables with empty spaces in them. Are the empty spaces internals? Or >> strings. These cache the hash value. Are the 8 bytes for the hash value >> interpreter internals? >> >> So, here's my request -- can you clarify your need for the >> differentiation? Other than just pointing to Scalene. If Scalene has a >> reason for making this differentiation can you explain what Scalene users >> get out of this? Suppose Scalene tells me "your objects take 84.3% of the >> memory and interpreter internals take the other 17.7%" what can I as a user >> do with that information? >> >> >>> Also, I'm not saying "we must implement this because it's so useful." >>> My original intention is to understand: >>> (1) is the differentiation being done at all? >>> >> >> It's not. We're not being mean here. If it was being done someone would >> have told you after your first message. >> >> >>> (2) if it's not being done, why? >>> >> >> Because nobody saw a need for it. In fact, apart from you, there still >> isn't anyone who sees the need for it, since you haven't explained your >> need. (This, too, should have been obvious to you given the responses you'v >> gotten so far. :-) >> >> >>> (3) does it make sense to implement it? >>> >> >> Probably not. I certainly don't expect it to be easy. So it won't "make >> sense" unless you have actually explained your reason for wanting this and >> convinced some folks that this is a good reason. See the answer for (1) and >> (2) above. >> >> >>> So far I think I've got the answers to 1 & 2--it's not being done >>> because people don't find it useful. The answer to 3 is most likely "no" >>> due to the costs, but it would be nice if someone could weigh in on this >>> part. Maybe there's some workaround. >>> >> >> If you were asking me to weigh in *now* I'd say "no", if only because you >> haven't explained the reason why this is needed. And if you have an >> implementation idea in mind, please don't be shy. >> >> -- >> --Guido van Rossum (python.org/~guido) >> *Pronouns: he/him **(why is my pronoun here?)* >> <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> >> > -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
_______________________________________________ Python-ideas mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/6ISATRBVG3R6TLN2QXSC3PEY3EJ53IKW/ Code of Conduct: http://python.org/psf/codeofconduct/
