Thanky you, Dmitry.

One question before I describe what we are doing with NJS. I did read about the 
VM handling process before switching from Lua to NJS and it sounded very 
practical but my current understanding is that there could be multiple VM’s 
instantiated for a single request. A js_set, js_content, and js_header_filter 
directive that applies to a single request, for example, would instantiate 3 
VMs. And were you to need to set multiple variables with js_set, then keep 
adding to that # of VMs. My original understanding of that was that those VMs 
would be destroyed once they exited so even if you had multiple VMs 
instantiated per request, the memory impact would not be cumulative in a single 
request. Is that understanding correct? Or are you saying that each VM 
accumulates more and more memory until the entire request completes?

As far as how we’re using NJS, we’re mostly using it for header filters, 
internal redirection, and access control. So there really shouldn’t be a threat 
to memory in most instances unless we’re not just dealing with a single request 
memory leak inside of a VM but also a memory leak that involves every VM that 
NJS instantiates just accumulating memory until the request completes.

Right now, my working theory about what is most likely to be creating the 
memory spikes has to do with POST body analysis. Unfortunately, some of the 
requests that I have to deal with are POSTs that have to either be denied 
access or routed differently depending on the contents of the POST body. 
Unfortunately, these same routes can vary in the size of the POST body and I 
have no control over how any of that works because the way it works is 
controlled by third parties. One of those third parties has significant market 
share on the internet so we can’t really avoid dealing with it.

In any case, before we switched to NJS, we were using Lua to do the same things 
and that gave us the advantage of doing both memory cleanup if needed and also 
doing easy analysis of POST body args. I was able to do this sort of thing with 
Lua before:
local post_args, post_err = ngx.req.get_post_args()
if post_args.arg_name = something then

But in NJS, there’s no such POST body utility so I had to write my own. The 
code that I use to parse out the POST body works for both URL encoded POST 
bodies and multipart POST bodies, but it has to read the entire POST into a 
variable before I can use it. For small POSTs, that’s not a problem. For larger 
POSTs that contain a big attachment, it would be. Ultimately, I only care about 
the string key/value pairs for my purposes (not file attachments) so I was 
hoping to discard attachment data while parsing the body. I think that that is 
actually how Lua’s version of this works too. So my next thought was that I 
could use a Buffer and rs.readSync to read the POST body in buffer frames to 
keep memory minimal so that I could could discard the any file attachments from 
the POST body and just evaluate the key/value data that uses simple strings. 
But from what you’re saying, it sounds like there’s basically no difference 
between fs.readSync w/ a Buffer and rs.readFileSync in terms of actual memory 
use. So either way, with a large POST body, you’d be steamrolling the memory 
use in a single Nginx worker thread. When I had to deal with stuff like this in 
Lua, I’d just run collectgarbage() to clean up memory and it seemed to work 
fine. But then I also wasn’t having to parse out the POST body myself in Lua 
either.

It’s possible that something else is going on other than that. qs.parse seems 
like it could get us into some trouble if the query_string that was passed was 
unusuall long too from what you’re saying about how memory is handled. None of 
the situations that I’m handling are for long running requests. They’re all 
designed for very fast requests that come into the servers that I manage on a 
constant basis.

If you can shed some light on the way that VM’s and their memory are handled 
per my question above and any insights into what to do about this type of 
situation, that would help a lot. I don’t know if there are any plans to offer 
a POST body parsing feature in NJS for those that need to evalute POST body 
data like how Lua did it, but if there was some way to be able to do that at 
the Nginx layer instead of at the NJS layer, it seems like that could be a lot 
more sensitive to memory use. Right now, if my understanding is correct, the 
only option that I’d even have would be to just stop doing POST body handling 
if the POST body is above a certain total size. I guess if there was some way 
to forcibly free memory, that would help too. But I don’t think that that is as 
common of a problem as having to deal with very large query strings that some 
third party appends to a URL (probably maliciously) and/or a very large file 
upload attached to a multipart POST. So the only concern that I’d have about 
memory in a situation where I don’t have to worry about memory when parsing a 
larger file woudl be if multiple js_sets and such would just keep spawning VMs 
and accumulating memory during a single request.

Any thoughts?

—
Lance Dockins

> On Thursday, Sep 21, 2023 at 1:45 AM, Dmitry Volyntsev <xei...@nginx.com 
> (mailto:xei...@nginx.com)> wrote:
>
> On 20.09.2023 20:37, Lance Dockins wrote:
> > So I guess my question at the moment is whether endless memory use
> > growth being reported by njs.memoryStats.size after file writes is
> > some sort of false positive tied to quirks in how memory use is being
> > reported or whether this is indicative of a memory leak? Any insight
> > would be appreicated.
>
> Hi Lance,
> The reason njs.memoryStats.size keeps growing is because NJS uses arena
> memory allocator linked to a current request and a new object
> representing memoryStats structure is returned every time
> njs.memoryStats is accessed. Currently NJS does not free most of the
> internal objects and structures until the current request is destroyed
> because it is not intended for a long running code.
>
> Regarding the sudden memory spikes, please share some details about JS
> code you are using.
> One place to look is to analyze the amount of traffic that goes to NJS
> locations and what exactly those location do.
>
_______________________________________________
nginx mailing list
nginx@nginx.org
https://mailman.nginx.org/mailman/listinfo/nginx

Reply via email to