﻿id	summary	reporter	owner	description	type	status	priority	milestone	component	version	resolution	keywords	cc	uname	nginx_version
63	Error while allocating	www.google.com/accounts/o8/id?id=AItOawnbO_Bp0gq-BdFo1JSwu_TT7SOWGIFncw4	somebody	"We are running nginx 1.0.10.

The type of traffic is large posts (more than 4K - 16K post messages, hundreds to thousands per second)
  
We kept crashing within seconds, while free:
#4  0x000000000041ac41 in ngx_destroy_pool (pool=0x1f5e3f0) at core/ngx_palloc.c:64

Please see the steps below we run to solve this crash:

1. Commenting our code and instead responding with a dummy predefined message.

After this step I notice the nginx is crashing easily and generated cores easily. (e.g. started at Nov 29 08:06:08, 1st core was created   2011/11/29 08:06:30, 2nd 2011/11/29 08:06:33, 3rd 08:07:16, 4th 2011/11/29 08:07:22)
What I also noticed it that it always crashes on the same function/line: ngx_destroy_pool, while it frees a large pool (same large pool for all the cores at the same run).

#4  0x000000000041ac41 in ngx_destroy_pool (pool=0x1f5e3f0) at core/ngx_palloc.c:64

2.	After running strlen on the allocated buffer I noticed that the size that is allocated is the same as the size returns by strlen.
Which means the allocation size is for content_ln and not content_ln+1 (+1 for the end_of string), the allocation doesn’t includes the null char.
ngx_alloc 0x125e8a0 size 4376

(gdb) p strlen((char *)(l->alloc))
$1 = 4376
(gdb) p l->alloc
$2 = (void *) 0x125e8a0

3. I added +1 to the allocation in ngx_palloc_large and rerun again:
static void *
ngx_palloc_large(ngx_pool_t *pool, size_t size)
{
    void              *p;
    ngx_uint_t         n;
    ngx_pool_large_t  *large;

    p = ngx_alloc(size+1, pool->log); //THE CHANGE IS HERE
....
}

it runs for a while, no cores were generated while before I could generate cores easily within seconds.

4. I returned back our original call and rerun again.

So far no cores were created, and it runs for almost 12 hours.

There is probably a better/cleaner fix for that .....   

The reason I put it as major is because this crash can lead to a dead lock when all the workers are waiting for a malloc lock while a thread that was killed due to ""double-free"" false diagnostic (memory corruption) did not release this lock.  

Please let me know if you need any core files 

Thanks,
Rafit 

"	defect	closed	major		nginx-core	1.0.x	worksforme		rafit.izhak.ratzin@…		1.0.10
