Discussion:
Understanding ioBufAllocator behvaiour
Kapil Sharma (kapsharm)
2017-05-24 15:29:25 UTC
Permalink
On plateauing - not necessarily; we do see the memory consumption increasing continuously in our deployments as well. It depends on the pattern of segment sizes over time.

ATS uses power of 2 allocators for memory pool - there are 15 of those, ranging from 128bytes to 2M if my memory serves me right - and these are per thread! ATS will choose an optimal allocator for the segments.

As Alan mentioned, once chunk are allocated, they are never freed.

Here is a totally artificial example just to make the point (please correct if my understanding is flawed):
* the traffic pattern was such that initially only 2M allocators were used then ATS will keep allocating 2M chunks until RAM cache limit (lets say it is 64GB) is reached.
* Now traffic pattern changed (smaller fragment requests), and only 1M allocators are used, ATS will now keep allocating 1M chunks, again capping at 64GB. But in the end ATS would have allocated 128GB well over RAM cache size limit
.


In the past a there was some prototype of reclaimable buffer support added in ATS, but I believe it was removed in 7.0? Also there is recent discussion of adding jmalloc?



On May 24, 2017, at 11:01 AM, Alan Carroll <***@yahoo-inc.com<mailto:***@yahoo-inc.com>> wrote:

One issue is that memory never moves between the iobuf sizes. Once a chunk of memory is used for a specific iobuf slot, it's there forever. But unless something is leaking, the total size should eventually plateau, certainly within less than a day if you have a basically constant load. There will be some growth due to blocks being kept in thread local allocation pools, but again that should level in less time than you've run.


On Wednesday, May 24, 2017, 9:50:39 AM CDT, Dunkin, Nick <***@ccur.com<mailto:***@ccur.com>> wrote:

Hi Alan,



This is 7.0.0



I only see this behavior on ioBufAllocator[0], [4] and [5]. The other ioBufAllocators’ usage looks as I would expect (i.e. allocated goes up then flat), so I was thinking it was more likely something to do with my configuration or use-case.



I’d also just like to understand, at a high level, how the ioBufAllocators are used.



Thanks,



Nick



From: Alan Carroll <***@yahoo-inc.com<mailto:***@yahoo-inc.com>>
Reply-To: "***@trafficserver.apache.org<mailto:***@trafficserver.apache.org>" <***@trafficserver.apache.org<mailto:***@trafficserver.apache.org>>
Date: Wednesday, May 24, 2017 at 10:33 AM
To: "***@trafficserver.apache.org<mailto:***@trafficserver.apache.org>" <***@trafficserver.apache.org<mailto:***@trafficserver.apache.org>>
Subject: Re: Understanding ioBufAllocator behvaiour



Honestly it sounds like a leak. Can you specify which version of Traffic Server this is?




On Wednesday, May 24, 2017, 8:22:46 AM CDT, Dunkin, Nick <***@ccur.com<mailto:***@ccur.com>> wrote:

Hi



I have a load test that I’ve been running for a number of days now. I’m using the memory dump logging in traffic.out and I’m trying to understand how Traffic Server allocates and reuses memory. I’m still quite new to Traffic Server.



Nearly all of the memory traces look as I would expect, i.e. memory is allocated and reused over the lifetime of the test. However my readings from ioBufAllocator[0] show a continual increase in allocated AND used. I am attaching a graph. (FYI – This graph covers approximately 3 days of continual load test.)



I would have expected to start seeing reuse in ioBufAllocator by now, like I do in the other ioBufAllocators. Can someone help me understand what I’m seeing?



Many thanks,



Nick Dunkin



Nick Dunkin

Principal Engineer

o: 678.258.4071

e: ***@curr.com<mailto:***@ccur.com>

4375 River Green Pkwy # 100, Duluth, GA 30096, USA

<image001.png>

<image001.png>
Dunkin, Nick
2017-05-24 15:37:49 UTC
Permalink
Hi Kapil,

This is very interesting, thanks.

Excuse this newbie question though:

What is the lifetime of “in use” objects in the ioBufs? I don’t understand why we are not seeing reuse in these structures. In your artificial example, why would ATS “keep allocating 2M chunks until RAM cache limit is reached”. At what point are those allocated 2M chunks eligible for reuse?

Thanks,

Nick

From: "Kapil Sharma (kapsharm)" <***@cisco.com>
Reply-To: "***@trafficserver.apache.org" <***@trafficserver.apache.org>
Date: Wednesday, May 24, 2017 at 11:29 AM
To: "***@trafficserver.apache.org" <***@trafficserver.apache.org>
Subject: Re: Understanding ioBufAllocator behvaiour

On plateauing - not necessarily; we do see the memory consumption increasing continuously in our deployments as well. It depends on the pattern of segment sizes over time.

ATS uses power of 2 allocators for memory pool - there are 15 of those, ranging from 128bytes to 2M if my memory serves me right - and these are per thread! ATS will choose an optimal allocator for the segments.

As Alan mentioned, once chunk are allocated, they are never freed.

Here is a totally artificial example just to make the point (please correct if my understanding is flawed):
* the traffic pattern was such that initially only 2M allocators were used then ATS will keep allocating 2M chunks until RAM cache limit (lets say it is 64GB) is reached.
* Now traffic pattern changed (smaller fragment requests), and only 1M allocators are used, ATS will now keep allocating 1M chunks, again capping at 64GB. But in the end ATS would have allocated 128GB well over RAM cache size limit
.


In the past a there was some prototype of reclaimable buffer support added in ATS, but I believe it was removed in 7.0? Also there is recent discussion of adding jmalloc?



On May 24, 2017, at 11:01 AM, Alan Carroll <***@yahoo-inc.com<mailto:***@yahoo-inc.com>> wrote:

One issue is that memory never moves between the iobuf sizes. Once a chunk of memory is used for a specific iobuf slot, it's there forever. But unless something is leaking, the total size should eventually plateau, certainly within less than a day if you have a basically constant load. There will be some growth due to blocks being kept in thread local allocation pools, but again that should level in less time than you've run.


On Wednesday, May 24, 2017, 9:50:39 AM CDT, Dunkin, Nick <***@ccur.com<mailto:***@ccur.com>> wrote:

Hi Alan,


This is 7.0.0


I only see this behavior on ioBufAllocator[0], [4] and [5]. The other ioBufAllocators’ usage looks as I would expect (i.e. allocated goes up then flat), so I was thinking it was more likely something to do with my configuration or use-case.


I’d also just like to understand, at a high level, how the ioBufAllocators are used.


Thanks,


Nick


From: Alan Carroll <***@yahoo-inc.com<mailto:***@yahoo-inc.com>>
Reply-To: "***@trafficserver.apache.org<mailto:***@trafficserver.apache.org>" <***@trafficserver.apache.org<mailto:***@trafficserver.apache.org>>
Date: Wednesday, May 24, 2017 at 10:33 AM
To: "***@trafficserver.apache.org<mailto:***@trafficserver.apache.org>" <***@trafficserver.apache.org<mailto:***@trafficserver.apache.org>>
Subject: Re: Understanding ioBufAllocator behvaiour


Honestly it sounds like a leak. Can you specify which version of Traffic Server this is?



On Wednesday, May 24, 2017, 8:22:46 AM CDT, Dunkin, Nick <***@ccur.com<mailto:***@ccur.com>> wrote:

Hi


I have a load test that I’ve been running for a number of days now. I’m using the memory dump logging in traffic.out and I’m trying to understand how Traffic Server allocates and reuses memory. I’m still quite new to Traffic Server.


Nearly all of the memory traces look as I would expect, i.e. memory is allocated and reused over the lifetime of the test. However my readings from ioBufAllocator[0] show a continual increase in allocated AND used. I am attaching a graph. (FYI – This graph covers approximately 3 days of continual load test.)


I would have expected to start seeing reuse in ioBufAllocator by now, like I do in the other ioBufAllocators. Can someone help me understand what I’m seeing?


Many thanks,


Nick Dunkin


Nick Dunkin

Principal Engineer

o: 678.258.4071

e: ***@curr.com<mailto:***@ccur.com>

4375 River Green Pkwy # 100, Duluth, GA 30096, USA

<image001.png>
<image001.png>
Kapil Sharma (kapsharm)
2017-05-24 15:49:37 UTC
Permalink
On May 24, 2017, at 11:37 AM, Dunkin, Nick <***@ccur.com<mailto:***@ccur.com>> wrote:

Hi Kapil,

This is very interesting, thanks.

Excuse this newbie question though:

What is the lifetime of “in use” objects in the ioBufs?
That is determined by your RAM cache size and algorithm (CLFUS or LRU).

I don’t understand why we are not seeing reuse in these structures. In your artificial example, why would ATS “keep allocating 2M chunks until RAM cache limit is reached”. At what point are those allocated 2M chunks eligible for reuse?
Once the they are de-allocated by RAM cache as part of LRU/CLFUS, they are put back into the free/reclaimable pool, and can be re-used. But the important point is that the low level memory mgmt code doesn’t actually free the memory - so memory allocated by chunks can only increase not shrink. My example was an extreme case to make a point.


Thanks,

Nick

From: "Kapil Sharma (kapsharm)" <***@cisco.com<mailto:***@cisco.com>>
Reply-To: "***@trafficserver.apache.org<mailto:***@trafficserver.apache.org>" <***@trafficserver.apache.org<mailto:***@trafficserver.apache.org>>
Date: Wednesday, May 24, 2017 at 11:29 AM
To: "***@trafficserver.apache.org<mailto:***@trafficserver.apache.org>" <***@trafficserver.apache.org<mailto:***@trafficserver.apache.org>>
Subject: Re: Understanding ioBufAllocator behvaiour

On plateauing - not necessarily; we do see the memory consumption increasing continuously in our deployments as well. It depends on the pattern of segment sizes over time.

ATS uses power of 2 allocators for memory pool - there are 15 of those, ranging from 128bytes to 2M if my memory serves me right - and these are per thread! ATS will choose an optimal allocator for the segments.

As Alan mentioned, once chunk are allocated, they are never freed.

Here is a totally artificial example just to make the point (please correct if my understanding is flawed):
* the traffic pattern was such that initially only 2M allocators were used then ATS will keep allocating 2M chunks until RAM cache limit (lets say it is 64GB) is reached.
* Now traffic pattern changed (smaller fragment requests), and only 1M allocators are used, ATS will now keep allocating 1M chunks, again capping at 64GB. But in the end ATS would have allocated 128GB well over RAM cache size limit
.


In the past a there was some prototype of reclaimable buffer support added in ATS, but I believe it was removed in 7.0? Also there is recent discussion of adding jmalloc?



On May 24, 2017, at 11:01 AM, Alan Carroll <***@yahoo-inc.com<mailto:***@yahoo-inc.com>> wrote:

One issue is that memory never moves between the iobuf sizes. Once a chunk of memory is used for a specific iobuf slot, it's there forever. But unless something is leaking, the total size should eventually plateau, certainly within less than a day if you have a basically constant load. There will be some growth due to blocks being kept in thread local allocation pools, but again that should level in less time than you've run.


On Wednesday, May 24, 2017, 9:50:39 AM CDT, Dunkin, Nick <***@ccur.com<mailto:***@ccur.com>> wrote:
Hi Alan,

This is 7.0.0

I only see this behavior on ioBufAllocator[0], [4] and [5]. The other ioBufAllocators’ usage looks as I would expect (i.e. allocated goes up then flat), so I was thinking it was more likely something to do with my configuration or use-case.

I’d also just like to understand, at a high level, how the ioBufAllocators are used.

Thanks,

Nick

From: Alan Carroll <***@yahoo-inc.com<mailto:***@yahoo-inc.com>>
Reply-To: "***@trafficserver.apache.org<mailto:***@trafficserver.apache.org>" <***@trafficserver.apache.org<mailto:***@trafficserver.apache.org>>
Date: Wednesday, May 24, 2017 at 10:33 AM
To: "***@trafficserver.apache.org<mailto:***@trafficserver.apache.org>" <***@trafficserver.apache.org<mailto:***@trafficserver.apache.org>>
Subject: Re: Understanding ioBufAllocator behvaiour

Honestly it sounds like a leak. Can you specify which version of Traffic Server this is?


On Wednesday, May 24, 2017, 8:22:46 AM CDT, Dunkin, Nick <***@ccur.com<mailto:***@ccur.com>> wrote:
Hi

I have a load test that I’ve been running for a number of days now. I’m using the memory dump logging in traffic.out and I’m trying to understand how Traffic Server allocates and reuses memory. I’m still quite new to Traffic Server.

Nearly all of the memory traces look as I would expect, i.e. memory is allocated and reused over the lifetime of the test. However my readings from ioBufAllocator[0] show a continual increase in allocated AND used. I am attaching a graph. (FYI – This graph covers approximately 3 days of continual load test.)

I would have expected to start seeing reuse in ioBufAllocator by now, like I do in the other ioBufAllocators. Can someone help me understand what I’m seeing?

Many thanks,

Nick Dunkin


Nick Dunkin

Principal Engineer

o: 678.258.4071

e: ***@curr.com<mailto:***@ccur.com>

4375 River Green Pkwy # 100, Duluth, GA 30096, USA

<image001.png>
<image001.png>
Alan Carroll
2017-05-24 15:39:42 UTC
Permalink
That can certainly happen and is a known problem, but it looked like Nick's scenario was a constant load via a test application and he saw unbounded growth in a single iobuf bucket.
For threads, there is a single global pool and each thread keeps a smaller pool from the global one (via the ProxyAllocator instances). The ProxyAllocator has a high and low water mark - when the # of items in the thread exceeds the high water mark they are released back to the global pool until there are only low water mark items left. The values for these are in the 128-512 range, so not on the same scale as this memory growth.
There's been lots of discussion about jemalloc. What we lack is production performance data to see what the impact would be. We're working on that. As far as I understand it (Phil and Leif know more) we would keep the ProxyAllocators but instead of releasing to a global pool the memory would be released to jemalloc for re-use, thereby strongly bounding the amount of memory in a particular iobuf bucket.


On Wednesday, May 24, 2017, 10:29:32 AM CDT, Kapil Sharma (kapsharm) <***@cisco.com> wrote:On plateauing - not necessarily; we do see the memory consumption increasing continuously in our deployments as well. It depends on the pattern of segment sizes over time. 
ATS uses power of 2 allocators for memory pool - there are 15 of those, ranging from 128bytes to 2M if my memory serves me right - and these are per thread! ATS will choose an optimal allocator for the segments.
As Alan mentioned, once chunk are allocated, they are never freed.
Here is a totally artificial example just to make the point (please correct if my understanding is flawed):* the traffic pattern was such that initially only 2M allocators were used then ATS will keep allocating 2M chunks until RAM cache limit (lets say it is 64GB) is reached.* Now traffic pattern changed (smaller fragment requests), and only 1M allocators are used, ATS will now keep allocating 1M chunks, again capping at 64GB. But in the end ATS would have allocated 128GB well over RAM cache size limit
.

In the past a there was some prototype of reclaimable buffer support added in ATS, but I believe it was removed in 7.0? Also there is recent discussion of adding jmalloc?



On May 24, 2017, at 11:01 AM, Alan Carroll <***@yahoo-inc.com> wrote:
One issue is that memory never moves between the iobuf sizes. Once a chunk of memory is used for a specific iobuf slot, it's there forever. But unless something is leaking, the total size should eventually plateau, certainly within less than a day if you have a basically constant load. There will be some growth due to blocks being kept in thread local allocation pools, but again that should level in less time than you've run.


On Wednesday, May 24, 2017, 9:50:39 AM CDT, Dunkin, Nick <***@ccur.com> wrote:#yiv2668652937 -- filtered {panose-1:2 4 5 3 5 4 6 3 2 4;}#yiv2668652937 filtered {font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 4;}#yiv2668652937 p.yiv2668652937MsoNormal, #yiv2668652937 li.yiv2668652937MsoNormal, #yiv2668652937 div.yiv2668652937MsoNormal {margin:0in;margin-bottom:.0001pt;font-size:12.0pt;}#yiv2668652937 a:link, #yiv2668652937 span.yiv2668652937MsoHyperlink {color:blue;text-decoration:underline;}#yiv2668652937 a:visited, #yiv2668652937 span.yiv2668652937MsoHyperlinkFollowed {color:purple;text-decoration:underline;}#yiv2668652937 p.yiv2668652937msonormal, #yiv2668652937 li.yiv2668652937msonormal, #yiv2668652937 div.yiv2668652937msonormal {margin-right:0in;margin-left:0in;font-size:12.0pt;}#yiv2668652937 p.yiv2668652937msochpdefault, #yiv2668652937 li.yiv2668652937msochpdefault, #yiv2668652937 div.yiv2668652937msochpdefault {margin-right:0in;margin-left:0in;font-size:12.0pt;}#yiv2668652937 span.yiv2668652937msohyperlink {}#yiv2668652937 span.yiv2668652937msohyperlinkfollowed {}#yiv2668652937 span.yiv2668652937emailstyle17 {}#yiv2668652937 span.yiv2668652937msoins {}#yiv2668652937 p.yiv2668652937msonormal1, #yiv2668652937 li.yiv2668652937msonormal1, #yiv2668652937 div.yiv2668652937msonormal1 {margin:0in;margin-bottom:.0001pt;font-size:12.0pt;font-family:Calibri;}#yiv2668652937 span.yiv2668652937msohyperlink1 {color:#0563C1;text-decoration:underline;}#yiv2668652937 span.yiv2668652937msohyperlinkfollowed1 {color:#954F72;text-decoration:underline;}#yiv2668652937 span.yiv2668652937emailstyle171 {font-family:Calibri;color:windowtext;}#yiv2668652937 span.yiv2668652937msoins1 {color:teal;text-decoration:underline;}#yiv2668652937 p.yiv2668652937msochpdefault1, #yiv2668652937 li.yiv2668652937msochpdefault1, #yiv2668652937 div.yiv2668652937msochpdefault1 {margin-right:0in;margin-left:0in;font-size:12.0pt;font-family:Calibri;}#yiv2668652937 span.yiv2668652937EmailStyle29 {font-family:Calibri;color:windowtext;}#yiv2668652937 span.yiv2668652937msoIns {text-decoration:underline;color:teal;}#yiv2668652937 .yiv2668652937MsoChpDefault {font-size:10.0pt;}#yiv2668652937 filtered {margin:1.0in 1.0in 1.0in 1.0in;}#yiv2668652937 div.yiv2668652937WordSection1 {}#yiv2668652937
Hi Alan,
 

This is 7.0.0
 

I only see this behavior on ioBufAllocator[0], [4] and [5].  The other ioBufAllocators’ usage looks as I would expect (i.e. allocated goes up then flat), so I was thinking it was more likely something to do with my configuration or use-case.
 

I’d also just like to understand, at a high level, how the ioBufAllocators are used.
 

Thanks,
 

Nick
 

From:Alan Carroll <***@yahoo-inc.com>
Reply-To: "***@trafficserver.apache.org" <***@trafficserver.apache.org>
Date: Wednesday, May 24, 2017 at 10:33 AM
To: "***@trafficserver.apache.org" <***@trafficserver.apache.org>
Subject: Re: Understanding ioBufAllocator behvaiour
 

Honestly it sounds like a leak. Can you specify which version of Traffic Server this is?
 
 

On Wednesday, May 24, 2017, 8:22:46 AM CDT, Dunkin, Nick <***@ccur.com> wrote:

Hi
 

I have a load test that I’ve been running for a number of days now.  I’m using the memory dump logging in traffic.out and I’m trying to understand how Traffic Server allocates and reuses memory.  I’m still quite new to Traffic Server.
 

Nearly all of the memory traces look as I would expect, i.e. memory is allocated and reused over the lifetime of the test.  However my readings from ioBufAllocator[0] show a continual increase in allocated AND used.  I am attaching a graph.  (FYI – This graph covers approximately 3 days of continual load test.)
 

I would have expected to start seeing reuse in ioBufAllocator by now, like I do in the other ioBufAllocators.  Can someone help me understand what I’m seeing?
 

Many thanks,
 

Nick Dunkin
 

Nick Dunkin

Principal Engineer

o:   678.258.4071

e:   ***@curr.com 


4375 River Green Pkwy # 100, Duluth, GA 30096, USA
<image001.png>
<image001.png>
Dunkin, Nick
2017-05-24 15:49:32 UTC
Permalink
Just to clarify.

Yes, this is a constant load via a test application. The test scenario is live segmented video content (Apple HLS), so we have small, plain text, manifest files and large(ish) video files (3MB). There is also constant cache churn because the content is live video. We have set RAM cache and disk cache deliberately small (1GB each) to observe behavior as we churn cache.

The graph I attached was for ioBufAllocator[0], but we do see similar trends on [4] and [5]. We do see expected behavior, i.e. plateauing, from the other ioBufAllocators. Sorry for not making that clear in my initial email.

Thanks,

Nick

From: Alan Carroll <***@yahoo-inc.com>
Reply-To: "***@trafficserver.apache.org" <***@trafficserver.apache.org>
Date: Wednesday, May 24, 2017 at 11:39 AM
To: "***@trafficserver.apache.org" <***@trafficserver.apache.org>
Subject: Re: Understanding ioBufAllocator behvaiour

That can certainly happen and is a known problem, but it looked like Nick's scenario was a constant load via a test application and he saw unbounded growth in a single iobuf bucket.

For threads, there is a single global pool and each thread keeps a smaller pool from the global one (via the ProxyAllocator instances). The ProxyAllocator has a high and low water mark - when the # of items in the thread exceeds the high water mark they are released back to the global pool until there are only low water mark items left. The values for these are in the 128-512 range, so not on the same scale as this memory growth.

There's been lots of discussion about jemalloc. What we lack is production performance data to see what the impact would be. We're working on that. As far as I understand it (Phil and Leif know more) we would keep the ProxyAllocators but instead of releasing to a global pool the memory would be released to jemalloc for re-use, thereby strongly bounding the amount of memory in a particular iobuf bucket.


On Wednesday, May 24, 2017, 10:29:32 AM CDT, Kapil Sharma (kapsharm) <***@cisco.com> wrote:
On plateauing - not necessarily; we do see the memory consumption increasing continuously in our deployments as well. It depends on the pattern of segment sizes over time.

ATS uses power of 2 allocators for memory pool - there are 15 of those, ranging from 128bytes to 2M if my memory serves me right - and these are per thread! ATS will choose an optimal allocator for the segments.

As Alan mentioned, once chunk are allocated, they are never freed.

Here is a totally artificial example just to make the point (please correct if my understanding is flawed):
* the traffic pattern was such that initially only 2M allocators were used then ATS will keep allocating 2M chunks until RAM cache limit (lets say it is 64GB) is reached.
* Now traffic pattern changed (smaller fragment requests), and only 1M allocators are used, ATS will now keep allocating 1M chunks, again capping at 64GB. But in the end ATS would have allocated 128GB well over RAM cache size limit
.


In the past a there was some prototype of reclaimable buffer support added in ATS, but I believe it was removed in 7.0? Also there is recent discussion of adding jmalloc?



On May 24, 2017, at 11:01 AM, Alan Carroll <***@yahoo-inc.com<mailto:***@yahoo-inc.com>> wrote:

One issue is that memory never moves between the iobuf sizes. Once a chunk of memory is used for a specific iobuf slot, it's there forever. But unless something is leaking, the total size should eventually plateau, certainly within less than a day if you have a basically constant load. There will be some growth due to blocks being kept in thread local allocation pools, but again that should level in less time than you've run.


On Wednesday, May 24, 2017, 9:50:39 AM CDT, Dunkin, Nick <***@ccur.com<mailto:***@ccur.com>> wrote:

Hi Alan,


This is 7.0.0


I only see this behavior on ioBufAllocator[0], [4] and [5]. The other ioBufAllocators’ usage looks as I would expect (i.e. allocated goes up then flat), so I was thinking it was more likely something to do with my configuration or use-case.


I’d also just like to understand, at a high level, how the ioBufAllocators are used.


Thanks,


Nick


From: Alan Carroll <***@yahoo-inc.com<mailto:***@yahoo-inc.com>>
Reply-To: "***@trafficserver.apache.org<mailto:***@trafficserver.apache.org>" <***@trafficserver.apache.org<mailto:***@trafficserver.apache.org>>
Date: Wednesday, May 24, 2017 at 10:33 AM
To: "***@trafficserver.apache.org<mailto:***@trafficserver.apache.org>" <***@trafficserver.apache.org<mailto:***@trafficserver.apache.org>>
Subject: Re: Understanding ioBufAllocator behvaiour


Honestly it sounds like a leak. Can you specify which version of Traffic Server this is?



On Wednesday, May 24, 2017, 8:22:46 AM CDT, Dunkin, Nick <***@ccur.com<mailto:***@ccur.com>> wrote:

Hi


I have a load test that I’ve been running for a number of days now. I’m using the memory dump logging in traffic.out and I’m trying to understand how Traffic Server allocates and reuses memory. I’m still quite new to Traffic Server.


Nearly all of the memory traces look as I would expect, i.e. memory is allocated and reused over the lifetime of the test. However my readings from ioBufAllocator[0] show a continual increase in allocated AND used. I am attaching a graph. (FYI – This graph covers approximately 3 days of continual load test.)


I would have expected to start seeing reuse in ioBufAllocator by now, like I do in the other ioBufAllocators. Can someone help me understand what I’m seeing?


Many thanks,


Nick Dunkin


Nick Dunkin

Principal Engineer

o: 678.258.4071

e: ***@curr.com<mailto:***@ccur.com>
4375 River Green Pkwy # 100, Duluth, GA 30096, USA

<image001.png>
<image001.png>
Sudheer Vinukonda
2017-05-24 21:28:22 UTC
Permalink
AFAIR, I don't think RAM Cache includes all ioBuf pools. It's typically the 1M (or may be higher) pools only.

The rest of the lower sized ioBuf pools (especially, ioBuf-0 that's referenced here) are generally used for various purposes such as session contexts, connection objects etc. And I don't believe these pools are directly constrained or affected by RAM cache size parameter. They are simply proportional to the peak amount of concurrent sessions that ATS handles.

Disclaimer : It's possible I may not be remembering correctly and it's been a while I looked at this.

- Sudheer
Hi again,
This is great stuff, but it leads me to believe that I’ve totally overestimated my ram_cache.size setting. And in fact, totally misunderstood the parameter.
If I expect 5 of my ioBufAllocators to be in use during normal activity, then potentially I could see memory allocated to the level of (5 x ram_cache.size)? Because each ioBufAllocator is bounded by ram_cache.size?
No, not really. I guess my example which was intended as worst case example may have confused things :)
“Allocated" Chunks: IO Buffer chunks that have been allocated by the ioBufAllocator, but not all of them are actually being used in RAM Cache.
"In-use” Chunks: Chunks that are in use in the RAM Cache, these are subset of the “allocated” chunks.
“Free” Chunks: These contain the difference between the above two. When ioBufAllocator needs chunks of a particular size pool, it will try get it from the free list. If not available, only then new chunks are allocated from memory.
Allocated Chunks = In-use Chunks + Free Chunks
When buffer chunk is “de-alloced” from RAM cache, it is put back into the free chunk pool.
Ram Cache size parameter will limit the total “in-use” chunks, and this includes sum total size of the “in-use" chunks from all 15 pools. In general your traffic pattern should fall into steady state “plateau” such that all the “allocated” chunks doesn’t need to grow. But yes, sum total size of allocated chunks >= Ram cache size parameter. So it is best to keep some headroom in RAM.
I remember there was a way to dump the mem pools information to traffic.out - maybe someone in the list can help.
Hope this doesn’t confuse things more :)
(Terminology I used above may not reflect what’s in the code)
In which case I need to reduce, or tune, my ram_cache.size by a factor of 5?
I have a large ram_cache.size (100gb), assuming it was allocated to one large reserve of memory, so I assume this understanding is naive?
Thanks again for all your assistance,
Nick
Date: Wednesday, May 24, 2017 at 11:29 AM
Subject: Re: Understanding ioBufAllocator behvaiour
On plateauing - not necessarily; we do see the memory consumption increasing continuously in our deployments as well. It depends on the pattern of segment sizes over time.
ATS uses power of 2 allocators for memory pool - there are 15 of those, ranging from 128bytes to 2M if my memory serves me right - and these are per thread! ATS will choose an optimal allocator for the segments.
As Alan mentioned, once chunk are allocated, they are never freed.
* the traffic pattern was such that initially only 2M allocators were used then ATS will keep allocating 2M chunks until RAM cache limit (lets say it is 64GB) is reached.
* Now traffic pattern changed (smaller fragment requests), and only 1M allocators are used, ATS will now keep allocating 1M chunks, again capping at 64GB. But in the end ATS would have allocated 128GB well over RAM cache size limit
.
In the past a there was some prototype of reclaimable buffer support added in ATS, but I believe it was removed in 7.0? Also there is recent discussion of adding jmalloc?
Post by Kapil Sharma (kapsharm)
One issue is that memory never moves between the iobuf sizes. Once a chunk of memory is used for a specific iobuf slot, it's there forever. But unless something is leaking, the total size should eventually plateau, certainly within less than a day if you have a basically constant load. There will be some growth due to blocks being kept in thread local allocation pools, but again that should level in less time than you've run.
Hi Alan,
This is 7.0.0
I only see this behavior on ioBufAllocator[0], [4] and [5]. The other ioBufAllocators’ usage looks as I would expect (i.e. allocated goes up then flat), so I was thinking it was more likely something to do with my configuration or use-case.
I’d also just like to understand, at a high level, how the ioBufAllocators are used.
Thanks,
Nick
Date: Wednesday, May 24, 2017 at 10:33 AM
Subject: Re: Understanding ioBufAllocator behvaiour
Honestly it sounds like a leak. Can you specify which version of Traffic Server this is?
Hi
I have a load test that I’ve been running for a number of days now. I’m using the memory dump logging in traffic.out and I’m trying to understand how Traffic Server allocates and reuses memory. I’m still quite new to Traffic Server.
Nearly all of the memory traces look as I would expect, i.e. memory is allocated and reused over the lifetime of the test. However my readings from ioBufAllocator[0] show a continual increase in allocated AND used. I am attaching a graph. (FYI – This graph covers approximately 3 days of continual load test.)
I would have expected to start seeing reuse in ioBufAllocator by now, like I do in the other ioBufAllocators. Can someone help me understand what I’m seeing?
Many thanks,
Nick Dunkin
Nick Dunkin
Principal Engineer
o: 678.258.4071
4375 River Green Pkwy # 100, Duluth, GA 30096, USA
<image001.png>
<image001.png>
Alan Carroll
2017-05-24 21:32:38 UTC
Permalink
I'll have to check the code, but I think it's possible for ram cache to use other sizes, if the objects are small. The cache knows the approximate size of the object before reading and IIRC gets an iobuf of roughly that size for the disk read. If the object goes in ram cache, the iobuf is simply moved over to the ram cache.


On Wednesday, May 24, 2017, 4:28:46 PM CDT, Sudheer Vinukonda <***@yahoo.com> wrote:AFAIR, I don't think RAM Cache includes all ioBuf pools. It's typically the 1M (or may be higher) pools only.
The rest of the lower sized ioBuf pools (especially, ioBuf-0 that's referenced here) are generally used for various purposes such as session contexts, connection objects etc. And I don't believe these pools are directly constrained or affected by RAM cache size parameter. They are simply proportional to the peak amount of concurrent sessions that ATS handles.
Disclaimer : It's possible I may not be remembering correctly and it's been a while I looked at this.
- Sudheer 
On May 24, 2017, at 12:50 PM, Kapil Sharma (kapsharm) <***@cisco.com> wrote:





On May 24, 2017, at 2:52 PM, Dunkin, Nick <***@ccur.com> wrote:
Hi again,   This is great stuff, but it leads me to believe that I’ve totally overestimated my ram_cache.size setting.  And in fact, totally misunderstood the parameter.   Let me see if I understand what you’ve explained:   If I expect 5 of my ioBufAllocators to be in use during normal activity, then potentially I could see memory allocated to the level of (5 x ram_cache.size)?  Because each ioBufAllocator is bounded by ram_cache.size?
No, not really. I guess my example which was intended as worst case example may have confused things :)Let's differentiate between:“Allocated" Chunks: IO Buffer chunks that have been allocated by the ioBufAllocator, but not all of them are actually being used in RAM Cache."In-use” Chunks: Chunks that are in use in the RAM Cache, these are subset of the “allocated” chunks.“Free” Chunks:  These contain the difference between the above two. When ioBufAllocator needs chunks of a particular size pool, it will try get it from the free list. If not available, only then new chunks are allocated from memory.Allocated Chunks = In-use Chunks + Free Chunks
When buffer chunk is “de-alloced” from RAM cache, it is put back into the free chunk pool. 
Ram Cache size parameter will limit the total “in-use” chunks, and this includes sum total size of the “in-use" chunks from all 15 pools. In general your traffic pattern should fall into steady state “plateau” such that all the “allocated” chunks doesn’t need to grow. But yes, sum total size of allocated chunks >= Ram cache size parameter. So it is best to keep some headroom in RAM.
I remember there was a way to dump the mem pools information  to traffic.out - maybe someone in the list can help.
Hope this doesn’t confuse things more :)
(Terminology I used above may not reflect what’s in the code)




  In which case I need to reduce, or tune, my ram_cache.size by a factor of 5?   I have a large ram_cache.size (100gb), assuming it was allocated to one large reserve of memory, so I assume this understanding is naive?   Thanks again for all your assistance,   Nick   From: "Kapil Sharma (kapsharm)" <***@cisco.com>
Reply-To: "***@trafficserver.apache.org" <***@trafficserver.apache.org>
Date: Wednesday, May 24, 2017 at 11:29 AM
To: "***@trafficserver.apache.org" <***@trafficserver.apache.org>
Subject: Re: Understanding ioBufAllocator behvaiour   On plateauing - not necessarily; we do see the memory consumption increasing continuously in our deployments as well. It depends on the pattern of segment sizes over time.    ATS uses power of 2 allocators for memory pool - there are 15 of those, ranging from 128bytes to 2M if my memory serves me right - and these are per thread! ATS will choose an optimal allocator for the segments.   As Alan mentioned, once chunk are allocated, they are never freed.   Here is a totally artificial example just to make the point (please correct if my understanding is flawed): * the traffic pattern was such that initially only 2M allocators were used then ATS will keep allocating 2M chunks until RAM cache limit (lets say it is 64GB) is reached. * Now traffic pattern changed (smaller fragment requests), and only 1M allocators are used, ATS will now keep allocating 1M chunks, again capping at 64GB. But in the end ATS would have allocated 128GB well over RAM cache size limit
.     In the past a there was some prototype of reclaimable buffer support added in ATS, but I believe it was removed in 7.0? Also there is recent discussion of adding jmalloc?      
On May 24, 2017, at 11:01 AM, Alan Carroll <***@yahoo-inc.com> wrote:   One issue is that memory never moves between the iobuf sizes. Once a chunk of memory is used for a specific iobuf slot, it's there forever. But unless something is leaking, the total size should eventually plateau, certainly within less than a day if you have a basically constant load. There will be some growth due to blocks being kept in thread local allocation pools, but again that should level in less time than you've run.     On Wednesday, May 24, 2017, 9:50:39 AM CDT, Dunkin, Nick <***@ccur.com> wrote:
Hi Alan,
 
This is 7.0.0
 
I only see this behavior on ioBufAllocator[0], [4] and [5].  The other ioBufAllocators’ usage looks as I would expect (i.e. allocated goes up then flat), so I was thinking it was more likely something to do with my configuration or use-case.
 
I’d also just like to understand, at a high level, how the ioBufAllocators are used.
 
Thanks,
 
Nick
 
From: Alan Carroll <***@yahoo-inc.com>
Reply-To: "***@trafficserver.apache.org" <***@trafficserver.apache.org>
Date: Wednesday, May 24, 2017 at 10:33 AM
To: "***@trafficserver.apache.org" <***@trafficserver.apache.org>
Subject: Re: Understanding ioBufAllocator behvaiour
 
Honestly it sounds like a leak. Can you specify which version of Traffic Server this is?
   
On Wednesday, May 24, 2017, 8:22:46 AM CDT, Dunkin, Nick <***@ccur.com> wrote:

Hi
 
I have a load test that I’ve been running for a number of days now.  I’m using the memory dump logging in traffic.out and I’m trying to understand how Traffic Server allocates and reuses memory.  I’m still quite new to Traffic Server.
 
Nearly all of the memory traces look as I would expect, i.e. memory is allocated and reused over the lifetime of the test.  However my readings from ioBufAllocator[0] show a continual increase in allocated AND used.  I am attaching a graph.  (FYI – This graph covers approximately 3 days of continual load test.)
 
I would have expected to start seeing reuse in ioBufAllocator by now, like I do in the other ioBufAllocators.  Can someone help me understand what I’m seeing?
 
Many thanks,
 
Nick Dunkin
 
Nick Dunkin

Principal Engineer

o:   678.258.4071

e:   ***@curr.com 

4375 River Green Pkwy # 100, Duluth, GA 30096, USA

<image001.png>
<image001.png>
 

Loading...