Chou, Peter
2017-06-23 21:22:23 UTC
Hi,
I am evaluating the collapsed_forwarding plugin for addressing the "thundering herd" issue. One scenario that I am looking at is "thundering herd" caused by slow DNS response time. I found that the plugin either has no effect or spits out a bunch of errors.
Setup --
* Use Linux tc to add 500ms of delay to the test DNS server's network port egress.
* Use curl to send ten requests from the client to ATS (both on the same machine).
Curl instances are back-grounded so they can run simultaneously.
All requests can be initiated within 200ms of each other.
Results --
* If the requested object is in the cache but stale, then I see 1x 304 and 9x 200 responses from the origin server to ATS to client.
* If the requested object is not in the cache (purge first), then I see 1x 200 from the origin server and 1x200 and 9x 500 responses from ATS to the client.
The 500 error is "InkAPI Error" and traffic.out says "request delayed, but unsuccessful".
I did try setting the plugin's delay to 500 and retries to 10 with no change in result (I believe this is 5s total).
Is it possible to get this working, i.e., wait 500ms then send one request to origin then serve all ten responses to the client? I suspect this is a limitation of the hook point being associated with cache-write-locking which occurs after the DNS look-up. Appreciate any comments.
Thanks,
Peter
I am evaluating the collapsed_forwarding plugin for addressing the "thundering herd" issue. One scenario that I am looking at is "thundering herd" caused by slow DNS response time. I found that the plugin either has no effect or spits out a bunch of errors.
Setup --
* Use Linux tc to add 500ms of delay to the test DNS server's network port egress.
* Use curl to send ten requests from the client to ATS (both on the same machine).
Curl instances are back-grounded so they can run simultaneously.
All requests can be initiated within 200ms of each other.
Results --
* If the requested object is in the cache but stale, then I see 1x 304 and 9x 200 responses from the origin server to ATS to client.
* If the requested object is not in the cache (purge first), then I see 1x 200 from the origin server and 1x200 and 9x 500 responses from ATS to the client.
The 500 error is "InkAPI Error" and traffic.out says "request delayed, but unsuccessful".
I did try setting the plugin's delay to 500 and retries to 10 with no change in result (I believe this is 5s total).
Is it possible to get this working, i.e., wait 500ms then send one request to origin then serve all ten responses to the client? I suspect this is a limitation of the hook point being associated with cache-write-locking which occurs after the DNS look-up. Appreciate any comments.
Thanks,
Peter