The volume of CDN resource back-to-source requests triggered by a slash soars

background

One quiet night, I suddenly received an abnormal CDN back-to-source request from my friend on the phone line. The specific performance was a surge in requests, accompanied by a small number of 404 requests. The soaring request for back-to-source requests had lasted for two days but had not been discovered until recently. After analyzing the log after the 404 request triggered an alarm, it was discovered that the number of back-to-source volumes soared.
The cause of the triggering problem was quickly discovered and repaired and launched. Here I would like to share related concepts and knowledge such as the CDN back-to-source strategy and the principle of the 301 trigger mechanism learned during the follow-up process.

The traffic surge caused by extra slashes (/)

A large number of 301

By checking the nginx log of the origin site, it was found that the volume of back-to-origin requests soared nearly 1,000 times compared with the same day last week, and even began to suspect that there was a problem with the CDN manufacturer. They are all 301 (Move Permanently) responses, which account for more than 95%. 301 means permanent redirection, which means that most of the back-to-origin requests will be redirected to another address by the origin site.
After consulting with the CDN manufacturer, it is confirmed that the CDN node does not follow the 301 request from the origin site by default, that is, the CDN node will not jump to the final address on behalf of the client and cache it locally before returning, but directly return the 301 request to the client, and finally the client The terminal performs a 301 redirection by itself.
How did the sudden surge of 301 requests come from? After checking the path requested by nginx 301, I quickly found that the path of all CDN image resource addresses starts with a /, for example, http://cdn.demo.com/image/test.jpg will become http://cdn.demo .com//image/test.jpg , the source site of this type of request will directly return 301 Location: http://cdn.demo.com/image/test.jpg , that is, the user is required to 301 jump to the version that removes the redundant /.
This raises a question. In my impression, the file server of the source site does not have such a processing mechanism similar to 301. Does the merge/operation happen in nginx? However, it seems that nginx has not been configured similarly.

Nginx merge_slashes that seems to be 301 black hand

Check the server code of the file again to confirm that there is no merge/301 processing logic, and after the actual measurement, it seems to be able to confirm that nginx has done this step of 301 processing (note that this is the wrong conclusion of the first version), there is a merge_slashes configuration in the nginx configuration, official documents described as follows:

<span style="color:#333333"><span style="background-color:#ffffff"><code class="language-csharp">Syntax: merge_slashes <span style="color:#0000ff" >on</span> | off;
Default: merge_slashes <span style="color:#0000ff">on</span>;
Context: http, server
Enables <span style="color:#0000ff">or</span> disables compression of two <span style="color:#0000ff">or</span> more adjacent slashes <span style="color:#0000ff" >in</span> a URI <span style="color:#0000ff">into</span> a single slash.
Note that compression <span style="color:#0000ff">is</span> essential <span style="color:#0000ff">for</span> the correct matching of prefix <span style="color:#0000ff ">string</span> <span style="color:#0000ff">and</span> regular expression locations. Without it, the “<span style="color:#008000">//scripts/one.php " request would not match</span>
location /scripts/ {
    ...
}
<span style="color:#0000ff">and</span> might be processed <span style="color:#0000ff">as</span> a <span style="color:#0000ff">static</span> span> file. So it gets converted to “/scripts/one.php”.
Turning the compression off can become necessary <span style="color:#0000ff">if</span> a URI contains base64-encoded names, since base64 uses the “/” character internally. However, <span style="color: #0000ff">for</span> security considerations, it <span style="color:#0000ff">is</span> better to avoid turning the compression off.
</code></span></span>

As shown above, merge_slashes is enabled by default, and its function is to merge more than 2 consecutive /s in the URI into a single /, that is to say, whether it is cdn.demo.com//a.jpg or cdn.demo.com ///a.jpg will be merged into cdn.demo.com/a.jpg.
When I explored here, I subconsciously thought that it was the merge_slashes function of nginx that returned a 301 response. I originally published the blog based on this conclusion, but two days later I saw blogger @唐大侠灵感执话: “Since nginx Merge / is enabled by default, why doesn’t it work?”
I fell into deep thought, and I tried to read the nginx source code for a while to find the code logic that merge_slashes triggers 301, but I didn’t clear this logic in the complicated source code, so I simply verified that when calling the backend server directly through HTTP, it doesn’t work. After the 301 is triggered, the 301 will be directly blamed on nginx, so it is actually an indirect confirmation and cannot be 100% sure. It seems that it is necessary to regroup and investigate the fog generated by 301, but it turns out that there is another truth behind the fog.
Let’s re-clarify the function of merge_slashes. When merge_slashes is enabled, it will indeed merge the redundant / in the path, but the logic of merging only occurs in the step of location match. Assuming the following configuration

<span style="color:#333333"><span style="background-color:#ffffff"><code class="language-kotlin"> location / {
        <span style="color:#0000ff">return</span> <span style="color:#880000">404</span>;
    }
    location /scripts/ {
        proxy_pass http:<span style="color:#008000">//127.0.0.1:6666;</span>
    }
</code></span></span>

When merge_slashes is enabled: //scripts/one.php will merge redundant/regarded as /scripts/one.php for matching when matching, so the location /scripts/ rule is hit, but after the hit, proxy_pass is passed to the data of the upstream service. The path will still be the original //scripts/one.php instead of the merged version.
And when merge_slashes is off: //scripts/one.php will not match directly with location /scripts/, so in the end it can only hit location / and return 404.
So merge_slashes will only merge / when matching location, and then transfer the corresponding request to the processing flow after location. It is not the real 301 behind the scenes! !
There is a long way to go, and we will continue to explore. Since the real black hand of 301 cannot be found on nginx, we will come back to torture the file server.

golang DefaultServemux

The file server of the source site is a golang service. Previously, it only checked the business code without considering the underlying library it depends on. This time, we will reorganize and study the related source code of net/http.
In the default ServerMutex implementation there is a Handler function:

<span style="color:#333333"><span style="background-color:#ffffff"><code class="language-yaml"><span style="color:#880000">2322< /span><span style="color:#a31515">func</span> <span style="color:#a31515">(mux</span> <span style="color:#a31515">*ServeMux) </span> <span style="color:#a31515">Handler(r</span> <span style="color:#a31515">*Request)</span> <span style="color:#a31515" >(h</span> <span style="color:#a31515">Handler,</span> <span style="color:#a31515">pattern</span> <span style="color:#a31515" >string)</span> {
<span style="color:#a31515">...</span>
<span style="color:#880000">2335</span>
<span style="color:#880000">2336</span> <span style="color:#a31515">//</span> <span style="color:#a31515">All</span> < span style="color:#a31515">other</span> <span style="color:#a31515">requests</span> <span style="color:#a31515">have</span> <span style ="color:#a31515">any</span> <span style="color:#a31515">port</span> <span style="color:#a31515">stripped</span> <span style=" color:#a31515">and</span> <span style="color:#a31515">path</span> <span style="color:#a31515">cleaned</span>
<span style="color:#880000">2337</span> <span style="color:#a31515">//</span> <span style="color:#a31515">before</span> < span style="color:#a31515">passing</span> <span style="color:#a31515">to</span> <span style="color:#a31515">mux.handler.</span>
<span style="color:#880000">2338 </span><span style="color:#a31515">host</span> <span style="color:#a31515">:=</span> < span style="color:#a31515">stripHostPort(r.Host)</span>
<span style="color:#880000">2339 </span><span style="color:#a31515">path</span> <span style="color:#a31515">:=</span> < span style="color:#a31515">cleanPath(r.URL.Path)</span>
<span style="color:#880000">2340</span>
<span style="color:#880000">2341</span> <span style="color:#a31515">//</span> <span style="color:#a31515">If</span> < span style="color:#a31515">the</span> <span style="color:#a31515">given</span> <span style="color:#a31515">path</span> <span style ="color:#a31515">is</span> <span style="color:#a31515">/tree</span> <span style="color:#a31515">and</span> <span style= "color:#a31515">its</span> <span style="color:#a31515">handler</span> <span style="color:#a31515">is</span> <span style="color :#a31515">not</span> <span style="color:#a31515">registered</span>,
<span style="color:#880000">2342</span> <span style="color:#a31515">//</span> <span style="color:#a31515">redirect</span> < span style="color:#a31515">for</span> <span style="color:#a31515">/tree/.</span>
<span style="color:#880000">2343 </span><span style="color:#a31515">if</span> <span style="color:#a31515">u</span>, < span style="color:#a31515">ok</span> <span style="color:#a31515">:=</span> <span style="color:#a31515">mux.redirectToPathSlash(host</span> span>, <span style="color:#a31515">path</span>, <span style="color:#a31515">r.URL);</span> <span style="color:#a31515" >ok</span> {
<span style="color:#880000">2344 </span><span style="color:#a31515">return</span> <span style="color:#a31515">RedirectHandler(u.String() </span>, <span style="color:#a31515">StatusMovedPermanently)</span>, <span style="color:#a31515">u.Path</span>
<span style="color:#880000">2345</span> }
<span style="color:#880000">2346</span>
<span style="color:#880000">2347 </span><span style="color:#a31515">if</span> <span style="color:#a31515">path</span> <span style="color:#a31515">!=</span> <span style="color:#a31515">r.URL.Path</span> {
<span style="color:#880000">2348 </span><span style="color:#a31515">_</span>, <span style="color:#a31515">pattern</span> < span style="color:#a31515">=</span> <span style="color:#a31515">mux.handler(host</span>, <span style="color:#a31515">path)< /span>
<span style="color:#880000">2349 </span><span style="color:#a31515">url</span> <span style="color:#a31515">:=</span> < span style="color:#a31515">*r.URL</span>
<span style="color:#880000">2350 </span><span style="color:#a31515">url.Path</span> <span style="color:#a31515">=</span> <span style="color:#a31515">path</span>
<span style="color:#880000">2351 </span><span style="color:#a31515">return</span> <span style="color:#a31515">RedirectHandler(url.String() </span>, <span style="color:#a31515">StatusMovedPermanently)</span>, <span style="color:#a31515">pattern</span>
<span style="color:#880000">2352</span> }
<span style="color:#880000">2353</span>
<span style="color:#880000">2354 </span><span style="color:#a31515">return</span> <span style="color:#a31515">mux.handler(host</ span>, <span style="color:#a31515">r.URL.Path)</span>
<span style="color:#880000">2355</span> }
</code></span></span>

Line 2339 of the above code will call cleanPath, and CleanPath will eventually call path.Clean, as described in the official path.Clean documentation:

<span style="color:#333333"><span style="background-color:#ffffff"><code class="language-sql">Clean <span style="color:#0000ff">returns </span> the shortest path name equivalent <span style="color:#0000ff">to</span> path <span style="color:#0000ff">by</span> purely lexical processing. It applies the following rules iteratively until <span style="color:#0000ff">no</span> further processing can be done:

<span style="color:#880000">1.</span> Replace multiple slashes <span style="color:#0000ff">with</span> a single slash.
<span style="color:#880000">2.</span> Eliminate <span style="color:#0000ff">each</span> . path name <span style="color:#0000ff">element< /span> (the <span style="color:#0000ff">current</span> directory).
<span style="color:#880000">3.</span> Eliminate <span style="color:#0000ff">each</span> <span style="color:#0000ff">inner</span> .. path name <span style="color:#0000ff">element</span> (the parent directory) along <span style="color:#0000ff">with</span> the non<span style="color :#ab5656">-</span>.. element that <span style="color:#0000ff">precedes</span> it.
<span style="color:#880000">4.</span> Eliminate .. elements that <span style="color:#0000ff">begin</span> a rooted path: that <span style="color: #0000ff">is</span>, replace "/.." <span style="color:#0000ff">by</span> "/" <span style="color:#0000ff">at</span > the beginning <span style="color:#0000ff">of</span> a path.
</code></span></span>

The first point is to clearly state that multiple slashes will be merged into a single slash, and in line 2347 of the Handler code, it will be clearly judged that if the original r.URL.Path is not equal to the path processed by CleanPath, it will pass return RedirectHandler(url. String(), StatusMovedPermanently), pattern directly returns 301 permanent redirection.
The HTTP startup code of the file server is http.ListenAndServe(settings.HttpBind, nil), that is, handler=nil, the default DefaultServeMux will be used, and the truth is finally revealed.
As for why there was a situation of “so after simply verifying that 301 is not triggered when calling the back-end server directly through HTTP, the 301 is directly thrown at the head of nginx”, because at that time, the lazy direct HTTP call was not golang The file server is the log server of the golang version, and the log server defines its own handler, so there is no 301 logic of the default handler–which eventually leads to a wrong conclusion–sooner or later there will be a price for being lazy ==

Where did the extra/come from

That’s right, this is a low-level mistake–I just modified the module code assembled by the CDN URL of the image file a few days ago. I quickly went back to diff and found that there was a line of local test code in the generated code of the image resource model that was submitted by hand shaking (embarrassing== !), the final effect is that there will be an extra / at the beginning of the URL path of all CDN image resources, and this extra / will only cause the image to load slower rather than fail to load, so everyone did not find it when using the app what is the problem.
Immediately commented out the test code that caused redundant/online, and cleared a wave of online related caches, and then looked back to observe that the nginx log request volume of the source site dropped sharply visible to the naked eye, and a small number of subsequent 404 requests gradually disappeared.

How come a small number of resources 404

Although the 404 requests also disappeared, the reason for the occurrence of 404 is still not located. Overall, the number of 404 requests is only about 1% of that of 301 requests. By analyzing the log, it is found that the timing of the appearance and disappearance of 404 requests is basically the same as that of a large number of 301 requests. The appearance and disappearance of requests are consistent, and it seems that the two are directly related.
Check the log of the 404 request carefully, and find that the path is also very regular, such as:

<span style="color:#333333"><span style="background-color:#ffffff"><code class="language-bash">GET /demo/image/icon.png,/demo/ image/icon.png
GET /demo/image/profile.png, /demo/image/profile.png
GET /demo/image/head.png, /demo/image/head.jpg
</code></span></span>

Using these paths to assemble a complete URL such as http://cdn.demo.com/demo/image/icon.png can actually successfully obtain resources, which means that these resources must exist, but for some inexplicable reason when the client requests The path part is stitched together once, and the corresponding request http://cdn.demo.com/demo/image/icon.png, /demo/image/icon.png is naturally 404.
From the perspective of the entire CDN file request process, there are four possibilities:

  1. The first thing to suspect is that the server made an error when splicing the CDN URL, which caused the path to be spliced repeatedly, but there was really nothing suspicious in the code.
  2. When the CDN node forwards the 301 request from the origin site to the client, the Location response is spliced incorrectly. Considering that CDN is a widely used product provided to so many users, the possibility of the CDN node making an error is very low.
  3. The Location in the 301 response given by nginx after merge / is wrong. Considering that nginx is such a tried-and-tested product, this possibility is extremely low.
  4. I can’t help but recall that the strange android models may have all kinds of strange situations (various pitfalls I have stepped on…), is there a problem with the jump mechanism of some niche models when they jump to 301? Unfortunately, the origin site cannot directly receive the original request from the client, so the specific client-related parameters of this type of 404 request cannot be analyzed further.

Therefore, it can only be inferred from the timing of the occurrence that 404 is related to the occurrence of a large number of 301 requests, but there is no way to locate the specific step where there is a problem–you are welcome to give us new possible ideas-after the 301 problem is fixed The 404 request has disappeared, and it can only be left for the time being. See you in the future.

Mystery of soaring requests but not soaring traffic

After the problem was fixed, I suddenly wondered why the bandwidth of the origin site was not blown up even after the number of requests soared by *1000+ for several days? The bandwidth purchased by the source site can’t handle the *1000 traffic!
After careful consideration, it becomes clear: because the newly added requests are 301 directly returned by nginx instead of the actual target file, subsequent client 301 jumps to the correct URL request and then enters the normal CDN back-to-source->cache- > Return to the process, and will not bring additional consumption to the source site. A complete 301 response is only a few hundred bytes, so the actual consumption of bandwidth is very small. The bandwidth occupied by 1000 301 responses is equivalent to the bandwidth occupied by a picture of several hundred KB, so fortunately, the bandwidth of this step is not large. No problem.

Summary

Looking back on this accident, hand shaking is really the root of all evil. It was originally a new part of the code, but the result was that the previous test code was accidentally added, which caused the CDN 301 request to soar. The specific impact on users:

  1. All image CDN requests will have an additional 301 jump, depending on the distance between the user’s geographical location and the source site, the loading time will increase by a few to several hundred milliseconds.
  2. About 1% of image requests will be responded with 404 due to repeated spliced paths.

After so many years of walking in the code lake, after all, I still stepped on one of the classic programmer pits where the test code referred to the line and caused accidents. Take it as a warning, take it as a warning!

reference

301 Moved Permanently – HTTP | MDN
CDN back-to-origin 301/302 follow-Configuration Guide-Document Center-Tencent Cloud

The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledge. Java skill treeHomepageOverview 124723 people are studying systematically