How I found and fixed a bug in PHP's standard library

栏目: IT技术 · 发布时间: 4年前

内容简介:I’ve been a software developer at EBANX since 2016. There we have a mechanism calledI was deploying some code once and I noticed something weird in the logs. Some requests had two Host headers once they arrived in the release candidate server. Normally we

I’ve been a software developer at EBANX since 2016. There we have a mechanism called Release Candidate . Every time we deploy something new to our monolith PHP system we end up deploying it first to the release candidate server. Then we elect a few random requests and proxy it from our live normal servers to the release candidate server, if those requests go well with no errors then the deploy occurs in the normal API servers, if an error occurs the proxying mechanism is disabled and the responsible for the code can better debug the problem in the release candidate server and rollback the code if needed.

I was deploying some code once and I noticed something weird in the logs. Some requests had two Host headers once they arrived in the release candidate server. Normally we simply copy the headers that came from the original request, so the request should arrive in the release candidate with the host header containing the url of the original server. In the cases where two Hosts were present the top one was the url of the release candidate server and the second one was the original one containing the original api server. This puzzled me, the code I was modifying had nothing to do with the proxying mechanism, and after looking at older logs I found out that this always occurred a few times. Since this mechanism is critical to the safety of our deploys I dropped everything and started to investigate.

Naturally I assumed that the bug was somewhere in our proxying code, after digging for a while I couldn’t find anything wrong with the code. This is the function we use to rewrite the Header of the request to the release candidate with the original host:

private static function adjustRequestTarget(RequestInterface $request): RequestInterface {
  $original_host = $request->getHeaderLine('Host');
  $new_uri = $request->getUri()->withHost(static::getReleaseCandidateInstanceHost());
  return $request
    ->withUri($new_uri)
    ->withHeader('Host', $original_host);
}

We are using Guzzle as the http client. Internally the headers are represented as an array inside the class GuzzleHttp\Psr7\Request. I noticed that when calling the withHeader function the Host header is shifted to the end of array, but regardless of this detail I couldn’t reproduce the issue when testing locally with the exact same requests that caused the problem in production.

I ended up noticing something pretty interesting. The stream option being passed during the request was different on my tests than the production code, this made guzzle use a different http handler, with stream set as true it uses the PHP stream handler, with it set as false it uses cURL, I was testing it with false, changing it to true made me able to reproduce the issue when replaying a request that caused the problem in production, now I just needed to drill down enough in the code that was called to determine what was causing the problem.

After cloning the PHP source and reading for a while I finally found the culprit, the problem is caused by a combination of the change of header array order that the previously referenced adjustRequestTarget function causes, and the way the http fopen wrapper handles the http headers.

Inside the http_fopen_wrapper.c code we have these two interesting snippets:

if ((s = strstr(t, "host:")) &&
                (s == t || *(s-1) == '\r' || *(s-1) == '\n' ||
                             *(s-1) == '\t' || *(s-1) == ' ')) {
                 have_header |= HTTP_HEADER_HOST;
            }
/* Send Host: header so name-based virtual hosts work */
if ((have_header & HTTP_HEADER_HOST) == 0) {
    smart_str_appends(&req_buf, "Host: ");
    smart_str_appends(&req_buf, ZSTR_VAL(resource->host));
    if ((use_ssl && resource->port != 443 && resource->port != 0) ||
        (!use_ssl && resource->port != 80 && resource->port != 0)) {
        smart_str_appendc(&req_buf, ':');
        smart_str_append_unsigned(&req_buf, resource->port);
    }
    smart_str_appends(&req_buf, "\r\n");
}

The first snippet searches in the header string for an instance of the host: string in the beginning of a line, if it finds one there then it enables the have_header HTTP_HEADER_HOST flag, later the second snippet checks that flag and if it’s not enabled then it appends a Host header with the target address.

Turns out that all the affected requests had a header with the string host: somewhere in them before the Host header (which was shifted to the end of array due to us changing it back to the original host), this made the check fail and the wrapper thought we didn’t have a host header because strstr only returns the first instance of a string. So the wrapper ended up appending a new Host header with the target url and it still sends the one that we were trying to send originally causing the issue.

The following test script:

<?php
$opts = array(
  'http'=>array(
    'method'=>"GET",
    'header'=>"RandomHeader: foobar\r\n" .
              "Cookie: foo=bar\r\n" .
              "Host: c764dd43.ngrok.io  \r\n"
  )
);
$context = stream_context_create($opts);
$fp = fopen('http://c764dd43.ngrok.io', 'r', false, $context);
fpassthru($fp);

Will generate the following raw headers:

GET / HTTP/1.0
RandomHeader: foobar
Cookie: foo=bar
Host: c764dd43.ngrok.io
X-Forwarded-For: 2804:14c:8781:8c3a:3805:2fbe:1845:15e6

Now if we change the RandomHeader value to something that contains the host: string:

<?php
$opts = array(
  'http'=>array(
    'method'=>"GET",
    'header'=>"RandomHeader: localhost:8080\r\n" .
              "Cookie: foo=bar\r\n" .
              "Host: testcustomheader  \r\n"
  )
);
$context = stream_context_create($opts);
$fp = fopen('http://c764dd43.ngrok.io', 'r', false, $context);
fpassthru($fp); (edited)

We’ll get two Host headers in the raw request:

GET / HTTP/1.0
Host: c764dd43.ngrok.io
RandomHeader: localhost:8080
Cookie: foo=bar
Host: testcustomhostheader
X-Forwarded-For: 2804:14c:8781:8c3a:3805:2fbe:1845:15e6

According to the RFC 7320 a server must respond with a 400 (bad request) to requests that contain more than one Host header, so every request affected by this bug should break on a properly implemented server.

With this information I decided to open a bug on the PHP issue tracker. After a few days of inactivity I decided I had enough knowledge to try to implement a fix myself!

I cloned the PHP git repository from github and compiled it from the master branch.

I changed the snippet that checks if the host header is present to check for every occurence of the string:

while ((s = strstr(s, "host:"))) {
	if (s == t || *(s-1) == '\r' || *(s-1) == '\n' ||
	                 *(s-1) == '\t' || *(s-1) == ' ') {
		 have_header |= HTTP_HEADER_HOST;
	}
	s++;
}

I recompiled, ran the test script again and it worked! Now I needed to find out the project contributing guidelines.

I found out that PHP has a test suite composed by phpt files, these files are a mixture of plaintext and php with different sections. The most important sections are FILE and EXPECTED , inside the file section there should be a PHP script that should output something to stdout. The EXPECTED section contains the expected output, the test script then compares the output produced by the script inside FILE to the output on the EXPECTED section, if they match the test passes, if they don’t then it fails. It’s very simple and effective. I changed the script I was using to reproduce the issue to become a .phpt file so we could have automated tests for this bug:

--TEST--
Bug #79265 (Improper injection of Host header when using fopen for http requests)
--INI--
allow_url_fopen=1
--SKIPIF--

--FILE--
<?php
require 'server.inc';

$responses = array(
    "data://text/plain,HTTP/1.0 200 OK\r\n\r\n",
);

$pid = http_server("tcp://127.0.0.1:12342", $responses, $output);

$opts = array(
  'http'=>array(
    'method'=>"GET",
    'header'=>"RandomHeader: localhost:8080\r\n" .
              "Cookie: foo=bar\r\n" .
              "Host: userspecifiedvalue\r\n"
  )
);
$context = stream_context_create($opts);
$fd = fopen('http://127.0.0.1:12342/', 'rb', false, $context);
fseek($output, 0, SEEK_SET);
echo stream_get_contents($output);
fclose($fd);

http_server_kill($pid);

?>
--EXPECT--
GET / HTTP/1.0
Connection: close
RandomHeader: localhost:8080
Cookie: foo=bar
Host: userspecifiedvalue

I commited the fix alongside with the test script and opened a pull request . It was merged a few hours later.


以上所述就是小编给大家介绍的《How I found and fixed a bug in PHP's standard library》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

光线跟踪算法技术

光线跟踪算法技术

萨芬 / 刘天慧 / 清华大学出版社 / 2011-3 / 98.00元

《光线跟踪算法技术》详细阐述了与光线跟踪问题相关的高效解决方案及相应的数据结构和算法,主要包括采样技术、投影视图、视见系统、景深、非线性投影、立体视觉、光照与材质、镜面反射、光泽反射、全局光照、透明度、阴影、环境遮挡、区域光照、光线与对象间的相交计算、对象变换、栅格技术以及纹理映射技术等内容。此外,《光线跟踪算法技术》还提供了相应的算法、代码以及伪代码,以帮助读者进一步理解计算方案的实现过程。 ......一起来看看 《光线跟踪算法技术》 这本书的介绍吧!

XML、JSON 在线转换
XML、JSON 在线转换

在线XML、JSON转换工具

RGB HSV 转换
RGB HSV 转换

RGB HSV 互转工具