How I found and fixed a bug in PHP's standard library

栏目: IT技术 · 发布时间: 4年前

内容简介:I’ve been a software developer at EBANX since 2016. There we have a mechanism calledI was deploying some code once and I noticed something weird in the logs. Some requests had two Host headers once they arrived in the release candidate server. Normally we

I’ve been a software developer at EBANX since 2016. There we have a mechanism called Release Candidate . Every time we deploy something new to our monolith PHP system we end up deploying it first to the release candidate server. Then we elect a few random requests and proxy it from our live normal servers to the release candidate server, if those requests go well with no errors then the deploy occurs in the normal API servers, if an error occurs the proxying mechanism is disabled and the responsible for the code can better debug the problem in the release candidate server and rollback the code if needed.

I was deploying some code once and I noticed something weird in the logs. Some requests had two Host headers once they arrived in the release candidate server. Normally we simply copy the headers that came from the original request, so the request should arrive in the release candidate with the host header containing the url of the original server. In the cases where two Hosts were present the top one was the url of the release candidate server and the second one was the original one containing the original api server. This puzzled me, the code I was modifying had nothing to do with the proxying mechanism, and after looking at older logs I found out that this always occurred a few times. Since this mechanism is critical to the safety of our deploys I dropped everything and started to investigate.

Naturally I assumed that the bug was somewhere in our proxying code, after digging for a while I couldn’t find anything wrong with the code. This is the function we use to rewrite the Header of the request to the release candidate with the original host:

private static function adjustRequestTarget(RequestInterface $request): RequestInterface {
  $original_host = $request->getHeaderLine('Host');
  $new_uri = $request->getUri()->withHost(static::getReleaseCandidateInstanceHost());
  return $request
    ->withUri($new_uri)
    ->withHeader('Host', $original_host);
}

We are using Guzzle as the http client. Internally the headers are represented as an array inside the class GuzzleHttp\Psr7\Request. I noticed that when calling the withHeader function the Host header is shifted to the end of array, but regardless of this detail I couldn’t reproduce the issue when testing locally with the exact same requests that caused the problem in production.

I ended up noticing something pretty interesting. The stream option being passed during the request was different on my tests than the production code, this made guzzle use a different http handler, with stream set as true it uses the PHP stream handler, with it set as false it uses cURL, I was testing it with false, changing it to true made me able to reproduce the issue when replaying a request that caused the problem in production, now I just needed to drill down enough in the code that was called to determine what was causing the problem.

After cloning the PHP source and reading for a while I finally found the culprit, the problem is caused by a combination of the change of header array order that the previously referenced adjustRequestTarget function causes, and the way the http fopen wrapper handles the http headers.

Inside the http_fopen_wrapper.c code we have these two interesting snippets:

if ((s = strstr(t, "host:")) &&
                (s == t || *(s-1) == '\r' || *(s-1) == '\n' ||
                             *(s-1) == '\t' || *(s-1) == ' ')) {
                 have_header |= HTTP_HEADER_HOST;
            }
/* Send Host: header so name-based virtual hosts work */
if ((have_header & HTTP_HEADER_HOST) == 0) {
    smart_str_appends(&req_buf, "Host: ");
    smart_str_appends(&req_buf, ZSTR_VAL(resource->host));
    if ((use_ssl && resource->port != 443 && resource->port != 0) ||
        (!use_ssl && resource->port != 80 && resource->port != 0)) {
        smart_str_appendc(&req_buf, ':');
        smart_str_append_unsigned(&req_buf, resource->port);
    }
    smart_str_appends(&req_buf, "\r\n");
}

The first snippet searches in the header string for an instance of the host: string in the beginning of a line, if it finds one there then it enables the have_header HTTP_HEADER_HOST flag, later the second snippet checks that flag and if it’s not enabled then it appends a Host header with the target address.

Turns out that all the affected requests had a header with the string host: somewhere in them before the Host header (which was shifted to the end of array due to us changing it back to the original host), this made the check fail and the wrapper thought we didn’t have a host header because strstr only returns the first instance of a string. So the wrapper ended up appending a new Host header with the target url and it still sends the one that we were trying to send originally causing the issue.

The following test script:

<?php
$opts = array(
  'http'=>array(
    'method'=>"GET",
    'header'=>"RandomHeader: foobar\r\n" .
              "Cookie: foo=bar\r\n" .
              "Host: c764dd43.ngrok.io  \r\n"
  )
);
$context = stream_context_create($opts);
$fp = fopen('http://c764dd43.ngrok.io', 'r', false, $context);
fpassthru($fp);

Will generate the following raw headers:

GET / HTTP/1.0
RandomHeader: foobar
Cookie: foo=bar
Host: c764dd43.ngrok.io
X-Forwarded-For: 2804:14c:8781:8c3a:3805:2fbe:1845:15e6

Now if we change the RandomHeader value to something that contains the host: string:

<?php
$opts = array(
  'http'=>array(
    'method'=>"GET",
    'header'=>"RandomHeader: localhost:8080\r\n" .
              "Cookie: foo=bar\r\n" .
              "Host: testcustomheader  \r\n"
  )
);
$context = stream_context_create($opts);
$fp = fopen('http://c764dd43.ngrok.io', 'r', false, $context);
fpassthru($fp); (edited)

We’ll get two Host headers in the raw request:

GET / HTTP/1.0
Host: c764dd43.ngrok.io
RandomHeader: localhost:8080
Cookie: foo=bar
Host: testcustomhostheader
X-Forwarded-For: 2804:14c:8781:8c3a:3805:2fbe:1845:15e6

According to the RFC 7320 a server must respond with a 400 (bad request) to requests that contain more than one Host header, so every request affected by this bug should break on a properly implemented server.

With this information I decided to open a bug on the PHP issue tracker. After a few days of inactivity I decided I had enough knowledge to try to implement a fix myself!

I cloned the PHP git repository from github and compiled it from the master branch.

I changed the snippet that checks if the host header is present to check for every occurence of the string:

while ((s = strstr(s, "host:"))) {
	if (s == t || *(s-1) == '\r' || *(s-1) == '\n' ||
	                 *(s-1) == '\t' || *(s-1) == ' ') {
		 have_header |= HTTP_HEADER_HOST;
	}
	s++;
}

I recompiled, ran the test script again and it worked! Now I needed to find out the project contributing guidelines.

I found out that PHP has a test suite composed by phpt files, these files are a mixture of plaintext and php with different sections. The most important sections are FILE and EXPECTED , inside the file section there should be a PHP script that should output something to stdout. The EXPECTED section contains the expected output, the test script then compares the output produced by the script inside FILE to the output on the EXPECTED section, if they match the test passes, if they don’t then it fails. It’s very simple and effective. I changed the script I was using to reproduce the issue to become a .phpt file so we could have automated tests for this bug:

--TEST--
Bug #79265 (Improper injection of Host header when using fopen for http requests)
--INI--
allow_url_fopen=1
--SKIPIF--

--FILE--
<?php
require 'server.inc';

$responses = array(
    "data://text/plain,HTTP/1.0 200 OK\r\n\r\n",
);

$pid = http_server("tcp://127.0.0.1:12342", $responses, $output);

$opts = array(
  'http'=>array(
    'method'=>"GET",
    'header'=>"RandomHeader: localhost:8080\r\n" .
              "Cookie: foo=bar\r\n" .
              "Host: userspecifiedvalue\r\n"
  )
);
$context = stream_context_create($opts);
$fd = fopen('http://127.0.0.1:12342/', 'rb', false, $context);
fseek($output, 0, SEEK_SET);
echo stream_get_contents($output);
fclose($fd);

http_server_kill($pid);

?>
--EXPECT--
GET / HTTP/1.0
Connection: close
RandomHeader: localhost:8080
Cookie: foo=bar
Host: userspecifiedvalue

I commited the fix alongside with the test script and opened a pull request . It was merged a few hours later.


以上所述就是小编给大家介绍的《How I found and fixed a bug in PHP's standard library》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

微信民族志、自媒体时代的知识生产与文化实践

微信民族志、自媒体时代的知识生产与文化实践

赵旭东 / 中国社会科学出版社 / 2017-9 / 98.00元

进入二十一世纪以来,随着网络技术的发展,自媒体的悄然登场深度影响着我们的日常生活。中国社会中自媒体通讯方式的普及以及随之而有的一种文化书写的新形式——微信民族志的出现使原有文化秩序中时空意义发生转变的同时,也在重新塑造着以研究异文化为己任的人类学学科自身的成长、转型与发展。在此种情境之下,由中国人民大学人类学研究所、中国人民大学国家发展与战略研究院、中国人民大学社会学理论与方法研究中心、《探索与争......一起来看看 《微信民族志、自媒体时代的知识生产与文化实践》 这本书的介绍吧!

HTML 压缩/解压工具
HTML 压缩/解压工具

在线压缩/解压 HTML 代码

JSON 在线解析
JSON 在线解析

在线 JSON 格式化工具

HSV CMYK 转换工具
HSV CMYK 转换工具

HSV CMYK互换工具