The Subtle Dangers of the Comma Operator (C++)

栏目: IT技术 · 发布时间: 5年前

内容简介:In its powerful abilities, the C++ language allows us to do many things.But like a philosopher who was also the uncle of a superhero once said, with great power comes great responsibility.Translated in C++, this means that if you're not careful, some C++ f

In its powerful abilities, the C++ language allows us to do many things.

But like a philosopher who was also the uncle of a superhero once said, with great power comes great responsibility.

Translated in C++, this means that if you're not careful, some C++ features that let you write expressive code can turn around and create buggy code that doesn't do what it's supposed to.

One beautiful example (of some definition of beautiful) is overloading of the comma operator. As we're going to see, a very subtle change in working code can make it go horribly wrong.

A big thanks goes to Fluent C++ reader Nope for showing me this example.

Overloading the Comma Operator Is Powerful

First, the comma operator is a thing. Its default implementation for all types does this: a,b evaluates a, then evaluates b, then returns b. For example, 1,2 returns 2.

It's generally not recommended, but C++ allows for overloading the comma operator. Here is a detailed article on overloading the comma operator , which you can read to get more familiar with the topic.

Overloading the comma operator allows us to do nice things. For example, this is whatBoost Assign uses to allow us to append data to an existing vector:

v += 1,2,3,4,5;

Without overloading the comma operator, we can't write this expression with standard C++, even in C++20. Once the vector is constructed, we can only add elements one by one by using the push_back member function.

The preceding code allows us to add elements to an existing vector with very expressive code.

Here is a very simplified implementation of Boost Assign, which allows us to write the preceding line of code (thanks to Nope for this implementation):

template <typename Vector>
struct appender
{
    Vector& vec;

    template<typename T>
    appender<Vector>& operator,(const T& e)
    {
        vec.push_back(e);
        return *this;
    }
};

template <typename T>
appender<std::vector<T>> operator+=(std::vector<T>& v, const T& e)
{
    v.push_back(e);
    return {v};
}

int main()
{
    auto data = std::vector<int>{};
    data += 1,2,3,4,5;

    for (auto&& e: data)
        std::cout << ' ' << e;
    std::cout << '\n';
}

Here is how this code works:

  • Since the comma operator has the lowest precedence, data += 1 executes first.
  • This adds 1 to the vector and returns an appender referencing that vector.
  • The appender overloads the comma operator. When this appender is associated with 2, it adds it to the vector and returns itself.
  • The appender is then associated with 3 and also adds it, and then 4, and then 5.

The output of this program is this (run the code yourselfhere):

All good.

At least, so far.

Overloading the Comma Operator Is Dangerous

Now let's make a small change in our code. Instead of defining the comma operator as a member function, let's define it as a free function. For example, this could be desirable as it allows implicit conversions, as explained in item 24 ofEffective C++.

template <typename Vector>
struct appender
{
    Vector& vec;
};

template <typename Vector, typename T>
appender<Vector>& operator,(appender<Vector>& v, const T& e)
{
    v.vec.push_back(e);
    return v;
}

template <typename T>
appender<std::vector<T>> operator+=(std::vector<T>& v, const T& e)
{
    v.push_back(e);
    return {v};
}

int main()
{
    auto data = std::vector<int>{};
    data += 1,2,3,4,5;

    for (auto&& e: data)
        std::cout << ' ' << e;
    std::cout << '\n';
}

This shouldn't change anything, right?

Let's run the program (run it yourself here). It outputs this:

If you're like me, you're staring at the screen in disbelief. Run the program that works and the one that doesn't work if you'd like to see it with your own eyes.

Maybe the worst thing is that it compiles, not that it doesn't have the behavior we would naturally expect.

Can you see why this is happening?

I recommend you search on your own. This is highly instructive!

...

Seriously, try to find what's wrong on your own. I'll tell you in a bit, but it's more fun and rewarding to find it yourself.

...

You're on mobile and it's not convenient? No worries, bookmark this page or send it to yourself by email so you can come back to it later on your computer.

...

Found it yet?

...

Ok, I'll show you now.

What the C*mm* Is Happening?

The problem has to do with lvalues and rvalues. If we look again at the free function operator, it takes an lvalue reference as an input:

template <typename Vector, typename T>
appender<Vector>& operator,(appender<Vector>& v, const T& e)
{
    v.vec.push_back(e);
    return v;
}

The calling code is this:

data += 1,2,3,4,5;

data += 1 is an rvalue. An lvalue reference cannot bind to it. Therefore, this overload of the comma operator is not called.

If it were any other operator, the code would not have compiled. But like we saw at the beginning of this post, the comma operator has a default implementation for all types. Therefore, the default implementation is executed—the one that returns the second element, here 2. Then it returns 3. Then 4 and then 5. And it doesn't really do anything.

Incorrectly overloading the comma operator results in a silent failure . The code compiles, runs, but doesn't do what you want.

To make this implementation work, we need to provide an overload of the comma operator that can accept rvalues:

template <typename Vector, typename T>
appender<Vector>& operator,(appender<Vector>& v, const T& e)
{
	v.vec.push_back(e);
	return v;
}

template <typename Vector, typename T>
appender<Vector>& operator,(appender<Vector>&& v, const T& e)
{
	v.vec.push_back(e);
	return v;
}

Note that a const lvalue reference would also bind to lvalues and rvalues.

To use this in our case, we need to return a copy of the appender so that a const reference can bind to it. In our case, this would still append to the vector because the various copies of the appender would contain a reference to the same vector (thanks to Patrice Dalesme for showing me this solution):

template <typename Vector, typename T>
appender<Vector> operator,(appender<Vector> const& v, const T& e)
{
	v.vec.push_back(e);
	return v;
}

Lessons Learned

There are at least two lessons we can learn from that example.

The first one is that if we overload the comma operator, we need to be extra careful to cover all cases and think about lvalues and rvalues. Otherwise we end up with buggy code.

The second one is independent from the comma operator. We see that in C++, member functions are easier to define than free functions. Member functions are, by default, defined for both lvalues and rvalue references of a type, whereas free functions taking references may work for only one case.

The opposite also exists (member functions explicitly defined for lvalues or rvalues, and free functions taking by copy or const reference), but the most common prototypes have the properties we discussed earlier.


以上所述就是小编给大家介绍的《The Subtle Dangers of the Comma Operator (C++)》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

程序员面试手册

程序员面试手册

[印] 纳拉辛哈·卡鲁曼希(Narasimha Karumanchi) / 爱飞翔 / 机械工业出版社 / 2018-2-27 / 99

本书特色 以通俗易懂的方式讲述面试题,涵盖编程基础、架构设计、网络技术、数据库技术、数据结构及算法等主题 书中的题目来自微软、谷歌、亚马逊、雅虎、Oracle、Facebook等大公司的面试题,以及一些知名竞赛(如GATE)的考试题 全书约有700道算法题,每道题都有详细解答 针对每一编程问题,都会按照复杂度递减的顺序给出各种解法 专注于问题本身并对这些问题做出分析,......一起来看看 《程序员面试手册》 这本书的介绍吧!

JSON 在线解析
JSON 在线解析

在线 JSON 格式化工具

Base64 编码/解码
Base64 编码/解码

Base64 编码/解码

XML 在线格式化
XML 在线格式化

在线 XML 格式化压缩工具