Triaging Firefox Nightly Bugs - It's fun!

[linkstandalone]

You may notice that typing about:crashes in Firefox brings up all the crashes that have occurred in your browser. Do you ever wonder the process of how Firefox engineers look into these crash reports and analyze them? In this post, I'll be talking about the crash triage process that I've been involved in for the past few months and hopefully, you can find it interesting.

This is a group of people whose responsibilities are filing bugs for Nightly crashes if they are valid. It is a small group of engineers so we do a weekly rotation to respect everyone's time which also ensures that we have coverage on each day.

As you probably aware already, there are two Nightly builds get released every day, and both of them need to be triaged. However, We don't triage crashes right in the next day after the releases, because usually, it takes time for users to update their Nightly build and also takes time for crashes to occur.

Crash reports come with a lot of information, pretty much everything relates to the crash are included, such as crash reasons, crash addresses and crash stacks. Note that no personal identified information is included in the crash reports, so it's impossible to correlate crash reports to individual users. Most of the time, we don't need to check all the fields to tell what's going on.

Bugs We File

There are some approaches that we try to follow to avoid filing bugs that are not actionable, but also make sure we don't miss any potential bugs.

  • File bugs when more than one installations hit the same crash
    • Everyone has different machines and setups. We try to file bugs that at least can be reproducible on two different machines, otherwise, the crash is very likely only reproducible on that particular machine and not actionable for engineers. However, we still file bugs if we think it is valid even if it only comes from one installation
  • Don't file IPC shutdown crashes if you can't figure what's going on.
    • I'll talk about IPC shutdown crashes in detail later. TLDR: IPC shutdown crashes are not real crashes. They are hard to be actionable, so we only file them if we have reasons.
  • Don't file bugs that don't have symbols
    • Bugs usually are not actionable without symbols.
  • For crashes in third party library, file if there are same pattern crashes from multiple installations
    • There aren't a lot of things that we can do if the crash occurs in third part code. However, it would still be useful to file bugs for them, so that we could get the attention from the third party libraries. For instance, keychain-pkcs11's developers jumped into bug 1668593 to help the investigation.
  • Valid crash reasons such as MOZ_CRASH or MOZ_DIAGNOSTIC_ASSERT
    • These usually are real crashes, we should file them despite the number of crashes.

Common Crashes

OOM

Out-of-Memory crashes are common and not actionable usually. However, we should still valid the validness of them.

Generally, there are two things we want to look at:

  • Available Memory
    • This means checking the Available Page File on Windows. It's a lot harder on Linux and MacOS due the existence of OOM killer as OOM Killer could possibly kill the Firefox process caused by memory allocations from other programs.
  • OOM allocation Size
    • If it's very large then something might be wrong.

The Above image shows a bizarre crash report I noticed one day as the allocation size was huge. So I filed bug 167475 for it.

Bad Memory Bits

Some crashes are just caused by bad hardware such as bad memory bits. This is actually quite common crash reports.

The way to identify them is by expanding the crash reports to show other existing crashes that have the same signature. And if every crash has a different crash reason and crash addresses, then it is very likely that this is a bad memory bit crash.

Here's an example of bad memory bits crash. If we expand the report to view other crashes, we can see they all have different crash addresses. For this particular example, some addresses are very close to 0 which seem like a null pointer crash, on the other hand, some addresses are far away from 0.

Here's the screenshot of the crash addresses.

In addition to the above reason, the call stack of this crash shows that it is in the garbage collector code. Garbage collector would sweep lots of memory so users have bad memory bits are very likely going to crash.

Shutdown Hang

Shutdown hangs are also quite common in crash reports, and it drove me nuts in the beginning. The signature usually looks similar to [@ IPCError-browser | ShutDownKill. However, the fact is they are not real crashes. They are reports for slow shutdowns. The call stacks reported doesn't indicate where it crashes, in fact they are snapshots of the content process that was told to shutdown but didn't manage to do so within a 20 seconds limit.

There are two possible causes here.

  1. Content process was slow to respond
    • We don't want to file bugs for this reason because there isn't much we can do unless improving the speed of shutdown
  2. Content process was deadlocked while trying to shutdown
    • This is something we care about and actionable

There are a couple of steps that I follow to investigate these bugs.

  1. Expand the date range to include more reports by clicking the More Reports button to increase the date range to at least a month.
  2. Check the ipc shutdown state table under the aggregation tab. The value could be either RecvShutdown or SendFinishShutdown by a content process while it's shutting down.
    • If the table has multiple values, then it's likely that it's slow shutdown,
  3. Check the call stack to see if there's mozilla::dom::ContentChild::RecvShutdown() method. This occurrence of this method indicates that content process had received the shutdown IPC message and had started shutting down. It's likely that the process was just slow to shutdown if you see this method, however it is possible that the process might still get stuck after it had received the message.
  4. If the content process didn't receive the shutdown message. Then it's a clear sign of a potential deadlock. You should look at the call stack and see if you can find an issue there.
bug 1658429 is an example of potential deadlocks as it appears to be that the content process was waiting for the parent process, however the parent process had started to shutdown.

Nightly Triage rotation made me gained a lot of knowledge regards bug triaging and crash report analyzing, and I found they were extremely useful. The knowledge I gained not only helped me to do my work better, but they also expanded my knowledge about system programming, which is something that I always want to know more about.

Hope you find this article helpful!

Sun, 25 Oct 2020 15:19:27 -0400

There are some approaches that we try to follow to avoid filing bugs that are not actionable, but also make sure we don't miss any potential bugs.

  • File bugs when more than one installations hit the same crash
    • Everyone has different machines and setups. We try to file bugs that at least can be reproducible on two different machines, otherwise, the crash is very likely only reproducible on that particular machine and not actionable for engineers. However, we still file bugs if we think it is valid even if it only comes from one installation
  • Don't file IPC shutdown crashes if you can't figure what's going on.
    • I'll talk about IPC shutdown crashes in detail later. TLDR: IPC shutdown crashes are not real crashes. They are hard to be actionable, so we only file them if we have reasons.
  • Don't file bugs that don't have symbols
    • Bugs usually are not actionable without symbols.
  • For crashes in third party library, file if there are same pattern crashes from multiple installations
    • There aren't a lot of things that we can do if the crash occurs in third part code. However, it would still be useful to file bugs for them, so that we could get the attention from the third party libraries. For instance, keychain-pkcs11's developers jumped into bug 1668593 to help the investigation.
  • Valid crash reasons such as MOZ_CRASH or MOZ_DIAGNOSTIC_ASSERT
    • These usually are real crashes, we should file them despite the number of crashes.

Out-of-Memory crashes are common and not actionable usually. However, we should still valid the validness of them.

Generally, there are two things we want to look at:

  • Available Memory
    • This means checking the Available Page File on Windows. It's a lot harder on Linux and MacOS due the existence of OOM killer as OOM Killer could possibly kill the Firefox process caused by memory allocations from other programs.
  • OOM allocation Size
    • If it's very large then something might be wrong.

OOM Image

The Above image shows a bizarre crash report I noticed one day as the allocation size was huge. So I filed bug 167475 for it.

Some crashes are just caused by bad hardware such as bad memory bits. This is actually quite common crash reports.

The way to identify them is by expanding the crash reports to show other existing crashes that have the same signature. And if every crash has a different crash reason and crash addresses, then it is very likely that this is a bad memory bit crash.

Here's an example of bad memory bits crash. If we expand the report to view other crashes, we can see they all have different crash addresses. For this particular example, some addresses are very close to 0 which seem like a null pointer crash, on the other hand, some addresses are far away from 0.

bad_memory_crash.png

In addition to the above reason, the call stack of this crash shows that it is in the garbage collector code. Garbage collector would sweep lots of memory so users have bad memory bits are very likely going to crash.

Shutdown hangs are also quite common in crash reports, and it drove me nuts in the beginning. The signature usually looks similar to [@ IPCError-browser | ShutDownKill. However, the fact is they are not real crashes. They are reports for slow shutdowns. The call stacks reported doesn't indicate where it crashes, in fact they are snapshots of the content process that was told to shutdown but didn't manage to do so within a 20 seconds limit.

There are two possible causes here.

  1. Content process was slow to respond
    • We don't want to file bugs for this reason because there isn't much we can do unless improving the speed of shutdown
  2. Content process was deadlocked while trying to shutdown
    • This is something we care about and actionable

There are a couple of steps that I follow to investigate these bugs.

  1. Expand the date range to include more reports by clicking the More Reports button to increase the date range to at least a month.
  2. Check the ipc shutdown state table under the aggregation tab. The value could be either RecvShutdown or SendFinishShutdown by a content process while it's shutting down.
    • If the table has multiple values, then it's likely that it's slow shutdown,
  3. Check the call stack to see if there's mozilla::dom::ContentChild::RecvShutdown() method. This occurrence of this method indicates that content process had received the shutdown IPC message and had started shutting down. It's likely that the process was just slow to shutdown if you see this method, however it is possible that the process might still get stuck after it had received the message.
  4. If the content process didn't receive the shutdown message. Then it's a clear sign of a potential deadlock. You should look at the call stack and see if you can find an issue there.

Joining the Nightly Triage rotation made me gained a lot of knowledge regards bug triaging and crash report analyzing, and I found they were extremely useful. The knowledge I gained not only helped me to do my work better, but they also expanded my knowledge about system programming, which is something that I always want to know more about.

Hope you find this article helpful!

Sat, 24 Oct 2020 21:28:23 -0400

在Linux上用rr来debug - 拥有回到过去的能力!

[linkstandalone]

想象一下如果你的程序某时会崩溃,但是不能稳定复现,你会如何debug它?

用传统debugger面临的问题就是你不知道这次运行的时候能不能复现,你猜测可能某段代码出现了 问题,所以进行了一番检查。但如果最后不能复现的话,刚刚所做的工作就是无用的。gdb通过一个反 向调试的功能解决了这个问题。但是gdb的反向调试代价很大,在大型项目上运行很吃力。Bug出现的几率高还好,但如果出现的概率是1/100,甚至1/1000呢?

要么就用大量printf,通过print出数据来进行debug,好处是重复运行程序的成本较低,可以写个 脚本,反复可以多次尝试,直到程序崩溃,然后反回去看log。但也有一样的问题就是效率不高,哪怕 复现成功了,也很有可能现有的printf并不能解决问题,需要加更多的printf,又要反复运行程序来复现。

rr的出现就是来解决上面提到的问题。rr是一个debugger,通过记录程序运行时的状态,来提供一 个可以反复反向调试的debug环境。

rr的优点

  • 比gdb的反向调试更成熟,并且有更少的消耗
  • 对多线程进程友好,可以只调试某个进程
  • 支持gdb的指令

安装

推荐按着官方文档来进行本地编译,然后安装。请注意rr现在只能在Linux上运行并且对CPU有一定的要求,具体要求请看官方文档

基础使用方法

使用rr总共分成两步

  1. 第一步是rr record - 运行程序,并且记录下程序运行时的状态。
  2. 第二部是rr replay - 回放记录好的程序。

实例

  #include <cstdio>
  #include <thread>
  #include <stdlib.h>

  void inc(int& x, int id) {
    id = id + 1; //这一行有点多余,主要是为了展示rr的reverse-continue 功能
    if (x == 2 && id == 3) {
      abort();
    }
    ++x;
    printf("x=%d\n", x);
  }

  int main () {
    int x = 0;
    std::thread t1(inc, std::ref(x), 1);
    std::thread t2(inc, std::ref(x), 2);
    std::thread t3(inc, std::ref(x), 3);
    std::thread t4(inc, std::ref(x), 4);

    t1.join();
    t2.join();
    t3.join();
    t4.join();
  }

上面这个程序启动了多个线程,会在当t2这个线程运行时x == 2 && id == 3的时候崩溃。因为线程运行的顺序是不可控的,所以这个程序并不会总是崩溃。

我们首先编译上面这个程序

  • 运行g++ -g -lpthread main.cpp -o main。编译好后,我们得到了 main 这个二进制文件。

然后我们来record

  • 这里我们运行rr record --chaos ./main。正常情况我们 rr record ../main 就可以了,但我们这里用了--chaos让rr可以更加随机的进行调度,从而增加 复现这个bug的概率。我们还可以自动化这个步骤通过while rr record --chaos ./main; do :; done 来反复运行这个程序,这个while循环会在main崩溃的时候自动结束。

rr_debugger_1 rr_debugger_2

我这次运行了32次才成功复现这个崩溃。

rr 会把每次的记录存放在~/.local/share/rr里面,这时候我们ls看一下。

rr_debugger_3

每个main-<数字>就是rr存放的记录,rr支持我们回放每一个记录,rr replay 默认会回放最新的那个,我们直接rr replay就好了。

rr_debugger_4

上图就是rr debugger的样子。因为rr使用的是gdb protocol,所以我们可以运行各种gdb的指令。

我们接下来利用gdb的continue指令 (continue指令会让程序一直运行,直到程序结束,程序崩溃,或者命中断点),让程序自动停在崩溃的地方。

rr_debugger_5

我们看到rr的回放也print出了x=1 和 x=2, 跟当时运行时是完全一样的。这里我们输入bt可以检查call stack的内容。可以看到abort()是让程序崩溃的原因。

rr_debugger_6

输入up 2,来到inc()的这个frame。再配合list,就可以看到具体造成崩溃的代码。这里我们输入 p id 可以看到当前id的数据,确实是id == 3造成了崩溃。

rr_debugger_7

这时候我们需要寻找为什么id是3的原因,这段代码很明显是由id = id + 1造成的,但是假如我们不知道是哪里造成id变化,我们可以输入watch -l id 来观察这个变量,再通过rc来回到造成变化的代码。

rr_debugger_8

注意这里我们输入了rc两次,第一次rc的时候,rr收到了程序崩溃的信号,所以停止了,我们需要再输入一次让它继续。rr自动就帮我们回到了id变化的这一行代码,这时候我们再看p idid就是2了。通过追逐id的变化,我们就可以对代码进行必要的修改了。

这就是rr的能力,不单单我们返回到了这一行代码,连程序相应的状态都回到了这个时刻。rr的强 大之处就是给我们的一种回到过去的能力。

Sun, 18 Oct 2020 19:11:41 -0400

一个浏览器工程师实现Web元素的经历

[linkstandalone]

在过去的2个月,我用了很多时间在Gecko中补充了<dialog>标签的支持使其符合大部分的标准,能够在Nightly中默认开启,但还不能在Release中开启,因为当前的实现并不完美。这是我第一次实现Web标准,写下这篇文章来进行一个总结,顺便也可以帮助大家了解到浏览器工程师是如何工作的。

Spec(标准)的理解是第一步也是最重要的一步。浏览器工程师需要大量的时间来阅读Web标准,很多时候还需要跟别的引擎开发团队进行讨论和修改。拿<dialog>标签来举例,我所做的第一件事情,就是阅读跟<dialog>相关的文档,而一项标准往往和别的标准相关联,<dialog>就跟top layer的标准联系很大,所以top layer也是需要被掌握的内容。当<dialog>在进入modal状态下的时候,需要被focus的元素也在标准中明确声明了,因此也需要阅读focus相关的文档。

多个引擎团队同时支持的标准往往更可靠或者更加完美一点。<dialog>是仅仅在blink中实现了的,所以我在为Gecko所做的工作,就像是复查了一遍相关的标准。当然,标准并不是完美的,里面经常有模糊的地方和需要改进的地方。例如,<dialog>centered alignment模式就非常的难以理解且繁琐,与此同时<dialog>的focus逻辑对accessbility的支持并不是很好。以上这些则需要大量的时间来进行讨论,从而选择出大家都同意的解决方法,再放进Web标准中,最后浏览器再跟进进行修改。

很多时候我们需要参考别的浏览器的实现,来维持更好的一致性。<dialog>在Blink中已经有相关的实现了,所以我在做Gecko的实现的时候,对于模糊的和不确定的选择,也会参考blink中的实现。例如在实现‘支持用escape键关掉modal dialog’的时候,我并不是很确定如果modal dialog 和一个全屏元素同时出现的话,应该先关掉哪个。并且这些是没有在标准中明确提出的,因此我参考了Blink的行为,从而做了相同的行为在Gecko中。还有的时候,blink的行为与标准并不相符。这个时候我会去请教blink的工程师,从而了解这是bug还是blink有意做出的改变。

Web标准大体上就是这样,简单来说就是个理解和修改的过程,下一步就是把标准中的要求放在浏览器中实现。

大部分Web元素的实现,主要跟两个模块打交道,Layout和DOM。Layout主要负责元素在页面中样式的实现,DOM主要负责的就是元素逻辑方面的实现。如果这个元素可以被CSS所解释,那实际上主要做的就是DOM逻辑上的实现,因为几行CSS就可以把元素放到正确的位置上了。

但很不幸的是,dialog并不能被CSS所解释,而要实现标准中的要求,则需要在Layout模块中加入一些hack。一开始我们以为只需要少量的hack就可以解决排版方面的要求,但后来发现需要的hack越来越多,我们也就越来越头疼,就又开始重新思考modal dialog的排版。最后我们决定先实现一个临时的与标准不同但更合理的方案,然后再尝试把这个方案放入标准里面。因此有些时候,引擎开发团队会根据自身的情况选择不同的方案,然后再与别的引擎团队进行讨论,这也是大家共同合作统一的标准,从而使Web更好的发展。

相比较Layout方面,DOM方面需要做的修改则直接很多,主要就是<dialog>的行为代码,如focus行为、returnValue返回正确的值和支持用escape关掉dialog之类的。

相对于标准的理解来说,实际上需要编写的代码要简单许多。

最后一步就是确认你所写的代码是否正确。Web的标准是拥有大量的测试实例的,都在一个叫web-platform-test (WPT)的仓库中。这些测试是引擎开发商共同来维护的。例如我们要求能放入WPT的测试,最好都放入到WPT中。因为这样可以同时测试别的引擎的行为,从而维护一致性,达到共益。Gecko中有一个WPT的copy,每天都会加入最新的修改,然后我们在Gecko的WPT做出的修改,也会自动放入主WPT仓库中。因为有blink早已加入的现成测试实例可以用,所以我这次做<dialog>就方便许多。当然,我也发现了现有实例的bug,所以有时候修改也是不可避免的。

以上就是一个Web元素是如何在浏览器中实现的过程。

Sun, 18 Oct 2020 18:34:35 -0400

Some Notes About Linking in C++

[linkstandalone]

One thing that came to my mind frequently, while I was working with C++, was that how to link external libraries to my projects. A lot of the time, when I see a project on Github, I see the source code and a makefile, but I don't know what I suppose to do with them. Do I need to compile them, to a special format? How do I compile them? So today, I decided to resolve this mystery as a gift for myself.

Linking is part of the c++ compilation process, and the goal of this process is to link object files together to create the executable. Libraries (except header-only libraries) are pre-compiled object files, so there are two things that we want to achieve while using them, get the pre-compiled object files and link them to our projects.

What Linking does

The first thing I did was I figured what linking does. Below is an example.

helper.cpp

int sum(int x, int y) {
  return x + y;
}

Let's create the shared library by g++ -fPIC -c -Wall helper.cpp && ld -shared helper.o -o libhelper.so. This command compiles helper.cpp and created the shared library called libhelper.so.

main.cpp


#include "stdio.h"
int sum(int x, int y);

int main() {
  printf("sum %d\n", sum(1, 2));
  return 0;
}

Compile it with g++ -Wall -L/home/saber/learn/cpp/linking -Wl,-rpath=/home/saber/learn/cpp/linking main.cpp -lhelper.

This is the output of the exectuable.

sum 3

One thing that I think very interesting which self-explains what linking does is, we didn't have the definition of sum in main.cpp, we just provided the declaration, and the program worked correctly, as the definition of sum was linked.

Dynamic Library

Dynamic libraries are shared libraries. As the name implies, dynamic libraries can be shared between multiple applications. The linking happens at run-time, so the executable doesn't have the library objects. Only a single copy of libraries is loaded into the memory and applications share the same libraries, thus the memory space is saved.

Static Library

As the opposite of dynamic library, static library embed the object files as part of the executable, and each executable has its copy of the library, so the executable is larger and the same library will be loaded into memory whenever a program uses it.

Header-only Library

Header only library is just a header file that can include it in your source file. The header file contains the implementation of the library. The advantage of this approach is the library is very easy to be embedded into the projects. Header-only library is common if the library is small, otherwise, the compilation cost will increase a lot.

Use them with CMake

This was the real question that I had. When I found a library on Github, what are the things that I should be looking at for integrating it into my project?

  • Find the compiling options that the project provide
    • Things that we want to look like are, is the project supposed to be compiled into a shared library, a static library or a header-only library.

Things are easy if it's a header-only library, just #include it. For shared libraries or static libraries, we need to compile it first and then link them into our projects.

Assume this is our project structure


├── helper
│   └── helper.cpp
└── main.cpp

main.cpp is our program, and helper.cpp is the library source file. The first thing we want to do is creating a CMakeLists.txt for the library to compile it into a static library Let's create a minimalist CMakeLists.txt in the helper directory.

# CMake instructions to make the static lib
ADD_LIBRARY( HelperLib STATIC
	         helper.cpp )

Note that to make it a shared library, all we need to do is to change STATIC to SHARED.

Then we need another CMakeLists.txt for the main program. Let's create one under the root directory as

cmake_minimum_required(VERSION 3.10)

# set the project name
project(test)

add_executable(test main.cpp)

# CMake builds the static library
add_subdirectory(${CMAKE_SOURCE_DIR}/helper)

# Use the static library
TARGET_LINK_LIBRARIES(${PROJECT_NAME} HelperLib)

The above setup allows us to create the binary which has HelperLib linked.

Sun, 18 Oct 2020 17:42:33 -0400

Order Of Evaluation

[linkstandalone]

Today I saw the Order of Evaluation section in C++ Primer, and I wished I could've learned this earlier!

This section stated that the order of evaluation in C++ is not defined (there are few exceptions), instead it's totally up to the compiler to decide which operand will be evaluated first.

For instance, int a = f1() * f2(); There are no guarantees that f1() will be evaluated first or f2() will be evaluated first.

Right right, now I recall a bug which took me some time to fix.

While I was working on the DOM Arena bug, I wrote something like this

HTMLElement* element = new(nodeInfo->NodeInfoManager()) HTMLElement(nodeInfo.forget());

A brief explanation to the code is nodeInfo is a ref-counted smart pointer instance which owns a raw pointer, and nodeInfo.forget() will make nodeInfo to lose the ownership, to let it pass the ownership to the caller.

Do you see the problem? Yeah, if nodeInfo.forget() gets evaluated first, nodeInfo->NodeInfoManager() will became dereferencing a null pointer, which is invalid.

It was driving me nuts, because I couldn't understand why and I had the impression that the order of evaluation was from left to right!

Happy Coding!

Thu, 15 Oct 2020 22:03:12 -0400

Perfect Forwarding And Universal Reference

[linkstandalone]

While I was learning mtrebi's threadpool implementation, I noticed some peculiar usage of &&, which looked like this.

template<typename F, typename...Args>
auto submit(F&& f, Args&&... args) -> std::future<decltype(f(args...))> {

&& is known for rvalue reference in C++, but what does it mean in this context? To answer this question, I did some research and figured it was called universal reference.

The term universal reference was coined by Scott Meyers, and it is used to describe references that may be either lvalue references or rvalue references may bind to anything. To understand it precisely, we need to understand what problem does it solve.

Perfect Forwarding Problem

void Bar(int& b) {
  printf("Bar called\n");
  b = 123;
}

Given function Bar, and we want to implement a wrapper function which allows use to pass some arguments to Bar via the wrapper function. We could do it as

template <typename T1>
void wrapper(T1& a) {
  Bar(a);
}

Then we could call it as

int k = 0;
wrapper(k);

It would work, but what if we want to call it as wrapper(0)? wrapper(0) should be valid because Bar(0) is a valid call. However, it would fail to compile as the compiler returns expects an l-value for 2nd argument.

To fix it, we could overload the wrapper function as

template <typename T1>
void wrapper(T1 a) {
  Bar(a);
}

However, it fixes the compiling error, but it changes the original meaning of Bar. We want to pass the reference of a variable to Bar, and this is pass by value.

Universal Reference

Before we give a solution to the previous problem, let's look at what universal reference brings to us.

Let's begin by looking at the problem that universal reference intends to solve.

In a context like this

template <typename T>
void wrapper(T&& a) {
}

&& becomes a universal reference which means it could be either a lvalue reference or rvalue reference based on the type of T, and we call this behaviour type deduction, and the rules that define the type deduction behaviour is called reference collapsing rules.

The rules are simple. If T is a lvalue, T&& becomes a lvalue reference, otherwise(rvalue) T&& becomes a rvalue reference.

By using universal reference, we could change our previous example to

template <typename T1>
void wrapper(T1&& a) {
  Bar(a);
}

When wrapper(k) applies, since k is a lvalue, the declaration of wrapper becomes wrapper(T1& a).

When wrapper(k) applies, since 0 is a rvalue, the declaration of wrapper becomes wrapper(T1&& a).

One Last Problem

Universal reference is great, it solves the wrong types of arguments problem we have before, however it is not perfect. Let's look at this code snippet.

void func(int& b) {
  printf("1\n");
}

void func(int&& b) {
  printf("2\n");
}

template <typename T1>
void wrapper(T1&& e1) {
    func(e1);
}

int main() {
  int k = 0;
  wrapper(0);
  return 0;
}

I expected the above example to output 2 as T1&& should be a rvalue reference as we passed a rvalue(0) to wrapper, however this was not the case. Surprisingly, 1 was the output, because once the argument 0 got passed into the function, it got a name which was e1, and it was no longer a rvalue as it got a name, and thus the lvalue reference version of func was invoked.

To fix this, what we have to do is cast it back to rvalue reference, but we need to be careful, we don't want to cast it always to rvalue reference, we want something like if T1 is rvalue reference, cast e1 to rvalue reference, otherwise cast it to lvalue reference. Fortunately, we can do it by using std::forward.

This is the definition of std::forward.

template<class T>
T&& forward(typename std::remove_reference<T>::type& t) noexcept {
  return static_cast<T&&>(t);
}

template <class T>
T&& forward(typename std::remove_reference<T>::type&& t) noexcept {
  return static_cast<T&&>(t);
}

Let's apply both cases and see what std::forward returns.

When T1 is a lvalue reference, std::forward is

template<class T>
T1& && forward(typename std::remove_reference<T>::type& t) noexcept {
  return static_cast<T& &&>(t);
}

By applying the reference collapsing rules here, it is

template<class T>
T1& forward(typename std::remove_reference<T>::type& t) noexcept {
  return static_cast<T&>(t);
}

When T1 is a rvalue reference, std::forward is

template<class T>
T1&& && forward(typename std::remove_reference<T>::type& t) noexcept {
  return static_cast<T&& &&>(t);
}

which is equivalent to

template<class T>
T1&& forward(typename std::remove_reference<T>::type& t) noexcept {
  return static_cast<T&&>(t);
}

So the final solution for wrapper is

template <typename T1>
void wrapper(T1&& e1) {
    func(std::forward(e1));
}

Problem solved!

Function Ambiguity

This is something I noticed while I was studying this.

Consider this code snippet

void func(int& b) {
  printf("1");
}

void func(int b) {
  printf("2");
}

template <typename T1>
void wrapper(T1&& e1) {
    func(e1);
}

int main() {
  int k = 0;
  wrapper(k);
  return 0;
}

When we compile it, we see

perf.cpp:17:5: error: call to 'func' is ambiguous
    func(e1);
    ^~~~
perf.cpp:22:3: note: in instantiation of function template specialization
      'wrapper<int &>' requested here
  wrapper(k);
  ^
perf.cpp:7:6: note: candidate function
void func(int& b) {
     ^
perf.cpp:11:6: note: candidate function
void func(int b) {
     ^
1 error generated.

The reason for the compiling error is wrapper in this case is equivalent to

void wrapper(T& e1) {
    func(e1);
}

and both func(int& b) and function(int b) could be used, so the compiler fails to detect which is the correct function.

References

Perfect forwarding and universal references in C++, by Eli Bendersky

Answers to a stackoverflow question, by Kerrek SB

Thu, 15 Oct 2020 22:03:11 -0400