C++语言的15个晦涩特性

原创

小哥 3年前 (2023-05-24) 阅读数 74 #大杂烩

转自： http://developer.51cto.com/art/201312/425995.htm

此列表已收集 C++ 语言中的一些歧义（Obscure）通过对这种语言各个方面的多年研究，收集了特征。C++它非常巨大，我总能学到一些新知识。即使你是对的C++我知道它就像我的手掌一样，我希望你能从列表中学到一些东西。下面列出的特征根据其模糊程度从浅到深排序。

1. 方括号的真正含义
1. 最烦人的分析
3.替换操作标记
1. 重新定义关键字
1. Placement new
6.声明变量时的分支
7.成员函数的引用修饰符
8.迁移到完整模板元编程
9.指向成员的指针运算符
1. 静态实例方法
11.重载++和–
12.操作员重载和检查顺序
13.函数作为模板参数
14.模板的参数也是模板
15.try块作为函数

方括号的真正含义

用于访问数组元素ptr[3]其实只是(ptr + 3)缩写，与(3 + ptr)是等价的，因此相反是真的3[ptr]它也是等效的，使用3[ptr]这是完全有效的代码

最烦人的分析

“most vexing parse“这个词来源于：Scott Meyers之所以提出它，是因为C++语法陈述的歧义可能导致不合逻辑的行为：

// 这个解释正确吗？
// 1) 类型std::string的变量将传递std::string()例示？
// 2) 返回std::string并具有函数指针参数，
// 此函数还返回一个std::string但是没有参数？
std::string foo(std::string());
// 这仍然正确吗？
// 1)类型int变量将传递int(x)例示？
// 2)返回int具有一个参数的值，
// 此参数是名为x的int它是一个类型变量吗？
int bar(int(x));

在两种情况下C++该标准需要第二种解释，即使第一种解释看起来更直观。程序员可以通过将变量的初始值括在括号中来消除歧义：

//添加括号以消除歧义
std::string foo((std::string()));
int bar((int(x)));

第二种情况造成歧义的原因是int y = 3;等价于int(y) = 3;

译者注：我对这一点感到有点困惑。以下是我的解释g++测试用例：

include
include
using namespace std;
int bar(int(x)); // 等价于int bar(int x)
string foo(string()); // 等价于string foo(string (*)())
string test() {
return "test";
}
int main()
{
cout << bar(2) << endl; // 输出2
cout << foo(test); // 输出test
return 0;
}
int bar(int(x)) {
return x;
}
string foo(string (*fun)()) {
return (*fun)();
}

它可以正确输出，但如果在编译前根据作者的意图添加括号，则会报告一堆错误，例如“尚未在此范围内声明”、“重新定义”等。不清楚作者的意图是什么。

替换操作标记

标记符and, and_eq, bitand, bitor, compl, not, not_eq, or, or_eq, xor, xor_eq, <%, %>, <: 和 :>可以用来替代我们常用的&&, &=, &, |, ~, !, !=, ||, |=, ^, ^=, {, }, [ 和 ]。当键盘上缺少必要的符号时，可以改用这些操作标记。

重新定义关键字

通过预处理器重新定义关键字从技术上讲会引起错误，但实际上是允许这样做的。因此你可以使用类似#define true false 或 #define else来玩一些恶作剧。但是，有时它是合法且有用的，例如，如果您正在使用大型库并需要绕过它C++访问保护机制，除了打补丁库的方法外，还可以要解决此问题，请在包含库头文件之前关闭访问保护，但请记住在包含库头文件后打开保护机制！

define class struct
define private public
define protected public
include "library.h"
undef class
undef private
undef protected

请注意，此方法并不总是有效的，这取决于您的编译器。当实例变量未使用访问控制字符修饰时，C++只需按顺序布置这些实例变量，以便编译器可以重新排序访问控制字符组以自由更改内存布局。例如，允许编译器将所有私有成员移动到公共成员之后。另一个潜在的问题是名称重组（name mangling），Microsoft的C++编译器将访问控制字符合并到其name mangling因此，更改表中的访问控制字符意味着破坏现有编译代码的兼容性。

译者注：在C++中，Name Mangling 它是为支持重载而添加的技术。编译器调整目标源文件中的名称，使目标文件的符号表和连接过程中使用的名称与编译目标文件的源程序中的名称不同，从而实现重载。

Placement new

Placement new是new运算符的替代语法应用于分配的对象，该对象具有正确的大小和值赋值，包括 Virtual 函数表的建立和构造函数的调用。

译者注：placement new它是在用户指定的内存位置构建一个新对象，并且此构造过程不需要额外的内存分配。它只需要调用对象的构造函数。placement new其实，这只是把原版放进去的问题new这两个步骤是分开的：第一步是自己分配内存，第二步是调用类构造函数以在分配的内存上构建一个新对象。placement new的好处：1）在分配的内存上构建对象的速度很快。2）分配的内存可以重复使用，有效避免内存碎片问题。

include
using namespace std;
struct Test {
int data;
Test() { cout << "Test::Test()" << endl; }
~Test() { cout << "Test::~Test()" << endl; }
};
int main() {
// Must allocate our own memory
Test ptr = (Test )malloc(sizeof(Test));
// Use placement new
new (ptr) Test;
// Must call the destructor ourselves
ptr->~Test();
// Must release the memory ourselves
free(ptr);
return 0;
}

可在性能关键情况下需要自定义分配器时使用Placement new。例如，一个slab分配器从单个大内存块开始并使用placement new在块内按顺序分配对象。这不仅避免了内存碎片，而且节省了malloc堆遍历的开销。

声明变量时的分支

C++包含一个语法缩写，该缩写可以在声明变量时分支。它看起来像一个单变量声明，也可以有if或while这样的分支条件。

struct Event { virtual ~Event() {} };
struct MouseEvent : Event { int x, y; };
struct KeyboardEvent : Event { int key; };
void log(Event *event) {
if (MouseEvent mouse = dynamic_cast<MouseEvent >(event))
std::cout << "MouseEvent " << mouse->x << " " << mouse->y << std::endl;
else if (KeyboardEvent keyboard = dynamic_cast<KeyboardEvent >(event))
std::cout << "KeyboardEvent " << keyboard->key << std::endl;
else
std::cout << "Event" << std::endl;
}

成员函数的引用修饰符

C++11允许成员函数在对象的值类型上重载，this指针会将对象视为引用修饰符。引文修饰符将放置在cv限定符（译者注：CV限定词有三种：const限定符、volatile限定符和const-volatile限定符）处于同一位置并基于this对象是左值还是左值都会影响重载解析：

include
struct Foo {
void foo() & { std::cout << "lvalue" << std::endl; }
void foo() && { std::cout << "rvalue" << std::endl; }
};
int main() {
Foo foo;
foo.foo(); // Prints "lvalue"
Foo().foo(); // Prints "rvalue"
return 0;
}

迁移到完整模板元编程

C++模板是用来实现编译时元编程的，也就是说，程序可以生成其他程序。设计模板系统的初衷是执行简单的类型替换，但在C++在标准化过程中，突然发现模板实际上非常强大，能够执行任意计算。虽然笨拙且效率低下，但模板专用化确实可以完成一些计算：

// Recursive template for general case
template <int N>
struct factorial {
enum { value = N * factorial<N - 1>::value };
};
// Template specialization for base case
template <>
struct factorial<0> {
enum { value = 1 };
};
enum { result = factorial<5>::value }; // 5 4 3 2 1 == 120

C++模板可以被视为一种函数式编程语言，因为它们使用递归而不是迭代，并且包含不可变的状态。您可以使用typedef使用创建任何类型的变量enum创建一个int类型变量，嵌入在类型本身中的数据结构。

// Compile-time list of integers
template <int D, typename N>
struct node {
enum { data = D };
typedef N next;
};
struct end {};
// Compile-time sum function
template <typename L>
struct sum {
enum { value = L::data + sum<typename L::next>::value };
};
template <>
struct sum {
enum { value = 0 };
};
// Data structures are embedded in types
typedef node<1, node<2, node<3, end> > > list123;
enum { total = sum::value }; // 1 + 2 + 3 == 6

当然，这些示例是无用的，但是模板元编程可以做一些有用的事情，例如操作类型列表。但是，使用C++模板的编程语言可用性极低，因此请谨慎使用它们并少量使用它们。模板代码难以阅读、编译缓慢且难以调试，因为它的错误消息冗长且令人困惑。

指向成员的指针运算符

指向成员的指针运算符允许您在类的任何实例上描述指向成员的指针。有两种类型pointer-to-member算子*和指针运算符->：

include
using namespace std;
struct Test {
int num;
void func() {}
};
// Notice the extra "Test::" in the pointer type
int Test::*ptr_num = &Test::num;
void (Test::*ptr_func)() = &Test::func;
int main() {
Test t;
Test *pt = new Test;
// Call the stored member function
(t.*ptr_func)();
(pt->*ptr_func)();
// Set the variable in the stored member slot
t.*ptr_num = 1;
pt->*ptr_num = 2;
delete pt;
return 0;
}

此功能实际上非常有用，尤其是在编写库时。例如Boost::Python, 一个习惯C++绑定到Python对象库使用成员指针运算符，这使得在包装对象时可以轻松地指向成员。

include
include <boost/python.hpp>
using namespace boost::python;
struct World {
std::string msg;
void greet() { std::cout << msg << std::endl; }
};
BOOST_PYTHON_MODULE(hello) {
class_("World")
.def_readwrite("msg", &World::msg)
.def("greet", &World::greet);
}

请记住，使用成员函数指针不同于使用普通函数指针。在成员函数指针和普通函数指针之间casting它是无效的。例如Microsoft编译器中的成员该函数使用一个名为thiscall优化调用约定，thiscall将this参数放到ecx在寄存器中，而常规函数的调用约定是解析堆栈上的所有参数数。

此外，成员函数指针可能比普通指针大四倍左右。编译器需要存储函数体的地址、到正确父地址的偏移量（多个继承）、虚拟函数表中另一个偏移量的索引（虚拟继承），甚至是对象本身内部的 Virtual 函数表的偏移量（用于类型的前向声明）。

include
struct A {};
struct B : virtual A {};
struct C {};
struct D : A, C {};
struct E;
int main() {
std::cout << sizeof(void (A::*)()) << std::endl;
std::cout << sizeof(void (B::*)()) << std::endl;
std::cout << sizeof(void (D::*)()) << std::endl;
std::cout << sizeof(void (E::*)()) << std::endl;
return 0;
}
// 32-bit Visual C++ 2008: A = 4, B = 8, D = 12, E = 16
// 32-bit GCC 4.2.1: A = 8, B = 8, D = 8, E = 8
// 32-bit Digital Mars C++: A = 4, B = 4, D = 4, E = 4

在Digital Mars编译器中的所有成员函数都具有相同的大小，这源于生成的智能设计”thunk“使用右偏移量而不是存储指针本身的内部偏移量的函数。

静态实例方法

C++静态方法可以通过实例或直接通过类调用。这允许您将实例方法修改为静态方法，而无需更新任何调用点。

struct Foo {
static void foo() {}
};
// These are equivalent
Foo::foo();
Foo().foo();

重载++和–

C++设计中自定义运算符的函数名称是运算符本身，在大多数情况下效果很好。例如，一元运算符的-和二进制运算符-（否定和减法）可以通过通过参数的数量来区分。但这不适用于一元递增和递减运算符，因为它们的特征似乎是相同的。C++语言有一个非常笨拙的技术来解决这个问题：后缀++和–操作符号必须为空int该参数用作标记，让编译器知道执行后缀操作（是的，仅int键入有效。

struct Number {
Number &operator ++ (); // Generate a prefix ++ operator
Number operator ++ (int); // Generate a postfix ++ operator
};

操作员重载和检查顺序

重载,(逗号)，||或者&&操作员可能会造成混淆，因为它违反了正常的检查规则。通常，逗号运算符仅在检查整个左侧后才开始检查查右边，|| 和 &&操作员具有短路行为：它只在必要时检查右侧。无论如何，运算符的重载版本只是函数调用，函数调用以未指定的顺序检查其参数。

使这些运算符过载只是一种滥用C++语法方法。作为示例，我将提供一个Python以括号形式打印语句免费版本C++实现：

include
namespace __hidden__ {
struct print {
bool space;
print() : space(false) {}
~print() { std::cout << std::endl; }
template <typename T>
print &operator , (const T &t) {
if (space) std::cout << ;
else space = true;
std::cout << t;
return *this;
}
};
}
define print __hidden__::print(),
int main() {
int a = 1, b = 2;
print "this is a test";
print "the sum of", a, "and", b, "is", a + b;
return 0;
}

函数作为模板参数

众所周知，模板参数可以是特定的整数或函数。这允许编译器内联调用特定函数，以便在实例化模板代码时更有效地执行。在以下示例中，函数memoize的模板参数也是一个函数，只能通过函数调用新的参数值（旧的参数值可以通过cache获得）：

include
template <int (*f)(int)>
int memoize(int x) {
static std::map<int, int> cache;
std::map<int, int>::iterator y = cache.find(x);
if (y != cache.end()) return y->second;
return cache[x] = f(x);
}
int fib(int n) {
if (n < 2) return n;
return memoize(n - 1) + memoize(n - 2);
}

模板的参数也是模板

模板参数实际上可以是模板本身，这允许您在实例化模板时传递不带模板参数的模板类型。请看下面的代码：

template <typename T>
struct Cache { ... };
template <typename T>
struct NetworkStore { ... };
template <typename T>
struct MemoryStore { ... };
template <typename Store, typename T>
struct CachedStore {
Store store;
Cache cache;
};
CachedStore<NetworkStore, int> a;
CachedStore<MemoryStore, int> b;

CachedStore的cache存储的数据类型不同于store类型相同。但是，我们正在实例化一个CachedStore数据类型必须重复编写（上面的代码是int型），store我必须自己写，CachedStore我们还需要写，关键是我们不能保证两者的数据类型是一致的。我们真的只想确定一次数据类型，即是的，所以我们可以强制它保持不变，但是没有类型参数的列表可能会导致编译错误：