C 性能优化实践（二）

InfoRich 2021-08-14

展开全文

字符串性能优化

在平时的开发工作中，操作字符串是十分常见和频繁的，比如下面这个个简单的字符串操作函数：

std::string log_str(const char* filename, int line_number) { std::string_view s(filename); auto pos = s.rfind('/'); auto name = (pos == std::string_view::npos) ? filename : s.substr(pos + 1); std::stringstream ss; ss << name << ':' << line_number << ':'; return ss.str();}

auto s = log_str(__FILE__, __LINE__);std::cout << 'runtime log ' << s << '\n';//main.cpp:12

这里log_str将文件名和行号通过:连接在一起返回给用户，看似很普通的一个字符串操作，实际上是比较低效的，在前文中指出了尽量不使用stringstream去转换数字，这里可以换成std::to_chars以获取更好的性能。除此之外还能不能进一步提高这个log_str的性能呢？答案是肯定的，我希望能彻底消除运行期生成字符串的开销。

消除运行期开销的方法无疑问的是使用编译期生成字符串，如果我有一个编译期字符串的话，上述代码就可以改成这样了？

auto static_log_str(const compile_time_string &filename, const compile_time_string &line_number) {  auto pos = filename.rfind('/');  return str.substr(pos + 1) + ':' + line_number + ':';}

由于这个static_log_str操作的编译期字符串，所以可以彻底消除之前log_str函数的运行期开销。但是C++目前并没没有提供一个编译期字符串供我们操作，据说在c++23中会有编译期字符串以及容器，既然现在没有就造一个出来用以优化字符串操作的性能。

编译期字符串

我希望编译期字符串和运行期字符串的用法是类似的，也有各种算法以及+，=字符串操作符，只不过都是在编译期完成。既然是编译期字符串，那么内存肯定是在栈上分配的，内存大小是固定的，因此一个编译期字符串其实是一个固定长度的字符串，另外既然是一个编译期字符串那么它的方法都是constexpr方法，让我们来看看一个编译期固定长度的字符串长什么样。

template <class Char, std::size_t N>class FixedString{...};

void test_str() { constexpr auto s1 = 'hello'_fs; //FixedString<char, 5> constexpr auto s = s1 + ' ' + 'world'; static_assert(s=='hello world'_fs); static_assert(s.substr(6)=='world'_fs);}

可以看到编译期字符串操作和普通字符串操作没什么分别，唯一有差别的地方是多了一个constexpr限定符，这个constexpr的作用是告诉编译器，这个constexpr变量是在编译期生成的，既然是在编译期生成的，那么编译器也会有要求，要求生成这个变量的表达式或者函数的参数必须是编译期常量，否则会给出一个编译错误。我们操作的是字符串常量是编译期常量，所以生成一个constexpr的编译期字符串是没问题的。接下来我们来看看怎么实现这个编译期字符串的。

编译期字符串的实现

成员

template <class Char, std::size_t N>class FixedString {  Char data_[N + 1u];  // +1 是为了放字符串终止符'0'  std::size_t size_;   // 这个size是不包含字符串终止符的长度, size_ <= N.  //...

有两个成员，一个是含字符串终止符'0'的char数组，一个是字符串实际长度，除此之外没有其它成员了，看起来非常简单。接下来需要解决的一个问题是如何在编译期去构造这个string，即如何在编译期填满这个data_以及赋值size_。

构造

template <class Char, std::size_t N>class FixedString { Char data_[N + 1u]; // +1 是为了放字符串终止符'0' std::size_t size_; // 这个size是不包含字符串终止符的长度, size_ <= N.public: template <class That, std::size_t... Is> constexpr FixedString(const That& that, std::size_t size, std::index_sequence<Is...>, std::size_t pos = 0, std::size_t count = npos) noexcept : data_{ (Is < (size - pos) && Is < count ? that[Is + pos] : Char(0))..., Char(0) }, size_{ std::min(size - pos, count) } {}

constexpr std::size_t size() const noexcept { return size_; }

这个构造函数看起来有点复杂，我们可以先忽略这个构造函数后面的两个参数，这里借助了std::index_sequence和逗号表达式来初始化以及列表初始化来在编译期填充这个data_。

关键在这里：

data_{ (Is < (size - pos) && Is < count ? that[Is + pos] : Char(0))..., Char(0) }

首先逗号表达式用于展开变参Is..., 在展开变参的过程中我们就能获取0，1，2，...的编译期索引了，有了这个索引接着就可以将that中的字符一个个取出并利用列表初始化来初始化data_了，这里还是有一点技巧的。这里也可以考虑使用c++17的fold experssion来替代逗号表达式。

来测试一下构造编译期字符串。

void test_construct() { constexpr const char* str = 'hello'; constexpr FixedString<char, 5> fixed_str(str, 5, std::make_index_sequence<5>{}); static_assert(fixed_str.size() == 5);//ok}

编译期断言通过，在内存中可以看到fixedstring是着这样的{{'h','e','l','l','o','\0'}, 5}, 说明我们在编译期已经构造完成了一个字符串了。虽然我们可以通过auto来消除FixedString<char, 5>类型的编写，但是仍然不够方便，因为构造这个string的时候还需要先计算出字符串的长度，还需要调用std::make_index_sequence, 用起来不方便。我们当然不应该给用户提供这样原始的接口，应该提供的是非常简单易用的接口，就像文章开头介绍的那样，我们可以通过这种方式去构造编译期字符串：

constexpr auto str = 'hello'_fs; //FixedString<char, 5>constexpr auto str1 = MakeFixedString('hello');static_assert(str==str1);

为了能更方便地构造编译期字符串，我们需要增加几个重载的构造函数：

//支持char数组 template <std::size_t M, class = typename std::enable_if<(M - 1u <= N)>::type> constexpr FixedString(const Char(&that)[M]) noexcept : FixedString{ that, M - 1u, std::make_index_sequence<M - 1u>{} } {}

//指出char* constexpr FixedString(const Char* that, std::size_t count) noexcept : FixedString(that, count, std::make_index_sequence<N>{}) {}

有了这两个构造函数我们就可以通过编译期字符串常量去构造fixedstring了。

来看看这两个构造编译期字符串的辅助函数是怎么样的。

//Only for linux, not supported in msvc.template <class Char, Char... Cs>constexpr FixedString<Char, sizeof...(Cs)> operator'' _fs() noexcept {  const Char a[] = {Cs..., Char(0)};  return {a, sizeof...(Cs)};//调用支持char*的构造函数}
template <class Char, std::size_t N>constexpr FixedString<Char, N - 1u> MakeFixedString(const Char (&a)[N]) noexcept {  return {a}; //调用支持char数组的构造函数}

有了这两个辅助函数我们就可以很轻松地创建编译期字符串了。注意_fs literal操作符只能在linux中使用，在msvc中并没有支持，所以在win上是用不了，可以使用都支持的MakeFixedString。

到此我们已经有了一个基本能用的编译期字符串了，但是还缺少一些操作符，比如连接字符串的操作符+以及赋值操作符=，没有这些操作符，我们使用fixed string的时候是不方便的，接下来我们再来看看怎么实现这些操作符。

操作符

字符串+操作符实际上是一个concat，将两个字符串合并成一个新的字符串，既然是生成一个新的字符串那仍然需要借助构造函数来实现，再来增加一个构造函数，该构造函数可以接受两个字符串。

template <class Left, class Right, std::size_t... Is> constexpr FixedString(const Left& left, std::size_t left_size, const Right& right, std::size_t right_size, std::index_sequence<Is...>) noexcept : data_{ CharAt<Char>(left, left_size, right, right_size, Is)..., Char(0) }, size_{ left_size + right_size } {}

这个构造函数和之前那个构造函数是类似的，也是通过逗号表达式去构造data_，不同的是这里是通过一个CharAt而不是一个表达式去填充data_的，再看看这个CharAt

template <class Char, class Left, class Right>constexpr Char CharAt(const Left& left, std::size_t left_count, const Right& right,  std::size_t right_count, std::size_t i) noexcept {  return i < left_count    ? left[i]    : i < (left_count + right_count) ? right[i - left_count] : Char(0);}

这个CharAt实际上将left和right的字符一个个取出来用于初始化data_，这样我们就可以通过传入两个常量字符串来构造一个新的字符串了，有了这个构造函数我们再实现+连接操作符就简单了：

template <class Char, class Left, class Right, std::size_t... Is>static constexpr FixedString<Char, sizeof...(Is)> Concat( const Left& left, std::size_t left_count, const Right& right, std::size_t right_count, std::index_sequence<Is...> is) noexcept { return { left, left_count, right, right_count, is };}

template <std::size_t M>friend constexpr FixedString<Char, N + M - 1u> operator+( const FixedString& a, const Char(&b)[M]) noexcept { return Concat<Char>(a.data_, a.size_, b, M - 1u, std::make_index_sequence<N + M - 1u>{});}

template <std::size_t M>friend constexpr FixedString<Char, N + M - 1u> operator+( const Char(&a)[M], const FixedString& b) noexcept { return Concat<Char>(a, M - 1, b.data_, b.size_, std::make_index_sequence<N + M - 1u>{});}

==比较操作符的实现比较简单：

template <class Left, class Right>constexpr bool Equal(const Left& left, std::size_t left_size, const Right& right,  std::size_t right_size) noexcept {  if (left_size != right_size) return false;
  for (size_t i = 0; i < left_size; ++i) {    if (left[i] != right[i]) {      return false;    }  }
  return true;}
template <class Char, std::size_t A, std::size_t B>constexpr bool operator==(const FixedString<Char, A>& a,  const FixedString<Char, B>& b) noexcept {  return Equal(a, a.size(), b, b.size());}
template <class Char, std::size_t A, std::size_t B>constexpr bool operator!=(const FixedString<Char, A>& a, const FixedString<Char, B>& b) {  return !(a == b);}

来测试一下这个连接操作符和比较操作符：

void test_concat() { constexpr auto s1 = MakeFixedString('hello'); constexpr auto s = s1 + ' ' + 'world'; constexpr auto s2 = MakeFixedString('hello world'); static_assert(s == s2);}

当然，我们还缺少算法，比如append, push_back, find, rfind等等，有了constexpr，这一切都变得简单了。

算法

constexpr std::size_t size() const noexcept { return size_; } constexpr std::size_t length() const noexcept { return size_; } constexpr bool empty() const noexcept { return 0u == size_; } static constexpr std::size_t capacity() noexcept { return N; } static constexpr std::size_t max_size() noexcept { return N; }

constexpr Char& at(std::size_t i) noexcept(false) { return data_[i]; } constexpr const Char& at(std::size_t i) const noexcept(false) { return data_[i]; } constexpr Char& operator[](std::size_t i) noexcept { return data_[i]; } constexpr const Char& operator[](std::size_t i) const noexcept { return data_[i]; }

constexpr Char& front() noexcept { return (*this)[0u]; } constexpr const Char& front() const noexcept { return (*this)[0u]; } constexpr Char& back() noexcept { return data_[size_ - 1]; } constexpr const Char& back() const noexcept { return data_[size_ - 1]; } constexpr void clear() noexcept { data_[0u] = Char(0); size_ = 0u; }

constexpr void push_back(Char ch) noexcept(false) { // detail::fixedstring::checkOverflow(1u, N - size_); data_[size_] = ch; data_[++size_] = Char(0); }

constexpr void pop_back() noexcept(false) { // detail::fixedstring::checkOverflow(1u, size_); --size_; data_[size_] = Char(0); }

template <std::size_t M> constexpr FixedString& append(const FixedString<Char, M>& that) noexcept(false) { return append(that, 0u, that.size_); }

template <std::size_t M> constexpr FixedString& append(const FixedString<Char, M>& that, std::size_t pos, std::size_t count) noexcept(false) { for (std::size_t i = 0u; i < count; ++i) { data_[size_ + i] = that.data_[pos + i]; } size_ += count; data_[size_] = Char(0); return *this; }

constexpr FixedString& append(const Char* that) noexcept(false) { return append(that, strlen(that)); }

constexpr FixedString& append(const Char* that, std::size_t count) noexcept(false) { for (std::size_t i = 0u; i < count; ++i) { data_[size_ + i] = that[i]; } size_ += count; data_[size_] = Char(0); return *this; }

constexpr size_t find(Char ch) { for (size_t i = 0; i < N; ++i) { if (data_[i] == ch) { return i; } }

return npos; }

constexpr size_t rfind(Char ch) { for (size_t i = N - 1; i >= 0; --i) { if (data_[i] == ch) { return i; } }

return npos; }

constexpr FixedString substr(std::size_t pos) const noexcept(false) { return { *this, pos }; }

constexpr FixedString substr(std::size_t pos, std::size_t count) const noexcept(false) { return { *this, pos, count }; }

constexpr std::size_t copy(Char* dest, std::size_t count, std::size_t pos) const noexcept(false) { for (std::size_t i = 0u; i < count; ++i) { if (i + pos == size_) { return size_; } dest[i] = data_[i + pos]; } return count; }

这些算法实现一目了然。注：为了让代码简单省掉了一些索引越界的判断。

to_string, to_string_view

为了让fixed string能转换为std::string和std::string_view，我们需要添加几个辅助方法：

  operator std::basic_string<Char>() const noexcept(false) {    return std::basic_string<Char>{begin(), end()};  }
  std::basic_string<Char> to_string() const noexcept(false) {    return std::basic_string<Char>{begin(), end()};  }  constexpr operator std::basic_string_view<Char>() const {    return std::basic_string_view<Char>{begin(), size()};  }  std::basic_string<Char> to_stringview() const noexcept(false) {    return std::basic_string_view<Char>{begin(), end()};  }

测试代码

void test_to_string() { constexpr auto str = MakeFixedString('hello'); std::string s = str; std::string_view s1 = str; assert(s == 'hello's); assert(s1 == 'hello'sv);}

编译期字符串操作性能对比

前面花了很大精力去实现一个编译期字符串，那么这个编译期字符串到底怎么用来优化性能呢，优化效果如何呢？让我们来改造文章最开始的那个log_str函数，看看优化效果如何。

template<size_t N, size_t M>constexpr auto static_log_str(const char(&filename)[N], const char(&line_number)[M]) {  auto str = MakeFixedString(filename);  auto pos = str.rfind('/'); //注： win上用'\\'  return str.substr(pos + 1) + ':' + line_number + ':';}
#define STR(x) #x#define TOSTRING(x) STR(x)
constexpr auto s = static_log_str(__FILE__, TOSTRING(__LINE__));std::cout << s.data() << '\n'; //main.cpp:20:

性能比较

在gcc8.2 -O2下编译测试

void test_log_str() { ScopedTimer timer('ss log str'); for (int i = 0; i < 10000; ++i) { log_str(__FILE__, __LINE__); }}

#define STR(x) #x#define TOSTRING(x) STR(x)

void test_static_log_str() { ScopedTimer timer('static log str'); for (int i = 0; i < 10000; ++i) { static_log_str(__FILE__, TOSTRING(__LINE__)); }}

int main() { test_log_str(); test_static_log_str(); test_static_log_str(); test_log_str();}

输出：

ss log str : 4556868 nsstatic log str : 44 nsstatic log str : 19 nsss log str : 4421673 ns

性能测试的结果是4556868 vs 44!说明我们彻底消除了运行期的开销！这就是编译期字符优化的威力！！

(注：如果读者想看完整的源码，可以去看folly的FixedString: https://github.com/facebook/folly/blob/master/folly/FixedString.h)

这里也许有人会说编译期字符串优化只适用于编译期常量字符串，有一些局限性，如果我的字符串是运行期的该怎么优化呢？Good question，我在后面会继续介绍优化运行期字符串的方法，点赞超过100更新性能优化第三篇:)

本站是提供个人知识管理的网络存储空间，所有内容均由用户发布，不代表本站观点。请注意甄别内容中的联系方式、诱导购买等信息，谨防诈骗。如发现有害或侵权内容，请点击一键举报。

转藏分享

QQ空间 QQ好友新浪微博微信

献花（0） +1

来自： InfoRich > 《c 》

举报/认领

0条评论

发表

请遵守用户评论公约

类似文章 更多

InfoRich

关注对话

TA的最新馆藏

最新综述！万字长文彻底搞懂单目3D车道线检测
浅谈容器网络原理
细说真4K内窥ISP硬件架构如何实现（Tj375)
为什么工程上选择VINS而不是ORB-SLAM？
何为网络协议？一图知晓网络过程。
当要求功能安全时，我们在要求什么？

喜欢该文的人也喜欢更多

热门阅读换一换