我想转换一个std::字符串小写。我知道tolower()函数。然而,在过去,我有这个函数的问题,它几乎不是理想的无论如何使用std::string将需要迭代每个字符。

有没有一种替代方案能100%有效?


当前回答

这可能是另一个简单的版本,将大写字母转换为小写字母,反之亦然。我使用VS2017社区版来编译这个源代码。

#include <iostream>
#include <string>
using namespace std;

int main()
{
    std::string _input = "lowercasetouppercase";
#if 0
    // My idea is to use the ascii value to convert
    char upperA = 'A';
    char lowerA = 'a';

    cout << (int)upperA << endl; // ASCII value of 'A' -> 65
    cout << (int)lowerA << endl; // ASCII value of 'a' -> 97
    // 97-65 = 32; // Difference of ASCII value of upper and lower a
#endif // 0

    cout << "Input String = " << _input.c_str() << endl;
    for (int i = 0; i < _input.length(); ++i)
    {
        _input[i] -= 32; // To convert lower to upper
#if 0
        _input[i] += 32; // To convert upper to lower
#endif // 0
    }
    cout << "Output String = " << _input.c_str() << endl;

    return 0;
}

注意:如果有特殊字符,则需要使用条件检查来处理。

其他回答

如果字符串包含ASCII范围之外的UTF-8字符,则boost::algorithm::to_lower将不会转换这些字符。当涉及UTF-8时,最好使用boost::locale::to_lower。参见http://www.boost.org/doc/libs/1_51_0/libs/locale/doc/html/conversions.html

这是Stefan Mai的回应的后续:如果你想把转换的结果放在另一个字符串中,你需要在调用std::transform之前预先分配它的存储空间。由于STL将转换后的字符存储在目标迭代器中(在每次循环迭代时递增),因此目标字符串不会自动调整大小,并且可能会占用内存。

#include <string>
#include <algorithm>
#include <iostream>

int main (int argc, char* argv[])
{
  std::string sourceString = "Abc";
  std::string destinationString;

  // Allocate the destination space
  destinationString.resize(sourceString.size());

  // Convert the source string to lower case
  // storing the result in destination string
  std::transform(sourceString.begin(),
                 sourceString.end(),
                 destinationString.begin(),
                 ::tolower);

  // Output the result of the conversion
  std::cout << sourceString
            << " -> "
            << destinationString
            << std::endl;
}

由于没有一个答案提到即将到来的Ranges库,它从c++ 20开始就在标准库中可用,目前在GitHub上单独可用为range-v3,我想添加一种使用它执行转换的方法。

就地修改字符串:

str |= action::transform([](unsigned char c){ return std::tolower(c); });

生成一个新的字符串:

auto new_string = original_string
    | view::transform([](unsigned char c){ return std::tolower(c); });

(不要忘记#include <cctype>和所需的Ranges头。)

注意:使用unsigned char作为lambda的参数是受cppreference的启发,它声明:

Like all other functions from <cctype>, the behavior of std::tolower is undefined if the argument's value is neither representable as unsigned char nor equal to EOF. To use these functions safely with plain chars (or signed chars), the argument should first be converted to unsigned char: char my_tolower(char ch) { return static_cast<char>(std::tolower(static_cast<unsigned char>(ch))); } Similarly, they should not be directly used with standard algorithms when the iterator's value type is char or signed char. Instead, convert the value to unsigned char first: std::string str_tolower(std::string s) { std::transform(s.begin(), s.end(), s.begin(), // static_cast<int(*)(int)>(std::tolower) // wrong // [](int c){ return std::tolower(c); } // wrong // [](char c){ return std::tolower(c); } // wrong [](unsigned char c){ return std::tolower(c); } // correct ); return s; }

std::ctype::tolower()从标准c++本地化库将正确地为您做这件事。下面是一个例子,从下面的参考页面提取

#include <locale>
#include <iostream>

int main () {
  std::locale::global(std::locale("en_US.utf8"));
  std::wcout.imbue(std::locale());
  std::wcout << "In US English UTF-8 locale:\n";
  auto& f = std::use_facet<std::ctype<wchar_t>>(std::locale());
  std::wstring str = L"HELLo, wORLD!";
  std::wcout << "Lowercase form of the string '" << str << "' is ";
  f.tolower(&str[0], &str[0] + str.size());
  std::wcout << "'" << str << "'\n";
}

复制是因为不允许改进答案。谢谢所以


string test = "Hello World";
for(auto& c : test)
{
   c = tolower(c);
}

解释:

For (auto& c: test)是一个基于范围的For循环,类似于For (range_declaration:range_expression)loop_statement:

Range_declaration: auto& c 这里auto说明符用于自动类型推断。类型从变量初始化式中扣除。 range_expression:测试 本例中的范围是字符串test的字符。

字符串test的字符可以在for循环中通过标识符c作为引用。