如何在不手动指定编码的情况下获得C#中字符串的一致字节表示？

如何在.NET（C#）中将字符串转换为字节[]而不手动指定特定编码？

我要加密字符串。我可以在不进行转换的情况下对其进行加密，但我仍然想知道为什么编码会在这里发挥作用。

此外，为什么还要考虑编码？我不能简单地获取字符串存储的字节数吗？为什么依赖字符编码？

当前回答

这取决于你想要的字节

这是因为，正如泰勒所恰当地说的，“字符串不是纯数据。它们也有信息。”在这种情况下，信息是创建字符串时假定的编码。

假设您有二进制数据（而不是文本）存储在字符串中

这是基于OP对自己问题的评论，如果我理解OP在用例中的提示，这是正确的问题。

由于上面提到的假设编码，将二进制数据存储在字符串中可能是错误的方法！无论哪个程序或库将二进制数据存储在字符串中（而不是更合适的byte[]数组），在战斗开始之前就已经失败了。如果他们以REST请求/响应或任何必须传输字符串的方式向您发送字节，Base64将是正确的方法。

如果您有一个编码未知的文本字符串

其他人都错误地回答了这个错误的问题。

如果字符串看起来不错，只需选择一个编码（最好是以UTF开头的编码），使用相应的System.Text.encoding。？？？。GetBytes（）函数，并告诉给谁您选择了哪种编码。

2017-11-08 18:21:31

其他回答

BinaryFormatter bf = new BinaryFormatter();
byte[] bytes;
MemoryStream ms = new MemoryStream();

string orig = "喂 Hello 谢谢 Thank You";
bf.Serialize(ms, orig);
ms.Seek(0, 0);
bytes = ms.ToArray();

MessageBox.Show("Original bytes Length: " + bytes.Length.ToString());

MessageBox.Show("Original string Length: " + orig.Length.ToString());

for (int i = 0; i < bytes.Length; ++i) bytes[i] ^= 168; // pseudo encrypt
for (int i = 0; i < bytes.Length; ++i) bytes[i] ^= 168; // pseudo decrypt

BinaryFormatter bfx = new BinaryFormatter();
MemoryStream msx = new MemoryStream();            
msx.Write(bytes, 0, bytes.Length);
msx.Seek(0, 0);
string sx = (string)bfx.Deserialize(msx);

MessageBox.Show("Still intact :" + sx);

MessageBox.Show("Deserialize string Length(still intact): " 
    + sx.Length.ToString());

BinaryFormatter bfy = new BinaryFormatter();
MemoryStream msy = new MemoryStream();
bfy.Serialize(msy, sx);
msy.Seek(0, 0);
byte[] bytesy = msy.ToArray();

MessageBox.Show("Deserialize bytes Length(still intact): " 
   + bytesy.Length.ToString());

2009-01-23 16:36:07

如果您真的想要一个字符串的基本字节的副本，可以使用下面这样的函数。然而，你不应该继续阅读以找出原因。

[DllImport(
        "msvcrt.dll",
        EntryPoint = "memcpy",
        CallingConvention = CallingConvention.Cdecl,
        SetLastError = false)]
private static extern unsafe void* UnsafeMemoryCopy(
    void* destination,
    void* source,
    uint count);

public static byte[] GetUnderlyingBytes(string source)
{
    var length = source.Length * sizeof(char);
    var result = new byte[length];
    unsafe
    {
        fixed (char* firstSourceChar = source)
        fixed (byte* firstDestination = result)
        {
            var firstSource = (byte*)firstSourceChar;
            UnsafeMemoryCopy(
                firstDestination,
                firstSource,
                (uint)length);
        }
    }

    return result;
}

这个函数会很快地得到字符串下面的字节的副本。您将以任何方式在系统上编码这些字节。这种编码几乎可以肯定是UTF-16LE，但这是一个您不必关心的实现细节。

打电话会更安全、更简单、更可靠，

System.Text.Encoding.Unicode.GetBytes()

这很可能会产生相同的结果，更容易键入，字节将往返，Unicode中的字节表示也可以，调用

System.Text.Encoding.Unicode.GetString()

2014-11-25 10:29:12

可以使用以下代码在字符串和字节数组之间进行转换。

string s = "Hello World";

// String to Byte[]

byte[] byte1 = System.Text.Encoding.Default.GetBytes(s);

// OR

byte[] byte2 = System.Text.ASCIIEncoding.Default.GetBytes(s);

// Byte[] to string

string str = System.Text.Encoding.UTF8.GetString(byte1);

2014-09-09 11:30:51

关键问题是字符串中的字形需要32位（字符代码为16位），但字节只有8位可用。一对一映射不存在，除非您将自己限制为仅包含ASCII字符的字符串。System.Text.Encoding有很多方法可以将字符串映射到byte[]，您需要选择一种方法来避免信息丢失，并且当您的客户端需要将byte[]映射回字符串时，它很容易使用。

Utf8是一种流行的编码方式，它紧凑而无损耗。

2009-01-23 14:15:26

代码如下：

// Input string.
const string input = "Dot Net Perls";

// Invoke GetBytes method.
// ... You can store this array as a field!
byte[] array = Encoding.ASCII.GetBytes(input);

// Loop through contents of the array.
foreach (byte element in array)
{
    Console.WriteLine("{0} = {1}", element, (char)element);
}

2013-01-23 06:21:33

如何在不手动指定编码的情况下获得C#中字符串的一致字节表示？

推荐文章

最新文章

标签