如何在不手动指定编码的情况下获得C#中字符串的一致字节表示？

如何在.NET（C#）中将字符串转换为字节[]而不手动指定特定编码？

我要加密字符串。我可以在不进行转换的情况下对其进行加密，但我仍然想知道为什么编码会在这里发挥作用。

此外，为什么还要考虑编码？我不能简单地获取字符串存储的字节数吗？为什么依赖字符编码？

当前回答

由于以下事实，字符串可以通过几种不同的方式转换为字节数组：.NET支持Unicode，Unicode标准化了几种称为UTF的不同编码。它们具有不同长度的字节表示，但在这个意义上是等价的，即当字符串被编码时，它可以被编码回字符串，但如果字符串用一个UTF编码，并且在不同UTF的假设下解码，如果可能会出错。

此外，.NET支持非Unicode编码，但它们在一般情况下无效（只有在实际字符串（如ASCII）中使用有限的Unicode代码点子集时才有效）。在内部，.NET支持UTF-16，但对于流表示，通常使用UTF-8。它也是互联网的事实标准。

毫不奇怪，System.Text.Encoding类是一个抽象类，它支持将字符串序列化为字节数组和反序列化；它的派生类支持具体编码：ASCIIEncoding和四个UTF（System.Text.UnicodeEncoding支持UTF-16）

参考此链接。

对于使用System.Text.Encoding.GetBytes对字节数组进行序列化。对于反向操作，使用System.Text.Encoding.GGetChars。此函数返回字符数组，因此要获取字符串，请使用字符串构造函数System.string（char[]）。请参阅本页。

例子：

string myString = //... some string

System.Text.Encoding encoding = System.Text.Encoding.UTF8; //or some other, but prefer some UTF is Unicode is used
byte[] bytes = encoding.GetBytes(myString);

//next lines are written in response to a follow-up questions:

myString = new string(encoding.GetChars(bytes));
byte[] bytes = encoding.GetBytes(myString);
myString = new string(encoding.GetChars(bytes));
byte[] bytes = encoding.GetBytes(myString);

//how many times shall I repeat it to show there is a round-trip? :-)

2014-06-11 11:29:06

其他回答

BinaryFormatter bf = new BinaryFormatter();
byte[] bytes;
MemoryStream ms = new MemoryStream();

string orig = "喂 Hello 谢谢 Thank You";
bf.Serialize(ms, orig);
ms.Seek(0, 0);
bytes = ms.ToArray();

MessageBox.Show("Original bytes Length: " + bytes.Length.ToString());

MessageBox.Show("Original string Length: " + orig.Length.ToString());

for (int i = 0; i < bytes.Length; ++i) bytes[i] ^= 168; // pseudo encrypt
for (int i = 0; i < bytes.Length; ++i) bytes[i] ^= 168; // pseudo decrypt

BinaryFormatter bfx = new BinaryFormatter();
MemoryStream msx = new MemoryStream();            
msx.Write(bytes, 0, bytes.Length);
msx.Seek(0, 0);
string sx = (string)bfx.Deserialize(msx);

MessageBox.Show("Still intact :" + sx);

MessageBox.Show("Deserialize string Length(still intact): " 
    + sx.Length.ToString());

BinaryFormatter bfy = new BinaryFormatter();
MemoryStream msy = new MemoryStream();
bfy.Serialize(msy, sx);
msy.Seek(0, 0);
byte[] bytesy = msy.ToArray();

MessageBox.Show("Deserialize bytes Length(still intact): " 
   + bytesy.Length.ToString());

2009-01-23 16:36:07

您需要考虑编码，因为1个字符可以由1个或多个字节（最多约6个）表示，不同的编码将对这些字节进行不同的处理。

Joel对此发表了一篇帖子：

绝对最低限度每个软件开发人员绝对、肯定地必须了解Unicode和字符集（没有借口！）

2009-01-23 14:03:30

bytes[] buffer = UnicodeEncoding.UTF8.GetBytes(string something); //for converting to UTF then get its bytes

bytes[] buffer = ASCIIEncoding.ASCII.GetBytes(string something); //for converting to ascii then get its bytes

2012-01-02 11:07:00

要将字符串转换为字节[]，请使用以下解决方案：

string s = "abcdefghijklmnopqrstuvwxyz";
byte[] b = System.Text.UTF32Encoding.GetBytes(s);

我希望这有帮助。

2014-04-09 12:39:54

从字节[]到字符串：

        return BitConverter.ToString(bytes);

2015-01-21 14:05:34

如何在不手动指定编码的情况下获得C#中字符串的一致字节表示？

推荐文章

最新文章

标签