如何从字符串中删除所有非字母的字符?
非字母数字呢?
这必须是一个自定义函数还是也有更通用的解决方案?
如何从字符串中删除所有非字母的字符?
非字母数字呢?
这必须是一个自定义函数还是也有更通用的解决方案?
当前回答
这个解决方案受到Allen先生的解决方案的启发,需要一个整数的Numbers表(如果您想进行具有良好性能的严肃查询操作,您应该手头有这个表)。它不需要CTE。您可以更改NOT IN(…)表达式以排除特定字符,或将其更改为IN(…)只保留某些字符的OR LIKE表达式。
SELECT (
SELECT SUBSTRING([YourString], N, 1)
FROM dbo.Numbers
WHERE N > 0 AND N <= CONVERT(INT, LEN([YourString]))
AND SUBSTRING([YourString], N, 1) NOT IN ('(',')',',','.')
FOR XML PATH('')
) AS [YourStringTransformed]
FROM ...
其他回答
虽然这篇文章有点老了,但我想说以下几点。 我有上述解决方案的问题是,它没有过滤出字符,如ç, ë, ï等。我调整了一个函数如下(我只使用80 varchar字符串来节省内存):
create FUNCTION dbo.udf_Cleanchars (@InputString varchar(80))
RETURNS varchar(80)
AS
BEGIN
declare @return varchar(80) , @length int , @counter int , @cur_char char(1)
SET @return = ''
SET @length = 0
SET @counter = 1
SET @length = LEN(@InputString)
IF @length > 0
BEGIN WHILE @counter <= @length
BEGIN SET @cur_char = SUBSTRING(@InputString, @counter, 1) IF ((ascii(@cur_char) in (32,44,46)) or (ascii(@cur_char) between 48 and 57) or (ascii(@cur_char) between 65 and 90) or (ascii(@cur_char) between 97 and 122))
BEGIN SET @return = @return + @cur_char END
SET @counter = @counter + 1
END END
RETURN @return END
我把它放在调用PatIndex的两个地方。
PatIndex('%[^A-Za-z0-9]%', @Temp)
为上面的自定义函数RemoveNonAlphaCharacters并重命名为RemoveNonAlphaNumericCharacters
信不信由你,在我的系统中,这个丑陋的函数比G masters的优雅函数表现得更好。
CREATE FUNCTION dbo.RemoveSpecialChar (@s VARCHAR(256))
RETURNS VARCHAR(256)
WITH SCHEMABINDING
BEGIN
IF @s IS NULL
RETURN NULL
DECLARE @s2 VARCHAR(256) = '',
@l INT = LEN(@s),
@p INT = 1
WHILE @p <= @l
BEGIN
DECLARE @c INT
SET @c = ASCII(SUBSTRING(@s, @p, 1))
IF @c BETWEEN 48 AND 57
OR @c BETWEEN 65 AND 90
OR @c BETWEEN 97 AND 122
SET @s2 = @s2 + CHAR(@c)
SET @p = @p + 1
END
IF LEN(@s2) = 0
RETURN NULL
RETURN @s2
使用CTE生成的数字表来检查每个字符,然后FOR XML连接到一个保留值的字符串,您可以…
CREATE FUNCTION [dbo].[PatRemove](
@pattern varchar(50),
@expression varchar(8000)
)
RETURNS varchar(8000)
AS
BEGIN
WITH
d(d) AS (SELECT d FROM (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) digits(d)),
nums(n) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM d d1, d d2, d d3, d d4),
chars(c) AS (SELECT SUBSTRING(@expression, n, 1) FROM nums WHERE n <= LEN(@expression))
SELECT
@expression = (SELECT c AS [text()] FROM chars WHERE c NOT LIKE @pattern FOR XML PATH(''));
RETURN @expression;
END
SQL Server 2017+的另一个可能的选项,没有循环和/或递归,是使用TRANSLATE()和REPLACE()的基于字符串的方法。
t - sql声明:
DECLARE @pattern varchar(52) = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
SELECT
v.[Text],
REPLACE(
TRANSLATE(
v.[Text],
REPLACE(TRANSLATE(v.[Text], @pattern, REPLICATE('a', LEN(@pattern))), 'a', ''),
REPLICATE('0', LEN(REPLACE(TRANSLATE(v.[Text], @pattern, REPLICATE('a', LEN(@pattern))), 'a', '')))
),
'0',
''
) AS AlphabeticCharacters
FROM (VALUES
('abc1234def5678ghi90jkl#@$&'),
('1234567890'),
('JAHDBESBN%*#*@*($E*sd55bn')
) v ([Text])
或作为一个函数:
CREATE FUNCTION dbo.RemoveNonAlphabeticCharacters (@Text varchar(1000))
RETURNS varchar(1000)
AS BEGIN
DECLARE @pattern varchar(52) = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
SET @text = REPLACE(
TRANSLATE(
@Text,
REPLACE(TRANSLATE(@Text, @pattern, REPLICATE('a', LEN(@pattern))), 'a', ''),
REPLICATE('0', LEN(REPLACE(TRANSLATE(@Text, @pattern, REPLICATE('a', LEN(@pattern))), 'a', '')))
),
'0',
''
)
RETURN @Text
END