我如何找到两个子字符串('123STRINGabc' -> '字符串')之间的字符串?

我现在的方法是这样的:

>>> start = 'asdf=5;'
>>> end = '123jasd'
>>> s = 'asdf=5;iwantthis123jasd'
>>> print((s.split(start))[1].split(end)[0])
iwantthis

然而,这似乎非常低效且不符合python规则。有什么更好的方法来做这样的事情吗?

忘了说: 字符串可能不是以start和end开始和结束的。他们可能会有更多的字符之前和之后。


当前回答

source='your token _here0@df and maybe _here1@df or maybe _here2@df'
start_sep='_'
end_sep='@df'
result=[]
tmp=source.split(start_sep)
for par in tmp:
  if end_sep in par:
    result.append(par.split(end_sep)[0])

print result

必须显示: 这里0,这里1,这里2

regex更好,但它需要额外的库,你可能只想使用python

其他回答

from timeit import timeit
from re import search, DOTALL


def partition_find(string, start, end):
    return string.partition(start)[2].rpartition(end)[0]


def re_find(string, start, end):
    # applying re.escape to start and end would be safer
    return search(start + '(.*)' + end, string, DOTALL).group(1)


def index_find(string, start, end):
    return string[string.find(start) + len(start):string.rfind(end)]


# The wikitext of "Alan Turing law" article form English Wikipeida
# https://en.wikipedia.org/w/index.php?title=Alan_Turing_law&action=edit&oldid=763725886
string = """..."""
start = '==Proposals=='
end = '==Rival bills=='

assert index_find(string, start, end) \
       == partition_find(string, start, end) \
       == re_find(string, start, end)

print('index_find', timeit(
    'index_find(string, start, end)',
    globals=globals(),
    number=100_000,
))

print('partition_find', timeit(
    'partition_find(string, start, end)',
    globals=globals(),
    number=100_000,
))

print('re_find', timeit(
    're_find(string, start, end)',
    globals=globals(),
    number=100_000,
))

结果:

index_find 0.35047444528454114
partition_find 0.5327825636197754
re_find 7.552149639286381

在这个例子中,Re_find几乎比index_find慢20倍。

这本质上是cji的答案——7月30日10日5:58。 我更改了try except结构,以便更清楚地说明导致异常的原因。

def find_between( inputStr, firstSubstr, lastSubstr ):
'''
find between firstSubstr and lastSubstr in inputStr  STARTING FROM THE LEFT
    http://stackoverflow.com/questions/3368969/find-string-between-two-substrings
        above also has a func that does this FROM THE RIGHT   
'''
start, end = (-1,-1)
try:
    start = inputStr.index( firstSubstr ) + len( firstSubstr )
except ValueError:
    print '    ValueError: ',
    print "firstSubstr=%s  -  "%( firstSubstr ), 
    print sys.exc_info()[1]

try:
    end = inputStr.index( lastSubstr, start )       
except ValueError:
    print '    ValueError: ',
    print "lastSubstr=%s  -  "%( lastSubstr ), 
    print sys.exc_info()[1]

return inputStr[start:end]    
source='your token _here0@df and maybe _here1@df or maybe _here2@df'
start_sep='_'
end_sep='@df'
result=[]
tmp=source.split(start_sep)
for par in tmp:
  if end_sep in par:
    result.append(par.split(end_sep)[0])

print result

必须显示: 这里0,这里1,这里2

regex更好,但它需要额外的库,你可能只想使用python

s[len(start):-len(end)]

要提取STRING,请尝试:

myString = '123STRINGabc'
startString = '123'
endString = 'abc'

mySubString=myString[myString.find(startString)+len(startString):myString.find(endString)]