Python 如何使用多个分隔符分割字符串

在Python中， 分隔符 是用于分隔或标记字符串不同部分之间边界的字符或字符序列。分隔符可以用于将字符串分成更小的部分，或者用于识别和提取较大字符串中的特定信息。

例如，最常见的分隔符是空白字符（例如空格、制表符、换行符），用于在句子中分隔单词。然而，其他字符或字符序列也可以作为分隔符，例如逗号、分号、连字符和冒号。

分隔符在处理文本数据时非常有用，因为它们允许您以各种方式拆分、连接或操作字符串。

使用re.split()

示例

在这个示例中，我们首先导入re模块来使用正则表达式。我们定义了包含多个分隔符（逗号、空格、分号、连字符）的示例字符串。然后，我们使用re.split()方法使用一个正则表达式来匹配任何这些分隔符的一个或多个出现。输出应该是拆分文本的列表：[‘apples’, ‘oranges’, ‘bananas’, ‘grapes’]。

import re
# sample string
text = "apples, oranges; bananas - grapes"
# split the string using multiple delimiters with regular expressions
split_text = re.split(r'[,\s;|-]+', text)
# print the result
print(split_text)

输出

['apples', 'oranges', 'bananas', 'grapes']

使用 split() 和 replace()

示例

在这个示例中，我们首先定义了一个示例字符串。然后我们使用 replace() 方法将所有的分隔符都替换为一个空格。然后我们使用 split() 方法不带任何参数将字符串根据空格分割成一个单词列表。输出应该是分割后的文本列表：[‘apples’, ‘oranges’, ‘bananas’, ‘grapes’]。

# sample string
text = "apples, oranges; bananas - grapes"
# replace all delimiters with a single delimiter, then split
split_text = text.replace(",", " ").replace(";", " ").replace("-", " ").split()
# print the result
print(split_text)

输出

['apples', 'oranges', 'bananas', 'grapes']

使用re.findall()

示例

在这个示例中，我们使用正则表达式从样本字符串中提取所有单词。我们使用re.findall()方法和一个正则表达式，该表达式匹配一个或多个单词字符（字母、数字和下划线）。输出应该是一个拆分文本的列表：[‘apples’，’oranges’，’bananas’，’grapes’]。然而，如果原始字符串包含其他不属于单词的字符（例如标点符号），则此方法可能无法按预期工作。

import re
# sample string
text = "apples, oranges; bananas - grapes"
# extract all words using regular expressions

words = re.findall(r'\w+', text)
# print the result
print(words)

输出

['apples', 'oranges', 'bananas', 'grapes']

使用split()和join()

示例

在这个示例中，我们定义了一个样本字符串和一个分隔符列表。然后，我们创建一个正则表达式模式，匹配任何一个分隔符，并使用map()函数来转义分隔符中的特殊字符。然后我们使用re.split()和模式来将字符串分割成一个单词列表，并使用filter()和None函数来从列表中删除任何空字符串。输出应该是一个分割文本的列表：[‘苹果’, ‘橙子’, ‘香蕉’, ‘葡萄’]。

import re
# sample string
text = "apples, oranges; bananas - grapes"
# define the delimiters
delimiters = [',', ';', '-', ' ']
# create a regular expression pattern with the delimiters
pattern = '|'.join(map(re.escape, delimiters))
# split the string using the pattern and remove empty strings
split_text = list(filter(None, re.split(pattern, text)))
# print the result
print(split_text)

输出

['apples', 'oranges', 'bananas', 'grapes']

使用re.split()与前瞻和后顾断言

示例

在此示例中，我们使用re.split()与前瞻和后顾断言在每个分隔符后的位置上进行分割字符串。正则表达式r'(?<=[,;\\-\s])(?=[^\s])'匹配以逗号、分号、连字符或空格为前导，并且后跟一个非空白字符的任何位置。输出应该是分割后的文本列表：[‘apples’, ‘oranges’, ‘bananas’, ‘grapes’]。

import re
# sample string
text = "cricket, baseball; football - volleyball"
# split the string using lookahead and lookbehind assertions
split_text = re.split(r'(?<=[,;\-\s])(?=[^\s])', text)
# print the result
print(split_text)

输出

['cricket, ', 'baseball; ', 'football ', '- ', 'volleyball']

使用正则表达式的split()

示例

这个示例使用了re.split()。我们定义了一个样本字符串和一个正则表达式，该表达式匹配任何分隔符的一个或多个出现次数。我们用re.split()和正则表达式将字符串拆分成一个单词列表。输出应该是拆分后的文本列表：[‘dogs’, ‘foxes’, ‘hyenas’, ‘jackals’]。

import re
# sample string
text = "dogs, foxes; hyenas, jackals"


# split the string using regular expressions
split_text = re.split(r'[,\s;|-]+', text)
# print the result
print(split_text)