如果顺序不重要,如何比较 python 中的两个字符串?

我有两根弦

string1="abc def ghi"

还有

string2="def ghi abc"

如何得到这两个字符串是相同的,而不破坏单词?

667210 次浏览

If you want to know if both the strings are equal, you can simply do

print string1 == string2

But if you want to know if they both have the same set of characters and they occur same number of times, you can use collections.Counter, like this

>>> string1, string2 = "abc def ghi", "def ghi abc"
>>> from collections import Counter
>>> Counter(string1) == Counter(string2)
True

Something like this:

if string1 == string2:
print 'they are the same'

update: if you want to see if each sub-string may exist in the other:

elem1 = [x for x in string1.split()]
elem2 = [x for x in string2.split()]


for item in elem1:
if item in elem2:
print item
>>> s1="abc def ghi"
>>> s2="def ghi abc"
>>> s1 == s2  # For string comparison
False
>>> sorted(list(s1)) == sorted(list(s2)) # For comparing if they have same characters.
True
>>> sorted(list(s1))
[' ', ' ', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']
>>> sorted(list(s2))
[' ', ' ', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']

Try to covert both strings to upper or lower case. Then you can use == comparison operator.

Seems question is not about strings equality, but of sets equality. You can compare them this way only by splitting strings and converting them to sets:

s1 = 'abc def ghi'
s2 = 'def ghi abc'
set1 = set(s1.split(' '))
set2 = set(s2.split(' '))
print set1 == set2

Result will be

True

open both of the files then compare them by splitting its word contents;

log_file_A='file_A.txt'


log_file_B='file_B.txt'


read_A=open(log_file_A,'r')
read_A=read_A.read()
print read_A


read_B=open(log_file_B,'r')
read_B=read_B.read()
print read_B


File_A_set = set(read_A.split(' '))
File_A_set = set(read_B.split(' '))
print File_A_set == File_B_set

I am going to provide several solutions and you can choose the one that meets your needs:

1) If you are concerned with just the characters, i.e, same characters and having equal frequencies of each in both the strings, then use:

''.join(sorted(string1)).strip() == ''.join(sorted(string2)).strip()

2) If you are also concerned with the number of spaces (white space characters) in both strings, then simply use the following snippet:

sorted(string1) == sorted(string2)

3) If you are considering words but not their ordering and checking if both the strings have equal frequencies of words, regardless of their order/occurrence, then can use:

sorted(string1.split()) == sorted(string2.split())

4) Extending the above, if you are not concerned with the frequency count, but just need to make sure that both the strings contain the same set of words, then you can use the following:

set(string1.split()) == set(string2.split())

Equality in direct comparing:

string1 = "sample"
string2 = "sample"


if string1 == string2 :
print("Strings are equal with text : ", string1," & " ,string2)
else :
print ("Strings are not equal")

Equality in character sets:

string1 = 'abc def ghi'
string2 = 'def ghi abc'


set1 = set(string1.split(' '))
set2 = set(string2.split(' '))


print set1 == set2


if string1 == string2 :
print("Strings are equal with text : ", string1," & " ,string2)
else :
print ("Strings are not equal")

I think difflib is a good library to do this job

   >>>import difflib
>>> diff = difflib.Differ()
>>> a='he is going home'
>>> b='he is goes home'
>>> list(diff.compare(a,b))
['  h', '  e', '   ', '  i', '  s', '   ', '  g', '  o', '+ e', '+ s', '- i', '- n', '- g', '   ', '  h', '  o', '  m', '  e']
>>> list(diff.compare(a.split(),b.split()))
['  he', '  is', '- going', '+ goes', '  home']

For that, you can use default difflib in python

from difflib import SequenceMatcher


def similar(a, b):
return SequenceMatcher(None, a, b).ratio()

then call similar() as

similar(string1, string2)

it will return compare as ,ratio >= threshold to get match result

If you want a really simple answer:

s_1 = "abc def ghi"
s_2 = "def ghi abc"
flag = 0
for i in s_1:
if i not in s_2:
flag = 1
if flag == 0:
print("a == b")
else:
print("a != b")

This is a pretty basic example, but after the logical comparisons (==) or string1.lower() == string2.lower(), maybe can be useful to try some of the basic metrics of distances between two strings.

You can find examples everywhere related to these or some other metrics, try also the fuzzywuzzy package (https://github.com/seatgeek/fuzzywuzzy).

import Levenshtein
import difflib


print(Levenshtein.ratio('String1', 'String2'))
print(difflib.SequenceMatcher(None, 'String1', 'String2').ratio())

If you just need to check if the two strings are exactly same,

text1 = 'apple'


text2 = 'apple'


text1 == text2

The result will be

True

If you need the matching percentage,

import difflib


text1 = 'Since 1958.'


text2 = 'Since 1958'


output = str(int(difflib.SequenceMatcher(None, text1, text2).ratio()*100))

Matching percentage output will be,

'95'

You can use simple loops to check two strings are equal. .But ideally you can use something like return s1==s2

s1 = 'hello'
s2 = 'hello'


a = []
for ele in s1:
a.append(ele)
for i in range(len(s2)):
if a[i]==s2[i]:
a.pop()
if len(a)>0:
return False
else:
return True