小开

最佳答案

The problem is that your <a> tag with the <i> tag inside, doesn't have the string attribute you expect it to have. First let's take a look at what text="" argument for find() does.

NOTE: The text argument is an old name, since BeautifulSoup 4.4.0 it's called string.

From the docs:

Although string is for finding strings, you can combine it with arguments that find tags: Beautiful Soup will find all tags whose .string matches your value for string. This code finds the tags whose .string is “Elsie”:
soup.find_all("a", string="Elsie")
# [<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>]

Now let's take a look what Tag's string attribute is (from the docs again):

If a tag has only one child, and that child is a NavigableString, the child is made available as .string:
title_tag.string
# u'The Dormouse's story'

(...)

If a tag contains more than one thing, then it’s not clear what .string should refer to, so .string is defined to be None:
print(soup.html.string)
# None

This is exactly your case. Your <a> tag contains a text and <i> tag. Therefore, the find gets None when trying to search for a string and thus it can't match.

How to solve this?

Maybe there is a better solution but I would probably go with something like this:

import re
from bs4 import BeautifulSoup as BS


soup = BS("""
<a href="/customer-menu/1/accounts/1/update">
<i class="fa fa-edit"></i> Edit
</a>
""")


links = soup.find_all('a', href="/customer-menu/1/accounts/1/update")


for link in links:
if link.find(text=re.compile("Edit")):
thelink = link
break


print(thelink)

I think there are not too many links pointing to /customer-menu/1/accounts/1/update so it should be fast enough.

小开

You can pass a function that return True if a text contains "Edit" to .find

In [51]: def Edit_in_text(tag):
....:     return tag.name == 'a' and 'Edit' in tag.text
....:


In [52]: soup.find(Edit_in_text, href="/customer-menu/1/accounts/1/update")
Out[52]:
<a href="/customer-menu/1/accounts/1/update">
<i class="fa fa-edit"></i> Edit
</a>

EDIT:

You can use the .get_text() method instead of the text in your function which gives the same result:

def Edit_in_text(tag):
return tag.name == 'a' and 'Edit' in tag.get_text()

小开

in one line using lambda

soup.find(lambda tag:tag.name=="a" and "Edit" in tag.text)

小开

With soupsieve 2.1.0 you can use :-soup-contains css pseudo class selector to target a node's text. This replaces the deprecated form of :contains().

from bs4 import BeautifulSoup as BS


soup = BS("""
<a href="/customer-menu/1/accounts/1/update">
Edit
</a>
""")
single = soup.select_one('a:-soup-contains("Edit")').text.strip()
multiple = [i.text.strip() for i in soup.select('a:-soup-contains("Edit")')]
print(single, '\n', multiple)

小开

Method - 1: Checking text property

    pattern = 'Edit'
a2 = soup.find_all('a', string = pattern)[0]

Method - 2: Using lambda iterate through all elements

    a2 = soup.find(lambda tag:tag.name=="a" and "Edit" in tag.text)

BeautifulSoup-通过标记中的文本进行搜索

剪辑

Good Luck