如何按类查找元素

我在使用Beautifulsoup解析带有“class”属性的HTML元素时遇到了麻烦。代码看起来像这样

soup = BeautifulSoup(sdata)
mydivs = soup.findAll('div')
for div in mydivs: 
    if (div["class"] == "stylelistrow"):
        print div

我在脚本完成后的同一行上得到一个错误。

File "./beautifulcoding.py", line 130, in getlanguage
  if (div["class"] == "stylelistrow"):
File "/usr/local/lib/python2.6/dist-packages/BeautifulSoup.py", line 599, in __getitem__
   return self._getAttrMap()[key]
KeyError: 'class'

如何消除这个错误呢?

当前回答

这可以让我访问class属性(在beautifulsoup 4上，与文档所说的相反)。KeyError返回的是一个列表，而不是字典。

for hit in soup.findAll(name='span'):
    print hit.contents[1]['class']

2014-07-29 07:03:36

其他回答

或者我们可以使用lxml，它支持xpath和非常快!

from lxml import html, etree 

attr = html.fromstring(html_text)#passing the raw html
handles = attr.xpath('//div[@class="stylelistrow"]')#xpath exresssion to find that specific class

for each in handles:
    print(etree.tostring(each))#printing the html as string

2020-04-18 08:03:38

具体到BeautifulSoup 3:

soup.findAll('div',
             {'class': lambda x: x 
                       and 'stylelistrow' in x.split()
             }
            )

会找到所有这些:

<div class="stylelistrow">
<div class="stylelistrow button">
<div class="button stylelistrow">

2014-12-09 21:48:51

使用class_=如果你想在不指定HTML标签的情况下查找元素。

对于单个元素:

soup.find(class_='my-class-name')

对于多个元素:

soup.find_all(class_='my-class-name')

2021-02-16 10:47:43

更新:2016 在beautifulsoup的最新版本中，方法“findAll”已被重命名为 “find_all”。官方文件链接

因此答案将是

soup.find_all("html_element", class_="your_class_name")

2016-07-20 02:26:21

这招对我很管用:

for div in mydivs:
    try:
        clazz = div["class"]
    except KeyError:
        clazz = ""
    if (clazz == "stylelistrow"):
        print div

2013-06-11 02:10:29

如何按类查找元素

推荐文章

最新文章

标签