Quantcast
Channel: Universal Feed Parser issue - Stack Overflow
Viewing all articles
Browse latest Browse all 2

Answer by Hai Vu for Universal Feed Parser issue

$
0
0

I found out the problem was with the use of namespace.

for FreeBSD's RSS feed:

<rss xmlns:atom="http://www.w3.org/2005/Atom"     xmlns="http://www.w3.org/1999/xhtml"     version="2.0">

For Ubuntu's feed:

<rss xmlns:atom="http://www.w3.org/2005/Atom"     version="2.0">

When I remove the extra namespace declaration from FreeBSD's feed, everything works as expected.

So what does it means for you? I can think of a couple of different approaches:

  1. Use something else, such as BeautifulSoup. I tried it and it seems to work.
  2. Download the whole RSS feed, apply some search/replace to fix up the namespaces, then use feedparser.parse() afterward. This approach is a big hack; I would not use it myself.

Update

Here is a sample code for rss_get_items() which will returns you a list of items from an RSS feed. Each item is a dictionary with some standard keys such as title, pubdate, link, and guid.

from bs4 import BeautifulSoupimport urllib2def rss_get_items(url):        request = urllib2.Request(url)    response = urllib2.urlopen(request)    soup = BeautifulSoup(response)    for item_node in soup.find_all('item'):        item = {}        for subitem_node in item_node.findChildren():            key = subitem_node.name            value = subitem_node.text            item[key] = value        yield itemif __name__ == '__main__':    url = 'http://www.freebsd.org/security/rss.xml'    for item in rss_get_items(url):        print item['title']        print item['pubdate']        print item['link']        print item['guid']        print '---'

Output:

FreeBSD-SA-14:04.bindTue, 14 Jan 2014 00:00:00 PSThttp://security.FreeBSD.org/advisories/FreeBSD-SA-14:04.bind.aschttp://security.FreeBSD.org/advisories/FreeBSD-SA-14:04.bind.asc---FreeBSD-SA-14:03.opensslTue, 14 Jan 2014 00:00:00 PSThttp://security.FreeBSD.org/advisories/FreeBSD-SA-14:03.openssl.aschttp://security.FreeBSD.org/advisories/FreeBSD-SA-14:03.openssl.asc---...

Notes:

  • I omit error checking for sake of brevity.
  • I recommend only using the BeautifulSoup API when feedparser fails. The reason is feedparser is the right tool the the job. Hopefully, they will update it to be more forgiving in the future.

Viewing all articles
Browse latest Browse all 2

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>