Love or hate them, the top python scraping libraries have some hidden gems and tricks that you can use to enhance, update and diversify your Django models. This talk will teach you more advanced techniques to aggregate content from RSS feeds, Twitter, Tumblr and normal old web sites for your Django projects.
- lxml fu: etree vs html
- lxml faves: iterlinks, prev/next, strip_tags, linepos
- incorporating xpath
- building your xml views/templates with lxml (this bullet is optional: may not have time but would love to hear if folks might find this useful)
- learning how to build a good JSON API handler: what you can learn from some amazing api handlers when you have to build your own
- feedparser, HTMLParser, re: the quick & dirty ways to parse when LXML isn't fast enough