Headless Browser and scraping – solutions – Stack Overflow
I’m trying to put list of possible solutions for browser automatic tests suits and headless browser platforms capable of scraping.
BROWSER TESTING / SCRAPING:
- SlimerJS – similar to PhantomJS, uses Gecko (Firefox) instead of WebKit
- new PhantomCSS – CSS regression testing. A CasperJS module for automating visual regression testing with PhantomJS and Resemble.js.
- new WebdriverCSS – plugin for Webdriver.io for automating visual regression testing
- new PhantomFlow – Describe and visualize user flows through tests. An experimental approach to Web user interface testing.
- new trifleJS – ports the PhantomJS API to use the Internet Explorer engine.
- new CasperJS IDE (commercial)
- Node-phantom – bridges the gap between PhantomJS and node.js
- WebDriverJs – Selenium WebDriver bindings for node.js by Selenium Team
- WD.js – node module for WebDriver/Selenium 2
- yiewd – WD.js wrapper using latest Harmony generators! Get rid of the callback pyramid with yield
- ZombieJs – Insanely fast, headless full-stack testing using node.js
- NightwatchJs – Node JS based testing solution using Selenium Webdriver
- Chimera – Chimera: can do everything what phantomJS does, but in a full JS environment
- Webdriver.io – better implementation of WebDriver bindings with predefined 50+ actions
- new Nightmare – PhantomJS bridge with a high-level API. It uses PhantomJS-Node under the hood.
WEB SCRAPING / MINING
- Scrapy – Python, mainly a scraper/miner – fast, well documented and, can be linked with Django Dynamic Scraper for nice mining deployments, or Scrapy Cloud for PaaS (server-less) deployment, works in terminal or an server stand-alone proces, can be used with Celery, built on top of Twisted
- Snailer – node.js module, untested yet.
- Node-Crawler – node.js module, untested yet.
RELATED LINKS & RESOURCES
- Any pure Node.js solution or Nodejs to PhanthomJS/CasperJS module that actually works and is documented?
Answer: Chimera seems to go in that direction, checkout Chimera
Answer: Checkout the list created by rjk with ruby based solutions
- Do you know any related tech or solution?
Feel free to reedit this question and add content as you wish! Thank you for your contributions!
- added SlimerJS to the list
- added Snailer and Node-Crawler and Node-phantom
- added Yiewd WebDriver wrapper
- added WebDriverJs and WD.js
- added Ghost Driver
- added Comparsion of Webscraping software on Screen Scraper Blog
- added ZombieJs
- added Resemble.js and PhantomCSS and PhantomFlow, categorised and reedited content
- 04.01.2014, added Chimera, answered 2 questions
- added NightWatchJs
- added DalekJS
- added WebdriverCSS
- added CasperBox
- added trifleJS
- added CasperJS IDE
- added Nightmare