Your mission is to write a bot (some code) to transform a website into open data
This is the site in question: http://siotec.bancaditalia.it/sportelli/jsp/layout/home.jsp?detail=download
This is one for people with technical skills - specifically coding in Ruby or Python (other languages are in the pipeline)
The website is built on top of the Adobe Flex framework. All communication is handled via the AMF protocol. But hope at last, if you look at the detail of an entry, there is XML export.
The difficulty with this example is inspecting their protocol to scrape the entries we are interested in. Definitely a hard mission, but doable.
A good man in the middle proxy to inspect their protocol that is capable of reading the AMF protocol and XML is Charles Proxy.
For Ruby there are several AMF gems (like this serializer/deserializer) that you can use to parse their communication.
Here's how we suggest you go about it:
- Start by clicking 'Accept this mission' on this page. Don't worry, you can always give up if you can't finish it.
- You'll write the scraper using our "Turbot" framework. Head over to the Turbot website to and click "Start contributing" to read a getting started guide.
- If you have any questions, whether they are technical or about the data, get in touch and ask!
- When you think you've written a suitable bot, submit it for review using the Turbot command line tool.
- Once we've checked over the data, we'll either tell you if anything needs to be fixed or we'll accept the bot which means your mission will be complete!
Still not sure? Don't worry!
Whilst this does require you to be able to code, its probably not as hard as you think. Take a look at our example bots to get a feel for what's required.