Your mission is to write a bot (some code) to transform a website into open data
This is the site in question: http://www.cnbs.gob.hn/index.php/sistema-financiero/manuales-contables
This is one for people with technical skills - specifically coding in Ruby or Python (other languages are in the pipeline)
This is the website, that contains links to PDF files and ZIP archives with PDF files. A rather difficult mission, because it requires a lot of domain knowledge to choose and parse the PDF files. You should be able to read and understand Spanish or have domain knowledge on the contents of the PDF files. Second the PDF files are textual data that needs to be parsed correctly, that is doable with a considerable amount of effort once the right documents have been chosen.
Here's how we suggest you go about it:
- Start by clicking 'Accept this mission' on this page. Don't worry, you can always give up if you can't finish it.
- You'll write the scraper using our "Turbot" framework. Head over to the Turbot website to and click "Start contributing" to read a getting started guide.
- If you have any questions, whether they are technical or about the data, get in touch and ask!
- When you think you've written a suitable bot, submit it for review using the Turbot command line tool.
- Once we've checked over the data, we'll either tell you if anything needs to be fixed or we'll accept the bot which means your mission will be complete!
Still not sure? Don't worry!
Whilst this does require you to be able to code, its probably not as hard as you think. Take a look at our example bots to get a feel for what's required.