I can bring up some new hands-on ideas that we (me, @ada and @georgiana-b) could probably cover:
- cleaning up data sets with Python, OpenRefine (and others)
- easy data transformations with open source Pentaho data integration apps
I personally would like to go a bit into the system administration side and introduce a FLOSS web administration panel for self hosted infrastructure (LAMP stack and more), developed with love and care by a dedicated team. More like a "host your own domains" kinda workshop.
Here are some more ideas I could help with:
- how we are building a platform for interactively visualising the public procurement data (network graphs and more), can cover pretty much any aspect of it
- deployment mini-hackathon, to experiment with containers, services, IaaS, CI and more - in the form of lightning talks and workshops
- free and open source tools for team collaboration (@ion can probably cover journalistic workflow, I can show some pair programming techniques for programmers and sysadmins)
And some more I can't help much with but I might know people coming to DH that could:
- identifying entities in data sets (NER) and efficiently detecting languages - this can and might also get into machine learning stuff
- parsing scanned text documents
I'm not entirely sure how technical we could get, that'd be one of the purposes of this board - to get to know who's interested in what, and adjust our activities based on some real input.
@stefanw, any thoughts? you started this topic, is this the kind of conversation you were aiming for?