Translating as Community Work


This week, I am doing some serious community volunteering in the production of the Arduino Reference in Spanish. In the early days of the Arduino project, back in 2006 and thanks to the work of mostly 3 people, we translated the Arduino page back then into Spanish. The system, thanks to the contribution of hundreds of users around the globe, evolved in an almost infinite series of iterations, into what it is today.

How we get and use people’s feedback in Arduino

Both IDE (Integrated Development Environment) and website reflect the effort of a lot of users (Arduino’s forum has almost 25.000 registered users, there are more than 150.000 boards out there) that have been trying the system, making projects with it, and reporting back to the developers what they felt should be improved. The development team, made of 376 registered members (as for 2010-05-31) works actively in making decisions on what is worth implementing. Those issues are listed and the core team (5 people) implements them for the next version of the system.

With such a dynamic system, translations are always left behind. I feel that one of the most complex aspects of having an active development of whatever open source project or product is the maintenance of meaningful documentation in whatever languages the users need. During March 2010 we arranged a community meeting at NYU, at ITP’s facilities. We invited a selected group of people based on the relevance of their contributions to the project. It is not possible to bring all the 376 developers together at one place, basically because Arduino is not having that kind of budget, not because we wouldn love to.

Something that came out from that meeting was the need to create better translations of the system.This brought to my agenda the need of creating a method to make the translations the best way possible. I designed a series of alternatives that I will describe now further.

The semi-automatic system

My first idea was to use an online service like Google Translate to create a first version of the translations and then let the community evaluate the level of accuracy, to go on changing only the pages that didn’t make any sense at all. This would e.g. allow very quick translation into as many languages as we would like to (as long as they were supported in the translation tool).

This option was disregarded by a majority of the people that volunteered to help out with the translations. Without any further study, they decided that it is much more time consuming to correct whatever the system made wrong than to make it from scratch. My personal perception is the opposite, however, this is just an opinion and as a community moderator, I need to follow the rules established by the community if I want them to collaborate in making this happen.

The reduced translation

One discussion we have had within the team for quite some time is that we feel we should put some serious effort in getting the most basic aspects of our documentation translated to a series of languges like: Chinese, Spanish, German, French, Italian, Portuguese, etc that reach the mojority of our audience. This minimal part of the documentation should include:

  • the Reference: where you find the most important commands and explanations on how to program,
  • the Extended Reference: where to get information about specialized commands
  • the Libraries Section: with explanations on how to use community contributed blocks of code
  • the Quick Start Guide (also known as “Getting Started”)

This approach should include reserving some kind of funding to get that made professionaly periodically. However I see a lot of issues with this. It brings up the question of who makes what when. Making things this way means that we first need to make sure the web is completely stable, then we need to hire someone to translate, we need to prepare a brief of things that need to be translated, and every time there is a small change we need to pay again. I think the community driven translation is much more powerful as an option.

The community translation

The previously presented options bring us to the conclusion that for a project like Arduino a community driven translation is the way to go. The community needs in any way to be properly supported if this goal is to be achieved. The tools of choice for us are a wiki with a community-shared password, an email list, and a limited time-frame of one week. I am experimenting to see if it is possible to achieve the translation of over 400 documents including the creation of a style-guide and a glossary of terms that will aid others in translating this website to other languages.

We use PmWiki in Arduino, a well known wiki system that keeps all the information into files and not databases. To achieve respectable speeds when serving pages we installed an HTML cache system. There are other tools that allow e.g. creating a list of all the pages linked in the site. That is the tool we are using for making the translations: I generated a list with all the pages our spider could find, published the list and let people self-assign the tasks and document the errors. After that I keep my email address open as much as I can to attend people’s questions on how to proceed with the translation.

The first hours

After 3 hours we had achieved 5% of the site. Our tools aren’t perfect and we found out that some of the links were missing in our automatically generated index, some of the translators found pretty early in the process that pages were missing. I keep on monitoring the process, we have four days to go.