platform specification Vs. 0.2
perldoc 2.0 – Platform specification
| Version | 0.2 |
| Date | November 30, 2006 |
| Author | Jørgen W. Lang |
| perldoc2@joergen-lang.com | |
| Website | http://perldoc2.joergen-lang.com |
Note
This is version 0.2 of the perldoc 2.0 platform specification.
Although most of the general concepts are expected to stay, details in implementation, etc. might change.
Thanks to everybody who helped directly or indirectly in the creation of the project and this specification.
You know who you are!
Joergen W. Lang
Feedback
Your feedback, comments, ideas are needed!
If you have anything to say that might be usefull for this project, please do not hesitate to contact me via the email address above.
You also might consider joining the developers’ mailing list under:
https://lists.sourceforge.net/lists/listinfo/perldoc2-developers
ABSTRACT
This document describes the components of the perldoc 2.0 translation platform and repository and the workflow between its parts.
GENERAL CONCEPTS
Goals
The final goal is to provide complete translations of
- the core documentation
- the documentation of the core modules
Once finished the translations could be made available via language-specific subomains of perldoc.perl.org, like fr.perldoc.perl.org and as part of the actual perl distribution e.g. in the ./pod/ directory.
In the meantime translated documents could and should be available via the platform website. This way they can be used and reviewed as soon as single documents are finished.
The main focus of the platform is aimed at the translation of the documentation for the programming language Perl into other natural languages. Since Perl6 is already in the making, the platform should be ready for this.
Although it is neither the primary goal nor a prerequisite, the platform might support the translation of documentation for other projects/programming languages in the future.
Audience
- translators (translating the docs)
- end-users (looking up documentation in their languages)
- developers (who can help improving the platform)
The term end-users is utilized to differentiate between developers, (P|p)erl hackers, etc. reading the documentation and those actively involved in the creation, mainteneance and improvement of the platform itself (althought these sometimes will be the same people).
Multilingual
Since this is a translation platform, the contents and interfaces of the website should be available in as many natural languages as possible.
Framework
We should not reinvent the wheel. There already is a good choice of web application frameworks out there.
Choosing one written in Perl might help us – and Perl at the same time. ;o)
Adoption
The adoption method could be an essential part of the project. By assigning a complete document to one physical person, this person is encouraged to make the translation a personal effort instead of feeling like an anonymous gear within the big translation machine.
This method does not exclude the possibility for splitting up one document between multiple individuals.
Maybe the platform should have support for this. Using po4a might help a lot.
[added in Vs. 0.2]
The adoption concept is also used to assign responsible persons to one or more language.
These people have a role similar to an overall editor in a publishing company. Needed skills are:
- good to very good knowledge of Perl
- good to very good knowledge of the natural language they’re adopting
- the usual team-leading/compromising, etc. soft skills
Language adopters will have to decide if the quality of a given translation is good enough to be released.
They will also co-ordinate the work of a specific language team.
Quality
To ensure the best possible quality of translations a certain set of guidelines should be followed.
A common glossary of terms should be used for each language.
COMPONENTS
The following key components are needed:
Repository
The repository is the storage area for documents to be translated. This could be a SVN repository, a database, a directory structure or whatever.
For the moment we use an SVN repository on sourceforge to collect and store the documents to be translated and the already existing
translations. This might change during the developement of the project as it might be more practical to store the documents within the database.
Database
The database is used to store information about the documents to be translated like the perl version they are based on, their translation status, timestamps, and other meta information.
The database will also keep track of ‘available’ languages. This could mean a general table of languages that are spoken on the planet today.
(The DB could also be used to store the documents themselves.)
[added in Vs. 0.2:]
The way documents should be stored is currently under discussion. The storage of the actual documents and their translations in a database
appears to be more flexible and extensible but lacks versioning unless it was reimplemented within the DB. Using a versioning system like SVN (already there) means another layer to take care of.
[added in Vs. 0.2:]
Access control
An extensible role based access control system will be used access to several parts of the application.
Possible roles might include:
- Administrators (should be clear)
- Language adopters
- can “administrate” one language
- are the primary contacts for language global issues
- can influence all documents of one language
- Document adopters
- should be per language
- can influence one document in a specific language
- Core Translators
- can check-out, check-in, etc.
- these are the “trusted” translators in the project
- can approve translations
- Wingman/Reviewer
- a core translator assigned to another one for the purpose of QA
- Translators
- Cannot approve
- can only checkout a limited number of document “parts”.
- can be promoted to core translators if found trustworthy enough
- Users
- can create accounts (So they can keep public/private notes, store bookmarks, glossaries…)
The administrators would be the only language-independent roles.
An adopter role can only be carried by a core translator.
Interface(s)
The platform might have several different interfaces.
One for translators, one for end users and for administrators.
The primary interface is web-based.
Access to documents and information about them will be done via a website.
The interface(s) provides the following key features:
[several points added in Vs. 0.2:]
- create and manage user accounts
- login/logout
- overview of documents available for translation
- translation status of these documents (see ‘document specific status’ for details)
- check-out of ‘vacant’ documents
- check-in of translated documents (marks document as ‘pending’)
- show various forms of statistics about translation status, available documents, etc.
- check-out for review of translated documents
- checking in after review to mark documents as ‘finished’
- maintain database/repository
- submit errata
These can have one of the following states:- new
- rejected (with reason)
- acknowledged (or open)
- closed (with comment)
- submit comments
Here, the following states are possible:- unread
- acknowledged (with response)
A public viewable list of comments/errata per document and per language would also help to prevent to duplicate issues.The platform could also provide secondary interfaces in the form of web services, maybe for the communication with an editor program installed on the local machine of a translator.
[subtle changes in Vs. 0.2:]
People
Although most of the processing is probably done more or less automatically there are certain parts of the workflow that involves review and steering be ‘real people’.
The obvious is the actual translation itself.
This needs – well – translators.
Additionally some editorial staff might be needed to review and correct the translation.
Sometimes decisions have to be made wether a certain word should be translated one way or the other or not a all.
This might need one or more ‘referees’ of some kind. This might become part of a “language adopter’s” job.
It also takes real people to review feedback and submitted errata.
To support the interaction between people forums and other means of communication should be available.
After all, the whole platform is powered by mutual help.
Design
Usage of the platform should be
- simple
- fun
- cool
Additional features
Tools/Helpers/Guidelines
- lists of available translation tools
- like editors with .po mode
- download links
- scripts and programs to ease translator’s lives
RSS feeds for:
- statistics
- news
- documents with newly translated parts
- documents with newly approved parts
- documents with newly abandoned parts
- comments/errata
Resources
- glossaries
- dictionaries
Mutual help
- Forums
- IRC-Channel(s)
- Mailing lists
- developers (english)
- language-teams (in their native language)
- language-adopters (english)
- (These feaures can be run from anywhere (freenode…) and linked into the site.
They do not have to be part of the actual application.)
Multilevel adoptions
- one key person is the adopter for one document
- the document could then be shared among multiple translators who take care of various parts of the doc.
Multiple formats
- RSS
- (X)HTML
- POD
- …
- Download the whole documentation for one language as one ‘book’.
Sponsorships
A possibility for individuals or companies to sponsor the translation of one or more particular document.
These documents could have aspecial marker to them that identifies the sponsor to the reader.
Workflow
From the translators point of view:
- (register/create account)
- check for available (‘vacant’) documents/languages
- login
- pick a document to adopt
- check-out the document (marks the document as ‘adopted’)
- translate the document
- check-in the document (marks the document as ‘updated’)
- get the document reviewed
From a reviewer’s point of view:
- (register/create account)
- check for ‘updated’ documents
- login
- pick a document to review
- check-out the document
- review the document (remove ‘fuzzy’ markers)
- check-in the document (no fuzzy markers mark it as ‘finished’)
From a user’s point of view:
- check for documents/languages
- read the document in maybe one of several available formats
- leave feedback/errata
DETAILS
Database
The database stores the following information:
- translation projects and their status
(as we expect at least one more project in the future) - registered translators
(and some details about them)
For a project:
- list of ’supported’ documents
(plus the neccessary details) - maybe the documents themselves
- which languages the document was or is being translated to
For a document:
- status
- meta information
[added in Vs. 0.2:]
For parts of a document:
- status
- meta information
What else?
Furthermore, this or another database will very likely contain everything that’s needed to run the web application itself.
Status of project – Details
- Meta information like maintainer, contained subprojects, statistical information, etc.
- For the translation or the Perl5 documentation two subprojects are the translation of the core documents and the translation of the core modules.
- The translation of the core documents could be further split by ‘importance’ of translation.
Status of document
A document (or a part thereof), stored in an SVN repository, a database or whereever, has a certain status attached to it.
Depending on its state of translation this could be one of the following:
| Status | Explanation |
|---|---|
| vacant | the document has not yet been assigned to a translator |
| adopted | the document has been assigned to a translator |
| updated | a partially translated document |
| pending | the initial translation of the document has been finished but the document has not yet been reviewed |
| finished | the docuement has been translated and reviewd |
| abandoned | documents that have been adopted but haven’t been worked upon for a given time will be marked as ‘abandoned’. (The translator will have to be informed by this. If she does not update the document within a given time the status will be changed back to ‘vacant’, maybe automatically) This mode is to give the translator time to change the document’s status to ‘updated’. |
[added in Vs. 0.2:]
If a translator has problems with a document or certain parts that neither him nor the wingman can resolve, the document can be put back
into ‘vacant’ mode so other people can have a go at it. Questions in the forums/mailing lists should be encouraged before revoking.
[several parts added in Vs. 0.2:]
Meta information
The documents can be organized in a hierarchical structure. For example:
- Perl 5
- Core Documentation
- 5.6.x
- 5.8.x
- …
- - Core Module Documentation
- 5.6.x
- 5.8.x
- …
- Core Documentation
- Perl 6
- other projects
For a document or part thereof certain meta information will be stored:
- project this document is part of
- time of adoption
- time of update
- name of adopter
- name of assigned wingman
- translation status (as described above)
- all neccessary information that’s needed to roll back to an older version of the translation
User registration
For reasons of security and consistency it is probably neccessary for users to register with the project as a translator/reviewer/…
Check-out
A person with the role of “core translator” will be able to assign one (or more?) documents or parts thereof to himself. This person will be the “adopter” for this document for a certain amount of time (the “time to live”, “TTL”, see “Abandoned documents”).
[some changes/additions in Vs. 0.2:]
Review
Probably the only way to good quality and correctness of the translations (orthography, speling, language and content) is mutual help, unless Mark Shuttleworth wants to sponsor this project.
A concept of peer review similar to that of wikipedia could be used. Other users are encouraged to review and to correct. Maybe this could follow the “buddy principle” as practiced with divers. In this project the buddy is called a “wingman” (see glossary at the end of this document).
To mark a document as ‘finished’ it has to be reviewed by at least one person not being the translator itself. Maybe the review should involve
marking the several parts of the translation as ‘reviewed’ (maybe using the ‘fuzzy’ flag?)
To review a document the ‘buddy’ needs to checkout the document and actually read it. By re-submitting it the document will be marked as ‘finished’.
Using the ‘fuzzy’ marker avoids having to review the whole document which can be a great help, especially with lengthy and complex documents. This implies that the fuzzy marker is set by default.
When the translator stores a part of his checkout as “finished,” her wingman would get it on his todo list for review. He can then either approve the translation, or reject it (with a reason). The latter drops it onto the translators todo list for his checkout again.
Abandoned documents
Sometimes people adopt a document but do not have the time/motivation/resources to update it. To ensure that these documents do not become “zombies” they will have a certain ‘time to live’ (TTL) based on their length and (maybe) complexity (perlopentut is more complex than perl588delta, etc.)
If an adopted document exceeds its TTL the following could happen:
- the adopter will be informed that the document has not been updated within the given TTL. She will then be given a certain time to react.
- In this first stage it should not be neccessary to submit actual changes to the document but to merely ‘touch’ the document. This is to confirm that the translator is still willing to work on the document. This will re-initialize the time to live. Maybe with a flag that indicates that this document has one ‘reminder’ to it.
- The second stage might require an actual update of the document. TTL keeps running, does not get reset.
- If the TTL has been exceeded with no reaction on the translator’s side the document will be marked as ‘vacant’ again.
- If a partially translated document was abandoned this needs to be marked in meta information
[additions in Vs. 0.2:]
Check-in/out
The check-in or ’submit’ of translated documents could be acchieved in one of the following ways:
- Upload via HTML-Form
- email (to a special address handling the integration of the document into the database).
Using the database approach would enable us to allow check-ins of partially translated documents.
The checkout-process could be done by simply downloading the document (or their parts) from a special address that is tracked by the DB.
Terms
This document uses the following terms as follows:
- Project
- The translation of the perl documentation (perldoc 2.0) is a project.
The translation of the documentation for Catalyst could be another another. - Subproject
- The translation of the core documentation is a subproject of perldoc 2.0.
The translation of the documentation for the core modules is another. - Adoption
- The process of assigning a project, a document or parts thereof to a particular person.
- Core documentation (of perl)
- A typical installation of perl from source creates the directory ‘perl-[version_number]‘.
Contained in this is a directory named ‘pod’. All documents contained in this directory and ending in ‘.pod’ are part of the core documentation. - Core modules
- Modules that are installed with a typical perl installation from source by default.
- Wingman
- We use the principle of peer review to ensure correctness and quality.
In our project the peer is called a “wingman”. - Time to live (TTL)
- The time an adopted but untranslated document takes to get his status changed back to ‘vacant’.
savings account loan instant paydayloan borrow student additional consolidatemaryland loan adjustablehome american african construction loanhome alaska loan equityalaska participation program loanally officer powers loan andconstruction home nw american loans10 payday ohio online 14 loaninstant loan 1000 payday






[...] The full text can be found on the perldoc 2.0 blog. [...]
[...] constraints during its natural language processing, such as resolving references when using the …platform specification Vs. 0.2 at perldoc 2.0The main focus of the platform is aimed at the translation of the documentation for the programming [...]