Okapi Framework Changes Log - Feb-14-2010
Note this document is common to both the okapi-lib distribution and
the okapi-apps distribution. The information pertaining to
applications other than Tikal are relevant only for the okapi-apps
distribution.
Changes from M5 (0.5) to 0.5.1
- Rainbow:
- Translation Package Creation:
- Fixed the bug where the encoder manager
for RTF output was not properly set and cause some formats like HTML, TMX,
etc. to have un-escaped characters.
- Changed the RTF writer to allow other skeleton writers than
GenericSkeletonWriter.
- Replaced the Search and Replace utility by the "Search and Replace with
Filter" and the "Search and Replace without Filter" pre-defined pipelines.
- Replaced the Text Rewriting utility by the "Text Rewriting" pre-defined
pipeline.
- Tikal:
- Fixed the issue of not having the HTML filter mapped when using the
Vignette filter.
- Added support for accessing Microsoft MT engine (
-ms
option).
- Translation resources:
- Added a connector for Microsoft MT Web services (http://api.microsofttranslator.com/V1/SOAP.svc),
a Microsoft Bing AppID is needed to use it. You can obtain one at
http://www.bing.com/developers/appids.aspx.
- Google MT: made it consistent with other connector when result
is same as target, now the result is returned.
- SimpleTM: Added made the feature "penalize exact matches when target has
different codes than the query" an option. (default is true, backward
compatible).
- Libraries:
- Fixed issue with GenericSkeletonWriter and in-line codes in segmented
text unit that were outside any segment.
- Fixed issue with GenericFilterWriter output stream not nullified in
close() (causing for example no output using FilterEventsToRawDocument).
- Steps / Pipeline:
- Added MULTI_EVENT (new resource and Event) handling to pipeline.
- Changed step handlers to return Event by default.
- Fixed the parameters setting bug preventing to save the parameters for
pre-defined pipeline from one session to the next.
- Leveraging step:
- Fixed the bug preventing to enter a TMX path.
- Made adding an
MT! prefix to the TMX entries an option.
- Added an option to enabled/disable the step.
- Search and Replace step: Improved the behavior of the dialog box for
add/edit item.
- Format Conversion step: Fixed bug where the table-delimited output was
not closed properly for "one output per input" use case.
- Added Text Modification step.
- Filters:
- PHP Content filter: Added UI for the localization directives options
(default behaviour is the same).
- OpenXML filter: Changed the parameters editor to use GridLayout instead
of BorderLayout.
- TMX filter: Fixed losing original line-breaks between <tu> when
re-writing.
- Vignette filter: Fixed bug of un-escaped and non-CDATA RTF output.
- Properties filter: Added the option "Convert \n and \t to line-break and
tab".
- Table filter:
- Fixed issue #119 where csv action "Exclude leading/trailing..." was not
updated properly in the parameters editor
- Fixed issue #118 where some csv cases were not extracted properly
- Installation:
- Updated licence information for third-party packages.
- Removed all the dependencies to swing2swt.
Changes from M4 to M5
- Libraries:
- Changed minimum requirement to Java 1.6 instead of Java 1.5.
- Removed distribution for Mac Carbon, added distribution for Mac
Cocoa-64-bit.
- Updated to Lucene 3.0.0
- Refactored Pensieve TM engine, added new API.
- Rainbow:
- Added the duration of the process in the log.
- Updated the UI of the Pipeline Edit / Execute facility to make the
panels of each step accessible without clicking.
- Replaced the utility "Generate SimpleTM Dabase" by the
pre-defined pipeline "Import
Into Pensieve TM" (the previous utility's functionality is still available
using a custom pipeline).
- Replaced the utility "Export SimpleTM Database" by the
pre-defined pipeline "Convert
File Format" (the previous utility's functionality is still available using
a custom pipeline).
- Fixed issue with Text Rewriting and empty <target> for XLIFF input.
- Replaced the utility "Translation Comparison" by the pre-defined
pipeline "Translation Comparison".
- Added the pre-defined pipeline "Create Translations in Batch Mode"
- Replaced the utility "XSL Transformation" by the pre-defined pipeline
"XSL Transformation".
- Replaced the utility "Used Characters Listing" by the pre-defined
pipeline "Used Characters Listing".
- Ratel:
- Fixed selection bug in UI.
- Updated the default segmentation rules.
- Steps:
- Added Batch Translation step (tested with ProMT and Apertium).
- Added Codes Removal step
- Added Leveraging step
- Completed initial Tokenization and Word-Count steps.
- Added the Sentence Alignment step.
- Translation resources:
- Fixed issue with score > 100 in Pensieve TM.
- Added NCR support for Apertium connector.
- Filters:
- In the Properties Filter: Added pre-defined configuration for Skype's
.lang format.
- In the RTF parser:
- Fixed the issue with \'HHc being read as \'HH\'HH in some cases.
- Added support for additional DBCS encodings.
- Added TTX Filter for Trados TagEditor documents (Beta).
- In the HTML Filter: Added pre-defined configuration for well-formed files, providing groups
and extra meta-data.
- In the XML Filter: Changed the ITS extension
idPointer to
idValue and modified its behavior to allow ID values to be
generated from the expression, not just from the content pointed by the
expression. The values are backward compatible, but existing parameters file
will have to rename any reference to idPointer by idValue.
- Added the Vignette Filter for Vignette export XML documents (Alpha)
- Added the Pensieve Filter for reading and writing Pensieve translation
memories.
Changes from M3 to M4
- Filters:
- XLIFF filter: Added property for target-language and option to add it.
Changed some of the language selection behaviors and set fall-back to ID
option to false.
- Fixed several bugs in the OpenXML filter (MS Office 2007 documents)
- The JSON Filter has been added, to support for example AJAX or Palm
WebOS applications.
- The PHP Content Filter has been added, to support PHP include files.
- Added default DITA configuration to the XML Filter.
- Fixed several issues with the TS, Table, TMX, and XLIFF filters.
- Added
whiteSpaces ITS extension support in the XML Filter.
- The PHP Content Filter has been added.
- Library, Translation resources:
- All the TM and MT connectors have been moved to the package
net.sf.okapi.connectors.
- Modified the OpenTran connector to use the REST interface instead of
RCP.
- Added the connector to the MyMemory server (http://mymemory.translated.net)
- Improved Google MT connector.
- Improved GlobalSight TM connector for inline codes, and adjusted it for
GS version 7.1.6.
- Added Pensieve TM engine and its connector.
- Added the connector for the open-source Apertium MT system web
service (http://wiki.apertium.org/wiki/Main_Page)
- Changed language identification from String to LocaleId objects
across the whole framework.
- Steps and Rainbow utilities:
- Added the SimpleTM2TMX step.
- Added Import and Export utilities for SimpleTM files.
- Continued improving the Tokenization and WordCount steps.
- Implement an option to select the XSLT processor to use with the
XSL
Transformation utility.
- Updated the Translation Package Creation utility to
select from several resources for the pre-translation options, and to
allow specifying threshold instead "exact match only".
- Updated the Text Rewriting utility to select from
several resources for the translation options.
- Added the FormatConversion step.
- Improved inline compatibility in projects generated for OmegaT.
- Tikal:
- Added support for accessing the MyMemory repository (
-mm option)
- Corrected display of extended characters on the console for some
languages/platforms.
- Added threshold and max-hits options for TM query command (
-opt
option)
- Added a command to create PO files from any input (
-2po
command).
- Added a command to create TMX files from any input (
-2tmx
command).
- Added a command to create Table files from any input (
-2tbl
command).
- Added capability to query a Pensieve TM (
-pen option).
- Added support for accessing GlobalSight TM servers (
-gs option).
- Added support for accessing Apertium MT servers (
-apertium option).
- Added segmentation and leveraging options for the extraction command.
- Added a commands to import any file into a Pensieve TM (
-imp
command).
- Added a command to export a Pensieve TM to a TMX file (
-exp
command).
Changes from M2 to M3
- The build system has been completely redone and now uses Maven as its
main builder. This has resulted in several changes in the structure of the
Okapi classes, and in the way the files are distributed.
- Filters:
- Added the TS Filter (beta) for Qt translation files.
- Fixed handling of fuzzy flag for plural entries in the PO filter.
- Fixed handling of
approved, state and
coord properties in the XLIFF
Filter.
- Improved XML Filter:
- Improved rewriting of document type subset declaration.
- Added support for protecting custom entity references.
- Added support for ID defined using
xml:id or the
idPointer ITS extension feature.
- Properties Filter:
- Change the default configuration to always escape output.
- Added pre-defined configuration for non-escaped output.
- Fixed various issues in the OpenXML Filter (docx, pptx, etc.), and
PO Filter.
- Libraries:
- The Google MT connector has been enhanced to have the inline codes taken
into account, not simply pushed to the end of the text.
- Fixed one error in default segmentation rules.
- Added a connector component for the Translate Toolkit TM server.
- Added steps such as Word-count and Tokenizer.
- The command-line tool Tikal has been added.
- Rainbow (okapi-apps distribution only):
- Improved handling of un-approved translations in TMX generated
during a translation package creation.
- Added option to choose to merge only approved translations in
translation package post-processing.
Changes from M1 to M2
- Filters:
- The DTD Filter has been added.
- The PlainText Filter has been added.
- The Table Filter has been added.
- Several pre-defined filter configurations have been added or updated:
Mozilla-RDF, XML Android Strings, XML Java properties, RESX, Monoligual PO,
SRT (Sub-titles), plain-text lines, plain-text paragraphs, CSV, etc.
- The OpenXML Filter (DOCX, PPTX, XSLX files) has been improved and now
provides much inline code simplification.
- The definition of the parameters for the RegEx Filter have been modified
to allow the support of target text, ID, etc. This new format is not
compatible with the one of M1.
- Other filters (HTML, Properties, XLIFF, TMX, PO, and OpenDocument
filters) have been improved.
- Libraries:
- A new TM connector to query remote GlobalSight TM servers has been
added. (See the Java Example05 of the okapi-lib distribution for an
illustration on how to use this component).
- A connector to query the remote OpenTran server has been added. (See the
Java Example05 of the okapi-lib distribution for an illustration on how to
use this component).
- New
RawDocument object model.
- The events mechanism has been augmented to work with batch items in the
pipeline.
- The encoding detection and handling of BOM has been modified in most
filters and utilities.
- The pipeline mechanism has been extensively re-written.
- Many steps for the pipeline have been created, they are experimental for
now.
- Rainbow:
- Ratel:
- Better preservation of comments in SRX files; and capability to add
comments from within Ratel.
- Uses the latest libraries.