ICU Custom UText Providers Guide

UText is an abstraction facility that allows the underlying native storage mechanism to be separate from its consumption by ICU library facilities. It provides for a mechanism for iteration over code points, extraction of UTF-16 encoded text, and the manipulation of the underlying native storage without accessing it directly.

ICU UBiDi and UText Enhancements

Currently there are few complete implementations of the Unicode Standard Annex #9: The Bidirectional Algorithm. The most popular include International Components for Unicode (ICU), FriBiDi, Uniscribe, DirectWrite, and Core Text. Uniscribe and DirectWrite and Windows are only, and Core Text is OSX only. Of the remaining two, ICU’s UBiDi implementation (UBiDi) is the most desirable due the to additional library functionality available in a single package. One disadvantage of UBiDi is that is only functions on UTF-16 (UChar) arrays. In other functions within the ICU library, the UText abstraction facility is used to allow any text storage and encoding provider to be used, however, UBiDi currently does not support UText. One of the reasons for the lack of support is the ubidi_writeReordered() and ubidi_writeReverse() functions which write to the provided UChar arrays. The implementation of the UChar UText Provider currently does not support write operations.

ICU Building ICU using Visual Studio with Boost Versioned Library Layout

These instructions are for creating a proper out-of-source build of ICU which is not supported very well by the default ICU Visual Studio Solution file and Project files. The goal is to create versioned ICU library names compatible with Boost which ensures that multiple builds can co-exist in the same stage directory. Everything else depends on this premise.

Visual Studio MSBuild support for Boost Versioned Library Layout

MSBuild provides a flexible way to programatically discover build variables and apply them to your Visual Studio build. The Boost libraries use the Boost.Build system, a variant of Jam to perform the build and part of that system allows for versioned library names - a very powerful naming convention for libraries! Replicating this within Visual Studio allows for multiple builds to co-exist within the same OutDir making library management much easier.