upload
The Unicode Consortium
産業: Computer; Software
Number of terms: 11048
Number of blossaries: 0
Company Profile:
The Unicode Consortium or Unicode Inc. is a not-for-profit organization that coordinates the development of the Unicode standard. Its stated goal is to eventually enable computers to operate in all languages from around the world. The consortium develops and publishes a list of freely-available ...
A maximal character sequence consisting of either a base character followed by a sequence of one or more characters where each is a combining character, zero width joiner, or zero width non-joiner; or a sequence of one or more characters where each is a combining character, zero width joiner, or zero width non-joiner. * When identifying a combining character sequence in Unicode text, the definition of the combining character sequence is applied maximally. For example, in the sequence <c, dot-below, caron, acute, a>, the entire sequence <c, dotbelow, caron, acute> is identified as the combining character sequence, rather than the alternative of identifying <c, dot-below> as a combining character sequence followed by a separate (defective) combining character sequence <caron, acute>.
Industry:Computer; Software
A maximal character sequence consisting of either a base character followed by a sequence of one or more characters where each is a combining character, zero width joiner, or zero width non-joiner; or a sequence of one or more characters where each is a combining character, zero width joiner, or zero width non-joiner. * When identifying a combining character sequence in Unicode text, the definition of the combining character sequence is applied maximally. For example, in the sequence <c, dot-below, caron, acute, a>, the entire sequence <c, dotbelow, caron, acute> is identified as the combining character sequence, rather than the alternative of identifying <c, dot-below> as a combining character sequence followed by a separate (defective) combining character sequence <caron, acute>.
Industry:Computer; Software
A numeric value in the range 0..254 given to each Unicode code point, formally defined as the property Canonical_Combining_Class. * The combining class for each encoded character in the standard is specified in the file UnicodeData.txt in the Unicode Character Database. Any code point not listed in that data file defaults to \p(Canonical_Combining_Class &#61; 0) (or \p(ccc &#61; 0) for short). * An extracted listing of combining classes, sorted by numeric value, is provided in the file DerivedCombiningClass.txt in the Unicode Character Database. * Only combining marks have a combining class other than zero. Almost all combining marks with a class other than zero are also nonspacing marks, with a few exceptions. Also, not all nonspacing marks have a non-zero combining class. Thus, while the correlation between ^\p(ccc&#61;0) and \p(gc&#61;Mn) is close, it is not exact, and implementations should not depend on the two concepts being identical.
Industry:Computer; Software
A commonly used synonym for combining character.
Industry:Computer; Software
(1) Consistency with existing practice or preexisting character encoding standards. (2) Characteristic of a normative mapping and form of equivalence.
Industry:Computer; Software
A character that would not have been encoded except for compatibility and round-trip convertibility with other standards.
Industry:Computer; Software
Synonym for compatibility decomposable character.
Industry:Computer; Software
A character whose compatibility decomposition is not identical to its canonical decomposition. It may also be known as a compatibility precomposed character or a compatibility composite character. * For example, U+00B5 micro sign has no canonical decomposition mapping, so its canonical decomposition is the same as the character itself. It has a compatibility decomposition to U+03BC greek small letter mu. Because micro sign has a compatibility decomposition that is not equal to its canonical decomposition, it is a compatibility decomposable character. * For example, U+03D3 greek upsilon with acute and hook symbol canonically decomposes to the sequence <U+03D2 greek upsilon with hook symbol, U+0301 combining acute accent>. That sequence has a compatibility decomposition of <U+03A5 greek capital letter upsilon, U+0301 combining acute accent>. Because greek upsilon with acute and hook symbol has a compatibility decomposition that is not equal to its canonical decomposition, it is a compatibility decomposable character. * This term should not be confused with the term compatibility character. * Many compatibility decomposable characters are included in the Unicode Standard solely to represent distinctions in other base standards. They support transmission and processing of legacy data. Their use is discouraged other than for legacy data or other special circumstances. * Some widely used and indispensable characters, such as NBSP, are compatibility decomposable characters for historical reasons. Their use is not discouraged. * A large number of compatibility decomposable characters are used in phonetic and mathematical notation, where their use is not discouraged. * For historical reasons, some characters that might have been given a compatibility decomposition were not, in fact, decomposed. The Normalization Stability Policy prohibits adding decompositions for such cases in the future, so that normalization forms will stay stable. * Replacing a compatibility decomposable character by its compatibility decomposition may lose round-trip convertibility with a base standard.
Industry:Computer; Software
The decomposition of a character or character sequence that results from recursively applying both the compatibility mappings and the canonical mappings found in the Unicode Character Database, Conjoining Jamo Behavior, until no characters can be further decomposed, and then reordering nonspacing marks. Some compatibility decompositions remove formatting information.
Industry:Computer; Software
Two character sequences are said to be compatibility equivalents if their full compatibility decompositions are identical.
Industry:Computer; Software