upload
The Unicode Consortium
産業: Computer; Software
Number of terms: 11048
Number of blossaries: 0
Company Profile:
The Unicode Consortium or Unicode Inc. is a not-for-profit organization that coordinates the development of the Unicode standard. Its stated goal is to eventually enable computers to operate in all languages from around the world. The consortium develops and publishes a list of freely-available ...
A non-empty subsequence of a Unicode code unit sequence X which does not contain any code units which also belong to any minimal well-formed subsequence of X. * In other words, an ill-formed code unit subsequence cannot overlap with a minimal well-formed subsequence.
Industry:Computer; Software
A fixed property that is also subject to a stability guarantee preventing any change in the published listing of property values other than assignment of new values to formerly unassigned code points. * An immutable property is trivially stable with respect to all algorithms. * An example of an immutable property is the Unicode character name itself. Because character names are values of an immutable property, misspellings and incorrect names will never be corrected clerically. Any errata will be noted in a comment in the character names list and, where needed, an informative character name alias will be provided. * When an encoded character property representing a code point property is immutable, none of its values can ever change. This follows from the fact that the code points themselves do not change, and the status of the property is unaffected by whether a particular abstract character is encoded at a code point later. An example of such a property is the Pattern_Syntax property; all values of that property are unchangeable for all code points, forever. * In the more typical case of an immutable property, the values for existing encoded characters cannot change, but when a new character is encoded, the formerly unassigned code point changes from having a default value for the property to having one of its nondefault values. Once that nondefault value is published, it can no longer be changed.
Industry:Computer; Software
A value for an encoded character property that is given by a generic rule or by an “otherwise” clause in one of the data files of the Unicode Character Database. * Implicit property values are used to avoid having to explicitly list values for more than 1 million code points (most of them unassigned) for every property.
Industry:Computer; Software
A Unicode string is said to be in a particular Unicode encoding form if and only if it consists of a well-formed Unicode code unit sequence of that Unicode encoding form. * A Unicode string consisting of a well-formed UTF-8 code unit sequence is said to be in UTF-8. Such a Unicode string is referred to as a valid UTF-8 string, or a UTF-8 string for short. * A Unicode string consisting of a well-formed UTF-16 code unit sequence is said to be in UTF-16. Such a Unicode string is referred to as a valid UTF-16 string, or a UTF-16 string for short. * A Unicode string consisting of a well-formed UTF-32 code unit sequence is said to be in UTF-32. Such a Unicode string is referred to as a valid UTF-32 string, or a UTF-32 string for short.
Industry:Computer; Software
An in-band channel conveys information about text by embedding that information within the text itself, with special syntax to distinguish it. In-band information is encoded in the same character set as the text, and is interspersed with and carried along with the text data. Examples are XML and HTML markup.
Industry:Computer; Software
In Indic scripts, certain vowels are depicted using independent letter symbols that stand on their own. This is often true when a word starts with a vowel or a word consists of only a vowel.
Industry:Computer; Software
Forms of decimal digits used in various Indic scripts (for example, Devanagari: U+0966, U+0967, U+0968, U+0969). Arabic digits (and, eventually, European digits) derive historically from these forms.
Industry:Computer; Software
Information in this standard that is not normative but that contributes to the correct use and implementation of the standard.
Industry:Computer; Software
A Unicode character property whose values are provided for information only.
Industry:Computer; Software
In writing systems based on a script in the Brahmi family of Indic scripts, a consonant letter symbol normally has an inherent vowel, unless otherwise indicated. The phonetic value of this vowel differs among the various languages written with these writing systems. An inherent vowel is overridden either by indicating another vowel with an explicit vowel sign or by using virama to create a dead consonant.
Industry:Computer; Software