Skip Headers
Oracle® Database Globalization Support Guide
10g Release 2 (10.2)

Part Number B14225-01
Go to Documentation Home
Home
Go to Book List
Book List
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Feedback

Go to next page
Next
View PDF

Contents

Title and Copyright Information

Send Us Your Comments

Preface

Intended Audience
Documentation Accessibility
Structure
Related Documents
Conventions

What's New in Globalization Support?

Oracle Database 10g Release 2 (10.2) New Features in Globalization
Oracle Database 10g Release 1 (10.1) New Features in Globalization

1 Overview of Globalization Support

Globalization Support Architecture
Locale Data on Demand
Architecture to Support Multilingual Applications
Using Unicode in a Multilingual Database
Globalization Support Features
Language Support
Territory Support
Date and Time Formats
Monetary and Numeric Formats
Calendars Feature
Linguistic Sorting
Character Set Support
Character Semantics
Customization of Locale and Calendar Data
Unicode Support

2 Choosing a Character Set

Character Set Encoding
What is an Encoded Character Set?
Which Characters Are Encoded?
Phonetic Writing Systems
Ideographic Writing Systems
Punctuation, Control Characters, Numbers, and Symbols
Writing Direction
What Characters Does a Character Set Support?
ASCII Encoding
How are Characters Encoded?
Single-Byte Encoding Schemes
Multibyte Encoding Schemes
Naming Convention for Oracle Character Sets
Length Semantics
Choosing an Oracle Database Character Set
Current and Future Language Requirements
Client Operating System and Application Compatibility
Character Set Conversion Between Clients and the Server
Performance Implications of Choosing a Database Character Set
Restrictions on Database Character Sets
Restrictions on Character Sets Used to Express Names
Database Character Set Statement of Direction
Choosing Unicode as a Database Character Set
Choosing a National Character Set
Summary of Supported Datatypes
Changing the Character Set After Database Creation
Monolingual Database Scenario
Character Set Conversion in a Monolingual Scenario
Multilingual Database Scenarios
Restricted Multilingual Support
Unrestricted Multilingual Support

3 Setting Up a Globalization Support Environment

Setting NLS Parameters
Choosing a Locale with the NLS_LANG Environment Variable
Specifying the Value of NLS_LANG
Overriding Language and Territory Specifications
Locale Variants
Should the NLS_LANG Setting Match the Database Character Set?
NLS Database Parameters
NLS Data Dictionary Views
NLS Dynamic Performance Views
OCINlsGetInfo() Function
Language and Territory Parameters
NLS_LANGUAGE
NLS_TERRITORY
Overriding Default Values for NLS_LANGUAGE and NLS_TERRITORY During a Session
Date and Time Parameters
Date Formats
NLS_DATE_FORMAT
NLS_DATE_LANGUAGE
Time Formats
NLS_TIMESTAMP_FORMAT
NLS_TIMESTAMP_TZ_FORMAT
Calendar Definitions
Calendar Formats
First Day of the Week
First Calendar Week of the Year
Number of Days and Months in a Year
First Year of Era
NLS_CALENDAR
Numeric Separator and List Separator Parameters
Numeric Formats
NLS_NUMERIC_CHARACTERS
NLS_LIST_SEPARATOR
Monetary Parameters
Currency Formats
NLS_CURRENCY
NLS_ISO_CURRENCY
NLS_DUAL_CURRENCY
Oracle Support for the Euro
NLS_MONETARY_CHARACTERS
NLS_CREDIT
NLS_DEBIT
Linguistic Sort Parameters
NLS_SORT
NLS_COMP
Character Set Conversion Parameter
NLS_NCHAR_CONV_EXCP
Length Semantics
NLS_LENGTH_SEMANTICS

4 Datetime Datatypes and Time Zone Support

Overview of Datetime and Interval Datatypes and Time Zone Support
Datetime and Interval Datatypes
Datetime Datatypes
DATE Datatype
TIMESTAMP Datatype
TIMESTAMP WITH TIME ZONE Datatype
TIMESTAMP WITH LOCAL TIME ZONE Datatype
Inserting Values into Datetime Datatypes
Choosing a TIMESTAMP Datatype
Interval Datatypes
INTERVAL YEAR TO MONTH Datatype
INTERVAL DAY TO SECOND Datatype
Inserting Values into Interval Datatypes
Datetime and Interval Arithmetic and Comparisons
Datetime and Interval Arithmetic
Datetime Comparisons
Explicit Conversion of Datetime Datatypes
Datetime SQL Functions
Datetime and Time Zone Parameters and Environment Variables
Datetime Format Parameters
Time Zone Environment Variables
Daylight Saving Time Session Parameter
Choosing a Time Zone File
Upgrading the Time Zone File
Setting the Database Time Zone
Setting the Session Time Zone
Converting Time Zones With the AT TIME ZONE Clause
Support for Daylight Saving Time
Examples: The Effect of Daylight Saving Time on Datetime Calculations

5 Linguistic Sorting and String Searching

Overview of Oracle's Sorting Capabilities
Using Binary Sorts
Using Linguistic Sorts
Monolingual Linguistic Sorts
Multilingual Linguistic Sorts
Multilingual Sorting Levels
Primary Level Sorts
Secondary Level Sorts
Tertiary Level Sorts
Linguistic Sort Features
Base Letters
Ignorable Characters
Contracting Characters
Expanding Characters
Context-Sensitive Characters
Canonical Equivalence
Reverse Secondary Sorting
Character Rearrangement for Thai and Laotian Characters
Special Letters
Special Combination Letters
Special Uppercase Letters
Special Lowercase Letters
Case-Insensitive and Accent-Insensitive Linguistic Sorts
Examples of Case-Insensitive and Accent-Insensitive Sorts
Specifying a Case-Insensitive or Accent-Insensitive Sort
Linguistic Sort Examples
Performing Linguistic Comparisons
Linguistic Comparison Examples
Using Linguistic Indexes
Linguistic Indexes for Multiple Languages
Requirements for Using Linguistic Indexes
Set NLS_SORT Appropriately
Specify NOT NULL in a WHERE Clause If the Column Was Not Declared NOT NULL
Example: Setting Up a French Linguistic Index
Searching Linguistic Strings
SQL Regular Expressions in a Multilingual Environment
Character Range '[x-y]' in Regular Expressions
Collation Element Delimiter '[. .]' in Regular Expressions
Character Class '[: :]' in Regular Expressions
Equivalence Class '[= =]' in Regular Expressions
Examples: Regular Expressions

6 Supporting Multilingual Databases with Unicode

Overview of Unicode
What is Unicode?
Supplementary Characters
Unicode Encodings
UTF-8 Encoding
UCS-2 Encoding
UTF-16 Encoding
Examples: UTF-16, UTF-8, and UCS-2 Encoding
Oracle's Support for Unicode
Implementing a Unicode Solution in the Database
Enabling Multilingual Support with Unicode Databases
Enabling Multilingual Support with Unicode Datatypes
How to Choose Between a Unicode Database and a Unicode Datatype Solution
When Should You Use a Unicode Database?
When Should You Use Unicode Datatypes?
Comparing Unicode Character Sets for Database and Datatype Solutions
Unicode Case Studies
Designing Database Schemas to Support Multiple Languages
Specifying Column Lengths for Multilingual Data
Storing Data in Multiple Languages
Store Language Information with the Data
Select Translated Data Using Fine-Grained Access Control
Storing Documents in Multiple Languages in LOB Datatypes
Creating Indexes for Searching Multilingual Document Contents
Creating Multilexers
Creating Indexes for Documents Stored in the CLOB Datatype
Creating Indexes for Documents Stored in the BLOB Datatype

7 Programming with Unicode

Overview of Programming with Unicode
Database Access Product Stack and Unicode
SQL and PL/SQL Programming with Unicode
SQL NCHAR Datatypes
The NCHAR Datatype
The NVARCHAR2 Datatype
The NCLOB Datatype
Implicit Datatype Conversion Between NCHAR and Other Datatypes
Exception Handling for Data Loss During Datatype Conversion
Rules for Implicit Datatype Conversion
SQL Functions for Unicode Datatypes
Other SQL Functions
Unicode String Literals
Using the UTL_FILE Package with NCHAR Data
OCI Programming with Unicode
OCIEnvNlsCreate() Function for Unicode Programming
OCI Unicode Code Conversion
Data Integrity
OCI Performance Implications When Using Unicode
OCI Unicode Data Expansion
Setting UTF-8 to the NLS_LANG Character Set in OCI
Binding and Defining SQL CHAR Datatypes in OCI
Binding and Defining SQL NCHAR Datatypes in OCI
Handling SQL NCHAR String Literals in OCI
Binding and Defining CLOB and NCLOB Unicode Data in OCI
Pro*C/C++ Programming with Unicode
Pro*C/C++ Data Conversion in Unicode
Using the VARCHAR Datatype in Pro*C/C++
Using the NVARCHAR Datatype in Pro*C/C++
Using the UVARCHAR Datatype in Pro*C/C++
JDBC Programming with Unicode
Binding and Defining Java Strings to SQL CHAR Datatypes
Binding and Defining Java Strings to SQL NCHAR Datatypes
Using the SQL NCHAR Datatypes Without Changing the Code
Using SQL NCHAR String Literals in JDBC
Data Conversion in JDBC
Data Conversion for the OCI Driver
Data Conversion for Thin Drivers
Data Conversion for the Server-Side Internal Driver
Using oracle.sql.CHAR in Oracle Object Types
oracle.sql.CHAR
Accessing SQL CHAR and NCHAR Attributes with oracle.sql.CHAR
Restrictions on Accessing SQL CHAR Data with JDBC
Character Integrity Issues in a Multibyte Database Environment
ODBC and OLE DB Programming with Unicode
Unicode-Enabled Drivers in ODBC and OLE DB
OCI Dependency in Unicode
ODBC and OLE DB Code Conversion in Unicode
OLE DB Code Conversions
ODBC Unicode Datatypes
OLE DB Unicode Datatypes
ADO Access
XML Programming with Unicode
Writing an XML File in Unicode with Java
Reading an XML File in Unicode with Java
Parsing an XML Stream in Unicode with Java

8 Oracle Globalization Development Kit

Overview of the Oracle Globalization Development Kit
Designing a Global Internet Application
Deploying a Monolingual Internet Application
Deploying a Multilingual Internet Application
Developing a Global Internet Application
Locale Determination
Locale Awareness
Localizing the Content
Getting Started with the Globalization Development Kit
GDK Quick Start
Modifying the HelloWorld Application
GDK Application Framework for J2EE
Making the GDK Framework Available to J2EE Applications
Integrating Locale Sources into the GDK Framework
Getting the User Locale From the GDK Framework
Implementing Locale Awareness Using the GDK Localizer
Defining the Supported Application Locales in the GDK
Handling Non-ASCII Input and Output in the GDK Framework
Managing Localized Content in the GDK
Managing Localized Content in JSPs and Java Servlets
Managing Localized Content in Static Files
GDK Java API
Oracle Locale Information in the GDK
Oracle Locale Mapping in the GDK
Oracle Character Set Conversion (JDK 1.4 and Later) in the GDK
Oracle Date, Number, and Monetary Formats in the GDK
Oracle Binary and Linguistic Sorts in the GDK
Oracle Language and Character Set Detection in the GDK
Oracle Translated Locale and Time Zone Names in the GDK
Using the GDK for E-Mail Programs
The GDK Application Configuration File
locale-charset-maps
page-charset
application-locales
locale-determine-rule
locale-parameter-name
message-bundles
url-rewrite-rule
Example: GDK Application Configuration File
GDK for Java Supplied Packages and Classes
oracle.i18n.lcsd
oracle.i18n.net
oracle.i18n.servlet
oracle.i18n.text
oracle.i18n.util
GDK for PL/SQL Supplied Packages
GDK Error Messages

9 SQL and PL/SQL Programming in a Global Environment

Locale-Dependent SQL Functions with Optional NLS Parameters
Default Values for NLS Parameters in SQL Functions
Specifying NLS Parameters in SQL Functions
Unacceptable NLS Parameters in SQL Functions
Other Locale-Dependent SQL Functions
The CONVERT Function
SQL Functions for Different Length Semantics
LIKE Conditions for Different Length Semantics
Character Set SQL Functions
Converting from Character Set Number to Character Set Name
Converting from Character Set Name to Character Set Number
Returning the Length of an NCHAR Column
The NLSSORT Function
NLSSORT Syntax
Comparing Strings in a WHERE Clause
Using the NLS_COMP Parameter to Simplify Comparisons in the WHERE Clause
Controlling an ORDER BY Clause
Miscellaneous Topics for SQL and PL/SQL Programming in a Global Environment
SQL Date Format Masks
Calculating Week Numbers
SQL Numeric Format Masks
Loading External BFILE Data into LOB Columns

10 OCI Programming in a Global Environment

Using the OCI NLS Functions
Specifying Character Sets in OCI
Getting Locale Information in OCI
Mapping Locale Information Between Oracle and Other Standards
Manipulating Strings in OCI
Classifying Characters in OCI
Converting Character Sets in OCI
OCI Messaging Functions
lmsgen Utility

11 Character Set Migration

Overview of Character Set Migration
Data Truncation
Additional Problems Caused by Data Truncation
Character Set Conversion Issues
Replacement Characters that Result from Using the Export and Import Utilities
Invalid Data That Results from Setting the Client's NLS_LANG Parameter Incorrectly
Changing the Database Character Set of an Existing Database
Migrating Character Data Using a Full Export and Import
Migrating a Character Set Using the CSALTER Script
Using the CSALTER Script in an Oracle Real Application Clusters Environment
Migrating Character Data Using the CSALTER Script and Selective Imports
Migrating to NCHAR Datatypes
Migrating Oracle8i Database NCHAR Columns to Oracle9i Database and Later
Changing the National Character Set
Migrating CHAR Columns to NCHAR Columns
Using the ALTER TABLE MODIFY Statement to Change CHAR Columns to NCHAR Columns
Using Online Table Redefinition to Migrate a Large Table to Unicode
Tasks to Recover Database Schema After Character Set Migration

12 Character Set Scanner Utilities

The Language and Character Set File Scanner
Syntax of the LCSSCAN Command
Examples: Using the LCSSCAN Command
Getting Command-Line Help for the Language and Character Set File Scanner
Supported Languages and Character Sets
LCSSCAN Error Messages
The Database Character Set Scanner
Conversion Tests on Character Data
Scan Modes in the Database Character Set Scanner
Full Database Scan
User Scan
Table Scan
Column Scan
Installing and Starting the Database Character Set Scanner
Access Privileges for the Database Character Set Scanner
Installing the Database Character Set Scanner System Tables
Starting the Database Character Set Scanner
Creating the Database Character Set Scanner Parameter File
Getting Command-Line Help for the Database Character Set Scanner
Database Character Set Scanner Parameters
Database Character Set Scanner Sessions: Examples
Full Database Scan: Examples
Example: Parameter-File Method
Example: Command-Line Method
Database Character Set Scanner Messages
User Scan: Examples
Example: Parameter-File Method
Example: Command-Line Method
Database Character Set Scanner Messages
Single Table Scan: Examples
Example: Parameter-File Method
Example: Command-Line Method
Database Character Set Scanner Messages
Example: Parameter-File Method
Example: Command-Line Method
Database Character Set Scanner Messages
Column Scan: Examples
Example: Parameter-File Method
Example: Command-Line Method
Database Character Set Scanner Messages
Database Character Set Scanner Reports
Database Scan Summary Report
Database Size
Database Scan Parameters
Scan Summary
Data Dictionary Conversion Summary
Application Data Conversion Summary
Application Data Conversion Summary Per Column Size Boundary
Distribution of Convertible Data Per Table
Distribution of Convertible Data Per Column
Indexes To Be Rebuilt
Truncation Due To Character Semantics
Character Set Detection Result
Language Detection Result
Database Scan Individual Exception Report
Database Scan Parameters
Data Dictionary Individual Exceptions
Application Data Individual Exceptions
How to Handle Convertible or Lossy Data in the Data Dictionary
Storage and Performance Considerations in the Database Character Set Scanner
Storage Considerations for the Database Character Set Scanner
CSM$TABLES
CSM$COLUMNS
CSM$ERRORS
Performance Considerations for the Database Character Set Scanner
Using Multiple Scan Processes
Setting the Array Fetch Buffer Size
Optimizing the QUERY Clause
Suppressing Exception and Convertible Log
Recommendations and Restrictions for the Database Character Set Scanner
Scanning Database Containing Data Not in the Database Character Set
Scanning Database Containing Data from Two or More Character Sets
Database Character Set Scanner CSALTER Script
Checking Phase of the CSALTER Script
Updating Phase of the CSALTER Script
Database Character Set Scanner Views
CSMV$COLUMNS
CSMV$CONSTRAINTS
CSMV$ERRORS
CSMV$INDEXES
CSMV$TABLES
Database Character Set Scanner Error Messages

13 Customizing Locale

Overview of the Oracle Locale Builder Utility
Configuring Unicode Fonts for the Oracle Locale Builder
Font Configuration on Windows
Font Configuration on Other Platforms
The Oracle Locale Builder User Interface
Oracle Locale Builder Windows and Dialog Boxes
Existing Definitions Dialog Box
Session Log Dialog Box
Preview NLT Tab Page
Open File Dialog Box
Creating a New Language Definition with the Oracle Locale Builder
Creating a New Territory Definition with the Oracle Locale Builder
Customizing Time Zone Data
Customizing Calendars with the NLS Calendar Utility
Displaying a Code Chart with the Oracle Locale Builder
Creating a New Character Set Definition with the Oracle Locale Builder
Character Sets with User-Defined Characters
Oracle Character Set Conversion Architecture
Unicode 4.0 Private Use Area
User-Defined Character Cross-References Between Character Sets
Guidelines for Creating a New Character Set from an Existing Character Set
Example: Creating a New Character Set Definition with the Oracle Locale Builder
Creating a New Linguistic Sort with the Oracle Locale Builder
Changing the Sort Order for All Characters with the Same Diacritic
Changing the Sort Order for One Character with a Diacritic
Generating and Installing NLB Files
Transportable NLB Data

A Locale Data

Languages
Translated Messages
Territories
Character Sets
Recommended Database Character Sets
Other Character Sets
Character Sets that Support the Euro Symbol
Client-Only Character Sets
Universal Character Sets
Character Set Conversion Support
Subsets and Supersets
Language and Character Set Detection Support
Linguistic Sorts
Calendar Systems
Time Zone Names
Obsolete Locale Data
Obsolete Linguistic Sorts
Obsolete Territories
Obsolete Languages
New Names for Obsolete Character Sets
AL24UTFFSS Character Set Desupported
Updates to the Oracle Language and Territory Definition Files

B Unicode Character Code Assignments

Unicode Code Ranges
UTF-16 Encoding
UTF-8 Encoding

Glossary

Index