Skip Headers

Oracle9i Application Server Globalization Support Guide
Release 2 (9.0.2)

Part Number A92110-02
Go To Documentation Library
Home
Go To Product List
Solution Area
Go To Table Of Contents
Contents
Go To Index
Index

Go to previous page Go to next page

2
Developing Global Internet Applications for Oracle9iAS

This chapter contains the following topics:

Overview of Developing Global Internet Applications

Building an Internet application for Oracle9iAS that supports different locales requires good development practices. The application itself must be aware of the user's locale and be able to present locale-appropriate content to the user. Clients must be able to communicate with the application server regardless of the client's locale, with minimal character set conversion. The application server must be able to access the database server with data in many languages, again with minimal character set conversion. Character set conversion decreases performance and increases the chance of data loss because some characters may not be available in the target character set.

See Also:

Oracle9i Globalization Support Guide in the Oracle Database Documentation Library for more information about character set conversion

Oracle9iAS supports the following programming languages and corresponding Web development environments:

Developing Locale-Aware Internet Applications

Global Internet applications need to be aware of the user's locale.

A monolingual application by definition serves users with the same locale. A user's locale is fixed in a monolingual application and is usually the same as the default runtime locale of the programming environment.

In a multilingual application, the user's locale can vary. Multilingual applications should:

Locale-sensitive functions, such as date formatting, are built into programming environments such as C/C++, Java, and PL/SQL. Applications can use locale-sensitive functions to format the HTML pages according to the cultural conventions of the user's locale.

Different programming environments represent locales in different ways. For example, the French (Canada) locale is represented as follows:

Environment Representation Locale Explanation

Various

ISO standard

fr-CA

fr is the language code defined in the ISO 639 standard. CA is the country code defined in the ISO 3166 standard.

Java

Java locale object

fr_CA

Java uses the ISO language and country code.

C/C++

POSIX locale name

fr_CA on UNIX

UNIX locale names may include a character set that overrides the default character set. For example, the de.ISO8859-15 locale is used to support the Euro symbol.

PL/SQL and SQL

NLS_LANGUAGE and NLS_TERRITORY parameters

NLS_LANGUAGE=
"CANADIAN FRENCH"

NLS_TERRITORY=
"CANADA"

See Also: "Setting NLS_LANG in a Multilingual Application Architecture".

Table 2-1 shows how different programming environments represent some commonly used locales.

Table 2-1 Locale Representations in Different Programming Environments 
Locale ISO Java UNIX NLS_LANGUAGE,
NLS_TERRITORY

Arabic (U.A.E.)

ar

ar

ar

ARABIC, UNITED ARAB EMIRATES

Germany (German)

de-DE

de_DE

de

GERMANY, GERMAN

English (U.S.A)

en

en_US

en_US

AMERICAN, AMERICA

English (United Kingdom)

en-GB

en_GB

en_UK

ENGLISH, UNITED KINGDOM

Greek

el

el

el

GREEK, GREECE

Spanish (Spain)

es-ES

es_ES

es

SPANISH, SPAIN

French (France)

fr

fr_FR

fr

FRENCH, FRANCE

French (Canada)

fr-CA

fr_CA

fr_CA

CANADIAN FRENCH, CANADA

Hebrew

he

he

he

HEBREW, ISRAEL

Italian (Italy)

it

it

it

ITALIAN, ITALY

Japanese

ja-JP

ja_JP

ja_JP

JAPANESE, JAPAN

Korean

ko-KR

ko_KR

ko_KR

KOREAN, KOREA

Portuguese (Portugal)

pt

pt

pt

PORTUGUESE, PORTUGAL

Portuguese (Brazil)

pt-BR

pt_BR

pt_BR

BRAZILIAN PORTUGUESE, BRAZIL

Turkish

tr

tr

tr

TURKISH, TURKEY

Thai

th

th

th

THAI, THAILAND

Chinese (Taiwan)

zh-TW

zh_TW

zh_TW

TRADITIONAL CHINESE, TAIWAN

Chinese (P.R.C)

zh-CN

zh_CN

zh_CN

SIMPLIFIED CHINESE, CHINA

If you write applications for more than one programming environment, then locales must be synchronized between environments. For example, Java applications that call PL/SQL procedures should map the Java locales to the corresponding NLS_LANGUAGE and NLS_TERRITORY values and change the parameter values to match the user's locale before calling the PL/SQL procedures.

This section contains the following topics:

Determining the User's Locale in Monolingual Applications

A monolingual application by definition serves users with the same locale. A user's locale is fixed in a monolingual application and is usually the same as the default runtime locale of the programming environment.

Determining the User's Locale in Multilingual Applications

Multilingual applications can determine a user's locale dynamically in the following ways:

You can use these methods of determining the user's locale together or separately. After the application determines the locale, the locale should be:

Locale Awareness in Java Applications

A Java locale object represents the corresponding user's locale in Java. The Java encoding used for the locale is required to properly convert Java strings to byte data and vice versa.

Consider the Java encoding for the locale when you make the Java code aware of a user's locale. There are two ways to make a Java method sensitive to the Java locale and the Java encoding:

These approaches are discussed in the following sections:

Locale Awareness in Monolingual Java Applications

Monolingual applications should run implicitly with the default Java locale and default Java encoding so that the applications can be configured easily for a different locale. For example, to create a date formatter using the default Java locale, use the following method call:

DateFormat df = DateFormat.getDateTimeInstance(DateFormat.FULL, DateFormat.FULL);
dateString = df.format(date); /* Format a date */

Locale Awareness in Multilingual Java Applications

You should develop multilingual applications such that they are independent of fixed default locales or encodings. Explicitly specify the Java locale and Java encoding that correspond to the current user's locale. For example, specify the Java locale object that corresponds to the user's locale, identified by user_locale, in the getDateTimeInstance() method:

DateFormat df = DateFormat.getDateTimeInstance(DateFormat.FULL, DateFormat.FULL, user_
locale);
dateString = df.format(date); /* Format a date */

Note that the only difference between the example code for the monolingual application and the multilingual application is the inclusion of user_locale.

Similarly, do not use encoding-sensitive methods that assume the default Java encoding. For example, you should not use the String.getBytes() method in a multilingual application because it is encoding-sensitive. Instead, use the method that accepts encoding as an argument, which is String.getBytes(String encoding). Be sure to specify the encoding used for the user's locale.

Do not use the Locale.setDefault() method to change the default locale because:

Locale Awareness in Perl and C/C++ Applications

Perl and C/C++ use the POSIX locale model for internationalized applications. The implementation for monolingual and multilingual applications is discussed in the following sections:

Locale Awareness in Monolingual Perl and C/C++ Applications

Monolingual applications should be sensitive to the default POSIX locale, which is configured by changing the value of the LC_ALL environment variable or changing the operating system locale from the Control Panel in Windows.

See Also:

Table 2-1 for a list of commonly used POSIX locales

To run on the default POSIX locale, the applications should call the setlocale() function to set the default locale to the one defined by LC_ALL and use the POSIX locale-sensitive functions such as strftime() thereafter. Note that the setlocale() function affects the current process and all the threads associated with it, so any multithread application should assume the same POSIX locale in each thread. The following example gets the current time in the format specific to the default locale in Perl:

use locale;
use POSIX qw (locale_h);
...
$old_locale = setlocale( LC_ALL, "" );
$dateString = POSIX::strftime( "%c", localtime());
...

Locale Awareness in Multilingual Perl and C/C++ Applications

Multilingual applications should be sensitive to dynamically determined locales. Call the setlocale() function to initialize the locale before calling locale-sensitive functions. For example, the following C code gets the local time in the format of the user locale identified by user_locale:

#include <locale.h>
#include <time.h>
    ...
    const char *user_locale = "fr";
    time_t ltime;
    struct tm *thetime;
    unsigned char dateString[100];
    ...
    setlocale(LC_ALL, user_locale);
    time (&ltime);
    thetime = gmtime(&ltime);
    strftime((char *)dateString, 100, "%c", (const struct tm *)thetime))
    ...

You must map user locales to POSIX locale names for applications to initialize the correct locale dynamically in C/C++ and Perl. The POSIX locales depend on the operating system.

Locale Awareness in SQL and PL/SQL Applications

PL/SQL procedures run in the context of a database session whose locale is initialized by the NLS_LANG parameter in the database access descriptor (DAD). The NLS_LANG parameter specifies top-level NLS parameters, NLS_LANGUAGE and NLS_TERRITORY, for the database session. Other NLS parameters, such as NLS_SORT and NLS_DATE_LANGUAGE, inherit their values from these top-level parameters. These NLS parameters define the locale of a database session.

See Also:

  • "Configuring the NLS_LANG Environment Variable"

  • PL/SQL User's Guide and Reference

  • Oracle9i Database Reference in the Oracle Database Documentation Library

  • Oracle9i Globalization Support Guide in the Oracle Database Documentation Library

for more information about NLS parameters

There are two ways to make SQL and PL/SQL functions locale sensitive:

This section contains the following topics:

Locale Awareness in Monolingual SQL and PL/SQL Applications

Generally speaking, the initial values of the NLS parameters inherited from NLS_LANG are sufficient for monolingual PL/SQL procedures. For example, the following PL/SQL code calls the TO_CHAR() function to get the formatted date, which uses the current values of the NLS_DATE_FORMAT and NLS_DATE_LANGUAGE parameters:

mydate date;
dateString varchar2(100);
...
select sysdate into mydate from dual;
dateString = TO_CHAR(mydate);

If the initial values of the NLS parameters are not appropriate, then use an ALTER SESSION statement to overwrite them for the current database session. You can use the ALTER SESSION statement with the DBMS_SQL package. For example:

cur integer;
status integer;
...
cur := dbms_sql.open_cursor;
dbms_sql.parse(cur, 'alter session set nls_date_format = 'Day Month, YYYY',
       dbms_sql.native);
status := dbms_sql.execute(cur);

Locale Awareness in Multilingual SQL and PL/SQL Applications

Multilingual applications should use ALTER SESSION statements to change the locale of the database session to the user's locale before calling any locale-sensitive SQL or PL/SQL functions. You can use the ALTER SESSION statement with the DBMS_SQL package. For example:

cur integer;
status integer;
...
cur := dbms_sql.open_cursor;
dbms_sql.parse(cur, 'alter session set nls_language = 'NLS_LANGUAGE_of_user_
       locale
', dbms_sql.native); dbms_sql.parse(cur, 'alter session set nls_territory = 'NLS_TERRITORY_of_
       user_locale'
, dbms_sql.native); status := dbms_sql.execute(cur);

Alternatively, applications can specify the NLS parameters in every SQL function that accepts an NLS parameter as an argument. For example, the following PL/SQL code gets a date string based on the language of the user's locale:

mydate date;
dateString varchar2(100);
...
select sysdate into mydate from dual;
dateString TO_CHAR(mydate, 'DD-MON-YYYY HH24:MI:SSxFF', 
                    'NLS_DATE_LANGUAGE=language' );
...

language specifies the Oracle language name for the user's locale.

Encoding HTML Pages

The encoding of an HTML page is important information for a browser and an Internet application. You can think of the page encoding as the character set used for the locale to which an Internet application is serving. The browser needs to know about the page encoding so that it can use the correct fonts and character set mapping tables to display pages. Internet applications need to know about the HTML page encoding so they can process input data from an HTML form. To correctly specify the page encoding for HTML pages, Internet applications must:

This section contains the following topics:

Choosing an HTML Page Encoding

This section contains the following topics:

Choosing an HTML Page Encoding for Monolingual Applications

The HTML page encoding is based on the user's locale. If the application is monolingual, it supports only one locale per instance. Therefore, you should encode HTML pages in the native encoding for that locale. The encoding should be equivalent to the Oracle character set specified by the NLS_LANG parameter in the Oracle HTTP Server configuration file.

Table 2-2 lists the Oracle character set names for the native encodings of the most commonly used locales, along with the corresponding Internet Assigned Numbers Authority (IANA) encoding names and Java encoding names. Use these character sets for monolingual applications.

See Also:

"Setting NLS_LANG in a Monolingual Application Architecture"

Table 2-2 Native Encodings for Commonly Used Locales  
Language Oracle Character Set Name IANA Encoding Name Java Encoding Name

Western European

WE8MSWIN1252

ISO-8859-1

ISO8859_1

Central European

EE8MSWIN1250

ISO-8859-2

ISO8859_2

Japanese

JA16SJIS

Shift_JIS

MS932

Traditional Chinese

ZHT16MSWIN950

Big5

MS950

Simplified Chinese

ZHS16GBK

GB2312

GBK

Korean

KO16MSWIN949

EUC-KR

MS949

Arabic

AR8MSWIN1256

ISO-8859-6

ISO8859_6

Hebrew

IW8MSWIN1255

ISO-8859-8

ISO8859_8

Cyrillic

CL8MSWIN1251

ISO-8859-5

ISO8859_5

Baltic

BLT8MSWIN1257

ISO-8859-4

ISO8859_4

Greek

EL8MSWIN1253

ISO-8859-7

ISO8859_7

Thai

TH8TISASCII

TIS-620

TIS620

Turkish

TR8MSWIN1254

ISO-8859-9

ISO8859_9

Universal

UTF8

UTF-8

UTF8

Choosing an HTML Page Encoding for Multilingual Applications

Multilingual applications need to determine the encoding used for the current user's locale at runtime and map the locale to the encoding as shown in Table 2-2.

Instead of using different native encodings for different locales, you can use UTF-8 for all page encodings. Using the UTF-8 encoding not only simplifies the coding for multilingual applications but also supports multilingual content. In fact, if a multilingual Internet application is written in Perl, the best choice for the HTML page encoding is UTF-8 because these programming environments do not provide an intuitive and efficient way to convert HTML content from UTF-8 to the native encodings of various locales.

There are limitations to using UTF-8 with the Netscape 4.x browser:

Netscape 6 resolves the second and third limitations.

Specifying the Page Encoding for HTML Pages

The best practice for monolingual and multilingual applications is to specify the encoding of HTML pages returned to the client browser. The encoding of HTML pages can tell the browser to:

There are two ways to specify the encoding of an HTML page:

If you use both methods, then specifying the encoding in the HTTP header takes precedence.

Specifying the Encoding in the HTTP Header

Include the Content-Type HTTP header in the HTTP specification. It specifies the content type and character set. The most commonly used browsers, such as Netscape 4.0 and Internet Explorer 4.0 or later, correctly interpret this header. The Content-Type HTTP header has the following form:

Content-Type: text/plain; charset=iso-8859-4

The charset parameter specifies the encoding for the HTML page. The possible values for the charset parameter are the IANA names for the character encoding that the browser supports. Table 2-2 shows commonly used IANA names.

Specifying the Encoding in the HTML Page Header

Use this method primarily for static HTML pages. Specify the character encoding in the HTML header as follows:

<meta http-equiv="Content-Type" content="text/html;charset=utf-8">

The charset parameter specifies the encoding for the HTML page. The possible values for the charset parameter are the IANA names for the character encoding that the browser supports. Table 2-2 shows commonly used IANA names.

Specifying the Page Encoding in Java Servlets and Java Server Pages

For both monolingual and multilingual applications, you can specify the encoding of an HTML page in the Content-Type HTTP header in a Java Server Page (JSP) using the contentType page directive. For example:

<%@ page contentType="text/html; charset=utf-8" %>

This is the MIME type and character encoding that the JSP file uses for the response it sends to the client. You can use any MIME type or IANA character set name that is valid for the JSP container. The default MIME type is text/html, and the default character set is ISO-8859-1. In the example, the character set is set to UTF-8. The character set of the contentType page directive directs the JSP engine to encode the dynamic HTML page and set the HTTP Content-Type header with the specified character set.

For Java Servlets, you can call the setContentType() method of the Servlet API to specify a page encoding in the HTTP header. The following doGet() function shows how you should call this method:

public void doGet(HttpServletRequest req, HttpServletResponse res)throws 
ServletException, IOException 
{

    // generate the MIME type and character set header
    res.setContentType("text/html; charset=utf-8");
    ...
    // generate the HTML page
    Printwriter out = res.getWriter();
    out.println("<HTML>");
    ...
    out.println("</HTML>");
}

You should call the setContentType() method before the getWriter() method because the getWriter() method initializes an output stream writer that uses the character set that the setContentType() method call specifies. Any HTML content written to the writer and eventually to a browser is encoded in the encoding that the setContentType() call specifies.

Specifying the Page Encoding in SQL and PL/SQL Server Pages

You can specify a page encoding for PL/SQL front-end applications and PL/SQL Server Pages (PSP) in two ways:

The specified page encoding tells the mod_plsql module and the Web Toolkit to tag the corresponding charset parameter in the Content-Type header of an HTML page and to convert the page content to the corresponding character set.

This section includes the following topics:

Specifying the Page Encoding in PL/SQL and PSPs for Monolingual Environments

In order for monolingual applications to take the page encoding from the NLS_LANG parameter, the Content-Type HTTP header should not specify a page encoding. For PL/SQL procedures, the call to mime_header(), if any, should be similar to the following:

owa_util.mime_header('text/html',false);

For PSPs, the content type directive should be similar to the following:

<%@ page contentType="text/html"%>

Without the page encoding specified in the mime_header() function call or the content type directive, the Web Toolkit API uses the NLS_LANG character set as the page encoding by default, and converts HTML content to the NLS_LANG character set. Also, the Web Toolkit API automatically adds the default page encoding to the charset parameter of the Content-Type header.

Specifying the Page Encoding in PL/SQL and PSPs for Multilingual Environments

You can specify a page encoding in a PSP the same way that you specify it in a JSP page. The following directive tells the PSP compiler to generate code to set the page encoding in the HTTP Content-Type header for this page:

<%@ page contentType="text/html; charset=utf-8" %>

To specify the encoding in the Content-Type HTTP header for PL/SQL procedures, use the Web Toolkit API in the PL/SQL procedures. The Web Toolkit API consists of the OWA_UTL package, which allows you to specify the Content-Type header as follows:

owa_util.mime_header('text/html', false, 'utf-8')

You should call the mime_header() function in the context of the HTTP header. It generates the following Content-Type header in the HTTP response:

Content-Type: text/html; charset=utf-8

After you specify a page encoding, the Web Toolkit API converts HTML content to the specified page encoding.

Specifying the Page Encoding in Perl

For Perl scripts running in the mod_perl environment, specify an encoding to an HTML page in the HTTP Content-Type header as follows:

$page_encoding = 'utf-8';
$r->content_type("text/html; charset=$page_encoding");
$r->send_http_header;
return OK if $r->header_only;

This section contains the following topics:

Specifying the Page Encoding in Perl for Monolingual Applications

For monolingual applications, the encoding of an HTML page should be equivalent to:

Specifying the Page Encoding in Perl for Multilingual Applications

For multilingual applications, Perl scripts should run in an environment where:

This environment allows the scripts to process data in any language in UTF-8. The page encoding of the dynamic HTML pages generated from the scripts, however, could be different from UTF-8. If so, then use the UNICODE::MAPUTF8 Perl module to convert data from UTF-8 to the page encoding.

See Also:

http://www.cpan.org to download the UNICODE::MAPUTF8 Perl module

The following example illustrates how to use the UNICODE::MAPUTF8 Perl module to generate HTML pages in the Shift_JIS encoding:

use Unicode::MapUTF8 qw(from_utf8)
# This shows how the UTF8 Perl pragma is specified 
# but is NOT required by the from_utf8 function.
use utf8; 
...
$page_encoding = 'Shift_JIS';
$r->content_type("text/html; charset=$page_encoding");
$r->send_http_header;
return OK if $r->header_only;
...
#html_lines contains HTML content in UTF-8
print (from_utf8({ -string=>$html_lines, -charset=>$page_encoding}));
...

The from_utf8() function converts dynamic HTML content from UTF-8 to the character set specified in the charset argument.

Specifying the Page Encoding in Oracle9iAS Reports Services Applications

This section includes the following topics:

Specifying the Page Encoding in JSP Reports for the Web

You can specify the page encoding in JSP or HTML with the Web Source Editor in Reports Builder.

See Also:

"Specifying the Encoding in the HTML Page Header" and "Specifying the Page Encoding in Java Servlets and Java Server Pages" for more information.

Specifying the Page Encoding in HTML and XML Output for Paper Layout

In an Oracle9iAS Reports Services architecture, you must ensure that the output data is correctly converted and displayed in the appropriate character set. Oracle Net manages the conversion between the customer database and the Reports Server. Report output is generated using the Reports Server character set.

Specifying the Page Encoding in HTML for Reports Services

Specify the HTML page encoding in the page header. For example, to specify a Japanese character set, include the following tag in the page header:

<META http-equiv="Content-Type" content="text/html;charset=SHIFT_JIS">

See Also:

"Specifying the Encoding in the HTML Page Header"

Reports Builder puts this tag in your report via the Before Report Value and Before Form Value properties. The default values for these properties are similar to the following:

<html><head><meta http-equiv="Content-Type" content="text/html;charset=&Encoding"></head>

The IANA locale name that is equivalent to the NLS_LANG setting for Oracle9iAS Reports Services is assigned to &Encoding dynamically at runtime. Thus you do not need to modify your report or Oracle9iAS Reports Services settings to include the proper locale.

See Also:

Oracle9iAS Reports Services online help for more information

Specifying the Page Encoding in XML for Reports Services

Generally, when using XML, you would specify the encoding for XML by including a statement similar to the following as the Prolog at the first line in the outputted XML file:

<?xml version="1.0" encoding="SHIFT_JIS"?>

To set this Prolog in your report, you can specify the XML Prolog Value property of your report in Reports Builder or use the SRW.SET_XML_PROLOG built-in.


Note:

Currently, some Oracle NLS_CHARSET values have no equivalent IANA character set. The XML saved by Oracle9i Reports Developer for reports with these character sets cannot be opened by some XML viewers, such as Internet Explorer, unless you set REPORTS_NLS_XML_CHARSETS to the following:

WINDOWS-950=BIG5;CSEUCKR=EUC-KR;


See also:

Oracle9iAS Report Builder online help for more information

Handling HTML Form Input

Applications generate HTML forms to get user input. For Netscape and Internet Explorer browsers, the encoding of the input always corresponds to the encoding of the forms for both POST and GET requests. In other words, if the encoding of a form is UTF-8, input text that the browser returns is encoded in UTF-8. Thus Internet applications can control the encoding of the form input by specifying the corresponding encoding in the HTML form that requests information.

How a browser passes input in a POST request is different from how it passes input in a GET request:

HTML standards allow named and numbered entities. These special codes allow users to specify characters. For example, &aelig; and &#230; both refer to the character æ. Tables of these entities are available at

http://www.w3.org/TR/REC-html40/sgml/entities.html

Some browsers generate numbered or named entities for any input character that cannot be encoded in the encoding of an HTML form. For example, the Euro character and the character à (Unicode values 8364 and 224 respectively) cannot be encoded in Big5 encoding and are sent as &#8364; and &agrave; when the HTML encoding is Big5. However, the browser does not need to generate numbered or named entities if the page encoding of the HTML form is UTF-8 because all characters can be encoded in UTF-8. Internet applications that support page encoding other than UTF-8 need to be able to handle numbered and named entities.

Handling HTML Form Input in Java

In most JSP and Servlet containers, including Apache JServ, the Servlet API implementation assumes that incoming form input is in ISO-8859-1 encoding. As a result, when the HttpServletRequest.getParameter() API is called, all embedded %XX data in the input text is decoded, and the decoded input is converted from ISO-8859-1 to Unicode and returned as a Java string. The Java string returned is incorrect if the encoding of the HTML form is not ISO-8859-1. However, you can work around this problem by converting the form input data. When a JSP or Java Servlet receives form input in a Java string, it needs to convert it back to the original form in bytes, and then convert the original form to a Java string based on the correct encoding.

The following code converts a Java string to the correct encoding. The Java string real is initialized to store the correct characters from a UTF-8 form:

String original = request.getParameter("name");
try 
{
    String real = new String(original.getBytes("8859_1"),"UTF8");
} 
catch (UnsupportedEncodingException e) 
{
    String real = original;
} 

In addition to Java encoding names, you can use IANA encoding names as aliases in Java functions.

See Also:

Table 2-2 for mapping between commonly used IANA and Java encoding names

OC4J implements Servlet API 2.3, from which you can get the correct input by setting the CharEncoding attribute of the HTTP request object before calling the getParameter() function. Use the following code:

request.setCharacterEncoding("UTF8");
String real = request.getParameter("name");

Handling HTML Form Input in PL/SQL

The browser passes form input to PL/SQL procedures as PL/SQL procedure arguments. When a browser issues a POST or a GET request, it first sends the form input to the mod_plsql module in the encoding of the requesting HTML form. The mod_plsql module then decodes all %XX escape sequences in the input to their actual binary representations. It then passes the input to the PL/SQL procedure serving the request.

You should construct PL/SQL arguments you use to accept form input with the VARCHAR2 datatype. Data in VARCHAR2 are always encoded in the database character set. For example, the following PL/SQL procedure accepts two parameters in VARCHAR2:

procedure test(name VARCHAR2, gender VARCHAR2)
begin
...
end;

By default, the mod_plsql module assumes that the arguments of a PL/SQL procedure are in VARCHAR2 datatype when it binds them. Using VARCHAR2 as the argument datatype means that the module uses Oracle Character Set Conversion facility provided in Oracle Callable Library to convert form input data properly from the NLS_LANG character set, which is also your page encoding, to the database character set. The corresponding DAD specifies the NLS_LANG character set. As a result, the arguments passed as VARCHAR2 should already be encoded in the database character set and be ready to use within the PL/SQL procedures.

Handling HTML Form Input in PL/SQL for Monolingual Applications

For monolingual application deployment, the NLS_LANG character set specified in the DAD is the same as the character set of the form input and the page encoding chosen for the locale. As a result, form input passed as VARCHAR2 arguments should be transparently converted to the database character set and ready for use.

Handling HTML Form Input in PL/SQL for Multilingual Applications

For multilingual application deployment, form input can be encoded in different character sets depending on the page encodings you choose for the corresponding locales. You can no longer use Oracle Character Set Conversion facility because the character set of the form input is not always the same as the NLS_LANG character set. Relying on this conversion corrupts the input. To resolve this problem, disable Oracle Character Set Conversion facility by specifying the same NLS_LANG character set in the corresponding DAD as the database character set. Once you disable the conversion, PL/SQL procedures receive form input as VARCHAR2 arguments. You must convert the arguments from the form input encoding to the database character set before using them. You can use the following code to convert the argument from ISO-8859-1 character set to UTF-8:

procedure test(name VARCHAR2, gender VARCHAR2)
begin
   name := CONVERT(name, 'AMERICAN_AMERICA.UTF8',
                 A.WE8MSWIN1252')
   gender := CONVERT(gender, 'AMERICAN_AMERICA.UTF8',
                 AMERICAN_AMERICA.WE8MSWIN1252')
...
end;

See Also:

Configuring the NLS_LANG Environment Variable

Handling HTML Form Input in Perl

In the Oracle HTTP Server mod_perl environment, GET requests pass input to a Perl script differently than POST requests. It is good practice to handle both types of requests in the script. The following code gets the input value of the name parameter from an HTML form:

my $r = shift;
my %params = $r->method eq 'POST' ? $r->content : $r->args ;
my $name = $params{'name'} ;

For multilingual Perl scripts, the page encoding of an HTML form may be different from the UTF-8 encoding used in the Perl scripts. In this case, input data should be converted from the page encoding to UTF-8 before being processed. The following example illustrates how the Unicode::MapUTF8 Perl module converts strings from Shift_JIS to UTF-8:

use Unicode::MapUTF8 qw(to_utf8);
# This is to show how the UTF8 Perl pragma is specified, 
# and is NOT required by the from_utf8 function.
use utf8; 
...
my $page_encoding = 'Shift_JIS';
my $r = shift;
my %params = $r->method eq 'POST' ? $r->content : $r->args ;
my $name = to_utf8({-string=>$params{'name'}, -charset=>$page_encoding});
...

The to_utf8() function converts any input string from the specified encoding to UTF-8.

Encoding URLs

If HTML pages contain URLs with embedded query strings, you must escape any non-ASCII bytes in the query strings in the %XX format, where XX is the hexadecimal representation of the binary value of the byte. For example, if an Internet application embeds a URL that points to a UTF-8 JSP page containing the German name "Schloß," then the URL should be encoded as follows:

http://host.domain/actionpage.jsp?name=Schlo%c3%9f

Here, c3 and 9f represent the binary value in hexadecimal of the ß character in the UTF-8 encoding.

To encode a URL, be sure to complete the following tasks:

Most programming environments provide APIs to encode and decode URLs. The following sections describe URL encoding in various environments:

Encoding URLs in Java

If you construct a URL in a JSP or Java Servlet, you must escape all 8-bit bytes using their hexadecimal values prefixed by a percent sign as described in "Encoding URLs". The URLEncoder.encode() function provided in JDK 1.1 and JDK 1.2 works only if you encode a URL in the Java default encoding. To make it work for URLs in any encoding, add code to escape non-ASCII characters in a URL into their hexadecimal representation based on the encoding of your choice.

The following code shows an example of how to encode a URL based on the UTF-8 encoding:

String unreserved = new String("/\\-  _.!~*'()
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz 0123456789");
StringBuffer out = new StringBuffer(url.length());
for (int i = 0; i < url.length(); i++)
{
      int c = (int) url.charAt(i);
   if (unreserved.indexOf(c) != -1) {
        if (c == ' ') c = '+';
        out.append((char)c);
        continue;
   }
   byte [] ba;
   try {
        ba = url.substring(i, i+1).getBytes("UTF8");
   } catch (UnsupportedEncodingException e) {
        ba = url.getBytes();
   }
   for (int j=0; j < ba.length; j++)
   {
        out.append("%" + Long.toHexString((long)(ba[j]&0xff)).toUpperCase());
   }
}
String encodedUrl = out.toString();

Encoding URLs in PL/SQL

In Oracle9i, you can call the ESCAPE() function in the UTL_URL package to encode a URL in PL/SQL. You can call the ESCAPE() function as follows:

encodedURL varchar2(100);
url varchar2(100); 
charset varchar2(40); 
...
encodedURL := UTL_URL.ESCAPE(url, FALSE, charset);

The url argument is the URL that you want to encode. The charset argument specifies the character encoding used for the encoded URL. Use a valid Oracle character set name for the charset argument. To encode a URL in the database character set, always specify the charset argument as NULL.

See Also:

Table 2-2 for a list of commonly used Oracle character set names

Encoding URLs in Perl

You can encode a URL in Perl by using the escape_uri() function of the Apache::Util module as follows:

use Apache::Util qw(escape_uri);
...
$escaped_url   = escape_uri( $url );
...

The escape_uri() function takes the bytes from the $url input argument and encodes them into the %XX format. If you want to encode a URL in a different character encoding, you need to convert the URL to the target encoding before calling the escape_uri() function. Perl provides some modules for character conversion.

See Also:

http://www.cpan.org for Perl character conversion modules

Formatting HTML Pages to Accommodate Text in Different Languages

Design the format of HTML pages according to the following guidelines:

It is good practice to provide Cascading Style Sheets (CSS) for different locales or groups of locales and use them to control HTML page rendering. Using a CSS isolates the locale-specific formatting information from HTML pages. Applications should dynamically generate CSS references in HTML pages corresponding to the user's locale so that the pages can be rendered with the corresponding locale-specific formats. Locale-specific information in the CSS file should include:

Accessing the Database Server

There are several methods by which Internet applications can access the database server through Oracle9iAS. Any Java-based Internet applications that use technologies such as Java Servlets, JSPs, and EJBs can use the Oracle JDBC drivers for database connectivity.

Because Java strings are always Unicode-encoded, JDBC transparently converts text data from the database character set to Unicode and vice versa. Java Servlets and JSPs that interact with an Oracle database should make sure that:

For non-Java Internet applications that use programming technologies such as Perl, PL/SQL, and C/C++, text data retrieved from or inserted into a database are encoded in the character set specified by the NLS_LANG parameter. The character set used for the POSIX locale should match the NLS_LANG character set so that data from the database can be directly processed with the POSIX locale-sensitive functions in the applications.

See Also:

"Configuring the NLS_LANG Environment Variable"

For multilingual applications, the NLS_LANG character set and the page encoding should both be UTF-8 to avoid character set conversion and possible data loss.

This section includes the following topics:

Using JDBC to Access the Database

Use the Oracle JDBC drivers provided in Oracle9iAS for Oracle9i database access when you use JSPs and Java Servlets. Oracle9iAS provides two client-side JDBC drivers that you can deploy with middle-tier applications:

Oracle JDBC drivers transparently convert character data from the database character set to Unicode for the SQL CHAR datatypes and the SQL NCHAR datatypes. As a result of this transparent conversion, JSPs and Java Servlets calling Oracle JDBC drivers can bind and define database columns with Java strings and fetch data into Java strings from the result set of a SQL execution.

You can use a Java string to bind the NAME and ADDRESS columns of a customer table. Define the columns as VARCHAR2 and NVARCHAR2 datatypes, respectively. For example:

String cname = request.getParameter("cname")
String caddr = request.getParameter("caddress");
OraclePreparedStatement pstmt = conn.prepareStatement("insert into" + 
        "CUSTOMERS (NAME, ADRESS) values (?, ?) ");
pstmt.setString(1, cname);
pstmt.setFormOfUse(2, OraclePreparedStatement.FORM_NCHAR);
pstmt.setString(2, caddr);
pstmt.execute();

To bind a Java string variable to the ADDRESS column defined as NVARCHAR2, you should call the setFormOfUse() method before the setString() method.

See Also:

Oracle9i JDBC Developer's Guide and Reference in the Oracle Database Documentation Library

The Oracle JDBC drivers set the values for the NLS_LANGUAGE and NLS_TERRITORY session parameters to the values corresponding to the default Java locale when the database session was initialized. For monolingual applications, the Java default locale is configured to match the user's locale. Hence the database connection is always synchronized with the user's locale.

Using PL/SQL to Access the Database

PL/SQL procedures and PSPs use SQL to access data in the local Oracle9i database. They can also use SQL and database links to access data from a remote Oracle9i database.

For example, you can call the following PL/SQL procedure from the mod_plsql module. It inserts a record into a customer table with the customer name column defined as VARCHAR2 and the customer address column defined as NVARCHAR2:

procedure addcustomer( cname varchar2 default NULL, caddress nvarchar2 default 
NULL) is
begin
   ....
   if (cname is not null) then
        caddr :=TO_NCHAR(address);
        insert into customers (name, address) values (cname, caddr);
        commit;
   end if;
end;

Note that Apache mod_plsql does not support NVARCHAR argument passing. As a result, PL/SQL procedures have to use VARCHAR2 for arguments and convert them to NVARCHAR explicitly before executing the INSERT statement.

The example uses static SQL to access the customer table. You can also use the DBMS_SQL PL/SQL package to access data in the database, using dynamic SQL.

See Also:

Oracle9i Supplied PL/SQL Packages Reference in the Oracle Database Documentation Library

Using Perl to Access the Database

Perl scripts access Oracle9i databases using the DBI/DBD driver for Oracle. The DBI/DBD driver is part of Oracle9iAS. It calls Oracle Callable Interface (OCI) to access the databases. The data retrieved from or inserted into the databases is encoded in the NLS_LANG character set. Perl scripts should:

This allows you to process data retrieved from the databases with POSIX string manipulation functions.

The following code shows how to insert a row into a customer table in an Oracle9i database through the DBI/DBD driver.

Use Apache::DBI;
...
# Connect to the database
$constr = 'host=dlsun1304.us.oracle.com;sid=icachedb;port=1521' ; 
$usr = 'system' ; 
$pwd = 'manager' ; 
$dbh = DBI->connect("dbi:Oracle:$constr", $usr, $pwd, {AutoCommit=>1} ) || 
       $r->print("Failed to connect to Oracle: " . DBI->errstr ); 

# prepare the statement 
$sql = 'insert into customers (name, address) values (:n, :a)'; 
$sth = $dbh->prepare( $sql ); 
$sth->bind_param(':n' , $cname);
$sth->bind_param(':a', $caddress);
$sth->execute(); 
$sth->finish(); 
$dbh->disconnect(); 

If the target columns are of the SQL NCHAR data types, then you need to specify the form of use flag for each bind variable. For example, if the address column is of NVARCHAR2 datatye, you need to add the $sth->func() function call before executing the SQL statement:

use DBD::Oracle qw(:ora_forms);
...
$sql = 'insert into customers (name, address) values (:n, :a)';
$sth = $dbh->prepare($sql);
$sth->bind_param(':n', $cname);
$sth->bind_param(':a', $caddress);
$sth->func( { ':a' => ORA_NCHAR }, 'set_form');
$sth->execute();
$sth->finish();
$dbh->disconnect();

To properly process UTF-8 data in a multilingual application, Perl scripts should:

Using C/C++ to Access the Database

C/C++ applications access the Oracle9i databases with OCI or Pro*C/C++. You can call OCI directly or use the Pro*C/C++ interface to retrieve and store Unicode data in a UTF-8 database and in SQL NCHAR datatypes.

Generally, data retrieved from and inserted into the database is encoded in the NLS_LANG character set. C/C++ programs should use the same character set for their POSIX locale as the NLS_LANG character set. Otherwise, the POSIX string functions cannot be used on the character data retrieved from the database, and the character data encoded in the POSIX locale may be corrupted when it is inserted into the database.

For multilingual applications, you may want to use the Unicode API provided in the OCI library instead of relying on the NLS_LANG character set. This alternative is good for applications written for platforms such as Windows NT/2000, which implement the wchar_t datatype using UTF-16 Unicode. Using the Unicode API for those platforms bypasses some unnecessary data conversions that using the regular OCI API requires.

This section includes the following topics:

Using the OCI API to Access the Database

This example shows how to bind and define the VARCHAR2 and NVARCHAR2 columns of a customer table in C/C++. It uses OCI and is based on the NLS_LANG character set. Note that the text datatype is a macro for unsigned char.

text *sqlstmt= (text *)"SELECT name, address FROM customers 
                        WHERE id = :cusid"; 
text cname[100];                      /* Customer Name */  
text caddr[200];                      /* Customer Address */ 
text custid[10] = "9876";             /* Customer ID */
ub2 cform = SQLCS_NCHAR;              /* Form of Use for NCHAR types */
... 
OCIStmtPrepare (stmthp, errhp, sqlstmt, 
                (ub4)strlen ((char *)sqlstmt), 
                (ub4)OCI_NTV_SYNTAX, (ub4)OCI_DEFAULT)); 
/* Bind the custid buffer */
OCIBindByName(stmthp, &bnd1p, errhp, (text*)":custid",  
              (sb4)strlen((char *)":custid"),  
              (dvoid *) custid, sizeof(cust_id), SQLT_STR,  
              (dvoid *)&insname_ind, (ub2 *) 0, (ub2 *) 0, 
              (ub4) 0,(ub4 *)0, OCI_DEFAULT);

/* Define the cname buffer for VARCHAR */
OCIDefineByPos (stmthp, &dfn1p, errhp, (ub4)1, (dvoid *)cname, 
                (sb4)sizeof(cname), SQLT_STR, 
                (dvoid *)0, (ub2 *)0, (ub2 *)0, (ub4)OCI_DEFAULT); 

/* Define the caddr buffer for the address column in NVARCHAR2 */
OCIDefineByPos (stmthp, &dfn2p, errhp, (ub4)2, (dvoid *)caddr, 
                (sb4)sizeof(caddr), SQLT_STR, 
                (dvoid *)0, (ub2 *)0, (ub2 *)0, (ub4)OCI_DEFAULT); 
OCIAttrSet((void *) dfn2p, (ub4) OCI_HTYPE_DEFINE, (void *) &cform, (ub4) 0,
           (ub4)OCI_ATTR_CHARSET_FORM, errhp); 
...

Using the Unicode API Provided with OCI to Access the Database

You can use the Unicode API that the OCI library provides for multilingual applications.

Turn on the Unicode API by specifying Unicode mode when you create an OCI environment handle. Any handle inherited from the OCI environment handle is set to Unicode mode automatically. By changing to Unicode mode, all text data arguments to the OCI functions are assumed to be in the Unicode text (utext*) datatype and in UTF-16 encoding. For binding and defining, the data buffers are assumed to be utext buffers in UTF-16 encoding.

The program code for the Unicode API is similar to the code for the non-Unicode OCI API except that:

The following Windows program shows how you can:

Using Unicode Bind and Define in Pro*C/C++ to Access the Database

You can use Unicode bind and define in Pro*C/C++ for multilingual applications.

Pro*C/C++ lets you specify UTF-16 Unicode buffers for bind and define operations. There are two ways to specify UTF-16 buffers in Pro*C/C++:

In the following example, there are two host variables: cname and caddr. The cname host variable is declared as a utext buffer containing 100 UTF-16 code units (unsigned short) for the customer name column in the VARCHAR2 datatype. The caddr host variable is declared as a uvarchar buffer containing 50 UCS2 characters for the customer address column in the NVARCHAR2 datatype. The len and arr fields are accessible as fields of a struct.

#include <sqlca.h>
#include <sqlucs2.h>

main()
{
   ...
   /* Change to STRING datatype:    */
   EXEC ORACLE OPTION (CHAR_MAP=STRING) ;
   utext cname[100] ;              /* unsigned short type */
   uvarchar caddr[200] ;           /* Pro*C/C++ uvarchar type */
   ...
   EXEC SQL SELECT name, address INTO :cname, :caddr FROM customers;
   /* cname is NULL-terminated */
   wprintf(L"ENAME = %s, ADDRESS = %.*s\n", cname, caddr.len, caddr.arr);
   ...
}

Organizing the Content of HTML Pages for Translation

You should have the user interface (UI) and content presented in HTML pages translated. Translatable sources for the content of an HTML page belong to the following categories:

This section contains the following topics:

Translation Guidelines for HTML Page Content

When creating translatable content, developers should follow these translation guidelines:

Organizing Static Files for Translation

You should organize translatable HTML, images, and CSS files into different directories from non-translatable static files so that you can zip files under the locale-specific directory for translation. There are many possible ways to define the directory structure to hold these files. For example:

/docroot/images         - Non-translatable images
/docroot/html           - HTML common to all languages
/docroot/css            - Style sheets common to all languages
/docroot/<lang>         - Locale directory such as en, fr, ja etc.
/docroot/<lang>/images  - Images specific for <lang>
/docroot/<lang>/html    - HTMLs specific for <lang>
/docroot/<lang>/css     - Style sheets specific for <lang>

You can replace the <lang> placeholder with the ISO locale names. Based on the above structure, you must write a utility function called getLocalizedURL() to take a URL as a parameter and look for the available language file from this structure. Whenever you reference an HTML, image, or CSS file in an HTML page, the Internet application should call this function to construct the path of the translated file corresponding to the current locale and fall back appropriately if the translation does not exist. For example, if the path /docroot/html/welcome.html is passed to the getLocalizedURL() function and the current locale is fr_CA, then the function looks for the following files in the order shown:

/docroot/fr_CA/html/welcome.html
/docroot/fr/html/welcome.html
/docroot/en/html/welcome.html
/docroot/html/welcome.html

The function returns the first file that exists. This function always reverts to English when the translated version corresponding to the current locale does not exist.

For Internet applications that use UTF-8 as the page encoding, the encoding of the static HTML files should also be UTF-8. However, translators usually encode translated HTML files in the native encoding of the target language. To convert the translated HTML into UTF-8, you can use the JDK native2ascii utility shipped with Oracle9iAS.

For example, to convert a Japanese HTML file encoded in Shift_JIS into UTF-8:

  1. Replace the value of the charset parameter in the Content-Type HTML header in the <meta> tag with UTF-8.

  2. Use the native2ascii utility to copy the Japanese HTML file to a new file called japanese.unicode:

        native2ascii -encoding MS932 japanese.html japanese.unicode
    
    
  3. Use the native2ascii utility to convert the new file to Unicode:

        native2ascii -reverse -encoding UTF8 japanese.unicode japanese.html
    

    See Also:

    • Oracle9i SQLJ Developer's Guide and Reference in the Oracle Database Documentation Library

    • JDK documentation at http://www.javasoft.com

    for more information about the native2ascii utility

Organizing Translatable Static Strings for Java Servlets and Java Server Pages

You should externalize translatable strings within Java Servlets and JSPs into Java resource bundles so that these resource bundles can be translated independent of the Java code. After translation, the resource bundles carry the same base class names as the English bundles, but with the Java locale name as the suffix. You should place the bundles in the same directory as the English resource bundles for the Java resource bundle look-up mechanism to function properly.

See Also:

JDK documentation at http://www.javasoft.com

for more information about Java resource bundles

Some people may disagree about externalizing JSP strings to resource bundles because it seems to defeat the purpose of using JSPs. There are two reasons for externalizing JSPs strings:

Java supports two types of resource bundles: the list resource bundle and the property resource bundle. It is good practice to use list resource bundles instead of property resource bundles. The main reasons are:

The following is an example of a list resource bundle:

import java.util.ListResourceBundle;
public class Resource extends ListResourceBundle {
    public Object[][] getContents() {
        return contents;
    }
    static final Object[][] contents =
    {
       {"hello", "Hello World"},  
       ...
    
    };
}

Translators usually translate list resource bundles in the native encoding of the target language. Japanese list resource bundles encoded in Shift_JIS cannot be compiled on an English system because the Java compiler expects source files that are encoded in ISO-8859-1. In order to build translated list resource bundles in a platform-independent manner, you need to run the JDK native2ascii utility to escape all non-ASCII characters to Unicode escape sequences in the \uXXXX format, where XXXX is the Unicode value in hexadecimal. For example:

native2ascii -encoding MS932 resource_ja.java resource_ja.tmp

Java provides a default fallback mechanism for resource bundles when translated resource bundles are not available. An application only needs to make sure that a base resource bundle without any locale suffix always exists in the same directory. The base resource bundle should contains strings in the fallback language. As an example, Java looks for a resource bundle in the following order when the fr_CA Java locale is specified to the getBundle() function:

resource_fr_CA
resource_fr
resource_en_US /* where en_US is the default Java locale */
resource_en
resource (base resource bundle)

Retrieving Strings in Monolingual Applications

At runtime, monolingual applications can get strings from a resource bundle of the default Java locale as follows:

ResourceBundle rb = ResourceBundle.getBundle("resource");
String helloStr = rb.getString("hello");

Retrieving Strings in Multilingual Applications

Because the user's locale is not fixed in multilingual applications, they should call the getBundle() method by explicitly specifying a Java locale object that corresponds to the user's locale. The Java locale object is called user_locale in the following example:

ResourceBundle rb = ResourceBundle.getBundle("resource", user_locale);
String helloStr = rb.getString("hello");

Organizing Translatable Static Strings in C/C++ and Perl

For C/C++ programs and Perl scripts running on UNIX platforms, externalize static strings in C/C++ or Perl scripts to POSIX message files. For programs running on Windows platforms, externalize static strings to message tables in a database because Windows does not support POSIX message files.

See Also:

"Organizing Translatable Static Strings in Message Tables"

Message files (with the .po file extension) associated with a POSIX locale are identified by their domain names. You need to compile them into binary objects (with the .mo file extension) and place them into the directory corresponding to the POSIX locale. The path name for the POSIX locale is implementation-specific. For example, the UNIX msgfmt utility compiles a Canadian French message file, resource.po, and places it into the /usr/lib/locale/fr_CA/LC_MESSAGES directory on UNIX.

See Also:

Operating system documentation for gettext, msgfmt, and xgettext

The following is an example of a resource.po message file:

domain "resource"
msgid "hello"
msgstr "Hello World"
...

Note that the encoding used for the message files must match the encoding used for the corresponding POSIX locale.

Instead of putting binary message files into an implementation-specific directory, you should put them into an application-specific directory and use the binddomain() function to associate a domain with a directory. The following piece of Perl script uses the Locale::gettext Perl module to get a string from a POSIX message file:

use Locale::gettext;
use POSIX;
...
setlocale( LC_ALL, "fr_CA" );
textdomain( "resource" );
binddomain( "resource", "/usr/local/share");
print gettext( "hello" );

The domain name for the resource file is resource, the ID of the string to be retrieved is hello, the translation to be used is Canadian French (fr_ca), and the directory for the binary.mo files is /usr/locale/share/fr_CA/LC_MESSAGES.

See Also:

http://www.cpan.org to download the Locale:gettext Perl module

Organizing Translatable Static Strings in Message Tables

Message tables mainly store static translatable strings used by PL/SQL procedures and PSPs. You can also use them for some C/C++ programs and Perl scripts. The tables should have a language column to identify the language of static strings so that accessing applications can retrieve messages based on the user's locale. The table structure should be similar to the one below:

CREATE TABLE messages 
( msgid   NUMBER(5)
, langid  VARCHAR2(10)
, message VARCHAR2(4000)
);

The primary key for this table consists of the msgid and langid columns. One good choice for the values in these columns is the Oracle language abbreviations of corresponding locales. Using the Oracle language abbreviation allows applications to retrieve translated information transparently by issuing a query on the message table.

See Also:

Oracle9i Globalization Support Guide in the Oracle Database Documentation Library for a list of Oracle language abbreviations

To provide a fallback mechanism when the translation of a message is not available, create the following views on top of the message table defined in the previous example:

-- fallback language is English which is abbreviated as 'US'.
CREATE VIEW default_message_view AS
 SELECT msgid, message
 FROM messages
 WHERE langid = 'US';
/
-- create view for services, with fall-back mechanism
CREATE VIEW messages_view AS
SELECT d.msgid,
       CASE WHEN t.message IS NOT NULL
            THEN t.message
            ELSE d.message
       END AS message
FROM default_view d,
     translation  t
WHERE t.msgid (+) = d.msgid AND
      t.langid (+) = sys_context('USERENV', 'LANG');

Messages should be retrieved from the messages_view view that provides the logic to provide a fallback message in English by joining the default_message_view view with the messages table. The sys_context() SQL function returns the Oracle language abbreviation of the locale for the current database session. This locale should be initialized to the user's locale at the time when the session is created.

To retrieve a message, an application should use the following query:

SELECT message FROM message_view WHERE msgid = 'hello';

The NLS_LANGUAGE parameter of a database session defines the language of the message that the query retrieves. Note that there is no language information needed for the query with this message table schema.

In order to minimize the load to the database, you should set up all message tables and their associated views on an Oracle9iAS instance as a front end to the database where PL/SQL procedures and PSPs run.

Organizing Translatable Dynamic Content in Application Schema

An application schema stores translatable dynamic information that the application uses, such as product names and product descriptions. The following shows an example of a table that stores all the products of an Internet store. The translatable information for the table is the product description and the product name.

CREATE TABLE product_information
    ( product_id          NUMBER(6)
    , product_name        VARCHAR2(50)
    , product_description VARCHAR2(2000)
    , category_id         NUMBER(2)
    , warranty_period     INTERVAL YEAR TO MONTH
    , supplier_id         NUMBER(6)
    , product_status      VARCHAR2(20)
    , list_price          NUMBER(8,2)
    );

To store product names and product descriptions in different languages, create the following table so that the primary key consists of the product_id and language_id columns:

CREATE TABLE product_descriptions
    ( product_id             NUMBER(6)
    , language_id            VARCHAR2(3)
    , translated_name        NVARCHAR2(50)
    , translated_description NVARCHAR2(2000)
    );

Create a view on top of the tables to provide fallback when information is not available in the language that the user requests. For example:

CREATE VIEW product AS
SELECT i.product_id
,      d.language_id
,      CASE WHEN d.language_id IS NOT NULL
            THEN d.translated_name
            ELSE i.product_name
       END    AS product_name
,      i.category_id
,      CASE WHEN d.language_id IS NOT NULL
            THEN d.translated_description
            ELSE i.product_description
       END    AS product_description
,      i.warranty_period
,      i.supplier_id
,      i.product_status
,      i.list_price
FROM   product_information  i
,      product_descriptions d
WHERE  d.product_id  (+) = i.product_id
AND    d.language_id (+) = sys_context('USERENV','LANG');

This view performs an outer join on the product_information and production_description tables and selects the rows with the language_id equal to the Oracle language abbreviation of the current database session.

To retrieve a product name and product description from the product view, an application should use the following query:

SELECT product_name, product_description FROM product 
      WHERE product_id = '1234';

This query retrieves the translated product name and production description corresponding to the value of the NLS_LANGUAGE session parameter. Note that you do not need to specify any language information in the query.

Locale Awareness in Oracle9iAS Forms Services

The Oracle9iAS Forms Services architecture includes:

The Java Client is dynamically downloaded from Oracle9iAS when a user runs a Forms Services session. The Java Client provides the user interface for the Forms Services Runtime Engine. It also handles user interaction and visual feedback for actions such as navigating between items or checking a checkbox.

Oracle9iAS Forms Services consists of the Forms Services Runtime Engine and the Forms Listener Servlet. The Forms Services Runtime Engine is the process that maintains a connection to the database on behalf of the Java Client. The Forms Listener Servlet acts as a broker, taking connection requests from the Java Client processes and initiating a Forms Services runtime process on their behalf.

The NLS_LANG parameter for Forms Services initializes the locale of Oracle9iAS Forms Services. The NLS_LANGUAGE parameter derives its value from NLS_LANG and determines the language of Forms messages. The NLS_TERRITORY parameter also derives its value from NLS_LANG and determines conventions such as date and currency formats.

By default, the NLS_LANG parameter for Oracle9iAS Forms Services initializes the Java Client locale. The locale of the Java Client determines such things as button labels on default messages and parts of strings in menus.

See Also:

Oracle9iAS Forms Services Deployment Guide

This section includes the following topics:

Locale Awareness in a Monolingual Oracle9iAS Forms Services Application

A user's locale is fixed in a monolingual Oracle9iAS Forms Services application and is usually the same as the default Forms Services locale. When you develop a monolingual Forms Services application, you must develop it to conform to the intended user's locale. The database character set should be a superset of the Forms Services character set.

For example, a monolingual Forms Services application for a Japanese locale should include Japanese text, Japanese button labels, and Japanese menus. The application should also connect to a database whose character set is JA16SJIS, JA16EUC, or UTF-8.

Alternatively, you can configure Forms Services to read the preferred language settings of the browser. For example, if you have a human resources application translated into 24 languages, then add an application entry in the formsweb.cfg file like the following:

[HR]
default.env
[HR.DE]
DE.env
[HR.FR]
FR.env
.
.
.

When the Forms Servlet detects a language preference in the browser, it checks the formsweb.cfg file to see if there is a translated version of the application.

For example, suppose the request is http://myserver.mydomain/servlet/f90servlet?config=HR and the preferred language is German (DE). The Forms Servlet tries to read from the application definitions in the following order:

HR.DE
HR.IT
HR.FR
.
.
.
HR

If the Forms Servlet cannot find any of those configurations, then it uses the HR configuration (default.env).

This means that you can configure Forms to support multiple languages with one URL. Each application definition can have its own environment file that contains the NLS language parameter definition. You can also specify separate working directory information and path information for each application.

Locale Awareness in a Multilingual Oracle9iAS Forms Services Application

In a multilingual environment, the application can dynamically determine the locale of Oracle9iAS Forms Services in two ways:

When you develop a Forms Services application you must choose one of these methods.

You can dynamically change the Forms Services locale that the NLS_LANG parameter initializes by using an ALTER SESSION statement. To issue an ALTER SESSION statement in a Forms Services application, you can use the FORMS_DDL built-in from the WHEN-NEW-FORM-INSTANCE trigger. For example, the following statement dynamically changes the NLS_CALENDAR setting:

FORMS_DDL('ALTER SESSION SET NLS_CALENDAR=''JAPANESE IMPERIAL''');

However, changing the Forms Services locale with an ALTER SESSION statement does not change the text, labels, and menus of a Forms Services application. It also does not confirm that the runtime character set is a superset of the Forms Services character set.

You can configure multilingual Forms Services applications by using multiple environment configuration files (EnvFile). For example, you can create a form called form.fmx and translate it into Japanese and into Arabic using Oracle9i Translator. Then save them as d:\form\ja\form.fmx (Japanese) and d:\form\ar\form.fmx (Arabic). Finally, create two environment configurations files, ja.env and ar.env, and specify the following in the appropriate environment file:

Form Environment File NLS_LANG FORMS90_PATH

d:\form\ja\form.fmx

ja.env

JAPANESE_JAPAN.JASJIS

d:\form\ja

d:\form\ar\form.fmx

ar.env

ARABIC_EGYPT.ARMSWIN1256

d:\form\ar

See Also:

"Configuring Oracle9iAS Forms Services for Multilingual Support"

The locale of the Java Client depends on the nlsLang parameter that you set in the HTML file that invokes the Forms Services Application.

The Forms Applet defaults to the client's operating system locale. If the locale of a user's client is set to en-US but the application server is set so that the NLS_LANG parameter is FRENCH, then the user interface includes French button labels and window titles because the language of the resource bundles depends on the server setting. Use the NLSLANG Forms Applet parameter to change the user interface to English without interfering with the server settings. If the NLSLANG Forms Applet parameter is set to TRUE in the file that invokes the Forms Services application, then the locale of Forms Applet is initialized as the Forms Services locale. If NLSLANG is set to FALSE, then you can set the NLSLANG parameter separately for each application that is configured in the formsweb.cfg file.

The Java client initially chooses font properties from font.properties.xx when the JVM is initialized, where xx is the locale of the Java client. If the Forms Services locale is different from the Java client locale and nlsLang is set to true in the HTML file, then Forms Services overwrites the Java client locale with the NLS_LANG setting of Forms Services.

Locale Awareness in Oracle9iAS Reports Services

The Oracle9iAS Reports Services architecture includes:

Oracle9iAS Reports Services can run multiple reports simultaneously upon users' requests. Reports Services enters requests for reports into a job queue and dispatches them to a dynamic, configurable number of pre-spawned runtime engines. The runtime engine connects to the database, retrieves data, and formats output for the client.

The NLS_LANG setting for Reports Server initializes the locale of Reports Services. The NLS_LANGUAGE parameter derives its value from the NLS_LANG parameter and determines the language of the Reports Server messages. The NLS_TERRITORY parameter derives its value from the NLS_LANG parameter and determines the date and currency formats. For example, if NLS_LANG is set to JAPANESE_JAPAN.JA16SJIS, then Reports Server messages are in Japanese and reports use the Japanese date format and currency symbol.

Report output is generated in the Reports Services character set. The client needs to be aware of the character set in which Reports Services generated the HTML or XML.

See Also:

"Specifying the Page Encoding in Oracle9iAS Reports Services Applications"

This section contains the following topics:

Locale Awareness in a Monolingual Oracle9iAS Reports Services Application

A user's locale is fixed in a monolingual Oracle9iAS Reports Services application and is usually the same as the locale of the Reports Server. The database character set should be a superset of the Report Server character set.

Locale Awareness in a Multilingual Oracle9iAS Reports Services Application

In a multilingual report, the application can dynamically determine the locale of the Reports Server in two ways:

When you develop a report you must choose one of these methods.

You can dynamically change the NLS_LANG parameter of the Reports Server using an ALTER SESSION statement. To execute an ALTER SESSION statement from a report, you can use the srw.do_sql built into the BeforeReport trigger. For example, you can change the setting for the NLS_CALENDAR parameter as follows:

begin
?str:='alter session set nls_calendar=''JAPANESE IMPERIAL''';
     srw.do_sql(str);
     return (TRUE);
end;

See Also:

Locale Awareness in Oracle9iAS Discoverer

Oracle9iAS Discoverer can simultaneously support users with different locales. Discoverer always uses UTF-8 encoding for communication between the client and middle-tier services. Users may explicitly control the locale used for the user interface, or they may allow Oracle9iAS Discoverer to automatically determine a default. The order of precedence is:

  1. Language and locale settings included in the URL for Oracle9iAS Discoverer

  2. Language and locale settings specified in the Discoverer Connection (this is part of the Oracle9iAS Discoverer integration with Oracle9iAS Single Sign-On).

  3. Language and locale setting specified in the user's browser

  4. Language and locale of Oracle9i Application Server

For example, suppose a user goes to a URL for Oracle9iAS Discoverer that does not specify the language or locale. Oracle9i Application Server is installed with a default language and locale of Traditional Chinese - Hong Kong, so the HTML page returned to the user is written in Traditional Chinese. That page prompts the user to select a Discoverer Connection to use, and the connection has the language and locale specified as English - U.S. Because the Discoverer Connection settings take precedence over the Oracle9iAS settings, the Oracle9iAS Discoverer user interface appears in the English language.

Locale Awareness in Oracle9iAS Clickstream Intelligence Applications

Oracle9iAS Clickstream Intelligence installs language-dependent data into its schema at installation. ClickStream does not support multiple languages on a single instance. The use determines the language setting at installation.

The following restrictions apply to Oracle9iAS:

Table 2-3 describes the locale awareness of ClickStream components.

Table 2-3 Locale Awareness of ClickStream Components
Component Locale Awareness

ClickStream Configurator

Uses UTF-8 for encoding JSPs.

HTML Viewer by Discoverer

The language preference of the initial window is based on the preferred language setting of the browser. The user is then required to choose a language in the login window. After the user chooses a language, it determines the preferred locale.

Oracle Warehouse Builder Bridge

Oracle Warehouse Builder metadata for ClickStream is provided in English. Advanced users who want to extend the dimensions that are provided should note the restriction. When users import a newly created End User Layer (EUL) into Discoverer, they should not overwrite the existing data because the English data may be overwritten.


Go to previous page Go to next page
Oracle
Copyright © 2002 Oracle Corporation.

All Rights Reserved.
Go To Documentation Library
Home
Go To Product List
Solution Area
Go To Table Of Contents
Contents
Go To Index
Index