<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body bgcolor="#ffffff" text="#000000">
<font face="Consolas">Hi folks!<br>
<br>
I am playing around with a C library and came across <br>
some issues related to unicode in C. In the unicode enabled <br>
version of this C library which implements UCS-2 (i.e. just BMP)<br>
the "unicode character type" is defined in the header file <br>
as follows:<br>
======================================<br>
<font color="#000066"><small>/*<br>
* ZAB_CHAR<br>
* ZAB_UC<br>
*/<br>
<br>
#ifdef ZABonAIX<br>
#if defined(_AIX51) && defined(ZABwith64_BIT)<br>
#define ZABonAIX_wchar_is_4B<br>
#else<br>
#define ZABonAIX_wchar_is_2B<br>
#endif<br>
#endif<br>
<br>
#ifdef ZABonAIX<br>
#if defined(_AIX51) && defined(ZABwith64_BIT)<br>
#define ZABonAIX_wchar_is_4B<br>
#elif defined(ZABccQ)<br>
#define ZABonAIX_wchar_is_4B<br>
#else<br>
#define ZABonAIX_wchar_is_2B<br>
#endif<br>
#endif<br>
<br>
#if defined(ZABonNT) || \<br>
defined(ZABonOS400) || \<br>
(defined(ZABonOS390) && !defined(_LP64)) || \<br>
defined(ZABonAIX) && defined(ZABonAIX_wchar_is_2B)<br>
#define WCHAR_is_2B<br>
#else<br>
#define WCHAR_is_4B<br>
#endif<br>
<br>
#if defined(ZABonLIN) && defined(GCC_UTF16_PATCH)<br>
#if __GNUC_PREREQ (4,3)<br>
#include <uchar.h><br>
#define ZAB_UC_is_char16<br>
#endif<br>
#endif<br>
<br>
#ifndef ZABwithUNICODE<br>
#define ZAB_UC_is_1B<br>
typedef char ZAB_CHAR;<br>
typedef char ZAB_UC;<br>
#else /* ZABwithUNICODE */<br>
#if defined(WCHAR_is_2B)<br>
#define ZAB_UC_is_wchar<br>
typedef wchar_t ZAB_CHAR;<br>
typedef wchar_t ZAB_UC;<br>
#elif defined(ZAB_UC_is_char16)<br>
typedef char16_t ZAB_CHAR;<br>
typedef char16_t ZAB_UC;<br>
#else<br>
#define ZAB_UC_is_UTF16_without_wchar<br>
typedef unsigned short ZAB_CHAR;<br>
typedef unsigned short ZAB_UC;<br>
#endif<br>
#endif /* ZABwithUNICODE or not */<br>
<br>
/*<br>
* CFRSDKwith(out)UTF16_LITERALS<br>
* for CFR SDK applications: controls use of UTF-16<br>
* literal enabled compilers.<br>
*/<br>
#if defined(CFRSDKwithUTF16_LITERALS)<br>
#elif defined(CFRSDKwithoutUTF16_LITERALS)<br>
#define ZABwithoutUTF16_LITERALS<br>
#elif defined(WCHAR_is_2B) || \<br>
defined(ZABonHP_UX) || \<br>
(defined(ZABonLIN) && defined(__i386__) &&
defined(__GNUC__) && (__GNUC__<3)) || \<br>
(defined(ZABonLIN) && defined(GCC_UTF16_PATCH)) ||
\<br>
defined(ZABonSUN) || defined(ZABonAIX)<br>
/* we have literals for UTF-16 */<br>
#else<br>
#define ZABwithoutUTF16_LITERALS<br>
#endif</small></font><br>
</font><font face="Consolas">======================================</font><br>
<style type="text/css">p, li { white-space: pre-wrap; }</style><font
face="Consolas"><br>
All this boils down to<br>
<br>
</font>
<p style="margin: 0px; text-indent: 0px;"><small><font
face="Consolas"> +---------------------------+ </font></small></p>
<p style="margin: 0px; text-indent: 0px;"><small><font
face="Consolas"> +------>-| typedef wchar_t ZAB_CHAR; |</font></small></p>
<p style="margin: 0px; text-indent: 0px;"><small><font
face="Consolas"> | | typedef wchar_t ZAB_UC; | </font></small></p>
<p style="margin: 0px; text-indent: 0px;"><small><font
face="Consolas"> | +---------------------------+</font></small></p>
<p style="margin: 0px; text-indent: 0px;"><small><font
face="Consolas"> |</font></small></p>
<p style="margin: 0px; text-indent: 0px;"><small><font
face="Consolas">+---------+ | +----------------------------+</font></small></p>
<p style="margin: 0px; text-indent: 0px;"><small><font
face="Consolas">| Unicode |------+------>-| typedef
char16_t ZAB_CHAR; |</font></small></p>
<p style="margin: 0px; text-indent: 0px;"><small><font
face="Consolas">+---------+ | | typedef char16_t ZAB_UC; |</font></small></p>
<p style="margin: 0px; text-indent: 0px;"><small><font
face="Consolas"> | +----------------------------+</font></small></p>
<p style="margin: 0px; text-indent: 0px;"><small><font
face="Consolas"> |</font></small></p>
<p style="margin: 0px; text-indent: 0px;"><small><font
face="Consolas"> | +----------------------------------+</font></small></p>
<p style="margin: 0px; text-indent: 0px;"><small><font
face="Consolas"> +------>-| typedef unsigned short
ZAB_CHAR; |</font></small></p>
<p style="margin: 0px; text-indent: 0px;"><small><font
face="Consolas"> | typedef unsigned short ZAB_UC; |</font></small></p>
<p style="margin: 0px; text-indent: 0px;"><small><font
face="Consolas"> +----------------------------------+</font></small></p>
<small><font face="Consolas"><br>
</font><big><font face="Consolas">The question is now: Is it
correct (resp. safe) <br>
to <i><b>defctype</b></i> ZAB_UC just as :uint16 and
interpret <br>
it as a UFT-16 code point (with an appropriate endiannness)?<br>
<br>
How can the types like wchar_t and char_16_t be defined (resp.
used) in CFFI?<br>
<br>
Regards<br>
Nik<br>
<br>
<br>
<br>
</font></big></small><font face="Consolas"><br>
</font>
</body>
</html>