Home     |     .Net Programming    |     cSharp Home    |     Sql Server Home    |     Javascript / Client Side Development     |     Ajax Programming

Ruby on Rails Development     |     Perl Programming     |     C Programming Language     |     C++ Programming     |     IT Jobs

Python Programming Language     |     Laptop Suggestions?    |     TCL Scripting     |     Fortran Programming     |     Scheme Programming Language


 
 
Cervo Technologies
The Right Source to Outsource

MS Dynamics CRM 3.0

C Programming Language

UNICODE input for CGI using C


Dear All,
            I'm trying to accept a multi-lingual string (UNICODE) in a
form and am trying to parse it. What i am getting is %XX (which is a
single byte, not 2 bytes). So, is the data getting lost? What format
is it, if it is not getting lost.

Thanx in advance,
Punit.

In article <1180444998.814728.246@a26g2000pre.googlegroups.com>,

 <puneet.p.s@gmail.com> wrote:
>            I'm trying to accept a multi-lingual string (UNICODE) in a
>form and am trying to parse it. What i am getting is %XX (which is a
>single byte, not 2 bytes). So, is the data getting lost? What format
>is it, if it is not getting lost.

You should be getting 2 or more successive %XXs.  HTML form data send
using GET is part of the URL  Non-ASCII characters are represented in
UTF-8, then each byte of the UTF-8 sequence is encoded in hex as %XX.

See

   http://www.ietf.org/rfc/rfc3986.txt
   http://www.ietf.org/rfc/rfc2279.txt

For POST data, I can't find up-to-date documentation.  The very old
http://www.w3.org/TR/html4/interact/forms.html describes the
application/x-www-form-urlencoded mime type, but it does not mention
non-ASCII characters.  I think you'll find that it uses the same
method as GET, but it's possible that it might use the encoding
specified by the HTTP charset declaration rather than UTF-8.  You'll
need to ask about that somewhere other than comp.lang.c.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.

puneet.p.s@gmail.com wrote:

> I'm trying to accept a multi-lingual string (UNICODE) in a form
> and am trying to parse it. What i am getting is %XX (which is a
> single byte, not 2 bytes). So, is the data getting lost? What
> format is it, if it is not getting lost.

I suspect a coding error.  It must be line 42.

--
 <http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
 <http://www.securityfocus.com/columnists/423>
 <http://www.aaxnet.com/editor/edit043.html>
 <http://kadaitcha.cx/vista/dogsbreakfast/index.html>
                        cbfalconer at maineline dot net

--
Posted via a free Usenet account from http://www.teranews.com

Add to del.icio.us | Digg this | Stumble it | Powered by Megasolutions Inc