|
|
 |
 |
 |
 |
TCL(Tool Command Language) Scripting
|
 |
 |
 |
 |
 |
 |
 |
 |
Hebrew in starkit
Hi all, Can you please help me with this? I wrote a script that uses Hebrew strings, and I got what I wanted. But converting it to a starkit (a starpack actually) the Hebrew strings became Latin/Greek/some other encoding... How can I fix it? Thanks
On May 30, 9:58 pm, iu2 <isra@elbit.co.il> wrote: > Hi all, > Can you please help me with this? > I wrote a script that uses Hebrew strings, and I got what I wanted. > But converting it to a starkit (a starpack actually) the Hebrew > strings became Latin/Greek/some other encoding... > How can I fix it? > Thanks
Without more information I can't be sure, but Starkits by default include only a small subset of the encodings available in Tcl, so the problem is probably that your Starkit does not contain the appropriate encoding file. What encoding is your Hebrew text in? Is it in cp862 or iso8859-8?
On May 31, 11:26 am, billpo@alum.mit.edu wrote: > Without more information I can't be sure, but Starkits by default > include only a small subset of the encodings available in Tcl, so the > problem is probably that your Starkit does not contain the appropriate > encoding file. What encoding is your Hebrew text in? Is it in cp862 or > iso8859-8?
It is cp1255 What's weird is that testing the encoding from the starkit I get cp1252, which is also good for Hebrew. I managed to include all the encodings in the starkits' lib and changed the system encoding to cp1255 but that didn't help either.
On May 31, 8:31 am, iu2 <isra@elbit.co.il> wrote: > On May 31, 11:26 am, billpo @alum.mit.edu wrote: > > Without more information I can't be sure, but Starkits by default > > include only a small subset of the encodings available in Tcl, so the > > problem is probably that your Starkit does not contain the appropriate > > encoding file. What encoding is your Hebrew text in? Is it in cp862 or > > iso8859-8? > It is cp1255 > What's weird is that testing the encoding from the starkit I get > cp1252, which is also good for Hebrew. I managed to include all the > encodings in the starkits' lib and changed the system encoding to > cp1255 but that didn't help either.
Hunh? The cp1252 that I know doesn't include any Hebrew characters. It is an extended Latin character set. What is the native encoding of your system? And how is the Hebrew text included? That is, are you reading cp1255-encoded text from a file, or is your tcl script itself in cp1255?
On May 31, 6:44pm, billpo@alum.mit.edu wrote:
> On May 31, 8:31 am, iu2 <isra @elbit.co.il> wrote: > > On May 31, 11:26 am, billpo@alum.mit.edu wrote: > > > Without more information I can't be sure, but Starkits by default > > > include only a small subset of the encodings available in Tcl, so the > > > problem is probably that your Starkit does not contain the appropriate > > > encoding file. What encoding is your Hebrew text in? Is it in cp862 or > > > iso8859-8? > > It is cp1255 > > What's weird is that testing the encoding from the starkit I get > > cp1252, which is also good for Hebrew. I managed to include all the > > encodings in the starkits' lib and changed the system encoding to > > cp1255 but that didn't help either. > Hunh? The cp1252 that I know doesn't include any Hebrew characters. It > is an extended Latin character set. > What is the native encoding of your system? And how is the Hebrew text > included? That is, are you reading cp1255-encoded text from a file, or > is your tcl script itself in cp1255?
I just use "puts [encoding system]" at the beginning of my script and it prints cp1255. I don't read a file, the script itself is in cp1255 (probably...) The Hebrew text is inside my script, e.g., label .lbl -text My editor doesn't even show Hebrew letters, but Tk does!
On May 31, 10:59 am, iu2 <isra@elbit.co.il> wrote:
> On May 31, 6:44 pm, billpo @alum.mit.edu wrote: > > On May 31, 8:31 am, iu2 <isra@elbit.co.il> wrote: > > > On May 31, 11:26 am, billpo@alum.mit.edu wrote: > > > > Without more information I can't be sure, but Starkits by default > > > > include only a small subset of the encodings available in Tcl, so the > > > > problem is probably that your Starkit does not contain the appropriate > > > > encoding file. What encoding is your Hebrew text in? Is it in cp862 or > > > > iso8859-8? > > > It is cp1255 > > > What's weird is that testing the encoding from the starkit I get > > > cp1252, which is also good for Hebrew. I managed to include all the > > > encodings in the starkits' lib and changed the system encoding to > > > cp1255 but that didn't help either. > > Hunh? The cp1252 that I know doesn't include any Hebrew characters. It > > is an extended Latin character set. > > What is the native encoding of your system? And how is the Hebrew text > > included? That is, are you reading cp1255-encoded text from a file, or > > is your tcl script itself in cp1255? > I just use "puts [encoding system]" at the beginning of my script and > it prints cp1255. > I don't read a file, the script itself is in cp1255 (probably...) The > Hebrew text is inside my script, e.g., > label .lbl -text > My editor doesn't even show Hebrew letters, but Tk does!
I think that the text in your script is in UTF-8. When I read your post I get "shalom" written in UTF-8, which is the native internal encoding of Tk. The encoding returned by [encoding system] is the encoding that Tcl uses when passing strings to the operating system.
On May 31, 10:59 am, iu2 <isra@elbit.co.il> wrote:
> On May 31, 6:44 pm, billpo @alum.mit.edu wrote: > > On May 31, 8:31 am, iu2 <isra@elbit.co.il> wrote: > > > On May 31, 11:26 am, billpo@alum.mit.edu wrote: > > > > Without more information I can't be sure, but Starkits by default > > > > include only a small subset of the encodings available in Tcl, so the > > > > problem is probably that your Starkit does not contain the appropriate > > > > encoding file. What encoding is your Hebrew text in? Is it in cp862 or > > > > iso8859-8? > > > It is cp1255 > > > What's weird is that testing the encoding from the starkit I get > > > cp1252, which is also good for Hebrew. I managed to include all the > > > encodings in the starkits' lib and changed the system encoding to > > > cp1255 but that didn't help either. > > Hunh? The cp1252 that I know doesn't include any Hebrew characters. It > > is an extended Latin character set. > > What is the native encoding of your system? And how is the Hebrew text > > included? That is, are you reading cp1255-encoded text from a file, or > > is your tcl script itself in cp1255? > I just use "puts [encoding system]" at the beginning of my script and > it prints cp1255. > I don't read a file, the script itself is in cp1255 (probably...) The > Hebrew text is inside my script, e.g., > label .lbl -text > My editor doesn't even show Hebrew letters, but Tk does!
I'm not sure why this problem arises in the Starkit, but my experiments indicate that you can solve the problem by using \u escapes for the Hebrew text. In your example, instead of label .lbl -text put label .lbl -text \u05E9\u05DC\u05D5\u05DD and it should work.
I think I may know what is happenning. Someone with greater expertise in Starkit-ology can confirm or refute this. The problem is that the way Starkits work, your original program ends up being sourced by a higher-level Tcl script that is part of the Starkit infrastructure. When it is sourced, the interpreter assumes that the source text is in the system encoding, which on your system is cp1255. Since your Hebrew text is written directly, the sequence of bytes that should be interpreted as UTF-8 is actually interpreted as cp1255, which produces gibberish. If you write the Hebrew using \u escapes, instead, that forces the interpreter to interpret it as Unicode and all is well.
On May 31, 8:31 am, iu2 <isra@elbit.co.il> wrote: > On May 31, 11:26 am, billpo @alum.mit.edu wrote: > > Without more information I can't be sure, but Starkits by default > > include only a small subset of the encodings available in Tcl, so the > > problem is probably that your Starkit does not contain the appropriate > > encoding file. What encoding is your Hebrew text in? Is it in cp862 or > > iso8859-8? > It is cp1255 > What's weird is that testing the encoding from the starkit I get > cp1252, which is also good for Hebrew. I managed to include all the > encodings in the starkits' lib and changed the system encoding to > cp1255 but that didn't help either.
Hunh? The cp1252 that I know doesn't include any Hebrew characters. It is an extended Latin character set. What is the native encoding of your system? And how is the Hebrew text included? That is, are you reading cp1255-encoded text from a file, or is your tcl script itself in cp1255?
On May 31, 8:31 am, iu2 <isra@elbit.co.il> wrote: > On May 31, 11:26 am, billpo @alum.mit.edu wrote: > > Without more information I can't be sure, but Starkits by default > > include only a small subset of the encodings available in Tcl, so the > > problem is probably that your Starkit does not contain the appropriate > > encoding file. What encoding is your Hebrew text in? Is it in cp862 or > > iso8859-8? > It is cp1255 > What's weird is that testing the encoding from the starkit I get > cp1252, which is also good for Hebrew. I managed to include all the > encodings in the starkits' lib and changed the system encoding to > cp1255 but that didn't help either.
Hunh? The cp1252 that I know doesn't include any Hebrew characters. It is an extended Latin character set. What is the native encoding of your system? And how is the Hebrew text included? That is, are you reading cp1255-encoded text from a file, or is your tcl script itself in cp1255?
On Jun 1, 12:23 am, billpo@alum.mit.edu wrote: > I think I may know what is happenning. Someone with greater expertise > in Starkit-ology can confirm or refute this. The problem is that the > way Starkits work, your original program ends up being sourced by a > higher-level Tcl script that is part of the Starkit infrastructure. > When it is sourced, the interpreter assumes that the source text is in > the system encoding, which on your system is cp1255. Since your Hebrew > text is written directly, the sequence of bytes that should be > interpreted as UTF-8 is actually interpreted as cp1255, which produces > gibberish. If you write the Hebrew using \u escapes, instead, that > forces the interpreter to interpret it as Unicode and all is well.
Thanks, that gave me an idea which actually worked! I wrote label .lbl -text [ecoding convertfrom cp1255 " "] And it showed Ok in the starkit. Indeed now when run as a tcl script (not from the starkit) it looks gibberish, but that can be "if"-ed in the script itself, testing for a starkit.
On Jun 2, 10:35 pm, iu2 <isra@elbit.co.il> wrote:
> On Jun 1, 12:23 am, billpo @alum.mit.edu wrote: > > I think I may know what is happenning. Someone with greater expertise > > in Starkit-ology can confirm or refute this. The problem is that the > > way Starkits work, your original program ends up being sourced by a > > higher-level Tcl script that is part of the Starkit infrastructure. > > When it is sourced, the interpreter assumes that the source text is in > > the system encoding, which on your system is cp1255. Since your Hebrew > > text is written directly, the sequence of bytes that should be > > interpreted as UTF-8 is actually interpreted as cp1255, which produces > > gibberish. If you write the Hebrew using \u escapes, instead, that > > forces the interpreter to interpret it as Unicode and all is well. > Thanks, that gave me an idea which actually worked! > I wrote > label .lbl -text [ecoding convertfrom cp1255 " "] > And it showed Ok in the starkit. > Indeed now when run as a tcl script (not from the starkit) it looks > gibberish, but that can be "if"-ed in the script itself, testing for a > starkit.
I'm glad that works. You might try to abstract away from the system encoding of the target system by writing: label .lbl -text [encoding convertfrom [encoding system] " "] Bill
On Jun 3, 8:24 am, billpo@alum.mit.edu wrote:
> On Jun 2, 10:35 pm, iu2 <isra @elbit.co.il> wrote: > > On Jun 1, 12:23 am, billpo@alum.mit.edu wrote: > > > I think I may know what is happenning. Someone with greater expertise > > > in Starkit-ology can confirm or refute this. The problem is that the > > > way Starkits work, your original program ends up being sourced by a > > > higher-level Tcl script that is part of the Starkit infrastructure. > > > When it is sourced, the interpreter assumes that the source text is in > > > the system encoding, which on your system is cp1255. Since your Hebrew > > > text is written directly, the sequence of bytes that should be > > > interpreted as UTF-8 is actually interpreted as cp1255, which produces > > > gibberish. If you write the Hebrew using \u escapes, instead, that > > > forces the interpreter to interpret it as Unicode and all is well. > > Thanks, that gave me an idea which actually worked! > > I wrote > > label .lbl -text [ecoding convertfrom cp1255 " "] > > And it showed Ok in the starkit. > > Indeed now when run as a tcl script (not from the starkit) it looks > > gibberish, but that can be "if"-ed in the script itself, testing for a > > starkit. > I'm glad that works. You might try to abstract away from the system > encoding of the target system by writing: > label .lbl -text [encoding convertfrom [encoding system] " "] > Bill- Hide quoted text - >
I think there's a problem here, since the default encoding system for the starkit might differ from the default system encoding of the OS. For example in my case, the OS system encoding is cp1255 but the starkit's default encoding is cp1252. I got gibberish until I expilictly used cp1255.
On Jun 3, 1:31 am, iu2 <isra@elbit.co.il> wrote: > I think there's a problem here, since the default encoding system for > the starkit might differ from the default system encoding of the OS. > For example in my case, the OS system encoding is cp1255 but the > starkit's default encoding is cp1252. I got gibberish until I > expilictly used cp1255.
Ah, right. I guess there's no need to abstract then since the problematic encoding is associated with the Starkit and so will remain constant.
|
 |
 |
 |
 |
|