Esperanto mojibake & CS+ font editing

Jul 17, 2017 at 6:54 AM
Neophyte Member
"Fresh from the Bakery"
Join Date: Jul 17, 2017
Location:
Posts: 6
I'm considering making an Esperanto translation of Cave Story, and I have been testing some stuff to see if I will be able to make it work.

The main problem right now is Esperanto letters. Courier New supports them, but when I try to use them in-game, I get mojibake like this:

p359462-0-kpsfefn.png


those first two lines should say:
eĥoŝanĝo ĉiuĵaŭde
EĤOŜANĜO ĈIUĴAŬDE​
(it's a short phrase that includes all 6 letters unique to Esperanto)

Does anyone know a way I can get these letters to display correctly? ASCII substitutes exist (e.g. ehxosxangxo cxiujxauxde), but I'd rather not use them if at all possible. They look a bit ugly. :p


Also, does anyone know a way to edit the font files of Cave Story+? If there's a way I could add characters, that would be awesome. I'm guessing csfont.fnt tells the game which character is found where on the csfont_00.bmp bitmap, but I can't find a way to read and edit it.
 
Jul 17, 2017 at 7:20 AM
In my body, in my head
Forum Moderator
"Life begins and ends with Nu."
Join Date: Aug 28, 2009
Location: The Purple Zone
Posts: 5998
cave story only recognizes multi-byte (unicode etc) characters that have the high bit in the high byte. Specifically it does this because that will capture japanese characters and the rest of the text is 7-bit ascii
haven't got good solution for you at this moment as it's quite late for me
 
Jul 17, 2017 at 5:24 PM
Neophyte Member
"Fresh from the Bakery"
Join Date: Jul 17, 2017
Location:
Posts: 6
Aha, I see.

I was actually a little surprised to get this problem, because I've browsed through other threads about translation, and there wan't any mention of this problem there. I get this for French, because its characters are supported in ANSI, but Polish should be having the same problem:

p359470-0-y0uflhp.png

That bottom line should read "ć ń ó ś ź ż ą ę ł", the Polish letters

I can only get the ó to work, by converting to ANSI encoding. I can't figure out how they got the other letters to work.
 
Jul 17, 2017 at 6:37 PM
In my body, in my head
Forum Moderator
"Life begins and ends with Nu."
Join Date: Aug 28, 2009
Location: The Purple Zone
Posts: 5998
have you tried using one of those international versions as a base instead of english?
 
Jul 17, 2017 at 8:52 PM
Neophyte Member
"Fresh from the Bakery"
Join Date: Jul 17, 2017
Location:
Posts: 6
I tried taking a look at the Russian one, but Russian texts look like this when I run it:

p359474-0-rwbo9yz.png


So it looks like languages rely on Windows' setting for non-unicode characters. Which I should have known, because Japanese does this too. I forgot about that.
This is a bit of a problem, because Esperanto does not have a local encoding option like this.

I guess I could make a custom font that replaces unused ASCII symbols with Esperanto letters and hack it into the game. But I'd have to figure out how to do all that, and I'd prefer not having to require players to install a weird font.

*sigh* I thought this would be easier.
 
Jul 18, 2017 at 5:38 AM
Administrator
Forum Administrator
"Life begins and ends with Nu."
Join Date: Jul 15, 2007
Location: Australia
Posts: 6210
Age: 38
You clearly have a non-unicode codepage or you wouldn't get anything.

I suspect that your default non-unicode codepage is Windows-1250. Just save as ANSI.

https://en.wikipedia.org/wiki/Windows-1250

Actually I think I might try a bit of a proof of concept later.
 
Last edited:
Jul 18, 2017 at 7:26 PM
Neophyte Member
"Fresh from the Bakery"
Join Date: Jul 17, 2017
Location:
Posts: 6
Just save as ANSI.
That is the problem. ANSI doesn't support the characters I need. ĉ ĝ ĥ ĵ ŝ ŭ are not used in any language marketable enough for Windows to make a code page for them.

I'm pretty sure my computer uses Windows-1252, btw.
 
Jul 19, 2017 at 2:07 AM
Administrator
Forum Administrator
"Life begins and ends with Nu."
Join Date: Jul 15, 2007
Location: Australia
Posts: 6210
Age: 38
q=ĉ
w=ĝ
x=ĥ
y=ĵ
{=ŝ
[=ŭ

Q=Ĉ
W=Ĝ
X=Ĥ
Y=Ĵ
}=Ŝ
]=Ŭ

Would this work as a font mapping?

I can build a custom font. So all we need is some way to link it to CS without someone needing to install it first.
 
Last edited:
Jul 19, 2017 at 3:39 AM
Neophyte Member
"Fresh from the Bakery"
Join Date: Jul 17, 2017
Location:
Posts: 6
As long as <> and {} are not used in the game and they don't mess with the code if I use them, that would work, yes. :)

But I'd make it:
q=ĉ
w=ŭ
x=ŝ
y=ĝ
<=ĵ
{=ĥ

ŭ is basically a w sound, and ĉ and ŝ correspond to q and x in Chinese pinyin. For the other three, ĝ is common, ĵ is uncommon, and ĥ is very rare, so ĝ can have the normal letter (which also happens to be a little visually similar to a handwritten g), and ĥ can have the weird one that is hardest to reach on a keyboard.

That way it's a little easier for me if I want to edit the text after converting my translations to the ASCII substitutes.
 
Jul 19, 2017 at 7:15 AM
Administrator
Forum Administrator
"Life begins and ends with Nu."
Join Date: Jul 15, 2007
Location: Australia
Posts: 6210
Age: 38
Actually I probably shouldn't use <> or [] for that exact reason, but {} should be fine.

What about \=ĵ and |=Ĵ (backslash for lowercase and pipe for uppercase)? Would that work?
 
Jul 19, 2017 at 3:41 PM
In my body, in my head
Forum Moderator
"Life begins and ends with Nu."
Join Date: Aug 28, 2009
Location: The Purple Zone
Posts: 5998
iirc the only "special character" in vanilla TSC is < and #, and sometimes =
anything else is fair game
 
Jul 20, 2017 at 2:28 AM
Administrator
Forum Administrator
"Life begins and ends with Nu."
Join Date: Jul 15, 2007
Location: Australia
Posts: 6210
Age: 38
iirc the only "special character" in vanilla TSC is < and #, and sometimes =
anything else is fair game
We need to account for the credits as well. So square brackets are out.
 
Top