Bug: Other charsets

Development discussion about the hostip.info codebase, libraries, and database

Moderator: Moderators

Bug: Other charsets

Postby Tomi on Fri Dec 16, 2005 8:17 am

The system replaces all non-English characters with a ? (for example, čž, etc.).

If hostip.info is to become a recognised community-based project, than this must be improved.

Tomi
Tomi
Expert
 

Umlauts

Postby paulhaeder on Wed Jun 07, 2006 3:49 am

I agree. I'd love to see Zürich, instead of Z?rich etc. Please try to solve this.

Paul
paulhaeder
n00b
 
Posts: 4
Joined: Wed Jun 07, 2006 3:47 am

Postby robocoder on Thu Jun 08, 2006 8:01 am

Is that an actual example? It looks like Zürich is stored as Zurich in the database.

I did, however, notice that Beaupré, QC is coming up as Beaupr, QC.

There are a number of ways to approach this, including:
  • store the umlauts natively in the database
  • add an encode/decode step
  • use lookup tables (e.g., Zurich -> Zürich)
Also, the soundex (used in "Make a correction") would need to handle umlauts (if it doesn't already do so).
User avatar
robocoder
Veteran
 
Posts: 105
Joined: Sat Mar 25, 2006 8:10 pm
Location: Ontario, Canada

Umlauts

Postby paulhaeder on Mon Jun 12, 2006 6:50 am

That is an actual example.

It looks like Zürich is stored as Zurich in the database.

Yes and no. That's another problem I've come across. Cities are stored with different spellings: Zurich and Zürich. Rome is aka. Roma (Italian) and Rom (German). Which one is returned, depends on the IP...
paulhaeder
n00b
 
Posts: 4
Joined: Wed Jun 07, 2006 3:47 am

What's the status of this bug?

Postby sneba on Tue Jan 16, 2007 2:36 pm

I would like to know the status of this bug before I consider implementing the hostip system into my project. Are special characters in citynames still not shown correctly? Can anyone tell me?

Regards,
sneba
n00b
 
Posts: 1
Joined: Tue Jan 16, 2007 2:30 pm

Postby Coval on Tue Jul 03, 2007 10:56 am

Thanks for a great free service! Great job!

There is a huge problem with polish city names. Most of them which includes in their names our "national characters" are corrupted.

ś, Ś, Ł, ł, ą, Ą, ź, Ź, ż, Ż, ę, Ę

In what charset is hostinfo database encoded?
Coval
n00b
 
Posts: 1
Joined: Tue Jul 03, 2007 10:44 am


Return to Hostip.info code/database devel

Who is online

Users browsing this forum: No registered users and 7 guests

cron