Author |
Message |
michalk
Joined: 29 Aug 2014 Posts: 211
|
|
SA Editor - charset |
|
I'm connecting SA to our postgres databases using pglib.
It seems SA Editor data grid works with local operating system character (win1250 in my case), but don't request it from database during connection.
This way SA Editor retrieves UTF8 encoded data, trying to display it using cp1250. It ends up with scrambled letters.
Temporary solution is to sent SET NAMES 'utf8' to database before retrieving data. But it works for current connection only. Also can't help with showing preview data in pgAdmin or Notepad++ by clicking on table in source code.
Is there a way how to setup character set for connection?
IMO SA should set up client encoding on its own, Since it knows character set it is working with (based on OS charset).
with regards
|
|
Fri Jul 13, 2018 8:21 am |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7948
|
|
|
|
Thank you. I think I would need to submit that as an issue. I don't know what settings to use to resolve it.
May I ask you for a favor? Would you please attach a screenshot demonstrating what happens in your environment?
|
|
Fri Jul 13, 2018 10:24 am |
|
 |
michalk
Joined: 29 Aug 2014 Posts: 211
|
|
|
Fri Jul 13, 2018 10:30 am |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7948
|
|
|
|
Here is the response I've got from the team.
All query results are retrieved and displayed using UTF-8 charset. No explicit conversion to other charsets is applied in the SQL editor side. They are asking to check the same query results using pgAdmin to see if the data saved in the database table already contains extra symbols. When you convert that data to win1250 explicitly, the extra symbols cannot be seen as they get removed.
|
|
Sun Jul 15, 2018 9:52 pm |
|
 |
michalk
Joined: 29 Aug 2014 Posts: 211
|
|
|
|
so... as you can see on attached screenshot, it obviously doesn't work the way the devs think it does.
Screenshot proves, national characters are displayed correctly after setting wincp1250 client charset. It indicates, datagrid itself uses win1250 charset too. It looks like charset is derived from operating system.
Two additional information:
1. database is configured to utf-8, including default client encoding
2. We work with Czech version of Windows 10
One way or another, I'm looking for a solution to my issue: a way to display diacritics proper way.
|
|
Mon Jul 16, 2018 4:36 am |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7948
|
|
|
|
Would you please attach a screenshot of the same results from pgAdmin?
Try an ODBC connection with SQL Editor?
They believe the grid itself doesn't perform any character set conversions; if it happens, it happens somewhere in a middle layer, or it's done and saved that way already. that's what they told me.
|
|
Mon Jul 16, 2018 8:00 am |
|
 |
michalk
Joined: 29 Aug 2014 Posts: 211
|
|
|
|
PgAdmin by default shows correct diacritics
Executing in pgAdmin
 |
 |
SHOW client_encoding |
returns 'UNICODE' which is equivalent of UTF8.
ODBC case is more tricky, but maybe will put some light on the issue.
There are 2 versions of ODBC driver: UNICODE and ANSI.
While using UNICODE version, SHOW client_encoding returns UTF8 and data are shown properly. When I force a change of character set, calling 'SET NAMES 'win1250', I get empty strings instead of ones which originally contained national characters.
With ANSI driver, SHOW client_encoding returns WIN1250. Again, data are displayed correctly. When I change client encoding by calling 'SET NAMES 'utf8', I get mangled strings, the same way as using pglib
|
|
Mon Jul 16, 2018 10:02 am |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7948
|
|
|
|
Do I get it correctly, that with Unicode version of ODBC driver the data in SQL Editor in the result grid appears correctly?
and with ANSI version of ODBC driver it's the same as with pglib and it's mangled?
|
|
Mon Jul 16, 2018 10:06 am |
|
 |
michalk
Joined: 29 Aug 2014 Posts: 211
|
|
|
|
Actually (what is a bit strange to me but I'm not familiar with odbc layer), both versions of ODBC show proper strings by default. Even ANSI version, which automatically sets WIN1250 character set for communication with database.
ANSI version returns mangled texts after SET NAMES 'utf8'
|
|
Mon Jul 16, 2018 10:10 am |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7948
|
|
|
|
Thank you. I will forward your replies to the team
|
|
Mon Jul 16, 2018 10:21 am |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7948
|
|
|
|
Could you please provide us with CREATE DATABASE script for your database and CREATE TABLE for that table in question so that we can reproduce you environment with the same encoding and collation settings?
|
|
Tue Jul 17, 2018 1:51 am |
|
 |
michalk
Joined: 29 Aug 2014 Posts: 211
|
|
|
|
 |
 |
CREATE DATABASE mydb
WITH OWNER = mydb
ENCODING = 'UTF8'
TABLESPACE = pg_default
LC_COLLATE = 'cs_CZ.UTF-8'
LC_CTYPE = 'cs_CZ.UTF-8'
CONNECTION LIMIT = -1;
CREATE TABLE public.szn_stav
(
id_stav serial NOT NULL,
stav character(30),
CONSTRAINT pkey_stav PRIMARY KEY (id_stav)
)
WITH (
OIDS=TRUE
);
|
There are a few values which might does matter, retrieved by SELECT * FROM pg_settings run in pgAdmin,
'lc_collate';'cs_CZ.UTF-8'
'lc_ctype';'cs_CZ.UTF-8'
'lc_messages';'en_US.UTF-8'
'lc_monetary';'en_US.UTF-8'
'lc_numeric';'en_US.UTF-8'
'lc_time';'en_US.UTF-8'
'client_encoding';'UNICODE'
|
|
Tue Jul 17, 2018 5:28 am |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7948
|
|
|
|
Thank you. We will try to reproduce your setup.
|
|
Tue Jul 17, 2018 9:14 am |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7948
|
|
|
|
Thank you. We reproduced this issue. It appears to be specific to combination of the following factors
PostgreSQL
Use of pqLib connection interface
Use of bpchar internal data type mapped to character column data type.
With certain collations it produces the observed effect resulting in incorrect character conversion.
As a temporary workaround we recommend using ODBC connection with PostgreSQL Unicode driver.
|
|
Tue Jul 17, 2018 11:03 am |
|
 |
michalk
Joined: 29 Aug 2014 Posts: 211
|
|
|
|
Nice job!
|
|
Tue Jul 17, 2018 11:13 am |
|
 |
|