We discussed this with Pavel and Anton just a moment ago. Summary follows. - New byte "flag" is to be added (ENCODED_STRING) - 'Encoding' property is to be added at -- global level (BinaryConfiguration) -- per-class level (BinaryTypeConfiguration) -- per-field level (BinaryTypeConfiguration) 2017-07-28 14:15 GMT+03:00 Vladimir Ozerov [via Apache Ignite Developers] <[hidden email]>: As Pavel mentioned, Marshaller should not be tied to cache, BinaryObject -- Best regards, Andrey Kuznetsov. |
Encoding *must not* be added to per-class or per-field level, this is wrong.
It should be added to per-cache level, and to per-cache-column level in future. пт, 28 июля 2017 г. в 14:27, Andrey Kuznetsov <[hidden email]>: > We discussed this with Pavel and Anton just a moment ago. Summary follows. > > - New byte "flag" is to be added (ENCODED_STRING) > - 'Encoding' property is to be added at > -- global level (BinaryConfiguration) > -- per-class level (BinaryTypeConfiguration) > -- per-field level (BinaryTypeConfiguration) > > 2017-07-28 14:15 GMT+03:00 Vladimir Ozerov [via Apache Ignite Developers] < > [hidden email]>: > > > As Pavel mentioned, Marshaller should not be tied to cache, BinaryObject > > should be self-explanatory, i.e. containing all information necessary for > > unmarshalling. This is an absolute requirement. > > > > We will have one extra byte for in serialized form, meaning that > advantage > > of custom encoding will become evident for all strings with length >= 1, > > which is perfectly fine. I do not quite understand what are we arguing > > about. > > > > As far as configuration, we can do it as follows: > > > > 1) Add global encoding, UTF8 by default. > > 2) Add per-cache encoding. > > 3) Add encoding to JDBC and ODBC driver properties. > > > > This should be enough. > > > > > -- > Best regards, > Andrey Kuznetsov. > > > > > -- > View this message in context: > http://apache-ignite-developers.2346864.n4.nabble.com/Non-UTF-8-string-encoding-support-in-BinaryMarshaller-IGNITE-5655-tp20024p20161.html > Sent from the Apache Ignite Developers mailing list archive at Nabble.com. |
> As Pavel mentioned, Marshaller should not be tied to cache
> should be added to per-cache level Not sure if I follow. Marshalling and caching are two separate mechanisms. Defining binary format in CacheConfiguration violates separation of concerns. > Encoding *must not* be added to per-class or per-field level, this is wrong What is wrong with this? BinaryTypeConfiguration looks the right place for such a setting. Are we talking from SQL standpoint here, so you want this to be defined somehow via DDL in future? On Fri, Jul 28, 2017 at 2:30 PM, Vladimir Ozerov <[hidden email]> wrote: > Encoding *must not* be added to per-class or per-field level, this is > wrong. > > It should be added to per-cache level, and to per-cache-column level in > future. > > пт, 28 июля 2017 г. в 14:27, Andrey Kuznetsov <[hidden email]>: > > > We discussed this with Pavel and Anton just a moment ago. Summary > follows. > > > > - New byte "flag" is to be added (ENCODED_STRING) > > - 'Encoding' property is to be added at > > -- global level (BinaryConfiguration) > > -- per-class level (BinaryTypeConfiguration) > > -- per-field level (BinaryTypeConfiguration) > > > > 2017-07-28 14:15 GMT+03:00 Vladimir Ozerov [via Apache Ignite > Developers] < > > [hidden email]>: > > > > > As Pavel mentioned, Marshaller should not be tied to cache, > BinaryObject > > > should be self-explanatory, i.e. containing all information necessary > for > > > unmarshalling. This is an absolute requirement. > > > > > > We will have one extra byte for in serialized form, meaning that > > advantage > > > of custom encoding will become evident for all strings with length >= > 1, > > > which is perfectly fine. I do not quite understand what are we arguing > > > about. > > > > > > As far as configuration, we can do it as follows: > > > > > > 1) Add global encoding, UTF8 by default. > > > 2) Add per-cache encoding. > > > 3) Add encoding to JDBC and ODBC driver properties. > > > > > > This should be enough. > > > > > > > > -- > > Best regards, > > Andrey Kuznetsov. > > > > > > > > > > -- > > View this message in context: > > http://apache-ignite-developers.2346864.n4.nabble. > com/Non-UTF-8-string-encoding-support-in-BinaryMarshaller- > IGNITE-5655-tp20024p20161.html > > Sent from the Apache Ignite Developers mailing list archive at > Nabble.com. > |
String encoding is a concept similar to "collation" in RDBMS. You can
define it either globally, or on per-table basis. The same should be done for Ignite. We do not define behavior of a type. We define behavior of a *storage*. Two cases when proposed approach with per-type and per-type-field approach doesn't work: 1) I have a class Person with field "name". I have two caches/tables - one for US persons, where name is in Latin, another for RU persons with Cyrillic names. How can achieve optimal encoding formats for both tables? 2) I have an empty grid. Now I want to create a cache/table with custom encoding. How can I do that without cluster restart? Nohow, because BinaryTypeConfiguration configured statically, while caches/tables can be created in runtime. On Fri, Jul 28, 2017 at 2:38 PM, Pavel Tupitsyn <[hidden email]> wrote: > > As Pavel mentioned, Marshaller should not be tied to cache > > should be added to per-cache level > Not sure if I follow. > Marshalling and caching are two separate mechanisms. > Defining binary format in CacheConfiguration violates separation of > concerns. > > > Encoding *must not* be added to per-class or per-field level, this is > wrong > What is wrong with this? BinaryTypeConfiguration looks the right place for > such a setting. > Are we talking from SQL standpoint here, so you want this to be defined > somehow via DDL in future? > > On Fri, Jul 28, 2017 at 2:30 PM, Vladimir Ozerov <[hidden email]> > wrote: > > > Encoding *must not* be added to per-class or per-field level, this is > > wrong. > > > > It should be added to per-cache level, and to per-cache-column level in > > future. > > > > пт, 28 июля 2017 г. в 14:27, Andrey Kuznetsov <[hidden email]>: > > > > > We discussed this with Pavel and Anton just a moment ago. Summary > > follows. > > > > > > - New byte "flag" is to be added (ENCODED_STRING) > > > - 'Encoding' property is to be added at > > > -- global level (BinaryConfiguration) > > > -- per-class level (BinaryTypeConfiguration) > > > -- per-field level (BinaryTypeConfiguration) > > > > > > 2017-07-28 14:15 GMT+03:00 Vladimir Ozerov [via Apache Ignite > > Developers] < > > > [hidden email]>: > > > > > > > As Pavel mentioned, Marshaller should not be tied to cache, > > BinaryObject > > > > should be self-explanatory, i.e. containing all information necessary > > for > > > > unmarshalling. This is an absolute requirement. > > > > > > > > We will have one extra byte for in serialized form, meaning that > > > advantage > > > > of custom encoding will become evident for all strings with length >= > > 1, > > > > which is perfectly fine. I do not quite understand what are we > arguing > > > > about. > > > > > > > > As far as configuration, we can do it as follows: > > > > > > > > 1) Add global encoding, UTF8 by default. > > > > 2) Add per-cache encoding. > > > > 3) Add encoding to JDBC and ODBC driver properties. > > > > > > > > This should be enough. > > > > > > > > > > > -- > > > Best regards, > > > Andrey Kuznetsov. > > > > > > > > > > > > > > > -- > > > View this message in context: > > > http://apache-ignite-developers.2346864.n4.nabble. > > com/Non-UTF-8-string-encoding-support-in-BinaryMarshaller- > > IGNITE-5655-tp20024p20161.html > > > Sent from the Apache Ignite Developers mailing list archive at > > Nabble.com. > > > |
Currently, marshaller determines the type of field (BYTE, INT, STRING etc.)
only by the Class of data being serialized. It seems rather non-trivial to manage marshaling parameters at cache creation point. Alternatively, there exists simple and flexible way: just to introduce new Java type, say, StringWithEncoding, but it looks ugly to my mind. 2017-07-28 14:45 GMT+03:00 Vladimir Ozerov <[hidden email]>: > String encoding is a concept similar to "collation" in RDBMS. You can > define it either globally, or on per-table basis. The same should be done > for Ignite. We do not define behavior of a type. We define behavior of a > *storage*. > > Two cases when proposed approach with per-type and per-type-field approach > doesn't work: > 1) I have a class Person with field "name". I have two caches/tables - one > for US persons, where name is in Latin, another for RU persons with > Cyrillic names. How can achieve optimal encoding formats for both tables? > 2) I have an empty grid. Now I want to create a cache/table with custom > encoding. How can I do that without cluster restart? Nohow, because > BinaryTypeConfiguration configured statically, while caches/tables can be > created in runtime. > |
In reply to this post by Vladimir Ozerov
> String encoding is a concept similar to "collation" in RDBMS. You can
> define it either globally, or on per-table basis. Or on per-column (per-field) basis. Though Oracle does not have per-column charset, some other databases provide this option. MySQL: - https://dev.mysql.com/doc/refman/5.7/en/create-table.html | CHAR[(length)] [BINARY] [CHARACTER SET charset_name] [COLLATE collation_name] | VARCHAR(length) [BINARY] [CHARACTER SET charset_name] [COLLATE collation_name] | TEXT [BINARY] [CHARACTER SET charset_name] [COLLATE collation_name] SQL Server: - https://docs.microsoft.com/en-us/sql/t-sql/statements/create-table-transact-sql <column_definition> ::= column_name <data_type> [ FILESTREAM ] [ COLLATE collation_name ] Postgres: - https://www.postgresql.org/docs/9.6/static/sql-createtable.html CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXISTS ] table_name ( [ { column_name data_type [ COLLATE collation ] > 1) I have a class Person with field "name". I have two caches/tables - one > for US persons, where name is in Latin, another for RU persons with > Cyrillic names. How can achieve optimal encoding formats for both tables? You have to have two classes in this case, maybe with a common parent. Or you have to select a common denominator and settle with one encoding for both of them. Like Java did with UTF-16 java.util.String-s. — Artem Schitow [hidden email] > On 28 Jul 2017, at 14:45, Vladimir Ozerov <[hidden email]> wrote: > > String encoding is a concept similar to "collation" in RDBMS. You can > define it either globally, or on per-table basis. The same should be done > for Ignite. We do not define behavior of a type. We define behavior of a > *storage*. > > Two cases when proposed approach with per-type and per-type-field approach > doesn't work: > 1) I have a class Person with field "name". I have two caches/tables - one > for US persons, where name is in Latin, another for RU persons with > Cyrillic names. How can achieve optimal encoding formats for both tables? > 2) I have an empty grid. Now I want to create a cache/table with custom > encoding. How can I do that without cluster restart? Nohow, because > BinaryTypeConfiguration configured statically, while caches/tables can be > created in runtime. > > On Fri, Jul 28, 2017 at 2:38 PM, Pavel Tupitsyn <[hidden email]> > wrote: > >>> As Pavel mentioned, Marshaller should not be tied to cache >>> should be added to per-cache level >> Not sure if I follow. >> Marshalling and caching are two separate mechanisms. >> Defining binary format in CacheConfiguration violates separation of >> concerns. >> >>> Encoding *must not* be added to per-class or per-field level, this is >> wrong >> What is wrong with this? BinaryTypeConfiguration looks the right place for >> such a setting. >> Are we talking from SQL standpoint here, so you want this to be defined >> somehow via DDL in future? >> >> On Fri, Jul 28, 2017 at 2:30 PM, Vladimir Ozerov <[hidden email]> >> wrote: >> >>> Encoding *must not* be added to per-class or per-field level, this is >>> wrong. >>> >>> It should be added to per-cache level, and to per-cache-column level in >>> future. >>> >>> пт, 28 июля 2017 г. в 14:27, Andrey Kuznetsov <[hidden email]>: >>> >>>> We discussed this with Pavel and Anton just a moment ago. Summary >>> follows. >>>> >>>> - New byte "flag" is to be added (ENCODED_STRING) >>>> - 'Encoding' property is to be added at >>>> -- global level (BinaryConfiguration) >>>> -- per-class level (BinaryTypeConfiguration) >>>> -- per-field level (BinaryTypeConfiguration) >>>> >>>> 2017-07-28 14:15 GMT+03:00 Vladimir Ozerov [via Apache Ignite >>> Developers] < >>>> [hidden email]>: >>>> >>>>> As Pavel mentioned, Marshaller should not be tied to cache, >>> BinaryObject >>>>> should be self-explanatory, i.e. containing all information necessary >>> for >>>>> unmarshalling. This is an absolute requirement. >>>>> >>>>> We will have one extra byte for in serialized form, meaning that >>>> advantage >>>>> of custom encoding will become evident for all strings with length >= >>> 1, >>>>> which is perfectly fine. I do not quite understand what are we >> arguing >>>>> about. >>>>> >>>>> As far as configuration, we can do it as follows: >>>>> >>>>> 1) Add global encoding, UTF8 by default. >>>>> 2) Add per-cache encoding. >>>>> 3) Add encoding to JDBC and ODBC driver properties. >>>>> >>>>> This should be enough. >>>>> >>>>> >>>> -- >>>> Best regards, >>>> Andrey Kuznetsov. >>>> >>>> >>>> >>>> >>>> -- >>>> View this message in context: >>>> http://apache-ignite-developers.2346864.n4.nabble. >>> com/Non-UTF-8-string-encoding-support-in-BinaryMarshaller- >>> IGNITE-5655-tp20024p20161.html >>>> Sent from the Apache Ignite Developers mailing list archive at >>> Nabble.com. >>> >> |
Managing encoding on per-cache level is not that complex thing.
Essentially, when any cache message are prepared on initiating node, we perform Object -> BinaryObject transition. These places has reference to cache context ([1], [2]). This is where we should define proper string encoding - either take global encoding, or cache-specific encoding. As far as per-column encoding, let's put this fine-grained case aside for a while. This is not as widely used as global or per-cache/per-table scenario. [1] org.apache.ignite.internal.processors.cache.GridCacheContext#toCacheKeyObject(java.lang.Object) [2] org.apache.ignite.internal.processors.cache.GridCacheContext#toCacheObject On Fri, Jul 28, 2017 at 8:08 PM, Artem Schitow <[hidden email]> wrote: > > String encoding is a concept similar to "collation" in RDBMS. You can > > define it either globally, or on per-table basis. > > Or on per-column (per-field) basis. Though Oracle does not have per-column > charset, some other databases provide this option. > > MySQL: > - https://dev.mysql.com/doc/refman/5.7/en/create-table.html > | CHAR[(length)] [BINARY] > [CHARACTER SET charset_name] [COLLATE collation_name] > > | VARCHAR(length) [BINARY] > [CHARACTER SET charset_name] [COLLATE collation_name] > > | TEXT [BINARY] > [CHARACTER SET charset_name] [COLLATE collation_name] > > SQL Server: > - https://docs.microsoft.com/en-us/sql/t-sql/statements/ > create-table-transact-sql > <column_definition> ::= > column_name <data_type> > [ FILESTREAM ] > [ COLLATE collation_name ] > > Postgres: > - https://www.postgresql.org/docs/9.6/static/sql-createtable.html > CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF > NOT EXISTS ] table_name > ( [ > { > column_name data_type [ COLLATE collation ] > > > 1) I have a class Person with field "name". I have two caches/tables - > one > > for US persons, where name is in Latin, another for RU persons with > > Cyrillic names. How can achieve optimal encoding formats for both tables? > > You have to have two classes in this case, maybe with a common parent. Or > you have to select a common denominator and settle with one encoding for > both of them. Like Java did with UTF-16 java.util.String-s. > > — > Artem Schitow > [hidden email] > > > > > > On 28 Jul 2017, at 14:45, Vladimir Ozerov <[hidden email]> wrote: > > > > String encoding is a concept similar to "collation" in RDBMS. You can > > define it either globally, or on per-table basis. The same should be done > > for Ignite. We do not define behavior of a type. We define behavior of a > > *storage*. > > > > Two cases when proposed approach with per-type and per-type-field > approach > > doesn't work: > > 1) I have a class Person with field "name". I have two caches/tables - > one > > for US persons, where name is in Latin, another for RU persons with > > Cyrillic names. How can achieve optimal encoding formats for both tables? > > 2) I have an empty grid. Now I want to create a cache/table with custom > > encoding. How can I do that without cluster restart? Nohow, because > > BinaryTypeConfiguration configured statically, while caches/tables can be > > created in runtime. > > > > On Fri, Jul 28, 2017 at 2:38 PM, Pavel Tupitsyn <[hidden email]> > > wrote: > > > >>> As Pavel mentioned, Marshaller should not be tied to cache > >>> should be added to per-cache level > >> Not sure if I follow. > >> Marshalling and caching are two separate mechanisms. > >> Defining binary format in CacheConfiguration violates separation of > >> concerns. > >> > >>> Encoding *must not* be added to per-class or per-field level, this is > >> wrong > >> What is wrong with this? BinaryTypeConfiguration looks the right place > for > >> such a setting. > >> Are we talking from SQL standpoint here, so you want this to be defined > >> somehow via DDL in future? > >> > >> On Fri, Jul 28, 2017 at 2:30 PM, Vladimir Ozerov <[hidden email]> > >> wrote: > >> > >>> Encoding *must not* be added to per-class or per-field level, this is > >>> wrong. > >>> > >>> It should be added to per-cache level, and to per-cache-column level in > >>> future. > >>> > >>> пт, 28 июля 2017 г. в 14:27, Andrey Kuznetsov <[hidden email]>: > >>> > >>>> We discussed this with Pavel and Anton just a moment ago. Summary > >>> follows. > >>>> > >>>> - New byte "flag" is to be added (ENCODED_STRING) > >>>> - 'Encoding' property is to be added at > >>>> -- global level (BinaryConfiguration) > >>>> -- per-class level (BinaryTypeConfiguration) > >>>> -- per-field level (BinaryTypeConfiguration) > >>>> > >>>> 2017-07-28 14:15 GMT+03:00 Vladimir Ozerov [via Apache Ignite > >>> Developers] < > >>>> [hidden email]>: > >>>> > >>>>> As Pavel mentioned, Marshaller should not be tied to cache, > >>> BinaryObject > >>>>> should be self-explanatory, i.e. containing all information necessary > >>> for > >>>>> unmarshalling. This is an absolute requirement. > >>>>> > >>>>> We will have one extra byte for in serialized form, meaning that > >>>> advantage > >>>>> of custom encoding will become evident for all strings with length >= > >>> 1, > >>>>> which is perfectly fine. I do not quite understand what are we > >> arguing > >>>>> about. > >>>>> > >>>>> As far as configuration, we can do it as follows: > >>>>> > >>>>> 1) Add global encoding, UTF8 by default. > >>>>> 2) Add per-cache encoding. > >>>>> 3) Add encoding to JDBC and ODBC driver properties. > >>>>> > >>>>> This should be enough. > >>>>> > >>>>> > >>>> -- > >>>> Best regards, > >>>> Andrey Kuznetsov. > >>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> View this message in context: > >>>> http://apache-ignite-developers.2346864.n4.nabble. > >>> com/Non-UTF-8-string-encoding-support-in-BinaryMarshaller- > >>> IGNITE-5655-tp20024p20161.html > >>>> Sent from the Apache Ignite Developers mailing list archive at > >>> Nabble.com. > >>> > >> > > |
Vladimir, what about binary mode (IgniteCache.withKeepBinary)?
Two caches may have different encoding settings: BinaryObject obj = cache1.get(key); // Got fields in utf8 cache2.put(key, obj); // Fields are expected to be in Windows-1251 What do we do here? Re-build the binary object? Also, what about BinaryRawWriter - do we need encoding support there? Pavel On Tue, Aug 1, 2017 at 12:23 PM, Vladimir Ozerov <[hidden email]> wrote: > Managing encoding on per-cache level is not that complex thing. > Essentially, when any cache message are prepared on initiating node, we > perform Object -> BinaryObject transition. These places has reference to > cache context ([1], [2]). This is where we should define proper string > encoding - either take global encoding, or cache-specific encoding. > > As far as per-column encoding, let's put this fine-grained case aside for a > while. This is not as widely used as global or per-cache/per-table > scenario. > > [1] org.apache.ignite.internal.processors.cache.GridCacheContext# > toCacheKeyObject(java.lang.Object) > [2] > org.apache.ignite.internal.processors.cache.GridCacheContext#toCacheObject > > On Fri, Jul 28, 2017 at 8:08 PM, Artem Schitow <[hidden email]> > wrote: > > > > String encoding is a concept similar to "collation" in RDBMS. You can > > > define it either globally, or on per-table basis. > > > > Or on per-column (per-field) basis. Though Oracle does not have > per-column > > charset, some other databases provide this option. > > > > MySQL: > > - https://dev.mysql.com/doc/refman/5.7/en/create-table.html > > | CHAR[(length)] [BINARY] > > [CHARACTER SET charset_name] [COLLATE collation_name] > > > > | VARCHAR(length) [BINARY] > > [CHARACTER SET charset_name] [COLLATE collation_name] > > > > | TEXT [BINARY] > > [CHARACTER SET charset_name] [COLLATE collation_name] > > > > SQL Server: > > - https://docs.microsoft.com/en-us/sql/t-sql/statements/ > > create-table-transact-sql > > <column_definition> ::= > > column_name <data_type> > > [ FILESTREAM ] > > [ COLLATE collation_name ] > > > > Postgres: > > - https://www.postgresql.org/docs/9.6/static/sql-createtable.html > > CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF > > NOT EXISTS ] table_name > > ( [ > > { > > column_name data_type [ COLLATE collation ] > > > > > 1) I have a class Person with field "name". I have two caches/tables - > > one > > > for US persons, where name is in Latin, another for RU persons with > > > Cyrillic names. How can achieve optimal encoding formats for both > tables? > > > > You have to have two classes in this case, maybe with a common parent. Or > > you have to select a common denominator and settle with one encoding for > > both of them. Like Java did with UTF-16 java.util.String-s. > > > > — > > Artem Schitow > > [hidden email] > > > > > > > > > > > On 28 Jul 2017, at 14:45, Vladimir Ozerov <[hidden email]> > wrote: > > > > > > String encoding is a concept similar to "collation" in RDBMS. You can > > > define it either globally, or on per-table basis. The same should be > done > > > for Ignite. We do not define behavior of a type. We define behavior of > a > > > *storage*. > > > > > > Two cases when proposed approach with per-type and per-type-field > > approach > > > doesn't work: > > > 1) I have a class Person with field "name". I have two caches/tables - > > one > > > for US persons, where name is in Latin, another for RU persons with > > > Cyrillic names. How can achieve optimal encoding formats for both > tables? > > > 2) I have an empty grid. Now I want to create a cache/table with custom > > > encoding. How can I do that without cluster restart? Nohow, because > > > BinaryTypeConfiguration configured statically, while caches/tables can > be > > > created in runtime. > > > > > > On Fri, Jul 28, 2017 at 2:38 PM, Pavel Tupitsyn <[hidden email]> > > > wrote: > > > > > >>> As Pavel mentioned, Marshaller should not be tied to cache > > >>> should be added to per-cache level > > >> Not sure if I follow. > > >> Marshalling and caching are two separate mechanisms. > > >> Defining binary format in CacheConfiguration violates separation of > > >> concerns. > > >> > > >>> Encoding *must not* be added to per-class or per-field level, this is > > >> wrong > > >> What is wrong with this? BinaryTypeConfiguration looks the right place > > for > > >> such a setting. > > >> Are we talking from SQL standpoint here, so you want this to be > defined > > >> somehow via DDL in future? > > >> > > >> On Fri, Jul 28, 2017 at 2:30 PM, Vladimir Ozerov < > [hidden email]> > > >> wrote: > > >> > > >>> Encoding *must not* be added to per-class or per-field level, this is > > >>> wrong. > > >>> > > >>> It should be added to per-cache level, and to per-cache-column level > in > > >>> future. > > >>> > > >>> пт, 28 июля 2017 г. в 14:27, Andrey Kuznetsov <[hidden email]>: > > >>> > > >>>> We discussed this with Pavel and Anton just a moment ago. Summary > > >>> follows. > > >>>> > > >>>> - New byte "flag" is to be added (ENCODED_STRING) > > >>>> - 'Encoding' property is to be added at > > >>>> -- global level (BinaryConfiguration) > > >>>> -- per-class level (BinaryTypeConfiguration) > > >>>> -- per-field level (BinaryTypeConfiguration) > > >>>> > > >>>> 2017-07-28 14:15 GMT+03:00 Vladimir Ozerov [via Apache Ignite > > >>> Developers] < > > >>>> [hidden email]>: > > >>>> > > >>>>> As Pavel mentioned, Marshaller should not be tied to cache, > > >>> BinaryObject > > >>>>> should be self-explanatory, i.e. containing all information > necessary > > >>> for > > >>>>> unmarshalling. This is an absolute requirement. > > >>>>> > > >>>>> We will have one extra byte for in serialized form, meaning that > > >>>> advantage > > >>>>> of custom encoding will become evident for all strings with length > >= > > >>> 1, > > >>>>> which is perfectly fine. I do not quite understand what are we > > >> arguing > > >>>>> about. > > >>>>> > > >>>>> As far as configuration, we can do it as follows: > > >>>>> > > >>>>> 1) Add global encoding, UTF8 by default. > > >>>>> 2) Add per-cache encoding. > > >>>>> 3) Add encoding to JDBC and ODBC driver properties. > > >>>>> > > >>>>> This should be enough. > > >>>>> > > >>>>> > > >>>> -- > > >>>> Best regards, > > >>>> Andrey Kuznetsov. > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> -- > > >>>> View this message in context: > > >>>> http://apache-ignite-developers.2346864.n4.nabble. > > >>> com/Non-UTF-8-string-encoding-support-in-BinaryMarshaller- > > >>> IGNITE-5655-tp20024p20161.html > > >>>> Sent from the Apache Ignite Developers mailing list archive at > > >>> Nabble.com. > > >>> > > >> > > > > > |
Free forum by Nabble | Edit this page |