Hello!
It was recently discovered that there's some issue in H2 which leads to differing string encodings when they come into reducer from nodes with different system encoding. Even if we were able to fix this, I suspect that user code may too be bitten by mismatch of this setting. I propose to force UTF-8 as system encoding at all times when we control how JVM is launched. This includes ignite.sh, ignite.bat, Apache.Ignite.exe and C++'s ./ignite. This will mainly affect Windows systems as I expect that Linux will most always use UTF-8 locale and Mac OS X should always be UTF-8. file.encoding is somewhat misleading name since it specifies the default string encoding, such as the one used for String.getBytes(). It is a common convention to set ut to UTF-8, for example, IDEA will do that. WDYT? There's a pull request: https://github.com/apache/ignite/pull/5725 If somebody could contribute C++ and .Net tests I would be also grateful. Regards, -- Ilya Kasnacheev |
+1 from my side. I think it is reasonable
ср, 26 дек. 2018 г. в 19:20, Ilya Kasnacheev <[hidden email]>: > Hello! > > It was recently discovered that there's some issue in H2 which leads to > differing string encodings when they come into reducer from nodes with > different system encoding. Even if we were able to fix this, I suspect that > user code may too be bitten by mismatch of this setting. > > I propose to force UTF-8 as system encoding at all times when we control > how JVM is launched. > This includes ignite.sh, ignite.bat, Apache.Ignite.exe and C++'s ./ignite. > > This will mainly affect Windows systems as I expect that Linux will most > always use UTF-8 locale and Mac OS X should always be UTF-8. > > file.encoding is somewhat misleading name since it specifies the default > string encoding, such as the one used for String.getBytes(). It is a common > convention to set ut to UTF-8, for example, IDEA will do that. > > WDYT? > > There's a pull request: https://github.com/apache/ignite/pull/5725 > If somebody could contribute C++ and .Net tests I would be also grateful. > > Regards, > -- > Ilya Kasnacheev > |
Hello!
I've merged this to master. Now we should to try run nodes in UTF-8 mode unless specified otherwise explicitly. Also added a warning on start-up if other encoding is used as it might lead to data corruption. Regards, -- Ilya Kasnacheev ср, 26 дек. 2018 г. в 19:25, Dmitriy Pavlov <[hidden email]>: > +1 from my side. I think it is reasonable > > ср, 26 дек. 2018 г. в 19:20, Ilya Kasnacheev <[hidden email]>: > > > Hello! > > > > It was recently discovered that there's some issue in H2 which leads to > > differing string encodings when they come into reducer from nodes with > > different system encoding. Even if we were able to fix this, I suspect > that > > user code may too be bitten by mismatch of this setting. > > > > I propose to force UTF-8 as system encoding at all times when we control > > how JVM is launched. > > This includes ignite.sh, ignite.bat, Apache.Ignite.exe and C++'s > ./ignite. > > > > This will mainly affect Windows systems as I expect that Linux will most > > always use UTF-8 locale and Mac OS X should always be UTF-8. > > > > file.encoding is somewhat misleading name since it specifies the default > > string encoding, such as the one used for String.getBytes(). It is a > common > > convention to set ut to UTF-8, for example, IDEA will do that. > > > > WDYT? > > > > There's a pull request: https://github.com/apache/ignite/pull/5725 > > If somebody could contribute C++ and .Net tests I would be also grateful. > > > > Regards, > > -- > > Ilya Kasnacheev > > > |
Hello!
I have a concern about MetaStorage - it uses default system encoding to serialize keys. This means that after this change Windows nodes won't be able to read previous MetaStorage files. Is that OK? Is there a way to migrate old MetaStorages safely? чт, 10 янв. 2019 г. в 16:57, Ilya Kasnacheev <[hidden email]>: > Hello! > > I've merged this to master. Now we should to try run nodes in UTF-8 mode > unless specified otherwise explicitly. > > Also added a warning on start-up if other encoding is used as it might lead > to data corruption. > > Regards, > -- > Ilya Kasnacheev > > > ср, 26 дек. 2018 г. в 19:25, Dmitriy Pavlov <[hidden email]>: > > > +1 from my side. I think it is reasonable > > > > ср, 26 дек. 2018 г. в 19:20, Ilya Kasnacheev <[hidden email] > >: > > > > > Hello! > > > > > > It was recently discovered that there's some issue in H2 which leads to > > > differing string encodings when they come into reducer from nodes with > > > different system encoding. Even if we were able to fix this, I suspect > > that > > > user code may too be bitten by mismatch of this setting. > > > > > > I propose to force UTF-8 as system encoding at all times when we > control > > > how JVM is launched. > > > This includes ignite.sh, ignite.bat, Apache.Ignite.exe and C++'s > > ./ignite. > > > > > > This will mainly affect Windows systems as I expect that Linux will > most > > > always use UTF-8 locale and Mac OS X should always be UTF-8. > > > > > > file.encoding is somewhat misleading name since it specifies the > default > > > string encoding, such as the one used for String.getBytes(). It is a > > common > > > convention to set ut to UTF-8, for example, IDEA will do that. > > > > > > WDYT? > > > > > > There's a pull request: https://github.com/apache/ignite/pull/5725 > > > If somebody could contribute C++ and .Net tests I would be also > grateful. > > > > > > Regards, > > > -- > > > Ilya Kasnacheev > > > > > > -- Sincerely yours, Ivan Bessonov |
Hello!
I'm afraid not, but it is still possible to force legacy encoding as a workaround. Is there an expectation that MetaStorage contains I18N keys? Regards, -- Ilya Kasnacheev чт, 10 янв. 2019 г. в 17:11, Ivan Bessonov <[hidden email]>: > Hello! > > I have a concern about MetaStorage - it uses default system encoding to > serialize keys. This means that after this change Windows nodes won't be > able to read previous MetaStorage files. > > Is that OK? Is there a way to migrate old MetaStorages safely? > > чт, 10 янв. 2019 г. в 16:57, Ilya Kasnacheev <[hidden email]>: > > > Hello! > > > > I've merged this to master. Now we should to try run nodes in UTF-8 mode > > unless specified otherwise explicitly. > > > > Also added a warning on start-up if other encoding is used as it might > lead > > to data corruption. > > > > Regards, > > -- > > Ilya Kasnacheev > > > > > > ср, 26 дек. 2018 г. в 19:25, Dmitriy Pavlov <[hidden email]>: > > > > > +1 from my side. I think it is reasonable > > > > > > ср, 26 дек. 2018 г. в 19:20, Ilya Kasnacheev < > [hidden email] > > >: > > > > > > > Hello! > > > > > > > > It was recently discovered that there's some issue in H2 which leads > to > > > > differing string encodings when they come into reducer from nodes > with > > > > different system encoding. Even if we were able to fix this, I > suspect > > > that > > > > user code may too be bitten by mismatch of this setting. > > > > > > > > I propose to force UTF-8 as system encoding at all times when we > > control > > > > how JVM is launched. > > > > This includes ignite.sh, ignite.bat, Apache.Ignite.exe and C++'s > > > ./ignite. > > > > > > > > This will mainly affect Windows systems as I expect that Linux will > > most > > > > always use UTF-8 locale and Mac OS X should always be UTF-8. > > > > > > > > file.encoding is somewhat misleading name since it specifies the > > default > > > > string encoding, such as the one used for String.getBytes(). It is a > > > common > > > > convention to set ut to UTF-8, for example, IDEA will do that. > > > > > > > > WDYT? > > > > > > > > There's a pull request: https://github.com/apache/ignite/pull/5725 > > > > If somebody could contribute C++ and .Net tests I would be also > > grateful. > > > > > > > > Regards, > > > > -- > > > > Ilya Kasnacheev > > > > > > > > > > > > -- > Sincerely yours, > Ivan Bessonov > |
I believe that all current keys are just basic English strings and it's too
early to say that current change breaks something, sorry. We just have to ensure that old cp-1251 MetaStorages work fine and maybe then force UTF-8 for MetaStorage in source code to avoid such problems in the future. What do you think? чт, 10 янв. 2019 г. в 17:13, Ilya Kasnacheev <[hidden email]>: > Hello! > > I'm afraid not, but it is still possible to force legacy encoding as a > workaround. > > Is there an expectation that MetaStorage contains I18N keys? > > Regards, > -- > Ilya Kasnacheev > > > чт, 10 янв. 2019 г. в 17:11, Ivan Bessonov <[hidden email]>: > > > Hello! > > > > I have a concern about MetaStorage - it uses default system encoding to > > serialize keys. This means that after this change Windows nodes won't be > > able to read previous MetaStorage files. > > > > Is that OK? Is there a way to migrate old MetaStorages safely? > > > > чт, 10 янв. 2019 г. в 16:57, Ilya Kasnacheev <[hidden email] > >: > > > > > Hello! > > > > > > I've merged this to master. Now we should to try run nodes in UTF-8 > mode > > > unless specified otherwise explicitly. > > > > > > Also added a warning on start-up if other encoding is used as it might > > lead > > > to data corruption. > > > > > > Regards, > > > -- > > > Ilya Kasnacheev > > > > > > > > > ср, 26 дек. 2018 г. в 19:25, Dmitriy Pavlov <[hidden email]>: > > > > > > > +1 from my side. I think it is reasonable > > > > > > > > ср, 26 дек. 2018 г. в 19:20, Ilya Kasnacheev < > > [hidden email] > > > >: > > > > > > > > > Hello! > > > > > > > > > > It was recently discovered that there's some issue in H2 which > leads > > to > > > > > differing string encodings when they come into reducer from nodes > > with > > > > > different system encoding. Even if we were able to fix this, I > > suspect > > > > that > > > > > user code may too be bitten by mismatch of this setting. > > > > > > > > > > I propose to force UTF-8 as system encoding at all times when we > > > control > > > > > how JVM is launched. > > > > > This includes ignite.sh, ignite.bat, Apache.Ignite.exe and C++'s > > > > ./ignite. > > > > > > > > > > This will mainly affect Windows systems as I expect that Linux will > > > most > > > > > always use UTF-8 locale and Mac OS X should always be UTF-8. > > > > > > > > > > file.encoding is somewhat misleading name since it specifies the > > > default > > > > > string encoding, such as the one used for String.getBytes(). It is > a > > > > common > > > > > convention to set ut to UTF-8, for example, IDEA will do that. > > > > > > > > > > WDYT? > > > > > > > > > > There's a pull request: https://github.com/apache/ignite/pull/5725 > > > > > If somebody could contribute C++ and .Net tests I would be also > > > grateful. > > > > > > > > > > Regards, > > > > > -- > > > > > Ilya Kasnacheev > > > > > > > > > > > > > > > > > > -- > > Sincerely yours, > > Ivan Bessonov > > > -- Sincerely yours, Ivan Bessonov |
Hello!
I'd force UTF-8 where possible. It should not be very hard to write a test to ensure that an old one works (using multijvm). Regards, -- Ilya Kasnacheev чт, 10 янв. 2019 г. в 17:21, Ivan Bessonov <[hidden email]>: > I believe that all current keys are just basic English strings and it's too > early to say that current change breaks something, sorry. > > We just have to ensure that old cp-1251 MetaStorages work fine and > maybe then force UTF-8 for MetaStorage in source code to avoid such > problems in the future. What do you think? > > чт, 10 янв. 2019 г. в 17:13, Ilya Kasnacheev <[hidden email]>: > > > Hello! > > > > I'm afraid not, but it is still possible to force legacy encoding as a > > workaround. > > > > Is there an expectation that MetaStorage contains I18N keys? > > > > Regards, > > -- > > Ilya Kasnacheev > > > > > > чт, 10 янв. 2019 г. в 17:11, Ivan Bessonov <[hidden email]>: > > > > > Hello! > > > > > > I have a concern about MetaStorage - it uses default system encoding to > > > serialize keys. This means that after this change Windows nodes won't > be > > > able to read previous MetaStorage files. > > > > > > Is that OK? Is there a way to migrate old MetaStorages safely? > > > > > > чт, 10 янв. 2019 г. в 16:57, Ilya Kasnacheev < > [hidden email] > > >: > > > > > > > Hello! > > > > > > > > I've merged this to master. Now we should to try run nodes in UTF-8 > > mode > > > > unless specified otherwise explicitly. > > > > > > > > Also added a warning on start-up if other encoding is used as it > might > > > lead > > > > to data corruption. > > > > > > > > Regards, > > > > -- > > > > Ilya Kasnacheev > > > > > > > > > > > > ср, 26 дек. 2018 г. в 19:25, Dmitriy Pavlov <[hidden email]>: > > > > > > > > > +1 from my side. I think it is reasonable > > > > > > > > > > ср, 26 дек. 2018 г. в 19:20, Ilya Kasnacheev < > > > [hidden email] > > > > >: > > > > > > > > > > > Hello! > > > > > > > > > > > > It was recently discovered that there's some issue in H2 which > > leads > > > to > > > > > > differing string encodings when they come into reducer from nodes > > > with > > > > > > different system encoding. Even if we were able to fix this, I > > > suspect > > > > > that > > > > > > user code may too be bitten by mismatch of this setting. > > > > > > > > > > > > I propose to force UTF-8 as system encoding at all times when we > > > > control > > > > > > how JVM is launched. > > > > > > This includes ignite.sh, ignite.bat, Apache.Ignite.exe and C++'s > > > > > ./ignite. > > > > > > > > > > > > This will mainly affect Windows systems as I expect that Linux > will > > > > most > > > > > > always use UTF-8 locale and Mac OS X should always be UTF-8. > > > > > > > > > > > > file.encoding is somewhat misleading name since it specifies the > > > > default > > > > > > string encoding, such as the one used for String.getBytes(). It > is > > a > > > > > common > > > > > > convention to set ut to UTF-8, for example, IDEA will do that. > > > > > > > > > > > > WDYT? > > > > > > > > > > > > There's a pull request: > https://github.com/apache/ignite/pull/5725 > > > > > > If somebody could contribute C++ and .Net tests I would be also > > > > grateful. > > > > > > > > > > > > Regards, > > > > > > -- > > > > > > Ilya Kasnacheev > > > > > > > > > > > > > > > > > > > > > > > > -- > > > Sincerely yours, > > > Ivan Bessonov > > > > > > > > -- > Sincerely yours, > Ivan Bessonov > |
Free forum by Nabble | Edit this page |