Folks,
It looks like CacheEntry implementation (i.e., the entry that contains version) can't be properly serialized with the BinaryMarshaller. I created the test and the ticket: https://issues.apache.org/jira/browse/IGNITE-2203 Can someone take a look? -Val |
I'll take a look.
On Fri, Dec 18, 2015 at 4:37 AM, Valentin Kulichenko < [hidden email]> wrote: > Folks, > > It looks like CacheEntry implementation (i.e., the entry that contains > version) can't be properly serialized with the BinaryMarshaller. I created > the test and the ticket: https://issues.apache.org/jira/browse/IGNITE-2203 > > Can someone take a look? > > -Val > |
How am I supposed to handle this situation if the class comes from a 3d party I can't modify?
Thanks Andrey > Date: Fri, 18 Dec 2015 09:12:22 +0300 > Subject: Re: CacheEntry serialization failure > From: [hidden email] > To: [hidden email] > > I'll take a look. > > On Fri, Dec 18, 2015 at 4:37 AM, Valentin Kulichenko < > [hidden email]> wrote: > > > Folks, > > > > It looks like CacheEntry implementation (i.e., the entry that contains > > version) can't be properly serialized with the BinaryMarshaller. I created > > the test and the ticket: https://issues.apache.org/jira/browse/IGNITE-2203 > > > > Can someone take a look? > > > > -Val > > |
I fixed the problem, it was a bug actually.
By default classes which has some custom Java logic (e.g. Externalizable, or with writeObject/readObject methods) will be written using OptimizedMarshaller, so similar field names is not a problem. If you want to serialize such class in binary format and have duplicate field names, you should provide your own BinarySerializer, which will write these fields with different names. On Fri, Dec 18, 2015 at 8:07 PM, Andrey Kornev <[hidden email]> wrote: > How am I supposed to handle this situation if the class comes from a 3d > party I can't modify? > > Thanks > Andrey > > > Date: Fri, 18 Dec 2015 09:12:22 +0300 > > Subject: Re: CacheEntry serialization failure > > From: [hidden email] > > To: [hidden email] > > > > I'll take a look. > > > > On Fri, Dec 18, 2015 at 4:37 AM, Valentin Kulichenko < > > [hidden email]> wrote: > > > > > Folks, > > > > > > It looks like CacheEntry implementation (i.e., the entry that contains > > > version) can't be properly serialized with the BinaryMarshaller. I > created > > > the test and the ticket: > https://issues.apache.org/jira/browse/IGNITE-2203 > > > > > > Can someone take a look? > > > > > > -Val > > > > > |
Ah, I saw your problem with DirectedSpecifics. We need to think about how
to solve it. Here is the case: 1) Class is Serilzable and cannot be changed; 2) There are several duplicated field names; => BinaryMarshaller cannot handle it. Any thoughts? On Fri, Dec 18, 2015 at 8:34 PM, Vladimir Ozerov <[hidden email]> wrote: > I fixed the problem, it was a bug actually. > > By default classes which has some custom Java logic (e.g. Externalizable, > or with writeObject/readObject methods) will be written using > OptimizedMarshaller, so similar field names is not a problem. > If you want to serialize such class in binary format and have duplicate > field names, you should provide your own BinarySerializer, which will write > these fields with different names. > > On Fri, Dec 18, 2015 at 8:07 PM, Andrey Kornev <[hidden email]> > wrote: > >> How am I supposed to handle this situation if the class comes from a 3d >> party I can't modify? >> >> Thanks >> Andrey >> >> > Date: Fri, 18 Dec 2015 09:12:22 +0300 >> > Subject: Re: CacheEntry serialization failure >> > From: [hidden email] >> > To: [hidden email] >> > >> > I'll take a look. >> > >> > On Fri, Dec 18, 2015 at 4:37 AM, Valentin Kulichenko < >> > [hidden email]> wrote: >> > >> > > Folks, >> > > >> > > It looks like CacheEntry implementation (i.e., the entry that contains >> > > version) can't be properly serialized with the BinaryMarshaller. I >> created >> > > the test and the ticket: >> https://issues.apache.org/jira/browse/IGNITE-2203 >> > > >> > > Can someone take a look? >> > > >> > > -Val >> > > >> >> > > |
The most straightforward solution which comes to my mind - *do not ever use
BinaryMarshaller by default*. Always fallback to OptimizedMarshaller unless user explicitly asked us to use binary format (e.g. through package wildcards). BTW, we already do this for Externalizable and readObject/writeObject. On Fri, Dec 18, 2015 at 8:41 PM, Vladimir Ozerov <[hidden email]> wrote: > Ah, I saw your problem with DirectedSpecifics. We need to think about how > to solve it. Here is the case: > 1) Class is Serilzable and cannot be changed; > 2) There are several duplicated field names; > => BinaryMarshaller cannot handle it. > > Any thoughts? > > On Fri, Dec 18, 2015 at 8:34 PM, Vladimir Ozerov <[hidden email]> > wrote: > >> I fixed the problem, it was a bug actually. >> >> By default classes which has some custom Java logic (e.g. Externalizable, >> or with writeObject/readObject methods) will be written using >> OptimizedMarshaller, so similar field names is not a problem. >> If you want to serialize such class in binary format and have duplicate >> field names, you should provide your own BinarySerializer, which will write >> these fields with different names. >> >> On Fri, Dec 18, 2015 at 8:07 PM, Andrey Kornev <[hidden email]> >> wrote: >> >>> How am I supposed to handle this situation if the class comes from a 3d >>> party I can't modify? >>> >>> Thanks >>> Andrey >>> >>> > Date: Fri, 18 Dec 2015 09:12:22 +0300 >>> > Subject: Re: CacheEntry serialization failure >>> > From: [hidden email] >>> > To: [hidden email] >>> > >>> > I'll take a look. >>> > >>> > On Fri, Dec 18, 2015 at 4:37 AM, Valentin Kulichenko < >>> > [hidden email]> wrote: >>> > >>> > > Folks, >>> > > >>> > > It looks like CacheEntry implementation (i.e., the entry that >>> contains >>> > > version) can't be properly serialized with the BinaryMarshaller. I >>> created >>> > > the test and the ticket: >>> https://issues.apache.org/jira/browse/IGNITE-2203 >>> > > >>> > > Can someone take a look? >>> > > >>> > > -Val >>> > > >>> >>> >> >> > |
In this particular case, the class that fails is a non-static inner class that extends another non-static inner class, so they both end up having the compiler-generated "this$0" field.
Regards Andrey > Date: Fri, 18 Dec 2015 20:44:12 +0300 > Subject: Re: CacheEntry serialization failure > From: [hidden email] > To: [hidden email] > > The most straightforward solution which comes to my mind - *do not ever use > BinaryMarshaller by default*. Always fallback to OptimizedMarshaller unless > user explicitly asked us to use binary format (e.g. through package > wildcards). > > BTW, we already do this for Externalizable and readObject/writeObject. > > On Fri, Dec 18, 2015 at 8:41 PM, Vladimir Ozerov <[hidden email]> > wrote: > > > Ah, I saw your problem with DirectedSpecifics. We need to think about how > > to solve it. Here is the case: > > 1) Class is Serilzable and cannot be changed; > > 2) There are several duplicated field names; > > => BinaryMarshaller cannot handle it. > > > > Any thoughts? > > > > On Fri, Dec 18, 2015 at 8:34 PM, Vladimir Ozerov <[hidden email]> > > wrote: > > > >> I fixed the problem, it was a bug actually. > >> > >> By default classes which has some custom Java logic (e.g. Externalizable, > >> or with writeObject/readObject methods) will be written using > >> OptimizedMarshaller, so similar field names is not a problem. > >> If you want to serialize such class in binary format and have duplicate > >> field names, you should provide your own BinarySerializer, which will write > >> these fields with different names. > >> > >> On Fri, Dec 18, 2015 at 8:07 PM, Andrey Kornev <[hidden email]> > >> wrote: > >> > >>> How am I supposed to handle this situation if the class comes from a 3d > >>> party I can't modify? > >>> > >>> Thanks > >>> Andrey > >>> > >>> > Date: Fri, 18 Dec 2015 09:12:22 +0300 > >>> > Subject: Re: CacheEntry serialization failure > >>> > From: [hidden email] > >>> > To: [hidden email] > >>> > > >>> > I'll take a look. > >>> > > >>> > On Fri, Dec 18, 2015 at 4:37 AM, Valentin Kulichenko < > >>> > [hidden email]> wrote: > >>> > > >>> > > Folks, > >>> > > > >>> > > It looks like CacheEntry implementation (i.e., the entry that > >>> contains > >>> > > version) can't be properly serialized with the BinaryMarshaller. I > >>> created > >>> > > the test and the ticket: > >>> https://issues.apache.org/jira/browse/IGNITE-2203 > >>> > > > >>> > > Can someone take a look? > >>> > > > >>> > > -Val > >>> > > > >>> > >>> > >> > >> > > |
Folks,
It seems to me that issue here is not with 3rd-party libraries. We just don't properly support class hierarchy in binary format. Any class that extends another class and has the field with the same name as parent has will fail unless user provides custom serialization logic that will handle it. What if we prepend the field name with the simple class name in this case? Say, we have two classes: class A { private int id; } class B extends A { private int id; } In this case we will get two fields: "A.id" and "B.id". The only issue is that if there are no name conflict, we should be able to resolve by both names - with or without prepended type name. I.e., if A is serialized, you can get the field value by "id" or "A.id". This is similar to how it works if you join two SQL tables with the same column names. Any thoughts on whether it's doable or not? -Val On Fri, Dec 18, 2015 at 10:05 AM, Andrey Kornev <[hidden email]> wrote: > In this particular case, the class that fails is a non-static inner class > that extends another non-static inner class, so they both end up having the > compiler-generated "this$0" field. > > Regards > Andrey > > > Date: Fri, 18 Dec 2015 20:44:12 +0300 > > Subject: Re: CacheEntry serialization failure > > From: [hidden email] > > To: [hidden email] > > > > The most straightforward solution which comes to my mind - *do not ever > use > > BinaryMarshaller by default*. Always fallback to OptimizedMarshaller > unless > > user explicitly asked us to use binary format (e.g. through package > > wildcards). > > > > BTW, we already do this for Externalizable and readObject/writeObject. > > > > On Fri, Dec 18, 2015 at 8:41 PM, Vladimir Ozerov <[hidden email]> > > wrote: > > > > > Ah, I saw your problem with DirectedSpecifics. We need to think about > how > > > to solve it. Here is the case: > > > 1) Class is Serilzable and cannot be changed; > > > 2) There are several duplicated field names; > > > => BinaryMarshaller cannot handle it. > > > > > > Any thoughts? > > > > > > On Fri, Dec 18, 2015 at 8:34 PM, Vladimir Ozerov <[hidden email] > > > > > wrote: > > > > > >> I fixed the problem, it was a bug actually. > > >> > > >> By default classes which has some custom Java logic (e.g. > Externalizable, > > >> or with writeObject/readObject methods) will be written using > > >> OptimizedMarshaller, so similar field names is not a problem. > > >> If you want to serialize such class in binary format and have > duplicate > > >> field names, you should provide your own BinarySerializer, which will > write > > >> these fields with different names. > > >> > > >> On Fri, Dec 18, 2015 at 8:07 PM, Andrey Kornev < > [hidden email]> > > >> wrote: > > >> > > >>> How am I supposed to handle this situation if the class comes from a > 3d > > >>> party I can't modify? > > >>> > > >>> Thanks > > >>> Andrey > > >>> > > >>> > Date: Fri, 18 Dec 2015 09:12:22 +0300 > > >>> > Subject: Re: CacheEntry serialization failure > > >>> > From: [hidden email] > > >>> > To: [hidden email] > > >>> > > > >>> > I'll take a look. > > >>> > > > >>> > On Fri, Dec 18, 2015 at 4:37 AM, Valentin Kulichenko < > > >>> > [hidden email]> wrote: > > >>> > > > >>> > > Folks, > > >>> > > > > >>> > > It looks like CacheEntry implementation (i.e., the entry that > > >>> contains > > >>> > > version) can't be properly serialized with the BinaryMarshaller. > I > > >>> created > > >>> > > the test and the ticket: > > >>> https://issues.apache.org/jira/browse/IGNITE-2203 > > >>> > > > > >>> > > Can someone take a look? > > >>> > > > > >>> > > -Val > > >>> > > > > >>> > > >>> > > >> > > >> > > > > > |
May be we can use normal field names by default and add some prefix/suffix
if conflict is found? E.g.: class A { int a; // Write as "a"; } class B extends A { int a; // Write as "B_a"; } On Fri, Dec 18, 2015 at 10:34 PM, Valentin Kulichenko < [hidden email]> wrote: > Folks, > > It seems to me that issue here is not with 3rd-party libraries. We just > don't properly support class hierarchy in binary format. Any class that > extends another class and has the field with the same name as parent has > will fail unless user provides custom serialization logic that will handle > it. > > What if we prepend the field name with the simple class name in this case? > Say, we have two classes: > > class A { > private int id; > } > > class B extends A { > private int id; > } > > In this case we will get two fields: "A.id" and "B.id". The only issue is > that if there are no name conflict, we should be able to resolve by both > names - with or without prepended type name. I.e., if A is serialized, you > can get the field value by "id" or "A.id". This is similar to how it works > if you join two SQL tables with the same column names. > > Any thoughts on whether it's doable or not? > > -Val > > On Fri, Dec 18, 2015 at 10:05 AM, Andrey Kornev <[hidden email]> > wrote: > > > In this particular case, the class that fails is a non-static inner class > > that extends another non-static inner class, so they both end up having > the > > compiler-generated "this$0" field. > > > > Regards > > Andrey > > > > > Date: Fri, 18 Dec 2015 20:44:12 +0300 > > > Subject: Re: CacheEntry serialization failure > > > From: [hidden email] > > > To: [hidden email] > > > > > > The most straightforward solution which comes to my mind - *do not ever > > use > > > BinaryMarshaller by default*. Always fallback to OptimizedMarshaller > > unless > > > user explicitly asked us to use binary format (e.g. through package > > > wildcards). > > > > > > BTW, we already do this for Externalizable and readObject/writeObject. > > > > > > On Fri, Dec 18, 2015 at 8:41 PM, Vladimir Ozerov <[hidden email] > > > > > wrote: > > > > > > > Ah, I saw your problem with DirectedSpecifics. We need to think about > > how > > > > to solve it. Here is the case: > > > > 1) Class is Serilzable and cannot be changed; > > > > 2) There are several duplicated field names; > > > > => BinaryMarshaller cannot handle it. > > > > > > > > Any thoughts? > > > > > > > > On Fri, Dec 18, 2015 at 8:34 PM, Vladimir Ozerov < > [hidden email] > > > > > > > wrote: > > > > > > > >> I fixed the problem, it was a bug actually. > > > >> > > > >> By default classes which has some custom Java logic (e.g. > > Externalizable, > > > >> or with writeObject/readObject methods) will be written using > > > >> OptimizedMarshaller, so similar field names is not a problem. > > > >> If you want to serialize such class in binary format and have > > duplicate > > > >> field names, you should provide your own BinarySerializer, which > will > > write > > > >> these fields with different names. > > > >> > > > >> On Fri, Dec 18, 2015 at 8:07 PM, Andrey Kornev < > > [hidden email]> > > > >> wrote: > > > >> > > > >>> How am I supposed to handle this situation if the class comes from > a > > 3d > > > >>> party I can't modify? > > > >>> > > > >>> Thanks > > > >>> Andrey > > > >>> > > > >>> > Date: Fri, 18 Dec 2015 09:12:22 +0300 > > > >>> > Subject: Re: CacheEntry serialization failure > > > >>> > From: [hidden email] > > > >>> > To: [hidden email] > > > >>> > > > > >>> > I'll take a look. > > > >>> > > > > >>> > On Fri, Dec 18, 2015 at 4:37 AM, Valentin Kulichenko < > > > >>> > [hidden email]> wrote: > > > >>> > > > > >>> > > Folks, > > > >>> > > > > > >>> > > It looks like CacheEntry implementation (i.e., the entry that > > > >>> contains > > > >>> > > version) can't be properly serialized with the > BinaryMarshaller. > > I > > > >>> created > > > >>> > > the test and the ticket: > > > >>> https://issues.apache.org/jira/browse/IGNITE-2203 > > > >>> > > > > > >>> > > Can someone take a look? > > > >>> > > > > > >>> > > -Val > > > >>> > > > > > >>> > > > >>> > > > >> > > > >> > > > > > > > > > |
Vova,
Shouldn’t it be the other way around? Class B writes the field as “a”, and class A writes it with a prefix (possibly the hash code of the class name). Also, we should clearly document how the SQL queries are affected by this. AFAIK, we should be using field aliases here, no? D. On Sun, Dec 20, 2015 at 10:08 AM, Vladimir Ozerov <[hidden email]> wrote: > May be we can use normal field names by default and add some prefix/suffix > if conflict is found? E.g.: > > class A { > int a; // Write as "a"; > } > > class B extends A { > int a; // Write as "B_a"; > } > > On Fri, Dec 18, 2015 at 10:34 PM, Valentin Kulichenko < > [hidden email]> wrote: > > > Folks, > > > > It seems to me that issue here is not with 3rd-party libraries. We just > > don't properly support class hierarchy in binary format. Any class that > > extends another class and has the field with the same name as parent has > > will fail unless user provides custom serialization logic that will > handle > > it. > > > > What if we prepend the field name with the simple class name in this > case? > > Say, we have two classes: > > > > class A { > > private int id; > > } > > > > class B extends A { > > private int id; > > } > > > > In this case we will get two fields: "A.id" and "B.id". The only issue is > > that if there are no name conflict, we should be able to resolve by both > > names - with or without prepended type name. I.e., if A is serialized, > you > > can get the field value by "id" or "A.id". This is similar to how it > works > > if you join two SQL tables with the same column names. > > > > Any thoughts on whether it's doable or not? > > > > -Val > > > > On Fri, Dec 18, 2015 at 10:05 AM, Andrey Kornev < > [hidden email]> > > wrote: > > > > > In this particular case, the class that fails is a non-static inner > class > > > that extends another non-static inner class, so they both end up having > > the > > > compiler-generated "this$0" field. > > > > > > Regards > > > Andrey > > > > > > > Date: Fri, 18 Dec 2015 20:44:12 +0300 > > > > Subject: Re: CacheEntry serialization failure > > > > From: [hidden email] > > > > To: [hidden email] > > > > > > > > The most straightforward solution which comes to my mind - *do not > ever > > > use > > > > BinaryMarshaller by default*. Always fallback to OptimizedMarshaller > > > unless > > > > user explicitly asked us to use binary format (e.g. through package > > > > wildcards). > > > > > > > > BTW, we already do this for Externalizable and > readObject/writeObject. > > > > > > > > On Fri, Dec 18, 2015 at 8:41 PM, Vladimir Ozerov < > [hidden email] > > > > > > > wrote: > > > > > > > > > Ah, I saw your problem with DirectedSpecifics. We need to think > about > > > how > > > > > to solve it. Here is the case: > > > > > 1) Class is Serilzable and cannot be changed; > > > > > 2) There are several duplicated field names; > > > > > => BinaryMarshaller cannot handle it. > > > > > > > > > > Any thoughts? > > > > > > > > > > On Fri, Dec 18, 2015 at 8:34 PM, Vladimir Ozerov < > > [hidden email] > > > > > > > > > wrote: > > > > > > > > > >> I fixed the problem, it was a bug actually. > > > > >> > > > > >> By default classes which has some custom Java logic (e.g. > > > Externalizable, > > > > >> or with writeObject/readObject methods) will be written using > > > > >> OptimizedMarshaller, so similar field names is not a problem. > > > > >> If you want to serialize such class in binary format and have > > > duplicate > > > > >> field names, you should provide your own BinarySerializer, which > > will > > > write > > > > >> these fields with different names. > > > > >> > > > > >> On Fri, Dec 18, 2015 at 8:07 PM, Andrey Kornev < > > > [hidden email]> > > > > >> wrote: > > > > >> > > > > >>> How am I supposed to handle this situation if the class comes > from > > a > > > 3d > > > > >>> party I can't modify? > > > > >>> > > > > >>> Thanks > > > > >>> Andrey > > > > >>> > > > > >>> > Date: Fri, 18 Dec 2015 09:12:22 +0300 > > > > >>> > Subject: Re: CacheEntry serialization failure > > > > >>> > From: [hidden email] > > > > >>> > To: [hidden email] > > > > >>> > > > > > >>> > I'll take a look. > > > > >>> > > > > > >>> > On Fri, Dec 18, 2015 at 4:37 AM, Valentin Kulichenko < > > > > >>> > [hidden email]> wrote: > > > > >>> > > > > > >>> > > Folks, > > > > >>> > > > > > > >>> > > It looks like CacheEntry implementation (i.e., the entry that > > > > >>> contains > > > > >>> > > version) can't be properly serialized with the > > BinaryMarshaller. > > > I > > > > >>> created > > > > >>> > > the test and the ticket: > > > > >>> https://issues.apache.org/jira/browse/IGNITE-2203 > > > > >>> > > > > > > >>> > > Can someone take a look? > > > > >>> > > > > > > >>> > > -Val > > > > >>> > > > > > > >>> > > > > >>> > > > > >> > > > > >> > > > > > > > > > > > > > > |
Agree, prefixing parent class fields sound like a better option.
Regarding aliases - I need some time to understand internal mechanics. Will answer this a bit later. On Mon, Dec 21, 2015 at 11:18 AM, Dmitriy Setrakyan <[hidden email]> wrote: > Vova, > > Shouldn’t it be the other way around? Class B writes the field as “a”, and > class A writes it with a prefix (possibly the hash code of the class name). > > Also, we should clearly document how the SQL queries are affected by this. > AFAIK, we should be using field aliases here, no? > > D. > > On Sun, Dec 20, 2015 at 10:08 AM, Vladimir Ozerov <[hidden email]> > wrote: > > > May be we can use normal field names by default and add some > prefix/suffix > > if conflict is found? E.g.: > > > > class A { > > int a; // Write as "a"; > > } > > > > class B extends A { > > int a; // Write as "B_a"; > > } > > > > On Fri, Dec 18, 2015 at 10:34 PM, Valentin Kulichenko < > > [hidden email]> wrote: > > > > > Folks, > > > > > > It seems to me that issue here is not with 3rd-party libraries. We just > > > don't properly support class hierarchy in binary format. Any class that > > > extends another class and has the field with the same name as parent > has > > > will fail unless user provides custom serialization logic that will > > handle > > > it. > > > > > > What if we prepend the field name with the simple class name in this > > case? > > > Say, we have two classes: > > > > > > class A { > > > private int id; > > > } > > > > > > class B extends A { > > > private int id; > > > } > > > > > > In this case we will get two fields: "A.id" and "B.id". The only issue > is > > > that if there are no name conflict, we should be able to resolve by > both > > > names - with or without prepended type name. I.e., if A is serialized, > > you > > > can get the field value by "id" or "A.id". This is similar to how it > > works > > > if you join two SQL tables with the same column names. > > > > > > Any thoughts on whether it's doable or not? > > > > > > -Val > > > > > > On Fri, Dec 18, 2015 at 10:05 AM, Andrey Kornev < > > [hidden email]> > > > wrote: > > > > > > > In this particular case, the class that fails is a non-static inner > > class > > > > that extends another non-static inner class, so they both end up > having > > > the > > > > compiler-generated "this$0" field. > > > > > > > > Regards > > > > Andrey > > > > > > > > > Date: Fri, 18 Dec 2015 20:44:12 +0300 > > > > > Subject: Re: CacheEntry serialization failure > > > > > From: [hidden email] > > > > > To: [hidden email] > > > > > > > > > > The most straightforward solution which comes to my mind - *do not > > ever > > > > use > > > > > BinaryMarshaller by default*. Always fallback to > OptimizedMarshaller > > > > unless > > > > > user explicitly asked us to use binary format (e.g. through package > > > > > wildcards). > > > > > > > > > > BTW, we already do this for Externalizable and > > readObject/writeObject. > > > > > > > > > > On Fri, Dec 18, 2015 at 8:41 PM, Vladimir Ozerov < > > [hidden email] > > > > > > > > > wrote: > > > > > > > > > > > Ah, I saw your problem with DirectedSpecifics. We need to think > > about > > > > how > > > > > > to solve it. Here is the case: > > > > > > 1) Class is Serilzable and cannot be changed; > > > > > > 2) There are several duplicated field names; > > > > > > => BinaryMarshaller cannot handle it. > > > > > > > > > > > > Any thoughts? > > > > > > > > > > > > On Fri, Dec 18, 2015 at 8:34 PM, Vladimir Ozerov < > > > [hidden email] > > > > > > > > > > > wrote: > > > > > > > > > > > >> I fixed the problem, it was a bug actually. > > > > > >> > > > > > >> By default classes which has some custom Java logic (e.g. > > > > Externalizable, > > > > > >> or with writeObject/readObject methods) will be written using > > > > > >> OptimizedMarshaller, so similar field names is not a problem. > > > > > >> If you want to serialize such class in binary format and have > > > > duplicate > > > > > >> field names, you should provide your own BinarySerializer, which > > > will > > > > write > > > > > >> these fields with different names. > > > > > >> > > > > > >> On Fri, Dec 18, 2015 at 8:07 PM, Andrey Kornev < > > > > [hidden email]> > > > > > >> wrote: > > > > > >> > > > > > >>> How am I supposed to handle this situation if the class comes > > from > > > a > > > > 3d > > > > > >>> party I can't modify? > > > > > >>> > > > > > >>> Thanks > > > > > >>> Andrey > > > > > >>> > > > > > >>> > Date: Fri, 18 Dec 2015 09:12:22 +0300 > > > > > >>> > Subject: Re: CacheEntry serialization failure > > > > > >>> > From: [hidden email] > > > > > >>> > To: [hidden email] > > > > > >>> > > > > > > >>> > I'll take a look. > > > > > >>> > > > > > > >>> > On Fri, Dec 18, 2015 at 4:37 AM, Valentin Kulichenko < > > > > > >>> > [hidden email]> wrote: > > > > > >>> > > > > > > >>> > > Folks, > > > > > >>> > > > > > > > >>> > > It looks like CacheEntry implementation (i.e., the entry > that > > > > > >>> contains > > > > > >>> > > version) can't be properly serialized with the > > > BinaryMarshaller. > > > > I > > > > > >>> created > > > > > >>> > > the test and the ticket: > > > > > >>> https://issues.apache.org/jira/browse/IGNITE-2203 > > > > > >>> > > > > > > > >>> > > Can someone take a look? > > > > > >>> > > > > > > > >>> > > -Val > > > > > >>> > > > > > > > >>> > > > > > >>> > > > > > >> > > > > > >> > > > > > > > > > > > > > > > > > > > > |
Folks,
I thought about the solution a bit more and came to the following design. *Scenario:* class A { int x; } class B : extends A { int y; } *Solution:* 1) Field A.x is written as *"A.x"*, field B.x is written as *"B.x"*. I.e. *both conflicting fields are prefixed* with simple name of the owning class. 2) API is unchanged. User manipulates these fields on all public methods in exactly the same way: "A.x" and "B.x". *Rationale:* 1) We cannot prefix only some of conflicting fields. E.g. if decide to prefix only A.x, then it is not clear how to handle this case: class B extends A implements Binarylizable { void write(BinaryWriter writer) { writer.writeInt("B.x", x); // User intentionally written field as "B.x". } } BinaryObject obj = ...; Object val = obj.field("B.x"); // Should we lookup for "B.x" as user asked us, or just for "x"? Prefixing all conflicting fields with class name resolves the problem. 2) If we add methods to manipulate fields not only by name, but by (typeName + fieldName) as well, then we will have to add *9 new methods* to API: BinaryType.fieldTypeName(String typeName, String fieldName); BinaryType.field(String typeName, String fieldName); BinaryObject.field(String typeName, String fieldName); BinaryObject.hasField(String typeName, String fieldName); BinaryObjectBuilder.getField(String typeName, String fieldName); BinaryObjectBuilder.setField(String typeName, String fieldName, ...); // 3 overloads BinaryObjectBuilder.removeField(String typeName, String fieldName); This is definitely an overkill for such a rare scenario. Thoughts? On Mon, Dec 21, 2015 at 11:22 AM, Vladimir Ozerov <[hidden email]> wrote: > Agree, prefixing parent class fields sound like a better option. > Regarding aliases - I need some time to understand internal mechanics. > Will answer this a bit later. > > On Mon, Dec 21, 2015 at 11:18 AM, Dmitriy Setrakyan <[hidden email] > > wrote: > >> Vova, >> >> Shouldn’t it be the other way around? Class B writes the field as “a”, and >> class A writes it with a prefix (possibly the hash code of the class >> name). >> >> Also, we should clearly document how the SQL queries are affected by this. >> AFAIK, we should be using field aliases here, no? >> >> D. >> >> On Sun, Dec 20, 2015 at 10:08 AM, Vladimir Ozerov <[hidden email]> >> wrote: >> >> > May be we can use normal field names by default and add some >> prefix/suffix >> > if conflict is found? E.g.: >> > >> > class A { >> > int a; // Write as "a"; >> > } >> > >> > class B extends A { >> > int a; // Write as "B_a"; >> > } >> > >> > On Fri, Dec 18, 2015 at 10:34 PM, Valentin Kulichenko < >> > [hidden email]> wrote: >> > >> > > Folks, >> > > >> > > It seems to me that issue here is not with 3rd-party libraries. We >> just >> > > don't properly support class hierarchy in binary format. Any class >> that >> > > extends another class and has the field with the same name as parent >> has >> > > will fail unless user provides custom serialization logic that will >> > handle >> > > it. >> > > >> > > What if we prepend the field name with the simple class name in this >> > case? >> > > Say, we have two classes: >> > > >> > > class A { >> > > private int id; >> > > } >> > > >> > > class B extends A { >> > > private int id; >> > > } >> > > >> > > In this case we will get two fields: "A.id" and "B.id". The only >> issue is >> > > that if there are no name conflict, we should be able to resolve by >> both >> > > names - with or without prepended type name. I.e., if A is serialized, >> > you >> > > can get the field value by "id" or "A.id". This is similar to how it >> > works >> > > if you join two SQL tables with the same column names. >> > > >> > > Any thoughts on whether it's doable or not? >> > > >> > > -Val >> > > >> > > On Fri, Dec 18, 2015 at 10:05 AM, Andrey Kornev < >> > [hidden email]> >> > > wrote: >> > > >> > > > In this particular case, the class that fails is a non-static inner >> > class >> > > > that extends another non-static inner class, so they both end up >> having >> > > the >> > > > compiler-generated "this$0" field. >> > > > >> > > > Regards >> > > > Andrey >> > > > >> > > > > Date: Fri, 18 Dec 2015 20:44:12 +0300 >> > > > > Subject: Re: CacheEntry serialization failure >> > > > > From: [hidden email] >> > > > > To: [hidden email] >> > > > > >> > > > > The most straightforward solution which comes to my mind - *do not >> > ever >> > > > use >> > > > > BinaryMarshaller by default*. Always fallback to >> OptimizedMarshaller >> > > > unless >> > > > > user explicitly asked us to use binary format (e.g. through >> package >> > > > > wildcards). >> > > > > >> > > > > BTW, we already do this for Externalizable and >> > readObject/writeObject. >> > > > > >> > > > > On Fri, Dec 18, 2015 at 8:41 PM, Vladimir Ozerov < >> > [hidden email] >> > > > >> > > > > wrote: >> > > > > >> > > > > > Ah, I saw your problem with DirectedSpecifics. We need to think >> > about >> > > > how >> > > > > > to solve it. Here is the case: >> > > > > > 1) Class is Serilzable and cannot be changed; >> > > > > > 2) There are several duplicated field names; >> > > > > > => BinaryMarshaller cannot handle it. >> > > > > > >> > > > > > Any thoughts? >> > > > > > >> > > > > > On Fri, Dec 18, 2015 at 8:34 PM, Vladimir Ozerov < >> > > [hidden email] >> > > > > >> > > > > > wrote: >> > > > > > >> > > > > >> I fixed the problem, it was a bug actually. >> > > > > >> >> > > > > >> By default classes which has some custom Java logic (e.g. >> > > > Externalizable, >> > > > > >> or with writeObject/readObject methods) will be written using >> > > > > >> OptimizedMarshaller, so similar field names is not a problem. >> > > > > >> If you want to serialize such class in binary format and have >> > > > duplicate >> > > > > >> field names, you should provide your own BinarySerializer, >> which >> > > will >> > > > write >> > > > > >> these fields with different names. >> > > > > >> >> > > > > >> On Fri, Dec 18, 2015 at 8:07 PM, Andrey Kornev < >> > > > [hidden email]> >> > > > > >> wrote: >> > > > > >> >> > > > > >>> How am I supposed to handle this situation if the class comes >> > from >> > > a >> > > > 3d >> > > > > >>> party I can't modify? >> > > > > >>> >> > > > > >>> Thanks >> > > > > >>> Andrey >> > > > > >>> >> > > > > >>> > Date: Fri, 18 Dec 2015 09:12:22 +0300 >> > > > > >>> > Subject: Re: CacheEntry serialization failure >> > > > > >>> > From: [hidden email] >> > > > > >>> > To: [hidden email] >> > > > > >>> > >> > > > > >>> > I'll take a look. >> > > > > >>> > >> > > > > >>> > On Fri, Dec 18, 2015 at 4:37 AM, Valentin Kulichenko < >> > > > > >>> > [hidden email]> wrote: >> > > > > >>> > >> > > > > >>> > > Folks, >> > > > > >>> > > >> > > > > >>> > > It looks like CacheEntry implementation (i.e., the entry >> that >> > > > > >>> contains >> > > > > >>> > > version) can't be properly serialized with the >> > > BinaryMarshaller. >> > > > I >> > > > > >>> created >> > > > > >>> > > the test and the ticket: >> > > > > >>> https://issues.apache.org/jira/browse/IGNITE-2203 >> > > > > >>> > > >> > > > > >>> > > Can someone take a look? >> > > > > >>> > > >> > > > > >>> > > -Val >> > > > > >>> > > >> > > > > >>> >> > > > > >>> >> > > > > >> >> > > > > >> >> > > > > > >> > > > >> > > > >> > > >> > >> > > |
Sounds good for me. I would go for this approach.
In addition if to consider your example below and the user decides to look up a field by its simple name then he/she will get nothing or exception (depending on the API), correct? As an example for this case the method will return null BinaryObject obj = ...; Object val = obj.field("x"); // null will be returned cause we don't know what particular 'x' we have to return -- Denis On 12/21/2015 11:48 AM, Vladimir Ozerov wrote: > Folks, > > I thought about the solution a bit more and came to the following design. > > *Scenario:* > class A { > int x; > } > class B : extends A { > int y; > } > > *Solution:* > 1) Field A.x is written as *"A.x"*, field B.x is written as *"B.x"*. I.e. *both > conflicting fields are prefixed* with simple name of the owning class. > 2) API is unchanged. User manipulates these fields on all public methods in > exactly the same way: "A.x" and "B.x". > > *Rationale:* > 1) We cannot prefix only some of conflicting fields. E.g. if decide to > prefix only A.x, then it is not clear how to handle this case: > > class B extends A implements Binarylizable { > void write(BinaryWriter writer) { > writer.writeInt("B.x", x); // User intentionally written field as > "B.x". > } > } > > BinaryObject obj = ...; > Object val = obj.field("B.x"); // Should we lookup for "B.x" as user asked > us, or just for "x"? > > Prefixing all conflicting fields with class name resolves the problem. > > 2) If we add methods to manipulate fields not only by name, but by > (typeName + fieldName) as well, then we will have to add *9 new methods* to > API: > BinaryType.fieldTypeName(String typeName, String fieldName); > BinaryType.field(String typeName, String fieldName); > BinaryObject.field(String typeName, String fieldName); > BinaryObject.hasField(String typeName, String fieldName); > BinaryObjectBuilder.getField(String typeName, String fieldName); > BinaryObjectBuilder.setField(String typeName, String fieldName, ...); // 3 > overloads > BinaryObjectBuilder.removeField(String typeName, String fieldName); > > This is definitely an overkill for such a rare scenario. > > Thoughts? > > On Mon, Dec 21, 2015 at 11:22 AM, Vladimir Ozerov <[hidden email]> > wrote: > >> Agree, prefixing parent class fields sound like a better option. >> Regarding aliases - I need some time to understand internal mechanics. >> Will answer this a bit later. >> >> On Mon, Dec 21, 2015 at 11:18 AM, Dmitriy Setrakyan <[hidden email] >>> wrote: >>> Vova, >>> >>> Shouldn’t it be the other way around? Class B writes the field as “a”, and >>> class A writes it with a prefix (possibly the hash code of the class >>> name). >>> >>> Also, we should clearly document how the SQL queries are affected by this. >>> AFAIK, we should be using field aliases here, no? >>> >>> D. >>> >>> On Sun, Dec 20, 2015 at 10:08 AM, Vladimir Ozerov <[hidden email]> >>> wrote: >>> >>>> May be we can use normal field names by default and add some >>> prefix/suffix >>>> if conflict is found? E.g.: >>>> >>>> class A { >>>> int a; // Write as "a"; >>>> } >>>> >>>> class B extends A { >>>> int a; // Write as "B_a"; >>>> } >>>> >>>> On Fri, Dec 18, 2015 at 10:34 PM, Valentin Kulichenko < >>>> [hidden email]> wrote: >>>> >>>>> Folks, >>>>> >>>>> It seems to me that issue here is not with 3rd-party libraries. We >>> just >>>>> don't properly support class hierarchy in binary format. Any class >>> that >>>>> extends another class and has the field with the same name as parent >>> has >>>>> will fail unless user provides custom serialization logic that will >>>> handle >>>>> it. >>>>> >>>>> What if we prepend the field name with the simple class name in this >>>> case? >>>>> Say, we have two classes: >>>>> >>>>> class A { >>>>> private int id; >>>>> } >>>>> >>>>> class B extends A { >>>>> private int id; >>>>> } >>>>> >>>>> In this case we will get two fields: "A.id" and "B.id". The only >>> issue is >>>>> that if there are no name conflict, we should be able to resolve by >>> both >>>>> names - with or without prepended type name. I.e., if A is serialized, >>>> you >>>>> can get the field value by "id" or "A.id". This is similar to how it >>>> works >>>>> if you join two SQL tables with the same column names. >>>>> >>>>> Any thoughts on whether it's doable or not? >>>>> >>>>> -Val >>>>> >>>>> On Fri, Dec 18, 2015 at 10:05 AM, Andrey Kornev < >>>> [hidden email]> >>>>> wrote: >>>>> >>>>>> In this particular case, the class that fails is a non-static inner >>>> class >>>>>> that extends another non-static inner class, so they both end up >>> having >>>>> the >>>>>> compiler-generated "this$0" field. >>>>>> >>>>>> Regards >>>>>> Andrey >>>>>> >>>>>>> Date: Fri, 18 Dec 2015 20:44:12 +0300 >>>>>>> Subject: Re: CacheEntry serialization failure >>>>>>> From: [hidden email] >>>>>>> To: [hidden email] >>>>>>> >>>>>>> The most straightforward solution which comes to my mind - *do not >>>> ever >>>>>> use >>>>>>> BinaryMarshaller by default*. Always fallback to >>> OptimizedMarshaller >>>>>> unless >>>>>>> user explicitly asked us to use binary format (e.g. through >>> package >>>>>>> wildcards). >>>>>>> >>>>>>> BTW, we already do this for Externalizable and >>>> readObject/writeObject. >>>>>>> On Fri, Dec 18, 2015 at 8:41 PM, Vladimir Ozerov < >>>> [hidden email] >>>>>>> wrote: >>>>>>> >>>>>>>> Ah, I saw your problem with DirectedSpecifics. We need to think >>>> about >>>>>> how >>>>>>>> to solve it. Here is the case: >>>>>>>> 1) Class is Serilzable and cannot be changed; >>>>>>>> 2) There are several duplicated field names; >>>>>>>> => BinaryMarshaller cannot handle it. >>>>>>>> >>>>>>>> Any thoughts? >>>>>>>> >>>>>>>> On Fri, Dec 18, 2015 at 8:34 PM, Vladimir Ozerov < >>>>> [hidden email] >>>>>>>> wrote: >>>>>>>> >>>>>>>>> I fixed the problem, it was a bug actually. >>>>>>>>> >>>>>>>>> By default classes which has some custom Java logic (e.g. >>>>>> Externalizable, >>>>>>>>> or with writeObject/readObject methods) will be written using >>>>>>>>> OptimizedMarshaller, so similar field names is not a problem. >>>>>>>>> If you want to serialize such class in binary format and have >>>>>> duplicate >>>>>>>>> field names, you should provide your own BinarySerializer, >>> which >>>>> will >>>>>> write >>>>>>>>> these fields with different names. >>>>>>>>> >>>>>>>>> On Fri, Dec 18, 2015 at 8:07 PM, Andrey Kornev < >>>>>> [hidden email]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> How am I supposed to handle this situation if the class comes >>>> from >>>>> a >>>>>> 3d >>>>>>>>>> party I can't modify? >>>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> Andrey >>>>>>>>>> >>>>>>>>>>> Date: Fri, 18 Dec 2015 09:12:22 +0300 >>>>>>>>>>> Subject: Re: CacheEntry serialization failure >>>>>>>>>>> From: [hidden email] >>>>>>>>>>> To: [hidden email] >>>>>>>>>>> >>>>>>>>>>> I'll take a look. >>>>>>>>>>> >>>>>>>>>>> On Fri, Dec 18, 2015 at 4:37 AM, Valentin Kulichenko < >>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>> >>>>>>>>>>>> Folks, >>>>>>>>>>>> >>>>>>>>>>>> It looks like CacheEntry implementation (i.e., the entry >>> that >>>>>>>>>> contains >>>>>>>>>>>> version) can't be properly serialized with the >>>>> BinaryMarshaller. >>>>>> I >>>>>>>>>> created >>>>>>>>>>>> the test and the ticket: >>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-2203 >>>>>>>>>>>> Can someone take a look? >>>>>>>>>>>> >>>>>>>>>>>> -Val >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>> >> |
Denis,
Yes, as we do not know which field to pick, we return null. On Mon, Dec 21, 2015 at 12:00 PM, Denis Magda <[hidden email]> wrote: > Sounds good for me. I would go for this approach. > > In addition if to consider your example below and the user decides to look > up a field by its simple name then he/she will get nothing or exception > (depending on the API), correct? > As an example for this case the method will return null > > BinaryObject obj = ...; > Object val = obj.field("x"); // null will be returned cause we don't know > what particular 'x' we have to return > > > -- > Denis > > On 12/21/2015 11:48 AM, Vladimir Ozerov wrote: > >> Folks, >> >> I thought about the solution a bit more and came to the following design. >> >> *Scenario:* >> class A { >> int x; >> } >> class B : extends A { >> int y; >> } >> >> *Solution:* >> 1) Field A.x is written as *"A.x"*, field B.x is written as *"B.x"*. I.e. >> *both >> conflicting fields are prefixed* with simple name of the owning class. >> 2) API is unchanged. User manipulates these fields on all public methods >> in >> exactly the same way: "A.x" and "B.x". >> >> *Rationale:* >> 1) We cannot prefix only some of conflicting fields. E.g. if decide to >> prefix only A.x, then it is not clear how to handle this case: >> >> class B extends A implements Binarylizable { >> void write(BinaryWriter writer) { >> writer.writeInt("B.x", x); // User intentionally written field as >> "B.x". >> } >> } >> >> BinaryObject obj = ...; >> Object val = obj.field("B.x"); // Should we lookup for "B.x" as user asked >> us, or just for "x"? >> >> Prefixing all conflicting fields with class name resolves the problem. >> >> 2) If we add methods to manipulate fields not only by name, but by >> (typeName + fieldName) as well, then we will have to add *9 new methods* >> to >> >> API: >> BinaryType.fieldTypeName(String typeName, String fieldName); >> BinaryType.field(String typeName, String fieldName); >> BinaryObject.field(String typeName, String fieldName); >> BinaryObject.hasField(String typeName, String fieldName); >> BinaryObjectBuilder.getField(String typeName, String fieldName); >> BinaryObjectBuilder.setField(String typeName, String fieldName, ...); // 3 >> overloads >> BinaryObjectBuilder.removeField(String typeName, String fieldName); >> >> This is definitely an overkill for such a rare scenario. >> >> Thoughts? >> >> On Mon, Dec 21, 2015 at 11:22 AM, Vladimir Ozerov <[hidden email]> >> wrote: >> >> Agree, prefixing parent class fields sound like a better option. >>> Regarding aliases - I need some time to understand internal mechanics. >>> Will answer this a bit later. >>> >>> On Mon, Dec 21, 2015 at 11:18 AM, Dmitriy Setrakyan < >>> [hidden email] >>> >>>> wrote: >>>> Vova, >>>> >>>> Shouldn’t it be the other way around? Class B writes the field as “a”, >>>> and >>>> class A writes it with a prefix (possibly the hash code of the class >>>> name). >>>> >>>> Also, we should clearly document how the SQL queries are affected by >>>> this. >>>> AFAIK, we should be using field aliases here, no? >>>> >>>> D. >>>> >>>> On Sun, Dec 20, 2015 at 10:08 AM, Vladimir Ozerov <[hidden email] >>>> > >>>> wrote: >>>> >>>> May be we can use normal field names by default and add some >>>>> >>>> prefix/suffix >>>> >>>>> if conflict is found? E.g.: >>>>> >>>>> class A { >>>>> int a; // Write as "a"; >>>>> } >>>>> >>>>> class B extends A { >>>>> int a; // Write as "B_a"; >>>>> } >>>>> >>>>> On Fri, Dec 18, 2015 at 10:34 PM, Valentin Kulichenko < >>>>> [hidden email]> wrote: >>>>> >>>>> Folks, >>>>>> >>>>>> It seems to me that issue here is not with 3rd-party libraries. We >>>>>> >>>>> just >>>> >>>>> don't properly support class hierarchy in binary format. Any class >>>>>> >>>>> that >>>> >>>>> extends another class and has the field with the same name as parent >>>>>> >>>>> has >>>> >>>>> will fail unless user provides custom serialization logic that will >>>>>> >>>>> handle >>>>> >>>>>> it. >>>>>> >>>>>> What if we prepend the field name with the simple class name in this >>>>>> >>>>> case? >>>>> >>>>>> Say, we have two classes: >>>>>> >>>>>> class A { >>>>>> private int id; >>>>>> } >>>>>> >>>>>> class B extends A { >>>>>> private int id; >>>>>> } >>>>>> >>>>>> In this case we will get two fields: "A.id" and "B.id". The only >>>>>> >>>>> issue is >>>> >>>>> that if there are no name conflict, we should be able to resolve by >>>>>> >>>>> both >>>> >>>>> names - with or without prepended type name. I.e., if A is serialized, >>>>>> >>>>> you >>>>> >>>>>> can get the field value by "id" or "A.id". This is similar to how it >>>>>> >>>>> works >>>>> >>>>>> if you join two SQL tables with the same column names. >>>>>> >>>>>> Any thoughts on whether it's doable or not? >>>>>> >>>>>> -Val >>>>>> >>>>>> On Fri, Dec 18, 2015 at 10:05 AM, Andrey Kornev < >>>>>> >>>>> [hidden email]> >>>>> >>>>>> wrote: >>>>>> >>>>>> In this particular case, the class that fails is a non-static inner >>>>>>> >>>>>> class >>>>> >>>>>> that extends another non-static inner class, so they both end up >>>>>>> >>>>>> having >>>> >>>>> the >>>>>> >>>>>>> compiler-generated "this$0" field. >>>>>>> >>>>>>> Regards >>>>>>> Andrey >>>>>>> >>>>>>> Date: Fri, 18 Dec 2015 20:44:12 +0300 >>>>>>>> Subject: Re: CacheEntry serialization failure >>>>>>>> From: [hidden email] >>>>>>>> To: [hidden email] >>>>>>>> >>>>>>>> The most straightforward solution which comes to my mind - *do not >>>>>>>> >>>>>>> ever >>>>> >>>>>> use >>>>>>> >>>>>>>> BinaryMarshaller by default*. Always fallback to >>>>>>>> >>>>>>> OptimizedMarshaller >>>> >>>>> unless >>>>>>> >>>>>>>> user explicitly asked us to use binary format (e.g. through >>>>>>>> >>>>>>> package >>>> >>>>> wildcards). >>>>>>>> >>>>>>>> BTW, we already do this for Externalizable and >>>>>>>> >>>>>>> readObject/writeObject. >>>>> >>>>>> On Fri, Dec 18, 2015 at 8:41 PM, Vladimir Ozerov < >>>>>>>> >>>>>>> [hidden email] >>>>> >>>>>> wrote: >>>>>>>> >>>>>>>> Ah, I saw your problem with DirectedSpecifics. We need to think >>>>>>>>> >>>>>>>> about >>>>> >>>>>> how >>>>>>> >>>>>>>> to solve it. Here is the case: >>>>>>>>> 1) Class is Serilzable and cannot be changed; >>>>>>>>> 2) There are several duplicated field names; >>>>>>>>> => BinaryMarshaller cannot handle it. >>>>>>>>> >>>>>>>>> Any thoughts? >>>>>>>>> >>>>>>>>> On Fri, Dec 18, 2015 at 8:34 PM, Vladimir Ozerov < >>>>>>>>> >>>>>>>> [hidden email] >>>>>> >>>>>>> wrote: >>>>>>>>> >>>>>>>>> I fixed the problem, it was a bug actually. >>>>>>>>>> >>>>>>>>>> By default classes which has some custom Java logic (e.g. >>>>>>>>>> >>>>>>>>> Externalizable, >>>>>>> >>>>>>>> or with writeObject/readObject methods) will be written using >>>>>>>>>> OptimizedMarshaller, so similar field names is not a problem. >>>>>>>>>> If you want to serialize such class in binary format and have >>>>>>>>>> >>>>>>>>> duplicate >>>>>>> >>>>>>>> field names, you should provide your own BinarySerializer, >>>>>>>>>> >>>>>>>>> which >>>> >>>>> will >>>>>> >>>>>>> write >>>>>>> >>>>>>>> these fields with different names. >>>>>>>>>> >>>>>>>>>> On Fri, Dec 18, 2015 at 8:07 PM, Andrey Kornev < >>>>>>>>>> >>>>>>>>> [hidden email]> >>>>>>> >>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> How am I supposed to handle this situation if the class comes >>>>>>>>>>> >>>>>>>>>> from >>>>> >>>>>> a >>>>>> >>>>>>> 3d >>>>>>> >>>>>>>> party I can't modify? >>>>>>>>>>> >>>>>>>>>>> Thanks >>>>>>>>>>> Andrey >>>>>>>>>>> >>>>>>>>>>> Date: Fri, 18 Dec 2015 09:12:22 +0300 >>>>>>>>>>>> Subject: Re: CacheEntry serialization failure >>>>>>>>>>>> From: [hidden email] >>>>>>>>>>>> To: [hidden email] >>>>>>>>>>>> >>>>>>>>>>>> I'll take a look. >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Dec 18, 2015 at 4:37 AM, Valentin Kulichenko < >>>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Folks, >>>>>>>>>>>>> >>>>>>>>>>>>> It looks like CacheEntry implementation (i.e., the entry >>>>>>>>>>>>> >>>>>>>>>>>> that >>>> >>>>> contains >>>>>>>>>>> >>>>>>>>>>>> version) can't be properly serialized with the >>>>>>>>>>>>> >>>>>>>>>>>> BinaryMarshaller. >>>>>> >>>>>>> I >>>>>>> >>>>>>>> created >>>>>>>>>>> >>>>>>>>>>>> the test and the ticket: >>>>>>>>>>>>> >>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-2203 >>>>>>>>>>> >>>>>>>>>>>> Can someone take a look? >>>>>>>>>>>>> >>>>>>>>>>>>> -Val >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>> >>> > |
Vova,
We cannot return null in case of a conflict, as user won’t be able to differentiate between a conflict and missing field. We should throw an exception. Also, I don’t like parsing field names looking for a dot for every field. It will introduce a performance overhead for the cases that do not have conflicts. Instead, we should add another API for this use case, something like field(typeName, fieldName). D. On Mon, Dec 21, 2015 at 1:01 AM, Vladimir Ozerov <[hidden email]> wrote: > Denis, > Yes, as we do not know which field to pick, we return null. > > On Mon, Dec 21, 2015 at 12:00 PM, Denis Magda <[hidden email]> wrote: > > > Sounds good for me. I would go for this approach. > > > > In addition if to consider your example below and the user decides to > look > > up a field by its simple name then he/she will get nothing or exception > > (depending on the API), correct? > > As an example for this case the method will return null > > > > BinaryObject obj = ...; > > Object val = obj.field("x"); // null will be returned cause we don't know > > what particular 'x' we have to return > > > > > > -- > > Denis > > > > On 12/21/2015 11:48 AM, Vladimir Ozerov wrote: > > > >> Folks, > >> > >> I thought about the solution a bit more and came to the following > design. > >> > >> *Scenario:* > >> class A { > >> int x; > >> } > >> class B : extends A { > >> int y; > >> } > >> > >> *Solution:* > >> 1) Field A.x is written as *"A.x"*, field B.x is written as *"B.x"*. > I.e. > >> *both > >> conflicting fields are prefixed* with simple name of the owning class. > >> 2) API is unchanged. User manipulates these fields on all public methods > >> in > >> exactly the same way: "A.x" and "B.x". > >> > >> *Rationale:* > >> 1) We cannot prefix only some of conflicting fields. E.g. if decide to > >> prefix only A.x, then it is not clear how to handle this case: > >> > >> class B extends A implements Binarylizable { > >> void write(BinaryWriter writer) { > >> writer.writeInt("B.x", x); // User intentionally written field > as > >> "B.x". > >> } > >> } > >> > >> BinaryObject obj = ...; > >> Object val = obj.field("B.x"); // Should we lookup for "B.x" as user > asked > >> us, or just for "x"? > >> > >> Prefixing all conflicting fields with class name resolves the problem. > >> > >> 2) If we add methods to manipulate fields not only by name, but by > >> (typeName + fieldName) as well, then we will have to add *9 new methods* > >> to > >> > >> API: > >> BinaryType.fieldTypeName(String typeName, String fieldName); > >> BinaryType.field(String typeName, String fieldName); > >> BinaryObject.field(String typeName, String fieldName); > >> BinaryObject.hasField(String typeName, String fieldName); > >> BinaryObjectBuilder.getField(String typeName, String fieldName); > >> BinaryObjectBuilder.setField(String typeName, String fieldName, ...); > // 3 > >> overloads > >> BinaryObjectBuilder.removeField(String typeName, String fieldName); > >> > >> This is definitely an overkill for such a rare scenario. > >> > >> Thoughts? > >> > >> On Mon, Dec 21, 2015 at 11:22 AM, Vladimir Ozerov <[hidden email] > > > >> wrote: > >> > >> Agree, prefixing parent class fields sound like a better option. > >>> Regarding aliases - I need some time to understand internal mechanics. > >>> Will answer this a bit later. > >>> > >>> On Mon, Dec 21, 2015 at 11:18 AM, Dmitriy Setrakyan < > >>> [hidden email] > >>> > >>>> wrote: > >>>> Vova, > >>>> > >>>> Shouldn’t it be the other way around? Class B writes the field as “a”, > >>>> and > >>>> class A writes it with a prefix (possibly the hash code of the class > >>>> name). > >>>> > >>>> Also, we should clearly document how the SQL queries are affected by > >>>> this. > >>>> AFAIK, we should be using field aliases here, no? > >>>> > >>>> D. > >>>> > >>>> On Sun, Dec 20, 2015 at 10:08 AM, Vladimir Ozerov < > [hidden email] > >>>> > > >>>> wrote: > >>>> > >>>> May be we can use normal field names by default and add some > >>>>> > >>>> prefix/suffix > >>>> > >>>>> if conflict is found? E.g.: > >>>>> > >>>>> class A { > >>>>> int a; // Write as "a"; > >>>>> } > >>>>> > >>>>> class B extends A { > >>>>> int a; // Write as "B_a"; > >>>>> } > >>>>> > >>>>> On Fri, Dec 18, 2015 at 10:34 PM, Valentin Kulichenko < > >>>>> [hidden email]> wrote: > >>>>> > >>>>> Folks, > >>>>>> > >>>>>> It seems to me that issue here is not with 3rd-party libraries. We > >>>>>> > >>>>> just > >>>> > >>>>> don't properly support class hierarchy in binary format. Any class > >>>>>> > >>>>> that > >>>> > >>>>> extends another class and has the field with the same name as parent > >>>>>> > >>>>> has > >>>> > >>>>> will fail unless user provides custom serialization logic that will > >>>>>> > >>>>> handle > >>>>> > >>>>>> it. > >>>>>> > >>>>>> What if we prepend the field name with the simple class name in this > >>>>>> > >>>>> case? > >>>>> > >>>>>> Say, we have two classes: > >>>>>> > >>>>>> class A { > >>>>>> private int id; > >>>>>> } > >>>>>> > >>>>>> class B extends A { > >>>>>> private int id; > >>>>>> } > >>>>>> > >>>>>> In this case we will get two fields: "A.id" and "B.id". The only > >>>>>> > >>>>> issue is > >>>> > >>>>> that if there are no name conflict, we should be able to resolve by > >>>>>> > >>>>> both > >>>> > >>>>> names - with or without prepended type name. I.e., if A is > serialized, > >>>>>> > >>>>> you > >>>>> > >>>>>> can get the field value by "id" or "A.id". This is similar to how it > >>>>>> > >>>>> works > >>>>> > >>>>>> if you join two SQL tables with the same column names. > >>>>>> > >>>>>> Any thoughts on whether it's doable or not? > >>>>>> > >>>>>> -Val > >>>>>> > >>>>>> On Fri, Dec 18, 2015 at 10:05 AM, Andrey Kornev < > >>>>>> > >>>>> [hidden email]> > >>>>> > >>>>>> wrote: > >>>>>> > >>>>>> In this particular case, the class that fails is a non-static inner > >>>>>>> > >>>>>> class > >>>>> > >>>>>> that extends another non-static inner class, so they both end up > >>>>>>> > >>>>>> having > >>>> > >>>>> the > >>>>>> > >>>>>>> compiler-generated "this$0" field. > >>>>>>> > >>>>>>> Regards > >>>>>>> Andrey > >>>>>>> > >>>>>>> Date: Fri, 18 Dec 2015 20:44:12 +0300 > >>>>>>>> Subject: Re: CacheEntry serialization failure > >>>>>>>> From: [hidden email] > >>>>>>>> To: [hidden email] > >>>>>>>> > >>>>>>>> The most straightforward solution which comes to my mind - *do not > >>>>>>>> > >>>>>>> ever > >>>>> > >>>>>> use > >>>>>>> > >>>>>>>> BinaryMarshaller by default*. Always fallback to > >>>>>>>> > >>>>>>> OptimizedMarshaller > >>>> > >>>>> unless > >>>>>>> > >>>>>>>> user explicitly asked us to use binary format (e.g. through > >>>>>>>> > >>>>>>> package > >>>> > >>>>> wildcards). > >>>>>>>> > >>>>>>>> BTW, we already do this for Externalizable and > >>>>>>>> > >>>>>>> readObject/writeObject. > >>>>> > >>>>>> On Fri, Dec 18, 2015 at 8:41 PM, Vladimir Ozerov < > >>>>>>>> > >>>>>>> [hidden email] > >>>>> > >>>>>> wrote: > >>>>>>>> > >>>>>>>> Ah, I saw your problem with DirectedSpecifics. We need to think > >>>>>>>>> > >>>>>>>> about > >>>>> > >>>>>> how > >>>>>>> > >>>>>>>> to solve it. Here is the case: > >>>>>>>>> 1) Class is Serilzable and cannot be changed; > >>>>>>>>> 2) There are several duplicated field names; > >>>>>>>>> => BinaryMarshaller cannot handle it. > >>>>>>>>> > >>>>>>>>> Any thoughts? > >>>>>>>>> > >>>>>>>>> On Fri, Dec 18, 2015 at 8:34 PM, Vladimir Ozerov < > >>>>>>>>> > >>>>>>>> [hidden email] > >>>>>> > >>>>>>> wrote: > >>>>>>>>> > >>>>>>>>> I fixed the problem, it was a bug actually. > >>>>>>>>>> > >>>>>>>>>> By default classes which has some custom Java logic (e.g. > >>>>>>>>>> > >>>>>>>>> Externalizable, > >>>>>>> > >>>>>>>> or with writeObject/readObject methods) will be written using > >>>>>>>>>> OptimizedMarshaller, so similar field names is not a problem. > >>>>>>>>>> If you want to serialize such class in binary format and have > >>>>>>>>>> > >>>>>>>>> duplicate > >>>>>>> > >>>>>>>> field names, you should provide your own BinarySerializer, > >>>>>>>>>> > >>>>>>>>> which > >>>> > >>>>> will > >>>>>> > >>>>>>> write > >>>>>>> > >>>>>>>> these fields with different names. > >>>>>>>>>> > >>>>>>>>>> On Fri, Dec 18, 2015 at 8:07 PM, Andrey Kornev < > >>>>>>>>>> > >>>>>>>>> [hidden email]> > >>>>>>> > >>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>> How am I supposed to handle this situation if the class comes > >>>>>>>>>>> > >>>>>>>>>> from > >>>>> > >>>>>> a > >>>>>> > >>>>>>> 3d > >>>>>>> > >>>>>>>> party I can't modify? > >>>>>>>>>>> > >>>>>>>>>>> Thanks > >>>>>>>>>>> Andrey > >>>>>>>>>>> > >>>>>>>>>>> Date: Fri, 18 Dec 2015 09:12:22 +0300 > >>>>>>>>>>>> Subject: Re: CacheEntry serialization failure > >>>>>>>>>>>> From: [hidden email] > >>>>>>>>>>>> To: [hidden email] > >>>>>>>>>>>> > >>>>>>>>>>>> I'll take a look. > >>>>>>>>>>>> > >>>>>>>>>>>> On Fri, Dec 18, 2015 at 4:37 AM, Valentin Kulichenko < > >>>>>>>>>>>> [hidden email]> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>> Folks, > >>>>>>>>>>>>> > >>>>>>>>>>>>> It looks like CacheEntry implementation (i.e., the entry > >>>>>>>>>>>>> > >>>>>>>>>>>> that > >>>> > >>>>> contains > >>>>>>>>>>> > >>>>>>>>>>>> version) can't be properly serialized with the > >>>>>>>>>>>>> > >>>>>>>>>>>> BinaryMarshaller. > >>>>>> > >>>>>>> I > >>>>>>> > >>>>>>>> created > >>>>>>>>>>> > >>>>>>>>>>>> the test and the ticket: > >>>>>>>>>>>>> > >>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-2203 > >>>>>>>>>>> > >>>>>>>>>>>> Can someone take a look? > >>>>>>>>>>>>> > >>>>>>>>>>>>> -Val > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>> > >>> > > > |
Dima,
Here is how proposed design works: class A { int x = 1; } class B { int x = 2; } BinaryObject obj = ...; Object val = obj.field("A.x"); // Returns "1"; Object val = obj.field("B.x"); // Returns "2"; Object val = obj.field("x"); // Returns null; boolean exists = obj.hasField("A.x"); // Returns "true"; boolean exists = obj.hasField("B.x"); // Returns "true"; boolean exists = obj.hasField("x"); // Returns "false"; Looks clean and consistent for me. Remember that we are talking about very specific use case. It is very unlikely that user will operate on objects conflicting fields in binary form. Also, there will be no parsing at all. We use field name passed by user directly. Vladimir. On Mon, Dec 21, 2015 at 12:10 PM, Dmitriy Setrakyan <[hidden email]> wrote: > Vova, > > We cannot return null in case of a conflict, as user won’t be able to > differentiate between a conflict and missing field. We should throw an > exception. > > Also, I don’t like parsing field names looking for a dot for every field. > It will introduce a performance overhead for the cases that do not have > conflicts. Instead, we should add another API for this use case, something > like field(typeName, fieldName). > > D. > > On Mon, Dec 21, 2015 at 1:01 AM, Vladimir Ozerov <[hidden email]> > wrote: > > > Denis, > > Yes, as we do not know which field to pick, we return null. > > > > On Mon, Dec 21, 2015 at 12:00 PM, Denis Magda <[hidden email]> > wrote: > > > > > Sounds good for me. I would go for this approach. > > > > > > In addition if to consider your example below and the user decides to > > look > > > up a field by its simple name then he/she will get nothing or exception > > > (depending on the API), correct? > > > As an example for this case the method will return null > > > > > > BinaryObject obj = ...; > > > Object val = obj.field("x"); // null will be returned cause we don't > know > > > what particular 'x' we have to return > > > > > > > > > -- > > > Denis > > > > > > On 12/21/2015 11:48 AM, Vladimir Ozerov wrote: > > > > > >> Folks, > > >> > > >> I thought about the solution a bit more and came to the following > > design. > > >> > > >> *Scenario:* > > >> class A { > > >> int x; > > >> } > > >> class B : extends A { > > >> int y; > > >> } > > >> > > >> *Solution:* > > >> 1) Field A.x is written as *"A.x"*, field B.x is written as *"B.x"*. > > I.e. > > >> *both > > >> conflicting fields are prefixed* with simple name of the owning class. > > >> 2) API is unchanged. User manipulates these fields on all public > methods > > >> in > > >> exactly the same way: "A.x" and "B.x". > > >> > > >> *Rationale:* > > >> 1) We cannot prefix only some of conflicting fields. E.g. if decide to > > >> prefix only A.x, then it is not clear how to handle this case: > > >> > > >> class B extends A implements Binarylizable { > > >> void write(BinaryWriter writer) { > > >> writer.writeInt("B.x", x); // User intentionally written > field > > as > > >> "B.x". > > >> } > > >> } > > >> > > >> BinaryObject obj = ...; > > >> Object val = obj.field("B.x"); // Should we lookup for "B.x" as user > > asked > > >> us, or just for "x"? > > >> > > >> Prefixing all conflicting fields with class name resolves the problem. > > >> > > >> 2) If we add methods to manipulate fields not only by name, but by > > >> (typeName + fieldName) as well, then we will have to add *9 new > methods* > > >> to > > >> > > >> API: > > >> BinaryType.fieldTypeName(String typeName, String fieldName); > > >> BinaryType.field(String typeName, String fieldName); > > >> BinaryObject.field(String typeName, String fieldName); > > >> BinaryObject.hasField(String typeName, String fieldName); > > >> BinaryObjectBuilder.getField(String typeName, String fieldName); > > >> BinaryObjectBuilder.setField(String typeName, String fieldName, ...); > > // 3 > > >> overloads > > >> BinaryObjectBuilder.removeField(String typeName, String fieldName); > > >> > > >> This is definitely an overkill for such a rare scenario. > > >> > > >> Thoughts? > > >> > > >> On Mon, Dec 21, 2015 at 11:22 AM, Vladimir Ozerov < > [hidden email] > > > > > >> wrote: > > >> > > >> Agree, prefixing parent class fields sound like a better option. > > >>> Regarding aliases - I need some time to understand internal > mechanics. > > >>> Will answer this a bit later. > > >>> > > >>> On Mon, Dec 21, 2015 at 11:18 AM, Dmitriy Setrakyan < > > >>> [hidden email] > > >>> > > >>>> wrote: > > >>>> Vova, > > >>>> > > >>>> Shouldn’t it be the other way around? Class B writes the field as > “a”, > > >>>> and > > >>>> class A writes it with a prefix (possibly the hash code of the class > > >>>> name). > > >>>> > > >>>> Also, we should clearly document how the SQL queries are affected by > > >>>> this. > > >>>> AFAIK, we should be using field aliases here, no? > > >>>> > > >>>> D. > > >>>> > > >>>> On Sun, Dec 20, 2015 at 10:08 AM, Vladimir Ozerov < > > [hidden email] > > >>>> > > > >>>> wrote: > > >>>> > > >>>> May be we can use normal field names by default and add some > > >>>>> > > >>>> prefix/suffix > > >>>> > > >>>>> if conflict is found? E.g.: > > >>>>> > > >>>>> class A { > > >>>>> int a; // Write as "a"; > > >>>>> } > > >>>>> > > >>>>> class B extends A { > > >>>>> int a; // Write as "B_a"; > > >>>>> } > > >>>>> > > >>>>> On Fri, Dec 18, 2015 at 10:34 PM, Valentin Kulichenko < > > >>>>> [hidden email]> wrote: > > >>>>> > > >>>>> Folks, > > >>>>>> > > >>>>>> It seems to me that issue here is not with 3rd-party libraries. We > > >>>>>> > > >>>>> just > > >>>> > > >>>>> don't properly support class hierarchy in binary format. Any class > > >>>>>> > > >>>>> that > > >>>> > > >>>>> extends another class and has the field with the same name as > parent > > >>>>>> > > >>>>> has > > >>>> > > >>>>> will fail unless user provides custom serialization logic that will > > >>>>>> > > >>>>> handle > > >>>>> > > >>>>>> it. > > >>>>>> > > >>>>>> What if we prepend the field name with the simple class name in > this > > >>>>>> > > >>>>> case? > > >>>>> > > >>>>>> Say, we have two classes: > > >>>>>> > > >>>>>> class A { > > >>>>>> private int id; > > >>>>>> } > > >>>>>> > > >>>>>> class B extends A { > > >>>>>> private int id; > > >>>>>> } > > >>>>>> > > >>>>>> In this case we will get two fields: "A.id" and "B.id". The only > > >>>>>> > > >>>>> issue is > > >>>> > > >>>>> that if there are no name conflict, we should be able to resolve by > > >>>>>> > > >>>>> both > > >>>> > > >>>>> names - with or without prepended type name. I.e., if A is > > serialized, > > >>>>>> > > >>>>> you > > >>>>> > > >>>>>> can get the field value by "id" or "A.id". This is similar to how > it > > >>>>>> > > >>>>> works > > >>>>> > > >>>>>> if you join two SQL tables with the same column names. > > >>>>>> > > >>>>>> Any thoughts on whether it's doable or not? > > >>>>>> > > >>>>>> -Val > > >>>>>> > > >>>>>> On Fri, Dec 18, 2015 at 10:05 AM, Andrey Kornev < > > >>>>>> > > >>>>> [hidden email]> > > >>>>> > > >>>>>> wrote: > > >>>>>> > > >>>>>> In this particular case, the class that fails is a non-static > inner > > >>>>>>> > > >>>>>> class > > >>>>> > > >>>>>> that extends another non-static inner class, so they both end up > > >>>>>>> > > >>>>>> having > > >>>> > > >>>>> the > > >>>>>> > > >>>>>>> compiler-generated "this$0" field. > > >>>>>>> > > >>>>>>> Regards > > >>>>>>> Andrey > > >>>>>>> > > >>>>>>> Date: Fri, 18 Dec 2015 20:44:12 +0300 > > >>>>>>>> Subject: Re: CacheEntry serialization failure > > >>>>>>>> From: [hidden email] > > >>>>>>>> To: [hidden email] > > >>>>>>>> > > >>>>>>>> The most straightforward solution which comes to my mind - *do > not > > >>>>>>>> > > >>>>>>> ever > > >>>>> > > >>>>>> use > > >>>>>>> > > >>>>>>>> BinaryMarshaller by default*. Always fallback to > > >>>>>>>> > > >>>>>>> OptimizedMarshaller > > >>>> > > >>>>> unless > > >>>>>>> > > >>>>>>>> user explicitly asked us to use binary format (e.g. through > > >>>>>>>> > > >>>>>>> package > > >>>> > > >>>>> wildcards). > > >>>>>>>> > > >>>>>>>> BTW, we already do this for Externalizable and > > >>>>>>>> > > >>>>>>> readObject/writeObject. > > >>>>> > > >>>>>> On Fri, Dec 18, 2015 at 8:41 PM, Vladimir Ozerov < > > >>>>>>>> > > >>>>>>> [hidden email] > > >>>>> > > >>>>>> wrote: > > >>>>>>>> > > >>>>>>>> Ah, I saw your problem with DirectedSpecifics. We need to think > > >>>>>>>>> > > >>>>>>>> about > > >>>>> > > >>>>>> how > > >>>>>>> > > >>>>>>>> to solve it. Here is the case: > > >>>>>>>>> 1) Class is Serilzable and cannot be changed; > > >>>>>>>>> 2) There are several duplicated field names; > > >>>>>>>>> => BinaryMarshaller cannot handle it. > > >>>>>>>>> > > >>>>>>>>> Any thoughts? > > >>>>>>>>> > > >>>>>>>>> On Fri, Dec 18, 2015 at 8:34 PM, Vladimir Ozerov < > > >>>>>>>>> > > >>>>>>>> [hidden email] > > >>>>>> > > >>>>>>> wrote: > > >>>>>>>>> > > >>>>>>>>> I fixed the problem, it was a bug actually. > > >>>>>>>>>> > > >>>>>>>>>> By default classes which has some custom Java logic (e.g. > > >>>>>>>>>> > > >>>>>>>>> Externalizable, > > >>>>>>> > > >>>>>>>> or with writeObject/readObject methods) will be written using > > >>>>>>>>>> OptimizedMarshaller, so similar field names is not a problem. > > >>>>>>>>>> If you want to serialize such class in binary format and have > > >>>>>>>>>> > > >>>>>>>>> duplicate > > >>>>>>> > > >>>>>>>> field names, you should provide your own BinarySerializer, > > >>>>>>>>>> > > >>>>>>>>> which > > >>>> > > >>>>> will > > >>>>>> > > >>>>>>> write > > >>>>>>> > > >>>>>>>> these fields with different names. > > >>>>>>>>>> > > >>>>>>>>>> On Fri, Dec 18, 2015 at 8:07 PM, Andrey Kornev < > > >>>>>>>>>> > > >>>>>>>>> [hidden email]> > > >>>>>>> > > >>>>>>>> wrote: > > >>>>>>>>>> > > >>>>>>>>>> How am I supposed to handle this situation if the class comes > > >>>>>>>>>>> > > >>>>>>>>>> from > > >>>>> > > >>>>>> a > > >>>>>> > > >>>>>>> 3d > > >>>>>>> > > >>>>>>>> party I can't modify? > > >>>>>>>>>>> > > >>>>>>>>>>> Thanks > > >>>>>>>>>>> Andrey > > >>>>>>>>>>> > > >>>>>>>>>>> Date: Fri, 18 Dec 2015 09:12:22 +0300 > > >>>>>>>>>>>> Subject: Re: CacheEntry serialization failure > > >>>>>>>>>>>> From: [hidden email] > > >>>>>>>>>>>> To: [hidden email] > > >>>>>>>>>>>> > > >>>>>>>>>>>> I'll take a look. > > >>>>>>>>>>>> > > >>>>>>>>>>>> On Fri, Dec 18, 2015 at 4:37 AM, Valentin Kulichenko < > > >>>>>>>>>>>> [hidden email]> wrote: > > >>>>>>>>>>>> > > >>>>>>>>>>>> Folks, > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> It looks like CacheEntry implementation (i.e., the entry > > >>>>>>>>>>>>> > > >>>>>>>>>>>> that > > >>>> > > >>>>> contains > > >>>>>>>>>>> > > >>>>>>>>>>>> version) can't be properly serialized with the > > >>>>>>>>>>>>> > > >>>>>>>>>>>> BinaryMarshaller. > > >>>>>> > > >>>>>>> I > > >>>>>>> > > >>>>>>>> created > > >>>>>>>>>>> > > >>>>>>>>>>>> the test and the ticket: > > >>>>>>>>>>>>> > > >>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-2203 > > >>>>>>>>>>> > > >>>>>>>>>>>> Can someone take a look? > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> -Val > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>> > > >>>>>>> > > >>> > > > > > > |
Got it. I understand the reasoning now for not throwing an exception. Let’s
make sure we document this behavior. If there is no additional field-name-parsing overhead, then the proposed API looks very nice. D. On Mon, Dec 21, 2015 at 1:19 AM, Vladimir Ozerov <[hidden email]> wrote: > Dima, > > Here is how proposed design works: > > class A { > int x = 1; > } > class B { > int x = 2; > } > > BinaryObject obj = ...; > > Object val = obj.field("A.x"); // Returns "1"; > Object val = obj.field("B.x"); // Returns "2"; > Object val = obj.field("x"); // Returns null; > > boolean exists = obj.hasField("A.x"); // Returns "true"; > boolean exists = obj.hasField("B.x"); // Returns "true"; > boolean exists = obj.hasField("x"); // Returns "false"; > > Looks clean and consistent for me. Remember that we are talking about very > specific use case. It is very unlikely that user will operate on objects > conflicting fields in binary form. > Also, there will be no parsing at all. We use field name passed by user > directly. > > Vladimir. > > On Mon, Dec 21, 2015 at 12:10 PM, Dmitriy Setrakyan <[hidden email] > > > wrote: > > > Vova, > > > > We cannot return null in case of a conflict, as user won’t be able to > > differentiate between a conflict and missing field. We should throw an > > exception. > > > > Also, I don’t like parsing field names looking for a dot for every field. > > It will introduce a performance overhead for the cases that do not have > > conflicts. Instead, we should add another API for this use case, > something > > like field(typeName, fieldName). > > > > D. > > > > On Mon, Dec 21, 2015 at 1:01 AM, Vladimir Ozerov <[hidden email]> > > wrote: > > > > > Denis, > > > Yes, as we do not know which field to pick, we return null. > > > > > > On Mon, Dec 21, 2015 at 12:00 PM, Denis Magda <[hidden email]> > > wrote: > > > > > > > Sounds good for me. I would go for this approach. > > > > > > > > In addition if to consider your example below and the user decides to > > > look > > > > up a field by its simple name then he/she will get nothing or > exception > > > > (depending on the API), correct? > > > > As an example for this case the method will return null > > > > > > > > BinaryObject obj = ...; > > > > Object val = obj.field("x"); // null will be returned cause we don't > > know > > > > what particular 'x' we have to return > > > > > > > > > > > > -- > > > > Denis > > > > > > > > On 12/21/2015 11:48 AM, Vladimir Ozerov wrote: > > > > > > > >> Folks, > > > >> > > > >> I thought about the solution a bit more and came to the following > > > design. > > > >> > > > >> *Scenario:* > > > >> class A { > > > >> int x; > > > >> } > > > >> class B : extends A { > > > >> int y; > > > >> } > > > >> > > > >> *Solution:* > > > >> 1) Field A.x is written as *"A.x"*, field B.x is written as *"B.x"*. > > > I.e. > > > >> *both > > > >> conflicting fields are prefixed* with simple name of the owning > class. > > > >> 2) API is unchanged. User manipulates these fields on all public > > methods > > > >> in > > > >> exactly the same way: "A.x" and "B.x". > > > >> > > > >> *Rationale:* > > > >> 1) We cannot prefix only some of conflicting fields. E.g. if decide > to > > > >> prefix only A.x, then it is not clear how to handle this case: > > > >> > > > >> class B extends A implements Binarylizable { > > > >> void write(BinaryWriter writer) { > > > >> writer.writeInt("B.x", x); // User intentionally written > > field > > > as > > > >> "B.x". > > > >> } > > > >> } > > > >> > > > >> BinaryObject obj = ...; > > > >> Object val = obj.field("B.x"); // Should we lookup for "B.x" as user > > > asked > > > >> us, or just for "x"? > > > >> > > > >> Prefixing all conflicting fields with class name resolves the > problem. > > > >> > > > >> 2) If we add methods to manipulate fields not only by name, but by > > > >> (typeName + fieldName) as well, then we will have to add *9 new > > methods* > > > >> to > > > >> > > > >> API: > > > >> BinaryType.fieldTypeName(String typeName, String fieldName); > > > >> BinaryType.field(String typeName, String fieldName); > > > >> BinaryObject.field(String typeName, String fieldName); > > > >> BinaryObject.hasField(String typeName, String fieldName); > > > >> BinaryObjectBuilder.getField(String typeName, String fieldName); > > > >> BinaryObjectBuilder.setField(String typeName, String fieldName, > ...); > > > // 3 > > > >> overloads > > > >> BinaryObjectBuilder.removeField(String typeName, String fieldName); > > > >> > > > >> This is definitely an overkill for such a rare scenario. > > > >> > > > >> Thoughts? > > > >> > > > >> On Mon, Dec 21, 2015 at 11:22 AM, Vladimir Ozerov < > > [hidden email] > > > > > > > >> wrote: > > > >> > > > >> Agree, prefixing parent class fields sound like a better option. > > > >>> Regarding aliases - I need some time to understand internal > > mechanics. > > > >>> Will answer this a bit later. > > > >>> > > > >>> On Mon, Dec 21, 2015 at 11:18 AM, Dmitriy Setrakyan < > > > >>> [hidden email] > > > >>> > > > >>>> wrote: > > > >>>> Vova, > > > >>>> > > > >>>> Shouldn’t it be the other way around? Class B writes the field as > > “a”, > > > >>>> and > > > >>>> class A writes it with a prefix (possibly the hash code of the > class > > > >>>> name). > > > >>>> > > > >>>> Also, we should clearly document how the SQL queries are affected > by > > > >>>> this. > > > >>>> AFAIK, we should be using field aliases here, no? > > > >>>> > > > >>>> D. > > > >>>> > > > >>>> On Sun, Dec 20, 2015 at 10:08 AM, Vladimir Ozerov < > > > [hidden email] > > > >>>> > > > > >>>> wrote: > > > >>>> > > > >>>> May be we can use normal field names by default and add some > > > >>>>> > > > >>>> prefix/suffix > > > >>>> > > > >>>>> if conflict is found? E.g.: > > > >>>>> > > > >>>>> class A { > > > >>>>> int a; // Write as "a"; > > > >>>>> } > > > >>>>> > > > >>>>> class B extends A { > > > >>>>> int a; // Write as "B_a"; > > > >>>>> } > > > >>>>> > > > >>>>> On Fri, Dec 18, 2015 at 10:34 PM, Valentin Kulichenko < > > > >>>>> [hidden email]> wrote: > > > >>>>> > > > >>>>> Folks, > > > >>>>>> > > > >>>>>> It seems to me that issue here is not with 3rd-party libraries. > We > > > >>>>>> > > > >>>>> just > > > >>>> > > > >>>>> don't properly support class hierarchy in binary format. Any > class > > > >>>>>> > > > >>>>> that > > > >>>> > > > >>>>> extends another class and has the field with the same name as > > parent > > > >>>>>> > > > >>>>> has > > > >>>> > > > >>>>> will fail unless user provides custom serialization logic that > will > > > >>>>>> > > > >>>>> handle > > > >>>>> > > > >>>>>> it. > > > >>>>>> > > > >>>>>> What if we prepend the field name with the simple class name in > > this > > > >>>>>> > > > >>>>> case? > > > >>>>> > > > >>>>>> Say, we have two classes: > > > >>>>>> > > > >>>>>> class A { > > > >>>>>> private int id; > > > >>>>>> } > > > >>>>>> > > > >>>>>> class B extends A { > > > >>>>>> private int id; > > > >>>>>> } > > > >>>>>> > > > >>>>>> In this case we will get two fields: "A.id" and "B.id". The only > > > >>>>>> > > > >>>>> issue is > > > >>>> > > > >>>>> that if there are no name conflict, we should be able to resolve > by > > > >>>>>> > > > >>>>> both > > > >>>> > > > >>>>> names - with or without prepended type name. I.e., if A is > > > serialized, > > > >>>>>> > > > >>>>> you > > > >>>>> > > > >>>>>> can get the field value by "id" or "A.id". This is similar to > how > > it > > > >>>>>> > > > >>>>> works > > > >>>>> > > > >>>>>> if you join two SQL tables with the same column names. > > > >>>>>> > > > >>>>>> Any thoughts on whether it's doable or not? > > > >>>>>> > > > >>>>>> -Val > > > >>>>>> > > > >>>>>> On Fri, Dec 18, 2015 at 10:05 AM, Andrey Kornev < > > > >>>>>> > > > >>>>> [hidden email]> > > > >>>>> > > > >>>>>> wrote: > > > >>>>>> > > > >>>>>> In this particular case, the class that fails is a non-static > > inner > > > >>>>>>> > > > >>>>>> class > > > >>>>> > > > >>>>>> that extends another non-static inner class, so they both end up > > > >>>>>>> > > > >>>>>> having > > > >>>> > > > >>>>> the > > > >>>>>> > > > >>>>>>> compiler-generated "this$0" field. > > > >>>>>>> > > > >>>>>>> Regards > > > >>>>>>> Andrey > > > >>>>>>> > > > >>>>>>> Date: Fri, 18 Dec 2015 20:44:12 +0300 > > > >>>>>>>> Subject: Re: CacheEntry serialization failure > > > >>>>>>>> From: [hidden email] > > > >>>>>>>> To: [hidden email] > > > >>>>>>>> > > > >>>>>>>> The most straightforward solution which comes to my mind - *do > > not > > > >>>>>>>> > > > >>>>>>> ever > > > >>>>> > > > >>>>>> use > > > >>>>>>> > > > >>>>>>>> BinaryMarshaller by default*. Always fallback to > > > >>>>>>>> > > > >>>>>>> OptimizedMarshaller > > > >>>> > > > >>>>> unless > > > >>>>>>> > > > >>>>>>>> user explicitly asked us to use binary format (e.g. through > > > >>>>>>>> > > > >>>>>>> package > > > >>>> > > > >>>>> wildcards). > > > >>>>>>>> > > > >>>>>>>> BTW, we already do this for Externalizable and > > > >>>>>>>> > > > >>>>>>> readObject/writeObject. > > > >>>>> > > > >>>>>> On Fri, Dec 18, 2015 at 8:41 PM, Vladimir Ozerov < > > > >>>>>>>> > > > >>>>>>> [hidden email] > > > >>>>> > > > >>>>>> wrote: > > > >>>>>>>> > > > >>>>>>>> Ah, I saw your problem with DirectedSpecifics. We need to > think > > > >>>>>>>>> > > > >>>>>>>> about > > > >>>>> > > > >>>>>> how > > > >>>>>>> > > > >>>>>>>> to solve it. Here is the case: > > > >>>>>>>>> 1) Class is Serilzable and cannot be changed; > > > >>>>>>>>> 2) There are several duplicated field names; > > > >>>>>>>>> => BinaryMarshaller cannot handle it. > > > >>>>>>>>> > > > >>>>>>>>> Any thoughts? > > > >>>>>>>>> > > > >>>>>>>>> On Fri, Dec 18, 2015 at 8:34 PM, Vladimir Ozerov < > > > >>>>>>>>> > > > >>>>>>>> [hidden email] > > > >>>>>> > > > >>>>>>> wrote: > > > >>>>>>>>> > > > >>>>>>>>> I fixed the problem, it was a bug actually. > > > >>>>>>>>>> > > > >>>>>>>>>> By default classes which has some custom Java logic (e.g. > > > >>>>>>>>>> > > > >>>>>>>>> Externalizable, > > > >>>>>>> > > > >>>>>>>> or with writeObject/readObject methods) will be written using > > > >>>>>>>>>> OptimizedMarshaller, so similar field names is not a > problem. > > > >>>>>>>>>> If you want to serialize such class in binary format and > have > > > >>>>>>>>>> > > > >>>>>>>>> duplicate > > > >>>>>>> > > > >>>>>>>> field names, you should provide your own BinarySerializer, > > > >>>>>>>>>> > > > >>>>>>>>> which > > > >>>> > > > >>>>> will > > > >>>>>> > > > >>>>>>> write > > > >>>>>>> > > > >>>>>>>> these fields with different names. > > > >>>>>>>>>> > > > >>>>>>>>>> On Fri, Dec 18, 2015 at 8:07 PM, Andrey Kornev < > > > >>>>>>>>>> > > > >>>>>>>>> [hidden email]> > > > >>>>>>> > > > >>>>>>>> wrote: > > > >>>>>>>>>> > > > >>>>>>>>>> How am I supposed to handle this situation if the class > comes > > > >>>>>>>>>>> > > > >>>>>>>>>> from > > > >>>>> > > > >>>>>> a > > > >>>>>> > > > >>>>>>> 3d > > > >>>>>>> > > > >>>>>>>> party I can't modify? > > > >>>>>>>>>>> > > > >>>>>>>>>>> Thanks > > > >>>>>>>>>>> Andrey > > > >>>>>>>>>>> > > > >>>>>>>>>>> Date: Fri, 18 Dec 2015 09:12:22 +0300 > > > >>>>>>>>>>>> Subject: Re: CacheEntry serialization failure > > > >>>>>>>>>>>> From: [hidden email] > > > >>>>>>>>>>>> To: [hidden email] > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> I'll take a look. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> On Fri, Dec 18, 2015 at 4:37 AM, Valentin Kulichenko < > > > >>>>>>>>>>>> [hidden email]> wrote: > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Folks, > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> It looks like CacheEntry implementation (i.e., the entry > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>> that > > > >>>> > > > >>>>> contains > > > >>>>>>>>>>> > > > >>>>>>>>>>>> version) can't be properly serialized with the > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>> BinaryMarshaller. > > > >>>>>> > > > >>>>>>> I > > > >>>>>>> > > > >>>>>>>> created > > > >>>>>>>>>>> > > > >>>>>>>>>>>> the test and the ticket: > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-2203 > > > >>>>>>>>>>> > > > >>>>>>>>>>>> Can someone take a look? > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> -Val > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>> > > > >>> > > > > > > > > > > |
From my standpoint Vova's appoach very close to SQL behavior if two joined
tables have same column names. On Mon, Dec 21, 2015 at 12:23 PM, Dmitriy Setrakyan <[hidden email]> wrote: > Got it. I understand the reasoning now for not throwing an exception. Let’s > make sure we document this behavior. > > If there is no additional field-name-parsing overhead, then the proposed > API looks very nice. > > D. > > On Mon, Dec 21, 2015 at 1:19 AM, Vladimir Ozerov <[hidden email]> > wrote: > > > Dima, > > > > Here is how proposed design works: > > > > class A { > > int x = 1; > > } > > class B { > > int x = 2; > > } > > > > BinaryObject obj = ...; > > > > Object val = obj.field("A.x"); // Returns "1"; > > Object val = obj.field("B.x"); // Returns "2"; > > Object val = obj.field("x"); // Returns null; > > > > boolean exists = obj.hasField("A.x"); // Returns "true"; > > boolean exists = obj.hasField("B.x"); // Returns "true"; > > boolean exists = obj.hasField("x"); // Returns "false"; > > > > Looks clean and consistent for me. Remember that we are talking about > very > > specific use case. It is very unlikely that user will operate on objects > > conflicting fields in binary form. > > Also, there will be no parsing at all. We use field name passed by user > > directly. > > > > Vladimir. > > > > On Mon, Dec 21, 2015 at 12:10 PM, Dmitriy Setrakyan < > [hidden email] > > > > > wrote: > > > > > Vova, > > > > > > We cannot return null in case of a conflict, as user won’t be able to > > > differentiate between a conflict and missing field. We should throw an > > > exception. > > > > > > Also, I don’t like parsing field names looking for a dot for every > field. > > > It will introduce a performance overhead for the cases that do not have > > > conflicts. Instead, we should add another API for this use case, > > something > > > like field(typeName, fieldName). > > > > > > D. > > > > > > On Mon, Dec 21, 2015 at 1:01 AM, Vladimir Ozerov <[hidden email] > > > > > wrote: > > > > > > > Denis, > > > > Yes, as we do not know which field to pick, we return null. > > > > > > > > On Mon, Dec 21, 2015 at 12:00 PM, Denis Magda <[hidden email]> > > > wrote: > > > > > > > > > Sounds good for me. I would go for this approach. > > > > > > > > > > In addition if to consider your example below and the user decides > to > > > > look > > > > > up a field by its simple name then he/she will get nothing or > > exception > > > > > (depending on the API), correct? > > > > > As an example for this case the method will return null > > > > > > > > > > BinaryObject obj = ...; > > > > > Object val = obj.field("x"); // null will be returned cause we > don't > > > know > > > > > what particular 'x' we have to return > > > > > > > > > > > > > > > -- > > > > > Denis > > > > > > > > > > On 12/21/2015 11:48 AM, Vladimir Ozerov wrote: > > > > > > > > > >> Folks, > > > > >> > > > > >> I thought about the solution a bit more and came to the following > > > > design. > > > > >> > > > > >> *Scenario:* > > > > >> class A { > > > > >> int x; > > > > >> } > > > > >> class B : extends A { > > > > >> int y; > > > > >> } > > > > >> > > > > >> *Solution:* > > > > >> 1) Field A.x is written as *"A.x"*, field B.x is written as > *"B.x"*. > > > > I.e. > > > > >> *both > > > > >> conflicting fields are prefixed* with simple name of the owning > > class. > > > > >> 2) API is unchanged. User manipulates these fields on all public > > > methods > > > > >> in > > > > >> exactly the same way: "A.x" and "B.x". > > > > >> > > > > >> *Rationale:* > > > > >> 1) We cannot prefix only some of conflicting fields. E.g. if > decide > > to > > > > >> prefix only A.x, then it is not clear how to handle this case: > > > > >> > > > > >> class B extends A implements Binarylizable { > > > > >> void write(BinaryWriter writer) { > > > > >> writer.writeInt("B.x", x); // User intentionally written > > > field > > > > as > > > > >> "B.x". > > > > >> } > > > > >> } > > > > >> > > > > >> BinaryObject obj = ...; > > > > >> Object val = obj.field("B.x"); // Should we lookup for "B.x" as > user > > > > asked > > > > >> us, or just for "x"? > > > > >> > > > > >> Prefixing all conflicting fields with class name resolves the > > problem. > > > > >> > > > > >> 2) If we add methods to manipulate fields not only by name, but by > > > > >> (typeName + fieldName) as well, then we will have to add *9 new > > > methods* > > > > >> to > > > > >> > > > > >> API: > > > > >> BinaryType.fieldTypeName(String typeName, String fieldName); > > > > >> BinaryType.field(String typeName, String fieldName); > > > > >> BinaryObject.field(String typeName, String fieldName); > > > > >> BinaryObject.hasField(String typeName, String fieldName); > > > > >> BinaryObjectBuilder.getField(String typeName, String fieldName); > > > > >> BinaryObjectBuilder.setField(String typeName, String fieldName, > > ...); > > > > // 3 > > > > >> overloads > > > > >> BinaryObjectBuilder.removeField(String typeName, String > fieldName); > > > > >> > > > > >> This is definitely an overkill for such a rare scenario. > > > > >> > > > > >> Thoughts? > > > > >> > > > > >> On Mon, Dec 21, 2015 at 11:22 AM, Vladimir Ozerov < > > > [hidden email] > > > > > > > > > >> wrote: > > > > >> > > > > >> Agree, prefixing parent class fields sound like a better option. > > > > >>> Regarding aliases - I need some time to understand internal > > > mechanics. > > > > >>> Will answer this a bit later. > > > > >>> > > > > >>> On Mon, Dec 21, 2015 at 11:18 AM, Dmitriy Setrakyan < > > > > >>> [hidden email] > > > > >>> > > > > >>>> wrote: > > > > >>>> Vova, > > > > >>>> > > > > >>>> Shouldn’t it be the other way around? Class B writes the field > as > > > “a”, > > > > >>>> and > > > > >>>> class A writes it with a prefix (possibly the hash code of the > > class > > > > >>>> name). > > > > >>>> > > > > >>>> Also, we should clearly document how the SQL queries are > affected > > by > > > > >>>> this. > > > > >>>> AFAIK, we should be using field aliases here, no? > > > > >>>> > > > > >>>> D. > > > > >>>> > > > > >>>> On Sun, Dec 20, 2015 at 10:08 AM, Vladimir Ozerov < > > > > [hidden email] > > > > >>>> > > > > > >>>> wrote: > > > > >>>> > > > > >>>> May be we can use normal field names by default and add some > > > > >>>>> > > > > >>>> prefix/suffix > > > > >>>> > > > > >>>>> if conflict is found? E.g.: > > > > >>>>> > > > > >>>>> class A { > > > > >>>>> int a; // Write as "a"; > > > > >>>>> } > > > > >>>>> > > > > >>>>> class B extends A { > > > > >>>>> int a; // Write as "B_a"; > > > > >>>>> } > > > > >>>>> > > > > >>>>> On Fri, Dec 18, 2015 at 10:34 PM, Valentin Kulichenko < > > > > >>>>> [hidden email]> wrote: > > > > >>>>> > > > > >>>>> Folks, > > > > >>>>>> > > > > >>>>>> It seems to me that issue here is not with 3rd-party > libraries. > > We > > > > >>>>>> > > > > >>>>> just > > > > >>>> > > > > >>>>> don't properly support class hierarchy in binary format. Any > > class > > > > >>>>>> > > > > >>>>> that > > > > >>>> > > > > >>>>> extends another class and has the field with the same name as > > > parent > > > > >>>>>> > > > > >>>>> has > > > > >>>> > > > > >>>>> will fail unless user provides custom serialization logic that > > will > > > > >>>>>> > > > > >>>>> handle > > > > >>>>> > > > > >>>>>> it. > > > > >>>>>> > > > > >>>>>> What if we prepend the field name with the simple class name > in > > > this > > > > >>>>>> > > > > >>>>> case? > > > > >>>>> > > > > >>>>>> Say, we have two classes: > > > > >>>>>> > > > > >>>>>> class A { > > > > >>>>>> private int id; > > > > >>>>>> } > > > > >>>>>> > > > > >>>>>> class B extends A { > > > > >>>>>> private int id; > > > > >>>>>> } > > > > >>>>>> > > > > >>>>>> In this case we will get two fields: "A.id" and "B.id". The > only > > > > >>>>>> > > > > >>>>> issue is > > > > >>>> > > > > >>>>> that if there are no name conflict, we should be able to > resolve > > by > > > > >>>>>> > > > > >>>>> both > > > > >>>> > > > > >>>>> names - with or without prepended type name. I.e., if A is > > > > serialized, > > > > >>>>>> > > > > >>>>> you > > > > >>>>> > > > > >>>>>> can get the field value by "id" or "A.id". This is similar to > > how > > > it > > > > >>>>>> > > > > >>>>> works > > > > >>>>> > > > > >>>>>> if you join two SQL tables with the same column names. > > > > >>>>>> > > > > >>>>>> Any thoughts on whether it's doable or not? > > > > >>>>>> > > > > >>>>>> -Val > > > > >>>>>> > > > > >>>>>> On Fri, Dec 18, 2015 at 10:05 AM, Andrey Kornev < > > > > >>>>>> > > > > >>>>> [hidden email]> > > > > >>>>> > > > > >>>>>> wrote: > > > > >>>>>> > > > > >>>>>> In this particular case, the class that fails is a non-static > > > inner > > > > >>>>>>> > > > > >>>>>> class > > > > >>>>> > > > > >>>>>> that extends another non-static inner class, so they both end > up > > > > >>>>>>> > > > > >>>>>> having > > > > >>>> > > > > >>>>> the > > > > >>>>>> > > > > >>>>>>> compiler-generated "this$0" field. > > > > >>>>>>> > > > > >>>>>>> Regards > > > > >>>>>>> Andrey > > > > >>>>>>> > > > > >>>>>>> Date: Fri, 18 Dec 2015 20:44:12 +0300 > > > > >>>>>>>> Subject: Re: CacheEntry serialization failure > > > > >>>>>>>> From: [hidden email] > > > > >>>>>>>> To: [hidden email] > > > > >>>>>>>> > > > > >>>>>>>> The most straightforward solution which comes to my mind - > *do > > > not > > > > >>>>>>>> > > > > >>>>>>> ever > > > > >>>>> > > > > >>>>>> use > > > > >>>>>>> > > > > >>>>>>>> BinaryMarshaller by default*. Always fallback to > > > > >>>>>>>> > > > > >>>>>>> OptimizedMarshaller > > > > >>>> > > > > >>>>> unless > > > > >>>>>>> > > > > >>>>>>>> user explicitly asked us to use binary format (e.g. through > > > > >>>>>>>> > > > > >>>>>>> package > > > > >>>> > > > > >>>>> wildcards). > > > > >>>>>>>> > > > > >>>>>>>> BTW, we already do this for Externalizable and > > > > >>>>>>>> > > > > >>>>>>> readObject/writeObject. > > > > >>>>> > > > > >>>>>> On Fri, Dec 18, 2015 at 8:41 PM, Vladimir Ozerov < > > > > >>>>>>>> > > > > >>>>>>> [hidden email] > > > > >>>>> > > > > >>>>>> wrote: > > > > >>>>>>>> > > > > >>>>>>>> Ah, I saw your problem with DirectedSpecifics. We need to > > think > > > > >>>>>>>>> > > > > >>>>>>>> about > > > > >>>>> > > > > >>>>>> how > > > > >>>>>>> > > > > >>>>>>>> to solve it. Here is the case: > > > > >>>>>>>>> 1) Class is Serilzable and cannot be changed; > > > > >>>>>>>>> 2) There are several duplicated field names; > > > > >>>>>>>>> => BinaryMarshaller cannot handle it. > > > > >>>>>>>>> > > > > >>>>>>>>> Any thoughts? > > > > >>>>>>>>> > > > > >>>>>>>>> On Fri, Dec 18, 2015 at 8:34 PM, Vladimir Ozerov < > > > > >>>>>>>>> > > > > >>>>>>>> [hidden email] > > > > >>>>>> > > > > >>>>>>> wrote: > > > > >>>>>>>>> > > > > >>>>>>>>> I fixed the problem, it was a bug actually. > > > > >>>>>>>>>> > > > > >>>>>>>>>> By default classes which has some custom Java logic (e.g. > > > > >>>>>>>>>> > > > > >>>>>>>>> Externalizable, > > > > >>>>>>> > > > > >>>>>>>> or with writeObject/readObject methods) will be written > using > > > > >>>>>>>>>> OptimizedMarshaller, so similar field names is not a > > problem. > > > > >>>>>>>>>> If you want to serialize such class in binary format and > > have > > > > >>>>>>>>>> > > > > >>>>>>>>> duplicate > > > > >>>>>>> > > > > >>>>>>>> field names, you should provide your own BinarySerializer, > > > > >>>>>>>>>> > > > > >>>>>>>>> which > > > > >>>> > > > > >>>>> will > > > > >>>>>> > > > > >>>>>>> write > > > > >>>>>>> > > > > >>>>>>>> these fields with different names. > > > > >>>>>>>>>> > > > > >>>>>>>>>> On Fri, Dec 18, 2015 at 8:07 PM, Andrey Kornev < > > > > >>>>>>>>>> > > > > >>>>>>>>> [hidden email]> > > > > >>>>>>> > > > > >>>>>>>> wrote: > > > > >>>>>>>>>> > > > > >>>>>>>>>> How am I supposed to handle this situation if the class > > comes > > > > >>>>>>>>>>> > > > > >>>>>>>>>> from > > > > >>>>> > > > > >>>>>> a > > > > >>>>>> > > > > >>>>>>> 3d > > > > >>>>>>> > > > > >>>>>>>> party I can't modify? > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> Thanks > > > > >>>>>>>>>>> Andrey > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> Date: Fri, 18 Dec 2015 09:12:22 +0300 > > > > >>>>>>>>>>>> Subject: Re: CacheEntry serialization failure > > > > >>>>>>>>>>>> From: [hidden email] > > > > >>>>>>>>>>>> To: [hidden email] > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> I'll take a look. > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> On Fri, Dec 18, 2015 at 4:37 AM, Valentin Kulichenko < > > > > >>>>>>>>>>>> [hidden email]> wrote: > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> Folks, > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> It looks like CacheEntry implementation (i.e., the > entry > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>> that > > > > >>>> > > > > >>>>> contains > > > > >>>>>>>>>>> > > > > >>>>>>>>>>>> version) can't be properly serialized with the > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>> BinaryMarshaller. > > > > >>>>>> > > > > >>>>>>> I > > > > >>>>>>> > > > > >>>>>>>> created > > > > >>>>>>>>>>> > > > > >>>>>>>>>>>> the test and the ticket: > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-2203 > > > > >>>>>>>>>>> > > > > >>>>>>>>>>>> Can someone take a look? > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> -Val > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>> > > > > >>>>>>>>>> > > > > >>>>>>> > > > > >>> > > > > > > > > > > > > > > > -- Sergey Kozlov |
Several additional problems appeared:
1) We must use fully-qualified class names because simple class names might also be the same. E.g: class package1.A extends package2.A 2) Our query engine doesn't like dots and "$" from class names. Because of this fields with these characters cannot be queried. To workaorund these problems I did the following: 1) Fully qualified class names are used; 2) "." is replaced with "_" and "$" is replaced with "__". E.g. field org.my.package.MyClass$MyInnerClass.x is written as " org_my_package_MyClass__MyInnerClass_x"; 3) If user would like to query the field, he can call special helper method BinaryUtils.qualifiedFieldName(Class cls, String fieldName) returning correct field name. As this problem is not likely to occur in real quering scenarios, I think this solution is more or less fine. Thoughts? On Mon, Dec 21, 2015 at 12:28 PM, Sergey Kozlov <[hidden email]> wrote: > From my standpoint Vova's appoach very close to SQL behavior if two joined > tables have same column names. > > On Mon, Dec 21, 2015 at 12:23 PM, Dmitriy Setrakyan <[hidden email] > > > wrote: > > > Got it. I understand the reasoning now for not throwing an exception. > Let’s > > make sure we document this behavior. > > > > If there is no additional field-name-parsing overhead, then the proposed > > API looks very nice. > > > > D. > > > > On Mon, Dec 21, 2015 at 1:19 AM, Vladimir Ozerov <[hidden email]> > > wrote: > > > > > Dima, > > > > > > Here is how proposed design works: > > > > > > class A { > > > int x = 1; > > > } > > > class B { > > > int x = 2; > > > } > > > > > > BinaryObject obj = ...; > > > > > > Object val = obj.field("A.x"); // Returns "1"; > > > Object val = obj.field("B.x"); // Returns "2"; > > > Object val = obj.field("x"); // Returns null; > > > > > > boolean exists = obj.hasField("A.x"); // Returns "true"; > > > boolean exists = obj.hasField("B.x"); // Returns "true"; > > > boolean exists = obj.hasField("x"); // Returns "false"; > > > > > > Looks clean and consistent for me. Remember that we are talking about > > very > > > specific use case. It is very unlikely that user will operate on > objects > > > conflicting fields in binary form. > > > Also, there will be no parsing at all. We use field name passed by user > > > directly. > > > > > > Vladimir. > > > > > > On Mon, Dec 21, 2015 at 12:10 PM, Dmitriy Setrakyan < > > [hidden email] > > > > > > > wrote: > > > > > > > Vova, > > > > > > > > We cannot return null in case of a conflict, as user won’t be able to > > > > differentiate between a conflict and missing field. We should throw > an > > > > exception. > > > > > > > > Also, I don’t like parsing field names looking for a dot for every > > field. > > > > It will introduce a performance overhead for the cases that do not > have > > > > conflicts. Instead, we should add another API for this use case, > > > something > > > > like field(typeName, fieldName). > > > > > > > > D. > > > > > > > > On Mon, Dec 21, 2015 at 1:01 AM, Vladimir Ozerov < > [hidden email] > > > > > > > wrote: > > > > > > > > > Denis, > > > > > Yes, as we do not know which field to pick, we return null. > > > > > > > > > > On Mon, Dec 21, 2015 at 12:00 PM, Denis Magda <[hidden email] > > > > > > wrote: > > > > > > > > > > > Sounds good for me. I would go for this approach. > > > > > > > > > > > > In addition if to consider your example below and the user > decides > > to > > > > > look > > > > > > up a field by its simple name then he/she will get nothing or > > > exception > > > > > > (depending on the API), correct? > > > > > > As an example for this case the method will return null > > > > > > > > > > > > BinaryObject obj = ...; > > > > > > Object val = obj.field("x"); // null will be returned cause we > > don't > > > > know > > > > > > what particular 'x' we have to return > > > > > > > > > > > > > > > > > > -- > > > > > > Denis > > > > > > > > > > > > On 12/21/2015 11:48 AM, Vladimir Ozerov wrote: > > > > > > > > > > > >> Folks, > > > > > >> > > > > > >> I thought about the solution a bit more and came to the > following > > > > > design. > > > > > >> > > > > > >> *Scenario:* > > > > > >> class A { > > > > > >> int x; > > > > > >> } > > > > > >> class B : extends A { > > > > > >> int y; > > > > > >> } > > > > > >> > > > > > >> *Solution:* > > > > > >> 1) Field A.x is written as *"A.x"*, field B.x is written as > > *"B.x"*. > > > > > I.e. > > > > > >> *both > > > > > >> conflicting fields are prefixed* with simple name of the owning > > > class. > > > > > >> 2) API is unchanged. User manipulates these fields on all public > > > > methods > > > > > >> in > > > > > >> exactly the same way: "A.x" and "B.x". > > > > > >> > > > > > >> *Rationale:* > > > > > >> 1) We cannot prefix only some of conflicting fields. E.g. if > > decide > > > to > > > > > >> prefix only A.x, then it is not clear how to handle this case: > > > > > >> > > > > > >> class B extends A implements Binarylizable { > > > > > >> void write(BinaryWriter writer) { > > > > > >> writer.writeInt("B.x", x); // User intentionally > written > > > > field > > > > > as > > > > > >> "B.x". > > > > > >> } > > > > > >> } > > > > > >> > > > > > >> BinaryObject obj = ...; > > > > > >> Object val = obj.field("B.x"); // Should we lookup for "B.x" as > > user > > > > > asked > > > > > >> us, or just for "x"? > > > > > >> > > > > > >> Prefixing all conflicting fields with class name resolves the > > > problem. > > > > > >> > > > > > >> 2) If we add methods to manipulate fields not only by name, but > by > > > > > >> (typeName + fieldName) as well, then we will have to add *9 new > > > > methods* > > > > > >> to > > > > > >> > > > > > >> API: > > > > > >> BinaryType.fieldTypeName(String typeName, String fieldName); > > > > > >> BinaryType.field(String typeName, String fieldName); > > > > > >> BinaryObject.field(String typeName, String fieldName); > > > > > >> BinaryObject.hasField(String typeName, String fieldName); > > > > > >> BinaryObjectBuilder.getField(String typeName, String fieldName); > > > > > >> BinaryObjectBuilder.setField(String typeName, String fieldName, > > > ...); > > > > > // 3 > > > > > >> overloads > > > > > >> BinaryObjectBuilder.removeField(String typeName, String > > fieldName); > > > > > >> > > > > > >> This is definitely an overkill for such a rare scenario. > > > > > >> > > > > > >> Thoughts? > > > > > >> > > > > > >> On Mon, Dec 21, 2015 at 11:22 AM, Vladimir Ozerov < > > > > [hidden email] > > > > > > > > > > > >> wrote: > > > > > >> > > > > > >> Agree, prefixing parent class fields sound like a better option. > > > > > >>> Regarding aliases - I need some time to understand internal > > > > mechanics. > > > > > >>> Will answer this a bit later. > > > > > >>> > > > > > >>> On Mon, Dec 21, 2015 at 11:18 AM, Dmitriy Setrakyan < > > > > > >>> [hidden email] > > > > > >>> > > > > > >>>> wrote: > > > > > >>>> Vova, > > > > > >>>> > > > > > >>>> Shouldn’t it be the other way around? Class B writes the field > > as > > > > “a”, > > > > > >>>> and > > > > > >>>> class A writes it with a prefix (possibly the hash code of the > > > class > > > > > >>>> name). > > > > > >>>> > > > > > >>>> Also, we should clearly document how the SQL queries are > > affected > > > by > > > > > >>>> this. > > > > > >>>> AFAIK, we should be using field aliases here, no? > > > > > >>>> > > > > > >>>> D. > > > > > >>>> > > > > > >>>> On Sun, Dec 20, 2015 at 10:08 AM, Vladimir Ozerov < > > > > > [hidden email] > > > > > >>>> > > > > > > >>>> wrote: > > > > > >>>> > > > > > >>>> May be we can use normal field names by default and add some > > > > > >>>>> > > > > > >>>> prefix/suffix > > > > > >>>> > > > > > >>>>> if conflict is found? E.g.: > > > > > >>>>> > > > > > >>>>> class A { > > > > > >>>>> int a; // Write as "a"; > > > > > >>>>> } > > > > > >>>>> > > > > > >>>>> class B extends A { > > > > > >>>>> int a; // Write as "B_a"; > > > > > >>>>> } > > > > > >>>>> > > > > > >>>>> On Fri, Dec 18, 2015 at 10:34 PM, Valentin Kulichenko < > > > > > >>>>> [hidden email]> wrote: > > > > > >>>>> > > > > > >>>>> Folks, > > > > > >>>>>> > > > > > >>>>>> It seems to me that issue here is not with 3rd-party > > libraries. > > > We > > > > > >>>>>> > > > > > >>>>> just > > > > > >>>> > > > > > >>>>> don't properly support class hierarchy in binary format. Any > > > class > > > > > >>>>>> > > > > > >>>>> that > > > > > >>>> > > > > > >>>>> extends another class and has the field with the same name as > > > > parent > > > > > >>>>>> > > > > > >>>>> has > > > > > >>>> > > > > > >>>>> will fail unless user provides custom serialization logic > that > > > will > > > > > >>>>>> > > > > > >>>>> handle > > > > > >>>>> > > > > > >>>>>> it. > > > > > >>>>>> > > > > > >>>>>> What if we prepend the field name with the simple class name > > in > > > > this > > > > > >>>>>> > > > > > >>>>> case? > > > > > >>>>> > > > > > >>>>>> Say, we have two classes: > > > > > >>>>>> > > > > > >>>>>> class A { > > > > > >>>>>> private int id; > > > > > >>>>>> } > > > > > >>>>>> > > > > > >>>>>> class B extends A { > > > > > >>>>>> private int id; > > > > > >>>>>> } > > > > > >>>>>> > > > > > >>>>>> In this case we will get two fields: "A.id" and "B.id". The > > only > > > > > >>>>>> > > > > > >>>>> issue is > > > > > >>>> > > > > > >>>>> that if there are no name conflict, we should be able to > > resolve > > > by > > > > > >>>>>> > > > > > >>>>> both > > > > > >>>> > > > > > >>>>> names - with or without prepended type name. I.e., if A is > > > > > serialized, > > > > > >>>>>> > > > > > >>>>> you > > > > > >>>>> > > > > > >>>>>> can get the field value by "id" or "A.id". This is similar > to > > > how > > > > it > > > > > >>>>>> > > > > > >>>>> works > > > > > >>>>> > > > > > >>>>>> if you join two SQL tables with the same column names. > > > > > >>>>>> > > > > > >>>>>> Any thoughts on whether it's doable or not? > > > > > >>>>>> > > > > > >>>>>> -Val > > > > > >>>>>> > > > > > >>>>>> On Fri, Dec 18, 2015 at 10:05 AM, Andrey Kornev < > > > > > >>>>>> > > > > > >>>>> [hidden email]> > > > > > >>>>> > > > > > >>>>>> wrote: > > > > > >>>>>> > > > > > >>>>>> In this particular case, the class that fails is a > non-static > > > > inner > > > > > >>>>>>> > > > > > >>>>>> class > > > > > >>>>> > > > > > >>>>>> that extends another non-static inner class, so they both > end > > up > > > > > >>>>>>> > > > > > >>>>>> having > > > > > >>>> > > > > > >>>>> the > > > > > >>>>>> > > > > > >>>>>>> compiler-generated "this$0" field. > > > > > >>>>>>> > > > > > >>>>>>> Regards > > > > > >>>>>>> Andrey > > > > > >>>>>>> > > > > > >>>>>>> Date: Fri, 18 Dec 2015 20:44:12 +0300 > > > > > >>>>>>>> Subject: Re: CacheEntry serialization failure > > > > > >>>>>>>> From: [hidden email] > > > > > >>>>>>>> To: [hidden email] > > > > > >>>>>>>> > > > > > >>>>>>>> The most straightforward solution which comes to my mind - > > *do > > > > not > > > > > >>>>>>>> > > > > > >>>>>>> ever > > > > > >>>>> > > > > > >>>>>> use > > > > > >>>>>>> > > > > > >>>>>>>> BinaryMarshaller by default*. Always fallback to > > > > > >>>>>>>> > > > > > >>>>>>> OptimizedMarshaller > > > > > >>>> > > > > > >>>>> unless > > > > > >>>>>>> > > > > > >>>>>>>> user explicitly asked us to use binary format (e.g. > through > > > > > >>>>>>>> > > > > > >>>>>>> package > > > > > >>>> > > > > > >>>>> wildcards). > > > > > >>>>>>>> > > > > > >>>>>>>> BTW, we already do this for Externalizable and > > > > > >>>>>>>> > > > > > >>>>>>> readObject/writeObject. > > > > > >>>>> > > > > > >>>>>> On Fri, Dec 18, 2015 at 8:41 PM, Vladimir Ozerov < > > > > > >>>>>>>> > > > > > >>>>>>> [hidden email] > > > > > >>>>> > > > > > >>>>>> wrote: > > > > > >>>>>>>> > > > > > >>>>>>>> Ah, I saw your problem with DirectedSpecifics. We need to > > > think > > > > > >>>>>>>>> > > > > > >>>>>>>> about > > > > > >>>>> > > > > > >>>>>> how > > > > > >>>>>>> > > > > > >>>>>>>> to solve it. Here is the case: > > > > > >>>>>>>>> 1) Class is Serilzable and cannot be changed; > > > > > >>>>>>>>> 2) There are several duplicated field names; > > > > > >>>>>>>>> => BinaryMarshaller cannot handle it. > > > > > >>>>>>>>> > > > > > >>>>>>>>> Any thoughts? > > > > > >>>>>>>>> > > > > > >>>>>>>>> On Fri, Dec 18, 2015 at 8:34 PM, Vladimir Ozerov < > > > > > >>>>>>>>> > > > > > >>>>>>>> [hidden email] > > > > > >>>>>> > > > > > >>>>>>> wrote: > > > > > >>>>>>>>> > > > > > >>>>>>>>> I fixed the problem, it was a bug actually. > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> By default classes which has some custom Java logic > (e.g. > > > > > >>>>>>>>>> > > > > > >>>>>>>>> Externalizable, > > > > > >>>>>>> > > > > > >>>>>>>> or with writeObject/readObject methods) will be written > > using > > > > > >>>>>>>>>> OptimizedMarshaller, so similar field names is not a > > > problem. > > > > > >>>>>>>>>> If you want to serialize such class in binary format and > > > have > > > > > >>>>>>>>>> > > > > > >>>>>>>>> duplicate > > > > > >>>>>>> > > > > > >>>>>>>> field names, you should provide your own BinarySerializer, > > > > > >>>>>>>>>> > > > > > >>>>>>>>> which > > > > > >>>> > > > > > >>>>> will > > > > > >>>>>> > > > > > >>>>>>> write > > > > > >>>>>>> > > > > > >>>>>>>> these fields with different names. > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> On Fri, Dec 18, 2015 at 8:07 PM, Andrey Kornev < > > > > > >>>>>>>>>> > > > > > >>>>>>>>> [hidden email]> > > > > > >>>>>>> > > > > > >>>>>>>> wrote: > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> How am I supposed to handle this situation if the class > > > comes > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>> from > > > > > >>>>> > > > > > >>>>>> a > > > > > >>>>>> > > > > > >>>>>>> 3d > > > > > >>>>>>> > > > > > >>>>>>>> party I can't modify? > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>>> Thanks > > > > > >>>>>>>>>>> Andrey > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>>> Date: Fri, 18 Dec 2015 09:12:22 +0300 > > > > > >>>>>>>>>>>> Subject: Re: CacheEntry serialization failure > > > > > >>>>>>>>>>>> From: [hidden email] > > > > > >>>>>>>>>>>> To: [hidden email] > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>>> I'll take a look. > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>>> On Fri, Dec 18, 2015 at 4:37 AM, Valentin Kulichenko < > > > > > >>>>>>>>>>>> [hidden email]> wrote: > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>>> Folks, > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>>> It looks like CacheEntry implementation (i.e., the > > entry > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>> that > > > > > >>>> > > > > > >>>>> contains > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>>>> version) can't be properly serialized with the > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>> BinaryMarshaller. > > > > > >>>>>> > > > > > >>>>>>> I > > > > > >>>>>>> > > > > > >>>>>>>> created > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>>>> the test and the ticket: > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-2203 > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>>>> Can someone take a look? > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>>> -Val > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>> > > > > > >>>>>>> > > > > > >>> > > > > > > > > > > > > > > > > > > > > > > > > -- > Sergey Kozlov > |
Are you talking about SQL queries? I thought that we agreed to use field aliases in case of name conflicts, no?
D. > On Dec 21, 2015, at 4:57 AM, Vladimir Ozerov <[hidden email]> wrote: > > Several additional problems appeared: > 1) We must use fully-qualified class names because simple class names might > also be the same. E.g: class package1.A extends package2.A > 2) Our query engine doesn't like dots and "$" from class names. Because of > this fields with these characters cannot be queried. > > To workaorund these problems I did the following: > 1) Fully qualified class names are used; > 2) "." is replaced with "_" and "$" is replaced with "__". E.g. field > org.my.package.MyClass$MyInnerClass.x is written as " > org_my_package_MyClass__MyInnerClass_x"; > 3) If user would like to query the field, he can call special helper > method BinaryUtils.qualifiedFieldName(Class > cls, String fieldName) returning correct field name. > > As this problem is not likely to occur in real quering scenarios, I think > this solution is more or less fine. > > Thoughts? > > > On Mon, Dec 21, 2015 at 12:28 PM, Sergey Kozlov <[hidden email]> > wrote: > >> From my standpoint Vova's appoach very close to SQL behavior if two joined >> tables have same column names. >> >> On Mon, Dec 21, 2015 at 12:23 PM, Dmitriy Setrakyan <[hidden email] >>> >> wrote: >> >>> Got it. I understand the reasoning now for not throwing an exception. >> Let’s >>> make sure we document this behavior. >>> >>> If there is no additional field-name-parsing overhead, then the proposed >>> API looks very nice. >>> >>> D. >>> >>> On Mon, Dec 21, 2015 at 1:19 AM, Vladimir Ozerov <[hidden email]> >>> wrote: >>> >>>> Dima, >>>> >>>> Here is how proposed design works: >>>> >>>> class A { >>>> int x = 1; >>>> } >>>> class B { >>>> int x = 2; >>>> } >>>> >>>> BinaryObject obj = ...; >>>> >>>> Object val = obj.field("A.x"); // Returns "1"; >>>> Object val = obj.field("B.x"); // Returns "2"; >>>> Object val = obj.field("x"); // Returns null; >>>> >>>> boolean exists = obj.hasField("A.x"); // Returns "true"; >>>> boolean exists = obj.hasField("B.x"); // Returns "true"; >>>> boolean exists = obj.hasField("x"); // Returns "false"; >>>> >>>> Looks clean and consistent for me. Remember that we are talking about >>> very >>>> specific use case. It is very unlikely that user will operate on >> objects >>>> conflicting fields in binary form. >>>> Also, there will be no parsing at all. We use field name passed by user >>>> directly. >>>> >>>> Vladimir. >>>> >>>> On Mon, Dec 21, 2015 at 12:10 PM, Dmitriy Setrakyan < >>> [hidden email] >>>>> >>>> wrote: >>>> >>>>> Vova, >>>>> >>>>> We cannot return null in case of a conflict, as user won’t be able to >>>>> differentiate between a conflict and missing field. We should throw >> an >>>>> exception. >>>>> >>>>> Also, I don’t like parsing field names looking for a dot for every >>> field. >>>>> It will introduce a performance overhead for the cases that do not >> have >>>>> conflicts. Instead, we should add another API for this use case, >>>> something >>>>> like field(typeName, fieldName). >>>>> >>>>> D. >>>>> >>>>> On Mon, Dec 21, 2015 at 1:01 AM, Vladimir Ozerov < >> [hidden email] >>>> >>>>> wrote: >>>>> >>>>>> Denis, >>>>>> Yes, as we do not know which field to pick, we return null. >>>>>> >>>>>> On Mon, Dec 21, 2015 at 12:00 PM, Denis Magda <[hidden email] >>> >>>>> wrote: >>>>>> >>>>>>> Sounds good for me. I would go for this approach. >>>>>>> >>>>>>> In addition if to consider your example below and the user >> decides >>> to >>>>>> look >>>>>>> up a field by its simple name then he/she will get nothing or >>>> exception >>>>>>> (depending on the API), correct? >>>>>>> As an example for this case the method will return null >>>>>>> >>>>>>> BinaryObject obj = ...; >>>>>>> Object val = obj.field("x"); // null will be returned cause we >>> don't >>>>> know >>>>>>> what particular 'x' we have to return >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Denis >>>>>>> >>>>>>>> On 12/21/2015 11:48 AM, Vladimir Ozerov wrote: >>>>>>>> >>>>>>>> Folks, >>>>>>>> >>>>>>>> I thought about the solution a bit more and came to the >> following >>>>>> design. >>>>>>>> >>>>>>>> *Scenario:* >>>>>>>> class A { >>>>>>>> int x; >>>>>>>> } >>>>>>>> class B : extends A { >>>>>>>> int y; >>>>>>>> } >>>>>>>> >>>>>>>> *Solution:* >>>>>>>> 1) Field A.x is written as *"A.x"*, field B.x is written as >>> *"B.x"*. >>>>>> I.e. >>>>>>>> *both >>>>>>>> conflicting fields are prefixed* with simple name of the owning >>>> class. >>>>>>>> 2) API is unchanged. User manipulates these fields on all public >>>>> methods >>>>>>>> in >>>>>>>> exactly the same way: "A.x" and "B.x". >>>>>>>> >>>>>>>> *Rationale:* >>>>>>>> 1) We cannot prefix only some of conflicting fields. E.g. if >>> decide >>>> to >>>>>>>> prefix only A.x, then it is not clear how to handle this case: >>>>>>>> >>>>>>>> class B extends A implements Binarylizable { >>>>>>>> void write(BinaryWriter writer) { >>>>>>>> writer.writeInt("B.x", x); // User intentionally >> written >>>>> field >>>>>> as >>>>>>>> "B.x". >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> BinaryObject obj = ...; >>>>>>>> Object val = obj.field("B.x"); // Should we lookup for "B.x" as >>> user >>>>>> asked >>>>>>>> us, or just for "x"? >>>>>>>> >>>>>>>> Prefixing all conflicting fields with class name resolves the >>>> problem. >>>>>>>> >>>>>>>> 2) If we add methods to manipulate fields not only by name, but >> by >>>>>>>> (typeName + fieldName) as well, then we will have to add *9 new >>>>> methods* >>>>>>>> to >>>>>>>> >>>>>>>> API: >>>>>>>> BinaryType.fieldTypeName(String typeName, String fieldName); >>>>>>>> BinaryType.field(String typeName, String fieldName); >>>>>>>> BinaryObject.field(String typeName, String fieldName); >>>>>>>> BinaryObject.hasField(String typeName, String fieldName); >>>>>>>> BinaryObjectBuilder.getField(String typeName, String fieldName); >>>>>>>> BinaryObjectBuilder.setField(String typeName, String fieldName, >>>> ...); >>>>>> // 3 >>>>>>>> overloads >>>>>>>> BinaryObjectBuilder.removeField(String typeName, String >>> fieldName); >>>>>>>> >>>>>>>> This is definitely an overkill for such a rare scenario. >>>>>>>> >>>>>>>> Thoughts? >>>>>>>> >>>>>>>> On Mon, Dec 21, 2015 at 11:22 AM, Vladimir Ozerov < >>>>> [hidden email] >>>>>>> >>>>>>>> wrote: >>>>>>>> >>>>>>>> Agree, prefixing parent class fields sound like a better option. >>>>>>>>> Regarding aliases - I need some time to understand internal >>>>> mechanics. >>>>>>>>> Will answer this a bit later. >>>>>>>>> >>>>>>>>> On Mon, Dec 21, 2015 at 11:18 AM, Dmitriy Setrakyan < >>>>>>>>> [hidden email] >>>>>>>>> >>>>>>>>>> wrote: >>>>>>>>>> Vova, >>>>>>>>>> >>>>>>>>>> Shouldn’t it be the other way around? Class B writes the field >>> as >>>>> “a”, >>>>>>>>>> and >>>>>>>>>> class A writes it with a prefix (possibly the hash code of the >>>> class >>>>>>>>>> name). >>>>>>>>>> >>>>>>>>>> Also, we should clearly document how the SQL queries are >>> affected >>>> by >>>>>>>>>> this. >>>>>>>>>> AFAIK, we should be using field aliases here, no? >>>>>>>>>> >>>>>>>>>> D. >>>>>>>>>> >>>>>>>>>> On Sun, Dec 20, 2015 at 10:08 AM, Vladimir Ozerov < >>>>>> [hidden email] >>>>>>>>>>> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> May be we can use normal field names by default and add some >>>>>>>>>>> >>>>>>>>>> prefix/suffix >>>>>>>>>> >>>>>>>>>>> if conflict is found? E.g.: >>>>>>>>>>> >>>>>>>>>>> class A { >>>>>>>>>>> int a; // Write as "a"; >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> class B extends A { >>>>>>>>>>> int a; // Write as "B_a"; >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> On Fri, Dec 18, 2015 at 10:34 PM, Valentin Kulichenko < >>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>> >>>>>>>>>>> Folks, >>>>>>>>>>>> >>>>>>>>>>>> It seems to me that issue here is not with 3rd-party >>> libraries. >>>> We >>>>>>>>>>>> >>>>>>>>>>> just >>>>>>>>>> >>>>>>>>>>> don't properly support class hierarchy in binary format. Any >>>> class >>>>>>>>>>>> >>>>>>>>>>> that >>>>>>>>>> >>>>>>>>>>> extends another class and has the field with the same name as >>>>> parent >>>>>>>>>>>> >>>>>>>>>>> has >>>>>>>>>> >>>>>>>>>>> will fail unless user provides custom serialization logic >> that >>>> will >>>>>>>>>>>> >>>>>>>>>>> handle >>>>>>>>>>> >>>>>>>>>>>> it. >>>>>>>>>>>> >>>>>>>>>>>> What if we prepend the field name with the simple class name >>> in >>>>> this >>>>>>>>>>>> >>>>>>>>>>> case? >>>>>>>>>>> >>>>>>>>>>>> Say, we have two classes: >>>>>>>>>>>> >>>>>>>>>>>> class A { >>>>>>>>>>>> private int id; >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> class B extends A { >>>>>>>>>>>> private int id; >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> In this case we will get two fields: "A.id" and "B.id". The >>> only >>>>>>>>>>>> >>>>>>>>>>> issue is >>>>>>>>>> >>>>>>>>>>> that if there are no name conflict, we should be able to >>> resolve >>>> by >>>>>>>>>>>> >>>>>>>>>>> both >>>>>>>>>> >>>>>>>>>>> names - with or without prepended type name. I.e., if A is >>>>>> serialized, >>>>>>>>>>>> >>>>>>>>>>> you >>>>>>>>>>> >>>>>>>>>>>> can get the field value by "id" or "A.id". This is similar >> to >>>> how >>>>> it >>>>>>>>>>>> >>>>>>>>>>> works >>>>>>>>>>> >>>>>>>>>>>> if you join two SQL tables with the same column names. >>>>>>>>>>>> >>>>>>>>>>>> Any thoughts on whether it's doable or not? >>>>>>>>>>>> >>>>>>>>>>>> -Val >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Dec 18, 2015 at 10:05 AM, Andrey Kornev < >>>>>>>>>>>> >>>>>>>>>>> [hidden email]> >>>>>>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> In this particular case, the class that fails is a >> non-static >>>>> inner >>>>>>>>>>>>> >>>>>>>>>>>> class >>>>>>>>>>> >>>>>>>>>>>> that extends another non-static inner class, so they both >> end >>> up >>>>>>>>>>>>> >>>>>>>>>>>> having >>>>>>>>>> >>>>>>>>>>> the >>>>>>>>>>>> >>>>>>>>>>>>> compiler-generated "this$0" field. >>>>>>>>>>>>> >>>>>>>>>>>>> Regards >>>>>>>>>>>>> Andrey >>>>>>>>>>>>> >>>>>>>>>>>>> Date: Fri, 18 Dec 2015 20:44:12 +0300 >>>>>>>>>>>>>> Subject: Re: CacheEntry serialization failure >>>>>>>>>>>>>> From: [hidden email] >>>>>>>>>>>>>> To: [hidden email] >>>>>>>>>>>>>> >>>>>>>>>>>>>> The most straightforward solution which comes to my mind - >>> *do >>>>> not >>>>>>>>>>>>>> >>>>>>>>>>>>> ever >>>>>>>>>>> >>>>>>>>>>>> use >>>>>>>>>>>>> >>>>>>>>>>>>>> BinaryMarshaller by default*. Always fallback to >>>>>>>>>>>>>> >>>>>>>>>>>>> OptimizedMarshaller >>>>>>>>>> >>>>>>>>>>> unless >>>>>>>>>>>>> >>>>>>>>>>>>>> user explicitly asked us to use binary format (e.g. >> through >>>>>>>>>>>>>> >>>>>>>>>>>>> package >>>>>>>>>> >>>>>>>>>>> wildcards). >>>>>>>>>>>>>> >>>>>>>>>>>>>> BTW, we already do this for Externalizable and >>>>>>>>>>>>>> >>>>>>>>>>>>> readObject/writeObject. >>>>>>>>>>> >>>>>>>>>>>> On Fri, Dec 18, 2015 at 8:41 PM, Vladimir Ozerov < >>>>>>>>>>>>>> >>>>>>>>>>>>> [hidden email] >>>>>>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Ah, I saw your problem with DirectedSpecifics. We need to >>>> think >>>>>>>>>>>>>>> >>>>>>>>>>>>>> about >>>>>>>>>>> >>>>>>>>>>>> how >>>>>>>>>>>>> >>>>>>>>>>>>>> to solve it. Here is the case: >>>>>>>>>>>>>>> 1) Class is Serilzable and cannot be changed; >>>>>>>>>>>>>>> 2) There are several duplicated field names; >>>>>>>>>>>>>>> => BinaryMarshaller cannot handle it. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Any thoughts? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Dec 18, 2015 at 8:34 PM, Vladimir Ozerov < >>>>>>>>>>>>>>> >>>>>>>>>>>>>> [hidden email] >>>>>>>>>>>> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I fixed the problem, it was a bug actually. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> By default classes which has some custom Java logic >> (e.g. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Externalizable, >>>>>>>>>>>>> >>>>>>>>>>>>>> or with writeObject/readObject methods) will be written >>> using >>>>>>>>>>>>>>>> OptimizedMarshaller, so similar field names is not a >>>> problem. >>>>>>>>>>>>>>>> If you want to serialize such class in binary format and >>>> have >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> duplicate >>>>>>>>>>>>> >>>>>>>>>>>>>> field names, you should provide your own BinarySerializer, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> which >>>>>>>>>> >>>>>>>>>>> will >>>>>>>>>>>> >>>>>>>>>>>>> write >>>>>>>>>>>>> >>>>>>>>>>>>>> these fields with different names. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, Dec 18, 2015 at 8:07 PM, Andrey Kornev < >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [hidden email]> >>>>>>>>>>>>> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> How am I supposed to handle this situation if the class >>>> comes >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> from >>>>>>>>>>> >>>>>>>>>>>> a >>>>>>>>>>>> >>>>>>>>>>>>> 3d >>>>>>>>>>>>> >>>>>>>>>>>>>> party I can't modify? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>>>>> Andrey >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Date: Fri, 18 Dec 2015 09:12:22 +0300 >>>>>>>>>>>>>>>>>> Subject: Re: CacheEntry serialization failure >>>>>>>>>>>>>>>>>> From: [hidden email] >>>>>>>>>>>>>>>>>> To: [hidden email] >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I'll take a look. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Fri, Dec 18, 2015 at 4:37 AM, Valentin Kulichenko < >>>>>>>>>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Folks, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> It looks like CacheEntry implementation (i.e., the >>> entry >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> that >>>>>>>>>> >>>>>>>>>>> contains >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> version) can't be properly serialized with the >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> BinaryMarshaller. >>>>>>>>>>>> >>>>>>>>>>>>> I >>>>>>>>>>>>> >>>>>>>>>>>>>> created >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> the test and the ticket: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-2203 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Can someone take a look? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -Val >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >> >> >> -- >> Sergey Kozlov >> |
Free forum by Nabble | Edit this page |