Custom Java serialization and BinaryMarshaller

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Custom Java serialization and BinaryMarshaller

Vladimir Ozerov
Folks,

Currently BinaryMarshaller works in a very non-trivial way:
1) If class is Serializable or Binarylizable, it is written in binary
format and can be used without deserialization.
2) If class implements Externalizable, it is written in binary format, but
without fields metadata.
3) If class has writeObject/readObject methods, it is written with
OptimizedMarshaller, also without fields metadata.

Class from p.2 and p.3 must always be deserialized on server side to allow
queries.

There was an idea to ignore Externalizable/writeObject/readObject and
always write object fields directly with binary format and metadata. And
let user fallback to default Java logic if needed.
I tried this approach today and it appears to be very unstable. Lots of
classes from JDK and other libraries has custom serialization logic. As we
ignore it, it produces weird exceptions (such as NPE) which we cannot
handle and cannot give user any advice on what is going on.

I think we should resolve this problem as follows:
1) Both Externalizable and writeObject/readObject cases *by default* are
handled in a similar way - with OptimizedMarshaller. *I.e. they are
deserialized on server by default.*
2) User can optionally fallback to binary format if he thinks it is safe.
But he must do it explicitly via configuration.
3) When we met such classes for the first time, a warning is printed to the
user. Something like: "Binary format cannot be applied to the [class]
because it implements Externalizable interface; instances will be
deserialized on server (to enable binary format please implement
Binarylizable interface, or enable [property] or define custom serializer)".
4) Only one exception class is possible on the server - ClassNotFound. We
will throw it with sensible error message as well.

Thoughts?

Vladimir.
Reply | Threaded
Open this post in threaded view
|

Re: Custom Java serialization and BinaryMarshaller

dsetrakyan
Vladimir,

I am not sure I like the approach you are suggesting. I am thinking that by
“unstable” classes you are referring to classes like HashMap or ArrayList,
in which case, providing field metadata for them does not make sense, as
well as deserializing them on the server side does not make sense.

Let’s dig a bit deeper. Can you provide a list of the “unstable” classes?

D.

On Mon, Dec 14, 2015 at 1:30 PM, Vladimir Ozerov <[hidden email]>
wrote:

> Folks,
>
> Currently BinaryMarshaller works in a very non-trivial way:
> 1) If class is Serializable or Binarylizable, it is written in binary
> format and can be used without deserialization.
> 2) If class implements Externalizable, it is written in binary format, but
> without fields metadata.
> 3) If class has writeObject/readObject methods, it is written with
> OptimizedMarshaller, also without fields metadata.
>
> Class from p.2 and p.3 must always be deserialized on server side to allow
> queries.
>
> There was an idea to ignore Externalizable/writeObject/readObject and
> always write object fields directly with binary format and metadata. And
> let user fallback to default Java logic if needed.
> I tried this approach today and it appears to be very unstable. Lots of
> classes from JDK and other libraries has custom serialization logic. As we
> ignore it, it produces weird exceptions (such as NPE) which we cannot
> handle and cannot give user any advice on what is going on.
>
> I think we should resolve this problem as follows:
> 1) Both Externalizable and writeObject/readObject cases *by default* are
> handled in a similar way - with OptimizedMarshaller. *I.e. they are
> deserialized on server by default.*
> 2) User can optionally fallback to binary format if he thinks it is safe.
> But he must do it explicitly via configuration.
> 3) When we met such classes for the first time, a warning is printed to the
> user. Something like: "Binary format cannot be applied to the [class]
> because it implements Externalizable interface; instances will be
> deserialized on server (to enable binary format please implement
> Binarylizable interface, or enable [property] or define custom
> serializer)".
> 4) Only one exception class is possible on the server - ClassNotFound. We
> will throw it with sensible error message as well.
>
> Thoughts?
>
> Vladimir.
>
Reply | Threaded
Open this post in threaded view
|

Re: Custom Java serialization and BinaryMarshaller

Vladimir Ozerov
Dima,

I do not think it is possible to get list of such classes because this can
be any class from any library. Can you explain what is your concerns?

On Tue, Dec 15, 2015 at 1:46 AM, Dmitriy Setrakyan <[hidden email]>
wrote:

> Vladimir,
>
> I am not sure I like the approach you are suggesting. I am thinking that by
> “unstable” classes you are referring to classes like HashMap or ArrayList,
> in which case, providing field metadata for them does not make sense, as
> well as deserializing them on the server side does not make sense.
>
> Let’s dig a bit deeper. Can you provide a list of the “unstable” classes?
>
> D.
>
> On Mon, Dec 14, 2015 at 1:30 PM, Vladimir Ozerov <[hidden email]>
> wrote:
>
> > Folks,
> >
> > Currently BinaryMarshaller works in a very non-trivial way:
> > 1) If class is Serializable or Binarylizable, it is written in binary
> > format and can be used without deserialization.
> > 2) If class implements Externalizable, it is written in binary format,
> but
> > without fields metadata.
> > 3) If class has writeObject/readObject methods, it is written with
> > OptimizedMarshaller, also without fields metadata.
> >
> > Class from p.2 and p.3 must always be deserialized on server side to
> allow
> > queries.
> >
> > There was an idea to ignore Externalizable/writeObject/readObject and
> > always write object fields directly with binary format and metadata. And
> > let user fallback to default Java logic if needed.
> > I tried this approach today and it appears to be very unstable. Lots of
> > classes from JDK and other libraries has custom serialization logic. As
> we
> > ignore it, it produces weird exceptions (such as NPE) which we cannot
> > handle and cannot give user any advice on what is going on.
> >
> > I think we should resolve this problem as follows:
> > 1) Both Externalizable and writeObject/readObject cases *by default* are
> > handled in a similar way - with OptimizedMarshaller. *I.e. they are
> > deserialized on server by default.*
> > 2) User can optionally fallback to binary format if he thinks it is safe.
> > But he must do it explicitly via configuration.
> > 3) When we met such classes for the first time, a warning is printed to
> the
> > user. Something like: "Binary format cannot be applied to the [class]
> > because it implements Externalizable interface; instances will be
> > deserialized on server (to enable binary format please implement
> > Binarylizable interface, or enable [property] or define custom
> > serializer)".
> > 4) Only one exception class is possible on the server - ClassNotFound. We
> > will throw it with sensible error message as well.
> >
> > Thoughts?
> >
> > Vladimir.
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Custom Java serialization and BinaryMarshaller

dsetrakyan
Vova,

I think I am beginning to see your point. However, my concern is that users
may wish to ignore Externalizable and use the Binary protocol. Is it
possible to have a configuration flag on per-cache basis?

Also, why delegate to OptimizedMarshaller for Externalizalbe classes? Can’t
we have this logic for the Binary marshaller?

D.

On Mon, Dec 14, 2015 at 9:58 PM, Vladimir Ozerov <[hidden email]>
wrote:

> Dima,
>
> I do not think it is possible to get list of such classes because this can
> be any class from any library. Can you explain what is your concerns?
>
> On Tue, Dec 15, 2015 at 1:46 AM, Dmitriy Setrakyan <[hidden email]>
> wrote:
>
> > Vladimir,
> >
> > I am not sure I like the approach you are suggesting. I am thinking that
> by
> > “unstable” classes you are referring to classes like HashMap or
> ArrayList,
> > in which case, providing field metadata for them does not make sense, as
> > well as deserializing them on the server side does not make sense.
> >
> > Let’s dig a bit deeper. Can you provide a list of the “unstable” classes?
> >
> > D.
> >
> > On Mon, Dec 14, 2015 at 1:30 PM, Vladimir Ozerov <[hidden email]>
> > wrote:
> >
> > > Folks,
> > >
> > > Currently BinaryMarshaller works in a very non-trivial way:
> > > 1) If class is Serializable or Binarylizable, it is written in binary
> > > format and can be used without deserialization.
> > > 2) If class implements Externalizable, it is written in binary format,
> > but
> > > without fields metadata.
> > > 3) If class has writeObject/readObject methods, it is written with
> > > OptimizedMarshaller, also without fields metadata.
> > >
> > > Class from p.2 and p.3 must always be deserialized on server side to
> > allow
> > > queries.
> > >
> > > There was an idea to ignore Externalizable/writeObject/readObject and
> > > always write object fields directly with binary format and metadata.
> And
> > > let user fallback to default Java logic if needed.
> > > I tried this approach today and it appears to be very unstable. Lots of
> > > classes from JDK and other libraries has custom serialization logic. As
> > we
> > > ignore it, it produces weird exceptions (such as NPE) which we cannot
> > > handle and cannot give user any advice on what is going on.
> > >
> > > I think we should resolve this problem as follows:
> > > 1) Both Externalizable and writeObject/readObject cases *by default*
> are
> > > handled in a similar way - with OptimizedMarshaller. *I.e. they are
> > > deserialized on server by default.*
> > > 2) User can optionally fallback to binary format if he thinks it is
> safe.
> > > But he must do it explicitly via configuration.
> > > 3) When we met such classes for the first time, a warning is printed to
> > the
> > > user. Something like: "Binary format cannot be applied to the [class]
> > > because it implements Externalizable interface; instances will be
> > > deserialized on server (to enable binary format please implement
> > > Binarylizable interface, or enable [property] or define custom
> > > serializer)".
> > > 4) Only one exception class is possible on the server - ClassNotFound.
> We
> > > will throw it with sensible error message as well.
> > >
> > > Thoughts?
> > >
> > > Vladimir.
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Custom Java serialization and BinaryMarshaller

Vladimir Ozerov
Dima,

As class can "migrate" between caches, we should have only single write
logic across all caches and nodes. For this reason it is better to place
this flag in BinaryTypeConfiguration.

Regarding OptimizedMarshaller - the sole purpose is consistency. If class
cannot be written with BinaryMarshaller, it is better to have only single
fallback scenario. But currently there are two such scenarios - one for
Externalizable (written as binary in raw mode) and another for
writeObject/readObject (written with OptimizedMarshaller). In future we
probably always write things using BinaryMarshaller, but it will require
some additional work to support Object[Output/Input]Stream classes.



On Tue, Dec 15, 2015 at 9:17 AM, Dmitriy Setrakyan <[hidden email]>
wrote:

> Vova,
>
> I think I am beginning to see your point. However, my concern is that users
> may wish to ignore Externalizable and use the Binary protocol. Is it
> possible to have a configuration flag on per-cache basis?
>
> Also, why delegate to OptimizedMarshaller for Externalizalbe classes? Can’t
> we have this logic for the Binary marshaller?
>
> D.
>
> On Mon, Dec 14, 2015 at 9:58 PM, Vladimir Ozerov <[hidden email]>
> wrote:
>
> > Dima,
> >
> > I do not think it is possible to get list of such classes because this
> can
> > be any class from any library. Can you explain what is your concerns?
> >
> > On Tue, Dec 15, 2015 at 1:46 AM, Dmitriy Setrakyan <
> [hidden email]>
> > wrote:
> >
> > > Vladimir,
> > >
> > > I am not sure I like the approach you are suggesting. I am thinking
> that
> > by
> > > “unstable” classes you are referring to classes like HashMap or
> > ArrayList,
> > > in which case, providing field metadata for them does not make sense,
> as
> > > well as deserializing them on the server side does not make sense.
> > >
> > > Let’s dig a bit deeper. Can you provide a list of the “unstable”
> classes?
> > >
> > > D.
> > >
> > > On Mon, Dec 14, 2015 at 1:30 PM, Vladimir Ozerov <[hidden email]
> >
> > > wrote:
> > >
> > > > Folks,
> > > >
> > > > Currently BinaryMarshaller works in a very non-trivial way:
> > > > 1) If class is Serializable or Binarylizable, it is written in binary
> > > > format and can be used without deserialization.
> > > > 2) If class implements Externalizable, it is written in binary
> format,
> > > but
> > > > without fields metadata.
> > > > 3) If class has writeObject/readObject methods, it is written with
> > > > OptimizedMarshaller, also without fields metadata.
> > > >
> > > > Class from p.2 and p.3 must always be deserialized on server side to
> > > allow
> > > > queries.
> > > >
> > > > There was an idea to ignore Externalizable/writeObject/readObject and
> > > > always write object fields directly with binary format and metadata.
> > And
> > > > let user fallback to default Java logic if needed.
> > > > I tried this approach today and it appears to be very unstable. Lots
> of
> > > > classes from JDK and other libraries has custom serialization logic.
> As
> > > we
> > > > ignore it, it produces weird exceptions (such as NPE) which we cannot
> > > > handle and cannot give user any advice on what is going on.
> > > >
> > > > I think we should resolve this problem as follows:
> > > > 1) Both Externalizable and writeObject/readObject cases *by default*
> > are
> > > > handled in a similar way - with OptimizedMarshaller. *I.e. they are
> > > > deserialized on server by default.*
> > > > 2) User can optionally fallback to binary format if he thinks it is
> > safe.
> > > > But he must do it explicitly via configuration.
> > > > 3) When we met such classes for the first time, a warning is printed
> to
> > > the
> > > > user. Something like: "Binary format cannot be applied to the [class]
> > > > because it implements Externalizable interface; instances will be
> > > > deserialized on server (to enable binary format please implement
> > > > Binarylizable interface, or enable [property] or define custom
> > > > serializer)".
> > > > 4) Only one exception class is possible on the server -
> ClassNotFound.
> > We
> > > > will throw it with sensible error message as well.
> > > >
> > > > Thoughts?
> > > >
> > > > Vladimir.
> > > >
> > >
> >
>