Igniters,
I have drafted an IEP on thin client serialization format [1], please review and let me know what you think. [1] https://cwiki.apache.org/confluence/display/IGNITE/IEP-75+Thin+Client+Serialization |
Hi, Pavel. Have you considered format with schema? Or schemaless of a
candidate format was a prerequisite? As for me, msgpack is great, but I suppose that we should benchmark formats thoroughly. And not only for Java. чт, 17 июн. 2021 г. в 15:29, Pavel Tupitsyn <[hidden email]>: > Igniters, > > I have drafted an IEP on thin client serialization format [1], > please review and let me know what you think. > > [1] > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-75+Thin+Client+Serialization > -- Sincerely yours, Ivan Daschinskiy |
Could you please share your code for benchmarks?
чт, 17 июн. 2021 г. в 15:56, Ivan Daschinsky <[hidden email]>: > Hi, Pavel. Have you considered format with schema? Or schemaless of a > candidate format was a prerequisite? > As for me, msgpack is great, but I suppose that we should benchmark > formats thoroughly. And not only for Java. > > чт, 17 июн. 2021 г. в 15:29, Pavel Tupitsyn <[hidden email]>: > >> Igniters, >> >> I have drafted an IEP on thin client serialization format [1], >> please review and let me know what you think. >> >> [1] >> >> https://cwiki.apache.org/confluence/display/IGNITE/IEP-75+Thin+Client+Serialization >> > > > -- > Sincerely yours, Ivan Daschinskiy > -- Sincerely yours, Ivan Daschinskiy |
Ivan,
> Have you considered format with schema? 1. We should be able to serialize arbitrary user data on the client side. I think we don't want to require extra steps from the user. 2. MsgPack can be also used in a schemaful way, when user objects are written as arrays, not as maps - so that field names are not included. > we should benchmark formats thoroughly Strictly speaking, the IEP is about the format (the spec), not about the implementation. The format itself is simple and efficient, there is nothing to make it slower than anything else. C# impl proves this by beating every competitor [1]. > Could you please share your code for benchmarks? The code is linked in the IEP [2] [1] https://aloiskraus.wordpress.com/2019/09/29/net-serialization-benchmark-2019-roundup/ [2] https://github.com/apache/ignite/pull/9178 On Thu, Jun 17, 2021 at 4:02 PM Ivan Daschinsky <[hidden email]> wrote: > Could you please share your code for benchmarks? > > чт, 17 июн. 2021 г. в 15:56, Ivan Daschinsky <[hidden email]>: > > > Hi, Pavel. Have you considered format with schema? Or schemaless of a > > candidate format was a prerequisite? > > As for me, msgpack is great, but I suppose that we should benchmark > > formats thoroughly. And not only for Java. > > > > чт, 17 июн. 2021 г. в 15:29, Pavel Tupitsyn <[hidden email]>: > > > >> Igniters, > >> > >> I have drafted an IEP on thin client serialization format [1], > >> please review and let me know what you think. > >> > >> [1] > >> > >> > https://cwiki.apache.org/confluence/display/IGNITE/IEP-75+Thin+Client+Serialization > >> > > > > > > -- > > Sincerely yours, Ivan Daschinskiy > > > > > -- > Sincerely yours, Ivan Daschinskiy > |
Also, it's well known use case of msgpack in the world of memory grids --
tarantool.io uses msgpack for clients binary protocol [1] So writing connectors to tarantool is quite easy task. [1] -- https://www.tarantool.io/en/doc/latest/dev_guide/internals/box_protocol/ чт, 17 июн. 2021 г. в 16:15, Pavel Tupitsyn <[hidden email]>: > Ivan, > > > Have you considered format with schema? > > 1. We should be able to serialize arbitrary user data on the client side. > I think we don't want to require extra steps from the user. > > 2. MsgPack can be also used in a schemaful way, when user objects are > written as arrays, not as maps - so that field names are not included. > > > > we should benchmark formats thoroughly > > Strictly speaking, the IEP is about the format (the spec), not about the > implementation. > The format itself is simple and efficient, there is nothing to make it > slower than anything else. > C# impl proves this by beating every competitor [1]. > > > > Could you please share your code for benchmarks? > > The code is linked in the IEP [2] > > > [1] > > https://aloiskraus.wordpress.com/2019/09/29/net-serialization-benchmark-2019-roundup/ > [2] https://github.com/apache/ignite/pull/9178 > > On Thu, Jun 17, 2021 at 4:02 PM Ivan Daschinsky <[hidden email]> > wrote: > > > Could you please share your code for benchmarks? > > > > чт, 17 июн. 2021 г. в 15:56, Ivan Daschinsky <[hidden email]>: > > > > > Hi, Pavel. Have you considered format with schema? Or schemaless of a > > > candidate format was a prerequisite? > > > As for me, msgpack is great, but I suppose that we should benchmark > > > formats thoroughly. And not only for Java. > > > > > > чт, 17 июн. 2021 г. в 15:29, Pavel Tupitsyn <[hidden email]>: > > > > > >> Igniters, > > >> > > >> I have drafted an IEP on thin client serialization format [1], > > >> please review and let me know what you think. > > >> > > >> [1] > > >> > > >> > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-75+Thin+Client+Serialization > > >> > > > > > > > > > -- > > > Sincerely yours, Ivan Daschinskiy > > > > > > > > > -- > > Sincerely yours, Ivan Daschinskiy > > > -- Sincerely yours, Ivan Daschinskiy |
Hi Pavel,
What you suggest looks promising: arbitrary object graph and platform independence aspects in particular. In IEP-54 we support only flat objects and only some standard types and assume inner objects of custom types will be serialized to byte[] somehow and their schema will not be managed by Ignite. And, we want to offer some default serializers to make it easier for end-user. AFAIU, MsgPack is suitable for the purpose as we don't want to invent yet another effective binary format. With an additional code harness, we can write object schema (field names) within the object itself. Is it right? I am just confused with "a schemaful way" and "field names are not included" in the same sentence. On Thu, Jun 17, 2021 at 4:46 PM Ivan Daschinsky <[hidden email]> wrote: > Also, it's well known use case of msgpack in the world of memory grids -- > tarantool.io uses msgpack for clients binary protocol [1] > So writing connectors to tarantool is quite easy task. > > [1] -- > https://www.tarantool.io/en/doc/latest/dev_guide/internals/box_protocol/ > > чт, 17 июн. 2021 г. в 16:15, Pavel Tupitsyn <[hidden email]>: > > > Ivan, > > > > > Have you considered format with schema? > > > > 1. We should be able to serialize arbitrary user data on the client side. > > I think we don't want to require extra steps from the user. > > > > 2. MsgPack can be also used in a schemaful way, when user objects are > > written as arrays, not as maps - so that field names are not included. > > > > > > > we should benchmark formats thoroughly > > > > Strictly speaking, the IEP is about the format (the spec), not about the > > implementation. > > The format itself is simple and efficient, there is nothing to make it > > slower than anything else. > > C# impl proves this by beating every competitor [1]. > > > > > > > Could you please share your code for benchmarks? > > > > The code is linked in the IEP [2] > > > > > > [1] > > > > > https://aloiskraus.wordpress.com/2019/09/29/net-serialization-benchmark-2019-roundup/ > > [2] https://github.com/apache/ignite/pull/9178 > > > > On Thu, Jun 17, 2021 at 4:02 PM Ivan Daschinsky <[hidden email]> > > wrote: > > > > > Could you please share your code for benchmarks? > > > > > > чт, 17 июн. 2021 г. в 15:56, Ivan Daschinsky <[hidden email]>: > > > > > > > Hi, Pavel. Have you considered format with schema? Or schemaless of a > > > > candidate format was a prerequisite? > > > > As for me, msgpack is great, but I suppose that we should benchmark > > > > formats thoroughly. And not only for Java. > > > > > > > > чт, 17 июн. 2021 г. в 15:29, Pavel Tupitsyn <[hidden email]>: > > > > > > > >> Igniters, > > > >> > > > >> I have drafted an IEP on thin client serialization format [1], > > > >> please review and let me know what you think. > > > >> > > > >> [1] > > > >> > > > >> > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-75+Thin+Client+Serialization > > > >> > > > > > > > > > > > > -- > > > > Sincerely yours, Ivan Daschinskiy > > > > > > > > > > > > > -- > > > Sincerely yours, Ivan Daschinskiy > > > > > > > > -- > Sincerely yours, Ivan Daschinskiy > -- Best regards, Andrey V. Mashenkov |
Andrey, i'm sorry but what do you mean as additional code harness? Usually,
POJO is serialized simply as map. чт, 17 июн. 2021 г., 19:55 Andrey Mashenkov <[hidden email]>: > Hi Pavel, > > What you suggest looks promising: arbitrary object graph and platform > independence aspects in particular. > > In IEP-54 we support only flat objects and only some standard types and > assume inner objects of custom types will be serialized to byte[] somehow > and their schema will not be managed by Ignite. > And, we want to offer some default serializers to make it easier for > end-user. > > AFAIU, MsgPack is suitable for the purpose as we don't want to invent yet > another effective binary format. > With an additional code harness, we can write object schema (field names) > within the object itself. > Is it right? > > I am just confused with "a schemaful way" and "field names are not > included" in the same sentence. > > > On Thu, Jun 17, 2021 at 4:46 PM Ivan Daschinsky <[hidden email]> > wrote: > > > Also, it's well known use case of msgpack in the world of memory grids -- > > tarantool.io uses msgpack for clients binary protocol [1] > > So writing connectors to tarantool is quite easy task. > > > > [1] -- > > https://www.tarantool.io/en/doc/latest/dev_guide/internals/box_protocol/ > > > > чт, 17 июн. 2021 г. в 16:15, Pavel Tupitsyn <[hidden email]>: > > > > > Ivan, > > > > > > > Have you considered format with schema? > > > > > > 1. We should be able to serialize arbitrary user data on the client > side. > > > I think we don't want to require extra steps from the user. > > > > > > 2. MsgPack can be also used in a schemaful way, when user objects are > > > written as arrays, not as maps - so that field names are not included. > > > > > > > > > > we should benchmark formats thoroughly > > > > > > Strictly speaking, the IEP is about the format (the spec), not about > the > > > implementation. > > > The format itself is simple and efficient, there is nothing to make it > > > slower than anything else. > > > C# impl proves this by beating every competitor [1]. > > > > > > > > > > Could you please share your code for benchmarks? > > > > > > The code is linked in the IEP [2] > > > > > > > > > [1] > > > > > > > > > https://aloiskraus.wordpress.com/2019/09/29/net-serialization-benchmark-2019-roundup/ > > > [2] https://github.com/apache/ignite/pull/9178 > > > > > > On Thu, Jun 17, 2021 at 4:02 PM Ivan Daschinsky <[hidden email]> > > > wrote: > > > > > > > Could you please share your code for benchmarks? > > > > > > > > чт, 17 июн. 2021 г. в 15:56, Ivan Daschinsky <[hidden email]>: > > > > > > > > > Hi, Pavel. Have you considered format with schema? Or schemaless > of a > > > > > candidate format was a prerequisite? > > > > > As for me, msgpack is great, but I suppose that we should benchmark > > > > > formats thoroughly. And not only for Java. > > > > > > > > > > чт, 17 июн. 2021 г. в 15:29, Pavel Tupitsyn <[hidden email] > >: > > > > > > > > > >> Igniters, > > > > >> > > > > >> I have drafted an IEP on thin client serialization format [1], > > > > >> please review and let me know what you think. > > > > >> > > > > >> [1] > > > > >> > > > > >> > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-75+Thin+Client+Serialization > > > > >> > > > > > > > > > > > > > > > -- > > > > > Sincerely yours, Ivan Daschinskiy > > > > > > > > > > > > > > > > > -- > > > > Sincerely yours, Ivan Daschinskiy > > > > > > > > > > > > > -- > > Sincerely yours, Ivan Daschinskiy > > > > > -- > Best regards, > Andrey V. Mashenkov > |
Ivan,
Ok, I've just thought if "fields are not included" then we need to bother about them by ourselves. чт, 17 июн. 2021 г., 20:10 Ivan Daschinsky <[hidden email]>: > Andrey, i'm sorry but what do you mean as additional code harness? Usually, > POJO is serialized simply as map. > > чт, 17 июн. 2021 г., 19:55 Andrey Mashenkov <[hidden email]>: > > > Hi Pavel, > > > > What you suggest looks promising: arbitrary object graph and platform > > independence aspects in particular. > > > > In IEP-54 we support only flat objects and only some standard types and > > assume inner objects of custom types will be serialized to byte[] somehow > > and their schema will not be managed by Ignite. > > And, we want to offer some default serializers to make it easier for > > end-user. > > > > AFAIU, MsgPack is suitable for the purpose as we don't want to invent yet > > another effective binary format. > > With an additional code harness, we can write object schema (field names) > > within the object itself. > > Is it right? > > > > I am just confused with "a schemaful way" and "field names are not > > included" in the same sentence. > > > > > > On Thu, Jun 17, 2021 at 4:46 PM Ivan Daschinsky <[hidden email]> > > wrote: > > > > > Also, it's well known use case of msgpack in the world of memory grids > -- > > > tarantool.io uses msgpack for clients binary protocol [1] > > > So writing connectors to tarantool is quite easy task. > > > > > > [1] -- > > > > https://www.tarantool.io/en/doc/latest/dev_guide/internals/box_protocol/ > > > > > > чт, 17 июн. 2021 г. в 16:15, Pavel Tupitsyn <[hidden email]>: > > > > > > > Ivan, > > > > > > > > > Have you considered format with schema? > > > > > > > > 1. We should be able to serialize arbitrary user data on the client > > side. > > > > I think we don't want to require extra steps from the user. > > > > > > > > 2. MsgPack can be also used in a schemaful way, when user objects are > > > > written as arrays, not as maps - so that field names are not > included. > > > > > > > > > > > > > we should benchmark formats thoroughly > > > > > > > > Strictly speaking, the IEP is about the format (the spec), not about > > the > > > > implementation. > > > > The format itself is simple and efficient, there is nothing to make > it > > > > slower than anything else. > > > > C# impl proves this by beating every competitor [1]. > > > > > > > > > > > > > Could you please share your code for benchmarks? > > > > > > > > The code is linked in the IEP [2] > > > > > > > > > > > > [1] > > > > > > > > > > > > > > https://aloiskraus.wordpress.com/2019/09/29/net-serialization-benchmark-2019-roundup/ > > > > [2] https://github.com/apache/ignite/pull/9178 > > > > > > > > On Thu, Jun 17, 2021 at 4:02 PM Ivan Daschinsky <[hidden email] > > > > > > wrote: > > > > > > > > > Could you please share your code for benchmarks? > > > > > > > > > > чт, 17 июн. 2021 г. в 15:56, Ivan Daschinsky <[hidden email] > >: > > > > > > > > > > > Hi, Pavel. Have you considered format with schema? Or schemaless > > of a > > > > > > candidate format was a prerequisite? > > > > > > As for me, msgpack is great, but I suppose that we should > benchmark > > > > > > formats thoroughly. And not only for Java. > > > > > > > > > > > > чт, 17 июн. 2021 г. в 15:29, Pavel Tupitsyn < > [hidden email] > > >: > > > > > > > > > > > >> Igniters, > > > > > >> > > > > > >> I have drafted an IEP on thin client serialization format [1], > > > > > >> please review and let me know what you think. > > > > > >> > > > > > >> [1] > > > > > >> > > > > > >> > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-75+Thin+Client+Serialization > > > > > >> > > > > > > > > > > > > > > > > > > -- > > > > > > Sincerely yours, Ivan Daschinskiy > > > > > > > > > > > > > > > > > > > > > -- > > > > > Sincerely yours, Ivan Daschinskiy > > > > > > > > > > > > > > > > > > -- > > > Sincerely yours, Ivan Daschinskiy > > > > > > > > > -- > > Best regards, > > Andrey V. Mashenkov > > > |
Andrey,
>> arbitrary object graph Also, that is not true, msgpack format doesn't handle circular graphs. Think about msgpack as binary json. You couldn't understand full structure of message if you didn't deserialize it fully before, maps and arrays are serialized just as contiguos chunks of values/kv-pairs. Msgpack is a really dumb and simple format. Also, as for me, I cannot understand why current ignite serialization (BinaryObjectBuilder or Binarilizable) is slower than raw message pack serializer. I suppose that this is an issue and we should investigate it. Pavel, why do you use PooledMessageBufferOutput in benchmarks? I'm sorry, but is it fair to use it? >> The code is linked in the IEP [2] Double checked -- there is not any links to PR either in IEP or in jira issue |
>> Double checked -- there is not any links to PR either in IEP or in jira
issue Sorry, there is a link in IEP, but not in jira ticket. чт, 17 июн. 2021 г. в 21:39, Ivan Daschinsky <[hidden email]>: > Andrey, > >> arbitrary object graph > Also, that is not true, msgpack format doesn't handle circular graphs. > Think about msgpack as binary json. You couldn't understand full structure > of message if you didn't deserialize it fully before, maps and arrays are > serialized just as contiguos chunks > of values/kv-pairs. Msgpack is a really dumb and simple format. > > Also, as for me, I cannot understand why current ignite serialization > (BinaryObjectBuilder or Binarilizable) is slower than raw message pack > serializer. > I suppose that this is an issue and we should investigate it. > > Pavel, why do you use PooledMessageBufferOutput in benchmarks? I'm > sorry, but is it fair to use it? > > >> The code is linked in the IEP [2] > Double checked -- there is not any links to PR either in IEP or in jira > issue > -- Sincerely yours, Ivan Daschinskiy |
Ivan, thankd for clarification.
Binarilizable interface forces user to write serialization code. We can support this or similar interface. But I'd like Ignite has some default serializer in addition. It can be also useful e.g. in compute for param and result serialization. BinaryObjectBuider requires an Ignite node for object construction, but we are looking for a detached builder and won't care about schemas. AFAIR, BinaryObject creates an objectReader on every single field read operation. So, BO solution produces a lot of garbage and BO has noticable overhead which affects the object footprint. чт, 17 июн. 2021 г., 21:41 Ivan Daschinsky <[hidden email]>: > >> Double checked -- there is not any links to PR either in IEP or in jira > issue > Sorry, there is a link in IEP, but not in jira ticket. > > чт, 17 июн. 2021 г. в 21:39, Ivan Daschinsky <[hidden email]>: > > > Andrey, > > >> arbitrary object graph > > Also, that is not true, msgpack format doesn't handle circular graphs. > > Think about msgpack as binary json. You couldn't understand full > structure > > of message if you didn't deserialize it fully before, maps and arrays are > > serialized just as contiguos chunks > > of values/kv-pairs. Msgpack is a really dumb and simple format. > > > > Also, as for me, I cannot understand why current ignite serialization > > (BinaryObjectBuilder or Binarilizable) is slower than raw message pack > > serializer. > > I suppose that this is an issue and we should investigate it. > > > > Pavel, why do you use PooledMessageBufferOutput in benchmarks? I'm > > sorry, but is it fair to use it? > > > > >> The code is linked in the IEP [2] > > Double checked -- there is not any links to PR either in IEP or in jira > > issue > > > > > -- > Sincerely yours, Ivan Daschinskiy > |
Andrey, here we discuss serialization format, as far as I understand.
Current implementation of ignite binary object serialization can be rewritten. If we do not care about fast (O(1)) field lookup, about schema validation and so on, msgpack is a really good option. It is also good for client binary protocol, i.e. tarantool uses it. >> Binarilizable interface forces user to write serialization code I am talking about speed comparison. You can see from Pavel's data, jackson-msgpack shows a pathetic performance comparing with a ignite's default binary marshaller. If you want really fast serialization -- the only option is to write code by yourself or use code generation. Default packer from msgpack-core java package is similar to BinaryWriter. So I am wondering why packer from msgpack-core show better performance than BinaryWriter. And I suppose that benchmark is not quite fair. чт, 17 июн. 2021 г. в 22:19, Andrey Mashenkov <[hidden email]>: > Ivan, thankd for clarification. > > Binarilizable interface forces user to write serialization code. We can > support this or similar interface. > But I'd like Ignite has some default serializer in addition. It can be also > useful e.g. in compute for param and result serialization. > > BinaryObjectBuider requires an Ignite node for object construction, but we > are looking for a detached builder and won't care about schemas. > > AFAIR, BinaryObject creates an objectReader on every single field read > operation. > So, BO solution produces a lot of garbage and BO has noticable overhead > which affects the object footprint. > > чт, 17 июн. 2021 г., 21:41 Ivan Daschinsky <[hidden email]>: > > > >> Double checked -- there is not any links to PR either in IEP or in > jira > > issue > > Sorry, there is a link in IEP, but not in jira ticket. > > > > чт, 17 июн. 2021 г. в 21:39, Ivan Daschinsky <[hidden email]>: > > > > > Andrey, > > > >> arbitrary object graph > > > Also, that is not true, msgpack format doesn't handle circular graphs. > > > Think about msgpack as binary json. You couldn't understand full > > structure > > > of message if you didn't deserialize it fully before, maps and arrays > are > > > serialized just as contiguos chunks > > > of values/kv-pairs. Msgpack is a really dumb and simple format. > > > > > > Also, as for me, I cannot understand why current ignite serialization > > > (BinaryObjectBuilder or Binarilizable) is slower than raw message pack > > > serializer. > > > I suppose that this is an issue and we should investigate it. > > > > > > Pavel, why do you use PooledMessageBufferOutput in benchmarks? I'm > > > sorry, but is it fair to use it? > > > > > > >> The code is linked in the IEP [2] > > > Double checked -- there is not any links to PR either in IEP or in jira > > > issue > > > > > > > > > -- > > Sincerely yours, Ivan Daschinskiy > > > -- Sincerely yours, Ivan Daschinskiy |
Ivan,
> why do you use PooledMessageBufferOutput in benchmarks? To make it fair. Ignite uses thread-local reusable buffers, see [1]. > why packer from msgpack-core show better performance than > BinaryWriter. And I suppose that benchmark is not quite fair. MsgPack writes and reads less bytes, so it should be faster. Benchmark is not 100% fair, there are some small extra things that BinaryWriter does. However: 1. I don't think we care about super-precise benchmarks here, just the ballpark. 2. We are discussing the format, not the implementation. Important takeaway is: The format does not prevent someone from implementing it efficiently. [1] https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/binary/BinaryWriterExImpl.java#L101 On Thu, Jun 17, 2021 at 10:40 PM Ivan Daschinsky <[hidden email]> wrote: > Andrey, here we discuss serialization format, as far as I understand. > Current implementation of ignite binary object serialization can be > rewritten. > If we do not care about fast (O(1)) field lookup, about schema validation > and so on, msgpack is a really good option. It is also good for client > binary protocol, i.e. > tarantool uses it. > > >> Binarilizable interface forces user to write serialization code > I am talking about speed comparison. You can see from Pavel's data, > jackson-msgpack shows a pathetic performance comparing with a ignite's > default binary marshaller. If you want really fast serialization -- the > only option is to write code by yourself or use code generation. Default > packer from msgpack-core java package is similar to BinaryWriter. So I am > wondering why packer from msgpack-core show better performance than > BinaryWriter. And I suppose that benchmark is not quite fair. > > > чт, 17 июн. 2021 г. в 22:19, Andrey Mashenkov <[hidden email] > >: > > > Ivan, thankd for clarification. > > > > Binarilizable interface forces user to write serialization code. We can > > support this or similar interface. > > But I'd like Ignite has some default serializer in addition. It can be > also > > useful e.g. in compute for param and result serialization. > > > > BinaryObjectBuider requires an Ignite node for object construction, but > we > > are looking for a detached builder and won't care about schemas. > > > > AFAIR, BinaryObject creates an objectReader on every single field read > > operation. > > So, BO solution produces a lot of garbage and BO has noticable overhead > > which affects the object footprint. > > > > чт, 17 июн. 2021 г., 21:41 Ivan Daschinsky <[hidden email]>: > > > > > >> Double checked -- there is not any links to PR either in IEP or in > > jira > > > issue > > > Sorry, there is a link in IEP, but not in jira ticket. > > > > > > чт, 17 июн. 2021 г. в 21:39, Ivan Daschinsky <[hidden email]>: > > > > > > > Andrey, > > > > >> arbitrary object graph > > > > Also, that is not true, msgpack format doesn't handle circular > graphs. > > > > Think about msgpack as binary json. You couldn't understand full > > > structure > > > > of message if you didn't deserialize it fully before, maps and arrays > > are > > > > serialized just as contiguos chunks > > > > of values/kv-pairs. Msgpack is a really dumb and simple format. > > > > > > > > Also, as for me, I cannot understand why current ignite serialization > > > > (BinaryObjectBuilder or Binarilizable) is slower than raw message > pack > > > > serializer. > > > > I suppose that this is an issue and we should investigate it. > > > > > > > > Pavel, why do you use PooledMessageBufferOutput in benchmarks? I'm > > > > sorry, but is it fair to use it? > > > > > > > > >> The code is linked in the IEP [2] > > > > Double checked -- there is not any links to PR either in IEP or in > jira > > > > issue > > > > > > > > > > > > > -- > > > Sincerely yours, Ivan Daschinskiy > > > > > > > > -- > Sincerely yours, Ivan Daschinskiy > |
>> To make it fair. Ignite uses thread-local reusable buffers, see [1].
I know, but PooledMessageBufferOutput is not about thread-local, isn't it? I'm not against about MsgPack, I'm for fair and not biased comparison. I suppose that MsgPack is an ideal candidate for thin client binary protocol, not only for serializing some user data. I am analyzed some tarantool connectors [1] [2] [3] and found msgpack based protocol is a really good idea. Also there is realy super fast and just 1 header library with liberal BSD-2 licence for C -- msgpuck [4]. It used in tarantool itself and in [1] and is stable and bullet proof. [1] -- https://github.com/igorcoding/asynctnt [2] -- https://github.com/tarantool/tarantool-python/ [3] -- https://github.com/tarantool/go-tarantool [4] -- https://github.com/rtsisyk/msgpuck пт, 18 июн. 2021 г. в 11:44, Pavel Tupitsyn <[hidden email]>: > Ivan, > > > why do you use PooledMessageBufferOutput in benchmarks? > > To make it fair. Ignite uses thread-local reusable buffers, see [1]. > > > > why packer from msgpack-core show better performance than > > BinaryWriter. And I suppose that benchmark is not quite fair. > > MsgPack writes and reads less bytes, so it should be faster. > Benchmark is not 100% fair, there are some small extra things that > BinaryWriter does. > > However: > 1. I don't think we care about super-precise benchmarks here, just the > ballpark. > 2. We are discussing the format, not the implementation. > > Important takeaway is: > The format does not prevent someone from implementing it efficiently. > > > > [1] > > https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/binary/BinaryWriterExImpl.java#L101 > > On Thu, Jun 17, 2021 at 10:40 PM Ivan Daschinsky <[hidden email]> > wrote: > > > Andrey, here we discuss serialization format, as far as I understand. > > Current implementation of ignite binary object serialization can be > > rewritten. > > If we do not care about fast (O(1)) field lookup, about schema validation > > and so on, msgpack is a really good option. It is also good for client > > binary protocol, i.e. > > tarantool uses it. > > > > >> Binarilizable interface forces user to write serialization code > > I am talking about speed comparison. You can see from Pavel's data, > > jackson-msgpack shows a pathetic performance comparing with a ignite's > > default binary marshaller. If you want really fast serialization -- the > > only option is to write code by yourself or use code generation. Default > > packer from msgpack-core java package is similar to BinaryWriter. So I am > > wondering why packer from msgpack-core show better performance than > > BinaryWriter. And I suppose that benchmark is not quite fair. > > > > > > чт, 17 июн. 2021 г. в 22:19, Andrey Mashenkov < > [hidden email] > > >: > > > > > Ivan, thankd for clarification. > > > > > > Binarilizable interface forces user to write serialization code. We can > > > support this or similar interface. > > > But I'd like Ignite has some default serializer in addition. It can be > > also > > > useful e.g. in compute for param and result serialization. > > > > > > BinaryObjectBuider requires an Ignite node for object construction, but > > we > > > are looking for a detached builder and won't care about schemas. > > > > > > AFAIR, BinaryObject creates an objectReader on every single field read > > > operation. > > > So, BO solution produces a lot of garbage and BO has noticable overhead > > > which affects the object footprint. > > > > > > чт, 17 июн. 2021 г., 21:41 Ivan Daschinsky <[hidden email]>: > > > > > > > >> Double checked -- there is not any links to PR either in IEP or in > > > jira > > > > issue > > > > Sorry, there is a link in IEP, but not in jira ticket. > > > > > > > > чт, 17 июн. 2021 г. в 21:39, Ivan Daschinsky <[hidden email]>: > > > > > > > > > Andrey, > > > > > >> arbitrary object graph > > > > > Also, that is not true, msgpack format doesn't handle circular > > graphs. > > > > > Think about msgpack as binary json. You couldn't understand full > > > > structure > > > > > of message if you didn't deserialize it fully before, maps and > arrays > > > are > > > > > serialized just as contiguos chunks > > > > > of values/kv-pairs. Msgpack is a really dumb and simple format. > > > > > > > > > > Also, as for me, I cannot understand why current ignite > serialization > > > > > (BinaryObjectBuilder or Binarilizable) is slower than raw message > > pack > > > > > serializer. > > > > > I suppose that this is an issue and we should investigate it. > > > > > > > > > > Pavel, why do you use PooledMessageBufferOutput in benchmarks? > I'm > > > > > sorry, but is it fair to use it? > > > > > > > > > > >> The code is linked in the IEP [2] > > > > > Double checked -- there is not any links to PR either in IEP or in > > jira > > > > > issue > > > > > > > > > > > > > > > > > -- > > > > Sincerely yours, Ivan Daschinskiy > > > > > > > > > > > > > -- > > Sincerely yours, Ivan Daschinskiy > > > -- Sincerely yours, Ivan Daschinskiy |
Free forum by Nabble | Edit this page |