Dear sirs!
I want to resolve issue IGNITE-13 - https://issues.apache.org/jira/browse/IGNITE-13 Is it actual? Vadim Opolski |
Hi Vadim,
I don't think it makes much sense to invest into OptimizedMarshaller. However, I would check if this optimization is applicable to BinaryMarshaller, and if yes, implement it. -Val On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский <[hidden email]> wrote: > Dear sirs! > > I want to resolve issue IGNITE-13 - > https://issues.apache.org/jira/browse/IGNITE-13 > > Is it actual? > > Vadim Opolski > |
On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko <
[hidden email]> wrote: > Hi Vadim, > > I don't think it makes much sense to invest into OptimizedMarshaller. > However, I would check if this optimization is applicable to > BinaryMarshaller, and if yes, implement it. > Val, in this case can you please update the ticket? > > -Val > > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский <[hidden email]> > wrote: > > > Dear sirs! > > > > I want to resolve issue IGNITE-13 - > > https://issues.apache.org/jira/browse/IGNITE-13 > > > > Is it actual? > > > > Vadim Opolski > > > |
Vladimir,
Can you please take a look and provide your thoughts? Can this be applied to binary marshaller? From what I recall, it serializes string a bit differently from optimized marshaller, so I'm not sure. -Val On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan <[hidden email]> wrote: > On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko < > [hidden email]> wrote: > > > Hi Vadim, > > > > I don't think it makes much sense to invest into OptimizedMarshaller. > > However, I would check if this optimization is applicable to > > BinaryMarshaller, and if yes, implement it. > > > > Val, in this case can you please update the ticket? > > > > > > -Val > > > > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский <[hidden email]> > > wrote: > > > > > Dear sirs! > > > > > > I want to resolve issue IGNITE-13 - > > > https://issues.apache.org/jira/browse/IGNITE-13 > > > > > > Is it actual? > > > > > > Vadim Opolski > > > > > > |
Hi,
It is hard to say whether it makes sense or not. No doubt, it could speed up marshalling process at the cost of 2x memory required for strings. From my previous experience with marshalling micro-optimizations, we will hardly ever notice speedup in distributed environment. But, there is another sied - it could speedup our queries, because we will not have to unmarshal string on every field access. So I would try to make this optimization optional and then measure query performance with classes having lots of strings. It could give us interesting results. On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko < [hidden email]> wrote: > Vladimir, > > Can you please take a look and provide your thoughts? Can this be applied > to binary marshaller? From what I recall, it serializes string a bit > differently from optimized marshaller, so I'm not sure. > > -Val > > On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan <[hidden email]> > wrote: > >> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko < >> [hidden email]> wrote: >> >> > Hi Vadim, >> > >> > I don't think it makes much sense to invest into OptimizedMarshaller. >> > However, I would check if this optimization is applicable to >> > BinaryMarshaller, and if yes, implement it. >> > >> >> Val, in this case can you please update the ticket? >> >> >> > >> > -Val >> > >> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский <[hidden email]> >> > wrote: >> > >> > > Dear sirs! >> > > >> > > I want to resolve issue IGNITE-13 - >> > > https://issues.apache.org/jira/browse/IGNITE-13 >> > > >> > > Is it actual? >> > > >> > > Vadim Opolski >> > > >> > >> > > |
Vladimir,
I think we misunderstood each other. My understanding of this optimization is the following. Currently string serialization is done in two steps (see BinaryWriterExImpl#doWriteString): strArr = BinaryUtils.strToUtf8Bytes(val); // Encode string into byte array. out.writeByteArray(strArr); // Write byte array into stream. What this ticket suggests is to write directly into stream while string is encoded, without intermediate array. This both reduces memory consumption and eliminates array copy step. I updated the ticket and added this explanation there. Vadim, can you create a micro benchmark and check if it gives any improvement? -Val On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov <[hidden email]> wrote: > Hi, > > It is hard to say whether it makes sense or not. No doubt, it could speed > up marshalling process at the cost of 2x memory required for strings. From > my previous experience with marshalling micro-optimizations, we will hardly > ever notice speedup in distributed environment. > > But, there is another sied - it could speedup our queries, because we will > not have to unmarshal string on every field access. So I would try to make > this optimization optional and then measure query performance with classes > having lots of strings. It could give us interesting results. > > On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko < > [hidden email]> wrote: > >> Vladimir, >> >> Can you please take a look and provide your thoughts? Can this be applied >> to binary marshaller? From what I recall, it serializes string a bit >> differently from optimized marshaller, so I'm not sure. >> >> -Val >> >> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan <[hidden email] >> > wrote: >> >>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko < >>> [hidden email]> wrote: >>> >>> > Hi Vadim, >>> > >>> > I don't think it makes much sense to invest into OptimizedMarshaller. >>> > However, I would check if this optimization is applicable to >>> > BinaryMarshaller, and if yes, implement it. >>> > >>> >>> Val, in this case can you please update the ticket? >>> >>> >>> > >>> > -Val >>> > >>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский <[hidden email] >>> > >>> > wrote: >>> > >>> > > Dear sirs! >>> > > >>> > > I want to resolve issue IGNITE-13 - >>> > > https://issues.apache.org/jira/browse/IGNITE-13 >>> > > >>> > > Is it actual? >>> > > >>> > > Vadim Opolski >>> > > >>> > >>> >> >> > |
Hello everybody!
Valentin, yes, I can create a micro benchmark and check if it gives any improvement. Vadim 2017-02-15 0:52 GMT+03:00 Valentin Kulichenko <[hidden email] >: > Vladimir, > > I think we misunderstood each other. My understanding of this optimization > is the following. > > Currently string serialization is done in two steps (see > BinaryWriterExImpl#doWriteString): > > strArr = BinaryUtils.strToUtf8Bytes(val); // Encode string into byte > array. > out.writeByteArray(strArr); // Write byte array into > stream. > > What this ticket suggests is to write directly into stream while string is > encoded, without intermediate array. This both reduces memory consumption > and eliminates array copy step. > > I updated the ticket and added this explanation there. > > Vadim, can you create a micro benchmark and check if it gives any > improvement? > > -Val > > On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov <[hidden email]> > wrote: > >> Hi, >> >> It is hard to say whether it makes sense or not. No doubt, it could speed >> up marshalling process at the cost of 2x memory required for strings. From >> my previous experience with marshalling micro-optimizations, we will hardly >> ever notice speedup in distributed environment. >> >> But, there is another sied - it could speedup our queries, because we >> will not have to unmarshal string on every field access. So I would try to >> make this optimization optional and then measure query performance with >> classes having lots of strings. It could give us interesting results. >> >> On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko < >> [hidden email]> wrote: >> >>> Vladimir, >>> >>> Can you please take a look and provide your thoughts? Can this be >>> applied to binary marshaller? From what I recall, it serializes string a >>> bit differently from optimized marshaller, so I'm not sure. >>> >>> -Val >>> >>> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan < >>> [hidden email]> wrote: >>> >>>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko < >>>> [hidden email]> wrote: >>>> >>>> > Hi Vadim, >>>> > >>>> > I don't think it makes much sense to invest into OptimizedMarshaller. >>>> > However, I would check if this optimization is applicable to >>>> > BinaryMarshaller, and if yes, implement it. >>>> > >>>> >>>> Val, in this case can you please update the ticket? >>>> >>>> >>>> > >>>> > -Val >>>> > >>>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский < >>>> [hidden email]> >>>> > wrote: >>>> > >>>> > > Dear sirs! >>>> > > >>>> > > I want to resolve issue IGNITE-13 - >>>> > > https://issues.apache.org/jira/browse/IGNITE-13 >>>> > > >>>> > > Is it actual? >>>> > > >>>> > > Vadim Opolski >>>> > > >>>> > >>>> >>> >>> >> > |
In reply to this post by Valentin Kulichenko
Hello everybody!
https://issues.apache.org/jira/browse/IGNITE-13 Valentin, I just have finished benchmark (with JMH) - https://github.com/javaller/MyBenchmark.git It collect data about time working of serialization. For instance - https://github.com/javaller/MyBenchmark/blob/master/out200217.txt To start it you have to do next: 1) clone it - git colne https://github.com/javaller/MyBenchmark.git 2) install it - mvn install 3) run benchmarks - java -Xms1024m -Xmx4096m -jar target\benchmarks.jar Vadim Opolski 2017-02-15 0:52 GMT+03:00 Valentin Kulichenko <[hidden email] >: > Vladimir, > > I think we misunderstood each other. My understanding of this optimization > is the following. > > Currently string serialization is done in two steps (see > BinaryWriterExImpl#doWriteString): > > strArr = BinaryUtils.strToUtf8Bytes(val); // Encode string into byte > array. > out.writeByteArray(strArr); // Write byte array into > stream. > > What this ticket suggests is to write directly into stream while string is > encoded, without intermediate array. This both reduces memory consumption > and eliminates array copy step. > > I updated the ticket and added this explanation there. > > Vadim, can you create a micro benchmark and check if it gives any > improvement? > > -Val > > On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov <[hidden email]> > wrote: > >> Hi, >> >> It is hard to say whether it makes sense or not. No doubt, it could speed >> up marshalling process at the cost of 2x memory required for strings. From >> my previous experience with marshalling micro-optimizations, we will hardly >> ever notice speedup in distributed environment. >> >> But, there is another sied - it could speedup our queries, because we >> will not have to unmarshal string on every field access. So I would try to >> make this optimization optional and then measure query performance with >> classes having lots of strings. It could give us interesting results. >> >> On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko < >> [hidden email]> wrote: >> >>> Vladimir, >>> >>> Can you please take a look and provide your thoughts? Can this be >>> applied to binary marshaller? From what I recall, it serializes string a >>> bit differently from optimized marshaller, so I'm not sure. >>> >>> -Val >>> >>> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan < >>> [hidden email]> wrote: >>> >>>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko < >>>> [hidden email]> wrote: >>>> >>>> > Hi Vadim, >>>> > >>>> > I don't think it makes much sense to invest into OptimizedMarshaller. >>>> > However, I would check if this optimization is applicable to >>>> > BinaryMarshaller, and if yes, implement it. >>>> > >>>> >>>> Val, in this case can you please update the ticket? >>>> >>>> >>>> > >>>> > -Val >>>> > >>>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский < >>>> [hidden email]> >>>> > wrote: >>>> > >>>> > > Dear sirs! >>>> > > >>>> > > I want to resolve issue IGNITE-13 - >>>> > > https://issues.apache.org/jira/browse/IGNITE-13 >>>> > > >>>> > > Is it actual? >>>> > > >>>> > > Vadim Opolski >>>> > > >>>> > >>>> >>> >>> >> > |
Hi Vadim,
I'm not sure I understand your benchmarks and how they verify the optimization discussed here. Basically, here is what needs to be done: 1. Create a benchmark for BinaryWriterExImpl#doWriteString method. 2. Run the benchmark with current implementation. 3. Make the change described in the ticket. 4. Run the benchmark with these changes. 5. Compare results. Makes sense? Let me know if anything is unclear. -Val On Mon, Feb 20, 2017 at 8:51 AM, Вадим Опольский <[hidden email]> wrote: > Hello everybody! > > https://issues.apache.org/jira/browse/IGNITE-13 > > Valentin, I just have finished benchmark (with JMH) - > https://github.com/javaller/MyBenchmark.git > > It collect data about time working of serialization. > > For instance - https://github.com/javaller/MyBenchmark/blob/master/ > out200217.txt > > To start it you have to do next: > > 1) clone it - git colne https://github.com/javaller/MyBenchmark.git > > 2) install it - mvn install > > 3) run benchmarks - java -Xms1024m -Xmx4096m -jar target\benchmarks.jar > > Vadim Opolski > > > > > > 2017-02-15 0:52 GMT+03:00 Valentin Kulichenko < > [hidden email]>: > >> Vladimir, >> >> I think we misunderstood each other. My understanding of this >> optimization is the following. >> >> Currently string serialization is done in two steps (see >> BinaryWriterExImpl#doWriteString): >> >> strArr = BinaryUtils.strToUtf8Bytes(val); // Encode string into byte >> array. >> out.writeByteArray(strArr); // Write byte array into >> stream. >> >> What this ticket suggests is to write directly into stream while string >> is encoded, without intermediate array. This both reduces memory >> consumption and eliminates array copy step. >> >> I updated the ticket and added this explanation there. >> >> Vadim, can you create a micro benchmark and check if it gives any >> improvement? >> >> -Val >> >> On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov <[hidden email]> >> wrote: >> >>> Hi, >>> >>> It is hard to say whether it makes sense or not. No doubt, it could >>> speed up marshalling process at the cost of 2x memory required for strings. >>> From my previous experience with marshalling micro-optimizations, we will >>> hardly ever notice speedup in distributed environment. >>> >>> But, there is another sied - it could speedup our queries, because we >>> will not have to unmarshal string on every field access. So I would try to >>> make this optimization optional and then measure query performance with >>> classes having lots of strings. It could give us interesting results. >>> >>> On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko < >>> [hidden email]> wrote: >>> >>>> Vladimir, >>>> >>>> Can you please take a look and provide your thoughts? Can this be >>>> applied to binary marshaller? From what I recall, it serializes string a >>>> bit differently from optimized marshaller, so I'm not sure. >>>> >>>> -Val >>>> >>>> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan < >>>> [hidden email]> wrote: >>>> >>>>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko < >>>>> [hidden email]> wrote: >>>>> >>>>> > Hi Vadim, >>>>> > >>>>> > I don't think it makes much sense to invest into OptimizedMarshaller. >>>>> > However, I would check if this optimization is applicable to >>>>> > BinaryMarshaller, and if yes, implement it. >>>>> > >>>>> >>>>> Val, in this case can you please update the ticket? >>>>> >>>>> >>>>> > >>>>> > -Val >>>>> > >>>>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский < >>>>> [hidden email]> >>>>> > wrote: >>>>> > >>>>> > > Dear sirs! >>>>> > > >>>>> > > I want to resolve issue IGNITE-13 - >>>>> > > https://issues.apache.org/jira/browse/IGNITE-13 >>>>> > > >>>>> > > Is it actual? >>>>> > > >>>>> > > Vadim Opolski >>>>> > > >>>>> > >>>>> >>>> >>>> >>> >> > |
Hi Valentin!
I compare speed of different methods how to get byte from string and push it to outputstream. Third method is the fastest. Ok, I'm creating a benchmark for BinaryWriterExImpl#doWriteString method, and making the change described in the ticket. https://issues.apache.org/jira/browse/IGNITE-13 Vadim 2017-02-21 1:06 GMT+03:00 Valentin Kulichenko <[hidden email] >: > Hi Vadim, > > I'm not sure I understand your benchmarks and how they verify the > optimization discussed here. Basically, here is what needs to be done: > > 1. Create a benchmark for BinaryWriterExImpl#doWriteString method. > 2. Run the benchmark with current implementation. > 3. Make the change described in the ticket. > 4. Run the benchmark with these changes. > 5. Compare results. > > Makes sense? Let me know if anything is unclear. > > -Val > > On Mon, Feb 20, 2017 at 8:51 AM, Вадим Опольский <[hidden email]> > wrote: > >> Hello everybody! >> >> https://issues.apache.org/jira/browse/IGNITE-13 >> >> Valentin, I just have finished benchmark (with JMH) - >> https://github.com/javaller/MyBenchmark.git >> >> It collect data about time working of serialization. >> >> For instance - https://github.com/javaller/My >> Benchmark/blob/master/out200217.txt >> >> To start it you have to do next: >> >> 1) clone it - git colne https://github.com/javaller/MyBenchmark.git >> >> 2) install it - mvn install >> >> 3) run benchmarks - java -Xms1024m -Xmx4096m -jar target\benchmarks.jar >> >> Vadim Opolski >> >> >> >> >> >> 2017-02-15 0:52 GMT+03:00 Valentin Kulichenko < >> [hidden email]>: >> >>> Vladimir, >>> >>> I think we misunderstood each other. My understanding of this >>> optimization is the following. >>> >>> Currently string serialization is done in two steps (see >>> BinaryWriterExImpl#doWriteString): >>> >>> strArr = BinaryUtils.strToUtf8Bytes(val); // Encode string into byte >>> array. >>> out.writeByteArray(strArr); // Write byte array >>> into stream. >>> >>> What this ticket suggests is to write directly into stream while string >>> is encoded, without intermediate array. This both reduces memory >>> consumption and eliminates array copy step. >>> >>> I updated the ticket and added this explanation there. >>> >>> Vadim, can you create a micro benchmark and check if it gives any >>> improvement? >>> >>> -Val >>> >>> On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov <[hidden email]> >>> wrote: >>> >>>> Hi, >>>> >>>> It is hard to say whether it makes sense or not. No doubt, it could >>>> speed up marshalling process at the cost of 2x memory required for strings. >>>> From my previous experience with marshalling micro-optimizations, we will >>>> hardly ever notice speedup in distributed environment. >>>> >>>> But, there is another sied - it could speedup our queries, because we >>>> will not have to unmarshal string on every field access. So I would try to >>>> make this optimization optional and then measure query performance with >>>> classes having lots of strings. It could give us interesting results. >>>> >>>> On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko < >>>> [hidden email]> wrote: >>>> >>>>> Vladimir, >>>>> >>>>> Can you please take a look and provide your thoughts? Can this be >>>>> applied to binary marshaller? From what I recall, it serializes string a >>>>> bit differently from optimized marshaller, so I'm not sure. >>>>> >>>>> -Val >>>>> >>>>> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan < >>>>> [hidden email]> wrote: >>>>> >>>>>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko < >>>>>> [hidden email]> wrote: >>>>>> >>>>>> > Hi Vadim, >>>>>> > >>>>>> > I don't think it makes much sense to invest into >>>>>> OptimizedMarshaller. >>>>>> > However, I would check if this optimization is applicable to >>>>>> > BinaryMarshaller, and if yes, implement it. >>>>>> > >>>>>> >>>>>> Val, in this case can you please update the ticket? >>>>>> >>>>>> >>>>>> > >>>>>> > -Val >>>>>> > >>>>>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский < >>>>>> [hidden email]> >>>>>> > wrote: >>>>>> > >>>>>> > > Dear sirs! >>>>>> > > >>>>>> > > I want to resolve issue IGNITE-13 - >>>>>> > > https://issues.apache.org/jira/browse/IGNITE-13 >>>>>> > > >>>>>> > > Is it actual? >>>>>> > > >>>>>> > > Vadim Opolski >>>>>> > > >>>>>> > >>>>>> >>>>> >>>>> >>>> >>> >> > |
In reply to this post by Valentin Kulichenko
Hi Valentin!
https://issues.apache.org/jira/browse/IGNITE-13 I created BinaryWriterExImplNew (extended of BinaryWriterExImpl) and added new methods with changes described in the ticket https://github.com/javaller/MyBenchmark/blob/master/src/main/java/org/sample/BinaryWriterExImplNew.java I created a benchmark for BinaryWriterExImplNew https://github.com/javaller/MyBenchmark/blob/master/src/main/java/org/sample/ExampleTest.java I run benchmark and compared results https://github.com/javaller/MyBenchmark/blob/master/totalstat.txt # Run complete. Total time: 00:10:24 Benchmark Mode Cnt Score Error Units ExampleTest.binaryHeapOutputStream1 avgt 50 1114999,207 ± 16756,776 ns/op ExampleTest.binaryHeapOutputStream2 avgt 50 1118149,320 ± 17515,961 ns/op ExampleTest.binaryHeapOutputStream3 avgt 50 1113678,657 ± 17652,314 ns/op ExampleTest.binaryHeapOutputStream4 avgt 50 1112415,051 ± 18273,874 ns/op ExampleTest.binaryHeapOutputStream5 avgt 50 1111366,583 ± 18282,829 ns/op ExampleTest.binaryHeapOutputStreamACSII avgt 50 1112079,667 ± 16659,532 ns/op ExampleTest.binaryHeapOutputStreamUTFCustom avgt 50 1114949,759 ± 16809,669 ns/op ExampleTest.binaryHeapOutputStreamUTFNIO avgt 50 1121462,325 ± 19836,466 ns/op Is it OK? Whats the next step? Do I have to move this JMH benchmark to the Ignite project ? Vadim Opolski 2017-02-21 1:06 GMT+03:00 Valentin Kulichenko <[hidden email] >: > Hi Vadim, > > I'm not sure I understand your benchmarks and how they verify the > optimization discussed here. Basically, here is what needs to be done: > > 1. Create a benchmark for BinaryWriterExImpl#doWriteString method. > 2. Run the benchmark with current implementation. > 3. Make the change described in the ticket. > 4. Run the benchmark with these changes. > 5. Compare results. > > Makes sense? Let me know if anything is unclear. > > -Val > > On Mon, Feb 20, 2017 at 8:51 AM, Вадим Опольский <[hidden email]> > wrote: > >> Hello everybody! >> >> https://issues.apache.org/jira/browse/IGNITE-13 >> >> Valentin, I just have finished benchmark (with JMH) - >> https://github.com/javaller/MyBenchmark.git >> >> It collect data about time working of serialization. >> >> For instance - https://github.com/javaller/My >> Benchmark/blob/master/out200217.txt >> >> To start it you have to do next: >> >> 1) clone it - git colne https://github.com/javaller/MyBenchmark.git >> >> 2) install it - mvn install >> >> 3) run benchmarks - java -Xms1024m -Xmx4096m -jar target\benchmarks.jar >> >> Vadim Opolski >> >> >> >> >> >> 2017-02-15 0:52 GMT+03:00 Valentin Kulichenko < >> [hidden email]>: >> >>> Vladimir, >>> >>> I think we misunderstood each other. My understanding of this >>> optimization is the following. >>> >>> Currently string serialization is done in two steps (see >>> BinaryWriterExImpl#doWriteString): >>> >>> strArr = BinaryUtils.strToUtf8Bytes(val); // Encode string into byte >>> array. >>> out.writeByteArray(strArr); // Write byte array >>> into stream. >>> >>> What this ticket suggests is to write directly into stream while string >>> is encoded, without intermediate array. This both reduces memory >>> consumption and eliminates array copy step. >>> >>> I updated the ticket and added this explanation there. >>> >>> Vadim, can you create a micro benchmark and check if it gives any >>> improvement? >>> >>> -Val >>> >>> On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov <[hidden email]> >>> wrote: >>> >>>> Hi, >>>> >>>> It is hard to say whether it makes sense or not. No doubt, it could >>>> speed up marshalling process at the cost of 2x memory required for strings. >>>> From my previous experience with marshalling micro-optimizations, we will >>>> hardly ever notice speedup in distributed environment. >>>> >>>> But, there is another sied - it could speedup our queries, because we >>>> will not have to unmarshal string on every field access. So I would try to >>>> make this optimization optional and then measure query performance with >>>> classes having lots of strings. It could give us interesting results. >>>> >>>> On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko < >>>> [hidden email]> wrote: >>>> >>>>> Vladimir, >>>>> >>>>> Can you please take a look and provide your thoughts? Can this be >>>>> applied to binary marshaller? From what I recall, it serializes string a >>>>> bit differently from optimized marshaller, so I'm not sure. >>>>> >>>>> -Val >>>>> >>>>> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan < >>>>> [hidden email]> wrote: >>>>> >>>>>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko < >>>>>> [hidden email]> wrote: >>>>>> >>>>>> > Hi Vadim, >>>>>> > >>>>>> > I don't think it makes much sense to invest into >>>>>> OptimizedMarshaller. >>>>>> > However, I would check if this optimization is applicable to >>>>>> > BinaryMarshaller, and if yes, implement it. >>>>>> > >>>>>> >>>>>> Val, in this case can you please update the ticket? >>>>>> >>>>>> >>>>>> > >>>>>> > -Val >>>>>> > >>>>>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский < >>>>>> [hidden email]> >>>>>> > wrote: >>>>>> > >>>>>> > > Dear sirs! >>>>>> > > >>>>>> > > I want to resolve issue IGNITE-13 - >>>>>> > > https://issues.apache.org/jira/browse/IGNITE-13 >>>>>> > > >>>>>> > > Is it actual? >>>>>> > > >>>>>> > > Vadim Opolski >>>>>> > > >>>>>> > >>>>>> >>>>> >>>>> >>>> >>> >> > |
Hi Vadim,
Thanks, I will review this week. -Val On Wed, Feb 22, 2017 at 2:28 AM, Вадим Опольский <[hidden email]> wrote: > Hi Valentin! > > https://issues.apache.org/jira/browse/IGNITE-13 > > I created BinaryWriterExImplNew (extended of BinaryWriterExImpl) and > added new methods with changes described in the ticket > > https://github.com/javaller/MyBenchmark/blob/master/src/ > main/java/org/sample/BinaryWriterExImplNew.java > > I created a benchmark for BinaryWriterExImplNew > > https://github.com/javaller/MyBenchmark/blob/master/src/ > main/java/org/sample/ExampleTest.java > > I run benchmark and compared results > > https://github.com/javaller/MyBenchmark/blob/master/totalstat.txt > > # Run complete. Total time: 00:10:24 > Benchmark Mode Cnt Score > Error Units > ExampleTest.binaryHeapOutputStream1 avgt 50 1114999,207 ± > 16756,776 ns/op > ExampleTest.binaryHeapOutputStream2 avgt 50 1118149,320 ± > 17515,961 ns/op > ExampleTest.binaryHeapOutputStream3 avgt 50 1113678,657 ± > 17652,314 ns/op > ExampleTest.binaryHeapOutputStream4 avgt 50 1112415,051 ± > 18273,874 ns/op > ExampleTest.binaryHeapOutputStream5 avgt 50 1111366,583 ± > 18282,829 ns/op > ExampleTest.binaryHeapOutputStreamACSII avgt 50 1112079,667 ± > 16659,532 ns/op > ExampleTest.binaryHeapOutputStreamUTFCustom avgt 50 1114949,759 ± > 16809,669 ns/op > ExampleTest.binaryHeapOutputStreamUTFNIO avgt 50 1121462,325 ± > 19836,466 ns/op > > Is it OK? Whats the next step? Do I have to move this JMH benchmark to the > Ignite project ? > > Vadim Opolski > > > > > > > > > 2017-02-21 1:06 GMT+03:00 Valentin Kulichenko < > [hidden email]>: > >> Hi Vadim, >> >> I'm not sure I understand your benchmarks and how they verify the >> optimization discussed here. Basically, here is what needs to be done: >> >> 1. Create a benchmark for BinaryWriterExImpl#doWriteString method. >> 2. Run the benchmark with current implementation. >> 3. Make the change described in the ticket. >> 4. Run the benchmark with these changes. >> 5. Compare results. >> >> Makes sense? Let me know if anything is unclear. >> >> -Val >> >> On Mon, Feb 20, 2017 at 8:51 AM, Вадим Опольский <[hidden email]> >> wrote: >> >>> Hello everybody! >>> >>> https://issues.apache.org/jira/browse/IGNITE-13 >>> >>> Valentin, I just have finished benchmark (with JMH) - >>> https://github.com/javaller/MyBenchmark.git >>> >>> It collect data about time working of serialization. >>> >>> For instance - https://github.com/javaller/My >>> Benchmark/blob/master/out200217.txt >>> >>> To start it you have to do next: >>> >>> 1) clone it - git colne https://github.com/javaller/MyBenchmark.git >>> >>> 2) install it - mvn install >>> >>> 3) run benchmarks - java -Xms1024m -Xmx4096m -jar target\benchmarks.jar >>> >>> Vadim Opolski >>> >>> >>> >>> >>> >>> 2017-02-15 0:52 GMT+03:00 Valentin Kulichenko < >>> [hidden email]>: >>> >>>> Vladimir, >>>> >>>> I think we misunderstood each other. My understanding of this >>>> optimization is the following. >>>> >>>> Currently string serialization is done in two steps (see >>>> BinaryWriterExImpl#doWriteString): >>>> >>>> strArr = BinaryUtils.strToUtf8Bytes(val); // Encode string into byte >>>> array. >>>> out.writeByteArray(strArr); // Write byte array >>>> into stream. >>>> >>>> What this ticket suggests is to write directly into stream while string >>>> is encoded, without intermediate array. This both reduces memory >>>> consumption and eliminates array copy step. >>>> >>>> I updated the ticket and added this explanation there. >>>> >>>> Vadim, can you create a micro benchmark and check if it gives any >>>> improvement? >>>> >>>> -Val >>>> >>>> On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov <[hidden email] >>>> > wrote: >>>> >>>>> Hi, >>>>> >>>>> It is hard to say whether it makes sense or not. No doubt, it could >>>>> speed up marshalling process at the cost of 2x memory required for strings. >>>>> From my previous experience with marshalling micro-optimizations, we will >>>>> hardly ever notice speedup in distributed environment. >>>>> >>>>> But, there is another sied - it could speedup our queries, because we >>>>> will not have to unmarshal string on every field access. So I would try to >>>>> make this optimization optional and then measure query performance with >>>>> classes having lots of strings. It could give us interesting results. >>>>> >>>>> On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko < >>>>> [hidden email]> wrote: >>>>> >>>>>> Vladimir, >>>>>> >>>>>> Can you please take a look and provide your thoughts? Can this be >>>>>> applied to binary marshaller? From what I recall, it serializes string a >>>>>> bit differently from optimized marshaller, so I'm not sure. >>>>>> >>>>>> -Val >>>>>> >>>>>> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan < >>>>>> [hidden email]> wrote: >>>>>> >>>>>>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko < >>>>>>> [hidden email]> wrote: >>>>>>> >>>>>>> > Hi Vadim, >>>>>>> > >>>>>>> > I don't think it makes much sense to invest into >>>>>>> OptimizedMarshaller. >>>>>>> > However, I would check if this optimization is applicable to >>>>>>> > BinaryMarshaller, and if yes, implement it. >>>>>>> > >>>>>>> >>>>>>> Val, in this case can you please update the ticket? >>>>>>> >>>>>>> >>>>>>> > >>>>>>> > -Val >>>>>>> > >>>>>>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский < >>>>>>> [hidden email]> >>>>>>> > wrote: >>>>>>> > >>>>>>> > > Dear sirs! >>>>>>> > > >>>>>>> > > I want to resolve issue IGNITE-13 - >>>>>>> > > https://issues.apache.org/jira/browse/IGNITE-13 >>>>>>> > > >>>>>>> > > Is it actual? >>>>>>> > > >>>>>>> > > Vadim Opolski >>>>>>> > > >>>>>>> > >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> > |
Hi Vadim,
Which method implements the approach described in the ticket? From what I see, all writeToStringX versions are still encoding into an intermediate array and then call out.writeByteArray. What we need to test is the approach where bytes are written directly into the stream during encoding. Encoding algorithm itself should stay the same for now, otherwise we will not know how to interpret the result. It looks like there is some misunderstanding here, so please let me know anything is still unclear. I will be happy to answer your questions. -Val On Wed, Feb 22, 2017 at 7:22 PM, Valentin Kulichenko < [hidden email]> wrote: > Hi Vadim, > > Thanks, I will review this week. > > -Val > > On Wed, Feb 22, 2017 at 2:28 AM, Вадим Опольский <[hidden email]> > wrote: > >> Hi Valentin! >> >> https://issues.apache.org/jira/browse/IGNITE-13 >> >> I created BinaryWriterExImplNew (extended of BinaryWriterExImpl) and >> added new methods with changes described in the ticket >> >> https://github.com/javaller/MyBenchmark/blob/master/src/main >> /java/org/sample/BinaryWriterExImplNew.java >> >> I created a benchmark for BinaryWriterExImplNew >> >> https://github.com/javaller/MyBenchmark/blob/master/src/main >> /java/org/sample/ExampleTest.java >> >> I run benchmark and compared results >> >> https://github.com/javaller/MyBenchmark/blob/master/totalstat.txt >> >> # Run complete. Total time: 00:10:24 >> Benchmark Mode Cnt >> Score Error Units >> ExampleTest.binaryHeapOutputStream1 avgt 50 1114999,207 ± >> 16756,776 ns/op >> ExampleTest.binaryHeapOutputStream2 avgt 50 1118149,320 ± >> 17515,961 ns/op >> ExampleTest.binaryHeapOutputStream3 avgt 50 1113678,657 ± >> 17652,314 ns/op >> ExampleTest.binaryHeapOutputStream4 avgt 50 1112415,051 ± >> 18273,874 ns/op >> ExampleTest.binaryHeapOutputStream5 avgt 50 1111366,583 ± >> 18282,829 ns/op >> ExampleTest.binaryHeapOutputStreamACSII avgt 50 1112079,667 ± >> 16659,532 ns/op >> ExampleTest.binaryHeapOutputStreamUTFCustom avgt 50 1114949,759 ± >> 16809,669 ns/op >> ExampleTest.binaryHeapOutputStreamUTFNIO avgt 50 1121462,325 ± >> 19836,466 ns/op >> >> Is it OK? Whats the next step? Do I have to move this JMH benchmark to >> the Ignite project ? >> >> Vadim Opolski >> >> >> >> >> >> >> >> >> 2017-02-21 1:06 GMT+03:00 Valentin Kulichenko < >> [hidden email]>: >> >>> Hi Vadim, >>> >>> I'm not sure I understand your benchmarks and how they verify the >>> optimization discussed here. Basically, here is what needs to be done: >>> >>> 1. Create a benchmark for BinaryWriterExImpl#doWriteString method. >>> 2. Run the benchmark with current implementation. >>> 3. Make the change described in the ticket. >>> 4. Run the benchmark with these changes. >>> 5. Compare results. >>> >>> Makes sense? Let me know if anything is unclear. >>> >>> -Val >>> >>> On Mon, Feb 20, 2017 at 8:51 AM, Вадим Опольский <[hidden email]> >>> wrote: >>> >>>> Hello everybody! >>>> >>>> https://issues.apache.org/jira/browse/IGNITE-13 >>>> >>>> Valentin, I just have finished benchmark (with JMH) - >>>> https://github.com/javaller/MyBenchmark.git >>>> >>>> It collect data about time working of serialization. >>>> >>>> For instance - https://github.com/javaller/My >>>> Benchmark/blob/master/out200217.txt >>>> >>>> To start it you have to do next: >>>> >>>> 1) clone it - git colne https://github.com/javaller/MyBenchmark.git >>>> >>>> 2) install it - mvn install >>>> >>>> 3) run benchmarks - java -Xms1024m -Xmx4096m -jar >>>> target\benchmarks.jar >>>> >>>> Vadim Opolski >>>> >>>> >>>> >>>> >>>> >>>> 2017-02-15 0:52 GMT+03:00 Valentin Kulichenko < >>>> [hidden email]>: >>>> >>>>> Vladimir, >>>>> >>>>> I think we misunderstood each other. My understanding of this >>>>> optimization is the following. >>>>> >>>>> Currently string serialization is done in two steps (see >>>>> BinaryWriterExImpl#doWriteString): >>>>> >>>>> strArr = BinaryUtils.strToUtf8Bytes(val); // Encode string into byte >>>>> array. >>>>> out.writeByteArray(strArr); // Write byte array >>>>> into stream. >>>>> >>>>> What this ticket suggests is to write directly into stream while >>>>> string is encoded, without intermediate array. This both reduces memory >>>>> consumption and eliminates array copy step. >>>>> >>>>> I updated the ticket and added this explanation there. >>>>> >>>>> Vadim, can you create a micro benchmark and check if it gives any >>>>> improvement? >>>>> >>>>> -Val >>>>> >>>>> On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov < >>>>> [hidden email]> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> It is hard to say whether it makes sense or not. No doubt, it could >>>>>> speed up marshalling process at the cost of 2x memory required for strings. >>>>>> From my previous experience with marshalling micro-optimizations, we will >>>>>> hardly ever notice speedup in distributed environment. >>>>>> >>>>>> But, there is another sied - it could speedup our queries, because we >>>>>> will not have to unmarshal string on every field access. So I would try to >>>>>> make this optimization optional and then measure query performance with >>>>>> classes having lots of strings. It could give us interesting results. >>>>>> >>>>>> On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko < >>>>>> [hidden email]> wrote: >>>>>> >>>>>>> Vladimir, >>>>>>> >>>>>>> Can you please take a look and provide your thoughts? Can this be >>>>>>> applied to binary marshaller? From what I recall, it serializes string a >>>>>>> bit differently from optimized marshaller, so I'm not sure. >>>>>>> >>>>>>> -Val >>>>>>> >>>>>>> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan < >>>>>>> [hidden email]> wrote: >>>>>>> >>>>>>>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko < >>>>>>>> [hidden email]> wrote: >>>>>>>> >>>>>>>> > Hi Vadim, >>>>>>>> > >>>>>>>> > I don't think it makes much sense to invest into >>>>>>>> OptimizedMarshaller. >>>>>>>> > However, I would check if this optimization is applicable to >>>>>>>> > BinaryMarshaller, and if yes, implement it. >>>>>>>> > >>>>>>>> >>>>>>>> Val, in this case can you please update the ticket? >>>>>>>> >>>>>>>> >>>>>>>> > >>>>>>>> > -Val >>>>>>>> > >>>>>>>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский < >>>>>>>> [hidden email]> >>>>>>>> > wrote: >>>>>>>> > >>>>>>>> > > Dear sirs! >>>>>>>> > > >>>>>>>> > > I want to resolve issue IGNITE-13 - >>>>>>>> > > https://issues.apache.org/jira/browse/IGNITE-13 >>>>>>>> > > >>>>>>>> > > Is it actual? >>>>>>>> > > >>>>>>>> > > Vadim Opolski >>>>>>>> > > >>>>>>>> > >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > |
Hi Vadim,
Looks like you accidentally removed dev list from the thread, adding it back. I think there is still misunderstanding. What I propose is to modify the BinaryUtils#strToUtf8Bytes so that it writes directly to BinaryOutputStream instead of intermediate array. This should decrease memory consumption and can also increase performance as we will avoid 'writeByteArray' step at the end. Does it make sense to you? -Val On Mon, Feb 27, 2017 at 6:55 AM, Вадим Опольский <[hidden email]> wrote: > Hi, Valentin! > > What do you think about using the methods of BinaryOutputStream: > > 1) writeByteArray(byte[] val) > 2) writeCharArray(char[] val) > 3) write (byte[] arr, int off, int len) > > String val = "Test"; > out.writeByteArray( val.getBytes(UTF_8)); > > String val = "Test"; > out.writeCharArray(str.toCharArray()); > > String val = "Test" > InputStream stream = new ByteArrayInputStream( > exampleString.getBytes(StandartCharsets.UTF_8)); > byte[] buffer = new byte[1024]; > while ((buffer = stream.read()) != -1) { > out.writeByteArray(buffer); > } > > What else can we use ? > > Vadim > > > 2017-02-25 2:21 GMT+03:00 Valentin Kulichenko < > [hidden email]>: > >> Hi Vadim, >> >> Which method implements the approach described in the ticket? From what I >> see, all writeToStringX versions are still encoding into an intermediate >> array and then call out.writeByteArray. What we need to test is the >> approach where bytes are written directly into the stream during encoding. >> Encoding algorithm itself should stay the same for now, otherwise we will >> not know how to interpret the result. >> >> It looks like there is some misunderstanding here, so please let me know >> anything is still unclear. I will be happy to answer your questions. >> >> -Val >> >> On Wed, Feb 22, 2017 at 7:22 PM, Valentin Kulichenko < >> [hidden email]> wrote: >> >>> Hi Vadim, >>> >>> Thanks, I will review this week. >>> >>> -Val >>> >>> On Wed, Feb 22, 2017 at 2:28 AM, Вадим Опольский <[hidden email]> >>> wrote: >>> >>>> Hi Valentin! >>>> >>>> https://issues.apache.org/jira/browse/IGNITE-13 >>>> >>>> I created BinaryWriterExImplNew (extended of BinaryWriterExImpl) and >>>> added new methods with changes described in the ticket >>>> >>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>> /java/org/sample/BinaryWriterExImplNew.java >>>> >>>> I created a benchmark for BinaryWriterExImplNew >>>> >>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>> /java/org/sample/ExampleTest.java >>>> >>>> I run benchmark and compared results >>>> >>>> https://github.com/javaller/MyBenchmark/blob/master/totalstat.txt >>>> >>>> # Run complete. Total time: 00:10:24 >>>> Benchmark Mode Cnt >>>> Score Error Units >>>> ExampleTest.binaryHeapOutputStream1 avgt 50 1114999,207 ± >>>> 16756,776 ns/op >>>> ExampleTest.binaryHeapOutputStream2 avgt 50 1118149,320 ± >>>> 17515,961 ns/op >>>> ExampleTest.binaryHeapOutputStream3 avgt 50 1113678,657 ± >>>> 17652,314 ns/op >>>> ExampleTest.binaryHeapOutputStream4 avgt 50 1112415,051 ± >>>> 18273,874 ns/op >>>> ExampleTest.binaryHeapOutputStream5 avgt 50 1111366,583 ± >>>> 18282,829 ns/op >>>> ExampleTest.binaryHeapOutputStreamACSII avgt 50 1112079,667 ± >>>> 16659,532 ns/op >>>> ExampleTest.binaryHeapOutputStreamUTFCustom avgt 50 1114949,759 ± >>>> 16809,669 ns/op >>>> ExampleTest.binaryHeapOutputStreamUTFNIO avgt 50 1121462,325 >>>> ± 19836,466 ns/op >>>> >>>> Is it OK? Whats the next step? Do I have to move this JMH benchmark to >>>> the Ignite project ? >>>> >>>> Vadim Opolski >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> 2017-02-21 1:06 GMT+03:00 Valentin Kulichenko < >>>> [hidden email]>: >>>> >>>>> Hi Vadim, >>>>> >>>>> I'm not sure I understand your benchmarks and how they verify the >>>>> optimization discussed here. Basically, here is what needs to be done: >>>>> >>>>> 1. Create a benchmark for BinaryWriterExImpl#doWriteString method. >>>>> 2. Run the benchmark with current implementation. >>>>> 3. Make the change described in the ticket. >>>>> 4. Run the benchmark with these changes. >>>>> 5. Compare results. >>>>> >>>>> Makes sense? Let me know if anything is unclear. >>>>> >>>>> -Val >>>>> >>>>> On Mon, Feb 20, 2017 at 8:51 AM, Вадим Опольский <[hidden email] >>>>> > wrote: >>>>> >>>>>> Hello everybody! >>>>>> >>>>>> https://issues.apache.org/jira/browse/IGNITE-13 >>>>>> >>>>>> Valentin, I just have finished benchmark (with JMH) - >>>>>> https://github.com/javaller/MyBenchmark.git >>>>>> >>>>>> It collect data about time working of serialization. >>>>>> >>>>>> For instance - https://github.com/javaller/My >>>>>> Benchmark/blob/master/out200217.txt >>>>>> >>>>>> To start it you have to do next: >>>>>> >>>>>> 1) clone it - git colne https://github.com/javaller/MyBenchmark.git >>>>>> >>>>>> 2) install it - mvn install >>>>>> >>>>>> 3) run benchmarks - java -Xms1024m -Xmx4096m -jar >>>>>> target\benchmarks.jar >>>>>> >>>>>> Vadim Opolski >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> 2017-02-15 0:52 GMT+03:00 Valentin Kulichenko < >>>>>> [hidden email]>: >>>>>> >>>>>>> Vladimir, >>>>>>> >>>>>>> I think we misunderstood each other. My understanding of this >>>>>>> optimization is the following. >>>>>>> >>>>>>> Currently string serialization is done in two steps (see >>>>>>> BinaryWriterExImpl#doWriteString): >>>>>>> >>>>>>> strArr = BinaryUtils.strToUtf8Bytes(val); // Encode string into >>>>>>> byte array. >>>>>>> out.writeByteArray(strArr); // Write byte array >>>>>>> into stream. >>>>>>> >>>>>>> What this ticket suggests is to write directly into stream while >>>>>>> string is encoded, without intermediate array. This both reduces memory >>>>>>> consumption and eliminates array copy step. >>>>>>> >>>>>>> I updated the ticket and added this explanation there. >>>>>>> >>>>>>> Vadim, can you create a micro benchmark and check if it gives any >>>>>>> improvement? >>>>>>> >>>>>>> -Val >>>>>>> >>>>>>> On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov < >>>>>>> [hidden email]> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> It is hard to say whether it makes sense or not. No doubt, it could >>>>>>>> speed up marshalling process at the cost of 2x memory required for strings. >>>>>>>> From my previous experience with marshalling micro-optimizations, we will >>>>>>>> hardly ever notice speedup in distributed environment. >>>>>>>> >>>>>>>> But, there is another sied - it could speedup our queries, because >>>>>>>> we will not have to unmarshal string on every field access. So I would try >>>>>>>> to make this optimization optional and then measure query performance with >>>>>>>> classes having lots of strings. It could give us interesting results. >>>>>>>> >>>>>>>> On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko < >>>>>>>> [hidden email]> wrote: >>>>>>>> >>>>>>>>> Vladimir, >>>>>>>>> >>>>>>>>> Can you please take a look and provide your thoughts? Can this be >>>>>>>>> applied to binary marshaller? From what I recall, it serializes string a >>>>>>>>> bit differently from optimized marshaller, so I'm not sure. >>>>>>>>> >>>>>>>>> -Val >>>>>>>>> >>>>>>>>> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan < >>>>>>>>> [hidden email]> wrote: >>>>>>>>> >>>>>>>>>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko < >>>>>>>>>> [hidden email]> wrote: >>>>>>>>>> >>>>>>>>>> > Hi Vadim, >>>>>>>>>> > >>>>>>>>>> > I don't think it makes much sense to invest into >>>>>>>>>> OptimizedMarshaller. >>>>>>>>>> > However, I would check if this optimization is applicable to >>>>>>>>>> > BinaryMarshaller, and if yes, implement it. >>>>>>>>>> > >>>>>>>>>> >>>>>>>>>> Val, in this case can you please update the ticket? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> > >>>>>>>>>> > -Val >>>>>>>>>> > >>>>>>>>>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский < >>>>>>>>>> [hidden email]> >>>>>>>>>> > wrote: >>>>>>>>>> > >>>>>>>>>> > > Dear sirs! >>>>>>>>>> > > >>>>>>>>>> > > I want to resolve issue IGNITE-13 - >>>>>>>>>> > > https://issues.apache.org/jira/browse/IGNITE-13 >>>>>>>>>> > > >>>>>>>>>> > > Is it actual? >>>>>>>>>> > > >>>>>>>>>> > > Vadim Opolski >>>>>>>>>> > > >>>>>>>>>> > >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > |
Hi Valentin!
Thank you for comments. There is a new method which writes directly to BinaryOutputStream instead of intermediate array. https://github.com/javaller/MyBenchmark/blob/master/src/ main/java/org/sample/BinaryUtilsNew.java There is benchmark. https://github.com/javaller/MyBenchmark/blob/master/src/ main/java/org/sample/MyBenchmark.java Unit test https://github.com/javaller/MyBenchmark/blob/master/src/ main/java/org/sample/BinaryOutputStreamTest.java Statistics https://github.com/javaller/MyBenchmark/blob/master/out_01_03_17.txt Benchmark Mode Cnt Score Error Units MyBenchmark.binaryHeapOutputInDirect avgt 50 111,337 ± 0,742 ns/op MyBenchmark.binaryHeapOutputStreamDirect avgt 50 23,847 ± 0,303 ns/op Vadim 2017-02-28 4:29 GMT+03:00 Valentin Kulichenko <[hidden email] >: > Hi Vadim, > > Looks like you accidentally removed dev list from the thread, adding it > back. > > I think there is still misunderstanding. What I propose is to modify > the BinaryUtils#strToUtf8Bytes so that it writes directly to BinaryOutputStream > instead of intermediate array. This should decrease memory consumption and > can also increase performance as we will avoid 'writeByteArray' step at > the end. > > Does it make sense to you? > > -Val > > On Mon, Feb 27, 2017 at 6:55 AM, Вадим Опольский <[hidden email]> > wrote: > >> Hi, Valentin! >> >> What do you think about using the methods of BinaryOutputStream: >> >> 1) writeByteArray(byte[] val) >> 2) writeCharArray(char[] val) >> 3) write (byte[] arr, int off, int len) >> >> String val = "Test"; >> out.writeByteArray( val.getBytes(UTF_8)); >> >> String val = "Test"; >> out.writeCharArray(str.toCharArray()); >> >> String val = "Test" >> InputStream stream = new ByteArrayInputStream( >> exampleString.getBytes(StandartCharsets.UTF_8)); >> byte[] buffer = new byte[1024]; >> while ((buffer = stream.read()) != -1) { >> out.writeByteArray(buffer); >> } >> >> What else can we use ? >> >> Vadim >> >> >> 2017-02-25 2:21 GMT+03:00 Valentin Kulichenko < >> [hidden email]>: >> >>> Hi Vadim, >>> >>> Which method implements the approach described in the ticket? From what >>> I see, all writeToStringX versions are still encoding into an intermediate >>> array and then call out.writeByteArray. What we need to test is the >>> approach where bytes are written directly into the stream during encoding. >>> Encoding algorithm itself should stay the same for now, otherwise we will >>> not know how to interpret the result. >>> >>> It looks like there is some misunderstanding here, so please let me know >>> anything is still unclear. I will be happy to answer your questions. >>> >>> -Val >>> >>> On Wed, Feb 22, 2017 at 7:22 PM, Valentin Kulichenko < >>> [hidden email]> wrote: >>> >>>> Hi Vadim, >>>> >>>> Thanks, I will review this week. >>>> >>>> -Val >>>> >>>> On Wed, Feb 22, 2017 at 2:28 AM, Вадим Опольский <[hidden email]> >>>> wrote: >>>> >>>>> Hi Valentin! >>>>> >>>>> https://issues.apache.org/jira/browse/IGNITE-13 >>>>> >>>>> I created BinaryWriterExImplNew (extended of BinaryWriterExImpl) and >>>>> added new methods with changes described in the ticket >>>>> >>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>>> /java/org/sample/BinaryWriterExImplNew.java >>>>> >>>>> I created a benchmark for BinaryWriterExImplNew >>>>> >>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>>> /java/org/sample/ExampleTest.java >>>>> >>>>> I run benchmark and compared results >>>>> >>>>> https://github.com/javaller/MyBenchmark/blob/master/totalstat.txt >>>>> >>>>> # Run complete. Total time: 00:10:24 >>>>> Benchmark Mode Cnt >>>>> Score Error Units >>>>> ExampleTest.binaryHeapOutputStream1 avgt 50 1114999,207 ± >>>>> 16756,776 ns/op >>>>> ExampleTest.binaryHeapOutputStream2 avgt 50 1118149,320 ± >>>>> 17515,961 ns/op >>>>> ExampleTest.binaryHeapOutputStream3 avgt 50 1113678,657 ± >>>>> 17652,314 ns/op >>>>> ExampleTest.binaryHeapOutputStream4 avgt 50 1112415,051 ± >>>>> 18273,874 ns/op >>>>> ExampleTest.binaryHeapOutputStream5 avgt 50 1111366,583 ± >>>>> 18282,829 ns/op >>>>> ExampleTest.binaryHeapOutputStreamACSII avgt 50 1112079,667 ± >>>>> 16659,532 ns/op >>>>> ExampleTest.binaryHeapOutputStreamUTFCustom avgt 50 1114949,759 ± >>>>> 16809,669 ns/op >>>>> ExampleTest.binaryHeapOutputStreamUTFNIO avgt 50 >>>>> 1121462,325 ± 19836,466 ns/op >>>>> >>>>> Is it OK? Whats the next step? Do I have to move this JMH benchmark to >>>>> the Ignite project ? >>>>> >>>>> Vadim Opolski >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> 2017-02-21 1:06 GMT+03:00 Valentin Kulichenko < >>>>> [hidden email]>: >>>>> >>>>>> Hi Vadim, >>>>>> >>>>>> I'm not sure I understand your benchmarks and how they verify the >>>>>> optimization discussed here. Basically, here is what needs to be done: >>>>>> >>>>>> 1. Create a benchmark for BinaryWriterExImpl#doWriteString method. >>>>>> 2. Run the benchmark with current implementation. >>>>>> 3. Make the change described in the ticket. >>>>>> 4. Run the benchmark with these changes. >>>>>> 5. Compare results. >>>>>> >>>>>> Makes sense? Let me know if anything is unclear. >>>>>> >>>>>> -Val >>>>>> >>>>>> On Mon, Feb 20, 2017 at 8:51 AM, Вадим Опольский < >>>>>> [hidden email]> wrote: >>>>>> >>>>>>> Hello everybody! >>>>>>> >>>>>>> https://issues.apache.org/jira/browse/IGNITE-13 >>>>>>> >>>>>>> Valentin, I just have finished benchmark (with JMH) - >>>>>>> https://github.com/javaller/MyBenchmark.git >>>>>>> >>>>>>> It collect data about time working of serialization. >>>>>>> >>>>>>> For instance - https://github.com/javaller/My >>>>>>> Benchmark/blob/master/out200217.txt >>>>>>> >>>>>>> To start it you have to do next: >>>>>>> >>>>>>> 1) clone it - git colne https://github.com/javaller/MyBenchmark.git >>>>>>> >>>>>>> 2) install it - mvn install >>>>>>> >>>>>>> 3) run benchmarks - java -Xms1024m -Xmx4096m -jar >>>>>>> target\benchmarks.jar >>>>>>> >>>>>>> Vadim Opolski >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> 2017-02-15 0:52 GMT+03:00 Valentin Kulichenko < >>>>>>> [hidden email]>: >>>>>>> >>>>>>>> Vladimir, >>>>>>>> >>>>>>>> I think we misunderstood each other. My understanding of this >>>>>>>> optimization is the following. >>>>>>>> >>>>>>>> Currently string serialization is done in two steps (see >>>>>>>> BinaryWriterExImpl#doWriteString): >>>>>>>> >>>>>>>> strArr = BinaryUtils.strToUtf8Bytes(val); // Encode string into >>>>>>>> byte array. >>>>>>>> out.writeByteArray(strArr); // Write byte >>>>>>>> array into stream. >>>>>>>> >>>>>>>> What this ticket suggests is to write directly into stream while >>>>>>>> string is encoded, without intermediate array. This both reduces memory >>>>>>>> consumption and eliminates array copy step. >>>>>>>> >>>>>>>> I updated the ticket and added this explanation there. >>>>>>>> >>>>>>>> Vadim, can you create a micro benchmark and check if it gives any >>>>>>>> improvement? >>>>>>>> >>>>>>>> -Val >>>>>>>> >>>>>>>> On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov < >>>>>>>> [hidden email]> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> It is hard to say whether it makes sense or not. No doubt, it >>>>>>>>> could speed up marshalling process at the cost of 2x memory required for >>>>>>>>> strings. From my previous experience with marshalling micro-optimizations, >>>>>>>>> we will hardly ever notice speedup in distributed environment. >>>>>>>>> >>>>>>>>> But, there is another sied - it could speedup our queries, because >>>>>>>>> we will not have to unmarshal string on every field access. So I would try >>>>>>>>> to make this optimization optional and then measure query performance with >>>>>>>>> classes having lots of strings. It could give us interesting results. >>>>>>>>> >>>>>>>>> On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko < >>>>>>>>> [hidden email]> wrote: >>>>>>>>> >>>>>>>>>> Vladimir, >>>>>>>>>> >>>>>>>>>> Can you please take a look and provide your thoughts? Can this be >>>>>>>>>> applied to binary marshaller? From what I recall, it serializes string a >>>>>>>>>> bit differently from optimized marshaller, so I'm not sure. >>>>>>>>>> >>>>>>>>>> -Val >>>>>>>>>> >>>>>>>>>> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan < >>>>>>>>>> [hidden email]> wrote: >>>>>>>>>> >>>>>>>>>>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko < >>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>> >>>>>>>>>>> > Hi Vadim, >>>>>>>>>>> > >>>>>>>>>>> > I don't think it makes much sense to invest into >>>>>>>>>>> OptimizedMarshaller. >>>>>>>>>>> > However, I would check if this optimization is applicable to >>>>>>>>>>> > BinaryMarshaller, and if yes, implement it. >>>>>>>>>>> > >>>>>>>>>>> >>>>>>>>>>> Val, in this case can you please update the ticket? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> > >>>>>>>>>>> > -Val >>>>>>>>>>> > >>>>>>>>>>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский < >>>>>>>>>>> [hidden email]> >>>>>>>>>>> > wrote: >>>>>>>>>>> > >>>>>>>>>>> > > Dear sirs! >>>>>>>>>>> > > >>>>>>>>>>> > > I want to resolve issue IGNITE-13 - >>>>>>>>>>> > > https://issues.apache.org/jira/browse/IGNITE-13 >>>>>>>>>>> > > >>>>>>>>>>> > > Is it actual? >>>>>>>>>>> > > >>>>>>>>>>> > > Vadim Opolski >>>>>>>>>>> > > >>>>>>>>>>> > >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > |
Hi Valentin!
I've created: new method strToUtf8BytesDirect in BinaryUtilsNew https://github.com/javaller/MyBenchmark/blob/master/src/ main/java/org/sample/BinaryUtilsNew.java new method doWriteStringDirect in BinaryWriterExImplNew https://github.com/javaller/MyBenchmark/blob/master/src/ main/java/org/sample/BinaryWriterExImplNew.java benchmarks for BinaryWriterExImpl doWriteString and BinaryWriterExImplNew doWriteStringDirect https://github.com/javaller/MyBenchmark/blob/master/src/ main/java/org/sample/ExampleTest.java This is a result of comparing: Benchmark Mode Cnt Score Error UnitsExampleTest.binaryHeapOutputStreamDirect avgt 50 1128448,743 ± 13536,689 ns/opExampleTest.binaryHeapOutputStreamInDirect avgt 50 1127270,695 ± 17309,256 ns/op Vadim 2017-03-02 1:02 GMT+03:00 Valentin Kulichenko <[hidden email] >: > Hi Vadim, > > We're getting closer :) I would actually like to see the test for actual > implementation of BinaryWriterExImpl#doWriteString method. Logic in > binaryHeapOutputInDirect() confuses me a bit and I'm not sure comparison is > valid. > > Can you please do the following: > > 1. Create new BinaryUtils#strToUtf8BytesDirect method, copy-paste the > code from existing BinaryUtils#strToUtf8Bytes and modify it so that it > takes BinaryOutputStream as an argument and writes to it directly. Do not > create stream inside this method, as it's the same as creating new array. > 2. Create new BinaryWriterExImpl#doWriteStringDirect, copy-paste the code > from existing BinaryWriterExImpl#doWriteString and modify it so that it > uses BinaryUtils#strToUtf8BytesDirect and doesn't call out.writeByteArray. > 3. Create benchmark for BinaryWriterExImpl#doWriteString method. I.e., > create an instance of BinaryWriterExImpl and call doWriteString() in > benchmark method. > 4. Similarly, create benchmark for BinaryWriterExImpl#doWriteStringDirect. > 5. Compare results. > > This will give us clear picture of how these two approaches perform. Your > current results are actually promising, but I would like to confirm them. > > -Val > > On Wed, Mar 1, 2017 at 6:17 AM, Вадим Опольский <[hidden email]> > wrote: > >> Hi Valentin! >> >> Thank you for comments. >> >> There is a new method which writes directly to BinaryOutputStream instead >> of intermediate array. >> https://github.com/javaller/MyBenchmark/blob/master/src/main >> /java/org/sample/BinaryUtilsNew.java >> >> There is benchmark. >> https://github.com/javaller/MyBenchmark/blob/master/src/main >> /java/org/sample/MyBenchmark.java >> >> Unit test >> https://github.com/javaller/MyBenchmark/blob/master/src/main >> /java/org/sample/BinaryOutputStreamTest.java >> >> Statistics >> https://github.com/javaller/MyBenchmark/blob/master/out_01_03_17.txt >> >> Benchmark >> Mode Cnt Score Error Units MyBenchmark.binaryHeapOutputIn >> Direct avgt 50 111,337 ± 0,742 ns/op >> MyBenchmark.binaryHeapOutputStreamDirect avgt 50 23,847 ± >> 0,303 ns/op >> >> >> Vadim >> >> >> >> >> >> >> >> >> >> >> 2017-02-28 4:29 GMT+03:00 Valentin Kulichenko < >> [hidden email]>: >> >>> Hi Vadim, >>> >>> Looks like you accidentally removed dev list from the thread, adding it >>> back. >>> >>> I think there is still misunderstanding. What I propose is to modify >>> the BinaryUtils#strToUtf8Bytes so that it writes directly to BinaryOutputStream >>> instead of intermediate array. This should decrease memory consumption and >>> can also increase performance as we will avoid 'writeByteArray' step at >>> the end. >>> >>> Does it make sense to you? >>> >>> -Val >>> >>> On Mon, Feb 27, 2017 at 6:55 AM, Вадим Опольский <[hidden email]> >>> wrote: >>> >>>> Hi, Valentin! >>>> >>>> What do you think about using the methods of BinaryOutputStream: >>>> >>>> 1) writeByteArray(byte[] val) >>>> 2) writeCharArray(char[] val) >>>> 3) write (byte[] arr, int off, int len) >>>> >>>> String val = "Test"; >>>> out.writeByteArray( val.getBytes(UTF_8)); >>>> >>>> String val = "Test"; >>>> out.writeCharArray(str.toCharArray()); >>>> >>>> String val = "Test" >>>> InputStream stream = new ByteArrayInputStream( >>>> exampleString.getBytes(StandartCharsets.UTF_8)); >>>> byte[] buffer = new byte[1024]; >>>> while ((buffer = stream.read()) != -1) { >>>> out.writeByteArray(buffer); >>>> } >>>> >>>> What else can we use ? >>>> >>>> Vadim >>>> >>>> >>>> 2017-02-25 2:21 GMT+03:00 Valentin Kulichenko < >>>> [hidden email]>: >>>> >>>>> Hi Vadim, >>>>> >>>>> Which method implements the approach described in the ticket? From >>>>> what I see, all writeToStringX versions are still encoding into an >>>>> intermediate array and then call out.writeByteArray. What we need to test >>>>> is the approach where bytes are written directly into the stream during >>>>> encoding. Encoding algorithm itself should stay the same for now, otherwise >>>>> we will not know how to interpret the result. >>>>> >>>>> It looks like there is some misunderstanding here, so please let me >>>>> know anything is still unclear. I will be happy to answer your questions. >>>>> >>>>> -Val >>>>> >>>>> On Wed, Feb 22, 2017 at 7:22 PM, Valentin Kulichenko < >>>>> [hidden email]> wrote: >>>>> >>>>>> Hi Vadim, >>>>>> >>>>>> Thanks, I will review this week. >>>>>> >>>>>> -Val >>>>>> >>>>>> On Wed, Feb 22, 2017 at 2:28 AM, Вадим Опольский < >>>>>> [hidden email]> wrote: >>>>>> >>>>>>> Hi Valentin! >>>>>>> >>>>>>> https://issues.apache.org/jira/browse/IGNITE-13 >>>>>>> >>>>>>> I created BinaryWriterExImplNew (extended of BinaryWriterExImpl) and >>>>>>> added new methods with changes described in the ticket >>>>>>> >>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>>>>> /java/org/sample/BinaryWriterExImplNew.java >>>>>>> >>>>>>> I created a benchmark for BinaryWriterExImplNew >>>>>>> >>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>>>>> /java/org/sample/ExampleTest.java >>>>>>> >>>>>>> I run benchmark and compared results >>>>>>> >>>>>>> https://github.com/javaller/MyBenchmark/blob/master/totalstat.txt >>>>>>> >>>>>>> # Run complete. Total time: 00:10:24 >>>>>>> Benchmark Mode Cnt >>>>>>> Score Error Units >>>>>>> ExampleTest.binaryHeapOutputStream1 avgt 50 1114999,207 >>>>>>> ± 16756,776 ns/op >>>>>>> ExampleTest.binaryHeapOutputStream2 avgt 50 1118149,320 >>>>>>> ± 17515,961 ns/op >>>>>>> ExampleTest.binaryHeapOutputStream3 avgt 50 1113678,657 >>>>>>> ± 17652,314 ns/op >>>>>>> ExampleTest.binaryHeapOutputStream4 avgt 50 1112415,051 >>>>>>> ± 18273,874 ns/op >>>>>>> ExampleTest.binaryHeapOutputStream5 avgt 50 1111366,583 >>>>>>> ± 18282,829 ns/op >>>>>>> ExampleTest.binaryHeapOutputStreamACSII avgt 50 1112079,667 ± >>>>>>> 16659,532 ns/op >>>>>>> ExampleTest.binaryHeapOutputStreamUTFCustom avgt 50 1114949,759 >>>>>>> ± 16809,669 ns/op >>>>>>> ExampleTest.binaryHeapOutputStreamUTFNIO avgt 50 >>>>>>> 1121462,325 ± 19836,466 ns/op >>>>>>> >>>>>>> Is it OK? Whats the next step? Do I have to move this JMH benchmark >>>>>>> to the Ignite project ? >>>>>>> >>>>>>> Vadim Opolski >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> 2017-02-21 1:06 GMT+03:00 Valentin Kulichenko < >>>>>>> [hidden email]>: >>>>>>> >>>>>>>> Hi Vadim, >>>>>>>> >>>>>>>> I'm not sure I understand your benchmarks and how they verify the >>>>>>>> optimization discussed here. Basically, here is what needs to be done: >>>>>>>> >>>>>>>> 1. Create a benchmark for BinaryWriterExImpl#doWriteString method. >>>>>>>> 2. Run the benchmark with current implementation. >>>>>>>> 3. Make the change described in the ticket. >>>>>>>> 4. Run the benchmark with these changes. >>>>>>>> 5. Compare results. >>>>>>>> >>>>>>>> Makes sense? Let me know if anything is unclear. >>>>>>>> >>>>>>>> -Val >>>>>>>> >>>>>>>> On Mon, Feb 20, 2017 at 8:51 AM, Вадим Опольский < >>>>>>>> [hidden email]> wrote: >>>>>>>> >>>>>>>>> Hello everybody! >>>>>>>>> >>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-13 >>>>>>>>> >>>>>>>>> Valentin, I just have finished benchmark (with JMH) - >>>>>>>>> https://github.com/javaller/MyBenchmark.git >>>>>>>>> >>>>>>>>> It collect data about time working of serialization. >>>>>>>>> >>>>>>>>> For instance - https://github.com/javaller/My >>>>>>>>> Benchmark/blob/master/out200217.txt >>>>>>>>> >>>>>>>>> To start it you have to do next: >>>>>>>>> >>>>>>>>> 1) clone it - git colne https://github.com/javal >>>>>>>>> ler/MyBenchmark.git >>>>>>>>> >>>>>>>>> 2) install it - mvn install >>>>>>>>> >>>>>>>>> 3) run benchmarks - java -Xms1024m -Xmx4096m -jar >>>>>>>>> target\benchmarks.jar >>>>>>>>> >>>>>>>>> Vadim Opolski >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 2017-02-15 0:52 GMT+03:00 Valentin Kulichenko < >>>>>>>>> [hidden email]>: >>>>>>>>> >>>>>>>>>> Vladimir, >>>>>>>>>> >>>>>>>>>> I think we misunderstood each other. My understanding of this >>>>>>>>>> optimization is the following. >>>>>>>>>> >>>>>>>>>> Currently string serialization is done in two steps (see >>>>>>>>>> BinaryWriterExImpl#doWriteString): >>>>>>>>>> >>>>>>>>>> strArr = BinaryUtils.strToUtf8Bytes(val); // Encode string into >>>>>>>>>> byte array. >>>>>>>>>> out.writeByteArray(strArr); // Write byte >>>>>>>>>> array into stream. >>>>>>>>>> >>>>>>>>>> What this ticket suggests is to write directly into stream while >>>>>>>>>> string is encoded, without intermediate array. This both reduces memory >>>>>>>>>> consumption and eliminates array copy step. >>>>>>>>>> >>>>>>>>>> I updated the ticket and added this explanation there. >>>>>>>>>> >>>>>>>>>> Vadim, can you create a micro benchmark and check if it gives any >>>>>>>>>> improvement? >>>>>>>>>> >>>>>>>>>> -Val >>>>>>>>>> >>>>>>>>>> On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov < >>>>>>>>>> [hidden email]> wrote: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> It is hard to say whether it makes sense or not. No doubt, it >>>>>>>>>>> could speed up marshalling process at the cost of 2x memory required for >>>>>>>>>>> strings. From my previous experience with marshalling micro-optimizations, >>>>>>>>>>> we will hardly ever notice speedup in distributed environment. >>>>>>>>>>> >>>>>>>>>>> But, there is another sied - it could speedup our queries, >>>>>>>>>>> because we will not have to unmarshal string on every field access. So I >>>>>>>>>>> would try to make this optimization optional and then measure query >>>>>>>>>>> performance with classes having lots of strings. It could give us >>>>>>>>>>> interesting results. >>>>>>>>>>> >>>>>>>>>>> On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko < >>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>> >>>>>>>>>>>> Vladimir, >>>>>>>>>>>> >>>>>>>>>>>> Can you please take a look and provide your thoughts? Can this >>>>>>>>>>>> be applied to binary marshaller? From what I recall, it serializes string a >>>>>>>>>>>> bit differently from optimized marshaller, so I'm not sure. >>>>>>>>>>>> >>>>>>>>>>>> -Val >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan < >>>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko < >>>>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> > Hi Vadim, >>>>>>>>>>>>> > >>>>>>>>>>>>> > I don't think it makes much sense to invest into >>>>>>>>>>>>> OptimizedMarshaller. >>>>>>>>>>>>> > However, I would check if this optimization is applicable to >>>>>>>>>>>>> > BinaryMarshaller, and if yes, implement it. >>>>>>>>>>>>> > >>>>>>>>>>>>> >>>>>>>>>>>>> Val, in this case can you please update the ticket? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> > >>>>>>>>>>>>> > -Val >>>>>>>>>>>>> > >>>>>>>>>>>>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский < >>>>>>>>>>>>> [hidden email]> >>>>>>>>>>>>> > wrote: >>>>>>>>>>>>> > >>>>>>>>>>>>> > > Dear sirs! >>>>>>>>>>>>> > > >>>>>>>>>>>>> > > I want to resolve issue IGNITE-13 - >>>>>>>>>>>>> > > https://issues.apache.org/jira/browse/IGNITE-13 >>>>>>>>>>>>> > > >>>>>>>>>>>>> > > Is it actual? >>>>>>>>>>>>> > > >>>>>>>>>>>>> > > Vadim Opolski >>>>>>>>>>>>> > > >>>>>>>>>>>>> > >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > |
Vadim,
Looks better now. Can you also try to modify the benchmark so that marshaller and writer are created outside of the measured method? I.e. the benchmark methods should be as simple as this: @Benchmark public void binaryHeapOutputStreamDirect() throws Exception { writer.doWriteStringDirect(message); } @Benchmark public void binaryHeapOutputStreamInDirect() throws Exception { writer.doWriteString(message); } In any case, do I understand correctly that it didn't actually make any performance difference? If so, I think we can close the ticket. Vova, can you also take a look and provide your thoughts? -Val On Thu, Mar 2, 2017 at 1:27 PM, Вадим Опольский <[hidden email]> wrote: > Hi Valentin! > > I've created: > > new method strToUtf8BytesDirect in BinaryUtilsNew > https://github.com/javaller/MyBenchmark/blob/master/src/main > /java/org/sample/BinaryUtilsNew.java > > new method doWriteStringDirect in BinaryWriterExImplNew > https://github.com/javaller/MyBenchmark/blob/master/src/main > /java/org/sample/BinaryWriterExImplNew.java > > benchmarks for BinaryWriterExImpl doWriteString and BinaryWriterExImplNew > doWriteStringDirect > https://github.com/javaller/MyBenchmark/blob/master/src/main > /java/org/sample/ExampleTest.java > > This is a result of comparing: > > Benchmark > Mode Cnt Score Error UnitsExampleTest. > binaryHeapOutputStreamDirect avgt 50 1128448,743 ± 13536,689 > ns/opExampleTest.binaryHeapOutputStreamInDirect avgt 50 1127270,695 ± > 17309,256 ns/op > > Vadim > > 2017-03-02 1:02 GMT+03:00 Valentin Kulichenko < > [hidden email]>: > >> Hi Vadim, >> >> We're getting closer :) I would actually like to see the test for actual >> implementation of BinaryWriterExImpl#doWriteString method. Logic in >> binaryHeapOutputInDirect() confuses me a bit and I'm not sure comparison is >> valid. >> >> Can you please do the following: >> >> 1. Create new BinaryUtils#strToUtf8BytesDirect method, copy-paste the >> code from existing BinaryUtils#strToUtf8Bytes and modify it so that it >> takes BinaryOutputStream as an argument and writes to it directly. Do not >> create stream inside this method, as it's the same as creating new array. >> 2. Create new BinaryWriterExImpl#doWriteStringDirect, copy-paste the >> code from existing BinaryWriterExImpl#doWriteString and modify it so >> that it uses BinaryUtils#strToUtf8BytesDirect and doesn't >> call out.writeByteArray. >> 3. Create benchmark for BinaryWriterExImpl#doWriteString method. I.e., >> create an instance of BinaryWriterExImpl and call doWriteString() in >> benchmark method. >> 4. Similarly, create benchmark for BinaryWriterExImpl#doWriteStri >> ngDirect. >> 5. Compare results. >> >> This will give us clear picture of how these two approaches perform. Your >> current results are actually promising, but I would like to confirm them. >> >> -Val >> >> On Wed, Mar 1, 2017 at 6:17 AM, Вадим Опольский <[hidden email]> >> wrote: >> >>> Hi Valentin! >>> >>> Thank you for comments. >>> >>> There is a new method which writes directly to BinaryOutputStream >>> instead of intermediate array. >>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>> /java/org/sample/BinaryUtilsNew.java >>> >>> There is benchmark. >>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>> /java/org/sample/MyBenchmark.java >>> >>> Unit test >>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>> /java/org/sample/BinaryOutputStreamTest.java >>> >>> Statistics >>> https://github.com/javaller/MyBenchmark/blob/master/out_01_03_17.txt >>> >>> Benchmark >>> Mode Cnt Score Error Units MyBenchmark.binaryHeapOutputIn >>> Direct avgt 50 111,337 ± 0,742 ns/op >>> MyBenchmark.binaryHeapOutputStreamDirect avgt 50 23,847 ± >>> 0,303 ns/op >>> >>> >>> Vadim >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> 2017-02-28 4:29 GMT+03:00 Valentin Kulichenko < >>> [hidden email]>: >>> >>>> Hi Vadim, >>>> >>>> Looks like you accidentally removed dev list from the thread, adding it >>>> back. >>>> >>>> I think there is still misunderstanding. What I propose is to modify >>>> the BinaryUtils#strToUtf8Bytes so that it writes directly to BinaryOutputStream >>>> instead of intermediate array. This should decrease memory consumption and >>>> can also increase performance as we will avoid 'writeByteArray' step >>>> at the end. >>>> >>>> Does it make sense to you? >>>> >>>> -Val >>>> >>>> On Mon, Feb 27, 2017 at 6:55 AM, Вадим Опольский <[hidden email]> >>>> wrote: >>>> >>>>> Hi, Valentin! >>>>> >>>>> What do you think about using the methods of BinaryOutputStream: >>>>> >>>>> 1) writeByteArray(byte[] val) >>>>> 2) writeCharArray(char[] val) >>>>> 3) write (byte[] arr, int off, int len) >>>>> >>>>> String val = "Test"; >>>>> out.writeByteArray( val.getBytes(UTF_8)); >>>>> >>>>> String val = "Test"; >>>>> out.writeCharArray(str.toCharArray()); >>>>> >>>>> String val = "Test" >>>>> InputStream stream = new ByteArrayInputStream( >>>>> exampleString.getBytes(StandartCharsets.UTF_8)); >>>>> byte[] buffer = new byte[1024]; >>>>> while ((buffer = stream.read()) != -1) { >>>>> out.writeByteArray(buffer); >>>>> } >>>>> >>>>> What else can we use ? >>>>> >>>>> Vadim >>>>> >>>>> >>>>> 2017-02-25 2:21 GMT+03:00 Valentin Kulichenko < >>>>> [hidden email]>: >>>>> >>>>>> Hi Vadim, >>>>>> >>>>>> Which method implements the approach described in the ticket? From >>>>>> what I see, all writeToStringX versions are still encoding into an >>>>>> intermediate array and then call out.writeByteArray. What we need to test >>>>>> is the approach where bytes are written directly into the stream during >>>>>> encoding. Encoding algorithm itself should stay the same for now, otherwise >>>>>> we will not know how to interpret the result. >>>>>> >>>>>> It looks like there is some misunderstanding here, so please let me >>>>>> know anything is still unclear. I will be happy to answer your questions. >>>>>> >>>>>> -Val >>>>>> >>>>>> On Wed, Feb 22, 2017 at 7:22 PM, Valentin Kulichenko < >>>>>> [hidden email]> wrote: >>>>>> >>>>>>> Hi Vadim, >>>>>>> >>>>>>> Thanks, I will review this week. >>>>>>> >>>>>>> -Val >>>>>>> >>>>>>> On Wed, Feb 22, 2017 at 2:28 AM, Вадим Опольский < >>>>>>> [hidden email]> wrote: >>>>>>> >>>>>>>> Hi Valentin! >>>>>>>> >>>>>>>> https://issues.apache.org/jira/browse/IGNITE-13 >>>>>>>> >>>>>>>> I created BinaryWriterExImplNew (extended of BinaryWriterExImpl) and >>>>>>>> added new methods with changes described in the ticket >>>>>>>> >>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>>>>>> /java/org/sample/BinaryWriterExImplNew.java >>>>>>>> >>>>>>>> I created a benchmark for BinaryWriterExImplNew >>>>>>>> >>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>>>>>> /java/org/sample/ExampleTest.java >>>>>>>> >>>>>>>> I run benchmark and compared results >>>>>>>> >>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/totalstat.txt >>>>>>>> >>>>>>>> # Run complete. Total time: 00:10:24 >>>>>>>> Benchmark Mode Cnt >>>>>>>> Score Error Units >>>>>>>> ExampleTest.binaryHeapOutputStream1 avgt 50 >>>>>>>> 1114999,207 ± 16756,776 ns/op >>>>>>>> ExampleTest.binaryHeapOutputStream2 avgt 50 >>>>>>>> 1118149,320 ± 17515,961 ns/op >>>>>>>> ExampleTest.binaryHeapOutputStream3 avgt 50 >>>>>>>> 1113678,657 ± 17652,314 ns/op >>>>>>>> ExampleTest.binaryHeapOutputStream4 avgt 50 >>>>>>>> 1112415,051 ± 18273,874 ns/op >>>>>>>> ExampleTest.binaryHeapOutputStream5 avgt 50 >>>>>>>> 1111366,583 ± 18282,829 ns/op >>>>>>>> ExampleTest.binaryHeapOutputStreamACSII avgt 50 1112079,667 ± >>>>>>>> 16659,532 ns/op >>>>>>>> ExampleTest.binaryHeapOutputStreamUTFCustom avgt 50 >>>>>>>> 1114949,759 ± 16809,669 ns/op >>>>>>>> ExampleTest.binaryHeapOutputStreamUTFNIO avgt 50 >>>>>>>> 1121462,325 ± 19836,466 ns/op >>>>>>>> >>>>>>>> Is it OK? Whats the next step? Do I have to move this JMH benchmark >>>>>>>> to the Ignite project ? >>>>>>>> >>>>>>>> Vadim Opolski >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 2017-02-21 1:06 GMT+03:00 Valentin Kulichenko < >>>>>>>> [hidden email]>: >>>>>>>> >>>>>>>>> Hi Vadim, >>>>>>>>> >>>>>>>>> I'm not sure I understand your benchmarks and how they verify the >>>>>>>>> optimization discussed here. Basically, here is what needs to be done: >>>>>>>>> >>>>>>>>> 1. Create a benchmark for BinaryWriterExImpl#doWriteString method. >>>>>>>>> 2. Run the benchmark with current implementation. >>>>>>>>> 3. Make the change described in the ticket. >>>>>>>>> 4. Run the benchmark with these changes. >>>>>>>>> 5. Compare results. >>>>>>>>> >>>>>>>>> Makes sense? Let me know if anything is unclear. >>>>>>>>> >>>>>>>>> -Val >>>>>>>>> >>>>>>>>> On Mon, Feb 20, 2017 at 8:51 AM, Вадим Опольский < >>>>>>>>> [hidden email]> wrote: >>>>>>>>> >>>>>>>>>> Hello everybody! >>>>>>>>>> >>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-13 >>>>>>>>>> >>>>>>>>>> Valentin, I just have finished benchmark (with JMH) - >>>>>>>>>> https://github.com/javaller/MyBenchmark.git >>>>>>>>>> >>>>>>>>>> It collect data about time working of serialization. >>>>>>>>>> >>>>>>>>>> For instance - https://github.com/javaller/My >>>>>>>>>> Benchmark/blob/master/out200217.txt >>>>>>>>>> >>>>>>>>>> To start it you have to do next: >>>>>>>>>> >>>>>>>>>> 1) clone it - git colne https://github.com/javal >>>>>>>>>> ler/MyBenchmark.git >>>>>>>>>> >>>>>>>>>> 2) install it - mvn install >>>>>>>>>> >>>>>>>>>> 3) run benchmarks - java -Xms1024m -Xmx4096m -jar >>>>>>>>>> target\benchmarks.jar >>>>>>>>>> >>>>>>>>>> Vadim Opolski >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 2017-02-15 0:52 GMT+03:00 Valentin Kulichenko < >>>>>>>>>> [hidden email]>: >>>>>>>>>> >>>>>>>>>>> Vladimir, >>>>>>>>>>> >>>>>>>>>>> I think we misunderstood each other. My understanding of this >>>>>>>>>>> optimization is the following. >>>>>>>>>>> >>>>>>>>>>> Currently string serialization is done in two steps (see >>>>>>>>>>> BinaryWriterExImpl#doWriteString): >>>>>>>>>>> >>>>>>>>>>> strArr = BinaryUtils.strToUtf8Bytes(val); // Encode string into >>>>>>>>>>> byte array. >>>>>>>>>>> out.writeByteArray(strArr); // Write byte >>>>>>>>>>> array into stream. >>>>>>>>>>> >>>>>>>>>>> What this ticket suggests is to write directly into stream while >>>>>>>>>>> string is encoded, without intermediate array. This both reduces memory >>>>>>>>>>> consumption and eliminates array copy step. >>>>>>>>>>> >>>>>>>>>>> I updated the ticket and added this explanation there. >>>>>>>>>>> >>>>>>>>>>> Vadim, can you create a micro benchmark and check if it gives >>>>>>>>>>> any improvement? >>>>>>>>>>> >>>>>>>>>>> -Val >>>>>>>>>>> >>>>>>>>>>> On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov < >>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> It is hard to say whether it makes sense or not. No doubt, it >>>>>>>>>>>> could speed up marshalling process at the cost of 2x memory required for >>>>>>>>>>>> strings. From my previous experience with marshalling micro-optimizations, >>>>>>>>>>>> we will hardly ever notice speedup in distributed environment. >>>>>>>>>>>> >>>>>>>>>>>> But, there is another sied - it could speedup our queries, >>>>>>>>>>>> because we will not have to unmarshal string on every field access. So I >>>>>>>>>>>> would try to make this optimization optional and then measure query >>>>>>>>>>>> performance with classes having lots of strings. It could give us >>>>>>>>>>>> interesting results. >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko < >>>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Vladimir, >>>>>>>>>>>>> >>>>>>>>>>>>> Can you please take a look and provide your thoughts? Can this >>>>>>>>>>>>> be applied to binary marshaller? From what I recall, it serializes string a >>>>>>>>>>>>> bit differently from optimized marshaller, so I'm not sure. >>>>>>>>>>>>> >>>>>>>>>>>>> -Val >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan < >>>>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko < >>>>>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> > Hi Vadim, >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > I don't think it makes much sense to invest into >>>>>>>>>>>>>> OptimizedMarshaller. >>>>>>>>>>>>>> > However, I would check if this optimization is applicable to >>>>>>>>>>>>>> > BinaryMarshaller, and if yes, implement it. >>>>>>>>>>>>>> > >>>>>>>>>>>>>> >>>>>>>>>>>>>> Val, in this case can you please update the ticket? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > -Val >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский < >>>>>>>>>>>>>> [hidden email]> >>>>>>>>>>>>>> > wrote: >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > > Dear sirs! >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > > I want to resolve issue IGNITE-13 - >>>>>>>>>>>>>> > > https://issues.apache.org/jira/browse/IGNITE-13 >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > > Is it actual? >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > > Vadim Opolski >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > |
In reply to this post by vadopolski
Hello Colleagues!
I've just copied benchmarks and gotten interest results. https://github.com/javaller/MyBenchmark/blob/master/03_03_17_out.txt Time increase in 5, 6 Iteration in both cases. What do you think about it? Vadim 2017-03-03 2:00 GMT+03:00 Valentin Kulichenko <[hidden email] >: > Vadim, > > Looks better now. Can you also try to modify the benchmark so that > marshaller and writer are created outside of the measured method? I.e. the > benchmark methods should be as simple as this: > > @Benchmark > public void binaryHeapOutputStreamDirect() throws Exception { > writer.doWriteStringDirect(message); > } > > @Benchmark > public void binaryHeapOutputStreamInDirect() throws Exception { > writer.doWriteString(message); > } > > In any case, do I understand correctly that it didn't actually make any > performance difference? If so, I think we can close the ticket. > > Vova, can you also take a look and provide your thoughts? > > -Val > > On Thu, Mar 2, 2017 at 1:27 PM, Вадим Опольский <[hidden email]> > wrote: > >> Hi Valentin! >> >> I've created: >> >> new method strToUtf8BytesDirect in BinaryUtilsNew >> https://github.com/javaller/MyBenchmark/blob/master/src/main >> /java/org/sample/BinaryUtilsNew.java >> >> new method doWriteStringDirect in BinaryWriterExImplNew >> https://github.com/javaller/MyBenchmark/blob/master/src/main >> /java/org/sample/BinaryWriterExImplNew.java >> >> benchmarks for BinaryWriterExImpl doWriteString and BinaryWriterExImplNew >> doWriteStringDirect >> https://github.com/javaller/MyBenchmark/blob/master/src/main >> /java/org/sample/ExampleTest.java >> >> This is a result of comparing: >> >> Benchmark >> Mode Cnt Score Error UnitsExampleTest.binaryHeapOutputStreamDirect >> avgt 50 1128448,743 ± 13536,689 ns/opExampleTest.binaryHeapOutputStreamInDirect >> avgt 50 1127270,695 ± 17309,256 ns/op >> >> Vadim >> >> 2017-03-02 1:02 GMT+03:00 Valentin Kulichenko < >> [hidden email]>: >> >>> Hi Vadim, >>> >>> We're getting closer :) I would actually like to see the test for actual >>> implementation of BinaryWriterExImpl#doWriteString method. Logic in >>> binaryHeapOutputInDirect() confuses me a bit and I'm not sure comparison is >>> valid. >>> >>> Can you please do the following: >>> >>> 1. Create new BinaryUtils#strToUtf8BytesDirect method, copy-paste the >>> code from existing BinaryUtils#strToUtf8Bytes and modify it so that it >>> takes BinaryOutputStream as an argument and writes to it directly. Do not >>> create stream inside this method, as it's the same as creating new array. >>> 2. Create new BinaryWriterExImpl#doWriteStringDirect, copy-paste the >>> code from existing BinaryWriterExImpl#doWriteString and modify it so >>> that it uses BinaryUtils#strToUtf8BytesDirect and doesn't >>> call out.writeByteArray. >>> 3. Create benchmark for BinaryWriterExImpl#doWriteString method. I.e., >>> create an instance of BinaryWriterExImpl and call doWriteString() in >>> benchmark method. >>> 4. Similarly, create benchmark for BinaryWriterExImpl#doWriteStri >>> ngDirect. >>> 5. Compare results. >>> >>> This will give us clear picture of how these two approaches perform. >>> Your current results are actually promising, but I would like to confirm >>> them. >>> >>> -Val >>> >>> On Wed, Mar 1, 2017 at 6:17 AM, Вадим Опольский <[hidden email]> >>> wrote: >>> >>>> Hi Valentin! >>>> >>>> Thank you for comments. >>>> >>>> There is a new method which writes directly to BinaryOutputStream >>>> instead of intermediate array. >>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>> /java/org/sample/BinaryUtilsNew.java >>>> >>>> There is benchmark. >>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>> /java/org/sample/MyBenchmark.java >>>> >>>> Unit test >>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>> /java/org/sample/BinaryOutputStreamTest.java >>>> >>>> Statistics >>>> https://github.com/javaller/MyBenchmark/blob/master/out_01_03_17.txt >>>> >>>> Benchmark >>>> Mode Cnt Score Error Units MyBenchmark.binaryHeapOutputIn >>>> Direct avgt 50 111,337 ± 0,742 ns/op >>>> MyBenchmark.binaryHeapOutputStreamDirect avgt 50 23,847 ± >>>> 0,303 ns/op >>>> >>>> >>>> Vadim >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> 2017-02-28 4:29 GMT+03:00 Valentin Kulichenko < >>>> [hidden email]>: >>>> >>>>> Hi Vadim, >>>>> >>>>> Looks like you accidentally removed dev list from the thread, adding >>>>> it back. >>>>> >>>>> I think there is still misunderstanding. What I propose is to modify >>>>> the BinaryUtils#strToUtf8Bytes so that it writes directly to BinaryOutputStream >>>>> instead of intermediate array. This should decrease memory consumption and >>>>> can also increase performance as we will avoid 'writeByteArray' step >>>>> at the end. >>>>> >>>>> Does it make sense to you? >>>>> >>>>> -Val >>>>> >>>>> On Mon, Feb 27, 2017 at 6:55 AM, Вадим Опольский <[hidden email] >>>>> > wrote: >>>>> >>>>>> Hi, Valentin! >>>>>> >>>>>> What do you think about using the methods of BinaryOutputStream: >>>>>> >>>>>> 1) writeByteArray(byte[] val) >>>>>> 2) writeCharArray(char[] val) >>>>>> 3) write (byte[] arr, int off, int len) >>>>>> >>>>>> String val = "Test"; >>>>>> out.writeByteArray( val.getBytes(UTF_8)); >>>>>> >>>>>> String val = "Test"; >>>>>> out.writeCharArray(str.toCharArray()); >>>>>> >>>>>> String val = "Test" >>>>>> InputStream stream = new ByteArrayInputStream( >>>>>> exampleString.getBytes(StandartCharsets.UTF_8)); >>>>>> byte[] buffer = new byte[1024]; >>>>>> while ((buffer = stream.read()) != -1) { >>>>>> out.writeByteArray(buffer); >>>>>> } >>>>>> >>>>>> What else can we use ? >>>>>> >>>>>> Vadim >>>>>> >>>>>> >>>>>> 2017-02-25 2:21 GMT+03:00 Valentin Kulichenko < >>>>>> [hidden email]>: >>>>>> >>>>>>> Hi Vadim, >>>>>>> >>>>>>> Which method implements the approach described in the ticket? From >>>>>>> what I see, all writeToStringX versions are still encoding into an >>>>>>> intermediate array and then call out.writeByteArray. What we need to test >>>>>>> is the approach where bytes are written directly into the stream during >>>>>>> encoding. Encoding algorithm itself should stay the same for now, otherwise >>>>>>> we will not know how to interpret the result. >>>>>>> >>>>>>> It looks like there is some misunderstanding here, so please let me >>>>>>> know anything is still unclear. I will be happy to answer your questions. >>>>>>> >>>>>>> -Val >>>>>>> >>>>>>> On Wed, Feb 22, 2017 at 7:22 PM, Valentin Kulichenko < >>>>>>> [hidden email]> wrote: >>>>>>> >>>>>>>> Hi Vadim, >>>>>>>> >>>>>>>> Thanks, I will review this week. >>>>>>>> >>>>>>>> -Val >>>>>>>> >>>>>>>> On Wed, Feb 22, 2017 at 2:28 AM, Вадим Опольский < >>>>>>>> [hidden email]> wrote: >>>>>>>> >>>>>>>>> Hi Valentin! >>>>>>>>> >>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-13 >>>>>>>>> >>>>>>>>> I created BinaryWriterExImplNew (extended of BinaryWriterExImpl) and >>>>>>>>> added new methods with changes described in the ticket >>>>>>>>> >>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>>>>>>> /java/org/sample/BinaryWriterExImplNew.java >>>>>>>>> >>>>>>>>> I created a benchmark for BinaryWriterExImplNew >>>>>>>>> >>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>>>>>>> /java/org/sample/ExampleTest.java >>>>>>>>> >>>>>>>>> I run benchmark and compared results >>>>>>>>> >>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/totalstat.txt >>>>>>>>> >>>>>>>>> # Run complete. Total time: 00:10:24 >>>>>>>>> Benchmark Mode Cnt >>>>>>>>> Score Error Units >>>>>>>>> ExampleTest.binaryHeapOutputStream1 avgt 50 >>>>>>>>> 1114999,207 ± 16756,776 ns/op >>>>>>>>> ExampleTest.binaryHeapOutputStream2 avgt 50 >>>>>>>>> 1118149,320 ± 17515,961 ns/op >>>>>>>>> ExampleTest.binaryHeapOutputStream3 avgt 50 >>>>>>>>> 1113678,657 ± 17652,314 ns/op >>>>>>>>> ExampleTest.binaryHeapOutputStream4 avgt 50 >>>>>>>>> 1112415,051 ± 18273,874 ns/op >>>>>>>>> ExampleTest.binaryHeapOutputStream5 avgt 50 >>>>>>>>> 1111366,583 ± 18282,829 ns/op >>>>>>>>> ExampleTest.binaryHeapOutputStreamACSII avgt 50 1112079,667 >>>>>>>>> ± 16659,532 ns/op >>>>>>>>> ExampleTest.binaryHeapOutputStreamUTFCustom avgt 50 >>>>>>>>> 1114949,759 ± 16809,669 ns/op >>>>>>>>> ExampleTest.binaryHeapOutputStreamUTFNIO avgt 50 >>>>>>>>> 1121462,325 ± 19836,466 ns/op >>>>>>>>> >>>>>>>>> Is it OK? Whats the next step? Do I have to move this >>>>>>>>> JMH benchmark to the Ignite project ? >>>>>>>>> >>>>>>>>> Vadim Opolski >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 2017-02-21 1:06 GMT+03:00 Valentin Kulichenko < >>>>>>>>> [hidden email]>: >>>>>>>>> >>>>>>>>>> Hi Vadim, >>>>>>>>>> >>>>>>>>>> I'm not sure I understand your benchmarks and how they verify the >>>>>>>>>> optimization discussed here. Basically, here is what needs to be done: >>>>>>>>>> >>>>>>>>>> 1. Create a benchmark for BinaryWriterExImpl#doWriteString >>>>>>>>>> method. >>>>>>>>>> 2. Run the benchmark with current implementation. >>>>>>>>>> 3. Make the change described in the ticket. >>>>>>>>>> 4. Run the benchmark with these changes. >>>>>>>>>> 5. Compare results. >>>>>>>>>> >>>>>>>>>> Makes sense? Let me know if anything is unclear. >>>>>>>>>> >>>>>>>>>> -Val >>>>>>>>>> >>>>>>>>>> On Mon, Feb 20, 2017 at 8:51 AM, Вадим Опольский < >>>>>>>>>> [hidden email]> wrote: >>>>>>>>>> >>>>>>>>>>> Hello everybody! >>>>>>>>>>> >>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-13 >>>>>>>>>>> >>>>>>>>>>> Valentin, I just have finished benchmark (with JMH) - >>>>>>>>>>> https://github.com/javaller/MyBenchmark.git >>>>>>>>>>> >>>>>>>>>>> It collect data about time working of serialization. >>>>>>>>>>> >>>>>>>>>>> For instance - https://github.com/javaller/My >>>>>>>>>>> Benchmark/blob/master/out200217.txt >>>>>>>>>>> >>>>>>>>>>> To start it you have to do next: >>>>>>>>>>> >>>>>>>>>>> 1) clone it - git colne https://github.com/javal >>>>>>>>>>> ler/MyBenchmark.git >>>>>>>>>>> >>>>>>>>>>> 2) install it - mvn install >>>>>>>>>>> >>>>>>>>>>> 3) run benchmarks - java -Xms1024m -Xmx4096m -jar >>>>>>>>>>> target\benchmarks.jar >>>>>>>>>>> >>>>>>>>>>> Vadim Opolski >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 2017-02-15 0:52 GMT+03:00 Valentin Kulichenko < >>>>>>>>>>> [hidden email]>: >>>>>>>>>>> >>>>>>>>>>>> Vladimir, >>>>>>>>>>>> >>>>>>>>>>>> I think we misunderstood each other. My understanding of this >>>>>>>>>>>> optimization is the following. >>>>>>>>>>>> >>>>>>>>>>>> Currently string serialization is done in two steps (see >>>>>>>>>>>> BinaryWriterExImpl#doWriteString): >>>>>>>>>>>> >>>>>>>>>>>> strArr = BinaryUtils.strToUtf8Bytes(val); // Encode string >>>>>>>>>>>> into byte array. >>>>>>>>>>>> out.writeByteArray(strArr); // Write byte >>>>>>>>>>>> array into stream. >>>>>>>>>>>> >>>>>>>>>>>> What this ticket suggests is to write directly into stream >>>>>>>>>>>> while string is encoded, without intermediate array. This both reduces >>>>>>>>>>>> memory consumption and eliminates array copy step. >>>>>>>>>>>> >>>>>>>>>>>> I updated the ticket and added this explanation there. >>>>>>>>>>>> >>>>>>>>>>>> Vadim, can you create a micro benchmark and check if it gives >>>>>>>>>>>> any improvement? >>>>>>>>>>>> >>>>>>>>>>>> -Val >>>>>>>>>>>> >>>>>>>>>>>> On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov < >>>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> It is hard to say whether it makes sense or not. No doubt, it >>>>>>>>>>>>> could speed up marshalling process at the cost of 2x memory required for >>>>>>>>>>>>> strings. From my previous experience with marshalling micro-optimizations, >>>>>>>>>>>>> we will hardly ever notice speedup in distributed environment. >>>>>>>>>>>>> >>>>>>>>>>>>> But, there is another sied - it could speedup our queries, >>>>>>>>>>>>> because we will not have to unmarshal string on every field access. So I >>>>>>>>>>>>> would try to make this optimization optional and then measure query >>>>>>>>>>>>> performance with classes having lots of strings. It could give us >>>>>>>>>>>>> interesting results. >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko < >>>>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Vladimir, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Can you please take a look and provide your thoughts? Can >>>>>>>>>>>>>> this be applied to binary marshaller? From what I recall, it serializes >>>>>>>>>>>>>> string a bit differently from optimized marshaller, so I'm not sure. >>>>>>>>>>>>>> >>>>>>>>>>>>>> -Val >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan < >>>>>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko < >>>>>>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> > Hi Vadim, >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > I don't think it makes much sense to invest into >>>>>>>>>>>>>>> OptimizedMarshaller. >>>>>>>>>>>>>>> > However, I would check if this optimization is applicable >>>>>>>>>>>>>>> to >>>>>>>>>>>>>>> > BinaryMarshaller, and if yes, implement it. >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Val, in this case can you please update the ticket? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > -Val >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский < >>>>>>>>>>>>>>> [hidden email]> >>>>>>>>>>>>>>> > wrote: >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > > Dear sirs! >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> > > I want to resolve issue IGNITE-13 - >>>>>>>>>>>>>>> > > https://issues.apache.org/jira/browse/IGNITE-13 >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> > > Is it actual? >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> > > Vadim Opolski >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > |
In reply to this post by Valentin Kulichenko
Valentin,
What do you think about duplicated cycle in strToBinaryOutputStream ? How to calculate StrLen для outBinaryHeap without this cycle ? public class BinaryUtilsNew extends BinaryUtils { public static int getStrLen(String val) { int strLen = val.length(); int utfLen = 0; int c; // Determine length of resulting byte array. *for (int cnt = 0; cnt < strLen; cnt++) { c = val.charAt(cnt); if (c >= 0x0001 && c <= 0x007F)* utfLen++; * else if (c > 0x07FF)* utfLen += 3; else utfLen += 2; } return utfLen; } public static void strToUtf8BytesDirect(BinaryOutputStream outBinaryHeap, String val) { int strLen = val.length(); int c, cnt; int position = 0; outBinaryHeap.unsafeEnsure(1 + 4); * outBinaryHeap.unsafeWriteByte(GridBinaryMarshaller.STRING); outBinaryHeap.unsafeWriteInt(getStrLen(val));* * for (cnt = 0; cnt < strLen; cnt++) { c = val.charAt(cnt);* * if (c >= 0x0001 && c <= 0x007F)* outBinaryHeap.writeByte((byte) c); * else if (c > 0x07FF) {* outBinaryHeap.writeByte((byte)(0xE0 | (c >> 12) & 0x0F)); outBinaryHeap.writeByte((byte)(0x80 | (c >> 6) & 0x3F)); outBinaryHeap.writeByte((byte)(0x80 | (c & 0x3F))); } else { outBinaryHeap.writeByte((byte)(0xC0 | ((c >> 6) & 0x1F))); outBinaryHeap.writeByte((byte)(0x80 | (c & 0x3F))); } } } Vadim 2017-03-03 2:00 GMT+03:00 Valentin Kulichenko <[hidden email] >: > Vadim, > > Looks better now. Can you also try to modify the benchmark so that > marshaller and writer are created outside of the measured method? I.e. the > benchmark methods should be as simple as this: > > @Benchmark > public void binaryHeapOutputStreamDirect() throws Exception { > writer.doWriteStringDirect(message); > } > > @Benchmark > public void binaryHeapOutputStreamInDirect() throws Exception { > writer.doWriteString(message); > } > > In any case, do I understand correctly that it didn't actually make any > performance difference? If so, I think we can close the ticket. > > Vova, can you also take a look and provide your thoughts? > > -Val > > On Thu, Mar 2, 2017 at 1:27 PM, Вадим Опольский <[hidden email]> > wrote: > >> Hi Valentin! >> >> I've created: >> >> new method strToUtf8BytesDirect in BinaryUtilsNew >> https://github.com/javaller/MyBenchmark/blob/master/src/main >> /java/org/sample/BinaryUtilsNew.java >> >> new method doWriteStringDirect in BinaryWriterExImplNew >> https://github.com/javaller/MyBenchmark/blob/master/src/main >> /java/org/sample/BinaryWriterExImplNew.java >> >> benchmarks for BinaryWriterExImpl doWriteString and BinaryWriterExImplNew >> doWriteStringDirect >> https://github.com/javaller/MyBenchmark/blob/master/src/main >> /java/org/sample/ExampleTest.java >> >> This is a result of comparing: >> >> Benchmark >> Mode Cnt Score Error UnitsExampleTest.binaryHeapOutputStreamDirect >> avgt 50 1128448,743 ± 13536,689 ns/opExampleTest.binaryHeapOutputStreamInDirect >> avgt 50 1127270,695 ± 17309,256 ns/op >> >> Vadim >> >> 2017-03-02 1:02 GMT+03:00 Valentin Kulichenko < >> [hidden email]>: >> >>> Hi Vadim, >>> >>> We're getting closer :) I would actually like to see the test for actual >>> implementation of BinaryWriterExImpl#doWriteString method. Logic in >>> binaryHeapOutputInDirect() confuses me a bit and I'm not sure comparison is >>> valid. >>> >>> Can you please do the following: >>> >>> 1. Create new BinaryUtils#strToUtf8BytesDirect method, copy-paste the >>> code from existing BinaryUtils#strToUtf8Bytes and modify it so that it >>> takes BinaryOutputStream as an argument and writes to it directly. Do not >>> create stream inside this method, as it's the same as creating new array. >>> 2. Create new BinaryWriterExImpl#doWriteStringDirect, copy-paste the >>> code from existing BinaryWriterExImpl#doWriteString and modify it so >>> that it uses BinaryUtils#strToUtf8BytesDirect and doesn't >>> call out.writeByteArray. >>> 3. Create benchmark for BinaryWriterExImpl#doWriteString method. I.e., >>> create an instance of BinaryWriterExImpl and call doWriteString() in >>> benchmark method. >>> 4. Similarly, create benchmark for BinaryWriterExImpl#doWriteStri >>> ngDirect. >>> 5. Compare results. >>> >>> This will give us clear picture of how these two approaches perform. >>> Your current results are actually promising, but I would like to confirm >>> them. >>> >>> -Val >>> >>> On Wed, Mar 1, 2017 at 6:17 AM, Вадим Опольский <[hidden email]> >>> wrote: >>> >>>> Hi Valentin! >>>> >>>> Thank you for comments. >>>> >>>> There is a new method which writes directly to BinaryOutputStream >>>> instead of intermediate array. >>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>> /java/org/sample/BinaryUtilsNew.java >>>> >>>> There is benchmark. >>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>> /java/org/sample/MyBenchmark.java >>>> >>>> Unit test >>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>> /java/org/sample/BinaryOutputStreamTest.java >>>> >>>> Statistics >>>> https://github.com/javaller/MyBenchmark/blob/master/out_01_03_17.txt >>>> >>>> Benchmark >>>> Mode Cnt Score Error Units MyBenchmark.binaryHeapOutputIn >>>> Direct avgt 50 111,337 ± 0,742 ns/op >>>> MyBenchmark.binaryHeapOutputStreamDirect avgt 50 23,847 ± >>>> 0,303 ns/op >>>> >>>> >>>> Vadim >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> 2017-02-28 4:29 GMT+03:00 Valentin Kulichenko < >>>> [hidden email]>: >>>> >>>>> Hi Vadim, >>>>> >>>>> Looks like you accidentally removed dev list from the thread, adding >>>>> it back. >>>>> >>>>> I think there is still misunderstanding. What I propose is to modify >>>>> the BinaryUtils#strToUtf8Bytes so that it writes directly to BinaryOutputStream >>>>> instead of intermediate array. This should decrease memory consumption and >>>>> can also increase performance as we will avoid 'writeByteArray' step >>>>> at the end. >>>>> >>>>> Does it make sense to you? >>>>> >>>>> -Val >>>>> >>>>> On Mon, Feb 27, 2017 at 6:55 AM, Вадим Опольский <[hidden email] >>>>> > wrote: >>>>> >>>>>> Hi, Valentin! >>>>>> >>>>>> What do you think about using the methods of BinaryOutputStream: >>>>>> >>>>>> 1) writeByteArray(byte[] val) >>>>>> 2) writeCharArray(char[] val) >>>>>> 3) write (byte[] arr, int off, int len) >>>>>> >>>>>> String val = "Test"; >>>>>> out.writeByteArray( val.getBytes(UTF_8)); >>>>>> >>>>>> String val = "Test"; >>>>>> out.writeCharArray(str.toCharArray()); >>>>>> >>>>>> String val = "Test" >>>>>> InputStream stream = new ByteArrayInputStream( >>>>>> exampleString.getBytes(StandartCharsets.UTF_8)); >>>>>> byte[] buffer = new byte[1024]; >>>>>> while ((buffer = stream.read()) != -1) { >>>>>> out.writeByteArray(buffer); >>>>>> } >>>>>> >>>>>> What else can we use ? >>>>>> >>>>>> Vadim >>>>>> >>>>>> >>>>>> 2017-02-25 2:21 GMT+03:00 Valentin Kulichenko < >>>>>> [hidden email]>: >>>>>> >>>>>>> Hi Vadim, >>>>>>> >>>>>>> Which method implements the approach described in the ticket? From >>>>>>> what I see, all writeToStringX versions are still encoding into an >>>>>>> intermediate array and then call out.writeByteArray. What we need to test >>>>>>> is the approach where bytes are written directly into the stream during >>>>>>> encoding. Encoding algorithm itself should stay the same for now, otherwise >>>>>>> we will not know how to interpret the result. >>>>>>> >>>>>>> It looks like there is some misunderstanding here, so please let me >>>>>>> know anything is still unclear. I will be happy to answer your questions. >>>>>>> >>>>>>> -Val >>>>>>> >>>>>>> On Wed, Feb 22, 2017 at 7:22 PM, Valentin Kulichenko < >>>>>>> [hidden email]> wrote: >>>>>>> >>>>>>>> Hi Vadim, >>>>>>>> >>>>>>>> Thanks, I will review this week. >>>>>>>> >>>>>>>> -Val >>>>>>>> >>>>>>>> On Wed, Feb 22, 2017 at 2:28 AM, Вадим Опольский < >>>>>>>> [hidden email]> wrote: >>>>>>>> >>>>>>>>> Hi Valentin! >>>>>>>>> >>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-13 >>>>>>>>> >>>>>>>>> I created BinaryWriterExImplNew (extended of BinaryWriterExImpl) and >>>>>>>>> added new methods with changes described in the ticket >>>>>>>>> >>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>>>>>>> /java/org/sample/BinaryWriterExImplNew.java >>>>>>>>> >>>>>>>>> I created a benchmark for BinaryWriterExImplNew >>>>>>>>> >>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>>>>>>> /java/org/sample/ExampleTest.java >>>>>>>>> >>>>>>>>> I run benchmark and compared results >>>>>>>>> >>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/totalstat.txt >>>>>>>>> >>>>>>>>> # Run complete. Total time: 00:10:24 >>>>>>>>> Benchmark Mode Cnt >>>>>>>>> Score Error Units >>>>>>>>> ExampleTest.binaryHeapOutputStream1 avgt 50 >>>>>>>>> 1114999,207 ± 16756,776 ns/op >>>>>>>>> ExampleTest.binaryHeapOutputStream2 avgt 50 >>>>>>>>> 1118149,320 ± 17515,961 ns/op >>>>>>>>> ExampleTest.binaryHeapOutputStream3 avgt 50 >>>>>>>>> 1113678,657 ± 17652,314 ns/op >>>>>>>>> ExampleTest.binaryHeapOutputStream4 avgt 50 >>>>>>>>> 1112415,051 ± 18273,874 ns/op >>>>>>>>> ExampleTest.binaryHeapOutputStream5 avgt 50 >>>>>>>>> 1111366,583 ± 18282,829 ns/op >>>>>>>>> ExampleTest.binaryHeapOutputStreamACSII avgt 50 1112079,667 >>>>>>>>> ± 16659,532 ns/op >>>>>>>>> ExampleTest.binaryHeapOutputStreamUTFCustom avgt 50 >>>>>>>>> 1114949,759 ± 16809,669 ns/op >>>>>>>>> ExampleTest.binaryHeapOutputStreamUTFNIO avgt 50 >>>>>>>>> 1121462,325 ± 19836,466 ns/op >>>>>>>>> >>>>>>>>> Is it OK? Whats the next step? Do I have to move this >>>>>>>>> JMH benchmark to the Ignite project ? >>>>>>>>> >>>>>>>>> Vadim Opolski >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 2017-02-21 1:06 GMT+03:00 Valentin Kulichenko < >>>>>>>>> [hidden email]>: >>>>>>>>> >>>>>>>>>> Hi Vadim, >>>>>>>>>> >>>>>>>>>> I'm not sure I understand your benchmarks and how they verify the >>>>>>>>>> optimization discussed here. Basically, here is what needs to be done: >>>>>>>>>> >>>>>>>>>> 1. Create a benchmark for BinaryWriterExImpl#doWriteString >>>>>>>>>> method. >>>>>>>>>> 2. Run the benchmark with current implementation. >>>>>>>>>> 3. Make the change described in the ticket. >>>>>>>>>> 4. Run the benchmark with these changes. >>>>>>>>>> 5. Compare results. >>>>>>>>>> >>>>>>>>>> Makes sense? Let me know if anything is unclear. >>>>>>>>>> >>>>>>>>>> -Val >>>>>>>>>> >>>>>>>>>> On Mon, Feb 20, 2017 at 8:51 AM, Вадим Опольский < >>>>>>>>>> [hidden email]> wrote: >>>>>>>>>> >>>>>>>>>>> Hello everybody! >>>>>>>>>>> >>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-13 >>>>>>>>>>> >>>>>>>>>>> Valentin, I just have finished benchmark (with JMH) - >>>>>>>>>>> https://github.com/javaller/MyBenchmark.git >>>>>>>>>>> >>>>>>>>>>> It collect data about time working of serialization. >>>>>>>>>>> >>>>>>>>>>> For instance - https://github.com/javaller/My >>>>>>>>>>> Benchmark/blob/master/out200217.txt >>>>>>>>>>> >>>>>>>>>>> To start it you have to do next: >>>>>>>>>>> >>>>>>>>>>> 1) clone it - git colne https://github.com/javal >>>>>>>>>>> ler/MyBenchmark.git >>>>>>>>>>> >>>>>>>>>>> 2) install it - mvn install >>>>>>>>>>> >>>>>>>>>>> 3) run benchmarks - java -Xms1024m -Xmx4096m -jar >>>>>>>>>>> target\benchmarks.jar >>>>>>>>>>> >>>>>>>>>>> Vadim Opolski >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 2017-02-15 0:52 GMT+03:00 Valentin Kulichenko < >>>>>>>>>>> [hidden email]>: >>>>>>>>>>> >>>>>>>>>>>> Vladimir, >>>>>>>>>>>> >>>>>>>>>>>> I think we misunderstood each other. My understanding of this >>>>>>>>>>>> optimization is the following. >>>>>>>>>>>> >>>>>>>>>>>> Currently string serialization is done in two steps (see >>>>>>>>>>>> BinaryWriterExImpl#doWriteString): >>>>>>>>>>>> >>>>>>>>>>>> strArr = BinaryUtils.strToUtf8Bytes(val); // Encode string >>>>>>>>>>>> into byte array. >>>>>>>>>>>> out.writeByteArray(strArr); // Write byte >>>>>>>>>>>> array into stream. >>>>>>>>>>>> >>>>>>>>>>>> What this ticket suggests is to write directly into stream >>>>>>>>>>>> while string is encoded, without intermediate array. This both reduces >>>>>>>>>>>> memory consumption and eliminates array copy step. >>>>>>>>>>>> >>>>>>>>>>>> I updated the ticket and added this explanation there. >>>>>>>>>>>> >>>>>>>>>>>> Vadim, can you create a micro benchmark and check if it gives >>>>>>>>>>>> any improvement? >>>>>>>>>>>> >>>>>>>>>>>> -Val >>>>>>>>>>>> >>>>>>>>>>>> On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov < >>>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> It is hard to say whether it makes sense or not. No doubt, it >>>>>>>>>>>>> could speed up marshalling process at the cost of 2x memory required for >>>>>>>>>>>>> strings. From my previous experience with marshalling micro-optimizations, >>>>>>>>>>>>> we will hardly ever notice speedup in distributed environment. >>>>>>>>>>>>> >>>>>>>>>>>>> But, there is another sied - it could speedup our queries, >>>>>>>>>>>>> because we will not have to unmarshal string on every field access. So I >>>>>>>>>>>>> would try to make this optimization optional and then measure query >>>>>>>>>>>>> performance with classes having lots of strings. It could give us >>>>>>>>>>>>> interesting results. >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko < >>>>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Vladimir, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Can you please take a look and provide your thoughts? Can >>>>>>>>>>>>>> this be applied to binary marshaller? From what I recall, it serializes >>>>>>>>>>>>>> string a bit differently from optimized marshaller, so I'm not sure. >>>>>>>>>>>>>> >>>>>>>>>>>>>> -Val >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan < >>>>>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko < >>>>>>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> > Hi Vadim, >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > I don't think it makes much sense to invest into >>>>>>>>>>>>>>> OptimizedMarshaller. >>>>>>>>>>>>>>> > However, I would check if this optimization is applicable >>>>>>>>>>>>>>> to >>>>>>>>>>>>>>> > BinaryMarshaller, and if yes, implement it. >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Val, in this case can you please update the ticket? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > -Val >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский < >>>>>>>>>>>>>>> [hidden email]> >>>>>>>>>>>>>>> > wrote: >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > > Dear sirs! >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> > > I want to resolve issue IGNITE-13 - >>>>>>>>>>>>>>> > > https://issues.apache.org/jira/browse/IGNITE-13 >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> > > Is it actual? >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> > > Vadim Opolski >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > |
Hi Vadim,
What do you mean by "copied benchmarks"? What changed singe previous iteration and why results are so different? As for duplicated loop, you don't need it. BinaryOutputStream allows to write a value to a particular position (even before already written data). So you can reserve 4 bytes for length, remember position, calculate length while encoding and writing bytes, and then write length. -Val On Fri, Mar 3, 2017 at 12:45 AM, Вадим Опольский <[hidden email]> wrote: > Valentin, > > What do you think about duplicated cycle in strToBinaryOutputStream ? > > How to calculate StrLen для outBinaryHeap without this cycle ? > > public class BinaryUtilsNew extends BinaryUtils { > > public static int getStrLen(String val) { > int strLen = val.length(); > int utfLen = 0; > int c; > > // Determine length of resulting byte array. > > > > > *for (int cnt = 0; cnt < strLen; cnt++) { c = val.charAt(cnt); if (c >= 0x0001 && c <= 0x007F)* utfLen++; > * else if (c > 0x07FF)* > utfLen += 3; > else > utfLen += 2; > } > > return utfLen; > } > > public static void strToUtf8BytesDirect(BinaryOutputStream outBinaryHeap, String val) { > > int strLen = val.length(); > int c, cnt; > > int position = 0; > > outBinaryHeap.unsafeEnsure(1 + 4); > > * outBinaryHeap.unsafeWriteByte(GridBinaryMarshaller.STRING); outBinaryHeap.unsafeWriteInt(getStrLen(val));* > > > > * for (cnt = 0; cnt < strLen; cnt++) { c = val.charAt(cnt);* > * if (c >= 0x0001 && c <= 0x007F)* > outBinaryHeap.writeByte((byte) c); > * else if (c > 0x07FF) {* > outBinaryHeap.writeByte((byte)(0xE0 | (c >> 12) & 0x0F)); > outBinaryHeap.writeByte((byte)(0x80 | (c >> 6) & 0x3F)); > outBinaryHeap.writeByte((byte)(0x80 | (c & 0x3F))); > } > else { > outBinaryHeap.writeByte((byte)(0xC0 | ((c >> 6) & 0x1F))); > outBinaryHeap.writeByte((byte)(0x80 | (c & 0x3F))); > } > } > } > > > Vadim > > > > 2017-03-03 2:00 GMT+03:00 Valentin Kulichenko < > [hidden email]>: > >> Vadim, >> >> Looks better now. Can you also try to modify the benchmark so that >> marshaller and writer are created outside of the measured method? I.e. the >> benchmark methods should be as simple as this: >> >> @Benchmark >> public void binaryHeapOutputStreamDirect() throws Exception { >> writer.doWriteStringDirect(message); >> } >> >> @Benchmark >> public void binaryHeapOutputStreamInDirect() throws Exception { >> writer.doWriteString(message); >> } >> >> In any case, do I understand correctly that it didn't actually make any >> performance difference? If so, I think we can close the ticket. >> >> Vova, can you also take a look and provide your thoughts? >> >> -Val >> >> On Thu, Mar 2, 2017 at 1:27 PM, Вадим Опольский <[hidden email]> >> wrote: >> >>> Hi Valentin! >>> >>> I've created: >>> >>> new method strToUtf8BytesDirect in BinaryUtilsNew >>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>> /java/org/sample/BinaryUtilsNew.java >>> >>> new method doWriteStringDirect in BinaryWriterExImplNew >>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>> /java/org/sample/BinaryWriterExImplNew.java >>> >>> benchmarks for BinaryWriterExImpl doWriteString and BinaryWriterExImplNew >>> doWriteStringDirect >>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>> /java/org/sample/ExampleTest.java >>> >>> This is a result of comparing: >>> >>> Benchmark >>> Mode Cnt Score Error UnitsExampleTest.binaryHeapOutputStreamDirect >>> avgt 50 1128448,743 ± 13536,689 ns/opExampleTest.binaryHeapOutputStreamInDirect >>> avgt 50 1127270,695 ± 17309,256 ns/op >>> >>> Vadim >>> >>> 2017-03-02 1:02 GMT+03:00 Valentin Kulichenko < >>> [hidden email]>: >>> >>>> Hi Vadim, >>>> >>>> We're getting closer :) I would actually like to see the test for >>>> actual implementation of BinaryWriterExImpl#doWriteString method. >>>> Logic in binaryHeapOutputInDirect() confuses me a bit and I'm not sure >>>> comparison is valid. >>>> >>>> Can you please do the following: >>>> >>>> 1. Create new BinaryUtils#strToUtf8BytesDirect method, copy-paste the >>>> code from existing BinaryUtils#strToUtf8Bytes and modify it so that it >>>> takes BinaryOutputStream as an argument and writes to it directly. Do not >>>> create stream inside this method, as it's the same as creating new array. >>>> 2. Create new BinaryWriterExImpl#doWriteStringDirect, copy-paste the >>>> code from existing BinaryWriterExImpl#doWriteString and modify it so >>>> that it uses BinaryUtils#strToUtf8BytesDirect and doesn't >>>> call out.writeByteArray. >>>> 3. Create benchmark for BinaryWriterExImpl#doWriteString method. I.e., >>>> create an instance of BinaryWriterExImpl and call doWriteString() in >>>> benchmark method. >>>> 4. Similarly, create benchmark for BinaryWriterExImpl#doWriteStri >>>> ngDirect. >>>> 5. Compare results. >>>> >>>> This will give us clear picture of how these two approaches perform. >>>> Your current results are actually promising, but I would like to confirm >>>> them. >>>> >>>> -Val >>>> >>>> On Wed, Mar 1, 2017 at 6:17 AM, Вадим Опольский <[hidden email]> >>>> wrote: >>>> >>>>> Hi Valentin! >>>>> >>>>> Thank you for comments. >>>>> >>>>> There is a new method which writes directly to BinaryOutputStream >>>>> instead of intermediate array. >>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>>> /java/org/sample/BinaryUtilsNew.java >>>>> >>>>> There is benchmark. >>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>>> /java/org/sample/MyBenchmark.java >>>>> >>>>> Unit test >>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>>> /java/org/sample/BinaryOutputStreamTest.java >>>>> >>>>> Statistics >>>>> https://github.com/javaller/MyBenchmark/blob/master/out_01_03_17.txt >>>>> >>>>> Benchmark >>>>> Mode Cnt Score Error Units MyBenchmark.binaryHeapOutputIn >>>>> Direct avgt 50 111,337 ± 0,742 ns/op >>>>> MyBenchmark.binaryHeapOutputStreamDirect avgt 50 23,847 >>>>> ± 0,303 ns/op >>>>> >>>>> >>>>> Vadim >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> 2017-02-28 4:29 GMT+03:00 Valentin Kulichenko < >>>>> [hidden email]>: >>>>> >>>>>> Hi Vadim, >>>>>> >>>>>> Looks like you accidentally removed dev list from the thread, adding >>>>>> it back. >>>>>> >>>>>> I think there is still misunderstanding. What I propose is to modify >>>>>> the BinaryUtils#strToUtf8Bytes so that it writes directly to BinaryOutputStream >>>>>> instead of intermediate array. This should decrease memory consumption and >>>>>> can also increase performance as we will avoid 'writeByteArray' step >>>>>> at the end. >>>>>> >>>>>> Does it make sense to you? >>>>>> >>>>>> -Val >>>>>> >>>>>> On Mon, Feb 27, 2017 at 6:55 AM, Вадим Опольский < >>>>>> [hidden email]> wrote: >>>>>> >>>>>>> Hi, Valentin! >>>>>>> >>>>>>> What do you think about using the methods of BinaryOutputStream: >>>>>>> >>>>>>> 1) writeByteArray(byte[] val) >>>>>>> 2) writeCharArray(char[] val) >>>>>>> 3) write (byte[] arr, int off, int len) >>>>>>> >>>>>>> String val = "Test"; >>>>>>> out.writeByteArray( val.getBytes(UTF_8)); >>>>>>> >>>>>>> String val = "Test"; >>>>>>> out.writeCharArray(str.toCharArray()); >>>>>>> >>>>>>> String val = "Test" >>>>>>> InputStream stream = new ByteArrayInputStream( >>>>>>> exampleString.getBytes(StandartCharsets.UTF_8)); >>>>>>> byte[] buffer = new byte[1024]; >>>>>>> while ((buffer = stream.read()) != -1) { >>>>>>> out.writeByteArray(buffer); >>>>>>> } >>>>>>> >>>>>>> What else can we use ? >>>>>>> >>>>>>> Vadim >>>>>>> >>>>>>> >>>>>>> 2017-02-25 2:21 GMT+03:00 Valentin Kulichenko < >>>>>>> [hidden email]>: >>>>>>> >>>>>>>> Hi Vadim, >>>>>>>> >>>>>>>> Which method implements the approach described in the ticket? From >>>>>>>> what I see, all writeToStringX versions are still encoding into an >>>>>>>> intermediate array and then call out.writeByteArray. What we need to test >>>>>>>> is the approach where bytes are written directly into the stream during >>>>>>>> encoding. Encoding algorithm itself should stay the same for now, otherwise >>>>>>>> we will not know how to interpret the result. >>>>>>>> >>>>>>>> It looks like there is some misunderstanding here, so please let me >>>>>>>> know anything is still unclear. I will be happy to answer your questions. >>>>>>>> >>>>>>>> -Val >>>>>>>> >>>>>>>> On Wed, Feb 22, 2017 at 7:22 PM, Valentin Kulichenko < >>>>>>>> [hidden email]> wrote: >>>>>>>> >>>>>>>>> Hi Vadim, >>>>>>>>> >>>>>>>>> Thanks, I will review this week. >>>>>>>>> >>>>>>>>> -Val >>>>>>>>> >>>>>>>>> On Wed, Feb 22, 2017 at 2:28 AM, Вадим Опольский < >>>>>>>>> [hidden email]> wrote: >>>>>>>>> >>>>>>>>>> Hi Valentin! >>>>>>>>>> >>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-13 >>>>>>>>>> >>>>>>>>>> I created BinaryWriterExImplNew (extended of BinaryWriterExImpl) and >>>>>>>>>> added new methods with changes described in the ticket >>>>>>>>>> >>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>>>>>>>> /java/org/sample/BinaryWriterExImplNew.java >>>>>>>>>> >>>>>>>>>> I created a benchmark for BinaryWriterExImplNew >>>>>>>>>> >>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>>>>>>>> /java/org/sample/ExampleTest.java >>>>>>>>>> >>>>>>>>>> I run benchmark and compared results >>>>>>>>>> >>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/totalstat.txt >>>>>>>>>> >>>>>>>>>> # Run complete. Total time: 00:10:24 >>>>>>>>>> Benchmark Mode Cnt >>>>>>>>>> Score Error Units >>>>>>>>>> ExampleTest.binaryHeapOutputStream1 avgt 50 >>>>>>>>>> 1114999,207 ± 16756,776 ns/op >>>>>>>>>> ExampleTest.binaryHeapOutputStream2 avgt 50 >>>>>>>>>> 1118149,320 ± 17515,961 ns/op >>>>>>>>>> ExampleTest.binaryHeapOutputStream3 avgt 50 >>>>>>>>>> 1113678,657 ± 17652,314 ns/op >>>>>>>>>> ExampleTest.binaryHeapOutputStream4 avgt 50 >>>>>>>>>> 1112415,051 ± 18273,874 ns/op >>>>>>>>>> ExampleTest.binaryHeapOutputStream5 avgt 50 >>>>>>>>>> 1111366,583 ± 18282,829 ns/op >>>>>>>>>> ExampleTest.binaryHeapOutputStreamACSII avgt 50 1112079,667 >>>>>>>>>> ± 16659,532 ns/op >>>>>>>>>> ExampleTest.binaryHeapOutputStreamUTFCustom avgt 50 >>>>>>>>>> 1114949,759 ± 16809,669 ns/op >>>>>>>>>> ExampleTest.binaryHeapOutputStreamUTFNIO avgt 50 >>>>>>>>>> 1121462,325 ± 19836,466 ns/op >>>>>>>>>> >>>>>>>>>> Is it OK? Whats the next step? Do I have to move this >>>>>>>>>> JMH benchmark to the Ignite project ? >>>>>>>>>> >>>>>>>>>> Vadim Opolski >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 2017-02-21 1:06 GMT+03:00 Valentin Kulichenko < >>>>>>>>>> [hidden email]>: >>>>>>>>>> >>>>>>>>>>> Hi Vadim, >>>>>>>>>>> >>>>>>>>>>> I'm not sure I understand your benchmarks and how they verify >>>>>>>>>>> the optimization discussed here. Basically, here is what needs to be done: >>>>>>>>>>> >>>>>>>>>>> 1. Create a benchmark for BinaryWriterExImpl#doWriteString >>>>>>>>>>> method. >>>>>>>>>>> 2. Run the benchmark with current implementation. >>>>>>>>>>> 3. Make the change described in the ticket. >>>>>>>>>>> 4. Run the benchmark with these changes. >>>>>>>>>>> 5. Compare results. >>>>>>>>>>> >>>>>>>>>>> Makes sense? Let me know if anything is unclear. >>>>>>>>>>> >>>>>>>>>>> -Val >>>>>>>>>>> >>>>>>>>>>> On Mon, Feb 20, 2017 at 8:51 AM, Вадим Опольский < >>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hello everybody! >>>>>>>>>>>> >>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-13 >>>>>>>>>>>> >>>>>>>>>>>> Valentin, I just have finished benchmark (with JMH) - >>>>>>>>>>>> https://github.com/javaller/MyBenchmark.git >>>>>>>>>>>> >>>>>>>>>>>> It collect data about time working of serialization. >>>>>>>>>>>> >>>>>>>>>>>> For instance - https://github.com/javaller/My >>>>>>>>>>>> Benchmark/blob/master/out200217.txt >>>>>>>>>>>> >>>>>>>>>>>> To start it you have to do next: >>>>>>>>>>>> >>>>>>>>>>>> 1) clone it - git colne https://github.com/javal >>>>>>>>>>>> ler/MyBenchmark.git >>>>>>>>>>>> >>>>>>>>>>>> 2) install it - mvn install >>>>>>>>>>>> >>>>>>>>>>>> 3) run benchmarks - java -Xms1024m -Xmx4096m -jar >>>>>>>>>>>> target\benchmarks.jar >>>>>>>>>>>> >>>>>>>>>>>> Vadim Opolski >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 2017-02-15 0:52 GMT+03:00 Valentin Kulichenko < >>>>>>>>>>>> [hidden email]>: >>>>>>>>>>>> >>>>>>>>>>>>> Vladimir, >>>>>>>>>>>>> >>>>>>>>>>>>> I think we misunderstood each other. My understanding of this >>>>>>>>>>>>> optimization is the following. >>>>>>>>>>>>> >>>>>>>>>>>>> Currently string serialization is done in two steps (see >>>>>>>>>>>>> BinaryWriterExImpl#doWriteString): >>>>>>>>>>>>> >>>>>>>>>>>>> strArr = BinaryUtils.strToUtf8Bytes(val); // Encode string >>>>>>>>>>>>> into byte array. >>>>>>>>>>>>> out.writeByteArray(strArr); // Write byte >>>>>>>>>>>>> array into stream. >>>>>>>>>>>>> >>>>>>>>>>>>> What this ticket suggests is to write directly into stream >>>>>>>>>>>>> while string is encoded, without intermediate array. This both reduces >>>>>>>>>>>>> memory consumption and eliminates array copy step. >>>>>>>>>>>>> >>>>>>>>>>>>> I updated the ticket and added this explanation there. >>>>>>>>>>>>> >>>>>>>>>>>>> Vadim, can you create a micro benchmark and check if it gives >>>>>>>>>>>>> any improvement? >>>>>>>>>>>>> >>>>>>>>>>>>> -Val >>>>>>>>>>>>> >>>>>>>>>>>>> On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov < >>>>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> It is hard to say whether it makes sense or not. No doubt, it >>>>>>>>>>>>>> could speed up marshalling process at the cost of 2x memory required for >>>>>>>>>>>>>> strings. From my previous experience with marshalling micro-optimizations, >>>>>>>>>>>>>> we will hardly ever notice speedup in distributed environment. >>>>>>>>>>>>>> >>>>>>>>>>>>>> But, there is another sied - it could speedup our queries, >>>>>>>>>>>>>> because we will not have to unmarshal string on every field access. So I >>>>>>>>>>>>>> would try to make this optimization optional and then measure query >>>>>>>>>>>>>> performance with classes having lots of strings. It could give us >>>>>>>>>>>>>> interesting results. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko < >>>>>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Vladimir, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Can you please take a look and provide your thoughts? Can >>>>>>>>>>>>>>> this be applied to binary marshaller? From what I recall, it serializes >>>>>>>>>>>>>>> string a bit differently from optimized marshaller, so I'm not sure. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -Val >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan < >>>>>>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko < >>>>>>>>>>>>>>>> [hidden email]> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> > Hi Vadim, >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > I don't think it makes much sense to invest into >>>>>>>>>>>>>>>> OptimizedMarshaller. >>>>>>>>>>>>>>>> > However, I would check if this optimization is applicable >>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>> > BinaryMarshaller, and if yes, implement it. >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Val, in this case can you please update the ticket? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > -Val >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский < >>>>>>>>>>>>>>>> [hidden email]> >>>>>>>>>>>>>>>> > wrote: >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > > Dear sirs! >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> > > I want to resolve issue IGNITE-13 - >>>>>>>>>>>>>>>> > > https://issues.apache.org/jira/browse/IGNITE-13 >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> > > Is it actual? >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> > > Vadim Opolski >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > |
Free forum by Nabble | Edit this page |