Hi,
I have a cache with 5000 objects, 400Kb each and I need to download all these objects using Binary Client Protocol. To do than I use Scan Query (and Load Next Page to load page 2, 3, etc...) request without any filter. I measure the time between two moments: when request has been sent and when the result is ready (not when page has been downloaded, only is ready to be downloaded). If the page size is 1000 (~400Mb per page) the measured time is 1000ms. If the page size if 100 (~40Mb per page) the measured time is 100ms. As result I conclude that Ignite spends ~2.5ms per every megabyte on preparing response and I correspondingly spend this time on waiting. This leads to the fact that I can't get throughput more than 200Mb/s using network 10Gbit/s. It's very confusing. So, the question. How to reduce Scan Query execution time in such configuration? -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ |
Hello!
Can you check if there's any difference with https://github.com/apache/ignite/pull/4592 ? Regards, -- Ilya Kasnacheev 2018-08-23 16:05 GMT+03:00 dmitrievanthony <[hidden email]>: > Hi, > > I have a cache with 5000 objects, 400Kb each and I need to download all > these objects using Binary Client Protocol. To do than I use Scan Query > (and > Load Next Page to load page 2, 3, etc...) request without any filter. > > I measure the time between two moments: when request has been sent and when > the result is ready (not when page has been downloaded, only is ready to be > downloaded). > > If the page size is 1000 (~400Mb per page) the measured time is 1000ms. > If the page size if 100 (~40Mb per page) the measured time is 100ms. > > As result I conclude that Ignite spends ~2.5ms per every megabyte on > preparing response and I correspondingly spend this time on waiting. This > leads to the fact that I can't get throughput more than 200Mb/s using > network 10Gbit/s. It's very confusing. > > So, the question. How to reduce Scan Query execution time in such > configuration? > > > > -- > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ > |
I checked and it looks like the result is the same (or even worse, I get
1150ms with page size 1000, but the reason might be in other changes, previous measures I did using 2.6). -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ |
Hi Dmitriy,
Why do you thing that results are not fetched to the client at this point? We respond to the client with first page. On Thu, Aug 23, 2018 at 5:22 PM dmitrievanthony <[hidden email]> wrote: > I checked and it looks like the result is the same (or even worse, I get > 1150ms with page size 1000, but the reason might be in other changes, > previous measures I did using 2.6). > > > > -- > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ > |
Reference: org.apache.ignite.internal.processors.platform.client.cache.ClientCacheScanQueryRequest#process
On Thu, Aug 30, 2018 at 11:53 AM Vladimir Ozerov <[hidden email]> wrote: > Hi Dmitriy, > > Why do you thing that results are not fetched to the client at this point? > We respond to the client with first page. > > On Thu, Aug 23, 2018 at 5:22 PM dmitrievanthony <[hidden email]> > wrote: > >> I checked and it looks like the result is the same (or even worse, I get >> 1150ms with page size 1000, but the reason might be in other changes, >> previous measures I did using 2.6). >> >> >> >> -- >> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ >> > |
Hi,
I prepared an example that reproduces what I'm talking about. Please take a look: https://github.com/dmitrievanthony/slow-scan-query-reproducer/blob/master/src/main/java/Client.java. I calculate time between the has been sent and the result is ready to be received (not fully received). And I use localhost as well. -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ |
I am not sure what is the purpose of measuring receive time of the first
byte. Please try to measure time of getting the first page, or the whole result set you are interested in. The main purpose of page size is to better utilize network and decrease number of request responses. If you are interested in getting the very first result ASAP, then it is better to keep page size small. But if the goal is to get all results ASAP, then greater page size will help. After some threshold greater page size will not provide benefits and may even make performance worse. So it is better to experiment with different pages size to find optimal value. On Thu, Aug 30, 2018 at 3:14 PM dmitrievanthony <[hidden email]> wrote: > Hi, > > I prepared an example that reproduces what I'm talking about. Please take a > look: > > https://github.com/dmitrievanthony/slow-scan-query-reproducer/blob/master/src/main/java/Client.java > . > > I calculate time between the has been sent and the result is ready to be > received (not fully received). And I use localhost as well. > > > > -- > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ > |
I have already experimented with different page sizes and found out that
"downloading" time is relatively small compare to this "waiting" time, so I've decided that this "waiting" is bottleneck and that's why I'm talking about it and measuring it. In case of AWS 10Gbit network allows us to receive page content with throughput ~1.2Gb/s (I've checked it and it's true), but "waiting" step works slower and it's a problem. You suggested to use greater page size for my purpose (maximal throughput, all results), so: Page size: 200Mb Waiting time: 800ms Downloading time: 160ms ---------------------------------- Total time: 960ms Throughput: 200Mb/s -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ |
BTW, measurements for the example I've been talking above:
Page size 5 Mb, waiting time 119.85 ± 6.72 ms Page size 10 Mb, waiting time 157.70 ± 15.35 ms Page size 20 Mb, waiting time 204.50 ± 19.18 ms Page size 50 Mb, waiting time 264.70 ± 22.30 ms Page size 100 Mb, waiting time 463.35 ± 17.12 ms Page size 150 Mb, waiting time 672.50 ± 21.98 ms -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ |
Anton,
I still struggle to understand the problem. Delay in getting the first page is not a problem on its own. But it might be a problem for your use case. My question is - what is the use case and what is your goal? Minimal latency? Maximal throughout? Getting the first result ASAP? Getting all results ASAP? чт, 30 авг. 2018 г. в 15:43, dmitrievanthony <[hidden email]>: > BTW, measurements for the example I've been talking above: > > Page size 5 Mb, waiting time 119.85 ± 6.72 ms > Page size 10 Mb, waiting time 157.70 ± 15.35 ms > Page size 20 Mb, waiting time 204.50 ± 19.18 ms > Page size 50 Mb, waiting time 264.70 ± 22.30 ms > Page size 100 Mb, waiting time 463.35 ± 17.12 ms > Page size 150 Mb, waiting time 672.50 ± 21.98 ms > > > > -- > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ > |
To be precise it's not only about first page, it's about getting next pages
as well. Regarding use case, in my client application I need to iterate over the dataset stored in Apache Ignite as fast as it possible. It means I should provide maximal throughput for simple "read all" operation. -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ |
In this case I would not advise to measure how long does it take to get the
first byte. We may set page size to 1 and get that byte very quickly, but it doesn't help you to iterate over all results quickly. Correct measurement here would be to calculate total iteration time. Provided that object size is pretty big, I think optimal page size might be pretty small. At this moment thin client SCAN implementation is really straightforward - just request-response for every page. To get peak throughput it is better to rewrite it to streaming approach, when server constantly pushes data to the client. Also compression may help in your case (trade CPU to network). But this is only ideas which are yet to be implemented in the product. Also you may want to look at "partition" and "filter" parameters of ScanQuery. With partition it is possible to start scanning in several threads. With filter it is possible to pass less data over wire. Vladimir. On Thu, Aug 30, 2018 at 5:40 PM dmitrievanthony <[hidden email]> wrote: > To be precise it's not only about first page, it's about getting next pages > as well. > > Regarding use case, in my client application I need to iterate over the > dataset stored in Apache Ignite as fast as it possible. It means I should > provide maximal throughput for simple "read all" operation. > > > > -- > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ > |
Yes, of course I started with measuring of total iteration time. After that I
found that throughput is about 200Mb/s, then I started looking for a bottleneck. Because "downloading" time is less than "waiting" time I conclude that "waiting" step is bottleneck and so that this thread has been started. Regarding your suggestions, yes, streaming approach probably will help us. At the same time, I don't think that compression will help because as I've already told it looks like the bottleneck on Apache Ignite side, not in network transfer itself. I can't use "filter" because I need all data. Regarding the "partition" parameter, do I understand correctly that it means number of cache partition to be fetched? In this case don't understand how it affects "scanning in several threads". -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ |
Free forum by Nabble | Edit this page |