Optimize GridLongList serialization

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Optimize GridLongList serialization

Александр Меньшиков
I investigated network loading and found that a big part of internal data
inside messages is `GridLongList`.
It is a part of `GridDhtTxFinishRequest`,
`GridDhtAtomicDeferredUpdateResponse`, `GridDhtAtomicUpdateRequest`,
`GridNearAtomicFullUpdateRequest` and `NearCacheUpdates`.

So I think it has the sense to optimize `GridLongList` serialization.


Here we serialize all elements and don't take into account `idx` value:

```

@Override public boolean writeTo(ByteBuffer buf, MessageWriter writer) {

        writer.setBuffer(buf);



        if (!writer.isHeaderWritten()) {

            if (!writer.writeHeader(directType(), fieldsCount()))

                return false;



            writer.onHeaderWritten();

        }



        switch (writer.state()) {

            case 0:

                if (!writer.writeLongArray("arr", arr))

                    return false;



                writer.incrementState();



            case 1:

                if (!writer.writeInt("idx", idx))

                    return false;



                writer.incrementState();



        }



        return true;

    }

```



Which is not happening in another serialization method in the same class:



```

public static void writeTo(DataOutput out, @Nullable GridLongList list)
throws IOException {

        out.writeInt(list != null ? list.idx : -1);



        if (list != null) {

            for (int i = 0; i < list.idx; i++)

                out.writeLong(list.arr[i]);

        }

    }

```


So, we can simply reduce messages size by sending only a valuable part of
the array.
If you don't mind I will create an issue in Jira for this.


By the way, `long` is a huge type. As I see in most cases `GridLongList`
uses for counters.
And I have checked the possibility of compress `long` into smaller types as
`int`, `short` or `byte` in test
`GridCacheInterceptorAtomicRebalanceTest` (took it by random).
And found out that all `long` in`GridLongList` can be cast to `int` and 70%
of them to shorts.
Such conversion is quite fast about 1.1 (ns) per element (I have checked it
by JMH test).



Of course, there are a lot of ways to compress data,
but I know proprietary GridGain plug-in has different `MessageWriter`
implementation.
So maybe it is unnecessary and some compression already exists in this
proprietary plug-in.
Does someone know something about it?
Reply | Threaded
Open this post in threaded view
|

Re: Optimize GridLongList serialization

dsetrakyan
Thanks, Alex.

GridGain automatically compresses all the internal types. Somehow it looks
like the GridLongList may have been mixed. Can you please file a ticket for
2.5 release?

D.

On Mon, Mar 26, 2018 at 4:55 AM, Александр Меньшиков <[hidden email]>
wrote:

> I investigated network loading and found that a big part of internal data
> inside messages is `GridLongList`.
> It is a part of `GridDhtTxFinishRequest`,
> `GridDhtAtomicDeferredUpdateResponse`, `GridDhtAtomicUpdateRequest`,
> `GridNearAtomicFullUpdateRequest` and `NearCacheUpdates`.
>
> So I think it has the sense to optimize `GridLongList` serialization.
>
>
> Here we serialize all elements and don't take into account `idx` value:
>
> ```
>
> @Override public boolean writeTo(ByteBuffer buf, MessageWriter writer) {
>
>         writer.setBuffer(buf);
>
>
>
>         if (!writer.isHeaderWritten()) {
>
>             if (!writer.writeHeader(directType(), fieldsCount()))
>
>                 return false;
>
>
>
>             writer.onHeaderWritten();
>
>         }
>
>
>
>         switch (writer.state()) {
>
>             case 0:
>
>                 if (!writer.writeLongArray("arr", arr))
>
>                     return false;
>
>
>
>                 writer.incrementState();
>
>
>
>             case 1:
>
>                 if (!writer.writeInt("idx", idx))
>
>                     return false;
>
>
>
>                 writer.incrementState();
>
>
>
>         }
>
>
>
>         return true;
>
>     }
>
> ```
>
>
>
> Which is not happening in another serialization method in the same class:
>
>
>
> ```
>
> public static void writeTo(DataOutput out, @Nullable GridLongList list)
> throws IOException {
>
>         out.writeInt(list != null ? list.idx : -1);
>
>
>
>         if (list != null) {
>
>             for (int i = 0; i < list.idx; i++)
>
>                 out.writeLong(list.arr[i]);
>
>         }
>
>     }
>
> ```
>
>
> So, we can simply reduce messages size by sending only a valuable part of
> the array.
> If you don't mind I will create an issue in Jira for this.
>
>
> By the way, `long` is a huge type. As I see in most cases `GridLongList`
> uses for counters.
> And I have checked the possibility of compress `long` into smaller types as
> `int`, `short` or `byte` in test
> `GridCacheInterceptorAtomicRebalanceTest` (took it by random).
> And found out that all `long` in`GridLongList` can be cast to `int` and 70%
> of them to shorts.
> Such conversion is quite fast about 1.1 (ns) per element (I have checked it
> by JMH test).
>
>
>
> Of course, there are a lot of ways to compress data,
> but I know proprietary GridGain plug-in has different `MessageWriter`
> implementation.
> So maybe it is unnecessary and some compression already exists in this
> proprietary plug-in.
> Does someone know something about it?
>
Reply | Threaded
Open this post in threaded view
|

Re: Optimize GridLongList serialization

Dmitriy Pavlov
Hi Dmitriy, did you mean GridList?

I don't understand what does it mean GridGain compress.

вт, 27 мар. 2018 г. в 3:06, Dmitriy Setrakyan <[hidden email]>:

> Thanks, Alex.
>
> GridGain automatically compresses all the internal types. Somehow it looks
> like the GridLongList may have been mixed. Can you please file a ticket for
> 2.5 release?
>
> D.
>
> On Mon, Mar 26, 2018 at 4:55 AM, Александр Меньшиков <[hidden email]
> >
> wrote:
>
> > I investigated network loading and found that a big part of internal data
> > inside messages is `GridLongList`.
> > It is a part of `GridDhtTxFinishRequest`,
> > `GridDhtAtomicDeferredUpdateResponse`, `GridDhtAtomicUpdateRequest`,
> > `GridNearAtomicFullUpdateRequest` and `NearCacheUpdates`.
> >
> > So I think it has the sense to optimize `GridLongList` serialization.
> >
> >
> > Here we serialize all elements and don't take into account `idx` value:
> >
> > ```
> >
> > @Override public boolean writeTo(ByteBuffer buf, MessageWriter writer) {
> >
> >         writer.setBuffer(buf);
> >
> >
> >
> >         if (!writer.isHeaderWritten()) {
> >
> >             if (!writer.writeHeader(directType(), fieldsCount()))
> >
> >                 return false;
> >
> >
> >
> >             writer.onHeaderWritten();
> >
> >         }
> >
> >
> >
> >         switch (writer.state()) {
> >
> >             case 0:
> >
> >                 if (!writer.writeLongArray("arr", arr))
> >
> >                     return false;
> >
> >
> >
> >                 writer.incrementState();
> >
> >
> >
> >             case 1:
> >
> >                 if (!writer.writeInt("idx", idx))
> >
> >                     return false;
> >
> >
> >
> >                 writer.incrementState();
> >
> >
> >
> >         }
> >
> >
> >
> >         return true;
> >
> >     }
> >
> > ```
> >
> >
> >
> > Which is not happening in another serialization method in the same class:
> >
> >
> >
> > ```
> >
> > public static void writeTo(DataOutput out, @Nullable GridLongList list)
> > throws IOException {
> >
> >         out.writeInt(list != null ? list.idx : -1);
> >
> >
> >
> >         if (list != null) {
> >
> >             for (int i = 0; i < list.idx; i++)
> >
> >                 out.writeLong(list.arr[i]);
> >
> >         }
> >
> >     }
> >
> > ```
> >
> >
> > So, we can simply reduce messages size by sending only a valuable part of
> > the array.
> > If you don't mind I will create an issue in Jira for this.
> >
> >
> > By the way, `long` is a huge type. As I see in most cases `GridLongList`
> > uses for counters.
> > And I have checked the possibility of compress `long` into smaller types
> as
> > `int`, `short` or `byte` in test
> > `GridCacheInterceptorAtomicRebalanceTest` (took it by random).
> > And found out that all `long` in`GridLongList` can be cast to `int` and
> 70%
> > of them to shorts.
> > Such conversion is quite fast about 1.1 (ns) per element (I have checked
> it
> > by JMH test).
> >
> >
> >
> > Of course, there are a lot of ways to compress data,
> > but I know proprietary GridGain plug-in has different `MessageWriter`
> > implementation.
> > So maybe it is unnecessary and some compression already exists in this
> > proprietary plug-in.
> > Does someone know something about it?
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Optimize GridLongList serialization

dsetrakyan
Sorry, Dmitiry, I meant Ignite, not GridGain (memories of pre-apache
coding). I am assuming that Alex Menshikov was referring to GridLongList
class in Ignite.

D.

On Mon, Mar 26, 2018 at 11:52 PM, Dmitry Pavlov <[hidden email]>
wrote:

> Hi Dmitriy, did you mean GridList?
>
> I don't understand what does it mean GridGain compress.
>
> вт, 27 мар. 2018 г. в 3:06, Dmitriy Setrakyan <[hidden email]>:
>
> > Thanks, Alex.
> >
> > GridGain automatically compresses all the internal types. Somehow it
> looks
> > like the GridLongList may have been mixed. Can you please file a ticket
> for
> > 2.5 release?
> >
> > D.
> >
> > On Mon, Mar 26, 2018 at 4:55 AM, Александр Меньшиков <
> [hidden email]
> > >
> > wrote:
> >
> > > I investigated network loading and found that a big part of internal
> data
> > > inside messages is `GridLongList`.
> > > It is a part of `GridDhtTxFinishRequest`,
> > > `GridDhtAtomicDeferredUpdateResponse`, `GridDhtAtomicUpdateRequest`,
> > > `GridNearAtomicFullUpdateRequest` and `NearCacheUpdates`.
> > >
> > > So I think it has the sense to optimize `GridLongList` serialization.
> > >
> > >
> > > Here we serialize all elements and don't take into account `idx` value:
> > >
> > > ```
> > >
> > > @Override public boolean writeTo(ByteBuffer buf, MessageWriter writer)
> {
> > >
> > >         writer.setBuffer(buf);
> > >
> > >
> > >
> > >         if (!writer.isHeaderWritten()) {
> > >
> > >             if (!writer.writeHeader(directType(), fieldsCount()))
> > >
> > >                 return false;
> > >
> > >
> > >
> > >             writer.onHeaderWritten();
> > >
> > >         }
> > >
> > >
> > >
> > >         switch (writer.state()) {
> > >
> > >             case 0:
> > >
> > >                 if (!writer.writeLongArray("arr", arr))
> > >
> > >                     return false;
> > >
> > >
> > >
> > >                 writer.incrementState();
> > >
> > >
> > >
> > >             case 1:
> > >
> > >                 if (!writer.writeInt("idx", idx))
> > >
> > >                     return false;
> > >
> > >
> > >
> > >                 writer.incrementState();
> > >
> > >
> > >
> > >         }
> > >
> > >
> > >
> > >         return true;
> > >
> > >     }
> > >
> > > ```
> > >
> > >
> > >
> > > Which is not happening in another serialization method in the same
> class:
> > >
> > >
> > >
> > > ```
> > >
> > > public static void writeTo(DataOutput out, @Nullable GridLongList list)
> > > throws IOException {
> > >
> > >         out.writeInt(list != null ? list.idx : -1);
> > >
> > >
> > >
> > >         if (list != null) {
> > >
> > >             for (int i = 0; i < list.idx; i++)
> > >
> > >                 out.writeLong(list.arr[i]);
> > >
> > >         }
> > >
> > >     }
> > >
> > > ```
> > >
> > >
> > > So, we can simply reduce messages size by sending only a valuable part
> of
> > > the array.
> > > If you don't mind I will create an issue in Jira for this.
> > >
> > >
> > > By the way, `long` is a huge type. As I see in most cases
> `GridLongList`
> > > uses for counters.
> > > And I have checked the possibility of compress `long` into smaller
> types
> > as
> > > `int`, `short` or `byte` in test
> > > `GridCacheInterceptorAtomicRebalanceTest` (took it by random).
> > > And found out that all `long` in`GridLongList` can be cast to `int` and
> > 70%
> > > of them to shorts.
> > > Such conversion is quite fast about 1.1 (ns) per element (I have
> checked
> > it
> > > by JMH test).
> > >
> > >
> > >
> > > Of course, there are a lot of ways to compress data,
> > > but I know proprietary GridGain plug-in has different `MessageWriter`
> > > implementation.
> > > So maybe it is unnecessary and some compression already exists in this
> > > proprietary plug-in.
> > > Does someone know something about it?
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Optimize GridLongList serialization

Александр Меньшиков
The ticket is created:
https://issues.apache.org/jira/browse/IGNITE-8054

2018-03-27 10:04 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:

> Sorry, Dmitiry, I meant Ignite, not GridGain (memories of pre-apache
> coding). I am assuming that Alex Menshikov was referring to GridLongList
> class in Ignite.
>
> D.
>
> On Mon, Mar 26, 2018 at 11:52 PM, Dmitry Pavlov <[hidden email]>
> wrote:
>
> > Hi Dmitriy, did you mean GridList?
> >
> > I don't understand what does it mean GridGain compress.
> >
> > вт, 27 мар. 2018 г. в 3:06, Dmitriy Setrakyan <[hidden email]>:
> >
> > > Thanks, Alex.
> > >
> > > GridGain automatically compresses all the internal types. Somehow it
> > looks
> > > like the GridLongList may have been mixed. Can you please file a ticket
> > for
> > > 2.5 release?
> > >
> > > D.
> > >
> > > On Mon, Mar 26, 2018 at 4:55 AM, Александр Меньшиков <
> > [hidden email]
> > > >
> > > wrote:
> > >
> > > > I investigated network loading and found that a big part of internal
> > data
> > > > inside messages is `GridLongList`.
> > > > It is a part of `GridDhtTxFinishRequest`,
> > > > `GridDhtAtomicDeferredUpdateResponse`, `GridDhtAtomicUpdateRequest`,
> > > > `GridNearAtomicFullUpdateRequest` and `NearCacheUpdates`.
> > > >
> > > > So I think it has the sense to optimize `GridLongList` serialization.
> > > >
> > > >
> > > > Here we serialize all elements and don't take into account `idx`
> value:
> > > >
> > > > ```
> > > >
> > > > @Override public boolean writeTo(ByteBuffer buf, MessageWriter
> writer)
> > {
> > > >
> > > >         writer.setBuffer(buf);
> > > >
> > > >
> > > >
> > > >         if (!writer.isHeaderWritten()) {
> > > >
> > > >             if (!writer.writeHeader(directType(), fieldsCount()))
> > > >
> > > >                 return false;
> > > >
> > > >
> > > >
> > > >             writer.onHeaderWritten();
> > > >
> > > >         }
> > > >
> > > >
> > > >
> > > >         switch (writer.state()) {
> > > >
> > > >             case 0:
> > > >
> > > >                 if (!writer.writeLongArray("arr", arr))
> > > >
> > > >                     return false;
> > > >
> > > >
> > > >
> > > >                 writer.incrementState();
> > > >
> > > >
> > > >
> > > >             case 1:
> > > >
> > > >                 if (!writer.writeInt("idx", idx))
> > > >
> > > >                     return false;
> > > >
> > > >
> > > >
> > > >                 writer.incrementState();
> > > >
> > > >
> > > >
> > > >         }
> > > >
> > > >
> > > >
> > > >         return true;
> > > >
> > > >     }
> > > >
> > > > ```
> > > >
> > > >
> > > >
> > > > Which is not happening in another serialization method in the same
> > class:
> > > >
> > > >
> > > >
> > > > ```
> > > >
> > > > public static void writeTo(DataOutput out, @Nullable GridLongList
> list)
> > > > throws IOException {
> > > >
> > > >         out.writeInt(list != null ? list.idx : -1);
> > > >
> > > >
> > > >
> > > >         if (list != null) {
> > > >
> > > >             for (int i = 0; i < list.idx; i++)
> > > >
> > > >                 out.writeLong(list.arr[i]);
> > > >
> > > >         }
> > > >
> > > >     }
> > > >
> > > > ```
> > > >
> > > >
> > > > So, we can simply reduce messages size by sending only a valuable
> part
> > of
> > > > the array.
> > > > If you don't mind I will create an issue in Jira for this.
> > > >
> > > >
> > > > By the way, `long` is a huge type. As I see in most cases
> > `GridLongList`
> > > > uses for counters.
> > > > And I have checked the possibility of compress `long` into smaller
> > types
> > > as
> > > > `int`, `short` or `byte` in test
> > > > `GridCacheInterceptorAtomicRebalanceTest` (took it by random).
> > > > And found out that all `long` in`GridLongList` can be cast to `int`
> and
> > > 70%
> > > > of them to shorts.
> > > > Such conversion is quite fast about 1.1 (ns) per element (I have
> > checked
> > > it
> > > > by JMH test).
> > > >
> > > >
> > > >
> > > > Of course, there are a lot of ways to compress data,
> > > > but I know proprietary GridGain plug-in has different `MessageWriter`
> > > > implementation.
> > > > So maybe it is unnecessary and some compression already exists in
> this
> > > > proprietary plug-in.
> > > > Does someone know something about it?
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Optimize GridLongList serialization

Александр Меньшиков
Hi.
I have finished this task. Just replaced

`writer.writeLongArray("arr", arr)`
with
`writer.writeLongArray("arr", arr, idx)`

Please review and merge if okay.

Jira: https://issues.apache.org/jira/browse/IGNITE-8054
PR: https://github.com/apache/ignite/pull/3748
CI:
https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_RunAll&branch_IgniteTests24Java8=pull%2F3748%2Fhead&tab=buildTypeStatusDiv

2018-03-27 11:50 GMT+03:00 Александр Меньшиков <[hidden email]>:

> The ticket is created:
> https://issues.apache.org/jira/browse/IGNITE-8054
>
> 2018-03-27 10:04 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
>
>> Sorry, Dmitiry, I meant Ignite, not GridGain (memories of pre-apache
>> coding). I am assuming that Alex Menshikov was referring to GridLongList
>> class in Ignite.
>>
>> D.
>>
>> On Mon, Mar 26, 2018 at 11:52 PM, Dmitry Pavlov <[hidden email]>
>> wrote:
>>
>> > Hi Dmitriy, did you mean GridList?
>> >
>> > I don't understand what does it mean GridGain compress.
>> >
>> > вт, 27 мар. 2018 г. в 3:06, Dmitriy Setrakyan <[hidden email]>:
>> >
>> > > Thanks, Alex.
>> > >
>> > > GridGain automatically compresses all the internal types. Somehow it
>> > looks
>> > > like the GridLongList may have been mixed. Can you please file a
>> ticket
>> > for
>> > > 2.5 release?
>> > >
>> > > D.
>> > >
>> > > On Mon, Mar 26, 2018 at 4:55 AM, Александр Меньшиков <
>> > [hidden email]
>> > > >
>> > > wrote:
>> > >
>> > > > I investigated network loading and found that a big part of internal
>> > data
>> > > > inside messages is `GridLongList`.
>> > > > It is a part of `GridDhtTxFinishRequest`,
>> > > > `GridDhtAtomicDeferredUpdateResponse`,
>> `GridDhtAtomicUpdateRequest`,
>> > > > `GridNearAtomicFullUpdateRequest` and `NearCacheUpdates`.
>> > > >
>> > > > So I think it has the sense to optimize `GridLongList`
>> serialization.
>> > > >
>> > > >
>> > > > Here we serialize all elements and don't take into account `idx`
>> value:
>> > > >
>> > > > ```
>> > > >
>> > > > @Override public boolean writeTo(ByteBuffer buf, MessageWriter
>> writer)
>> > {
>> > > >
>> > > >         writer.setBuffer(buf);
>> > > >
>> > > >
>> > > >
>> > > >         if (!writer.isHeaderWritten()) {
>> > > >
>> > > >             if (!writer.writeHeader(directType(), fieldsCount()))
>> > > >
>> > > >                 return false;
>> > > >
>> > > >
>> > > >
>> > > >             writer.onHeaderWritten();
>> > > >
>> > > >         }
>> > > >
>> > > >
>> > > >
>> > > >         switch (writer.state()) {
>> > > >
>> > > >             case 0:
>> > > >
>> > > >                 if (!writer.writeLongArray("arr", arr))
>> > > >
>> > > >                     return false;
>> > > >
>> > > >
>> > > >
>> > > >                 writer.incrementState();
>> > > >
>> > > >
>> > > >
>> > > >             case 1:
>> > > >
>> > > >                 if (!writer.writeInt("idx", idx))
>> > > >
>> > > >                     return false;
>> > > >
>> > > >
>> > > >
>> > > >                 writer.incrementState();
>> > > >
>> > > >
>> > > >
>> > > >         }
>> > > >
>> > > >
>> > > >
>> > > >         return true;
>> > > >
>> > > >     }
>> > > >
>> > > > ```
>> > > >
>> > > >
>> > > >
>> > > > Which is not happening in another serialization method in the same
>> > class:
>> > > >
>> > > >
>> > > >
>> > > > ```
>> > > >
>> > > > public static void writeTo(DataOutput out, @Nullable GridLongList
>> list)
>> > > > throws IOException {
>> > > >
>> > > >         out.writeInt(list != null ? list.idx : -1);
>> > > >
>> > > >
>> > > >
>> > > >         if (list != null) {
>> > > >
>> > > >             for (int i = 0; i < list.idx; i++)
>> > > >
>> > > >                 out.writeLong(list.arr[i]);
>> > > >
>> > > >         }
>> > > >
>> > > >     }
>> > > >
>> > > > ```
>> > > >
>> > > >
>> > > > So, we can simply reduce messages size by sending only a valuable
>> part
>> > of
>> > > > the array.
>> > > > If you don't mind I will create an issue in Jira for this.
>> > > >
>> > > >
>> > > > By the way, `long` is a huge type. As I see in most cases
>> > `GridLongList`
>> > > > uses for counters.
>> > > > And I have checked the possibility of compress `long` into smaller
>> > types
>> > > as
>> > > > `int`, `short` or `byte` in test
>> > > > `GridCacheInterceptorAtomicRebalanceTest` (took it by random).
>> > > > And found out that all `long` in`GridLongList` can be cast to `int`
>> and
>> > > 70%
>> > > > of them to shorts.
>> > > > Such conversion is quite fast about 1.1 (ns) per element (I have
>> > checked
>> > > it
>> > > > by JMH test).
>> > > >
>> > > >
>> > > >
>> > > > Of course, there are a lot of ways to compress data,
>> > > > but I know proprietary GridGain plug-in has different
>> `MessageWriter`
>> > > > implementation.
>> > > > So maybe it is unnecessary and some compression already exists in
>> this
>> > > > proprietary plug-in.
>> > > > Does someone know something about it?
>> > > >
>> > >
>> >
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Optimize GridLongList serialization

Dmitriy Pavlov
Hi Alexander,

Would new implementation of GridLongList be able to read value serialized
by old implementation?

Is it possible old implementation would not be able to read value from new?

Sincerely,
Dmitriy Pavlov

пн, 7 мая 2018 г. в 11:30, Александр Меньшиков <[hidden email]>:

> Hi.
> I have finished this task. Just replaced
>
> `writer.writeLongArray("arr", arr)`
> with
> `writer.writeLongArray("arr", arr, idx)`
>
> Please review and merge if okay.
>
> Jira: https://issues.apache.org/jira/browse/IGNITE-8054
> PR: https://github.com/apache/ignite/pull/3748
> CI:
> https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_RunAll&branch_IgniteTests24Java8=pull%2F3748%2Fhead&tab=buildTypeStatusDiv
>
> 2018-03-27 11:50 GMT+03:00 Александр Меньшиков <[hidden email]>:
>
>> The ticket is created:
>> https://issues.apache.org/jira/browse/IGNITE-8054
>>
>> 2018-03-27 10:04 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
>>
>>> Sorry, Dmitiry, I meant Ignite, not GridGain (memories of pre-apache
>>> coding). I am assuming that Alex Menshikov was referring to GridLongList
>>> class in Ignite.
>>>
>>> D.
>>>
>>> On Mon, Mar 26, 2018 at 11:52 PM, Dmitry Pavlov <[hidden email]>
>>> wrote:
>>>
>>> > Hi Dmitriy, did you mean GridList?
>>> >
>>> > I don't understand what does it mean GridGain compress.
>>> >
>>> > вт, 27 мар. 2018 г. в 3:06, Dmitriy Setrakyan <[hidden email]>:
>>> >
>>> > > Thanks, Alex.
>>> > >
>>> > > GridGain automatically compresses all the internal types. Somehow it
>>> > looks
>>> > > like the GridLongList may have been mixed. Can you please file a
>>> ticket
>>> > for
>>> > > 2.5 release?
>>> > >
>>> > > D.
>>> > >
>>> > > On Mon, Mar 26, 2018 at 4:55 AM, Александр Меньшиков <
>>> > [hidden email]
>>> > > >
>>> > > wrote:
>>> > >
>>> > > > I investigated network loading and found that a big part of
>>> internal
>>> > data
>>> > > > inside messages is `GridLongList`.
>>> > > > It is a part of `GridDhtTxFinishRequest`,
>>> > > > `GridDhtAtomicDeferredUpdateResponse`,
>>> `GridDhtAtomicUpdateRequest`,
>>> > > > `GridNearAtomicFullUpdateRequest` and `NearCacheUpdates`.
>>> > > >
>>> > > > So I think it has the sense to optimize `GridLongList`
>>> serialization.
>>> > > >
>>> > > >
>>> > > > Here we serialize all elements and don't take into account `idx`
>>> value:
>>> > > >
>>> > > > ```
>>> > > >
>>> > > > @Override public boolean writeTo(ByteBuffer buf, MessageWriter
>>> writer)
>>> > {
>>> > > >
>>> > > >         writer.setBuffer(buf);
>>> > > >
>>> > > >
>>> > > >
>>> > > >         if (!writer.isHeaderWritten()) {
>>> > > >
>>> > > >             if (!writer.writeHeader(directType(), fieldsCount()))
>>> > > >
>>> > > >                 return false;
>>> > > >
>>> > > >
>>> > > >
>>> > > >             writer.onHeaderWritten();
>>> > > >
>>> > > >         }
>>> > > >
>>> > > >
>>> > > >
>>> > > >         switch (writer.state()) {
>>> > > >
>>> > > >             case 0:
>>> > > >
>>> > > >                 if (!writer.writeLongArray("arr", arr))
>>> > > >
>>> > > >                     return false;
>>> > > >
>>> > > >
>>> > > >
>>> > > >                 writer.incrementState();
>>> > > >
>>> > > >
>>> > > >
>>> > > >             case 1:
>>> > > >
>>> > > >                 if (!writer.writeInt("idx", idx))
>>> > > >
>>> > > >                     return false;
>>> > > >
>>> > > >
>>> > > >
>>> > > >                 writer.incrementState();
>>> > > >
>>> > > >
>>> > > >
>>> > > >         }
>>> > > >
>>> > > >
>>> > > >
>>> > > >         return true;
>>> > > >
>>> > > >     }
>>> > > >
>>> > > > ```
>>> > > >
>>> > > >
>>> > > >
>>> > > > Which is not happening in another serialization method in the same
>>> > class:
>>> > > >
>>> > > >
>>> > > >
>>> > > > ```
>>> > > >
>>> > > > public static void writeTo(DataOutput out, @Nullable GridLongList
>>> list)
>>> > > > throws IOException {
>>> > > >
>>> > > >         out.writeInt(list != null ? list.idx : -1);
>>> > > >
>>> > > >
>>> > > >
>>> > > >         if (list != null) {
>>> > > >
>>> > > >             for (int i = 0; i < list.idx; i++)
>>> > > >
>>> > > >                 out.writeLong(list.arr[i]);
>>> > > >
>>> > > >         }
>>> > > >
>>> > > >     }
>>> > > >
>>> > > > ```
>>> > > >
>>> > > >
>>> > > > So, we can simply reduce messages size by sending only a valuable
>>> part
>>> > of
>>> > > > the array.
>>> > > > If you don't mind I will create an issue in Jira for this.
>>> > > >
>>> > > >
>>> > > > By the way, `long` is a huge type. As I see in most cases
>>> > `GridLongList`
>>> > > > uses for counters.
>>> > > > And I have checked the possibility of compress `long` into smaller
>>> > types
>>> > > as
>>> > > > `int`, `short` or `byte` in test
>>> > > > `GridCacheInterceptorAtomicRebalanceTest` (took it by random).
>>> > > > And found out that all `long` in`GridLongList` can be cast to
>>> `int` and
>>> > > 70%
>>> > > > of them to shorts.
>>> > > > Such conversion is quite fast about 1.1 (ns) per element (I have
>>> > checked
>>> > > it
>>> > > > by JMH test).
>>> > > >
>>> > > >
>>> > > >
>>> > > > Of course, there are a lot of ways to compress data,
>>> > > > but I know proprietary GridGain plug-in has different
>>> `MessageWriter`
>>> > > > implementation.
>>> > > > So maybe it is unnecessary and some compression already exists in
>>> this
>>> > > > proprietary plug-in.
>>> > > > Does someone know something about it?
>>> > > >
>>> > >
>>> >
>>>
>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: Optimize GridLongList serialization

Александр Меньшиков
Hi Dmitry,

It's 100% compatible. I just change one line for serializing only valuable
part of the array
 -- just use another method of a writer.

I think it's enough for now. We can return to a discussion about compact
GridLongList
 in future when the 3.0 release will start coming.

2018-05-08 19:33 GMT+03:00 Dmitry Pavlov <[hidden email]>:

> Hi Alexander,
>
> Would new implementation of GridLongList be able to read value serialized
> by old implementation?
>
> Is it possible old implementation would not be able to read value from new?
>
> Sincerely,
> Dmitriy Pavlov
>
> пн, 7 мая 2018 г. в 11:30, Александр Меньшиков <[hidden email]>:
>
> > Hi.
> > I have finished this task. Just replaced
> >
> > `writer.writeLongArray("arr", arr)`
> > with
> > `writer.writeLongArray("arr", arr, idx)`
> >
> > Please review and merge if okay.
> >
> > Jira: https://issues.apache.org/jira/browse/IGNITE-8054
> > PR: https://github.com/apache/ignite/pull/3748
> > CI:
> > https://ci.ignite.apache.org/viewType.html?buildTypeId=
> IgniteTests24Java8_RunAll&branch_IgniteTests24Java8=
> pull%2F3748%2Fhead&tab=buildTypeStatusDiv
> >
> > 2018-03-27 11:50 GMT+03:00 Александр Меньшиков <[hidden email]>:
> >
> >> The ticket is created:
> >> https://issues.apache.org/jira/browse/IGNITE-8054
> >>
> >> 2018-03-27 10:04 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
> >>
> >>> Sorry, Dmitiry, I meant Ignite, not GridGain (memories of pre-apache
> >>> coding). I am assuming that Alex Menshikov was referring to
> GridLongList
> >>> class in Ignite.
> >>>
> >>> D.
> >>>
> >>> On Mon, Mar 26, 2018 at 11:52 PM, Dmitry Pavlov <[hidden email]
> >
> >>> wrote:
> >>>
> >>> > Hi Dmitriy, did you mean GridList?
> >>> >
> >>> > I don't understand what does it mean GridGain compress.
> >>> >
> >>> > вт, 27 мар. 2018 г. в 3:06, Dmitriy Setrakyan <[hidden email]
> >:
> >>> >
> >>> > > Thanks, Alex.
> >>> > >
> >>> > > GridGain automatically compresses all the internal types. Somehow
> it
> >>> > looks
> >>> > > like the GridLongList may have been mixed. Can you please file a
> >>> ticket
> >>> > for
> >>> > > 2.5 release?
> >>> > >
> >>> > > D.
> >>> > >
> >>> > > On Mon, Mar 26, 2018 at 4:55 AM, Александр Меньшиков <
> >>> > [hidden email]
> >>> > > >
> >>> > > wrote:
> >>> > >
> >>> > > > I investigated network loading and found that a big part of
> >>> internal
> >>> > data
> >>> > > > inside messages is `GridLongList`.
> >>> > > > It is a part of `GridDhtTxFinishRequest`,
> >>> > > > `GridDhtAtomicDeferredUpdateResponse`,
> >>> `GridDhtAtomicUpdateRequest`,
> >>> > > > `GridNearAtomicFullUpdateRequest` and `NearCacheUpdates`.
> >>> > > >
> >>> > > > So I think it has the sense to optimize `GridLongList`
> >>> serialization.
> >>> > > >
> >>> > > >
> >>> > > > Here we serialize all elements and don't take into account `idx`
> >>> value:
> >>> > > >
> >>> > > > ```
> >>> > > >
> >>> > > > @Override public boolean writeTo(ByteBuffer buf, MessageWriter
> >>> writer)
> >>> > {
> >>> > > >
> >>> > > >         writer.setBuffer(buf);
> >>> > > >
> >>> > > >
> >>> > > >
> >>> > > >         if (!writer.isHeaderWritten()) {
> >>> > > >
> >>> > > >             if (!writer.writeHeader(directType(),
> fieldsCount()))
> >>> > > >
> >>> > > >                 return false;
> >>> > > >
> >>> > > >
> >>> > > >
> >>> > > >             writer.onHeaderWritten();
> >>> > > >
> >>> > > >         }
> >>> > > >
> >>> > > >
> >>> > > >
> >>> > > >         switch (writer.state()) {
> >>> > > >
> >>> > > >             case 0:
> >>> > > >
> >>> > > >                 if (!writer.writeLongArray("arr", arr))
> >>> > > >
> >>> > > >                     return false;
> >>> > > >
> >>> > > >
> >>> > > >
> >>> > > >                 writer.incrementState();
> >>> > > >
> >>> > > >
> >>> > > >
> >>> > > >             case 1:
> >>> > > >
> >>> > > >                 if (!writer.writeInt("idx", idx))
> >>> > > >
> >>> > > >                     return false;
> >>> > > >
> >>> > > >
> >>> > > >
> >>> > > >                 writer.incrementState();
> >>> > > >
> >>> > > >
> >>> > > >
> >>> > > >         }
> >>> > > >
> >>> > > >
> >>> > > >
> >>> > > >         return true;
> >>> > > >
> >>> > > >     }
> >>> > > >
> >>> > > > ```
> >>> > > >
> >>> > > >
> >>> > > >
> >>> > > > Which is not happening in another serialization method in the
> same
> >>> > class:
> >>> > > >
> >>> > > >
> >>> > > >
> >>> > > > ```
> >>> > > >
> >>> > > > public static void writeTo(DataOutput out, @Nullable GridLongList
> >>> list)
> >>> > > > throws IOException {
> >>> > > >
> >>> > > >         out.writeInt(list != null ? list.idx : -1);
> >>> > > >
> >>> > > >
> >>> > > >
> >>> > > >         if (list != null) {
> >>> > > >
> >>> > > >             for (int i = 0; i < list.idx; i++)
> >>> > > >
> >>> > > >                 out.writeLong(list.arr[i]);
> >>> > > >
> >>> > > >         }
> >>> > > >
> >>> > > >     }
> >>> > > >
> >>> > > > ```
> >>> > > >
> >>> > > >
> >>> > > > So, we can simply reduce messages size by sending only a valuable
> >>> part
> >>> > of
> >>> > > > the array.
> >>> > > > If you don't mind I will create an issue in Jira for this.
> >>> > > >
> >>> > > >
> >>> > > > By the way, `long` is a huge type. As I see in most cases
> >>> > `GridLongList`
> >>> > > > uses for counters.
> >>> > > > And I have checked the possibility of compress `long` into
> smaller
> >>> > types
> >>> > > as
> >>> > > > `int`, `short` or `byte` in test
> >>> > > > `GridCacheInterceptorAtomicRebalanceTest` (took it by random).
> >>> > > > And found out that all `long` in`GridLongList` can be cast to
> >>> `int` and
> >>> > > 70%
> >>> > > > of them to shorts.
> >>> > > > Such conversion is quite fast about 1.1 (ns) per element (I have
> >>> > checked
> >>> > > it
> >>> > > > by JMH test).
> >>> > > >
> >>> > > >
> >>> > > >
> >>> > > > Of course, there are a lot of ways to compress data,
> >>> > > > but I know proprietary GridGain plug-in has different
> >>> `MessageWriter`
> >>> > > > implementation.
> >>> > > > So maybe it is unnecessary and some compression already exists in
> >>> this
> >>> > > > proprietary plug-in.
> >>> > > > Does someone know something about it?
> >>> > > >
> >>> > >
> >>> >
> >>>
> >>
> >>
> >
>