DML data streaming

classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: DML data streaming

dsetrakyan
On Wed, Feb 15, 2017 at 4:28 AM, Vladimir Ozerov <[hidden email]>
wrote:

> Ok, let's put aside current fields configuration, I'll create separate
> thread for it. As far as _KEY and _VAL, proposed change is exactly about
> mappings:
>
> class QueryEntity {
>     ...
>     String keyFieldName;
>     String valFieldName;
>     ...
> }
>
> The key thing is that we will not require users to be aware of our system
> columns. Normally user should not bother about existence of hidden _KEY and
> _VAL columns. Instead, we just allow them to optionally reference the whole
> key and/or val through predefined name.
>
>
Vladimir, how will it work from the DDL perspective. Let's say whenever
user wants to create a table in Ignite?
Reply | Threaded
Open this post in threaded view
|

Re: DML data streaming

al.psc
Folks,

Regarding INSERT semantics in JDBC DML streaming mode - I've left only
INSERTs supports as we'd agreed before.

However, current architecture of streaming related internals does not
give any clear way to intercept key duplicates and inform the user -
say, I can't just throw an exception from stream receiver (which is to
my knowledge the only place where we could filter erroneous keys) as
long as it will lead to whole batch remap and it's clearly not what we
want here.

Printing warning to log from the receiver is of little to no use as it
will happen on data nodes so the end user won't see anything.

What I've introduced for now is optional config param that turns on
allowOverwrite on the streamer used in DML operation.

Does anyone have any thoughts about what could/should be done
regarding informing user about key duplicates in streaming mode? Or
probably we should just let it be as it is now?

Regards,
Alex

2017-02-15 23:42 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:

> On Wed, Feb 15, 2017 at 4:28 AM, Vladimir Ozerov <[hidden email]>
> wrote:
>
>> Ok, let's put aside current fields configuration, I'll create separate
>> thread for it. As far as _KEY and _VAL, proposed change is exactly about
>> mappings:
>>
>> class QueryEntity {
>>     ...
>>     String keyFieldName;
>>     String valFieldName;
>>     ...
>> }
>>
>> The key thing is that we will not require users to be aware of our system
>> columns. Normally user should not bother about existence of hidden _KEY and
>> _VAL columns. Instead, we just allow them to optionally reference the whole
>> key and/or val through predefined name.
>>
>>
> Vladimir, how will it work from the DDL perspective. Let's say whenever
> user wants to create a table in Ignite?
Reply | Threaded
Open this post in threaded view
|

Re: DML data streaming

dsetrakyan
On Wed, Feb 15, 2017 at 2:41 PM, Alexander Paschenko <
[hidden email]> wrote:

> Folks,
>
> Regarding INSERT semantics in JDBC DML streaming mode - I've left only
> INSERTs supports as we'd agreed before.
>
> However, current architecture of streaming related internals does not
> give any clear way to intercept key duplicates and inform the user -
> say, I can't just throw an exception from stream receiver (which is to
> my knowledge the only place where we could filter erroneous keys) as
> long as it will lead to whole batch remap and it's clearly not what we
> want here.
>
> Printing warning to log from the receiver is of little to no use as it
> will happen on data nodes so the end user won't see anything.
>

However, you still must do it. You should try throttling the identical log
messages, so we don't flood the log.


>
> What I've introduced for now is optional config param that turns on
> allowOverwrite on the streamer used in DML operation.
>

Agree, sounds like a good use of the flag. Are you setting it via JDBC/ODBC
connection flag?


> Does anyone have any thoughts about what could/should be done
> regarding informing user about key duplicates in streaming mode? Or
> probably we should just let it be as it is now?
>

In my view, we should introduce some generic error trap callback, e.g.
onSqlError(...), for all unhandled SQL errors. User should provide it in
the configuration, before startup. What do you think?


>
> Regards,
> Alex
>
> 2017-02-15 23:42 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
> > On Wed, Feb 15, 2017 at 4:28 AM, Vladimir Ozerov <[hidden email]>
> > wrote:
> >
> >> Ok, let's put aside current fields configuration, I'll create separate
> >> thread for it. As far as _KEY and _VAL, proposed change is exactly about
> >> mappings:
> >>
> >> class QueryEntity {
> >>     ...
> >>     String keyFieldName;
> >>     String valFieldName;
> >>     ...
> >> }
> >>
> >> The key thing is that we will not require users to be aware of our
> system
> >> columns. Normally user should not bother about existence of hidden _KEY
> and
> >> _VAL columns. Instead, we just allow them to optionally reference the
> whole
> >> key and/or val through predefined name.
> >>
> >>
> > Vladimir, how will it work from the DDL perspective. Let's say whenever
> > user wants to create a table in Ignite?
>
Reply | Threaded
Open this post in threaded view
|

Re: DML data streaming

Vladimir Ozerov
In reply to this post by dsetrakyan
Dima,

At this point we require the following additional data which is outside of
standard SQL:
- Key type
- Value type
- Set of key columns

I do not know yet how we will define these values. At the very least we can
calculate them automatically in some cases. For "keyFieldName" and
"valFieldName" things are easier, as we can always derive them from table
definition.

Example 1 - primitives:

CREATE TABLE (
    *pk_id* BIGINT PRIMARY KEY,
    *val*   BIGINT
)

keyFieldName = "*pk_id*", valFieldName = "*val*"

Example 2 - composites:

CREATE TABLE (
    *pk_id* BIGINT PRIMARY KEY,
    val1  BIGINT,
    val2  VARCHAR
)

keyFieldName = "*pk_id*", valFieldName = null (because value is complex and
is composed of two attributes).

Vladimir.


On Wed, Feb 15, 2017 at 11:42 PM, Dmitriy Setrakyan <[hidden email]>
wrote:

> On Wed, Feb 15, 2017 at 4:28 AM, Vladimir Ozerov <[hidden email]>
> wrote:
>
> > Ok, let's put aside current fields configuration, I'll create separate
> > thread for it. As far as _KEY and _VAL, proposed change is exactly about
> > mappings:
> >
> > class QueryEntity {
> >     ...
> >     String keyFieldName;
> >     String valFieldName;
> >     ...
> > }
> >
> > The key thing is that we will not require users to be aware of our system
> > columns. Normally user should not bother about existence of hidden _KEY
> and
> > _VAL columns. Instead, we just allow them to optionally reference the
> whole
> > key and/or val through predefined name.
> >
> >
> Vladimir, how will it work from the DDL perspective. Let's say whenever
> user wants to create a table in Ignite?
>
Reply | Threaded
Open this post in threaded view
|

Re: DML data streaming

dsetrakyan
Vladimir, I am not sure I understand your point. The value type name should
be the table name, no?

On Thu, Feb 16, 2017 at 12:13 AM, Vladimir Ozerov <[hidden email]>
wrote:

> Dima,
>
> At this point we require the following additional data which is outside of
> standard SQL:
> - Key type
> - Value type
> - Set of key columns
>
> I do not know yet how we will define these values. At the very least we can
> calculate them automatically in some cases. For "keyFieldName" and
> "valFieldName" things are easier, as we can always derive them from table
> definition.
>
> Example 1 - primitives:
>
> CREATE TABLE (
>     *pk_id* BIGINT PRIMARY KEY,
>     *val*   BIGINT
> )
>
> keyFieldName = "*pk_id*", valFieldName = "*val*"
>
> Example 2 - composites:
>
> CREATE TABLE (
>     *pk_id* BIGINT PRIMARY KEY,
>     val1  BIGINT,
>     val2  VARCHAR
> )
>
> keyFieldName = "*pk_id*", valFieldName = null (because value is complex and
> is composed of two attributes).
>
> Vladimir.
>
>
> On Wed, Feb 15, 2017 at 11:42 PM, Dmitriy Setrakyan <[hidden email]
> >
> wrote:
>
> > On Wed, Feb 15, 2017 at 4:28 AM, Vladimir Ozerov <[hidden email]>
> > wrote:
> >
> > > Ok, let's put aside current fields configuration, I'll create separate
> > > thread for it. As far as _KEY and _VAL, proposed change is exactly
> about
> > > mappings:
> > >
> > > class QueryEntity {
> > >     ...
> > >     String keyFieldName;
> > >     String valFieldName;
> > >     ...
> > > }
> > >
> > > The key thing is that we will not require users to be aware of our
> system
> > > columns. Normally user should not bother about existence of hidden _KEY
> > and
> > > _VAL columns. Instead, we just allow them to optionally reference the
> > whole
> > > key and/or val through predefined name.
> > >
> > >
> > Vladimir, how will it work from the DDL perspective. Let's say whenever
> > user wants to create a table in Ignite?
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: DML data streaming

Vladimir Ozerov
Dima,

Value type name doesn't necessarily maps to table name. For instance, what
if I have two tables like this? They both have "java.lang.Long" as type
name.

CREATE table *t1* {
    pk_id BIGINT PRIMARY KEY,
    val BIGINT
}

CREATE table *t2* {
    pk_id BIGINT PRIMARY KEY,
    val BIGINT
}

On Fri, Feb 17, 2017 at 12:40 AM, Dmitriy Setrakyan <[hidden email]>
wrote:

> Vladimir, I am not sure I understand your point. The value type name should
> be the table name, no?
>
> On Thu, Feb 16, 2017 at 12:13 AM, Vladimir Ozerov <[hidden email]>
> wrote:
>
> > Dima,
> >
> > At this point we require the following additional data which is outside
> of
> > standard SQL:
> > - Key type
> > - Value type
> > - Set of key columns
> >
> > I do not know yet how we will define these values. At the very least we
> can
> > calculate them automatically in some cases. For "keyFieldName" and
> > "valFieldName" things are easier, as we can always derive them from table
> > definition.
> >
> > Example 1 - primitives:
> >
> > CREATE TABLE (
> >     *pk_id* BIGINT PRIMARY KEY,
> >     *val*   BIGINT
> > )
> >
> > keyFieldName = "*pk_id*", valFieldName = "*val*"
> >
> > Example 2 - composites:
> >
> > CREATE TABLE (
> >     *pk_id* BIGINT PRIMARY KEY,
> >     val1  BIGINT,
> >     val2  VARCHAR
> > )
> >
> > keyFieldName = "*pk_id*", valFieldName = null (because value is complex
> and
> > is composed of two attributes).
> >
> > Vladimir.
> >
> >
> > On Wed, Feb 15, 2017 at 11:42 PM, Dmitriy Setrakyan <
> [hidden email]
> > >
> > wrote:
> >
> > > On Wed, Feb 15, 2017 at 4:28 AM, Vladimir Ozerov <[hidden email]
> >
> > > wrote:
> > >
> > > > Ok, let's put aside current fields configuration, I'll create
> separate
> > > > thread for it. As far as _KEY and _VAL, proposed change is exactly
> > about
> > > > mappings:
> > > >
> > > > class QueryEntity {
> > > >     ...
> > > >     String keyFieldName;
> > > >     String valFieldName;
> > > >     ...
> > > > }
> > > >
> > > > The key thing is that we will not require users to be aware of our
> > system
> > > > columns. Normally user should not bother about existence of hidden
> _KEY
> > > and
> > > > _VAL columns. Instead, we just allow them to optionally reference the
> > > whole
> > > > key and/or val through predefined name.
> > > >
> > > >
> > > Vladimir, how will it work from the DDL perspective. Let's say whenever
> > > user wants to create a table in Ignite?
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: DML data streaming

dsetrakyan
If we adapt table-per-cache policy, then table name should be equal to
cache name, especially when table is created via SQL.

For complex types, the type should also be equal to the table name. If the
value type is primitive, then you can still use the table name in SQL and
use the table name as cache name in code.

In my view, the design works. Do you agree?

D.

On Thu, Feb 16, 2017 at 11:58 PM, Vladimir Ozerov <[hidden email]>
wrote:

> Dima,
>
> Value type name doesn't necessarily maps to table name. For instance, what
> if I have two tables like this? They both have "java.lang.Long" as type
> name.
>
> CREATE table *t1* {
>     pk_id BIGINT PRIMARY KEY,
>     val BIGINT
> }
>
> CREATE table *t2* {
>     pk_id BIGINT PRIMARY KEY,
>     val BIGINT
> }
>
> On Fri, Feb 17, 2017 at 12:40 AM, Dmitriy Setrakyan <[hidden email]
> >
> wrote:
>
> > Vladimir, I am not sure I understand your point. The value type name
> should
> > be the table name, no?
> >
> > On Thu, Feb 16, 2017 at 12:13 AM, Vladimir Ozerov <[hidden email]>
> > wrote:
> >
> > > Dima,
> > >
> > > At this point we require the following additional data which is outside
> > of
> > > standard SQL:
> > > - Key type
> > > - Value type
> > > - Set of key columns
> > >
> > > I do not know yet how we will define these values. At the very least we
> > can
> > > calculate them automatically in some cases. For "keyFieldName" and
> > > "valFieldName" things are easier, as we can always derive them from
> table
> > > definition.
> > >
> > > Example 1 - primitives:
> > >
> > > CREATE TABLE (
> > >     *pk_id* BIGINT PRIMARY KEY,
> > >     *val*   BIGINT
> > > )
> > >
> > > keyFieldName = "*pk_id*", valFieldName = "*val*"
> > >
> > > Example 2 - composites:
> > >
> > > CREATE TABLE (
> > >     *pk_id* BIGINT PRIMARY KEY,
> > >     val1  BIGINT,
> > >     val2  VARCHAR
> > > )
> > >
> > > keyFieldName = "*pk_id*", valFieldName = null (because value is complex
> > and
> > > is composed of two attributes).
> > >
> > > Vladimir.
> > >
> > >
> > > On Wed, Feb 15, 2017 at 11:42 PM, Dmitriy Setrakyan <
> > [hidden email]
> > > >
> > > wrote:
> > >
> > > > On Wed, Feb 15, 2017 at 4:28 AM, Vladimir Ozerov <
> [hidden email]
> > >
> > > > wrote:
> > > >
> > > > > Ok, let's put aside current fields configuration, I'll create
> > separate
> > > > > thread for it. As far as _KEY and _VAL, proposed change is exactly
> > > about
> > > > > mappings:
> > > > >
> > > > > class QueryEntity {
> > > > >     ...
> > > > >     String keyFieldName;
> > > > >     String valFieldName;
> > > > >     ...
> > > > > }
> > > > >
> > > > > The key thing is that we will not require users to be aware of our
> > > system
> > > > > columns. Normally user should not bother about existence of hidden
> > _KEY
> > > > and
> > > > > _VAL columns. Instead, we just allow them to optionally reference
> the
> > > > whole
> > > > > key and/or val through predefined name.
> > > > >
> > > > >
> > > > Vladimir, how will it work from the DDL perspective. Let's say
> whenever
> > > > user wants to create a table in Ignite?
> > > >
> > >
> >
>
12