Hi,
Lets discuss SQL DML (INSERT/UPDATE) current behavior specific: Ignite doesn't check a type of input objects when hidden columns _key, _value is used in a DML statements. I describe the current behavior for example: 1. Cache configuration: 'setIndexedTypes(PersonKey.class, Person.class))' 2. PersonKey type contains 'int id' field. 3. SQL statement: 'INSERT INTO test (_val, _key) VALUES (?, ?)' Cases: 1. Invalid value object type: - Any value object may be passed as a query parameter - Query is executed without an error and returns '1' (one row updated); - There is not inserted row at the 'SELECT * FROM test' results. - cache.get(key) returns inserted object; 2. Invalid key object type: 2.1 Non-primitive object is passed and binary representation doesn't contain 'id' field. - Query is executed without error and returns '1' (one row updated); - The inserted row is available by 'SELECT *' and the row contains id = null; 2.2 Non-primitive object is passed and binary representation contains 'id' field. - The inserted row is available by 'SELECT *' and the row contains expected 'id' field; - The cache entry cannot be gathered by 'cache.get' operation with the corresponding 'PersonKey(id)' (keys differ). I propose to check type of the user's input object. I guess that using _key/_val columns works close to 'cache.put()' but it looks like significant usability issue. To confuse the 'PersonKey.class.getName()' and 'node.binary().builder("PersonKey")' is a typical mistake of Ignite newcomers. One more argument for check: SQL INSERT sematic means the row is inserted into the specified TABLE, not into the cache. So, throw IgniteSQLException is expected behavior in this case, i think. [1]. https://issues.apache.org/jira/browse/IGNITE-5250 -- Taras Ledkov Mail-To: [hidden email] |
Folks,
Do we want to preserve the annotation-based configuration? There are too many ways to configure SQL indexes/fields. For instance, if our new SQL API could see and access all of the fields out-of-the-box (without any extra settings) and DDL will be used to define indexed fields then that would be a huge usability improvement. - Denis On Thu, Feb 21, 2019 at 5:27 AM Taras Ledkov <[hidden email]> wrote: > Hi, > > Lets discuss SQL DML (INSERT/UPDATE) current behavior specific: > > Ignite doesn't check a type of input objects when hidden columns _key, > _value is used in a DML statements. > I describe the current behavior for example: > > 1. Cache configuration: 'setIndexedTypes(PersonKey.class, Person.class))' > 2. PersonKey type contains 'int id' field. > 3. SQL statement: 'INSERT INTO test (_val, _key) VALUES (?, ?)' > > Cases: > 1. Invalid value object type: > - Any value object may be passed as a query parameter > - Query is executed without an error and returns '1' (one row updated); > - There is not inserted row at the 'SELECT * FROM test' results. > - cache.get(key) returns inserted object; > > 2. Invalid key object type: > 2.1 Non-primitive object is passed and binary representation doesn't > contain 'id' field. > - Query is executed without error and returns '1' (one row updated); > - The inserted row is available by 'SELECT *' and the row contains id = > null; > 2.2 Non-primitive object is passed and binary representation contains > 'id' field. > - The inserted row is available by 'SELECT *' and the row contains > expected 'id' field; > - The cache entry cannot be gathered by 'cache.get' operation with the > corresponding 'PersonKey(id)' (keys differ). > > I propose to check type of the user's input object. > > I guess that using _key/_val columns works close to 'cache.put()' but it > looks like significant usability issue. > To confuse the 'PersonKey.class.getName()' and > 'node.binary().builder("PersonKey")' is a typical mistake of Ignite > newcomers. > > One more argument for check: SQL INSERT sematic means the row is > inserted into the specified TABLE, not into the cache. > So, throw IgniteSQLException is expected behavior in this case, i think. > > [1]. https://issues.apache.org/jira/browse/IGNITE-5250 > > -- > Taras Ledkov > Mail-To: [hidden email] > > |
Denis,
SQL is a language with strict schema what was one of significant factors of it's worldwide success. I doubt we will ever have SQL without configuration/definiton, because otherwise it will be not SQL, but something else (e.g. document-oriented, JSON, whatever). On Fri, Feb 22, 2019 at 1:52 AM Denis Magda <[hidden email]> wrote: > Folks, > > Do we want to preserve the annotation-based configuration? There are too > many ways to configure SQL indexes/fields. > > For instance, if our new SQL API could see and access all of the fields > out-of-the-box (without any extra settings) and DDL will be used to define > indexed fields then that would be a huge usability improvement. > > - > Denis > > > On Thu, Feb 21, 2019 at 5:27 AM Taras Ledkov <[hidden email]> wrote: > > > Hi, > > > > Lets discuss SQL DML (INSERT/UPDATE) current behavior specific: > > > > Ignite doesn't check a type of input objects when hidden columns _key, > > _value is used in a DML statements. > > I describe the current behavior for example: > > > > 1. Cache configuration: 'setIndexedTypes(PersonKey.class, > Person.class))' > > 2. PersonKey type contains 'int id' field. > > 3. SQL statement: 'INSERT INTO test (_val, _key) VALUES (?, ?)' > > > > Cases: > > 1. Invalid value object type: > > - Any value object may be passed as a query parameter > > - Query is executed without an error and returns '1' (one row updated); > > - There is not inserted row at the 'SELECT * FROM test' results. > > - cache.get(key) returns inserted object; > > > > 2. Invalid key object type: > > 2.1 Non-primitive object is passed and binary representation doesn't > > contain 'id' field. > > - Query is executed without error and returns '1' (one row updated); > > - The inserted row is available by 'SELECT *' and the row contains id = > > null; > > 2.2 Non-primitive object is passed and binary representation contains > > 'id' field. > > - The inserted row is available by 'SELECT *' and the row contains > > expected 'id' field; > > - The cache entry cannot be gathered by 'cache.get' operation with the > > corresponding 'PersonKey(id)' (keys differ). > > > > I propose to check type of the user's input object. > > > > I guess that using _key/_val columns works close to 'cache.put()' but it > > looks like significant usability issue. > > To confuse the 'PersonKey.class.getName()' and > > 'node.binary().builder("PersonKey")' is a typical mistake of Ignite > > newcomers. > > > > One more argument for check: SQL INSERT sematic means the row is > > inserted into the specified TABLE, not into the cache. > > So, throw IgniteSQLException is expected behavior in this case, i think. > > > > [1]. https://issues.apache.org/jira/browse/IGNITE-5250 > > > > -- > > Taras Ledkov > > Mail-To: [hidden email] > > > > > |
Vladimir,
That's understood. I'm just thinking of a use case different from the DDL approach where the schema is defined initially. Let's say that someone configured caches with CacheConfiguration and now puts an Object in the cache. For that person, it would be helpful to skip the Annotations or QueryEntities approaches for queryable fields definitions (not even indexes). For instance, the person might simply query some fields with the primary index in the WHERE clause and this shouldn't require any extra settings. Yes, it's clear that it might be extremely challenging to support but imagine how usable the API could become if we can get rid of Annotations and QueryEntities. Basically, my idea is that all of the objects and their fields stored in the caches should be visible to SQL w/o extra settings. If someone wants to create indexes then use DDL which was designed for this. - Denis On Fri, Feb 22, 2019 at 2:27 AM Vladimir Ozerov <[hidden email]> wrote: > Denis, > > SQL is a language with strict schema what was one of significant factors of > it's worldwide success. I doubt we will ever have SQL without > configuration/definiton, because otherwise it will be not SQL, but > something else (e.g. document-oriented, JSON, whatever). > > On Fri, Feb 22, 2019 at 1:52 AM Denis Magda <[hidden email]> wrote: > > > Folks, > > > > Do we want to preserve the annotation-based configuration? There are too > > many ways to configure SQL indexes/fields. > > > > For instance, if our new SQL API could see and access all of the fields > > out-of-the-box (without any extra settings) and DDL will be used to > define > > indexed fields then that would be a huge usability improvement. > > > > - > > Denis > > > > > > On Thu, Feb 21, 2019 at 5:27 AM Taras Ledkov <[hidden email]> > wrote: > > > > > Hi, > > > > > > Lets discuss SQL DML (INSERT/UPDATE) current behavior specific: > > > > > > Ignite doesn't check a type of input objects when hidden columns _key, > > > _value is used in a DML statements. > > > I describe the current behavior for example: > > > > > > 1. Cache configuration: 'setIndexedTypes(PersonKey.class, > > Person.class))' > > > 2. PersonKey type contains 'int id' field. > > > 3. SQL statement: 'INSERT INTO test (_val, _key) VALUES (?, ?)' > > > > > > Cases: > > > 1. Invalid value object type: > > > - Any value object may be passed as a query parameter > > > - Query is executed without an error and returns '1' (one row updated); > > > - There is not inserted row at the 'SELECT * FROM test' results. > > > - cache.get(key) returns inserted object; > > > > > > 2. Invalid key object type: > > > 2.1 Non-primitive object is passed and binary representation doesn't > > > contain 'id' field. > > > - Query is executed without error and returns '1' (one row updated); > > > - The inserted row is available by 'SELECT *' and the row contains id = > > > null; > > > 2.2 Non-primitive object is passed and binary representation contains > > > 'id' field. > > > - The inserted row is available by 'SELECT *' and the row contains > > > expected 'id' field; > > > - The cache entry cannot be gathered by 'cache.get' operation with the > > > corresponding 'PersonKey(id)' (keys differ). > > > > > > I propose to check type of the user's input object. > > > > > > I guess that using _key/_val columns works close to 'cache.put()' but > it > > > looks like significant usability issue. > > > To confuse the 'PersonKey.class.getName()' and > > > 'node.binary().builder("PersonKey")' is a typical mistake of Ignite > > > newcomers. > > > > > > One more argument for check: SQL INSERT sematic means the row is > > > inserted into the specified TABLE, not into the cache. > > > So, throw IgniteSQLException is expected behavior in this case, i > think. > > > > > > [1]. https://issues.apache.org/jira/browse/IGNITE-5250 > > > > > > -- > > > Taras Ledkov > > > Mail-To: [hidden email] > > > > > > > > > |
Denis,
Yes, this is what my answer was about - you cannot have SQL without defining fields in advance. Because it breaks a lot of standard SQL invariants and virtually makes the whole language unusable. For instance, think of product behavior in the following cases: 1) User queries an empty cache with a query "SELECT a FROM table" - what should happen - exception or empty result? How would I know whether field "a" will appear in future? 2) User executed a command "ALTER TABLE ... ADD COLUMN b" - how can I understand whether it is possible or not to add a column without strict schema? 3) "ALTER TABLE ... DROP COLUMN c" - what should happen if user will add an object with field "c" after that? 4) User connects to Ignite from Tableau and navigates through schema - what should be shown? That is, you cannot have SQL without schema because it is at the very heart of the technology. But you can have schema-less noSQL database. Let's do not invent a hybrid with tons of corner cases and separate learning curve. It should be enough just to rethink and simplify our configuration - reshape QueryEntity, deprecate all SQL annotations, allow only one table per cache, allow to define SQL script to be executed on cache start or so. As far as schemaless - it is viable approach for sure, but should be considered either outside of SQL (e.g. a kind of predicate/criteria API which can be merged with ScanQuery) or as a special datatype in SQL ecosystem (like is is done with JSON in many RDBMS databases). Vladimir. On Fri, Feb 22, 2019 at 11:01 PM Denis Magda <[hidden email]> wrote: > Vladimir, > > That's understood. I'm just thinking of a use case different from the DDL > approach where the schema is defined initially. Let's say that someone > configured caches with CacheConfiguration and now puts an Object in the > cache. For that person, it would be helpful to skip the Annotations or > QueryEntities approaches for queryable fields definitions (not even > indexes). For instance, the person might simply query some fields with the > primary index in the WHERE clause and this shouldn't require any extra > settings. Yes, it's clear that it might be extremely challenging to support > but imagine how usable the API could become if we can get rid of > Annotations and QueryEntities. > > Basically, my idea is that all of the objects and their fields stored in > the caches should be visible to SQL w/o extra settings. If someone wants to > create indexes then use DDL which was designed for this. > > > - > Denis > > > On Fri, Feb 22, 2019 at 2:27 AM Vladimir Ozerov <[hidden email]> > wrote: > > > Denis, > > > > SQL is a language with strict schema what was one of significant factors > of > > it's worldwide success. I doubt we will ever have SQL without > > configuration/definiton, because otherwise it will be not SQL, but > > something else (e.g. document-oriented, JSON, whatever). > > > > On Fri, Feb 22, 2019 at 1:52 AM Denis Magda <[hidden email]> wrote: > > > > > Folks, > > > > > > Do we want to preserve the annotation-based configuration? There are > too > > > many ways to configure SQL indexes/fields. > > > > > > For instance, if our new SQL API could see and access all of the fields > > > out-of-the-box (without any extra settings) and DDL will be used to > > define > > > indexed fields then that would be a huge usability improvement. > > > > > > - > > > Denis > > > > > > > > > On Thu, Feb 21, 2019 at 5:27 AM Taras Ledkov <[hidden email]> > > wrote: > > > > > > > Hi, > > > > > > > > Lets discuss SQL DML (INSERT/UPDATE) current behavior specific: > > > > > > > > Ignite doesn't check a type of input objects when hidden columns > _key, > > > > _value is used in a DML statements. > > > > I describe the current behavior for example: > > > > > > > > 1. Cache configuration: 'setIndexedTypes(PersonKey.class, > > > Person.class))' > > > > 2. PersonKey type contains 'int id' field. > > > > 3. SQL statement: 'INSERT INTO test (_val, _key) VALUES (?, ?)' > > > > > > > > Cases: > > > > 1. Invalid value object type: > > > > - Any value object may be passed as a query parameter > > > > - Query is executed without an error and returns '1' (one row > updated); > > > > - There is not inserted row at the 'SELECT * FROM test' results. > > > > - cache.get(key) returns inserted object; > > > > > > > > 2. Invalid key object type: > > > > 2.1 Non-primitive object is passed and binary representation doesn't > > > > contain 'id' field. > > > > - Query is executed without error and returns '1' (one row updated); > > > > - The inserted row is available by 'SELECT *' and the row contains > id = > > > > null; > > > > 2.2 Non-primitive object is passed and binary representation contains > > > > 'id' field. > > > > - The inserted row is available by 'SELECT *' and the row contains > > > > expected 'id' field; > > > > - The cache entry cannot be gathered by 'cache.get' operation with > the > > > > corresponding 'PersonKey(id)' (keys differ). > > > > > > > > I propose to check type of the user's input object. > > > > > > > > I guess that using _key/_val columns works close to 'cache.put()' but > > it > > > > looks like significant usability issue. > > > > To confuse the 'PersonKey.class.getName()' and > > > > 'node.binary().builder("PersonKey")' is a typical mistake of Ignite > > > > newcomers. > > > > > > > > One more argument for check: SQL INSERT sematic means the row is > > > > inserted into the specified TABLE, not into the cache. > > > > So, throw IgniteSQLException is expected behavior in this case, i > > think. > > > > > > > > [1]. https://issues.apache.org/jira/browse/IGNITE-5250 > > > > > > > > -- > > > > Taras Ledkov > > > > Mail-To: [hidden email] > > > > > > > > > > > > > > |
Vladimir,
Ok, agreed, let's not boil the ocean...at least for now ;) -- Denis Magda On Sat, Feb 23, 2019 at 12:50 AM Vladimir Ozerov <[hidden email]> wrote: > Denis, > > Yes, this is what my answer was about - you cannot have SQL without > defining fields in advance. Because it breaks a lot of standard SQL > invariants and virtually makes the whole language unusable. For instance, > think of product behavior in the following cases: > 1) User queries an empty cache with a query "SELECT a FROM table" - what > should happen - exception or empty result? How would I know whether field > "a" will appear in future? > 2) User executed a command "ALTER TABLE ... ADD COLUMN b" - how can I > understand whether it is possible or not to add a column without strict > schema? > 3) "ALTER TABLE ... DROP COLUMN c" - what should happen if user will add an > object with field "c" after that? > 4) User connects to Ignite from Tableau and navigates through schema - what > should be shown? > > That is, you cannot have SQL without schema because it is at the very heart > of the technology. But you can have schema-less noSQL database. > > Let's do not invent a hybrid with tons of corner cases and separate > learning curve. It should be enough just to rethink and simplify our > configuration - reshape QueryEntity, deprecate all SQL annotations, allow > only one table per cache, allow to define SQL script to be executed on > cache start or so. > > As far as schemaless - it is viable approach for sure, but should be > considered either outside of SQL (e.g. a kind of predicate/criteria API > which can be merged with ScanQuery) or as a special datatype in SQL > ecosystem (like is is done with JSON in many RDBMS databases). > > Vladimir. > > > > > On Fri, Feb 22, 2019 at 11:01 PM Denis Magda <[hidden email]> wrote: > > > Vladimir, > > > > That's understood. I'm just thinking of a use case different from the DDL > > approach where the schema is defined initially. Let's say that someone > > configured caches with CacheConfiguration and now puts an Object in the > > cache. For that person, it would be helpful to skip the Annotations or > > QueryEntities approaches for queryable fields definitions (not even > > indexes). For instance, the person might simply query some fields with > the > > primary index in the WHERE clause and this shouldn't require any extra > > settings. Yes, it's clear that it might be extremely challenging to > support > > but imagine how usable the API could become if we can get rid of > > Annotations and QueryEntities. > > > > Basically, my idea is that all of the objects and their fields stored in > > the caches should be visible to SQL w/o extra settings. If someone wants > to > > create indexes then use DDL which was designed for this. > > > > > > - > > Denis > > > > > > On Fri, Feb 22, 2019 at 2:27 AM Vladimir Ozerov <[hidden email]> > > wrote: > > > > > Denis, > > > > > > SQL is a language with strict schema what was one of significant > factors > > of > > > it's worldwide success. I doubt we will ever have SQL without > > > configuration/definiton, because otherwise it will be not SQL, but > > > something else (e.g. document-oriented, JSON, whatever). > > > > > > On Fri, Feb 22, 2019 at 1:52 AM Denis Magda <[hidden email]> wrote: > > > > > > > Folks, > > > > > > > > Do we want to preserve the annotation-based configuration? There are > > too > > > > many ways to configure SQL indexes/fields. > > > > > > > > For instance, if our new SQL API could see and access all of the > fields > > > > out-of-the-box (without any extra settings) and DDL will be used to > > > define > > > > indexed fields then that would be a huge usability improvement. > > > > > > > > - > > > > Denis > > > > > > > > > > > > On Thu, Feb 21, 2019 at 5:27 AM Taras Ledkov <[hidden email]> > > > wrote: > > > > > > > > > Hi, > > > > > > > > > > Lets discuss SQL DML (INSERT/UPDATE) current behavior specific: > > > > > > > > > > Ignite doesn't check a type of input objects when hidden columns > > _key, > > > > > _value is used in a DML statements. > > > > > I describe the current behavior for example: > > > > > > > > > > 1. Cache configuration: 'setIndexedTypes(PersonKey.class, > > > > Person.class))' > > > > > 2. PersonKey type contains 'int id' field. > > > > > 3. SQL statement: 'INSERT INTO test (_val, _key) VALUES (?, ?)' > > > > > > > > > > Cases: > > > > > 1. Invalid value object type: > > > > > - Any value object may be passed as a query parameter > > > > > - Query is executed without an error and returns '1' (one row > > updated); > > > > > - There is not inserted row at the 'SELECT * FROM test' results. > > > > > - cache.get(key) returns inserted object; > > > > > > > > > > 2. Invalid key object type: > > > > > 2.1 Non-primitive object is passed and binary representation > doesn't > > > > > contain 'id' field. > > > > > - Query is executed without error and returns '1' (one row > updated); > > > > > - The inserted row is available by 'SELECT *' and the row contains > > id = > > > > > null; > > > > > 2.2 Non-primitive object is passed and binary representation > contains > > > > > 'id' field. > > > > > - The inserted row is available by 'SELECT *' and the row contains > > > > > expected 'id' field; > > > > > - The cache entry cannot be gathered by 'cache.get' operation with > > the > > > > > corresponding 'PersonKey(id)' (keys differ). > > > > > > > > > > I propose to check type of the user's input object. > > > > > > > > > > I guess that using _key/_val columns works close to 'cache.put()' > but > > > it > > > > > looks like significant usability issue. > > > > > To confuse the 'PersonKey.class.getName()' and > > > > > 'node.binary().builder("PersonKey")' is a typical mistake of Ignite > > > > > newcomers. > > > > > > > > > > One more argument for check: SQL INSERT sematic means the row is > > > > > inserted into the specified TABLE, not into the cache. > > > > > So, throw IgniteSQLException is expected behavior in this case, i > > > think. > > > > > > > > > > [1]. https://issues.apache.org/jira/browse/IGNITE-5250 > > > > > > > > > > -- > > > > > Taras Ledkov > > > > > Mail-To: [hidden email] > > > > > > > > > > > > > > > > > > > > |
Hi Taras,
As far as your original question :-) I would say that user should have only one way to update data with DML - through plain attributes. That is, if we have a composite value with attributes "a" and "b", then we should: UPDATE table SET a=?, b=? WHERE ... // Allow UPDATE table SET _VAL=? WHERE ... // Disallow But if the value is an attribute itself (e.g. in case of primitive), then DML should be allowed on it for sure: UPDATE table SET _VAL=? WHERE ... // Allow What do you think? On Sat, Feb 23, 2019 at 6:50 PM Denis Magda <[hidden email]> wrote: > Vladimir, > > Ok, agreed, let's not boil the ocean...at least for now ;) > > -- > Denis Magda > > > On Sat, Feb 23, 2019 at 12:50 AM Vladimir Ozerov <[hidden email]> > wrote: > > > Denis, > > > > Yes, this is what my answer was about - you cannot have SQL without > > defining fields in advance. Because it breaks a lot of standard SQL > > invariants and virtually makes the whole language unusable. For instance, > > think of product behavior in the following cases: > > 1) User queries an empty cache with a query "SELECT a FROM table" - what > > should happen - exception or empty result? How would I know whether field > > "a" will appear in future? > > 2) User executed a command "ALTER TABLE ... ADD COLUMN b" - how can I > > understand whether it is possible or not to add a column without strict > > schema? > > 3) "ALTER TABLE ... DROP COLUMN c" - what should happen if user will add > an > > object with field "c" after that? > > 4) User connects to Ignite from Tableau and navigates through schema - > what > > should be shown? > > > > That is, you cannot have SQL without schema because it is at the very > heart > > of the technology. But you can have schema-less noSQL database. > > > > Let's do not invent a hybrid with tons of corner cases and separate > > learning curve. It should be enough just to rethink and simplify our > > configuration - reshape QueryEntity, deprecate all SQL annotations, allow > > only one table per cache, allow to define SQL script to be executed on > > cache start or so. > > > > As far as schemaless - it is viable approach for sure, but should be > > considered either outside of SQL (e.g. a kind of predicate/criteria API > > which can be merged with ScanQuery) or as a special datatype in SQL > > ecosystem (like is is done with JSON in many RDBMS databases). > > > > Vladimir. > > > > > > > > > > On Fri, Feb 22, 2019 at 11:01 PM Denis Magda <[hidden email]> wrote: > > > > > Vladimir, > > > > > > That's understood. I'm just thinking of a use case different from the > DDL > > > approach where the schema is defined initially. Let's say that someone > > > configured caches with CacheConfiguration and now puts an Object in the > > > cache. For that person, it would be helpful to skip the Annotations or > > > QueryEntities approaches for queryable fields definitions (not even > > > indexes). For instance, the person might simply query some fields with > > the > > > primary index in the WHERE clause and this shouldn't require any extra > > > settings. Yes, it's clear that it might be extremely challenging to > > support > > > but imagine how usable the API could become if we can get rid of > > > Annotations and QueryEntities. > > > > > > Basically, my idea is that all of the objects and their fields stored > in > > > the caches should be visible to SQL w/o extra settings. If someone > wants > > to > > > create indexes then use DDL which was designed for this. > > > > > > > > > - > > > Denis > > > > > > > > > On Fri, Feb 22, 2019 at 2:27 AM Vladimir Ozerov <[hidden email]> > > > wrote: > > > > > > > Denis, > > > > > > > > SQL is a language with strict schema what was one of significant > > factors > > > of > > > > it's worldwide success. I doubt we will ever have SQL without > > > > configuration/definiton, because otherwise it will be not SQL, but > > > > something else (e.g. document-oriented, JSON, whatever). > > > > > > > > On Fri, Feb 22, 2019 at 1:52 AM Denis Magda <[hidden email]> > wrote: > > > > > > > > > Folks, > > > > > > > > > > Do we want to preserve the annotation-based configuration? There > are > > > too > > > > > many ways to configure SQL indexes/fields. > > > > > > > > > > For instance, if our new SQL API could see and access all of the > > fields > > > > > out-of-the-box (without any extra settings) and DDL will be used to > > > > define > > > > > indexed fields then that would be a huge usability improvement. > > > > > > > > > > - > > > > > Denis > > > > > > > > > > > > > > > On Thu, Feb 21, 2019 at 5:27 AM Taras Ledkov <[hidden email] > > > > > > wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > Lets discuss SQL DML (INSERT/UPDATE) current behavior specific: > > > > > > > > > > > > Ignite doesn't check a type of input objects when hidden columns > > > _key, > > > > > > _value is used in a DML statements. > > > > > > I describe the current behavior for example: > > > > > > > > > > > > 1. Cache configuration: 'setIndexedTypes(PersonKey.class, > > > > > Person.class))' > > > > > > 2. PersonKey type contains 'int id' field. > > > > > > 3. SQL statement: 'INSERT INTO test (_val, _key) VALUES (?, ?)' > > > > > > > > > > > > Cases: > > > > > > 1. Invalid value object type: > > > > > > - Any value object may be passed as a query parameter > > > > > > - Query is executed without an error and returns '1' (one row > > > updated); > > > > > > - There is not inserted row at the 'SELECT * FROM test' results. > > > > > > - cache.get(key) returns inserted object; > > > > > > > > > > > > 2. Invalid key object type: > > > > > > 2.1 Non-primitive object is passed and binary representation > > doesn't > > > > > > contain 'id' field. > > > > > > - Query is executed without error and returns '1' (one row > > updated); > > > > > > - The inserted row is available by 'SELECT *' and the row > contains > > > id = > > > > > > null; > > > > > > 2.2 Non-primitive object is passed and binary representation > > contains > > > > > > 'id' field. > > > > > > - The inserted row is available by 'SELECT *' and the row > contains > > > > > > expected 'id' field; > > > > > > - The cache entry cannot be gathered by 'cache.get' operation > with > > > the > > > > > > corresponding 'PersonKey(id)' (keys differ). > > > > > > > > > > > > I propose to check type of the user's input object. > > > > > > > > > > > > I guess that using _key/_val columns works close to 'cache.put()' > > but > > > > it > > > > > > looks like significant usability issue. > > > > > > To confuse the 'PersonKey.class.getName()' and > > > > > > 'node.binary().builder("PersonKey")' is a typical mistake of > Ignite > > > > > > newcomers. > > > > > > > > > > > > One more argument for check: SQL INSERT sematic means the row is > > > > > > inserted into the specified TABLE, not into the cache. > > > > > > So, throw IgniteSQLException is expected behavior in this case, i > > > > think. > > > > > > > > > > > > [1]. https://issues.apache.org/jira/browse/IGNITE-5250 > > > > > > > > > > > > -- > > > > > > Taras Ledkov > > > > > > Mail-To: [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > |
Hello!
> UPDATE table SET _VAL=? WHERE ... // Disallow Breaking change and as such should be deferred to 3.0. All of our tables have types, so we can disallow doing _VAL=? where parameter object is not of table's type, and semantics break down here - you INSERT object in cache, get "1" rows updated but can't select this row from table. But we probably should not disallow _VAL=? where parameter object IS of table's type, since there may be users whose workflow depends on that and it isn't fixable easily. For example, they can have objects of which only subset of fields is indexed, the rest is not. Then they are inserting them via SQL as shown. Regards, -- Ilya Kasnacheev ср, 27 февр. 2019 г. в 12:10, Vladimir Ozerov <[hidden email]>: > Hi Taras, > > As far as your original question :-) I would say that user should have only > one way to update data with DML - through plain attributes. That is, if we > have a composite value with attributes "a" and "b", then we should: > UPDATE table SET a=?, b=? WHERE ... // Allow > UPDATE table SET _VAL=? WHERE ... // Disallow > > But if the value is an attribute itself (e.g. in case of primitive), then > DML should be allowed on it for sure: > UPDATE table SET _VAL=? WHERE ... // Allow > > What do you think? > > On Sat, Feb 23, 2019 at 6:50 PM Denis Magda <[hidden email]> wrote: > > > Vladimir, > > > > Ok, agreed, let's not boil the ocean...at least for now ;) > > > > -- > > Denis Magda > > > > > > On Sat, Feb 23, 2019 at 12:50 AM Vladimir Ozerov <[hidden email]> > > wrote: > > > > > Denis, > > > > > > Yes, this is what my answer was about - you cannot have SQL without > > > defining fields in advance. Because it breaks a lot of standard SQL > > > invariants and virtually makes the whole language unusable. For > instance, > > > think of product behavior in the following cases: > > > 1) User queries an empty cache with a query "SELECT a FROM table" - > what > > > should happen - exception or empty result? How would I know whether > field > > > "a" will appear in future? > > > 2) User executed a command "ALTER TABLE ... ADD COLUMN b" - how can I > > > understand whether it is possible or not to add a column without strict > > > schema? > > > 3) "ALTER TABLE ... DROP COLUMN c" - what should happen if user will > add > > an > > > object with field "c" after that? > > > 4) User connects to Ignite from Tableau and navigates through schema - > > what > > > should be shown? > > > > > > That is, you cannot have SQL without schema because it is at the very > > heart > > > of the technology. But you can have schema-less noSQL database. > > > > > > Let's do not invent a hybrid with tons of corner cases and separate > > > learning curve. It should be enough just to rethink and simplify our > > > configuration - reshape QueryEntity, deprecate all SQL annotations, > allow > > > only one table per cache, allow to define SQL script to be executed on > > > cache start or so. > > > > > > As far as schemaless - it is viable approach for sure, but should be > > > considered either outside of SQL (e.g. a kind of predicate/criteria API > > > which can be merged with ScanQuery) or as a special datatype in SQL > > > ecosystem (like is is done with JSON in many RDBMS databases). > > > > > > Vladimir. > > > > > > > > > > > > > > > On Fri, Feb 22, 2019 at 11:01 PM Denis Magda <[hidden email]> > wrote: > > > > > > > Vladimir, > > > > > > > > That's understood. I'm just thinking of a use case different from the > > DDL > > > > approach where the schema is defined initially. Let's say that > someone > > > > configured caches with CacheConfiguration and now puts an Object in > the > > > > cache. For that person, it would be helpful to skip the Annotations > or > > > > QueryEntities approaches for queryable fields definitions (not even > > > > indexes). For instance, the person might simply query some fields > with > > > the > > > > primary index in the WHERE clause and this shouldn't require any > extra > > > > settings. Yes, it's clear that it might be extremely challenging to > > > support > > > > but imagine how usable the API could become if we can get rid of > > > > Annotations and QueryEntities. > > > > > > > > Basically, my idea is that all of the objects and their fields stored > > in > > > > the caches should be visible to SQL w/o extra settings. If someone > > wants > > > to > > > > create indexes then use DDL which was designed for this. > > > > > > > > > > > > - > > > > Denis > > > > > > > > > > > > On Fri, Feb 22, 2019 at 2:27 AM Vladimir Ozerov < > [hidden email]> > > > > wrote: > > > > > > > > > Denis, > > > > > > > > > > SQL is a language with strict schema what was one of significant > > > factors > > > > of > > > > > it's worldwide success. I doubt we will ever have SQL without > > > > > configuration/definiton, because otherwise it will be not SQL, but > > > > > something else (e.g. document-oriented, JSON, whatever). > > > > > > > > > > On Fri, Feb 22, 2019 at 1:52 AM Denis Magda <[hidden email]> > > wrote: > > > > > > > > > > > Folks, > > > > > > > > > > > > Do we want to preserve the annotation-based configuration? There > > are > > > > too > > > > > > many ways to configure SQL indexes/fields. > > > > > > > > > > > > For instance, if our new SQL API could see and access all of the > > > fields > > > > > > out-of-the-box (without any extra settings) and DDL will be used > to > > > > > define > > > > > > indexed fields then that would be a huge usability improvement. > > > > > > > > > > > > - > > > > > > Denis > > > > > > > > > > > > > > > > > > On Thu, Feb 21, 2019 at 5:27 AM Taras Ledkov < > [hidden email] > > > > > > > > wrote: > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > Lets discuss SQL DML (INSERT/UPDATE) current behavior specific: > > > > > > > > > > > > > > Ignite doesn't check a type of input objects when hidden > columns > > > > _key, > > > > > > > _value is used in a DML statements. > > > > > > > I describe the current behavior for example: > > > > > > > > > > > > > > 1. Cache configuration: 'setIndexedTypes(PersonKey.class, > > > > > > Person.class))' > > > > > > > 2. PersonKey type contains 'int id' field. > > > > > > > 3. SQL statement: 'INSERT INTO test (_val, _key) VALUES (?, ?)' > > > > > > > > > > > > > > Cases: > > > > > > > 1. Invalid value object type: > > > > > > > - Any value object may be passed as a query parameter > > > > > > > - Query is executed without an error and returns '1' (one row > > > > updated); > > > > > > > - There is not inserted row at the 'SELECT * FROM test' > results. > > > > > > > - cache.get(key) returns inserted object; > > > > > > > > > > > > > > 2. Invalid key object type: > > > > > > > 2.1 Non-primitive object is passed and binary representation > > > doesn't > > > > > > > contain 'id' field. > > > > > > > - Query is executed without error and returns '1' (one row > > > updated); > > > > > > > - The inserted row is available by 'SELECT *' and the row > > contains > > > > id = > > > > > > > null; > > > > > > > 2.2 Non-primitive object is passed and binary representation > > > contains > > > > > > > 'id' field. > > > > > > > - The inserted row is available by 'SELECT *' and the row > > contains > > > > > > > expected 'id' field; > > > > > > > - The cache entry cannot be gathered by 'cache.get' operation > > with > > > > the > > > > > > > corresponding 'PersonKey(id)' (keys differ). > > > > > > > > > > > > > > I propose to check type of the user's input object. > > > > > > > > > > > > > > I guess that using _key/_val columns works close to > 'cache.put()' > > > but > > > > > it > > > > > > > looks like significant usability issue. > > > > > > > To confuse the 'PersonKey.class.getName()' and > > > > > > > 'node.binary().builder("PersonKey")' is a typical mistake of > > Ignite > > > > > > > newcomers. > > > > > > > > > > > > > > One more argument for check: SQL INSERT sematic means the row > is > > > > > > > inserted into the specified TABLE, not into the cache. > > > > > > > So, throw IgniteSQLException is expected behavior in this > case, i > > > > > think. > > > > > > > > > > > > > > [1]. https://issues.apache.org/jira/browse/IGNITE-5250 > > > > > > > > > > > > > > -- > > > > > > > Taras Ledkov > > > > > > > Mail-To: [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
I do not think this should be deferred, even though it changes default
behavior. Clean and simple semantics is much more important. In this regards DML was created incorrectly in the first place. We will fix it, leaving hidden fallback mode for those users who use this strange semantics. ср, 27 февр. 2019 г. в 12:57, Ilya Kasnacheev <[hidden email]>: > Hello! > > > UPDATE table SET _VAL=? WHERE ... // Disallow > > Breaking change and as such should be deferred to 3.0. > > All of our tables have types, so we can disallow doing _VAL=? where > parameter object is not of table's type, and semantics break down here - > you INSERT object in cache, get "1" rows updated but can't select this row > from table. > But we probably should not disallow _VAL=? where parameter object IS of > table's type, since there may be users whose workflow depends on that and > it isn't fixable easily. > > For example, they can have objects of which only subset of fields is > indexed, the rest is not. Then they are inserting them via SQL as shown. > > Regards, > -- > Ilya Kasnacheev > > > ср, 27 февр. 2019 г. в 12:10, Vladimir Ozerov <[hidden email]>: > > > Hi Taras, > > > > As far as your original question :-) I would say that user should have > only > > one way to update data with DML - through plain attributes. That is, if > we > > have a composite value with attributes "a" and "b", then we should: > > UPDATE table SET a=?, b=? WHERE ... // Allow > > UPDATE table SET _VAL=? WHERE ... // Disallow > > > > But if the value is an attribute itself (e.g. in case of primitive), then > > DML should be allowed on it for sure: > > UPDATE table SET _VAL=? WHERE ... // Allow > > > > What do you think? > > > > On Sat, Feb 23, 2019 at 6:50 PM Denis Magda <[hidden email]> wrote: > > > > > Vladimir, > > > > > > Ok, agreed, let's not boil the ocean...at least for now ;) > > > > > > -- > > > Denis Magda > > > > > > > > > On Sat, Feb 23, 2019 at 12:50 AM Vladimir Ozerov <[hidden email] > > > > > wrote: > > > > > > > Denis, > > > > > > > > Yes, this is what my answer was about - you cannot have SQL without > > > > defining fields in advance. Because it breaks a lot of standard SQL > > > > invariants and virtually makes the whole language unusable. For > > instance, > > > > think of product behavior in the following cases: > > > > 1) User queries an empty cache with a query "SELECT a FROM table" - > > what > > > > should happen - exception or empty result? How would I know whether > > field > > > > "a" will appear in future? > > > > 2) User executed a command "ALTER TABLE ... ADD COLUMN b" - how can I > > > > understand whether it is possible or not to add a column without > strict > > > > schema? > > > > 3) "ALTER TABLE ... DROP COLUMN c" - what should happen if user will > > add > > > an > > > > object with field "c" after that? > > > > 4) User connects to Ignite from Tableau and navigates through schema > - > > > what > > > > should be shown? > > > > > > > > That is, you cannot have SQL without schema because it is at the very > > > heart > > > > of the technology. But you can have schema-less noSQL database. > > > > > > > > Let's do not invent a hybrid with tons of corner cases and separate > > > > learning curve. It should be enough just to rethink and simplify our > > > > configuration - reshape QueryEntity, deprecate all SQL annotations, > > allow > > > > only one table per cache, allow to define SQL script to be executed > on > > > > cache start or so. > > > > > > > > As far as schemaless - it is viable approach for sure, but should be > > > > considered either outside of SQL (e.g. a kind of predicate/criteria > API > > > > which can be merged with ScanQuery) or as a special datatype in SQL > > > > ecosystem (like is is done with JSON in many RDBMS databases). > > > > > > > > Vladimir. > > > > > > > > > > > > > > > > > > > > On Fri, Feb 22, 2019 at 11:01 PM Denis Magda <[hidden email]> > > wrote: > > > > > > > > > Vladimir, > > > > > > > > > > That's understood. I'm just thinking of a use case different from > the > > > DDL > > > > > approach where the schema is defined initially. Let's say that > > someone > > > > > configured caches with CacheConfiguration and now puts an Object in > > the > > > > > cache. For that person, it would be helpful to skip the Annotations > > or > > > > > QueryEntities approaches for queryable fields definitions (not even > > > > > indexes). For instance, the person might simply query some fields > > with > > > > the > > > > > primary index in the WHERE clause and this shouldn't require any > > extra > > > > > settings. Yes, it's clear that it might be extremely challenging to > > > > support > > > > > but imagine how usable the API could become if we can get rid of > > > > > Annotations and QueryEntities. > > > > > > > > > > Basically, my idea is that all of the objects and their fields > stored > > > in > > > > > the caches should be visible to SQL w/o extra settings. If someone > > > wants > > > > to > > > > > create indexes then use DDL which was designed for this. > > > > > > > > > > > > > > > - > > > > > Denis > > > > > > > > > > > > > > > On Fri, Feb 22, 2019 at 2:27 AM Vladimir Ozerov < > > [hidden email]> > > > > > wrote: > > > > > > > > > > > Denis, > > > > > > > > > > > > SQL is a language with strict schema what was one of significant > > > > factors > > > > > of > > > > > > it's worldwide success. I doubt we will ever have SQL without > > > > > > configuration/definiton, because otherwise it will be not SQL, > but > > > > > > something else (e.g. document-oriented, JSON, whatever). > > > > > > > > > > > > On Fri, Feb 22, 2019 at 1:52 AM Denis Magda <[hidden email]> > > > wrote: > > > > > > > > > > > > > Folks, > > > > > > > > > > > > > > Do we want to preserve the annotation-based configuration? > There > > > are > > > > > too > > > > > > > many ways to configure SQL indexes/fields. > > > > > > > > > > > > > > For instance, if our new SQL API could see and access all of > the > > > > fields > > > > > > > out-of-the-box (without any extra settings) and DDL will be > used > > to > > > > > > define > > > > > > > indexed fields then that would be a huge usability improvement. > > > > > > > > > > > > > > - > > > > > > > Denis > > > > > > > > > > > > > > > > > > > > > On Thu, Feb 21, 2019 at 5:27 AM Taras Ledkov < > > [hidden email] > > > > > > > > > > wrote: > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > Lets discuss SQL DML (INSERT/UPDATE) current behavior > specific: > > > > > > > > > > > > > > > > Ignite doesn't check a type of input objects when hidden > > columns > > > > > _key, > > > > > > > > _value is used in a DML statements. > > > > > > > > I describe the current behavior for example: > > > > > > > > > > > > > > > > 1. Cache configuration: 'setIndexedTypes(PersonKey.class, > > > > > > > Person.class))' > > > > > > > > 2. PersonKey type contains 'int id' field. > > > > > > > > 3. SQL statement: 'INSERT INTO test (_val, _key) VALUES (?, > ?)' > > > > > > > > > > > > > > > > Cases: > > > > > > > > 1. Invalid value object type: > > > > > > > > - Any value object may be passed as a query parameter > > > > > > > > - Query is executed without an error and returns '1' (one row > > > > > updated); > > > > > > > > - There is not inserted row at the 'SELECT * FROM test' > > results. > > > > > > > > - cache.get(key) returns inserted object; > > > > > > > > > > > > > > > > 2. Invalid key object type: > > > > > > > > 2.1 Non-primitive object is passed and binary representation > > > > doesn't > > > > > > > > contain 'id' field. > > > > > > > > - Query is executed without error and returns '1' (one row > > > > updated); > > > > > > > > - The inserted row is available by 'SELECT *' and the row > > > contains > > > > > id = > > > > > > > > null; > > > > > > > > 2.2 Non-primitive object is passed and binary representation > > > > contains > > > > > > > > 'id' field. > > > > > > > > - The inserted row is available by 'SELECT *' and the row > > > contains > > > > > > > > expected 'id' field; > > > > > > > > - The cache entry cannot be gathered by 'cache.get' operation > > > with > > > > > the > > > > > > > > corresponding 'PersonKey(id)' (keys differ). > > > > > > > > > > > > > > > > I propose to check type of the user's input object. > > > > > > > > > > > > > > > > I guess that using _key/_val columns works close to > > 'cache.put()' > > > > but > > > > > > it > > > > > > > > looks like significant usability issue. > > > > > > > > To confuse the 'PersonKey.class.getName()' and > > > > > > > > 'node.binary().builder("PersonKey")' is a typical mistake of > > > Ignite > > > > > > > > newcomers. > > > > > > > > > > > > > > > > One more argument for check: SQL INSERT sematic means the row > > is > > > > > > > > inserted into the specified TABLE, not into the cache. > > > > > > > > So, throw IgniteSQLException is expected behavior in this > > case, i > > > > > > think. > > > > > > > > > > > > > > > > [1]. https://issues.apache.org/jira/browse/IGNITE-5250 > > > > > > > > > > > > > > > > -- > > > > > > > > Taras Ledkov > > > > > > > > Mail-To: [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
Hi,
Unwrapped key works properly now, but unwrapped value doesn't. Example: execSql("CREATE TABLE test_0 (id integer primary key, name varchar) WITH \"WRAP_VALUE\""); execSql("INSERT INTO test_0 (id, _val) VALUES (?, ?)", 0, "0"); The exception is thrown: IgniteSQLException: Value conversion failed [column=_VAL, from=java.lang.String, to=java.lang.Object] So, I propose to choose one of the following: - disallow use the hidden columns _key, _val in the INSERT/UPDATE/MERGE - check the input object type. 27.02.2019 13:19, Vladimir Ozerov пишет: > I do not think this should be deferred, even though it changes default > behavior. Clean and simple semantics is much more important. In this > regards DML was created incorrectly in the first place. We will fix it, > leaving hidden fallback mode for those users who use this strange semantics. > > ср, 27 февр. 2019 г. в 12:57, Ilya Kasnacheev <[hidden email]>: > >> Hello! >> >>> UPDATE table SET _VAL=? WHERE ... // Disallow >> Breaking change and as such should be deferred to 3.0. >> >> All of our tables have types, so we can disallow doing _VAL=? where >> parameter object is not of table's type, and semantics break down here - >> you INSERT object in cache, get "1" rows updated but can't select this row >> from table. >> But we probably should not disallow _VAL=? where parameter object IS of >> table's type, since there may be users whose workflow depends on that and >> it isn't fixable easily. >> >> For example, they can have objects of which only subset of fields is >> indexed, the rest is not. Then they are inserting them via SQL as shown. >> >> Regards, >> -- >> Ilya Kasnacheev >> >> >> ср, 27 февр. 2019 г. в 12:10, Vladimir Ozerov <[hidden email]>: >> >>> Hi Taras, >>> >>> As far as your original question :-) I would say that user should have >> only >>> one way to update data with DML - through plain attributes. That is, if >> we >>> have a composite value with attributes "a" and "b", then we should: >>> UPDATE table SET a=?, b=? WHERE ... // Allow >>> UPDATE table SET _VAL=? WHERE ... // Disallow >>> >>> But if the value is an attribute itself (e.g. in case of primitive), then >>> DML should be allowed on it for sure: >>> UPDATE table SET _VAL=? WHERE ... // Allow >>> >>> What do you think? >>> >>> On Sat, Feb 23, 2019 at 6:50 PM Denis Magda <[hidden email]> wrote: >>> >>>> Vladimir, >>>> >>>> Ok, agreed, let's not boil the ocean...at least for now ;) >>>> >>>> -- >>>> Denis Magda >>>> >>>> >>>> On Sat, Feb 23, 2019 at 12:50 AM Vladimir Ozerov <[hidden email] >>>> wrote: >>>> >>>>> Denis, >>>>> >>>>> Yes, this is what my answer was about - you cannot have SQL without >>>>> defining fields in advance. Because it breaks a lot of standard SQL >>>>> invariants and virtually makes the whole language unusable. For >>> instance, >>>>> think of product behavior in the following cases: >>>>> 1) User queries an empty cache with a query "SELECT a FROM table" - >>> what >>>>> should happen - exception or empty result? How would I know whether >>> field >>>>> "a" will appear in future? >>>>> 2) User executed a command "ALTER TABLE ... ADD COLUMN b" - how can I >>>>> understand whether it is possible or not to add a column without >> strict >>>>> schema? >>>>> 3) "ALTER TABLE ... DROP COLUMN c" - what should happen if user will >>> add >>>> an >>>>> object with field "c" after that? >>>>> 4) User connects to Ignite from Tableau and navigates through schema >> - >>>> what >>>>> should be shown? >>>>> >>>>> That is, you cannot have SQL without schema because it is at the very >>>> heart >>>>> of the technology. But you can have schema-less noSQL database. >>>>> >>>>> Let's do not invent a hybrid with tons of corner cases and separate >>>>> learning curve. It should be enough just to rethink and simplify our >>>>> configuration - reshape QueryEntity, deprecate all SQL annotations, >>> allow >>>>> only one table per cache, allow to define SQL script to be executed >> on >>>>> cache start or so. >>>>> >>>>> As far as schemaless - it is viable approach for sure, but should be >>>>> considered either outside of SQL (e.g. a kind of predicate/criteria >> API >>>>> which can be merged with ScanQuery) or as a special datatype in SQL >>>>> ecosystem (like is is done with JSON in many RDBMS databases). >>>>> >>>>> Vladimir. >>>>> >>>>> >>>>> >>>>> >>>>> On Fri, Feb 22, 2019 at 11:01 PM Denis Magda <[hidden email]> >>> wrote: >>>>>> Vladimir, >>>>>> >>>>>> That's understood. I'm just thinking of a use case different from >> the >>>> DDL >>>>>> approach where the schema is defined initially. Let's say that >>> someone >>>>>> configured caches with CacheConfiguration and now puts an Object in >>> the >>>>>> cache. For that person, it would be helpful to skip the Annotations >>> or >>>>>> QueryEntities approaches for queryable fields definitions (not even >>>>>> indexes). For instance, the person might simply query some fields >>> with >>>>> the >>>>>> primary index in the WHERE clause and this shouldn't require any >>> extra >>>>>> settings. Yes, it's clear that it might be extremely challenging to >>>>> support >>>>>> but imagine how usable the API could become if we can get rid of >>>>>> Annotations and QueryEntities. >>>>>> >>>>>> Basically, my idea is that all of the objects and their fields >> stored >>>> in >>>>>> the caches should be visible to SQL w/o extra settings. If someone >>>> wants >>>>> to >>>>>> create indexes then use DDL which was designed for this. >>>>>> >>>>>> >>>>>> - >>>>>> Denis >>>>>> >>>>>> >>>>>> On Fri, Feb 22, 2019 at 2:27 AM Vladimir Ozerov < >>> [hidden email]> >>>>>> wrote: >>>>>> >>>>>>> Denis, >>>>>>> >>>>>>> SQL is a language with strict schema what was one of significant >>>>> factors >>>>>> of >>>>>>> it's worldwide success. I doubt we will ever have SQL without >>>>>>> configuration/definiton, because otherwise it will be not SQL, >> but >>>>>>> something else (e.g. document-oriented, JSON, whatever). >>>>>>> >>>>>>> On Fri, Feb 22, 2019 at 1:52 AM Denis Magda <[hidden email]> >>>> wrote: >>>>>>>> Folks, >>>>>>>> >>>>>>>> Do we want to preserve the annotation-based configuration? >> There >>>> are >>>>>> too >>>>>>>> many ways to configure SQL indexes/fields. >>>>>>>> >>>>>>>> For instance, if our new SQL API could see and access all of >> the >>>>> fields >>>>>>>> out-of-the-box (without any extra settings) and DDL will be >> used >>> to >>>>>>> define >>>>>>>> indexed fields then that would be a huge usability improvement. >>>>>>>> >>>>>>>> - >>>>>>>> Denis >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Feb 21, 2019 at 5:27 AM Taras Ledkov < >>> [hidden email] >>>>>>> wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> Lets discuss SQL DML (INSERT/UPDATE) current behavior >> specific: >>>>>>>>> Ignite doesn't check a type of input objects when hidden >>> columns >>>>>> _key, >>>>>>>>> _value is used in a DML statements. >>>>>>>>> I describe the current behavior for example: >>>>>>>>> >>>>>>>>> 1. Cache configuration: 'setIndexedTypes(PersonKey.class, >>>>>>>> Person.class))' >>>>>>>>> 2. PersonKey type contains 'int id' field. >>>>>>>>> 3. SQL statement: 'INSERT INTO test (_val, _key) VALUES (?, >> ?)' >>>>>>>>> Cases: >>>>>>>>> 1. Invalid value object type: >>>>>>>>> - Any value object may be passed as a query parameter >>>>>>>>> - Query is executed without an error and returns '1' (one row >>>>>> updated); >>>>>>>>> - There is not inserted row at the 'SELECT * FROM test' >>> results. >>>>>>>>> - cache.get(key) returns inserted object; >>>>>>>>> >>>>>>>>> 2. Invalid key object type: >>>>>>>>> 2.1 Non-primitive object is passed and binary representation >>>>> doesn't >>>>>>>>> contain 'id' field. >>>>>>>>> - Query is executed without error and returns '1' (one row >>>>> updated); >>>>>>>>> - The inserted row is available by 'SELECT *' and the row >>>> contains >>>>>> id = >>>>>>>>> null; >>>>>>>>> 2.2 Non-primitive object is passed and binary representation >>>>> contains >>>>>>>>> 'id' field. >>>>>>>>> - The inserted row is available by 'SELECT *' and the row >>>> contains >>>>>>>>> expected 'id' field; >>>>>>>>> - The cache entry cannot be gathered by 'cache.get' operation >>>> with >>>>>> the >>>>>>>>> corresponding 'PersonKey(id)' (keys differ). >>>>>>>>> >>>>>>>>> I propose to check type of the user's input object. >>>>>>>>> >>>>>>>>> I guess that using _key/_val columns works close to >>> 'cache.put()' >>>>> but >>>>>>> it >>>>>>>>> looks like significant usability issue. >>>>>>>>> To confuse the 'PersonKey.class.getName()' and >>>>>>>>> 'node.binary().builder("PersonKey")' is a typical mistake of >>>> Ignite >>>>>>>>> newcomers. >>>>>>>>> >>>>>>>>> One more argument for check: SQL INSERT sematic means the row >>> is >>>>>>>>> inserted into the specified TABLE, not into the cache. >>>>>>>>> So, throw IgniteSQLException is expected behavior in this >>> case, i >>>>>>> think. >>>>>>>>> [1]. https://issues.apache.org/jira/browse/IGNITE-5250 >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Taras Ledkov >>>>>>>>> Mail-To: [hidden email] >>>>>>>>> >>>>>>>>> Taras Ledkov Mail-To: [hidden email] |
In reply to this post by Vladimir Ozerov
Hi,
I was wrong about unwrapped _val. Lets introduce IgniteSystemProperties.IGNITE_ALLOW_KEY_VAL_COLUMNS_AT_DML to switch on current behavior and disallow composite _key, _val columns at the INSERT/MERGE/UPDATE statements by default. 27.02.2019 13:19, Vladimir Ozerov пишет: > I do not think this should be deferred, even though it changes default > behavior. Clean and simple semantics is much more important. In this > regards DML was created incorrectly in the first place. We will fix it, > leaving hidden fallback mode for those users who use this strange semantics. > > ср, 27 февр. 2019 г. в 12:57, Ilya Kasnacheev <[hidden email]>: > >> Hello! >> >>> UPDATE table SET _VAL=? WHERE ... // Disallow >> Breaking change and as such should be deferred to 3.0. >> >> All of our tables have types, so we can disallow doing _VAL=? where >> parameter object is not of table's type, and semantics break down here - >> you INSERT object in cache, get "1" rows updated but can't select this row >> from table. >> But we probably should not disallow _VAL=? where parameter object IS of >> table's type, since there may be users whose workflow depends on that and >> it isn't fixable easily. >> >> For example, they can have objects of which only subset of fields is >> indexed, the rest is not. Then they are inserting them via SQL as shown. >> >> Regards, >> -- >> Ilya Kasnacheev >> >> >> ср, 27 февр. 2019 г. в 12:10, Vladimir Ozerov <[hidden email]>: >> >>> Hi Taras, >>> >>> As far as your original question :-) I would say that user should have >> only >>> one way to update data with DML - through plain attributes. That is, if >> we >>> have a composite value with attributes "a" and "b", then we should: >>> UPDATE table SET a=?, b=? WHERE ... // Allow >>> UPDATE table SET _VAL=? WHERE ... // Disallow >>> >>> But if the value is an attribute itself (e.g. in case of primitive), then >>> DML should be allowed on it for sure: >>> UPDATE table SET _VAL=? WHERE ... // Allow >>> >>> What do you think? >>> >>> On Sat, Feb 23, 2019 at 6:50 PM Denis Magda <[hidden email]> wrote: >>> >>>> Vladimir, >>>> >>>> Ok, agreed, let's not boil the ocean...at least for now ;) >>>> >>>> -- >>>> Denis Magda >>>> >>>> >>>> On Sat, Feb 23, 2019 at 12:50 AM Vladimir Ozerov <[hidden email] >>>> wrote: >>>> >>>>> Denis, >>>>> >>>>> Yes, this is what my answer was about - you cannot have SQL without >>>>> defining fields in advance. Because it breaks a lot of standard SQL >>>>> invariants and virtually makes the whole language unusable. For >>> instance, >>>>> think of product behavior in the following cases: >>>>> 1) User queries an empty cache with a query "SELECT a FROM table" - >>> what >>>>> should happen - exception or empty result? How would I know whether >>> field >>>>> "a" will appear in future? >>>>> 2) User executed a command "ALTER TABLE ... ADD COLUMN b" - how can I >>>>> understand whether it is possible or not to add a column without >> strict >>>>> schema? >>>>> 3) "ALTER TABLE ... DROP COLUMN c" - what should happen if user will >>> add >>>> an >>>>> object with field "c" after that? >>>>> 4) User connects to Ignite from Tableau and navigates through schema >> - >>>> what >>>>> should be shown? >>>>> >>>>> That is, you cannot have SQL without schema because it is at the very >>>> heart >>>>> of the technology. But you can have schema-less noSQL database. >>>>> >>>>> Let's do not invent a hybrid with tons of corner cases and separate >>>>> learning curve. It should be enough just to rethink and simplify our >>>>> configuration - reshape QueryEntity, deprecate all SQL annotations, >>> allow >>>>> only one table per cache, allow to define SQL script to be executed >> on >>>>> cache start or so. >>>>> >>>>> As far as schemaless - it is viable approach for sure, but should be >>>>> considered either outside of SQL (e.g. a kind of predicate/criteria >> API >>>>> which can be merged with ScanQuery) or as a special datatype in SQL >>>>> ecosystem (like is is done with JSON in many RDBMS databases). >>>>> >>>>> Vladimir. >>>>> >>>>> >>>>> >>>>> >>>>> On Fri, Feb 22, 2019 at 11:01 PM Denis Magda <[hidden email]> >>> wrote: >>>>>> Vladimir, >>>>>> >>>>>> That's understood. I'm just thinking of a use case different from >> the >>>> DDL >>>>>> approach where the schema is defined initially. Let's say that >>> someone >>>>>> configured caches with CacheConfiguration and now puts an Object in >>> the >>>>>> cache. For that person, it would be helpful to skip the Annotations >>> or >>>>>> QueryEntities approaches for queryable fields definitions (not even >>>>>> indexes). For instance, the person might simply query some fields >>> with >>>>> the >>>>>> primary index in the WHERE clause and this shouldn't require any >>> extra >>>>>> settings. Yes, it's clear that it might be extremely challenging to >>>>> support >>>>>> but imagine how usable the API could become if we can get rid of >>>>>> Annotations and QueryEntities. >>>>>> >>>>>> Basically, my idea is that all of the objects and their fields >> stored >>>> in >>>>>> the caches should be visible to SQL w/o extra settings. If someone >>>> wants >>>>> to >>>>>> create indexes then use DDL which was designed for this. >>>>>> >>>>>> >>>>>> - >>>>>> Denis >>>>>> >>>>>> >>>>>> On Fri, Feb 22, 2019 at 2:27 AM Vladimir Ozerov < >>> [hidden email]> >>>>>> wrote: >>>>>> >>>>>>> Denis, >>>>>>> >>>>>>> SQL is a language with strict schema what was one of significant >>>>> factors >>>>>> of >>>>>>> it's worldwide success. I doubt we will ever have SQL without >>>>>>> configuration/definiton, because otherwise it will be not SQL, >> but >>>>>>> something else (e.g. document-oriented, JSON, whatever). >>>>>>> >>>>>>> On Fri, Feb 22, 2019 at 1:52 AM Denis Magda <[hidden email]> >>>> wrote: >>>>>>>> Folks, >>>>>>>> >>>>>>>> Do we want to preserve the annotation-based configuration? >> There >>>> are >>>>>> too >>>>>>>> many ways to configure SQL indexes/fields. >>>>>>>> >>>>>>>> For instance, if our new SQL API could see and access all of >> the >>>>> fields >>>>>>>> out-of-the-box (without any extra settings) and DDL will be >> used >>> to >>>>>>> define >>>>>>>> indexed fields then that would be a huge usability improvement. >>>>>>>> >>>>>>>> - >>>>>>>> Denis >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Feb 21, 2019 at 5:27 AM Taras Ledkov < >>> [hidden email] >>>>>>> wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> Lets discuss SQL DML (INSERT/UPDATE) current behavior >> specific: >>>>>>>>> Ignite doesn't check a type of input objects when hidden >>> columns >>>>>> _key, >>>>>>>>> _value is used in a DML statements. >>>>>>>>> I describe the current behavior for example: >>>>>>>>> >>>>>>>>> 1. Cache configuration: 'setIndexedTypes(PersonKey.class, >>>>>>>> Person.class))' >>>>>>>>> 2. PersonKey type contains 'int id' field. >>>>>>>>> 3. SQL statement: 'INSERT INTO test (_val, _key) VALUES (?, >> ?)' >>>>>>>>> Cases: >>>>>>>>> 1. Invalid value object type: >>>>>>>>> - Any value object may be passed as a query parameter >>>>>>>>> - Query is executed without an error and returns '1' (one row >>>>>> updated); >>>>>>>>> - There is not inserted row at the 'SELECT * FROM test' >>> results. >>>>>>>>> - cache.get(key) returns inserted object; >>>>>>>>> >>>>>>>>> 2. Invalid key object type: >>>>>>>>> 2.1 Non-primitive object is passed and binary representation >>>>> doesn't >>>>>>>>> contain 'id' field. >>>>>>>>> - Query is executed without error and returns '1' (one row >>>>> updated); >>>>>>>>> - The inserted row is available by 'SELECT *' and the row >>>> contains >>>>>> id = >>>>>>>>> null; >>>>>>>>> 2.2 Non-primitive object is passed and binary representation >>>>> contains >>>>>>>>> 'id' field. >>>>>>>>> - The inserted row is available by 'SELECT *' and the row >>>> contains >>>>>>>>> expected 'id' field; >>>>>>>>> - The cache entry cannot be gathered by 'cache.get' operation >>>> with >>>>>> the >>>>>>>>> corresponding 'PersonKey(id)' (keys differ). >>>>>>>>> >>>>>>>>> I propose to check type of the user's input object. >>>>>>>>> >>>>>>>>> I guess that using _key/_val columns works close to >>> 'cache.put()' >>>>> but >>>>>>> it >>>>>>>>> looks like significant usability issue. >>>>>>>>> To confuse the 'PersonKey.class.getName()' and >>>>>>>>> 'node.binary().builder("PersonKey")' is a typical mistake of >>>> Ignite >>>>>>>>> newcomers. >>>>>>>>> >>>>>>>>> One more argument for check: SQL INSERT sematic means the row >>> is >>>>>>>>> inserted into the specified TABLE, not into the cache. >>>>>>>>> So, throw IgniteSQLException is expected behavior in this >>> case, i >>>>>>> think. >>>>>>>>> [1]. https://issues.apache.org/jira/browse/IGNITE-5250 >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Taras Ledkov >>>>>>>>> Mail-To: [hidden email] >>>>>>>>> >>>>>>>>> Taras Ledkov Mail-To: [hidden email] |
Free forum by Nabble | Edit this page |