Spark dataframe API will get wrong schema if user executes add/drop column DDL

classic Classic list List threaded Threaded
1 message Options
Ray
Reply | Threaded
Open this post in threaded view
|

Spark dataframe API will get wrong schema if user executes add/drop column DDL

Ray
Currently, when user performs add/remove column DDL, the QueryEntity will not
change.

This result in Spark getting wrong schema because Spark relies on
QueryEntity to construct data frame schema.

In Vladimir Ozerov's reply in dev list,
http://apache-ignite-developers.2346864.n4.nabble.com/Schema-in-CacheConfig-is-not-updated-after-DDL-commands-Add-drop-column-Create-drop-index-td38002.html.

This behavior is by design, so I decide to fix this issue from the Spark
side.

 

So I propose this solution, instead of getting schema by QueryEntity I want
to get schema by a SQL select command.

Nikolay Izhikov, what do you think about this solution?

I already created a ticket in JIRA,
https://issues.apache.org/jira/browse/IGNITE-10314

If you think this solution OK then I'll start implementing.



--
Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/