Guys,
is it possible to configure caches dynamically and persist their configuration in some shape and form? Here's the use case: - we want to create some caches in the already running cluster for a data set - once it is done, we'll run some SQL queries on top of those - ideally, we'd like to be able to safe the cache configurations so next time, we don't need to do the structures and field types discovery, but instead be able to load it on (re)start Is this a supported use-case or everything should be defined statically before nodes start? Looks like the latter, but perhaps we are missing something. Thanks in advance for any info, Cos |
Cos,
How you are going to create caches in already started cluster? I think you will create a cache configuration and after that will get cache via Ignite.getOrCreateCache(ccfg). So if you server cluster at some point was completely restarted, then executing getOrCreateCache(ccfg) will create cache again (if needed) or return existing cache. Also AFAIK CacheConfiguration class is serializable - you can save it somewhere and later load if needed. Or you may define some XML files with cache beans and load them with IgniteSpringHelper. Thoughts? On Tue, Dec 22, 2015 at 3:32 PM, Konstantin Boudnik <[hidden email]> wrote: > Guys, > > is it possible to configure caches dynamically and persist their > configuration > in some shape and form? Here's the use case: > - we want to create some caches in the already running cluster for a data > set > - once it is done, we'll run some SQL queries on top of those > - ideally, we'd like to be able to safe the cache configurations so next > time, we don't need to do the structures and field types discovery, but > instead be able to load it on (re)start > > Is this a supported use-case or everything should be defined statically > before > nodes start? Looks like the latter, but perhaps we are missing something. > > Thanks in advance for any info, > Cos > -- Alexey Kuznetsov GridGain Systems www.gridgain.com |
On Tue, Dec 22, 2015 at 03:45PM, Alexey Kuznetsov wrote:
> Cos, > > How you are going to create caches in already started cluster? > I think you will create a cache configuration and after that will get cache > via Ignite.getOrCreateCache(ccfg). I would imagine that I can write some Java code to describe the configuration if needed. After all, Spring is all over the place. And while I am not a big fun of overusing it, it's already there, so perhaps it can do something useful ;) > So if you server cluster at some point was completely restarted, then > executing getOrCreateCache(ccfg) will create cache again (if needed) or > return existing cache. I am trying to have an analogy with either RDBMS or a data processing framework. I know the both of those aren't exact, but bear with me for a second. In RDBMS world the UX is to be able either to query existing tables or to create, populate and query new ones. No special auxiliary configurations are needed. In a data processing frameworks like Spark, Flink, etc. the data is originating the schema (aka schema on read) thus no special preparation steps is needed before the data could be read from a storage and processed. Now, in the case of Ignite the data needs to be transferred to a RAM, however same schema-on-read (a parsing code) or an externalized metadata (stored config or something) could be used to structure it on-the-fly. Hence, my cluster would have a higher level of runtime dynamic as I now I can create new caches as I go, without restarting the cluster nodes on every sneeze. > Also AFAIK CacheConfiguration class is serializable - you can save it > somewhere and later load if needed. > Or you may define some XML files with cache beans and load them with > IgniteSpringHelper. > > Thoughts? Basically, I am trying to think of how to make this whole thing more user-friendly and less middleware developers oriented. Wouldn't it be great if a user can load some external data and immediately start playing with it and doing some OLAP or even - oh horror :) - OLTP on it? Does it make sense? Cos > On Tue, Dec 22, 2015 at 3:32 PM, Konstantin Boudnik <[hidden email]> wrote: > > > Guys, > > > > is it possible to configure caches dynamically and persist their > > configuration > > in some shape and form? Here's the use case: > > - we want to create some caches in the already running cluster for a data > > set > > - once it is done, we'll run some SQL queries on top of those > > - ideally, we'd like to be able to safe the cache configurations so next > > time, we don't need to do the structures and field types discovery, but > > instead be able to load it on (re)start > > > > Is this a supported use-case or everything should be defined statically > > before > > nodes start? Looks like the latter, but perhaps we are missing something. > > > > Thanks in advance for any info, > > Cos > > > > > > -- > Alexey Kuznetsov > GridGain Systems > www.gridgain.com |
Cos,
As far as schema-on-read, you can set all your caches in XML configuration, and they will be pre-created for you. Will this do the trick? D. On Tue, Dec 22, 2015 at 7:30 PM, Konstantin Boudnik <[hidden email]> wrote: > On Tue, Dec 22, 2015 at 03:45PM, Alexey Kuznetsov wrote: > > Cos, > > > > How you are going to create caches in already started cluster? > > I think you will create a cache configuration and after that will get > cache > > via Ignite.getOrCreateCache(ccfg). > > I would imagine that I can write some Java code to describe the > configuration > if needed. After all, Spring is all over the place. And while I am not a > big > fun of overusing it, it's already there, so perhaps it can do something > useful ;) > > > So if you server cluster at some point was completely restarted, then > > executing getOrCreateCache(ccfg) will create cache again (if needed) or > > return existing cache. > > I am trying to have an analogy with either RDBMS or a data processing > framework. I know the both of those aren't exact, but bear with me for a > second. In RDBMS world the UX is to be able either to query existing > tables or > to create, populate and query new ones. No special auxiliary configurations > are needed. In a data processing frameworks like Spark, Flink, etc. the > data > is originating the schema (aka schema on read) thus no special preparation > steps is needed before the data could be read from a storage and processed. > > Now, in the case of Ignite the data needs to be transferred to a RAM, > however > same schema-on-read (a parsing code) or an externalized metadata (stored > config or something) could be used to structure it on-the-fly. Hence, my > cluster would have a higher level of runtime dynamic as I now I can create > new > caches as I go, without restarting the cluster nodes on every sneeze. > > > Also AFAIK CacheConfiguration class is serializable - you can save it > > somewhere and later load if needed. > > Or you may define some XML files with cache beans and load them with > > IgniteSpringHelper. > > > > Thoughts? > > Basically, I am trying to think of how to make this whole thing more > user-friendly and less middleware developers oriented. Wouldn't it be > great if > a user can load some external data and immediately start playing with it > and > doing some OLAP or even - oh horror :) - OLTP on it? > > Does it make sense? > Cos > > > On Tue, Dec 22, 2015 at 3:32 PM, Konstantin Boudnik <[hidden email]> > wrote: > > > > > Guys, > > > > > > is it possible to configure caches dynamically and persist their > > > configuration > > > in some shape and form? Here's the use case: > > > - we want to create some caches in the already running cluster for a > data > > > set > > > - once it is done, we'll run some SQL queries on top of those > > > - ideally, we'd like to be able to safe the cache configurations so > next > > > time, we don't need to do the structures and field types discovery, > but > > > instead be able to load it on (re)start > > > > > > Is this a supported use-case or everything should be defined statically > > > before > > > nodes start? Looks like the latter, but perhaps we are missing > something. > > > > > > Thanks in advance for any info, > > > Cos > > > > > > > > > > > -- > > Alexey Kuznetsov > > GridGain Systems > > www.gridgain.com > |
In reply to this post by Konstantin Boudnik-2
>>Basically, I am trying to think of how to make this whole thing more
>>user-friendly and less middleware developers oriented. Wouldn't it be great if >>a user can load some external data and immediately start playing with it and >>doing some OLAP or even - oh horror :) - OLTP on it? I will describe how this could be done with Visor GUI from GridGain (based on Ignite). In Visor GUI you can start any cache from UI by specifying XML description of cache. User can specify JDBC POJO Store factory in XML cache description. And in special dialog "Load from store" load any data into cache from RDBMS using "select .. from ... where ..." queries. After that user can go to SQL tab and perform any SQL queries on loaded data. But this require that data sources beans should be already described in nodes XML configs in order to specify data source bean name in store factory. You may implement in you app smth. like I described, all needed functionality available in Ignite. In your app instead of RDBM you may load data from another source. -- Alexey Kuznetsov GridGain Systems www.gridgain.com |
In reply to this post by dsetrakyan
What if I don't know the configuration in advance? Doesn't it mean that I
would have to restart the nodes whenever a new cache is configured and needs to be added to the cluster? Cos On Tue, Dec 22, 2015 at 10:37PM, Dmitriy Setrakyan wrote: > Cos, > > As far as schema-on-read, you can set all your caches in XML configuration, > and they will be pre-created for you. Will this do the trick? > > D. > > On Tue, Dec 22, 2015 at 7:30 PM, Konstantin Boudnik <[hidden email]> wrote: > > > On Tue, Dec 22, 2015 at 03:45PM, Alexey Kuznetsov wrote: > > > Cos, > > > > > > How you are going to create caches in already started cluster? > > > I think you will create a cache configuration and after that will get > > cache > > > via Ignite.getOrCreateCache(ccfg). > > > > I would imagine that I can write some Java code to describe the > > configuration > > if needed. After all, Spring is all over the place. And while I am not a > > big > > fun of overusing it, it's already there, so perhaps it can do something > > useful ;) > > > > > So if you server cluster at some point was completely restarted, then > > > executing getOrCreateCache(ccfg) will create cache again (if needed) or > > > return existing cache. > > > > I am trying to have an analogy with either RDBMS or a data processing > > framework. I know the both of those aren't exact, but bear with me for a > > second. In RDBMS world the UX is to be able either to query existing > > tables or > > to create, populate and query new ones. No special auxiliary configurations > > are needed. In a data processing frameworks like Spark, Flink, etc. the > > data > > is originating the schema (aka schema on read) thus no special preparation > > steps is needed before the data could be read from a storage and processed. > > > > Now, in the case of Ignite the data needs to be transferred to a RAM, > > however > > same schema-on-read (a parsing code) or an externalized metadata (stored > > config or something) could be used to structure it on-the-fly. Hence, my > > cluster would have a higher level of runtime dynamic as I now I can create > > new > > caches as I go, without restarting the cluster nodes on every sneeze. > > > > > Also AFAIK CacheConfiguration class is serializable - you can save it > > > somewhere and later load if needed. > > > Or you may define some XML files with cache beans and load them with > > > IgniteSpringHelper. > > > > > > Thoughts? > > > > Basically, I am trying to think of how to make this whole thing more > > user-friendly and less middleware developers oriented. Wouldn't it be > > great if > > a user can load some external data and immediately start playing with it > > and > > doing some OLAP or even - oh horror :) - OLTP on it? > > > > Does it make sense? > > Cos > > > > > On Tue, Dec 22, 2015 at 3:32 PM, Konstantin Boudnik <[hidden email]> > > wrote: > > > > > > > Guys, > > > > > > > > is it possible to configure caches dynamically and persist their > > > > configuration > > > > in some shape and form? Here's the use case: > > > > - we want to create some caches in the already running cluster for a > > data > > > > set > > > > - once it is done, we'll run some SQL queries on top of those > > > > - ideally, we'd like to be able to safe the cache configurations so > > next > > > > time, we don't need to do the structures and field types discovery, > > but > > > > instead be able to load it on (re)start > > > > > > > > Is this a supported use-case or everything should be defined statically > > > > before > > > > nodes start? Looks like the latter, but perhaps we are missing > > something. > > > > > > > > Thanks in advance for any info, > > > > Cos > > > > > > > > > > > > > > > > -- > > > Alexey Kuznetsov > > > GridGain Systems > > > www.gridgain.com > > |
Cos, I am confused. What is the behavior you would like to see?
On Wed, Dec 23, 2015 at 12:01 AM, Konstantin Boudnik <[hidden email]> wrote: > What if I don't know the configuration in advance? Doesn't it mean that I > would have to restart the nodes whenever a new cache is configured and > needs > to be added to the cluster? > > Cos > > On Tue, Dec 22, 2015 at 10:37PM, Dmitriy Setrakyan wrote: > > Cos, > > > > As far as schema-on-read, you can set all your caches in XML > configuration, > > and they will be pre-created for you. Will this do the trick? > > > > D. > > > > On Tue, Dec 22, 2015 at 7:30 PM, Konstantin Boudnik <[hidden email]> > wrote: > > > > > On Tue, Dec 22, 2015 at 03:45PM, Alexey Kuznetsov wrote: > > > > Cos, > > > > > > > > How you are going to create caches in already started cluster? > > > > I think you will create a cache configuration and after that will get > > > cache > > > > via Ignite.getOrCreateCache(ccfg). > > > > > > I would imagine that I can write some Java code to describe the > > > configuration > > > if needed. After all, Spring is all over the place. And while I am not > a > > > big > > > fun of overusing it, it's already there, so perhaps it can do something > > > useful ;) > > > > > > > So if you server cluster at some point was completely restarted, then > > > > executing getOrCreateCache(ccfg) will create cache again (if needed) > or > > > > return existing cache. > > > > > > I am trying to have an analogy with either RDBMS or a data processing > > > framework. I know the both of those aren't exact, but bear with me for > a > > > second. In RDBMS world the UX is to be able either to query existing > > > tables or > > > to create, populate and query new ones. No special auxiliary > configurations > > > are needed. In a data processing frameworks like Spark, Flink, etc. the > > > data > > > is originating the schema (aka schema on read) thus no special > preparation > > > steps is needed before the data could be read from a storage and > processed. > > > > > > Now, in the case of Ignite the data needs to be transferred to a RAM, > > > however > > > same schema-on-read (a parsing code) or an externalized metadata > (stored > > > config or something) could be used to structure it on-the-fly. Hence, > my > > > cluster would have a higher level of runtime dynamic as I now I can > create > > > new > > > caches as I go, without restarting the cluster nodes on every sneeze. > > > > > > > Also AFAIK CacheConfiguration class is serializable - you can save it > > > > somewhere and later load if needed. > > > > Or you may define some XML files with cache beans and load them with > > > > IgniteSpringHelper. > > > > > > > > Thoughts? > > > > > > Basically, I am trying to think of how to make this whole thing more > > > user-friendly and less middleware developers oriented. Wouldn't it be > > > great if > > > a user can load some external data and immediately start playing with > it > > > and > > > doing some OLAP or even - oh horror :) - OLTP on it? > > > > > > Does it make sense? > > > Cos > > > > > > > On Tue, Dec 22, 2015 at 3:32 PM, Konstantin Boudnik <[hidden email]> > > > wrote: > > > > > > > > > Guys, > > > > > > > > > > is it possible to configure caches dynamically and persist their > > > > > configuration > > > > > in some shape and form? Here's the use case: > > > > > - we want to create some caches in the already running cluster > for a > > > data > > > > > set > > > > > - once it is done, we'll run some SQL queries on top of those > > > > > - ideally, we'd like to be able to safe the cache configurations > so > > > next > > > > > time, we don't need to do the structures and field types > discovery, > > > but > > > > > instead be able to load it on (re)start > > > > > > > > > > Is this a supported use-case or everything should be defined > statically > > > > > before > > > > > nodes start? Looks like the latter, but perhaps we are missing > > > something. > > > > > > > > > > Thanks in advance for any info, > > > > > Cos > > > > > > > > > > > > > > > > > > > > > -- > > > > Alexey Kuznetsov > > > > GridGain Systems > > > > www.gridgain.com > > > > |
Let me try to restate what I've said earlier.
- I have a running cluster - I want to create a new cache, configuration of which is totally unknown beforehand. Would be nice if I can do it by executing some custom java code from a client node - I want to be able to load POJO model classes _without_ restating the nodes, but simply from say a network location (URL class loading?) - once the cache is loaded and provisioned with some data (ie from stream?) I want to be able to store its configuration, so the next time I don't need to write and execute client Java code (as in the step above) Am I making myself more clear now? I believe the main snag, preventing us from getting on the same page is this. Ignite as it stands right now, is very much oriented on the static cluster and caches configurations. Even the auto-discovery from an external RDBMS is semi-static as it requires some user input and pre-existing configuration files. And that's the main issue IMO - the configuration is fundamentally defines not only the caches, but the whole cluster. If I want to use slightly different config to join a node to an existing cluster, it will be rejected. In other words, the nodes are very homogeneous. I am not saying it is a bad thing, but it is clearly a limiting factor for the cases like I described above. Cheers, Cos On Wed, Dec 23, 2015 at 12:47AM, Dmitriy Setrakyan wrote: > Cos, I am confused. What is the behavior you would like to see? > > On Wed, Dec 23, 2015 at 12:01 AM, Konstantin Boudnik <[hidden email]> wrote: > > > What if I don't know the configuration in advance? Doesn't it mean that I > > would have to restart the nodes whenever a new cache is configured and > > needs > > to be added to the cluster? > > > > Cos > > > > On Tue, Dec 22, 2015 at 10:37PM, Dmitriy Setrakyan wrote: > > > Cos, > > > > > > As far as schema-on-read, you can set all your caches in XML > > configuration, > > > and they will be pre-created for you. Will this do the trick? > > > > > > D. > > > > > > On Tue, Dec 22, 2015 at 7:30 PM, Konstantin Boudnik <[hidden email]> > > wrote: > > > > > > > On Tue, Dec 22, 2015 at 03:45PM, Alexey Kuznetsov wrote: > > > > > Cos, > > > > > > > > > > How you are going to create caches in already started cluster? > > > > > I think you will create a cache configuration and after that will get > > > > cache > > > > > via Ignite.getOrCreateCache(ccfg). > > > > > > > > I would imagine that I can write some Java code to describe the > > > > configuration > > > > if needed. After all, Spring is all over the place. And while I am not > > a > > > > big > > > > fun of overusing it, it's already there, so perhaps it can do something > > > > useful ;) > > > > > > > > > So if you server cluster at some point was completely restarted, then > > > > > executing getOrCreateCache(ccfg) will create cache again (if needed) > > or > > > > > return existing cache. > > > > > > > > I am trying to have an analogy with either RDBMS or a data processing > > > > framework. I know the both of those aren't exact, but bear with me for > > a > > > > second. In RDBMS world the UX is to be able either to query existing > > > > tables or > > > > to create, populate and query new ones. No special auxiliary > > configurations > > > > are needed. In a data processing frameworks like Spark, Flink, etc. the > > > > data > > > > is originating the schema (aka schema on read) thus no special > > preparation > > > > steps is needed before the data could be read from a storage and > > processed. > > > > > > > > Now, in the case of Ignite the data needs to be transferred to a RAM, > > > > however > > > > same schema-on-read (a parsing code) or an externalized metadata > > (stored > > > > config or something) could be used to structure it on-the-fly. Hence, > > my > > > > cluster would have a higher level of runtime dynamic as I now I can > > create > > > > new > > > > caches as I go, without restarting the cluster nodes on every sneeze. > > > > > > > > > Also AFAIK CacheConfiguration class is serializable - you can save it > > > > > somewhere and later load if needed. > > > > > Or you may define some XML files with cache beans and load them with > > > > > IgniteSpringHelper. > > > > > > > > > > Thoughts? > > > > > > > > Basically, I am trying to think of how to make this whole thing more > > > > user-friendly and less middleware developers oriented. Wouldn't it be > > > > great if > > > > a user can load some external data and immediately start playing with > > it > > > > and > > > > doing some OLAP or even - oh horror :) - OLTP on it? > > > > > > > > Does it make sense? > > > > Cos > > > > > > > > > On Tue, Dec 22, 2015 at 3:32 PM, Konstantin Boudnik <[hidden email]> > > > > wrote: > > > > > > > > > > > Guys, > > > > > > > > > > > > is it possible to configure caches dynamically and persist their > > > > > > configuration > > > > > > in some shape and form? Here's the use case: > > > > > > - we want to create some caches in the already running cluster > > for a > > > > data > > > > > > set > > > > > > - once it is done, we'll run some SQL queries on top of those > > > > > > - ideally, we'd like to be able to safe the cache configurations > > so > > > > next > > > > > > time, we don't need to do the structures and field types > > discovery, > > > > but > > > > > > instead be able to load it on (re)start > > > > > > > > > > > > Is this a supported use-case or everything should be defined > > statically > > > > > > before > > > > > > nodes start? Looks like the latter, but perhaps we are missing > > > > something. > > > > > > > > > > > > Thanks in advance for any info, > > > > > > Cos > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Alexey Kuznetsov > > > > > GridGain Systems > > > > > www.gridgain.com > > > > > > |
Cos,
In ignite-1.5 we introduced new binary marshaller. And now you could load data into cache without having POJO classes on server nodes. Just describe them as JdbcTypes in CacheJdbcPojoStoreFactory. The only thing that should be predefined on node start - is a data source(s), but I believe that data source is a kind of thing that is known when you design app. Make sense? |
Yup, that makes a lot of sense. I've seen the conversation about the
marshaller, but somehow I blinded it off. Lemme check more on this Appreciate the pointer! Cos On Thu, Dec 24, 2015 at 07:28AM, Alexey Kuznetsov wrote: > Cos, > > In ignite-1.5 we introduced new binary marshaller. > And now you could load data into cache without having POJO classes on > server nodes. > Just describe them as JdbcTypes in CacheJdbcPojoStoreFactory. > The only thing that should be predefined on node start - is a data > source(s), > but I believe that data source is a kind of thing that is known when you > design app. > > Make sense? |
Free forum by Nabble | Edit this page |