Igniters, I am working on IGNITE-7644 (export all key-value data from a persisted partition), it will be command line tool for extracting data from Ignite partition file without the need to start node. The main motivation is to have a lifebuoy in case if a file has damage for some reason. I suggest simple API and two commands for the first implementation: -c --CRC [srcPath] - check CRC for all(or by type) pages in partition -e --extract [srcPath] [outPath] - dump all survey data from partition to another file with raw key/value pair format (required graceful stop for a node, not necessary after --restore will be implemented) Output file format see in attached, this format does not contain any index inside but it is very simple and flexible for future works with raw key/value data. Future features: -u --upload - reload raw key/value pairs to node -s --status - check current node file status, need binary recovery or not (node crash on the middle of a checkpoint) -r --restore - restore binary consistency (finish checkpoint, required WAL file for recovery) Let's start a discussion, any comments are welcome. |
Hello, Dmitriy.
Should we support extraction of encrypted data? There will be 2 type of keys we should load to successfully extract data: * master key: keystore + password required. * cache keys: masterkey + access to metastore required. TDE task is almost done, please, take a look. ticket - https://issues.apache.org/jira/browse/IGNITE-8485 prototype - https://github.com/apache/ignite/pull/4167 spi - https://github.com/apache/ignite/pull/4167/files#diff-9a792ab0e6971f202d22d530af0ac933 В Сб, 30/06/2018 в 22:37 +0300, Dmitriy Govorukhin пишет: > Igniters, > > I am working on IGNITE-7644 (export all key-value data from a persisted partition), > it will be command line tool for extracting data from Ignite partition file without the need to start node. > The main motivation is to have a lifebuoy in case if a file has damage for some reason. > > I suggest simple API and two commands for the first implementation: > > -c > --CRC [srcPath] - check CRC for all(or by type) pages in partition > > -e > --extract [srcPath] [outPath] - dump all survey data from partition to another file with raw key/value pair format > (required graceful stop for a node, not necessary after --restore will be implemented) > > Output file format see in attached, this format does not contain any index inside but it is very simple and > flexible for future works with raw key/value data. > > Future features: > -u > --upload - reload raw key/value pairs to node > > -s > --status - check current node file status, need binary recovery or not (node crash on the middle of a checkpoint) > > -r > --restore - restore binary consistency (finish checkpoint, required WAL file for recovery) > > Let's start a discussion, any comments are welcome. > |
Nikolay,
I think we won't support extract from encrypted store In the first implementation. I guess we can support the encrypted store in future, or you have a reason why we should do it in first? On Sun, Jul 1, 2018 at 11:48 AM Nikolay Izhikov <[hidden email]> wrote: > Hello, Dmitriy. > > Should we support extraction of encrypted data? > > There will be 2 type of keys we should load to successfully extract data: > > * master key: keystore + password required. > * cache keys: masterkey + access to metastore required. > > TDE task is almost done, please, take a look. > > ticket - https://issues.apache.org/jira/browse/IGNITE-8485 > prototype - https://github.com/apache/ignite/pull/4167 > spi - > https://github.com/apache/ignite/pull/4167/files#diff-9a792ab0e6971f202d22d530af0ac933 > > В Сб, 30/06/2018 в 22:37 +0300, Dmitriy Govorukhin пишет: > > Igniters, > > > > I am working on IGNITE-7644 (export all key-value data from a persisted > partition), > > it will be command line tool for extracting data from Ignite partition > file without the need to start node. > > The main motivation is to have a lifebuoy in case if a file has damage > for some reason. > > > > I suggest simple API and two commands for the first implementation: > > > > -c > > --CRC [srcPath] - check CRC for all(or by type) pages in partition > > > > -e > > --extract [srcPath] [outPath] - dump all survey data from partition to > another file with raw key/value pair format > > (required graceful stop for a node, not necessary after --restore will > be implemented) > > > > Output file format see in attached, this format does not contain any > index inside but it is very simple and > > flexible for future works with raw key/value data. > > > > Future features: > > -u > > --upload - reload raw key/value pairs to node > > > > -s > > --status - check current node file status, need binary recovery or not > (node crash on the middle of a checkpoint) > > > > -r > > --restore - restore binary consistency (finish checkpoint, required WAL > file for recovery) > > > > Let's start a discussion, any comments are welcome. > > |
In reply to this post by Dmitriy Govorukhin
Dmitriy,
A few questions regarding the user cases for the utility: 1) Would I be able to read the extracted data from the dumped file without Ignite node binary/marshaller metadata? In other words, will I be able to move only the dumped file to another grid or will I need to move the metadata as well? 2) Are you planning to add a public API version of this utility as a part of Ignite? For example, if I am planning to run some statistics on a checkpointed data, will I be able to get some sort of an iterator to process this data? 3) How a user will choose which caches (cache groups) to process? Will the user need to provide a cache or cache ID (or either of them)? Will the utility be able to extract a single cache data from a cache group? 4) I think the upload part of the utility is missing some input parameters - for example, what cluster to connect to, what caches to upload to, etc. сб, 30 июн. 2018 г. в 22:38, Dmitriy Govorukhin < [hidden email]>: > Igniters, > > I am working on IGNITE-7644 > <https://issues.apache.org/jira/browse/IGNITE-7644> (export all key-value > data from a persisted partition), > it will be command line tool for extracting data from Ignite partition > file without the need to start node. > The main motivation is to have a lifebuoy in case if a file has damage for > some reason. > > I suggest simple API and two commands for the first implementation: > > -c > --CRC [srcPath] - check CRC for all(or by type) pages in partition > > -e > --extract [srcPath] [outPath] - dump all survey data from partition to > another file with raw key/value pair format > (required graceful stop for a node, not necessary after --restore will be > implemented) > > Output file format see in attached, this format does not contain any index > inside but it is very simple and > flexible for future works with raw key/value data. > > Future features: > -u > --upload - reload raw key/value pairs to node > > -s > --status - check current node file status, need binary recovery or not > (node crash on the middle of a checkpoint) > > -r > --restore - restore binary consistency (finish checkpoint, required WAL > file for recovery) > > Let's start a discussion, any comments are welcome. > > |
Alexey,
1. The utility will extract raw payload bytes. If you want to build binary object or Java class instances you will need binary/marshaller metadata. If two grid will have different metadata, you should move metadata as well as dumped data for construct binary objects on another grid. Do you have any ideas on how we can improve this approach? 2. I do not think that I understood your idea, please explain in more details who do you want to use the utility in checkpoint statistic? 3. In the first implementation, I prefer simple *file path* approach, you can specify a path as a parameter to some partition file or directory cache/group or root to caches/groups directory. 4. I have not had time to work out how we will upload date to another grid. Any ideas are welcome. On Mon, Jul 2, 2018 at 5:34 PM Alexey Goncharuk <[hidden email]> wrote: > Dmitriy, > > A few questions regarding the user cases for the utility: > 1) Would I be able to read the extracted data from the dumped file without > Ignite node binary/marshaller metadata? In other words, will I be able to > move only the dumped file to another grid or will I need to move the > metadata as well? > 2) Are you planning to add a public API version of this utility as a part > of Ignite? For example, if I am planning to run some statistics on a > checkpointed data, will I be able to get some sort of an iterator to > process this data? > 3) How a user will choose which caches (cache groups) to process? Will the > user need to provide a cache or cache ID (or either of them)? Will the > utility be able to extract a single cache data from a cache group? > 4) I think the upload part of the utility is missing some input parameters > - for example, what cluster to connect to, what caches to upload to, etc. > > сб, 30 июн. 2018 г. в 22:38, Dmitriy Govorukhin < > [hidden email]>: > > > Igniters, > > > > I am working on IGNITE-7644 > > <https://issues.apache.org/jira/browse/IGNITE-7644> (export all > key-value > > data from a persisted partition), > > it will be command line tool for extracting data from Ignite partition > > file without the need to start node. > > The main motivation is to have a lifebuoy in case if a file has damage > for > > some reason. > > > > I suggest simple API and two commands for the first implementation: > > > > -c > > --CRC [srcPath] - check CRC for all(or by type) pages in partition > > > > -e > > --extract [srcPath] [outPath] - dump all survey data from partition to > > another file with raw key/value pair format > > (required graceful stop for a node, not necessary after --restore will be > > implemented) > > > > Output file format see in attached, this format does not contain any > index > > inside but it is very simple and > > flexible for future works with raw key/value data. > > > > Future features: > > -u > > --upload - reload raw key/value pairs to node > > > > -s > > --status - check current node file status, need binary recovery or not > > (node crash on the middle of a checkpoint) > > > > -r > > --restore - restore binary consistency (finish checkpoint, required WAL > > file for recovery) > > > > Let's start a discussion, any comments are welcome. > > > > > |
Free forum by Nabble | Edit this page |