Hi, dev,
need opinions on the question discussed in https://issues.apache.org/jira/browse/IGNITE-1679 (IGFS: Purge event is inconsistent). In short: in Igfs we have "soft" delete that moves the deleted file or folder to special "TRASH" folder. Special async worker walks inside TRASH and removes the items permanently. When an item is completely removed, an event of type org.apache.ignite.events.EventType#EVT_IGFS_FILE_PURGED is fired. But such events are now fired only for files, and only in case if such file was deleted itself, but not a part of a folder sub-tree. It's quite obvious that such behavior is not quite consistent, so we should either get rid of PURGE events at all, or make them consistent. In the latter case it would be good to have answer to the question: what are real use cases when we may need the purge events ? (Now they seem to be used in tests only). If we don't have such real use cases, are there any objections to get rid of the purge events at all? Thanks in advance. |
Ivan,
The importance of the PURGE event has to do with notification about freeing memory, otherwise occupied by a deleted file. How hard do you think would be making the PURGE behavior consistent between directory and file deletions? D On Fri, Nov 20, 2015 at 8:15 AM, Ivan V. <[hidden email]> wrote: > Hi, dev, > need opinions on the question discussed in > https://issues.apache.org/jira/browse/IGNITE-1679 (IGFS: Purge event is > inconsistent). > In short: in Igfs we have "soft" delete that moves the deleted file or > folder to special "TRASH" folder. > Special async worker walks inside TRASH and removes the items permanently. > When an item is completely removed, an event of type > org.apache.ignite.events.EventType#EVT_IGFS_FILE_PURGED is fired. > But such events are now fired only for files, and only in case if such file > was deleted itself, but not a part of a folder sub-tree. It's quite obvious > that such behavior is not quite consistent, so we should either get rid of > PURGE events at all, or make them consistent. > In the latter case it would be good to have answer to the question: what > are real use cases when we may need the purge events ? (Now they seem to > be used in tests only). > If we don't have such real use cases, are there any objections to get rid > of the purge events at all? > Thanks in advance. > |
Hi, Dmitriy,
to wait for memory freeing we have method org.apache.ignite.internal.processors.igfs.IgfsEx#awaitDeletesAsync() which returns a Future that can be awaited (with a timeout or without). Also during recent fix https://issues.apache.org/jira/browse/IGNITE-1510 we introduced new method IgfsEx#clear(IgfsPath) that deletes the specified path and waits for the garbage data cleanup. These methods have more or less convenient usage pattern. But it is much more difficult to use PURGE events in practice. E.g. how to know how many events to expect, and how to track what events have arrived, and what have not? On Fri, Nov 20, 2015 at 9:10 PM, Dmitriy Setrakyan <[hidden email]> wrote: > Ivan, > > The importance of the PURGE event has to do with notification about freeing > memory, otherwise occupied by a deleted file. > > How hard do you think would be making the PURGE behavior consistent between > directory and file deletions? > > D > > On Fri, Nov 20, 2015 at 8:15 AM, Ivan V. <[hidden email]> > wrote: > > > Hi, dev, > > need opinions on the question discussed in > > https://issues.apache.org/jira/browse/IGNITE-1679 (IGFS: Purge event is > > inconsistent). > > In short: in Igfs we have "soft" delete that moves the deleted file or > > folder to special "TRASH" folder. > > Special async worker walks inside TRASH and removes the items > permanently. > > When an item is completely removed, an event of type > > org.apache.ignite.events.EventType#EVT_IGFS_FILE_PURGED is fired. > > But such events are now fired only for files, and only in case if such > file > > was deleted itself, but not a part of a folder sub-tree. It's quite > obvious > > that such behavior is not quite consistent, so we should either get rid > of > > PURGE events at all, or make them consistent. > > In the latter case it would be good to have answer to the question: what > > are real use cases when we may need the purge events ? (Now they seem to > > be used in tests only). > > If we don't have such real use cases, are there any objections to get rid > > of the purge events at all? > > Thanks in advance. > > > |
On Fri, Nov 20, 2015 at 11:08 AM, Ivan V. <[hidden email]> wrote:
> Hi, Dmitriy, > to wait for memory freeing we have > method > org.apache.ignite.internal.processors.igfs.IgfsEx#awaitDeletesAsync() > which returns a Future that can be awaited (with a timeout or without). > Also during recent fix https://issues.apache.org/jira/browse/IGNITE-1510 > we > introduced new method IgfsEx#clear(IgfsPath) that deletes the specified > path and waits for the garbage data cleanup. > These methods have more or less convenient usage pattern. > But it is much more difficult to use PURGE events in practice. E.g. how to > know how many events to expect, and how to track what events have arrived, > and what have not? > Ivan, I see your point. There are 2 ways to resolve it, we either deprecate the event, or we support it properly. How difficult, in your opinion, it would be to support this even properly. > > On Fri, Nov 20, 2015 at 9:10 PM, Dmitriy Setrakyan <[hidden email]> > wrote: > > > Ivan, > > > > The importance of the PURGE event has to do with notification about > freeing > > memory, otherwise occupied by a deleted file. > > > > How hard do you think would be making the PURGE behavior consistent > between > > directory and file deletions? > > > > D > > > > On Fri, Nov 20, 2015 at 8:15 AM, Ivan V. <[hidden email]> > > wrote: > > > > > Hi, dev, > > > need opinions on the question discussed in > > > https://issues.apache.org/jira/browse/IGNITE-1679 (IGFS: Purge event > is > > > inconsistent). > > > In short: in Igfs we have "soft" delete that moves the deleted file or > > > folder to special "TRASH" folder. > > > Special async worker walks inside TRASH and removes the items > > permanently. > > > When an item is completely removed, an event of type > > > org.apache.ignite.events.EventType#EVT_IGFS_FILE_PURGED is fired. > > > But such events are now fired only for files, and only in case if such > > file > > > was deleted itself, but not a part of a folder sub-tree. It's quite > > obvious > > > that such behavior is not quite consistent, so we should either get rid > > of > > > PURGE events at all, or make them consistent. > > > In the latter case it would be good to have answer to the question: > what > > > are real use cases when we may need the purge events ? (Now they seem > to > > > be used in tests only). > > > If we don't have such real use cases, are there any objections to get > rid > > > of the purge events at all? > > > Thanks in advance. > > > > > > |
Hi, Dmitriy,
this is not difficult to support the events properly, we just need to store last path of each file or make sure it is guaranteed to be calculatable when the file is already in TRASH (this is needed because each PURGE event must contain the path of the file that is purged). On Fri, Nov 20, 2015 at 10:45 PM, Dmitriy Setrakyan <[hidden email]> wrote: > On Fri, Nov 20, 2015 at 11:08 AM, Ivan V. <[hidden email]> > wrote: > > > Hi, Dmitriy, > > to wait for memory freeing we have > > method > > org.apache.ignite.internal.processors.igfs.IgfsEx#awaitDeletesAsync() > > which returns a Future that can be awaited (with a timeout or without). > > Also during recent fix https://issues.apache.org/jira/browse/IGNITE-1510 > > we > > introduced new method IgfsEx#clear(IgfsPath) that deletes the specified > > path and waits for the garbage data cleanup. > > These methods have more or less convenient usage pattern. > > But it is much more difficult to use PURGE events in practice. E.g. how > to > > know how many events to expect, and how to track what events have > arrived, > > and what have not? > > > > Ivan, I see your point. There are 2 ways to resolve it, we either deprecate > the event, or we support it properly. How difficult, in your opinion, it > would be to support this even properly. > > > > > > On Fri, Nov 20, 2015 at 9:10 PM, Dmitriy Setrakyan < > [hidden email]> > > wrote: > > > > > Ivan, > > > > > > The importance of the PURGE event has to do with notification about > > freeing > > > memory, otherwise occupied by a deleted file. > > > > > > How hard do you think would be making the PURGE behavior consistent > > between > > > directory and file deletions? > > > > > > D > > > > > > On Fri, Nov 20, 2015 at 8:15 AM, Ivan V. <[hidden email]> > > > wrote: > > > > > > > Hi, dev, > > > > need opinions on the question discussed in > > > > https://issues.apache.org/jira/browse/IGNITE-1679 (IGFS: Purge > event > > is > > > > inconsistent). > > > > In short: in Igfs we have "soft" delete that moves the deleted file > or > > > > folder to special "TRASH" folder. > > > > Special async worker walks inside TRASH and removes the items > > > permanently. > > > > When an item is completely removed, an event of type > > > > org.apache.ignite.events.EventType#EVT_IGFS_FILE_PURGED is fired. > > > > But such events are now fired only for files, and only in case if > such > > > file > > > > was deleted itself, but not a part of a folder sub-tree. It's quite > > > obvious > > > > that such behavior is not quite consistent, so we should either get > rid > > > of > > > > PURGE events at all, or make them consistent. > > > > In the latter case it would be good to have answer to the question: > > what > > > > are real use cases when we may need the purge events ? (Now they > seem > > to > > > > be used in tests only). > > > > If we don't have such real use cases, are there any objections to get > > rid > > > > of the purge events at all? > > > > Thanks in advance. > > > > > > > > > > |
In reply to this post by Ivan V.
Let me ask a different question: what's the point of having the concept of
TRASH? Here's an example why I think the 'soft' delete would only complicate thing. Suppose IGFS is sitting on top of HDFS and both have 'Trash' enabled. Now, the file is getting soft-deleted from IGFS and is moved to TRASH folder. But in HDFS it is also a move to a place which doesn't have any special meaning for HDFS. Even worst, if IFGS TRASH is linked to HDFS .Trash. HDFS has it's own policy on how to clean that up, which is likely to be different from that on IGFS. Often enough, HDFS .Trash is simply disabled. This discrepancy is going to create a situation when a file should still be in TRASH, but the secondary FS has already purged it. And what if yet another secondary file system like S3 has yet another policy around their own trash, which they don't even have, I believe? Where I am going with this is pretty straight forward: let's drop the soft-delete support and let the secondary FS to deal with it. If there's no secondary FS configured - the content of deleted file will have to retrieved by other means. Thoughts? Cos On Fri, Nov 20, 2015 at 07:15PM, Ivan V. wrote: > Hi, dev, > need opinions on the question discussed in > https://issues.apache.org/jira/browse/IGNITE-1679 (IGFS: Purge event is > inconsistent). > In short: in Igfs we have "soft" delete that moves the deleted file or > folder to special "TRASH" folder. > Special async worker walks inside TRASH and removes the items permanently. > When an item is completely removed, an event of type > org.apache.ignite.events.EventType#EVT_IGFS_FILE_PURGED is fired. > But such events are now fired only for files, and only in case if such file > was deleted itself, but not a part of a folder sub-tree. It's quite obvious > that such behavior is not quite consistent, so we should either get rid of > PURGE events at all, or make them consistent. > In the latter case it would be good to have answer to the question: what > are real use cases when we may need the purge events ? (Now they seem to > be used in tests only). > If we don't have such real use cases, are there any objections to get rid > of the purge events at all? > Thanks in advance. |
Cos,
The main reason soft delete was added is performance. Without soft-delete, the delete operation would have to wait until a file is fully deleted from a folder, which may take time. As far as secondary FS handling it, IGFS does not require a secondary FS, so we should account for cases when IGFS is running stand-alone. Thoughts? D. On Mon, Nov 23, 2015 at 11:00 PM, Konstantin Boudnik <[hidden email]> wrote: > Let me ask a different question: what's the point of having the concept of > TRASH? > > Here's an example why I think the 'soft' delete would only complicate > thing. > Suppose IGFS is sitting on top of HDFS and both have 'Trash' enabled. Now, > the file is getting soft-deleted from IGFS and is moved to TRASH folder. > But > in HDFS it is also a move to a place which doesn't have any special meaning > for HDFS. > > Even worst, if IFGS TRASH is linked to HDFS .Trash. HDFS has it's own > policy > on how to clean that up, which is likely to be different from that on IGFS. > Often enough, HDFS .Trash is simply disabled. This discrepancy is going to > create a situation when a file should still be in TRASH, but the secondary > FS > has already purged it. > > And what if yet another secondary file system like S3 has yet another > policy > around their own trash, which they don't even have, I believe? > > Where I am going with this is pretty straight forward: let's drop the > soft-delete support and let the secondary FS to deal with it. If there's no > secondary FS configured - the content of deleted file will have to > retrieved > by other means. > > Thoughts? > Cos > > On Fri, Nov 20, 2015 at 07:15PM, Ivan V. wrote: > > Hi, dev, > > need opinions on the question discussed in > > https://issues.apache.org/jira/browse/IGNITE-1679 (IGFS: Purge event is > > inconsistent). > > In short: in Igfs we have "soft" delete that moves the deleted file or > > folder to special "TRASH" folder. > > Special async worker walks inside TRASH and removes the items > permanently. > > When an item is completely removed, an event of type > > org.apache.ignite.events.EventType#EVT_IGFS_FILE_PURGED is fired. > > But such events are now fired only for files, and only in case if such > file > > was deleted itself, but not a part of a folder sub-tree. It's quite > obvious > > that such behavior is not quite consistent, so we should either get rid > of > > PURGE events at all, or make them consistent. > > In the latter case it would be good to have answer to the question: what > > are real use cases when we may need the purge events ? (Now they seem to > > be used in tests only). > > If we don't have such real use cases, are there any objections to get rid > > of the purge events at all? > > Thanks in advance. > |
Hi, Konstantin,
"TRASH" (the name comes from org.apache.ignite.internal.processors.igfs.IgfsFileInfo#TRASH_ID Java constant) notion is only applicable to primary (IGFS) file system. This is a special "folder" that does not have file system path. When IGFS is running over a secondary Fs, TRASH also exists in the primary IGFS, but does not exist in the secondary Fs. In secondary Fs deletion is performed just through the ordinary Fs API. So, we *do not* employ any assumption regarding the TRASH existence and behavior in the secondary Fs. As Dmitriy mentioned above, TRASH in primary Fs is needed for performance reasons: with it we delete file with only 1 transaction in Meta cache: we do not do any transactions in Data cache. (Similar technique is frequently applied frequently in real Fs deletion, like mv foo /tmp/ && rm -r /tmp/foo/ .) Currently we have fix of https://issues.apache.org/jira/browse/IGNITE-1679 that makes PURGE events enabled for all files. I still not quite realize how this functionality will be used by customers, but now it is repaired: once merged, you will be able to use it. On Tue, Nov 24, 2015 at 2:52 AM, Dmitriy Setrakyan <[hidden email]> wrote: > Cos, > > The main reason soft delete was added is performance. Without soft-delete, > the delete operation would have to wait until a file is fully deleted from > a folder, which may take time. > > As far as secondary FS handling it, IGFS does not require a secondary FS, > so we should account for cases when IGFS is running stand-alone. > > Thoughts? > > D. > > On Mon, Nov 23, 2015 at 11:00 PM, Konstantin Boudnik <[hidden email]> > wrote: > > > Let me ask a different question: what's the point of having the concept > of > > TRASH? > > > > Here's an example why I think the 'soft' delete would only complicate > > thing. > > Suppose IGFS is sitting on top of HDFS and both have 'Trash' enabled. > Now, > > the file is getting soft-deleted from IGFS and is moved to TRASH folder. > > But > > in HDFS it is also a move to a place which doesn't have any special > meaning > > for HDFS. > > > > Even worst, if IFGS TRASH is linked to HDFS .Trash. HDFS has it's own > > policy > > on how to clean that up, which is likely to be different from that on > IGFS. > > Often enough, HDFS .Trash is simply disabled. This discrepancy is going > to > > create a situation when a file should still be in TRASH, but the > secondary > > FS > > has already purged it. > > > > And what if yet another secondary file system like S3 has yet another > > policy > > around their own trash, which they don't even have, I believe? > > > > Where I am going with this is pretty straight forward: let's drop the > > soft-delete support and let the secondary FS to deal with it. If there's > no > > secondary FS configured - the content of deleted file will have to > > retrieved > > by other means. > > > > Thoughts? > > Cos > > > > On Fri, Nov 20, 2015 at 07:15PM, Ivan V. wrote: > > > Hi, dev, > > > need opinions on the question discussed in > > > https://issues.apache.org/jira/browse/IGNITE-1679 (IGFS: Purge event > is > > > inconsistent). > > > In short: in Igfs we have "soft" delete that moves the deleted file or > > > folder to special "TRASH" folder. > > > Special async worker walks inside TRASH and removes the items > > permanently. > > > When an item is completely removed, an event of type > > > org.apache.ignite.events.EventType#EVT_IGFS_FILE_PURGED is fired. > > > But such events are now fired only for files, and only in case if such > > file > > > was deleted itself, but not a part of a folder sub-tree. It's quite > > obvious > > > that such behavior is not quite consistent, so we should either get rid > > of > > > PURGE events at all, or make them consistent. > > > In the latter case it would be good to have answer to the question: > what > > > are real use cases when we may need the purge events ? (Now they seem > to > > > be used in tests only). > > > If we don't have such real use cases, are there any objections to get > rid > > > of the purge events at all? > > > Thanks in advance. > > > |
On Wed, Nov 25, 2015 at 04:54PM, Ivan V. wrote:
> Hi, Konstantin, > "TRASH" (the name comes from > org.apache.ignite.internal.processors.igfs.IgfsFileInfo#TRASH_ID Java > constant) notion is only applicable to primary (IGFS) file system. This is > a special "folder" that does not have file system path. When IGFS is > running over a secondary Fs, TRASH also exists in the primary IGFS, but > does not exist in the secondary Fs. > In secondary Fs deletion is performed just through the ordinary Fs API. So, > we *do not* employ any assumption regarding the TRASH existence and > behavior in the secondary Fs. > > As Dmitriy mentioned above, TRASH in primary Fs is needed for performance > reasons: with it we delete file with only 1 transaction in Meta cache: we > do not do any transactions in Data cache. > (Similar technique is frequently applied frequently in real Fs deletion, > like mv foo /tmp/ && rm -r /tmp/foo/ .) cases when an _optional_ secondary file system is plugged in. Cos > Currently we have fix of https://issues.apache.org/jira/browse/IGNITE-1679 > that makes PURGE events enabled for all files. > I still not quite realize how this functionality will be used by customers, > but now it is repaired: once merged, you will be able to use it. > > On Tue, Nov 24, 2015 at 2:52 AM, Dmitriy Setrakyan <[hidden email]> > wrote: > > > Cos, > > > > The main reason soft delete was added is performance. Without soft-delete, > > the delete operation would have to wait until a file is fully deleted from > > a folder, which may take time. > > > > As far as secondary FS handling it, IGFS does not require a secondary FS, > > so we should account for cases when IGFS is running stand-alone. > > > > Thoughts? > > > > D. > > > > On Mon, Nov 23, 2015 at 11:00 PM, Konstantin Boudnik <[hidden email]> > > wrote: > > > > > Let me ask a different question: what's the point of having the concept > > of > > > TRASH? > > > > > > Here's an example why I think the 'soft' delete would only complicate > > > thing. > > > Suppose IGFS is sitting on top of HDFS and both have 'Trash' enabled. > > Now, > > > the file is getting soft-deleted from IGFS and is moved to TRASH folder. > > > But > > > in HDFS it is also a move to a place which doesn't have any special > > meaning > > > for HDFS. > > > > > > Even worst, if IFGS TRASH is linked to HDFS .Trash. HDFS has it's own > > > policy > > > on how to clean that up, which is likely to be different from that on > > IGFS. > > > Often enough, HDFS .Trash is simply disabled. This discrepancy is going > > to > > > create a situation when a file should still be in TRASH, but the > > secondary > > > FS > > > has already purged it. > > > > > > And what if yet another secondary file system like S3 has yet another > > > policy > > > around their own trash, which they don't even have, I believe? > > > > > > Where I am going with this is pretty straight forward: let's drop the > > > soft-delete support and let the secondary FS to deal with it. If there's > > no > > > secondary FS configured - the content of deleted file will have to > > > retrieved > > > by other means. > > > > > > Thoughts? > > > Cos > > > > > > On Fri, Nov 20, 2015 at 07:15PM, Ivan V. wrote: > > > > Hi, dev, > > > > need opinions on the question discussed in > > > > https://issues.apache.org/jira/browse/IGNITE-1679 (IGFS: Purge event > > is > > > > inconsistent). > > > > In short: in Igfs we have "soft" delete that moves the deleted file or > > > > folder to special "TRASH" folder. > > > > Special async worker walks inside TRASH and removes the items > > > permanently. > > > > When an item is completely removed, an event of type > > > > org.apache.ignite.events.EventType#EVT_IGFS_FILE_PURGED is fired. > > > > But such events are now fired only for files, and only in case if such > > > file > > > > was deleted itself, but not a part of a folder sub-tree. It's quite > > > obvious > > > > that such behavior is not quite consistent, so we should either get rid > > > of > > > > PURGE events at all, or make them consistent. > > > > In the latter case it would be good to have answer to the question: > > what > > > > are real use cases when we may need the purge events ? (Now they seem > > to > > > > be used in tests only). > > > > If we don't have such real use cases, are there any objections to get > > rid > > > > of the purge events at all? > > > > Thanks in advance. > > > > > |
Hi, Konstantin,
currently we work on IGFS failovers and reliability, so IGFS should behave correctly in any expected use case. It's not quite clear to me what does "when an _optional_ secondary file system is plugged in" mean. Can you please explain this use case in more detail? On Thu, Nov 26, 2015 at 5:36 AM, Konstantin Boudnik <[hidden email]> wrote: > On Wed, Nov 25, 2015 at 04:54PM, Ivan V. wrote: > > Hi, Konstantin, > > "TRASH" (the name comes from > > org.apache.ignite.internal.processors.igfs.IgfsFileInfo#TRASH_ID Java > > constant) notion is only applicable to primary (IGFS) file system. This > is > > a special "folder" that does not have file system path. When IGFS is > > running over a secondary Fs, TRASH also exists in the primary IGFS, but > > does not exist in the secondary Fs. > > In secondary Fs deletion is performed just through the ordinary Fs API. > So, > > we *do not* employ any assumption regarding the TRASH existence and > > behavior in the secondary Fs. > > > > As Dmitriy mentioned above, TRASH in primary Fs is needed for performance > > reasons: with it we delete file with only 1 transaction in Meta cache: we > > do not do any transactions in Data cache. > > (Similar technique is frequently applied frequently in real Fs deletion, > > like mv foo /tmp/ && rm -r /tmp/foo/ .) > > I understand. However, I am wary about the potentially funny and > inconsistent > cases when an _optional_ secondary file system is plugged in. > > Cos > > > Currently we have fix of > https://issues.apache.org/jira/browse/IGNITE-1679 > > that makes PURGE events enabled for all files. > > I still not quite realize how this functionality will be used by > customers, > > but now it is repaired: once merged, you will be able to use it. > > > > On Tue, Nov 24, 2015 at 2:52 AM, Dmitriy Setrakyan < > [hidden email]> > > wrote: > > > > > Cos, > > > > > > The main reason soft delete was added is performance. Without > soft-delete, > > > the delete operation would have to wait until a file is fully deleted > from > > > a folder, which may take time. > > > > > > As far as secondary FS handling it, IGFS does not require a secondary > FS, > > > so we should account for cases when IGFS is running stand-alone. > > > > > > Thoughts? > > > > > > D. > > > > > > On Mon, Nov 23, 2015 at 11:00 PM, Konstantin Boudnik <[hidden email]> > > > wrote: > > > > > > > Let me ask a different question: what's the point of having the > concept > > > of > > > > TRASH? > > > > > > > > Here's an example why I think the 'soft' delete would only complicate > > > > thing. > > > > Suppose IGFS is sitting on top of HDFS and both have 'Trash' enabled. > > > Now, > > > > the file is getting soft-deleted from IGFS and is moved to TRASH > folder. > > > > But > > > > in HDFS it is also a move to a place which doesn't have any special > > > meaning > > > > for HDFS. > > > > > > > > Even worst, if IFGS TRASH is linked to HDFS .Trash. HDFS has it's own > > > > policy > > > > on how to clean that up, which is likely to be different from that on > > > IGFS. > > > > Often enough, HDFS .Trash is simply disabled. This discrepancy is > going > > > to > > > > create a situation when a file should still be in TRASH, but the > > > secondary > > > > FS > > > > has already purged it. > > > > > > > > And what if yet another secondary file system like S3 has yet another > > > > policy > > > > around their own trash, which they don't even have, I believe? > > > > > > > > Where I am going with this is pretty straight forward: let's drop the > > > > soft-delete support and let the secondary FS to deal with it. If > there's > > > no > > > > secondary FS configured - the content of deleted file will have to > > > > retrieved > > > > by other means. > > > > > > > > Thoughts? > > > > Cos > > > > > > > > On Fri, Nov 20, 2015 at 07:15PM, Ivan V. wrote: > > > > > Hi, dev, > > > > > need opinions on the question discussed in > > > > > https://issues.apache.org/jira/browse/IGNITE-1679 (IGFS: Purge > event > > > is > > > > > inconsistent). > > > > > In short: in Igfs we have "soft" delete that moves the deleted > file or > > > > > folder to special "TRASH" folder. > > > > > Special async worker walks inside TRASH and removes the items > > > > permanently. > > > > > When an item is completely removed, an event of type > > > > > org.apache.ignite.events.EventType#EVT_IGFS_FILE_PURGED is fired. > > > > > But such events are now fired only for files, and only in case if > such > > > > file > > > > > was deleted itself, but not a part of a folder sub-tree. It's quite > > > > obvious > > > > > that such behavior is not quite consistent, so we should either > get rid > > > > of > > > > > PURGE events at all, or make them consistent. > > > > > In the latter case it would be good to have answer to the question: > > > what > > > > > are real use cases when we may need the purge events ? (Now they > seem > > > to > > > > > be used in tests only). > > > > > If we don't have such real use cases, are there any objections to > get > > > rid > > > > > of the purge events at all? > > > > > Thanks in advance. > > > > > > > > |
Cos,
I agree with Ivan that there should be no architectural problems with our "trash" concept. Essentialy, when delete is performed in DUAL mode, we do two things: 1) Propagate removal to secondary file system; 2) Logically move affected path to "trash" in IGFS. Once item is in trash, no one else will be able to operate on it or it's children. Moreover, "trashed" item is decoupled from secondary file system (to the contrast with normal entities). We had some consistency problems when parts of "trashed" entry could be resurrected, but for now they are fixed. Vladimir. On Thu, Nov 26, 2015 at 1:11 PM, Ivan V. <[hidden email]> wrote: > Hi, Konstantin, > currently we work on IGFS failovers and reliability, so IGFS should behave > correctly in any expected use case. > It's not quite clear to me what does "when an _optional_ secondary file > system is plugged in" mean. > Can you please explain this use case in more detail? > > > On Thu, Nov 26, 2015 at 5:36 AM, Konstantin Boudnik <[hidden email]> > wrote: > > > On Wed, Nov 25, 2015 at 04:54PM, Ivan V. wrote: > > > Hi, Konstantin, > > > "TRASH" (the name comes from > > > org.apache.ignite.internal.processors.igfs.IgfsFileInfo#TRASH_ID Java > > > constant) notion is only applicable to primary (IGFS) file system. This > > is > > > a special "folder" that does not have file system path. When IGFS is > > > running over a secondary Fs, TRASH also exists in the primary IGFS, but > > > does not exist in the secondary Fs. > > > In secondary Fs deletion is performed just through the ordinary Fs API. > > So, > > > we *do not* employ any assumption regarding the TRASH existence and > > > behavior in the secondary Fs. > > > > > > As Dmitriy mentioned above, TRASH in primary Fs is needed for > performance > > > reasons: with it we delete file with only 1 transaction in Meta cache: > we > > > do not do any transactions in Data cache. > > > (Similar technique is frequently applied frequently in real Fs > deletion, > > > like mv foo /tmp/ && rm -r /tmp/foo/ .) > > > > I understand. However, I am wary about the potentially funny and > > inconsistent > > cases when an _optional_ secondary file system is plugged in. > > > > Cos > > > > > Currently we have fix of > > https://issues.apache.org/jira/browse/IGNITE-1679 > > > that makes PURGE events enabled for all files. > > > I still not quite realize how this functionality will be used by > > customers, > > > but now it is repaired: once merged, you will be able to use it. > > > > > > On Tue, Nov 24, 2015 at 2:52 AM, Dmitriy Setrakyan < > > [hidden email]> > > > wrote: > > > > > > > Cos, > > > > > > > > The main reason soft delete was added is performance. Without > > soft-delete, > > > > the delete operation would have to wait until a file is fully deleted > > from > > > > a folder, which may take time. > > > > > > > > As far as secondary FS handling it, IGFS does not require a secondary > > FS, > > > > so we should account for cases when IGFS is running stand-alone. > > > > > > > > Thoughts? > > > > > > > > D. > > > > > > > > On Mon, Nov 23, 2015 at 11:00 PM, Konstantin Boudnik <[hidden email] > > > > > > wrote: > > > > > > > > > Let me ask a different question: what's the point of having the > > concept > > > > of > > > > > TRASH? > > > > > > > > > > Here's an example why I think the 'soft' delete would only > complicate > > > > > thing. > > > > > Suppose IGFS is sitting on top of HDFS and both have 'Trash' > enabled. > > > > Now, > > > > > the file is getting soft-deleted from IGFS and is moved to TRASH > > folder. > > > > > But > > > > > in HDFS it is also a move to a place which doesn't have any special > > > > meaning > > > > > for HDFS. > > > > > > > > > > Even worst, if IFGS TRASH is linked to HDFS .Trash. HDFS has it's > own > > > > > policy > > > > > on how to clean that up, which is likely to be different from that > on > > > > IGFS. > > > > > Often enough, HDFS .Trash is simply disabled. This discrepancy is > > going > > > > to > > > > > create a situation when a file should still be in TRASH, but the > > > > secondary > > > > > FS > > > > > has already purged it. > > > > > > > > > > And what if yet another secondary file system like S3 has yet > another > > > > > policy > > > > > around their own trash, which they don't even have, I believe? > > > > > > > > > > Where I am going with this is pretty straight forward: let's drop > the > > > > > soft-delete support and let the secondary FS to deal with it. If > > there's > > > > no > > > > > secondary FS configured - the content of deleted file will have to > > > > > retrieved > > > > > by other means. > > > > > > > > > > Thoughts? > > > > > Cos > > > > > > > > > > On Fri, Nov 20, 2015 at 07:15PM, Ivan V. wrote: > > > > > > Hi, dev, > > > > > > need opinions on the question discussed in > > > > > > https://issues.apache.org/jira/browse/IGNITE-1679 (IGFS: Purge > > event > > > > is > > > > > > inconsistent). > > > > > > In short: in Igfs we have "soft" delete that moves the deleted > > file or > > > > > > folder to special "TRASH" folder. > > > > > > Special async worker walks inside TRASH and removes the items > > > > > permanently. > > > > > > When an item is completely removed, an event of type > > > > > > org.apache.ignite.events.EventType#EVT_IGFS_FILE_PURGED is > fired. > > > > > > But such events are now fired only for files, and only in case if > > such > > > > > file > > > > > > was deleted itself, but not a part of a folder sub-tree. It's > quite > > > > > obvious > > > > > > that such behavior is not quite consistent, so we should either > > get rid > > > > > of > > > > > > PURGE events at all, or make them consistent. > > > > > > In the latter case it would be good to have answer to the > question: > > > > what > > > > > > are real use cases when we may need the purge events ? (Now they > > seem > > > > to > > > > > > be used in tests only). > > > > > > If we don't have such real use cases, are there any objections to > > get > > > > rid > > > > > > of the purge events at all? > > > > > > Thanks in advance. > > > > > > > > > > > > |
Free forum by Nabble | Edit this page |