Igniters,
We revealed concurrency problem in IGFS and I would like to discuss possible solutions to it. Consider the following file system structure: root |-- A | |-- B | | |-- C | |-- D ... two concurrent operations in different threads: T1: move(/A/B, /A/D); T2: move(/A/D, /A/B/C); ... and how IGFS handles it now: T1: verify that "/A/B" and "/A/D" exist, they are not child-parent to each other, etc. -> OK. T2: do the same for "A/D" and "A/B/C" -> OK. T1: get IDs of "/A", "/A/B" and "/A/D" to lock them later inside tx. T2: get IDs of "/A", "/A/D", "/A/B" and "/A/B/C" to lock them later inside tx. T1: Start pessimistic tx, lock IDs of "/A", "/A/B", "/A/D", perform move -> OK. root |-- A | |-- D | | |-- B | | | |-- C T2: Start pessimistic tx, lock IDs of "/A", "/A/D", "/A/B" and "/A/B/C" (*directory structure already changed at this time!*), perform move -> OK. root |-- A B |-- D | |-- C | | |-- B (loop!) File system is corrupted. Folders B, C and D are not reacheable from root. To fix this now we additionaly check if directory structure is still valid *inside transaction*. It works, no more corruptions. But it requres taking locks on the whole paths *including root*. So move, delete and mkdirs opeartions *can no longer be concurrent*. Probably there is a way to relax this while still ensuring consistency, but I do not see how. One idea is to store real path inside each entry. This way we will be able to ensure that it is still at a valid location without blocking parents, so concurrnecy will be restored. But we will have to propagate strucutral changes to children. E.g. move of a folder with 100 items will lead to update of >100 cache entries. Not so good. Any other ideas? Vladimir. |
May be just check that they are not parent-child within the tx?
Sergi Igniters, We revealed concurrency problem in IGFS and I would like to discuss possible solutions to it. Consider the following file system structure: root |-- A | |-- B | | |-- C | |-- D ... two concurrent operations in different threads: T1: move(/A/B, /A/D); T2: move(/A/D, /A/B/C); ... and how IGFS handles it now: T1: verify that "/A/B" and "/A/D" exist, they are not child-parent to each other, etc. -> OK. T2: do the same for "A/D" and "A/B/C" -> OK. T1: get IDs of "/A", "/A/B" and "/A/D" to lock them later inside tx. T2: get IDs of "/A", "/A/D", "/A/B" and "/A/B/C" to lock them later inside tx. T1: Start pessimistic tx, lock IDs of "/A", "/A/B", "/A/D", perform move -> OK. root |-- A | |-- D | | |-- B | | | |-- C T2: Start pessimistic tx, lock IDs of "/A", "/A/D", "/A/B" and "/A/B/C" (*directory structure already changed at this time!*), perform move -> OK. root |-- A B |-- D | |-- C | | |-- B (loop!) File system is corrupted. Folders B, C and D are not reacheable from root. To fix this now we additionaly check if directory structure is still valid *inside transaction*. It works, no more corruptions. But it requres taking locks on the whole paths *including root*. So move, delete and mkdirs opeartions *can no longer be concurrent*. Probably there is a way to relax this while still ensuring consistency, but I do not see how. One idea is to store real path inside each entry. This way we will be able to ensure that it is still at a valid location without blocking parents, so concurrnecy will be restored. But we will have to propagate strucutral changes to children. E.g. move of a folder with 100 items will lead to update of >100 cache entries. Not so good. Any other ideas? Vladimir. |
This is how we fixed it for now. But it requires paths locking starting
from the root. Otherwise they can become parent-child in a moment after sucessfull check. On Thu, Sep 24, 2015 at 11:28 AM, Sergi Vladykin <[hidden email]> wrote: > May be just check that they are not parent-child within the tx? > > Sergi > Igniters, > > We revealed concurrency problem in IGFS and I would like to discuss > possible solutions to it. > > Consider the following file system structure: > root > |-- A > | |-- B > | | |-- C > | |-- D > > ... two concurrent operations in different threads: > T1: move(/A/B, /A/D); > T2: move(/A/D, /A/B/C); > > ... and how IGFS handles it now: > T1: verify that "/A/B" and "/A/D" exist, they are not child-parent to each > other, etc. -> OK. > T2: do the same for "A/D" and "A/B/C" -> OK. > T1: get IDs of "/A", "/A/B" and "/A/D" to lock them later inside tx. > T2: get IDs of "/A", "/A/D", "/A/B" and "/A/B/C" to lock them later inside > tx. > > T1: Start pessimistic tx, lock IDs of "/A", "/A/B", "/A/D", perform move -> > OK. > root > |-- A > | |-- D > | | |-- B > | | | |-- C > > T2: Start pessimistic tx, lock IDs of "/A", "/A/D", "/A/B" and > "/A/B/C" (*directory > structure already changed at this time!*), perform move -> OK. > root > |-- A > B > |-- D > | |-- C > | | |-- B (loop!) > > File system is corrupted. Folders B, C and D are not reacheable from root. > > To fix this now we additionaly check if directory structure is still > valid *inside > transaction*. It works, no more corruptions. But it requres taking locks on > the whole paths *including root*. So move, delete and mkdirs opeartions > *can > no longer be concurrent*. > > Probably there is a way to relax this while still ensuring consistency, but > I do not see how. One idea is to store real path inside each entry. This > way we will be able to ensure that it is still at a valid location without > blocking parents, so concurrnecy will be restored. But we will have to > propagate strucutral changes to children. E.g. move of a folder with 100 > items will lead to update of >100 cache entries. Not so good. > > Any other ideas? > > Vladimir. > |
I am not sure if we should be concerned about looking at the root or path here. This lock is only held for the duration of directory structure change. I don't think Hadoop/Spark jobs will be affected if we add an extra 10ms somewhere during the execution.
|
In reply to this post by Sergi
IIRC NN should be locking on these ops anyway, shouldn't it? The situation is
no different if multiple clients are doing these operations near-simultaneously. Unless I missed something here... On Thu, Sep 24, 2015 at 11:28AM, Sergi Vladykin wrote: > May be just check that they are not parent-child within the tx? > > Sergi > Igniters, > > We revealed concurrency problem in IGFS and I would like to discuss > possible solutions to it. > > Consider the following file system structure: > root > |-- A > | |-- B > | | |-- C > | |-- D > > ... two concurrent operations in different threads: > T1: move(/A/B, /A/D); > T2: move(/A/D, /A/B/C); > > ... and how IGFS handles it now: > T1: verify that "/A/B" and "/A/D" exist, they are not child-parent to each > other, etc. -> OK. > T2: do the same for "A/D" and "A/B/C" -> OK. > T1: get IDs of "/A", "/A/B" and "/A/D" to lock them later inside tx. > T2: get IDs of "/A", "/A/D", "/A/B" and "/A/B/C" to lock them later inside > tx. > > T1: Start pessimistic tx, lock IDs of "/A", "/A/B", "/A/D", perform move -> > OK. > root > |-- A > | |-- D > | | |-- B > | | | |-- C > > T2: Start pessimistic tx, lock IDs of "/A", "/A/D", "/A/B" and > "/A/B/C" (*directory > structure already changed at this time!*), perform move -> OK. > root > |-- A > B > |-- D > | |-- C > | | |-- B (loop!) > > File system is corrupted. Folders B, C and D are not reacheable from root. > > To fix this now we additionaly check if directory structure is still > valid *inside > transaction*. It works, no more corruptions. But it requres taking locks on > the whole paths *including root*. So move, delete and mkdirs opeartions *can > no longer be concurrent*. > > Probably there is a way to relax this while still ensuring consistency, but > I do not see how. One idea is to store real path inside each entry. This > way we will be able to ensure that it is still at a valid location without > blocking parents, so concurrnecy will be restored. But we will have to > propagate strucutral changes to children. E.g. move of a folder with 100 > items will lead to update of >100 cache entries. Not so good. > > Any other ideas? > > Vladimir. |
Cos,
Yes, no long-time locking is expected here. On Wed, Oct 7, 2015 at 6:57 AM, Konstantin Boudnik <[hidden email]> wrote: > IIRC NN should be locking on these ops anyway, shouldn't it? The situation > is > no different if multiple clients are doing these operations > near-simultaneously. Unless I missed something here... > > On Thu, Sep 24, 2015 at 11:28AM, Sergi Vladykin wrote: > > May be just check that they are not parent-child within the tx? > > > > Sergi > > Igniters, > > > > We revealed concurrency problem in IGFS and I would like to discuss > > possible solutions to it. > > > > Consider the following file system structure: > > root > > |-- A > > | |-- B > > | | |-- C > > | |-- D > > > > ... two concurrent operations in different threads: > > T1: move(/A/B, /A/D); > > T2: move(/A/D, /A/B/C); > > > > ... and how IGFS handles it now: > > T1: verify that "/A/B" and "/A/D" exist, they are not child-parent to > each > > other, etc. -> OK. > > T2: do the same for "A/D" and "A/B/C" -> OK. > > T1: get IDs of "/A", "/A/B" and "/A/D" to lock them later inside tx. > > T2: get IDs of "/A", "/A/D", "/A/B" and "/A/B/C" to lock them later > inside > > tx. > > > > T1: Start pessimistic tx, lock IDs of "/A", "/A/B", "/A/D", perform move > -> > > OK. > > root > > |-- A > > | |-- D > > | | |-- B > > | | | |-- C > > > > T2: Start pessimistic tx, lock IDs of "/A", "/A/D", "/A/B" and > > "/A/B/C" (*directory > > structure already changed at this time!*), perform move -> OK. > > root > > |-- A > > B > > |-- D > > | |-- C > > | | |-- B (loop!) > > > > File system is corrupted. Folders B, C and D are not reacheable from > root. > > > > To fix this now we additionaly check if directory structure is still > > valid *inside > > transaction*. It works, no more corruptions. But it requres taking locks > on > > the whole paths *including root*. So move, delete and mkdirs opeartions > *can > > no longer be concurrent*. > > > > Probably there is a way to relax this while still ensuring consistency, > but > > I do not see how. One idea is to store real path inside each entry. This > > way we will be able to ensure that it is still at a valid location > without > > blocking parents, so concurrnecy will be restored. But we will have to > > propagate strucutral changes to children. E.g. move of a folder with 100 > > items will lead to update of >100 cache entries. Not so good. > > > > Any other ideas? > > > > Vladimir. > |
On Wed, Oct 07, 2015 at 09:11AM, Vladimir Ozerov wrote:
> Cos, > Yes, no long-time locking is expected here. Sorry, I musta be still dense from the jet-lag: could you put in a bit more details? Thanks in advance! Cos > On Wed, Oct 7, 2015 at 6:57 AM, Konstantin Boudnik <[hidden email]> wrote: > > > IIRC NN should be locking on these ops anyway, shouldn't it? The situation > > is > > no different if multiple clients are doing these operations > > near-simultaneously. Unless I missed something here... > > > > On Thu, Sep 24, 2015 at 11:28AM, Sergi Vladykin wrote: > > > May be just check that they are not parent-child within the tx? > > > > > > Sergi > > > Igniters, > > > > > > We revealed concurrency problem in IGFS and I would like to discuss > > > possible solutions to it. > > > > > > Consider the following file system structure: > > > root > > > |-- A > > > | |-- B > > > | | |-- C > > > | |-- D > > > > > > ... two concurrent operations in different threads: > > > T1: move(/A/B, /A/D); > > > T2: move(/A/D, /A/B/C); > > > > > > ... and how IGFS handles it now: > > > T1: verify that "/A/B" and "/A/D" exist, they are not child-parent to > > each > > > other, etc. -> OK. > > > T2: do the same for "A/D" and "A/B/C" -> OK. > > > T1: get IDs of "/A", "/A/B" and "/A/D" to lock them later inside tx. > > > T2: get IDs of "/A", "/A/D", "/A/B" and "/A/B/C" to lock them later > > inside > > > tx. > > > > > > T1: Start pessimistic tx, lock IDs of "/A", "/A/B", "/A/D", perform move > > -> > > > OK. > > > root > > > |-- A > > > | |-- D > > > | | |-- B > > > | | | |-- C > > > > > > T2: Start pessimistic tx, lock IDs of "/A", "/A/D", "/A/B" and > > > "/A/B/C" (*directory > > > structure already changed at this time!*), perform move -> OK. > > > root > > > |-- A > > > B > > > |-- D > > > | |-- C > > > | | |-- B (loop!) > > > > > > File system is corrupted. Folders B, C and D are not reacheable from > > root. > > > > > > To fix this now we additionaly check if directory structure is still > > > valid *inside > > > transaction*. It works, no more corruptions. But it requres taking locks > > on > > > the whole paths *including root*. So move, delete and mkdirs opeartions > > *can > > > no longer be concurrent*. > > > > > > Probably there is a way to relax this while still ensuring consistency, > > but > > > I do not see how. One idea is to store real path inside each entry. This > > > way we will be able to ensure that it is still at a valid location > > without > > > blocking parents, so concurrnecy will be restored. But we will have to > > > propagate strucutral changes to children. E.g. move of a folder with 100 > > > items will lead to update of >100 cache entries. Not so good. > > > > > > Any other ideas? > > > > > > Vladimir. > > |
Cos,
Initially IGFS was designed to support concurrent structural updates. E.g., creation of directories /A/C/D and /E/F/G can be performed concurrently. Now we revealed that it might cause concurrency issues in case of conflicting updates. To fix this problem we now perform every update holding a kind of global file system lock. This doesn't affect file read/write operations - they still can be performed concurrently of course. The question was whether this could cause severe performance degradation or not. If we assume that in average Hadoop jobs file operations dominate over structural operations on directories, then it should not cause any significant performance issues. Vladimir. On Wed, Oct 7, 2015 at 10:38 PM, Konstantin Boudnik <[hidden email]> wrote: > On Wed, Oct 07, 2015 at 09:11AM, Vladimir Ozerov wrote: > > Cos, > > Yes, no long-time locking is expected here. > > Sorry, I musta be still dense from the jet-lag: could you put in a bit more > details? Thanks in advance! > > Cos > > > On Wed, Oct 7, 2015 at 6:57 AM, Konstantin Boudnik <[hidden email]> > wrote: > > > > > IIRC NN should be locking on these ops anyway, shouldn't it? The > situation > > > is > > > no different if multiple clients are doing these operations > > > near-simultaneously. Unless I missed something here... > > > > > > On Thu, Sep 24, 2015 at 11:28AM, Sergi Vladykin wrote: > > > > May be just check that they are not parent-child within the tx? > > > > > > > > Sergi > > > > Igniters, > > > > > > > > We revealed concurrency problem in IGFS and I would like to discuss > > > > possible solutions to it. > > > > > > > > Consider the following file system structure: > > > > root > > > > |-- A > > > > | |-- B > > > > | | |-- C > > > > | |-- D > > > > > > > > ... two concurrent operations in different threads: > > > > T1: move(/A/B, /A/D); > > > > T2: move(/A/D, /A/B/C); > > > > > > > > ... and how IGFS handles it now: > > > > T1: verify that "/A/B" and "/A/D" exist, they are not child-parent to > > > each > > > > other, etc. -> OK. > > > > T2: do the same for "A/D" and "A/B/C" -> OK. > > > > T1: get IDs of "/A", "/A/B" and "/A/D" to lock them later inside tx. > > > > T2: get IDs of "/A", "/A/D", "/A/B" and "/A/B/C" to lock them later > > > inside > > > > tx. > > > > > > > > T1: Start pessimistic tx, lock IDs of "/A", "/A/B", "/A/D", perform > move > > > -> > > > > OK. > > > > root > > > > |-- A > > > > | |-- D > > > > | | |-- B > > > > | | | |-- C > > > > > > > > T2: Start pessimistic tx, lock IDs of "/A", "/A/D", "/A/B" and > > > > "/A/B/C" (*directory > > > > structure already changed at this time!*), perform move -> OK. > > > > root > > > > |-- A > > > > B > > > > |-- D > > > > | |-- C > > > > | | |-- B (loop!) > > > > > > > > File system is corrupted. Folders B, C and D are not reacheable from > > > root. > > > > > > > > To fix this now we additionaly check if directory structure is still > > > > valid *inside > > > > transaction*. It works, no more corruptions. But it requres taking > locks > > > on > > > > the whole paths *including root*. So move, delete and mkdirs > opeartions > > > *can > > > > no longer be concurrent*. > > > > > > > > Probably there is a way to relax this while still ensuring > consistency, > > > but > > > > I do not see how. One idea is to store real path inside each entry. > This > > > > way we will be able to ensure that it is still at a valid location > > > without > > > > blocking parents, so concurrnecy will be restored. But we will have > to > > > > propagate strucutral changes to children. E.g. move of a folder with > 100 > > > > items will lead to update of >100 cache entries. Not so good. > > > > > > > > Any other ideas? > > > > > > > > Vladimir. > > > > |
Thanks Vladimir. In the case of HDFS, the global lock for conflicting
simultaneous changes ie mkdir /a/b/c together with mv /a/b /a/d will trigger the NN global lock on the file system so the latter operation won't start until the former is completed. Hence, the point of my confusion is why IGFS is adding an additional lock on top of what the secondary file system does anyway. Sorry, if I haven't expressed myself in a better way from the beginning. Cos On Thu, Oct 08, 2015 at 11:33AM, Vladimir Ozerov wrote: > Cos, > > Initially IGFS was designed to support concurrent structural updates. E.g., > creation of directories /A/C/D and /E/F/G can be performed concurrently. > Now we revealed that it might cause concurrency issues in case of > conflicting updates. > > To fix this problem we now perform every update holding a kind of global > file system lock. This doesn't affect file read/write operations - they > still can be performed concurrently of course. The question was whether > this could cause severe performance degradation or not. If we assume that > in average Hadoop jobs file operations dominate over structural operations > on directories, then it should not cause any significant performance issues. > > Vladimir. > > On Wed, Oct 7, 2015 at 10:38 PM, Konstantin Boudnik <[hidden email]> wrote: > > > On Wed, Oct 07, 2015 at 09:11AM, Vladimir Ozerov wrote: > > > Cos, > > > Yes, no long-time locking is expected here. > > > > Sorry, I musta be still dense from the jet-lag: could you put in a bit more > > details? Thanks in advance! > > > > Cos > > > > > On Wed, Oct 7, 2015 at 6:57 AM, Konstantin Boudnik <[hidden email]> > > wrote: > > > > > > > IIRC NN should be locking on these ops anyway, shouldn't it? The > > situation > > > > is > > > > no different if multiple clients are doing these operations > > > > near-simultaneously. Unless I missed something here... > > > > > > > > On Thu, Sep 24, 2015 at 11:28AM, Sergi Vladykin wrote: > > > > > May be just check that they are not parent-child within the tx? > > > > > > > > > > Sergi > > > > > Igniters, > > > > > > > > > > We revealed concurrency problem in IGFS and I would like to discuss > > > > > possible solutions to it. > > > > > > > > > > Consider the following file system structure: > > > > > root > > > > > |-- A > > > > > | |-- B > > > > > | | |-- C > > > > > | |-- D > > > > > > > > > > ... two concurrent operations in different threads: > > > > > T1: move(/A/B, /A/D); > > > > > T2: move(/A/D, /A/B/C); > > > > > > > > > > ... and how IGFS handles it now: > > > > > T1: verify that "/A/B" and "/A/D" exist, they are not child-parent to > > > > each > > > > > other, etc. -> OK. > > > > > T2: do the same for "A/D" and "A/B/C" -> OK. > > > > > T1: get IDs of "/A", "/A/B" and "/A/D" to lock them later inside tx. > > > > > T2: get IDs of "/A", "/A/D", "/A/B" and "/A/B/C" to lock them later > > > > inside > > > > > tx. > > > > > > > > > > T1: Start pessimistic tx, lock IDs of "/A", "/A/B", "/A/D", perform > > move > > > > -> > > > > > OK. > > > > > root > > > > > |-- A > > > > > | |-- D > > > > > | | |-- B > > > > > | | | |-- C > > > > > > > > > > T2: Start pessimistic tx, lock IDs of "/A", "/A/D", "/A/B" and > > > > > "/A/B/C" (*directory > > > > > structure already changed at this time!*), perform move -> OK. > > > > > root > > > > > |-- A > > > > > B > > > > > |-- D > > > > > | |-- C > > > > > | | |-- B (loop!) > > > > > > > > > > File system is corrupted. Folders B, C and D are not reacheable from > > > > root. > > > > > > > > > > To fix this now we additionaly check if directory structure is still > > > > > valid *inside > > > > > transaction*. It works, no more corruptions. But it requres taking > > locks > > > > on > > > > > the whole paths *including root*. So move, delete and mkdirs > > opeartions > > > > *can > > > > > no longer be concurrent*. > > > > > > > > > > Probably there is a way to relax this while still ensuring > > consistency, > > > > but > > > > > I do not see how. One idea is to store real path inside each entry. > > This > > > > > way we will be able to ensure that it is still at a valid location > > > > without > > > > > blocking parents, so concurrnecy will be restored. But we will have > > to > > > > > propagate strucutral changes to children. E.g. move of a folder with > > 100 > > > > > items will lead to update of >100 cache entries. Not so good. > > > > > > > > > > Any other ideas? > > > > > > > > > > Vladimir. > > > > > > |
Cos,
We are trying to keep IGFS decoupled from HDFS. Instead, we define IGFS as an in-memory file system which could optionally be a cache some secondary file system. And HDFS is one of these file systems. For this reason we cannot rely on any specific behavior of particular secondary file system. On Tue, Oct 13, 2015 at 8:14 AM, Konstantin Boudnik <[hidden email]> wrote: > Thanks Vladimir. In the case of HDFS, the global lock for conflicting > simultaneous changes ie > > mkdir /a/b/c > together with > mv /a/b /a/d > > will trigger the NN global lock on the file system so the latter operation > won't start until the former is completed. Hence, the point of my > confusion is > why IGFS is adding an additional lock on top of what the secondary file > system > does anyway. Sorry, if I haven't expressed myself in a better way from the > beginning. > > Cos > > On Thu, Oct 08, 2015 at 11:33AM, Vladimir Ozerov wrote: > > Cos, > > > > Initially IGFS was designed to support concurrent structural updates. > E.g., > > creation of directories /A/C/D and /E/F/G can be performed concurrently. > > Now we revealed that it might cause concurrency issues in case of > > conflicting updates. > > > > To fix this problem we now perform every update holding a kind of global > > file system lock. This doesn't affect file read/write operations - they > > still can be performed concurrently of course. The question was whether > > this could cause severe performance degradation or not. If we assume that > > in average Hadoop jobs file operations dominate over structural > operations > > on directories, then it should not cause any significant performance > issues. > > > > Vladimir. > > > > On Wed, Oct 7, 2015 at 10:38 PM, Konstantin Boudnik <[hidden email]> > wrote: > > > > > On Wed, Oct 07, 2015 at 09:11AM, Vladimir Ozerov wrote: > > > > Cos, > > > > Yes, no long-time locking is expected here. > > > > > > Sorry, I musta be still dense from the jet-lag: could you put in a bit > more > > > details? Thanks in advance! > > > > > > Cos > > > > > > > On Wed, Oct 7, 2015 at 6:57 AM, Konstantin Boudnik <[hidden email]> > > > wrote: > > > > > > > > > IIRC NN should be locking on these ops anyway, shouldn't it? The > > > situation > > > > > is > > > > > no different if multiple clients are doing these operations > > > > > near-simultaneously. Unless I missed something here... > > > > > > > > > > On Thu, Sep 24, 2015 at 11:28AM, Sergi Vladykin wrote: > > > > > > May be just check that they are not parent-child within the tx? > > > > > > > > > > > > Sergi > > > > > > Igniters, > > > > > > > > > > > > We revealed concurrency problem in IGFS and I would like to > discuss > > > > > > possible solutions to it. > > > > > > > > > > > > Consider the following file system structure: > > > > > > root > > > > > > |-- A > > > > > > | |-- B > > > > > > | | |-- C > > > > > > | |-- D > > > > > > > > > > > > ... two concurrent operations in different threads: > > > > > > T1: move(/A/B, /A/D); > > > > > > T2: move(/A/D, /A/B/C); > > > > > > > > > > > > ... and how IGFS handles it now: > > > > > > T1: verify that "/A/B" and "/A/D" exist, they are not > child-parent to > > > > > each > > > > > > other, etc. -> OK. > > > > > > T2: do the same for "A/D" and "A/B/C" -> OK. > > > > > > T1: get IDs of "/A", "/A/B" and "/A/D" to lock them later inside > tx. > > > > > > T2: get IDs of "/A", "/A/D", "/A/B" and "/A/B/C" to lock them > later > > > > > inside > > > > > > tx. > > > > > > > > > > > > T1: Start pessimistic tx, lock IDs of "/A", "/A/B", "/A/D", > perform > > > move > > > > > -> > > > > > > OK. > > > > > > root > > > > > > |-- A > > > > > > | |-- D > > > > > > | | |-- B > > > > > > | | | |-- C > > > > > > > > > > > > T2: Start pessimistic tx, lock IDs of "/A", "/A/D", "/A/B" and > > > > > > "/A/B/C" (*directory > > > > > > structure already changed at this time!*), perform move -> OK. > > > > > > root > > > > > > |-- A > > > > > > B > > > > > > |-- D > > > > > > | |-- C > > > > > > | | |-- B (loop!) > > > > > > > > > > > > File system is corrupted. Folders B, C and D are not reacheable > from > > > > > root. > > > > > > > > > > > > To fix this now we additionaly check if directory structure is > still > > > > > > valid *inside > > > > > > transaction*. It works, no more corruptions. But it requres > taking > > > locks > > > > > on > > > > > > the whole paths *including root*. So move, delete and mkdirs > > > opeartions > > > > > *can > > > > > > no longer be concurrent*. > > > > > > > > > > > > Probably there is a way to relax this while still ensuring > > > consistency, > > > > > but > > > > > > I do not see how. One idea is to store real path inside each > entry. > > > This > > > > > > way we will be able to ensure that it is still at a valid > location > > > > > without > > > > > > blocking parents, so concurrnecy will be restored. But we will > have > > > to > > > > > > propagate strucutral changes to children. E.g. move of a folder > with > > > 100 > > > > > > items will lead to update of >100 cache entries. Not so good. > > > > > > > > > > > > Any other ideas? > > > > > > > > > > > > Vladimir. > > > > > > > > > |
Ah, that makes sense - sorry I jumped to obviously wrong conclusion based on
existing HDFS secondary storage implementation. Cos On Tue, Oct 13, 2015 at 09:41AM, Vladimir Ozerov wrote: > Cos, > > We are trying to keep IGFS decoupled from HDFS. Instead, we define IGFS as > an in-memory file system which could optionally be a cache some secondary > file system. And HDFS is one of these file systems. > > For this reason we cannot rely on any specific behavior of particular > secondary file system. > > > On Tue, Oct 13, 2015 at 8:14 AM, Konstantin Boudnik <[hidden email]> wrote: > > > Thanks Vladimir. In the case of HDFS, the global lock for conflicting > > simultaneous changes ie > > > > mkdir /a/b/c > > together with > > mv /a/b /a/d > > > > will trigger the NN global lock on the file system so the latter operation > > won't start until the former is completed. Hence, the point of my > > confusion is > > why IGFS is adding an additional lock on top of what the secondary file > > system > > does anyway. Sorry, if I haven't expressed myself in a better way from the > > beginning. > > > > Cos > > > > On Thu, Oct 08, 2015 at 11:33AM, Vladimir Ozerov wrote: > > > Cos, > > > > > > Initially IGFS was designed to support concurrent structural updates. > > E.g., > > > creation of directories /A/C/D and /E/F/G can be performed concurrently. > > > Now we revealed that it might cause concurrency issues in case of > > > conflicting updates. > > > > > > To fix this problem we now perform every update holding a kind of global > > > file system lock. This doesn't affect file read/write operations - they > > > still can be performed concurrently of course. The question was whether > > > this could cause severe performance degradation or not. If we assume that > > > in average Hadoop jobs file operations dominate over structural > > operations > > > on directories, then it should not cause any significant performance > > issues. > > > > > > Vladimir. > > > > > > On Wed, Oct 7, 2015 at 10:38 PM, Konstantin Boudnik <[hidden email]> > > wrote: > > > > > > > On Wed, Oct 07, 2015 at 09:11AM, Vladimir Ozerov wrote: > > > > > Cos, > > > > > Yes, no long-time locking is expected here. > > > > > > > > Sorry, I musta be still dense from the jet-lag: could you put in a bit > > more > > > > details? Thanks in advance! > > > > > > > > Cos > > > > > > > > > On Wed, Oct 7, 2015 at 6:57 AM, Konstantin Boudnik <[hidden email]> > > > > wrote: > > > > > > > > > > > IIRC NN should be locking on these ops anyway, shouldn't it? The > > > > situation > > > > > > is > > > > > > no different if multiple clients are doing these operations > > > > > > near-simultaneously. Unless I missed something here... > > > > > > > > > > > > On Thu, Sep 24, 2015 at 11:28AM, Sergi Vladykin wrote: > > > > > > > May be just check that they are not parent-child within the tx? > > > > > > > > > > > > > > Sergi > > > > > > > Igniters, > > > > > > > > > > > > > > We revealed concurrency problem in IGFS and I would like to > > discuss > > > > > > > possible solutions to it. > > > > > > > > > > > > > > Consider the following file system structure: > > > > > > > root > > > > > > > |-- A > > > > > > > | |-- B > > > > > > > | | |-- C > > > > > > > | |-- D > > > > > > > > > > > > > > ... two concurrent operations in different threads: > > > > > > > T1: move(/A/B, /A/D); > > > > > > > T2: move(/A/D, /A/B/C); > > > > > > > > > > > > > > ... and how IGFS handles it now: > > > > > > > T1: verify that "/A/B" and "/A/D" exist, they are not > > child-parent to > > > > > > each > > > > > > > other, etc. -> OK. > > > > > > > T2: do the same for "A/D" and "A/B/C" -> OK. > > > > > > > T1: get IDs of "/A", "/A/B" and "/A/D" to lock them later inside > > tx. > > > > > > > T2: get IDs of "/A", "/A/D", "/A/B" and "/A/B/C" to lock them > > later > > > > > > inside > > > > > > > tx. > > > > > > > > > > > > > > T1: Start pessimistic tx, lock IDs of "/A", "/A/B", "/A/D", > > perform > > > > move > > > > > > -> > > > > > > > OK. > > > > > > > root > > > > > > > |-- A > > > > > > > | |-- D > > > > > > > | | |-- B > > > > > > > | | | |-- C > > > > > > > > > > > > > > T2: Start pessimistic tx, lock IDs of "/A", "/A/D", "/A/B" and > > > > > > > "/A/B/C" (*directory > > > > > > > structure already changed at this time!*), perform move -> OK. > > > > > > > root > > > > > > > |-- A > > > > > > > B > > > > > > > |-- D > > > > > > > | |-- C > > > > > > > | | |-- B (loop!) > > > > > > > > > > > > > > File system is corrupted. Folders B, C and D are not reacheable > > from > > > > > > root. > > > > > > > > > > > > > > To fix this now we additionaly check if directory structure is > > still > > > > > > > valid *inside > > > > > > > transaction*. It works, no more corruptions. But it requres > > taking > > > > locks > > > > > > on > > > > > > > the whole paths *including root*. So move, delete and mkdirs > > > > opeartions > > > > > > *can > > > > > > > no longer be concurrent*. > > > > > > > > > > > > > > Probably there is a way to relax this while still ensuring > > > > consistency, > > > > > > but > > > > > > > I do not see how. One idea is to store real path inside each > > entry. > > > > This > > > > > > > way we will be able to ensure that it is still at a valid > > location > > > > > > without > > > > > > > blocking parents, so concurrnecy will be restored. But we will > > have > > > > to > > > > > > > propagate strucutral changes to children. E.g. move of a folder > > with > > > > 100 > > > > > > > items will lead to update of >100 cache entries. Not so good. > > > > > > > > > > > > > > Any other ideas? > > > > > > > > > > > > > > Vladimir. > > > > > > > > > > > > |
Free forum by Nabble | Edit this page |