Hi all,
I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264, and I need some guidance on what’s the best way to approach it. The problem is that cache names are not restricted, but if persistence is enabled the cache needs to have a corresponding directory on the file system (“cache-…”) which can’t be created if the cache name contains certain characters (or a reserved system name). A straightforward approach would be to check if a cache name is allowed on the local system (e.g. via `Paths.get(name)`) and fail to create cache if it isn’t, but I’m a bit concerned with the consistency of the behavior (the same cache name be allowed on one system and not on another). I think a better way would be to replace special characters (say, all non-alphanumeric characters) with underscores in file names (not changing the cache configuration). Would this be OK? Are there any risks I’m not considering? WDYT? Thanks, Stan |
My preference would be to prohibit forward and backward slashes in cache
names altogether, as they may create a false feeling of some directory structure, which does not exist. We should also prohibit spaces as well. D. On Mon, Dec 25, 2017 at 7:09 AM, Stanislav Lukyanov <[hidden email]> wrote: > Hi all, > > I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264, and I > need some guidance on what’s the best way to approach it. > > The problem is that cache names are not restricted, but if persistence is > enabled the cache needs to have a corresponding directory on the file > system (“cache-…”) which can’t be created if the cache name contains > certain characters (or a reserved system name). > > A straightforward approach would be to check if a cache name is allowed on > the local system (e.g. via `Paths.get(name)`) and fail to create cache if > it isn’t, but I’m a bit concerned with the consistency of the behavior (the > same cache name be allowed on one system and not on another). > I think a better way would be to replace special characters (say, all > non-alphanumeric characters) with underscores in file names (not changing > the cache configuration). Would this be OK? Are there any risks I’m not > considering? > > WDYT? > > Thanks, > Stan > |
It also make sense to limit cache name length to reasonable length.
Because some File systems could have limitations on path length. See: https://en.wikipedia.org/wiki/Filename#Length_restrictions On Tue, Dec 26, 2017 at 1:41 AM, Dmitriy Setrakyan <[hidden email]> wrote: > My preference would be to prohibit forward and backward slashes in cache > names altogether, as they may create a false feeling of some directory > structure, which does not exist. We should also prohibit spaces as well. > > D. > > On Mon, Dec 25, 2017 at 7:09 AM, Stanislav Lukyanov < > [hidden email]> > wrote: > > > Hi all, > > > > I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264, and > I > > need some guidance on what’s the best way to approach it. > > > > The problem is that cache names are not restricted, but if persistence is > > enabled the cache needs to have a corresponding directory on the file > > system (“cache-…”) which can’t be created if the cache name contains > > certain characters (or a reserved system name). > > > > A straightforward approach would be to check if a cache name is allowed > on > > the local system (e.g. via `Paths.get(name)`) and fail to create cache if > > it isn’t, but I’m a bit concerned with the consistency of the behavior > (the > > same cache name be allowed on one system and not on another). > > I think a better way would be to replace special characters (say, all > > non-alphanumeric characters) with underscores in file names (not changing > > the cache configuration). Would this be OK? Are there any risks I’m not > > considering? > > > > WDYT? > > > > Thanks, > > Stan > > > -- Alexey Kuznetsov |
Thanks for the feedback.
It seems that another thing to handle is case-insensitive FS – “mycache” and “MyCache” is the same on Windows, so it might be reasonable to disallow having two caches with names that are equal ignoring case. And one more thing is control characters – forbidding at least range of ASCII 0x00-0x20 seems reasonable. To summarize, a possible set of restrictions would be - Whitespace characters (via Character.isWhitespaceCharacter) - Control characters (via Character.isISOCharacter) - Slashes - Characters reserved in Windows (<>:"/\|?*) - Length (say, up to 255) - Distinct names of caches when ignoring case It seems reasonable to enforce that even regardless of persistence directories naming (AFAIU that’s what Dmitry meant by forbidding things altogether), so that’s what I’m going to do. Any concerns? Specifically, would it be OK from backward compatibility point of view to forbid all these characters now for all caches? Thanks, Stan From: Alexey Kuznetsov Sent: 26 декабря 2017 г. 7:51 To: [hidden email] Subject: Re: Handling slashes in cache names It also make sense to limit cache name length to reasonable length. Because some File systems could have limitations on path length. See: https://en.wikipedia.org/wiki/Filename#Length_restrictions On Tue, Dec 26, 2017 at 1:41 AM, Dmitriy Setrakyan <[hidden email]> wrote: > My preference would be to prohibit forward and backward slashes in cache > names altogether, as they may create a false feeling of some directory > structure, which does not exist. We should also prohibit spaces as well. > > D. > > On Mon, Dec 25, 2017 at 7:09 AM, Stanislav Lukyanov < > [hidden email]> > wrote: > > > Hi all, > > > > I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264, and > I > > need some guidance on what’s the best way to approach it. > > > > The problem is that cache names are not restricted, but if persistence is > > enabled the cache needs to have a corresponding directory on the file > > system (“cache-…”) which can’t be created if the cache name contains > > certain characters (or a reserved system name). > > > > A straightforward approach would be to check if a cache name is allowed > on > > the local system (e.g. via `Paths.get(name)`) and fail to create cache if > > it isn’t, but I’m a bit concerned with the consistency of the behavior > (the > > same cache name be allowed on one system and not on another). > > I think a better way would be to replace special characters (say, all > > non-alphanumeric characters) with underscores in file names (not changing > > the cache configuration). Would this be OK? Are there any risks I’m not > > considering? > > > > WDYT? > > > > Thanks, > > Stan > > > -- Alexey Kuznetsov |
Looks good to me. Is this going to be an exception on startup? If yes, is
it safe to release it, or should we wait till 3.0? On Tue, Dec 26, 2017 at 2:08 AM, Stanislav Lukyanov <[hidden email]> wrote: > Thanks for the feedback. > > It seems that another thing to handle is case-insensitive FS – “mycache” > and “MyCache” is the same on Windows, so it might be reasonable to disallow > having two caches with names that are equal ignoring case. > And one more thing is control characters – forbidding at least range of > ASCII 0x00-0x20 seems reasonable. > > To summarize, a possible set of restrictions would be > - Whitespace characters (via Character.isWhitespaceCharacter) > - Control characters (via Character.isISOCharacter) > - Slashes > - Characters reserved in Windows (<>:"/\|?*) > - Length (say, up to 255) > - Distinct names of caches when ignoring case > It seems reasonable to enforce that even regardless of persistence > directories naming (AFAIU that’s what Dmitry meant by forbidding things > altogether), so that’s what I’m going to do. > Any concerns? > Specifically, would it be OK from backward compatibility point of view to > forbid all these characters now for all caches? > > Thanks, > Stan > > > From: Alexey Kuznetsov > Sent: 26 декабря 2017 г. 7:51 > To: [hidden email] > Subject: Re: Handling slashes in cache names > > It also make sense to limit cache name length to reasonable length. > Because some File systems could have limitations on path length. > See: https://en.wikipedia.org/wiki/Filename#Length_restrictions > > On Tue, Dec 26, 2017 at 1:41 AM, Dmitriy Setrakyan <[hidden email]> > wrote: > > > My preference would be to prohibit forward and backward slashes in cache > > names altogether, as they may create a false feeling of some directory > > structure, which does not exist. We should also prohibit spaces as well. > > > > D. > > > > On Mon, Dec 25, 2017 at 7:09 AM, Stanislav Lukyanov < > > [hidden email]> > > wrote: > > > > > Hi all, > > > > > > I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264, > and > > I > > > need some guidance on what’s the best way to approach it. > > > > > > The problem is that cache names are not restricted, but if persistence > is > > > enabled the cache needs to have a corresponding directory on the file > > > system (“cache-…”) which can’t be created if the cache name contains > > > certain characters (or a reserved system name). > > > > > > A straightforward approach would be to check if a cache name is allowed > > on > > > the local system (e.g. via `Paths.get(name)`) and fail to create cache > if > > > it isn’t, but I’m a bit concerned with the consistency of the behavior > > (the > > > same cache name be allowed on one system and not on another). > > > I think a better way would be to replace special characters (say, all > > > non-alphanumeric characters) with underscores in file names (not > changing > > > the cache configuration). Would this be OK? Are there any risks I’m not > > > considering? > > > > > > WDYT? > > > > > > Thanks, > > > Stan > > > > > > > > > -- > Alexey Kuznetsov > > |
There are also some international features that you might want to
address. For example, instead of backslash some other characters may be used on Windows - ¥ on the Japanese version, ₩ on the Korean version. See [1] for more info. Here is the citation: Security Considerations for Character Sets in File Names Windows code page and OEM character sets used on Japanese-language systems contain the Yen symbol (¥) instead of a backslash (\). Thus, the Yen character is a prohibited character for NTFS and FAT file systems. When mapping Unicode to a Japanese-language code page, conversion functions map both backslash (U+005C) and the normal Unicode Yen symbol (U+00A5) to this same character. For security reasons, your applications should not typically allow the character U+00A5 in a Unicode string that might be converted for use as a FAT file name. [1] - https://msdn.microsoft.com/en-us/library/dd374047(v=vs.85).aspx Best Regards, Igor On Tue, Dec 26, 2017 at 5:01 PM, Dmitriy Setrakyan <[hidden email]> wrote: > Looks good to me. Is this going to be an exception on startup? If yes, is > it safe to release it, or should we wait till 3.0? > > On Tue, Dec 26, 2017 at 2:08 AM, Stanislav Lukyanov < > [hidden email]> > wrote: > > > Thanks for the feedback. > > > > It seems that another thing to handle is case-insensitive FS – “mycache” > > and “MyCache” is the same on Windows, so it might be reasonable to > disallow > > having two caches with names that are equal ignoring case. > > And one more thing is control characters – forbidding at least range of > > ASCII 0x00-0x20 seems reasonable. > > > > To summarize, a possible set of restrictions would be > > - Whitespace characters (via Character.isWhitespaceCharacter) > > - Control characters (via Character.isISOCharacter) > > - Slashes > > - Characters reserved in Windows (<>:"/\|?*) > > - Length (say, up to 255) > > - Distinct names of caches when ignoring case > > It seems reasonable to enforce that even regardless of persistence > > directories naming (AFAIU that’s what Dmitry meant by forbidding things > > altogether), so that’s what I’m going to do. > > Any concerns? > > Specifically, would it be OK from backward compatibility point of view to > > forbid all these characters now for all caches? > > > > Thanks, > > Stan > > > > > > From: Alexey Kuznetsov > > Sent: 26 декабря 2017 г. 7:51 > > To: [hidden email] > > Subject: Re: Handling slashes in cache names > > > > It also make sense to limit cache name length to reasonable length. > > Because some File systems could have limitations on path length. > > See: https://en.wikipedia.org/wiki/Filename#Length_restrictions > > > > On Tue, Dec 26, 2017 at 1:41 AM, Dmitriy Setrakyan < > [hidden email]> > > wrote: > > > > > My preference would be to prohibit forward and backward slashes in > cache > > > names altogether, as they may create a false feeling of some directory > > > structure, which does not exist. We should also prohibit spaces as > well. > > > > > > D. > > > > > > On Mon, Dec 25, 2017 at 7:09 AM, Stanislav Lukyanov < > > > [hidden email]> > > > wrote: > > > > > > > Hi all, > > > > > > > > I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264, > > and > > > I > > > > need some guidance on what’s the best way to approach it. > > > > > > > > The problem is that cache names are not restricted, but if > persistence > > is > > > > enabled the cache needs to have a corresponding directory on the file > > > > system (“cache-…”) which can’t be created if the cache name contains > > > > certain characters (or a reserved system name). > > > > > > > > A straightforward approach would be to check if a cache name is > allowed > > > on > > > > the local system (e.g. via `Paths.get(name)`) and fail to create > cache > > if > > > > it isn’t, but I’m a bit concerned with the consistency of the > behavior > > > (the > > > > same cache name be allowed on one system and not on another). > > > > I think a better way would be to replace special characters (say, all > > > > non-alphanumeric characters) with underscores in file names (not > > changing > > > > the cache configuration). Would this be OK? Are there any risks I’m > not > > > > considering? > > > > > > > > WDYT? > > > > > > > > Thanks, > > > > Stan > > > > > > > > > > > > > > > -- > > Alexey Kuznetsov > > > > > |
In reply to this post by dsetrakyan
Well, that’s my question too :)
Do we have any compatibility guidelines or other documents on what can or cannot be in a minor/major release? Also, it might be helpful to add an environment variable (like IGNITE_DISABLE_CACHE_NAME_RESTRICTIONS) to restore the old behavior, just in case. Thanks, Stan From: Dmitriy Setrakyan Sent: 26 декабря 2017 г. 17:02 To: [hidden email] Subject: Re: Handling slashes in cache names Looks good to me. Is this going to be an exception on startup? If yes, is it safe to release it, or should we wait till 3.0? On Tue, Dec 26, 2017 at 2:08 AM, Stanislav Lukyanov <[hidden email]> wrote: > Thanks for the feedback. > > It seems that another thing to handle is case-insensitive FS – “mycache” > and “MyCache” is the same on Windows, so it might be reasonable to disallow > having two caches with names that are equal ignoring case. > And one more thing is control characters – forbidding at least range of > ASCII 0x00-0x20 seems reasonable. > > To summarize, a possible set of restrictions would be > - Whitespace characters (via Character.isWhitespaceCharacter) > - Control characters (via Character.isISOCharacter) > - Slashes > - Characters reserved in Windows (<>:"/\|?*) > - Length (say, up to 255) > - Distinct names of caches when ignoring case > It seems reasonable to enforce that even regardless of persistence > directories naming (AFAIU that’s what Dmitry meant by forbidding things > altogether), so that’s what I’m going to do. > Any concerns? > Specifically, would it be OK from backward compatibility point of view to > forbid all these characters now for all caches? > > Thanks, > Stan > > > From: Alexey Kuznetsov > Sent: 26 декабря 2017 г. 7:51 > To: [hidden email] > Subject: Re: Handling slashes in cache names > > It also make sense to limit cache name length to reasonable length. > Because some File systems could have limitations on path length. > See: https://en.wikipedia.org/wiki/Filename#Length_restrictions > > On Tue, Dec 26, 2017 at 1:41 AM, Dmitriy Setrakyan <[hidden email]> > wrote: > > > My preference would be to prohibit forward and backward slashes in cache > > names altogether, as they may create a false feeling of some directory > > structure, which does not exist. We should also prohibit spaces as well. > > > > D. > > > > On Mon, Dec 25, 2017 at 7:09 AM, Stanislav Lukyanov < > > [hidden email]> > > wrote: > > > > > Hi all, > > > > > > I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264, > and > > I > > > need some guidance on what’s the best way to approach it. > > > > > > The problem is that cache names are not restricted, but if persistence > is > > > enabled the cache needs to have a corresponding directory on the file > > > system (“cache-…”) which can’t be created if the cache name contains > > > certain characters (or a reserved system name). > > > > > > A straightforward approach would be to check if a cache name is allowed > > on > > > the local system (e.g. via `Paths.get(name)`) and fail to create cache > if > > > it isn’t, but I’m a bit concerned with the consistency of the behavior > > (the > > > same cache name be allowed on one system and not on another). > > > I think a better way would be to replace special characters (say, all > > > non-alphanumeric characters) with underscores in file names (not > changing > > > the cache configuration). Would this be OK? Are there any risks I’m not > > > considering? > > > > > > WDYT? > > > > > > Thanks, > > > Stan > > > > > > > > > -- > Alexey Kuznetsov > > |
Cache name appears to me purely logical entity. Can we simply store cache
ID in file system paths without adding any restrictions to cache names? On Wed, Dec 27, 2017 at 2:26 PM, Stanislav Lukyanov <[hidden email]> wrote: > Well, that’s my question too :) > Do we have any compatibility guidelines or other documents on what can or > cannot be in a minor/major release? > > Also, it might be helpful to add an environment variable (like > IGNITE_DISABLE_CACHE_NAME_RESTRICTIONS) to restore the old behavior, just > in case. > > Thanks, > Stan > > From: Dmitriy Setrakyan > Sent: 26 декабря 2017 г. 17:02 > To: [hidden email] > Subject: Re: Handling slashes in cache names > > Looks good to me. Is this going to be an exception on startup? If yes, is > it safe to release it, or should we wait till 3.0? > > On Tue, Dec 26, 2017 at 2:08 AM, Stanislav Lukyanov < > [hidden email]> > wrote: > > > Thanks for the feedback. > > > > It seems that another thing to handle is case-insensitive FS – “mycache” > > and “MyCache” is the same on Windows, so it might be reasonable to > disallow > > having two caches with names that are equal ignoring case. > > And one more thing is control characters – forbidding at least range of > > ASCII 0x00-0x20 seems reasonable. > > > > To summarize, a possible set of restrictions would be > > - Whitespace characters (via Character.isWhitespaceCharacter) > > - Control characters (via Character.isISOCharacter) > > - Slashes > > - Characters reserved in Windows (<>:"/\|?*) > > - Length (say, up to 255) > > - Distinct names of caches when ignoring case > > It seems reasonable to enforce that even regardless of persistence > > directories naming (AFAIU that’s what Dmitry meant by forbidding things > > altogether), so that’s what I’m going to do. > > Any concerns? > > Specifically, would it be OK from backward compatibility point of view to > > forbid all these characters now for all caches? > > > > Thanks, > > Stan > > > > > > From: Alexey Kuznetsov > > Sent: 26 декабря 2017 г. 7:51 > > To: [hidden email] > > Subject: Re: Handling slashes in cache names > > > > It also make sense to limit cache name length to reasonable length. > > Because some File systems could have limitations on path length. > > See: https://en.wikipedia.org/wiki/Filename#Length_restrictions > > > > On Tue, Dec 26, 2017 at 1:41 AM, Dmitriy Setrakyan < > [hidden email]> > > wrote: > > > > > My preference would be to prohibit forward and backward slashes in > cache > > > names altogether, as they may create a false feeling of some directory > > > structure, which does not exist. We should also prohibit spaces as > well. > > > > > > D. > > > > > > On Mon, Dec 25, 2017 at 7:09 AM, Stanislav Lukyanov < > > > [hidden email]> > > > wrote: > > > > > > > Hi all, > > > > > > > > I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264, > > and > > > I > > > > need some guidance on what’s the best way to approach it. > > > > > > > > The problem is that cache names are not restricted, but if > persistence > > is > > > > enabled the cache needs to have a corresponding directory on the file > > > > system (“cache-…”) which can’t be created if the cache name contains > > > > certain characters (or a reserved system name). > > > > > > > > A straightforward approach would be to check if a cache name is > allowed > > > on > > > > the local system (e.g. via `Paths.get(name)`) and fail to create > cache > > if > > > > it isn’t, but I’m a bit concerned with the consistency of the > behavior > > > (the > > > > same cache name be allowed on one system and not on another). > > > > I think a better way would be to replace special characters (say, all > > > > non-alphanumeric characters) with underscores in file names (not > > changing > > > > the cache configuration). Would this be OK? Are there any risks I’m > not > > > > considering? > > > > > > > > WDYT? > > > > > > > > Thanks, > > > > Stan > > > > > > > > > > > > > > > -- > > Alexey Kuznetsov > > > > > > |
In reply to this post by Igor Sapego-2
That’s interesting, thanks.
So, do you think the locale-specific file separators should be banned as well? Handling all kinds of cases like this might be complicated. I’d rather use something else if the cache name is not a valid file name, a hash of the cache name. This way all corner cases can be handled at once. The algorithm would be 1) Check that cache name doesn’t contain banned characters 2) Try to create a Path for “cache-<cache name>” 3) If failed, create a Path for “cache-<cache name hash>” Stan From: Igor Sapego Sent: 26 декабря 2017 г. 17:59 To: [hidden email] Subject: Re: Handling slashes in cache names There are also some international features that you might want to address. For example, instead of backslash some other characters may be used on Windows - ¥ on the Japanese version, ₩ on the Korean version. See [1] for more info. Here is the citation: Security Considerations for Character Sets in File Names Windows code page and OEM character sets used on Japanese-language systems contain the Yen symbol (¥) instead of a backslash (\). Thus, the Yen character is a prohibited character for NTFS and FAT file systems. When mapping Unicode to a Japanese-language code page, conversion functions map both backslash (U+005C) and the normal Unicode Yen symbol (U+00A5) to this same character. For security reasons, your applications should not typically allow the character U+00A5 in a Unicode string that might be converted for use as a FAT file name. [1] - https://msdn.microsoft.com/en-us/library/dd374047(v=vs.85).aspx Best Regards, Igor On Tue, Dec 26, 2017 at 5:01 PM, Dmitriy Setrakyan <[hidden email]> wrote: > Looks good to me. Is this going to be an exception on startup? If yes, is > it safe to release it, or should we wait till 3.0? > > On Tue, Dec 26, 2017 at 2:08 AM, Stanislav Lukyanov < > [hidden email]> > wrote: > > > Thanks for the feedback. > > > > It seems that another thing to handle is case-insensitive FS – “mycache” > > and “MyCache” is the same on Windows, so it might be reasonable to > disallow > > having two caches with names that are equal ignoring case. > > And one more thing is control characters – forbidding at least range of > > ASCII 0x00-0x20 seems reasonable. > > > > To summarize, a possible set of restrictions would be > > - Whitespace characters (via Character.isWhitespaceCharacter) > > - Control characters (via Character.isISOCharacter) > > - Slashes > > - Characters reserved in Windows (<>:"/\|?*) > > - Length (say, up to 255) > > - Distinct names of caches when ignoring case > > It seems reasonable to enforce that even regardless of persistence > > directories naming (AFAIU that’s what Dmitry meant by forbidding things > > altogether), so that’s what I’m going to do. > > Any concerns? > > Specifically, would it be OK from backward compatibility point of view to > > forbid all these characters now for all caches? > > > > Thanks, > > Stan > > > > > > From: Alexey Kuznetsov > > Sent: 26 декабря 2017 г. 7:51 > > To: [hidden email] > > Subject: Re: Handling slashes in cache names > > > > It also make sense to limit cache name length to reasonable length. > > Because some File systems could have limitations on path length. > > See: https://en.wikipedia.org/wiki/Filename#Length_restrictions > > > > On Tue, Dec 26, 2017 at 1:41 AM, Dmitriy Setrakyan < > [hidden email]> > > wrote: > > > > > My preference would be to prohibit forward and backward slashes in > cache > > > names altogether, as they may create a false feeling of some directory > > > structure, which does not exist. We should also prohibit spaces as > well. > > > > > > D. > > > > > > On Mon, Dec 25, 2017 at 7:09 AM, Stanislav Lukyanov < > > > [hidden email]> > > > wrote: > > > > > > > Hi all, > > > > > > > > I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264, > > and > > > I > > > > need some guidance on what’s the best way to approach it. > > > > > > > > The problem is that cache names are not restricted, but if > persistence > > is > > > > enabled the cache needs to have a corresponding directory on the file > > > > system (“cache-…”) which can’t be created if the cache name contains > > > > certain characters (or a reserved system name). > > > > > > > > A straightforward approach would be to check if a cache name is > allowed > > > on > > > > the local system (e.g. via `Paths.get(name)`) and fail to create > cache > > if > > > > it isn’t, but I’m a bit concerned with the consistency of the > behavior > > > (the > > > > same cache name be allowed on one system and not on another). > > > > I think a better way would be to replace special characters (say, all > > > > non-alphanumeric characters) with underscores in file names (not > > changing > > > > the cache configuration). Would this be OK? Are there any risks I’m > not > > > > considering? > > > > > > > > WDYT? > > > > > > > > Thanks, > > > > Stan > > > > > > > > > > > > > > > -- > > Alexey Kuznetsov > > > > > |
Agree with Stan and Vladimir.
We should not impose any restrictions on cache names, some users may have issues with that. Using cache names as file names is internal implementation detail. We can use cache id or some kind of encoding (base64, etc) to avoid file system issues. Thanks, Pavel On Wed, Dec 27, 2017 at 2:38 PM, Stanislav Lukyanov <[hidden email]> wrote: > That’s interesting, thanks. > So, do you think the locale-specific file separators should be banned as > well? > Handling all kinds of cases like this might be complicated. > > I’d rather use something else if the cache name is not a valid file name, > a hash of the cache name. > This way all corner cases can be handled at once. > The algorithm would be > 1) Check that cache name doesn’t contain banned characters > 2) Try to create a Path for “cache-<cache name>” > 3) If failed, create a Path for “cache-<cache name hash>” > > Stan > > From: Igor Sapego > Sent: 26 декабря 2017 г. 17:59 > To: [hidden email] > Subject: Re: Handling slashes in cache names > > There are also some international features that you might want to > address. For example, instead of backslash some other characters > may be used on Windows - ¥ on the Japanese version, ₩ on the > Korean version. > See [1] for more info. > > Here is the citation: > Security Considerations for Character Sets in File Names > Windows code page and OEM character sets used on > Japanese-language systems contain the Yen symbol (¥) instead of > a backslash (\). Thus, the Yen character is a prohibited character for > NTFS and FAT file systems. When mapping Unicode to > a Japanese-language code page, conversion functions map both > backslash (U+005C) and the normal Unicode Yen symbol (U+00A5) > to this same character. For security reasons, your applications should > not typically allow the character U+00A5 in a Unicode string that > might be converted for use as a FAT file name. > > [1] - https://msdn.microsoft.com/en-us/library/dd374047(v=vs.85).aspx > > > Best Regards, > Igor > > On Tue, Dec 26, 2017 at 5:01 PM, Dmitriy Setrakyan <[hidden email]> > wrote: > > > Looks good to me. Is this going to be an exception on startup? If yes, is > > it safe to release it, or should we wait till 3.0? > > > > On Tue, Dec 26, 2017 at 2:08 AM, Stanislav Lukyanov < > > [hidden email]> > > wrote: > > > > > Thanks for the feedback. > > > > > > It seems that another thing to handle is case-insensitive FS – > “mycache” > > > and “MyCache” is the same on Windows, so it might be reasonable to > > disallow > > > having two caches with names that are equal ignoring case. > > > And one more thing is control characters – forbidding at least range of > > > ASCII 0x00-0x20 seems reasonable. > > > > > > To summarize, a possible set of restrictions would be > > > - Whitespace characters (via Character.isWhitespaceCharacter) > > > - Control characters (via Character.isISOCharacter) > > > - Slashes > > > - Characters reserved in Windows (<>:"/\|?*) > > > - Length (say, up to 255) > > > - Distinct names of caches when ignoring case > > > It seems reasonable to enforce that even regardless of persistence > > > directories naming (AFAIU that’s what Dmitry meant by forbidding things > > > altogether), so that’s what I’m going to do. > > > Any concerns? > > > Specifically, would it be OK from backward compatibility point of view > to > > > forbid all these characters now for all caches? > > > > > > Thanks, > > > Stan > > > > > > > > > From: Alexey Kuznetsov > > > Sent: 26 декабря 2017 г. 7:51 > > > To: [hidden email] > > > Subject: Re: Handling slashes in cache names > > > > > > It also make sense to limit cache name length to reasonable length. > > > Because some File systems could have limitations on path length. > > > See: https://en.wikipedia.org/wiki/Filename#Length_restrictions > > > > > > On Tue, Dec 26, 2017 at 1:41 AM, Dmitriy Setrakyan < > > [hidden email]> > > > wrote: > > > > > > > My preference would be to prohibit forward and backward slashes in > > cache > > > > names altogether, as they may create a false feeling of some > directory > > > > structure, which does not exist. We should also prohibit spaces as > > well. > > > > > > > > D. > > > > > > > > On Mon, Dec 25, 2017 at 7:09 AM, Stanislav Lukyanov < > > > > [hidden email]> > > > > wrote: > > > > > > > > > Hi all, > > > > > > > > > > I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264 > , > > > and > > > > I > > > > > need some guidance on what’s the best way to approach it. > > > > > > > > > > The problem is that cache names are not restricted, but if > > persistence > > > is > > > > > enabled the cache needs to have a corresponding directory on the > file > > > > > system (“cache-…”) which can’t be created if the cache name > contains > > > > > certain characters (or a reserved system name). > > > > > > > > > > A straightforward approach would be to check if a cache name is > > allowed > > > > on > > > > > the local system (e.g. via `Paths.get(name)`) and fail to create > > cache > > > if > > > > > it isn’t, but I’m a bit concerned with the consistency of the > > behavior > > > > (the > > > > > same cache name be allowed on one system and not on another). > > > > > I think a better way would be to replace special characters (say, > all > > > > > non-alphanumeric characters) with underscores in file names (not > > > changing > > > > > the cache configuration). Would this be OK? Are there any risks I’m > > not > > > > > considering? > > > > > > > > > > WDYT? > > > > > > > > > > Thanks, > > > > > Stan > > > > > > > > > > > > > > > > > > > > > -- > > > Alexey Kuznetsov > > > > > > > > > > |
In reply to this post by Vladimir Ozerov
We can – by mapping a cache name to some (safe) string to be used as a directory name, say via Base64 as Pavel has suggested.
However, I think that banning certain characters might be reasonable. Some characters might be considered reserved (e.g. slashes, colon, asterisk, etc) to be used later, in case some future feature requires cache names to have an actual meaning. Some characters might be banned just as a precaution (e.g. control characters or whitespaces) because they might cause problems with logging or elsewhere (you might have a bad time processing a cache name with \0 in it :) ). The question is whether or not these considerations worth adding code and/or changing existing behavior. BTW Java folks had similar discussion on Java module names resulting in http://mail.openjdk.java.net/pipermail/jpms-spec-experts/2016-December/000515.html. Thanks, Stan From: Vladimir Ozerov Sent: 27 декабря 2017 г. 14:37 To: [hidden email] Subject: Re: Handling slashes in cache names Cache name appears to me purely logical entity. Can we simply store cache ID in file system paths without adding any restrictions to cache names? On Wed, Dec 27, 2017 at 2:26 PM, Stanislav Lukyanov <[hidden email]> wrote: > Well, that’s my question too :) > Do we have any compatibility guidelines or other documents on what can or > cannot be in a minor/major release? > > Also, it might be helpful to add an environment variable (like > IGNITE_DISABLE_CACHE_NAME_RESTRICTIONS) to restore the old behavior, just > in case. > > Thanks, > Stan > > From: Dmitriy Setrakyan > Sent: 26 декабря 2017 г. 17:02 > To: [hidden email] > Subject: Re: Handling slashes in cache names > > Looks good to me. Is this going to be an exception on startup? If yes, is > it safe to release it, or should we wait till 3.0? > > On Tue, Dec 26, 2017 at 2:08 AM, Stanislav Lukyanov < > [hidden email]> > wrote: > > > Thanks for the feedback. > > > > It seems that another thing to handle is case-insensitive FS – “mycache” > > and “MyCache” is the same on Windows, so it might be reasonable to > disallow > > having two caches with names that are equal ignoring case. > > And one more thing is control characters – forbidding at least range of > > ASCII 0x00-0x20 seems reasonable. > > > > To summarize, a possible set of restrictions would be > > - Whitespace characters (via Character.isWhitespaceCharacter) > > - Control characters (via Character.isISOCharacter) > > - Slashes > > - Characters reserved in Windows (<>:"/\|?*) > > - Length (say, up to 255) > > - Distinct names of caches when ignoring case > > It seems reasonable to enforce that even regardless of persistence > > directories naming (AFAIU that’s what Dmitry meant by forbidding things > > altogether), so that’s what I’m going to do. > > Any concerns? > > Specifically, would it be OK from backward compatibility point of view to > > forbid all these characters now for all caches? > > > > Thanks, > > Stan > > > > > > From: Alexey Kuznetsov > > Sent: 26 декабря 2017 г. 7:51 > > To: [hidden email] > > Subject: Re: Handling slashes in cache names > > > > It also make sense to limit cache name length to reasonable length. > > Because some File systems could have limitations on path length. > > See: https://en.wikipedia.org/wiki/Filename#Length_restrictions > > > > On Tue, Dec 26, 2017 at 1:41 AM, Dmitriy Setrakyan < > [hidden email]> > > wrote: > > > > > My preference would be to prohibit forward and backward slashes in > cache > > > names altogether, as they may create a false feeling of some directory > > > structure, which does not exist. We should also prohibit spaces as > well. > > > > > > D. > > > > > > On Mon, Dec 25, 2017 at 7:09 AM, Stanislav Lukyanov < > > > [hidden email]> > > > wrote: > > > > > > > Hi all, > > > > > > > > I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264, > > and > > > I > > > > need some guidance on what’s the best way to approach it. > > > > > > > > The problem is that cache names are not restricted, but if > persistence > > is > > > > enabled the cache needs to have a corresponding directory on the file > > > > system (“cache-…”) which can’t be created if the cache name contains > > > > certain characters (or a reserved system name). > > > > > > > > A straightforward approach would be to check if a cache name is > allowed > > > on > > > > the local system (e.g. via `Paths.get(name)`) and fail to create > cache > > if > > > > it isn’t, but I’m a bit concerned with the consistency of the > behavior > > > (the > > > > same cache name be allowed on one system and not on another). > > > > I think a better way would be to replace special characters (say, all > > > > non-alphanumeric characters) with underscores in file names (not > > changing > > > > the cache configuration). Would this be OK? Are there any risks I’m > not > > > > considering? > > > > > > > > WDYT? > > > > > > > > Thanks, > > > > Stan > > > > > > > > > > > > > > > -- > > Alexey Kuznetsov > > > > > > |
Special characters banning seems to be exclusive way and cannot be controlled in future if new symbols arise.
Maybe better solution will be choosing the array of permitted symbols for caches names (i.e. [a-zA-Z0-9_-])? Also +1 for using abstract hash string for directories names. > On 27 Dec 2017, at 15:14, Stanislav Lukyanov <[hidden email]> wrote: > > We can – by mapping a cache name to some (safe) string to be used as a directory name, say via Base64 as Pavel has suggested. > > However, I think that banning certain characters might be reasonable. > Some characters might be considered reserved (e.g. slashes, colon, asterisk, etc) to be used later, in case some future feature requires cache names to have an actual meaning. > Some characters might be banned just as a precaution (e.g. control characters or whitespaces) because they might cause problems with logging or elsewhere (you might have a bad time processing a cache name with \0 in it :) ). > > The question is whether or not these considerations worth adding code and/or changing existing behavior. > > BTW Java folks had similar discussion on Java module names resulting in http://mail.openjdk.java.net/pipermail/jpms-spec-experts/2016-December/000515.html. > > Thanks, > Stan > > From: Vladimir Ozerov > Sent: 27 декабря 2017 г. 14:37 > To: [hidden email] > Subject: Re: Handling slashes in cache names > > Cache name appears to me purely logical entity. Can we simply store cache > ID in file system paths without adding any restrictions to cache names? > > On Wed, Dec 27, 2017 at 2:26 PM, Stanislav Lukyanov <[hidden email]> > wrote: > >> Well, that’s my question too :) >> Do we have any compatibility guidelines or other documents on what can or >> cannot be in a minor/major release? >> >> Also, it might be helpful to add an environment variable (like >> IGNITE_DISABLE_CACHE_NAME_RESTRICTIONS) to restore the old behavior, just >> in case. >> >> Thanks, >> Stan >> >> From: Dmitriy Setrakyan >> Sent: 26 декабря 2017 г. 17:02 >> To: [hidden email] >> Subject: Re: Handling slashes in cache names >> >> Looks good to me. Is this going to be an exception on startup? If yes, is >> it safe to release it, or should we wait till 3.0? >> >> On Tue, Dec 26, 2017 at 2:08 AM, Stanislav Lukyanov < >> [hidden email]> >> wrote: >> >>> Thanks for the feedback. >>> >>> It seems that another thing to handle is case-insensitive FS – “mycache” >>> and “MyCache” is the same on Windows, so it might be reasonable to >> disallow >>> having two caches with names that are equal ignoring case. >>> And one more thing is control characters – forbidding at least range of >>> ASCII 0x00-0x20 seems reasonable. >>> >>> To summarize, a possible set of restrictions would be >>> - Whitespace characters (via Character.isWhitespaceCharacter) >>> - Control characters (via Character.isISOCharacter) >>> - Slashes >>> - Characters reserved in Windows (<>:"/\|?*) >>> - Length (say, up to 255) >>> - Distinct names of caches when ignoring case >>> It seems reasonable to enforce that even regardless of persistence >>> directories naming (AFAIU that’s what Dmitry meant by forbidding things >>> altogether), so that’s what I’m going to do. >>> Any concerns? >>> Specifically, would it be OK from backward compatibility point of view to >>> forbid all these characters now for all caches? >>> >>> Thanks, >>> Stan >>> >>> >>> From: Alexey Kuznetsov >>> Sent: 26 декабря 2017 г. 7:51 >>> To: [hidden email] >>> Subject: Re: Handling slashes in cache names >>> >>> It also make sense to limit cache name length to reasonable length. >>> Because some File systems could have limitations on path length. >>> See: https://en.wikipedia.org/wiki/Filename#Length_restrictions >>> >>> On Tue, Dec 26, 2017 at 1:41 AM, Dmitriy Setrakyan < >> [hidden email]> >>> wrote: >>> >>>> My preference would be to prohibit forward and backward slashes in >> cache >>>> names altogether, as they may create a false feeling of some directory >>>> structure, which does not exist. We should also prohibit spaces as >> well. >>>> >>>> D. >>>> >>>> On Mon, Dec 25, 2017 at 7:09 AM, Stanislav Lukyanov < >>>> [hidden email]> >>>> wrote: >>>> >>>>> Hi all, >>>>> >>>>> I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264, >>> and >>>> I >>>>> need some guidance on what’s the best way to approach it. >>>>> >>>>> The problem is that cache names are not restricted, but if >> persistence >>> is >>>>> enabled the cache needs to have a corresponding directory on the file >>>>> system (“cache-…”) which can’t be created if the cache name contains >>>>> certain characters (or a reserved system name). >>>>> >>>>> A straightforward approach would be to check if a cache name is >> allowed >>>> on >>>>> the local system (e.g. via `Paths.get(name)`) and fail to create >> cache >>> if >>>>> it isn’t, but I’m a bit concerned with the consistency of the >> behavior >>>> (the >>>>> same cache name be allowed on one system and not on another). >>>>> I think a better way would be to replace special characters (say, all >>>>> non-alphanumeric characters) with underscores in file names (not >>> changing >>>>> the cache configuration). Would this be OK? Are there any risks I’m >> not >>>>> considering? >>>>> >>>>> WDYT? >>>>> >>>>> Thanks, >>>>> Stan >>>>> >>>> >>> >>> >>> >>> -- >>> Alexey Kuznetsov >>> >>> >> >> > |
I personally like a Pavel's suggestion - base64 encoding seems like
a good solution, while string hashes will arise a collision issue. Best Regards, Igor On Wed, Dec 27, 2017 at 3:29 PM, Petr Ivanov <[hidden email]> wrote: > Special characters banning seems to be exclusive way and cannot be > controlled in future if new symbols arise. > Maybe better solution will be choosing the array of permitted symbols for > caches names (i.e. [a-zA-Z0-9_-])? > > > Also +1 for using abstract hash string for directories names. > > > > On 27 Dec 2017, at 15:14, Stanislav Lukyanov <[hidden email]> > wrote: > > > > We can – by mapping a cache name to some (safe) string to be used as a > directory name, say via Base64 as Pavel has suggested. > > > > However, I think that banning certain characters might be reasonable. > > Some characters might be considered reserved (e.g. slashes, colon, > asterisk, etc) to be used later, in case some future feature requires cache > names to have an actual meaning. > > Some characters might be banned just as a precaution (e.g. control > characters or whitespaces) because they might cause problems with logging > or elsewhere (you might have a bad time processing a cache name with \0 in > it :) ). > > > > The question is whether or not these considerations worth adding code > and/or changing existing behavior. > > > > BTW Java folks had similar discussion on Java module names resulting in > http://mail.openjdk.java.net/pipermail/jpms-spec-experts/ > 2016-December/000515.html. > > > > Thanks, > > Stan > > > > From: Vladimir Ozerov > > Sent: 27 декабря 2017 г. 14:37 > > To: [hidden email] > > Subject: Re: Handling slashes in cache names > > > > Cache name appears to me purely logical entity. Can we simply store cache > > ID in file system paths without adding any restrictions to cache names? > > > > On Wed, Dec 27, 2017 at 2:26 PM, Stanislav Lukyanov < > [hidden email]> > > wrote: > > > >> Well, that’s my question too :) > >> Do we have any compatibility guidelines or other documents on what can > or > >> cannot be in a minor/major release? > >> > >> Also, it might be helpful to add an environment variable (like > >> IGNITE_DISABLE_CACHE_NAME_RESTRICTIONS) to restore the old behavior, > just > >> in case. > >> > >> Thanks, > >> Stan > >> > >> From: Dmitriy Setrakyan > >> Sent: 26 декабря 2017 г. 17:02 > >> To: [hidden email] > >> Subject: Re: Handling slashes in cache names > >> > >> Looks good to me. Is this going to be an exception on startup? If yes, > is > >> it safe to release it, or should we wait till 3.0? > >> > >> On Tue, Dec 26, 2017 at 2:08 AM, Stanislav Lukyanov < > >> [hidden email]> > >> wrote: > >> > >>> Thanks for the feedback. > >>> > >>> It seems that another thing to handle is case-insensitive FS – > “mycache” > >>> and “MyCache” is the same on Windows, so it might be reasonable to > >> disallow > >>> having two caches with names that are equal ignoring case. > >>> And one more thing is control characters – forbidding at least range of > >>> ASCII 0x00-0x20 seems reasonable. > >>> > >>> To summarize, a possible set of restrictions would be > >>> - Whitespace characters (via Character.isWhitespaceCharacter) > >>> - Control characters (via Character.isISOCharacter) > >>> - Slashes > >>> - Characters reserved in Windows (<>:"/\|?*) > >>> - Length (say, up to 255) > >>> - Distinct names of caches when ignoring case > >>> It seems reasonable to enforce that even regardless of persistence > >>> directories naming (AFAIU that’s what Dmitry meant by forbidding things > >>> altogether), so that’s what I’m going to do. > >>> Any concerns? > >>> Specifically, would it be OK from backward compatibility point of view > to > >>> forbid all these characters now for all caches? > >>> > >>> Thanks, > >>> Stan > >>> > >>> > >>> From: Alexey Kuznetsov > >>> Sent: 26 декабря 2017 г. 7:51 > >>> To: [hidden email] > >>> Subject: Re: Handling slashes in cache names > >>> > >>> It also make sense to limit cache name length to reasonable length. > >>> Because some File systems could have limitations on path length. > >>> See: https://en.wikipedia.org/wiki/Filename#Length_restrictions > >>> > >>> On Tue, Dec 26, 2017 at 1:41 AM, Dmitriy Setrakyan < > >> [hidden email]> > >>> wrote: > >>> > >>>> My preference would be to prohibit forward and backward slashes in > >> cache > >>>> names altogether, as they may create a false feeling of some directory > >>>> structure, which does not exist. We should also prohibit spaces as > >> well. > >>>> > >>>> D. > >>>> > >>>> On Mon, Dec 25, 2017 at 7:09 AM, Stanislav Lukyanov < > >>>> [hidden email]> > >>>> wrote: > >>>> > >>>>> Hi all, > >>>>> > >>>>> I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264, > >>> and > >>>> I > >>>>> need some guidance on what’s the best way to approach it. > >>>>> > >>>>> The problem is that cache names are not restricted, but if > >> persistence > >>> is > >>>>> enabled the cache needs to have a corresponding directory on the file > >>>>> system (“cache-…”) which can’t be created if the cache name contains > >>>>> certain characters (or a reserved system name). > >>>>> > >>>>> A straightforward approach would be to check if a cache name is > >> allowed > >>>> on > >>>>> the local system (e.g. via `Paths.get(name)`) and fail to create > >> cache > >>> if > >>>>> it isn’t, but I’m a bit concerned with the consistency of the > >> behavior > >>>> (the > >>>>> same cache name be allowed on one system and not on another). > >>>>> I think a better way would be to replace special characters (say, all > >>>>> non-alphanumeric characters) with underscores in file names (not > >>> changing > >>>>> the cache configuration). Would this be OK? Are there any risks I’m > >> not > >>>>> considering? > >>>>> > >>>>> WDYT? > >>>>> > >>>>> Thanks, > >>>>> Stan > >>>>> > >>>> > >>> > >>> > >>> > >>> -- > >>> Alexey Kuznetsov > >>> > >>> > >> > >> > > > > |
Also, considering case-insensitivity issue, we need to choose
some encoding that only uses upper or lower case letters in encoding result. By the way, such encoding will resolve cache name clashes due to case-insensitivity issue. Best Regards, Igor On Wed, Dec 27, 2017 at 4:18 PM, Igor Sapego <[hidden email]> wrote: > I personally like a Pavel's suggestion - base64 encoding seems like > a good solution, while string hashes will arise a collision issue. > > Best Regards, > Igor > > On Wed, Dec 27, 2017 at 3:29 PM, Petr Ivanov <[hidden email]> wrote: > >> Special characters banning seems to be exclusive way and cannot be >> controlled in future if new symbols arise. >> Maybe better solution will be choosing the array of permitted symbols for >> caches names (i.e. [a-zA-Z0-9_-])? >> >> >> Also +1 for using abstract hash string for directories names. >> >> >> > On 27 Dec 2017, at 15:14, Stanislav Lukyanov <[hidden email]> >> wrote: >> > >> > We can – by mapping a cache name to some (safe) string to be used as a >> directory name, say via Base64 as Pavel has suggested. >> > >> > However, I think that banning certain characters might be reasonable. >> > Some characters might be considered reserved (e.g. slashes, colon, >> asterisk, etc) to be used later, in case some future feature requires cache >> names to have an actual meaning. >> > Some characters might be banned just as a precaution (e.g. control >> characters or whitespaces) because they might cause problems with logging >> or elsewhere (you might have a bad time processing a cache name with \0 in >> it :) ). >> > >> > The question is whether or not these considerations worth adding code >> and/or changing existing behavior. >> > >> > BTW Java folks had similar discussion on Java module names resulting in >> http://mail.openjdk.java.net/pipermail/jpms-spec-experts/201 >> 6-December/000515.html. >> > >> > Thanks, >> > Stan >> > >> > From: Vladimir Ozerov >> > Sent: 27 декабря 2017 г. 14:37 >> > To: [hidden email] >> > Subject: Re: Handling slashes in cache names >> > >> > Cache name appears to me purely logical entity. Can we simply store >> cache >> > ID in file system paths without adding any restrictions to cache names? >> > >> > On Wed, Dec 27, 2017 at 2:26 PM, Stanislav Lukyanov < >> [hidden email]> >> > wrote: >> > >> >> Well, that’s my question too :) >> >> Do we have any compatibility guidelines or other documents on what can >> or >> >> cannot be in a minor/major release? >> >> >> >> Also, it might be helpful to add an environment variable (like >> >> IGNITE_DISABLE_CACHE_NAME_RESTRICTIONS) to restore the old behavior, >> just >> >> in case. >> >> >> >> Thanks, >> >> Stan >> >> >> >> From: Dmitriy Setrakyan >> >> Sent: 26 декабря 2017 г. 17:02 >> >> To: [hidden email] >> >> Subject: Re: Handling slashes in cache names >> >> >> >> Looks good to me. Is this going to be an exception on startup? If yes, >> is >> >> it safe to release it, or should we wait till 3.0? >> >> >> >> On Tue, Dec 26, 2017 at 2:08 AM, Stanislav Lukyanov < >> >> [hidden email]> >> >> wrote: >> >> >> >>> Thanks for the feedback. >> >>> >> >>> It seems that another thing to handle is case-insensitive FS – >> “mycache” >> >>> and “MyCache” is the same on Windows, so it might be reasonable to >> >> disallow >> >>> having two caches with names that are equal ignoring case. >> >>> And one more thing is control characters – forbidding at least range >> of >> >>> ASCII 0x00-0x20 seems reasonable. >> >>> >> >>> To summarize, a possible set of restrictions would be >> >>> - Whitespace characters (via Character.isWhitespaceCharacter) >> >>> - Control characters (via Character.isISOCharacter) >> >>> - Slashes >> >>> - Characters reserved in Windows (<>:"/\|?*) >> >>> - Length (say, up to 255) >> >>> - Distinct names of caches when ignoring case >> >>> It seems reasonable to enforce that even regardless of persistence >> >>> directories naming (AFAIU that’s what Dmitry meant by forbidding >> things >> >>> altogether), so that’s what I’m going to do. >> >>> Any concerns? >> >>> Specifically, would it be OK from backward compatibility point of >> view to >> >>> forbid all these characters now for all caches? >> >>> >> >>> Thanks, >> >>> Stan >> >>> >> >>> >> >>> From: Alexey Kuznetsov >> >>> Sent: 26 декабря 2017 г. 7:51 >> >>> To: [hidden email] >> >>> Subject: Re: Handling slashes in cache names >> >>> >> >>> It also make sense to limit cache name length to reasonable length. >> >>> Because some File systems could have limitations on path length. >> >>> See: https://en.wikipedia.org/wiki/Filename#Length_restrictions >> >>> >> >>> On Tue, Dec 26, 2017 at 1:41 AM, Dmitriy Setrakyan < >> >> [hidden email]> >> >>> wrote: >> >>> >> >>>> My preference would be to prohibit forward and backward slashes in >> >> cache >> >>>> names altogether, as they may create a false feeling of some >> directory >> >>>> structure, which does not exist. We should also prohibit spaces as >> >> well. >> >>>> >> >>>> D. >> >>>> >> >>>> On Mon, Dec 25, 2017 at 7:09 AM, Stanislav Lukyanov < >> >>>> [hidden email]> >> >>>> wrote: >> >>>> >> >>>>> Hi all, >> >>>>> >> >>>>> I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264, >> >>> and >> >>>> I >> >>>>> need some guidance on what’s the best way to approach it. >> >>>>> >> >>>>> The problem is that cache names are not restricted, but if >> >> persistence >> >>> is >> >>>>> enabled the cache needs to have a corresponding directory on the >> file >> >>>>> system (“cache-…”) which can’t be created if the cache name contains >> >>>>> certain characters (or a reserved system name). >> >>>>> >> >>>>> A straightforward approach would be to check if a cache name is >> >> allowed >> >>>> on >> >>>>> the local system (e.g. via `Paths.get(name)`) and fail to create >> >> cache >> >>> if >> >>>>> it isn’t, but I’m a bit concerned with the consistency of the >> >> behavior >> >>>> (the >> >>>>> same cache name be allowed on one system and not on another). >> >>>>> I think a better way would be to replace special characters (say, >> all >> >>>>> non-alphanumeric characters) with underscores in file names (not >> >>> changing >> >>>>> the cache configuration). Would this be OK? Are there any risks I’m >> >> not >> >>>>> considering? >> >>>>> >> >>>>> WDYT? >> >>>>> >> >>>>> Thanks, >> >>>>> Stan >> >>>>> >> >>>> >> >>> >> >>> >> >>> >> >>> -- >> >>> Alexey Kuznetsov >> >>> >> >>> >> >> >> >> >> > >> >> > |
In reply to this post by Pavel Tupitsyn
On Wed, Dec 27, 2017 at 3:42 AM, Pavel Tupitsyn <[hidden email]>
wrote: > Agree with Stan and Vladimir. > We should not impose any restrictions on cache names, some users may have > issues with that. > > Using cache names as file names is internal implementation detail. > We can use cache id or some kind of encoding (base64, etc) to avoid file > system issues. > > Pavel, I disagree. I want to look at the file system and be able to clearly tell which folder belongs to which cache. If you use encryption or some other encoding, this would be impossible. I doubt that introducing cache name validation for *persistent* caches would affect any existing users. It sounds like for non-persistent caches the validation is not needed, right? D. |
Having different policies for persistent and non-persistent caches sounds
like a bad idea for me, because there could be troubles should user try to switch to persistent mode. It would require code changes. Can we just escape all non-latin symbols (e.g. using base64), while leaving the rest as is? With this approach in most cases cache name will remain the same, and only multibyte characters would be affected. On Wed, Dec 27, 2017 at 5:15 PM, Dmitriy Setrakyan <[hidden email]> wrote: > On Wed, Dec 27, 2017 at 3:42 AM, Pavel Tupitsyn <[hidden email]> > wrote: > > > Agree with Stan and Vladimir. > > We should not impose any restrictions on cache names, some users may have > > issues with that. > > > > Using cache names as file names is internal implementation detail. > > We can use cache id or some kind of encoding (base64, etc) to avoid file > > system issues. > > > > > Pavel, I disagree. I want to look at the file system and be able to clearly > tell which folder belongs to which cache. If you use encryption or some > other encoding, this would be impossible. > > I doubt that introducing cache name validation for *persistent* caches > would affect any existing users. It sounds like for non-persistent caches > the validation is not needed, right? > > D. > |
On Wed, Dec 27, 2017 at 6:25 AM, Vladimir Ozerov <[hidden email]>
wrote: > Having different policies for persistent and non-persistent caches sounds > like a bad idea for me, because there could be troubles should user try to > switch to persistent mode. It would require code changes. > > Can we just escape all non-latin symbols (e.g. using base64), while leaving > the rest as is? With this approach in most cases cache name will remain the > same, and only multibyte characters would be affected. > Agree, if we can keep cache names in human readable form. Would be nice to see some examples. |
Yep, base64 is just an example.
We need some kind of urlencode, but tailored for file names, so that names remain readable. To avoid uppercase/lowercase collisions on Windows, we can restrict allowed characters to lowercase English letters and numbers, - and _, and escape everything else in some way. On Wed, Dec 27, 2017 at 5:36 PM, Dmitriy Setrakyan <[hidden email]> wrote: > On Wed, Dec 27, 2017 at 6:25 AM, Vladimir Ozerov <[hidden email]> > wrote: > > > Having different policies for persistent and non-persistent caches sounds > > like a bad idea for me, because there could be troubles should user try > to > > switch to persistent mode. It would require code changes. > > > > Can we just escape all non-latin symbols (e.g. using base64), while > leaving > > the rest as is? With this approach in most cases cache name will remain > the > > same, and only multibyte characters would be affected. > > > > Agree, if we can keep cache names in human readable form. Would be nice to > see some examples. > |
Igniters
Use cache name for file and directory names on a file system is bad idea. In that case we should keep in mind many limitiations vary FS. Why do not use mapping cache name to an identifier tolerated to FS lacks? On Wed, Dec 27, 2017 at 7:05 PM, Pavel Tupitsyn <[hidden email]> wrote: > Yep, base64 is just an example. > We need some kind of urlencode, but tailored for file names, so that > names remain readable. > > To avoid uppercase/lowercase collisions on Windows, we can restrict allowed > characters > to lowercase English letters and numbers, - and _, and escape everything > else in some way. > > On Wed, Dec 27, 2017 at 5:36 PM, Dmitriy Setrakyan <[hidden email]> > wrote: > > > On Wed, Dec 27, 2017 at 6:25 AM, Vladimir Ozerov <[hidden email]> > > wrote: > > > > > Having different policies for persistent and non-persistent caches > sounds > > > like a bad idea for me, because there could be troubles should user try > > to > > > switch to persistent mode. It would require code changes. > > > > > > Can we just escape all non-latin symbols (e.g. using base64), while > > leaving > > > the rest as is? With this approach in most cases cache name will remain > > the > > > same, and only multibyte characters would be affected. > > > > > > > Agree, if we can keep cache names in human readable form. Would be nice > to > > see some examples. > > > -- Sergey Kozlov GridGain Systems www.gridgain.com |
In reply to this post by Pavel Tupitsyn
On Wed, Dec 27, 2017 at 8:05 AM, Pavel Tupitsyn <[hidden email]>
wrote: > Yep, base64 is just an example. > We need some kind of urlencode, but tailored for file names, so that > names remain readable. > > To avoid uppercase/lowercase collisions on Windows, we can restrict allowed > characters to lowercase English letters and numbers, - and _, and escape > everything > else in some way. > I think that we should allow users to specify any case they like, but internally we should always convert to upper or lower case, whichever one we choose. |
Free forum by Nabble | Edit this page |