In Best Practices for Using Strings in the .NET Framework, StringComparison OrdinalIgnoreCase is recommended for case-insensitive file paths. (Let's call it Statement A.)
I can agree with that, because I can create two files in the same directory:
é.txt
é.txt
Their filenames are not the same, second one is composed from e and modifier, so it actually has two letters. (You can try yourself using copy-paste.)
If there was Invariant culture comparison (and not ordinal comparison) in effect, NTFS wouldn't allow these files, because in the same article they explain, that in invariant culture a + ̊ = å
But in article on String.ToUpperInvariant() there is different recommendation: (Statement B.)
If you need the lowercase or uppercase version of an operating system identifier, such as a file name, named pipe, or registry key, use the ToLowerInvariant or ToUpperInvariant methods.
I need to create file path collection (in fact HashSet) to detect duplicates. So if I will obey statement B when creating the map, I could end with false positives, because abovementioned filenames é.txt and é.txt will be considered as one. Am I understanding it correctly that statement B found in MSDN is misleading? Or am I missing something?
I'm about to build a library, preferably without known bugs from start, so I simply don't want to neglect this.
Update:
Statement B seems to have one more issue: ToLowerInvariant() cannot be actually used. Reason (I quote Best practices article): DO: Use ToUpperInvariant rather than ToLowerInvariant when normalizing strings for comparison. Actual reason: There is a small range of characters that do not roundtrip, and going to lowercase will make these characters unavailable. (source)