clayborg added a comment. I am trying to switch to using llvm::sys::path::remove_dots() and we will see where we end up. By switching to this it means:
"./foo.c" --> "foo.c" "./foo/bar.c" --> "foo/bar.c" This makes is easier to see if a relative path matches another FileSpec since we don't have to remove the "./" from the beginning of the path. ================ Comment at: source/Breakpoint/BreakpointResolverFileLine.cpp:131 + while (relative_path.consume_front(".")) + /* Do nothing. */; + } ---------------- I will be removing this after I switch to using llvm::sys::path::remove_dots() instead of the Normalize() I converted. ================ Comment at: source/Utility/FileSpec.cpp:107 + for (auto i = path.find('/'); i != llvm::StringRef::npos; + i = path.find('/', i + 1)) { + const auto nextChar = safeCharAtIndex(path, i+1); ---------------- labath wrote: > aprantl wrote: > > clayborg wrote: > > > clayborg wrote: > > > > aprantl wrote: > > > > > clayborg wrote: > > > > > > I am fine switching to using the llvm functions for removing, this > > > > > > is only detecting if any normalization needs to happen. If there is > > > > > > an equivalent LLVM function that will only run through the string > > > > > > one time to detect all needs for normalization (no multiple passes > > > > > > looking for "..", then for "." etc like we used to have), I would > > > > > > be happy to use it, but there doesn't seem to be. We are trying to > > > > > > avoid having to create a "SmallVectorImpl< char >" for all paths if > > > > > > they don't need fixing here. This function is just iterating > > > > > > through an llvm::StringRef just to see if normalization needs to > > > > > > happen. If it does, then we make a SmallVectorImpl< char > and we > > > > > > fix the path. We can easily use llvm functions for the fixing part. > > > > > I see, you want to avoid copying the string when it isn't necessary > > > > > to do so. IMHO the best way to do this is to add a `bool > > > > > hasDots(const Twine &)` function to llvm::sys::path and add a `bool > > > > > remove_dot_dot` parameter to `removeDots()`. Since you have already > > > > > written the unittests, adding the function to LLVM shouldn't be any > > > > > extra work (I'll help reviewing), and we'll have 100 LoC less to > > > > > maintain. > > > > > > > > > > This way we can avoid having two functions that perform path > > > > > normalization that superficially look like they might do the same > > > > > thing, but a year from now nobody can really tell whether they behave > > > > > exactly the same, and whether a bug found in one implementation > > > > > should also be fixed in the other one, etc... . > > > > I want to know if there are redundant slashes as well. I want something > > > > like "bool llvm::sys::has(const Twine &, bool dots, bool dot_dots, bool > > > > redundant_slashes)". But I don't see that getting accepted? I just want > > > > a "bool llvm:sys::path::contains_redundant_stuff(const Twine &)". Not > > > > sure on the name though. needs_normalization? can_be_shortened? > > > > Everything in LLVM right now assumes that dots, dot dots, slashes and > > > > realpath stuff happen in the same function. Not sure how to break that > > > > up without ruining the performance of the one and only loop that should > > > > be happening. I like the current approach since it doesn't require > > > > chopping up the string into an array and iterating through the array. > > > Also not sure if the LLVM stuff will try to substitute in the current > > > working directory for "."? > > I just read the source code of remove_dots() > > (http://llvm.org/doxygen/Path_8cpp_source.html#l00699) and if I'm not > > mistaken, it actually also removes double-separators as a side-effect, > > since it iterates over the path components of the string as it constructs > > the copy. It also seems to avoid the copy operation if it isn't necessary. > > > > Could you take another look? Perhaps I'm missing something, (or perhaps we > > can just turn this into a small addition to remove_dots). > > > > > Everything in LLVM right now assumes that dots, dot dots, slashes and > > > realpath stuff happen in the same function. > > I'm not sure I understand. > FWIW, this could be implemented in a much simpler way, while still making > sure we run through the string only once. I'm thinking of something like: > ``` > bool first = true; > for(it = llvm::sys::path::begin(path, style), end = > llvm::sys::path::end(path); it != end; ++it) { > if (!first && (*it == "." || *it == "..")) return true; > first = false; > } > return false; > ``` This will all go away, I am just going to call llvm::sys::path::remove_dots()... ================ Comment at: source/Utility/FileSpec.cpp:187 + continue; // Skip '.' unless it is the first thing + } + else if (component != "..") { ---------------- Yes we would need to remove this. Again, I will switch to using llvm::sys::path::remove_dots and we can deal with the fallout there,. ================ Comment at: source/Utility/FileSpec.cpp:406 + m_directory.Clear(); + } } ---------------- > they resolve to the same file in a virtual filesystem, where all referenced > path components exist and are directories Normalizing happens after resolving and if we don't resolve a path, we have no way to know what is a directory and what isn't. We will be setting breakpoints for remote targets quite a bit in LLDB and ww can't assume or stat anything in a path. So I would say FileSpec's are equivalent if the relative paths match all components. > Now the (c) part would imply that foo/.. should normalize to ./. and not ., > which is a bit odd, but it is consistent with our stated intention of > preserving directory information. If we do not want to have ./. here, then we > need to come up with a different definition of what it means to be > "normalized". I guess we could make the m_directory contain "." and m_filename contain nothing for the "./." case. It doesn 't make sense to have "." in both places. ================ Comment at: source/Utility/FileSpec.cpp:857 + !IsPathSeparator(components[i].back(), syntax) && + (result.empty() || !IsPathSeparator(result.back(), syntax))) result += GetPreferredPathSeparator(syntax); ---------------- labath wrote: > Why is this necessary? `result` can't be empty because we have just appended > something to it (and we've checked that `components[i]` is non-empty). Because if you don't check this joining {"/', "foo.txt"} will result in "//foo.txt" which is wrong,. https://reviews.llvm.org/D45977 _______________________________________________ lldb-commits mailing list lldb-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits