clayborg added a comment.

I am trying to switch to using llvm::sys::path::remove_dots() and we will see 
where we end up. By switching to this it means:

"./foo.c" --> "foo.c"
"./foo/bar.c" --> "foo/bar.c"

This makes is easier to see if a relative path matches another FileSpec since 
we don't have to remove the "./" from the beginning of the path.



================
Comment at: source/Breakpoint/BreakpointResolverFileLine.cpp:131
+    while (relative_path.consume_front("."))
+      /* Do nothing. */;
+  }
----------------
I will be removing this after I switch to using llvm::sys::path::remove_dots() 
instead of the Normalize() I converted.


================
Comment at: source/Utility/FileSpec.cpp:107
+  for (auto i = path.find('/'); i != llvm::StringRef::npos;
+       i = path.find('/', i + 1)) {
+    const auto nextChar = safeCharAtIndex(path, i+1);
----------------
labath wrote:
> aprantl wrote:
> > clayborg wrote:
> > > clayborg wrote:
> > > > aprantl wrote:
> > > > > clayborg wrote:
> > > > > > I am fine switching to using the llvm functions for removing, this 
> > > > > > is only detecting if any normalization needs to happen. If there is 
> > > > > > an equivalent LLVM function that will only run through the string 
> > > > > > one time to detect all needs for normalization (no multiple passes 
> > > > > > looking for "..", then for "." etc like we used to have), I would 
> > > > > > be happy to use it, but there doesn't seem to be. We are trying to 
> > > > > > avoid having to create a "SmallVectorImpl< char >" for all paths if 
> > > > > > they don't need fixing here. This function is just iterating 
> > > > > > through an llvm::StringRef just to see if normalization needs to 
> > > > > > happen. If it does, then we make a SmallVectorImpl< char > and we 
> > > > > > fix the path. We can easily use llvm functions for the fixing part. 
> > > > > I see, you want to avoid copying the string when it isn't necessary 
> > > > > to do so. IMHO the best way to do this is to add a `bool 
> > > > > hasDots(const Twine &)` function to llvm::sys::path and add a `bool 
> > > > > remove_dot_dot` parameter to `removeDots()`. Since you have already 
> > > > > written the unittests, adding the function to LLVM shouldn't be any 
> > > > > extra work (I'll help reviewing), and we'll have 100 LoC less to 
> > > > > maintain.
> > > > > 
> > > > > This way we can avoid having two functions that perform path 
> > > > > normalization that superficially look like they might do the same 
> > > > > thing, but a year from now nobody can really tell whether they behave 
> > > > > exactly the same, and whether a bug found in one implementation 
> > > > > should also be fixed in the other one, etc... .
> > > > I want to know if there are redundant slashes as well. I want something 
> > > > like "bool llvm::sys::has(const Twine &, bool dots, bool dot_dots, bool 
> > > > redundant_slashes)". But I don't see that getting accepted? I just want 
> > > > a "bool llvm:sys::path::contains_redundant_stuff(const Twine &)".  Not 
> > > > sure on the name though. needs_normalization? can_be_shortened? 
> > > > Everything in LLVM right now assumes that dots, dot dots, slashes and 
> > > > realpath stuff happen in the same function. Not sure how to break that 
> > > > up without ruining the performance of the one and only loop that should 
> > > > be happening. I like the current approach since it doesn't require 
> > > > chopping up the string into an array and iterating through the array.
> > > Also not sure if the LLVM stuff will try to substitute in the current 
> > > working directory for "."?
> > I just read the source code of remove_dots() 
> > (http://llvm.org/doxygen/Path_8cpp_source.html#l00699) and if I'm not 
> > mistaken, it actually also removes double-separators as a side-effect, 
> > since it iterates over the path components of the string as it constructs 
> > the copy. It also seems to avoid the copy operation if it isn't necessary.
> > 
> > Could you take another look? Perhaps I'm missing something, (or perhaps we 
> > can just turn this into a small addition to remove_dots).
> > 
> > > Everything in LLVM right now assumes that dots, dot dots, slashes and 
> > > realpath stuff happen in the same function.
> > I'm not sure I understand.
> FWIW, this could be implemented in a much simpler way, while still making 
> sure we run through the string only once. I'm thinking of something like:
> ```
> bool first = true;
> for(it = llvm::sys::path::begin(path, style), end = 
> llvm::sys::path::end(path); it != end; ++it) {
>   if (!first && (*it == "." || *it == "..")) return true;
>   first = false;
> }
> return false;
> ```
This will all go away, I am just going to call llvm::sys::path::remove_dots()...


================
Comment at: source/Utility/FileSpec.cpp:187
+        continue; // Skip '.' unless it is the first thing
+    }
+    else if (component != "..") {
----------------
Yes we would need to remove this. Again, I will switch to using 
llvm::sys::path::remove_dots and we can deal with the fallout there,.


================
Comment at: source/Utility/FileSpec.cpp:406
+      m_directory.Clear();
+  }
 }
----------------
> they resolve to the same file in a virtual filesystem, where all referenced 
> path components exist and are directories

Normalizing happens after resolving and if we don't resolve a path, we have no 
way to know what is a directory and what isn't. We will be setting breakpoints 
for remote targets quite a bit in LLDB and ww can't assume or stat anything in 
a path. So I would say FileSpec's are equivalent if the relative paths match 
all components.

> Now the (c) part would imply that foo/.. should normalize to ./. and not ., 
> which is a bit odd, but it is consistent with our stated intention of 
> preserving directory information. If we do not want to have ./. here, then we 
> need to come up with a different definition of what it means to be 
> "normalized".

I guess we could make the  m_directory contain "." and m_filename contain 
nothing for the "./." case. It doesn 't make  sense to have "." in both places.


================
Comment at: source/Utility/FileSpec.cpp:857
+        !IsPathSeparator(components[i].back(), syntax) &&
+        (result.empty() || !IsPathSeparator(result.back(), syntax)))
       result += GetPreferredPathSeparator(syntax);
----------------
labath wrote:
> Why is this necessary? `result` can't be empty because we have just appended 
> something to it (and we've checked that `components[i]` is non-empty).
Because if you don't check this joining {"/', "foo.txt"} will result in 
"//foo.txt" which is wrong,.


https://reviews.llvm.org/D45977



_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to