mirror of
https://github.com/nim-lang/Nim.git
synced 2025-12-29 09:24:36 +00:00
* Add improved Windows UNC path support in std/os
Original issue: `std/os.createDir` tries to create every component of
the given path as a directory. The problem is that `createDir`
interprets every backslash/slash as a path separator. For a UNC path
this is incorrect. E.g. one UNC form is `\\Server\Volume\Path`. It's an
error to create the `\\Server` directory, as well as creating
`\\Server\Volume`.
Add `ntpath.nim` module with `splitDrive` proc. This implements UNC path
parsing as implemented in the Python `ntpath.py` module. The following
UNC forms are supported:
* `\\Server\Volume\Path`
* `\\?\Volume\Path`
* `\\?\UNC\Server\Volume\Path`
Improves support for UNC paths in various procs in `std/os`:
---
* pathnorm.addNormalizePath
* Issue: This had incomplete support for UNC paths
* The UNC prefix (first 2 characters of a UNC path) was assumed to
be exactly `\\`, but it can be `//` and `\/`, etc. as well
* Also, the UNC prefix must be normalized to the `dirSep` argument
of `addNormalizePath`
* Resolution: Changed to account for different UNC prefixes, and
normalizing the prefixes according to `dirSep`
* Affected procs that get tests: `relativePath`, `joinPath`
* Issue: The server/volume part of UNC paths can be stripped when
normalizing `..` path components
* This error should be negligable, so ignoring this
* splitPath
* Now make sure the UNC drive is not split; return the UNC drive as
`head` if the UNC drive is the only component of the path
* Consequently fixes `extractFilename`, `lastPathPart`
* parentDir / `/../`
* Strip away drive before working on the path, prepending the drive
after all work is done - prevents stripping UNC components
* Return empty string if drive component is the only component; this
is the behavior for POSIX paths as well
* Alternative implementation: Just call something like
`pathnorm.normalizePath(path & "/..")` for the whole proc - maybe
too big of a change
* tailDir
* If drive is present in path, just split that from path and return
path
* parentDirs iterator
* Uses `parentDir` for going backwards
* When going forwards, first `splitDrive`, yield the drive field, and
then iterate over path field as normal
* splitFile
* Make sure path parsing stops at end of drive component
* createDir
* Fixed by skipping drive part before creating directories
* Alternative implementation: use `parentDirs` iterator instead of
iterating over characters
* Consequence is that it will try to create the root directory
* isRootDir
* Changed to treat UNC drive alone as root (e.g. "//?/c:" is root)
* This change prevents the empty string being yielded by the
`parentDirs` iterator with `fromRoot = false`
* Internal `sameRoot`
* The "root" refers to the drive, so `splitDrive` can be used here
This adds UNC path support to all procs that could use it in std/os. I
don't think any more work has to be done to support UNC paths. For the
future, I believe the path handling code can be refactored due to
duplicate code. There are multiple ways of manipulating paths, such as
manually searching string for path separator and also having a path
normalizer (pathnorm.nim). If all path manipulation used `pathnorm.nim`,
and path component splitting used `parentDirs` iterator, then a lot of
code could be removed.
Tests
---
Added test file for `pathnorm.nim` and `ntpath.nim`.
`pathnorm.normalizePath` has no tests, so I'm adding a few unit tests.
`ntpath.nim` contains tests copied from Python's test suite.
Added integration tests to `tos.nim` that tests UNC paths.
Removed incorrect `relativePath` runnableExamples from being tested on Windows:
---
`relativePath("/Users///me/bar//z.nim", "//Users/", '/') == "me/bar/z.nim"`
This is incorrect on Windows because the `/` and `//` are not the same
root. `/` (or `\`) is expanded to the drive in the current working
directory (e.g. `C:\`). `//` (or `\\`), however, are the first two
characters of a UNC path. The following holds true for normal Windows
installations:
* `dirExists("/Users") != dirExists("//Users")`
* `dirExists("\\Users") != dirExists("\\\\Users")`
Fixes #19103
Questions:
---
* Should the `splitDrive` proc be in `os.nim` instead with copyright
notice above the proc?
* Is it fine to put most of the new tests into the `runnableExamples`
section of the procs in std/os?
* [skipci] Apply suggestions from code review
Co-authored-by: Clay Sweetser <Varriount@users.noreply.github.com>
* [skip ci] Update lib/pure/os.nim
Co-authored-by: Clay Sweetser <Varriount@users.noreply.github.com>
* Move runnableExamples tests in os.nim to tos.nim
* tests/topt_no_cursor: Change from using splitFile to splitDrive
`splitFile` can no longer be used in the test, because it generates
different ARC code on Windows and Linux. This replaces `splitFile` with
`splitDrive`, because it generates same ARC code on Windows and Linux,
and returns a tuple. I assume the test wants a proc that returns a
tuple.
* Drop copyright attribute to Python
Co-authored-by: Clay Sweetser <Varriount@users.noreply.github.com>
62 lines
2.3 KiB
Nim
62 lines
2.3 KiB
Nim
# This module is inspired by Python's `ntpath.py` module.
|
|
|
|
import std/[
|
|
strutils,
|
|
]
|
|
|
|
# Adapted `splitdrive` function from the following commits in Python source
|
|
# code:
|
|
# 5a607a3ee5e81bdcef3f886f9d20c1376a533df4 (2009): Initial UNC handling (by Mark Hammond)
|
|
# 2ba0fd5767577954f331ecbd53596cd8035d7186 (2022): Support for "UNC"-device paths (by Barney Gale)
|
|
#
|
|
# FAQ: Why use `strip` below? `\\?\UNC` is the start of a "UNC symbolic link",
|
|
# which is a special UNC form. Running `strip` differentiates `\\?\UNC\` (a UNC
|
|
# symbolic link) from e.g. `\\?\UNCD` (UNCD is the server in the UNC path).
|
|
func splitDrive*(p: string): tuple[drive, path: string] =
|
|
## Splits a Windows path into a drive and path part. The drive can be e.g.
|
|
## `C:`. It can also be a UNC path (`\\server\drive\path`).
|
|
##
|
|
## The equivalent `splitDrive` for POSIX systems always returns empty drive.
|
|
## Therefore this proc is only necessary on DOS-like file systems (together
|
|
## with Nim's `doslikeFileSystem` conditional variable).
|
|
##
|
|
## This proc's use case is to extract `path` such that it can be manipulated
|
|
## like a POSIX path.
|
|
runnableExamples:
|
|
doAssert splitDrive("C:") == ("C:", "")
|
|
doAssert splitDrive(r"C:\") == (r"C:", r"\")
|
|
doAssert splitDrive(r"\\server\drive\foo\bar") == (r"\\server\drive", r"\foo\bar")
|
|
doAssert splitDrive(r"\\?\UNC\server\share\dir") == (r"\\?\UNC\server\share", r"\dir")
|
|
|
|
result = ("", p)
|
|
if p.len < 2:
|
|
return
|
|
const sep = '\\'
|
|
let normp = p.replace('/', sep)
|
|
if p.len > 2 and normp[0] == sep and normp[1] == sep and normp[2] != sep:
|
|
|
|
# is a UNC path:
|
|
# vvvvvvvvvvvvvvvvvvvv drive letter or UNC path
|
|
# \\machine\mountpoint\directory\etc\...
|
|
# directory ^^^^^^^^^^^^^^^
|
|
let start = block:
|
|
const unc = "\\\\?\\UNC" # Length is 7
|
|
let idx = min(8, normp.len)
|
|
if unc == normp[0..<idx].strip(chars = {sep}, leading = false).toUpperAscii:
|
|
8
|
|
else:
|
|
2
|
|
let index = normp.find(sep, start)
|
|
if index == -1:
|
|
return
|
|
var index2 = normp.find(sep, index + 1)
|
|
|
|
# a UNC path can't have two slashes in a row (after the initial two)
|
|
if index2 == index + 1:
|
|
return
|
|
if index2 == -1:
|
|
index2 = p.len
|
|
return (p[0..<index2], p[index2..^1])
|
|
if p[1] == ':':
|
|
return (p[0..1], p[2..^1])
|