| single |
PathSpec
========
*pathspec* is a utility library for pattern matching of file paths. So far
this
only includes Git's `gitignore`_ pattern matching.
.. _`gitignore`: http://git-scm.com/docs/gitignore
Tutorial
--------
Say you have a "Projects" directory and you want to back it up, but only
certain files, and ignore others depending on certain conditions::
>>> from pathspec import PathSpec
>>> # The gitignore-style patterns for files to select, but we're
including
>>> # instead of ignoring.
>>> spec_text = """
...
... # This is a comment because the line begins with a hash: "#"
...
... # Include several project directories (and all descendants) relative
to
... # the current directory. To reference only a directory you must end
with a
... # slash: "/"
... /project-a/
... /project-b/
... /project-c/
...
... # Patterns can be negated by prefixing with exclamation mark: "!"
...
... # Ignore temporary files beginning or ending with "~" and ending with
... # ".swp".
... !~*
... !*~
... !*.swp
...
... # These are python projects so ignore compiled python files from
... # testing.
... !*.pyc
...
... # Ignore the build directories but only directly under the project
... # directories.
... !/*/build/
...
... """
The PathSpec class provides an abstraction around pattern implementations,
and we want to compile our patterns as "gitignore" patterns. You could call
it a
wrapper for a list of compiled patterns::
>>> spec = PathSpec.from_lines('gitignore', spec_text.splitlines())
If we wanted to manually compile the patterns, we can use the
GitIgnoreBasicPattern
class directly. It is used in the background for "gitignore" which
internally
converts patterns to regular expressions::
>>> from pathspec.patterns.gitignore.basic import GitIgnoreBasicPattern
>>> patterns = map(GitIgnoreBasicPattern, spec_text.splitlines())
>>> spec = PathSpec(patterns)
``PathSpec.from_lines()`` is a class method which simplifies that.
If you want to load the patterns from file, you can pass the file object
directly as well::
>>> with open('patterns.list', 'r') as fh:
>>> spec = PathSpec.from_lines('gitignore', fh)
You can perform matching on a whole directory tree with::
>>> matches = set(spec.match_tree_files('path/to/directory'))
Or you can perform matching on a specific set of file paths with::
>>> matches = set(spec.match_files(file_paths))
Or check to see if an individual file matches::
>>> is_matched = spec.match_file(file_path)
There's actually two implementations of "gitignore". The basic
implementation is
used by PathSpec and follows patterns as documented by `gitignore`_.
However, Git's behavior differs from the documented patterns. There's some
edge-cases, and in particular, Git allows including files from excluded
directories which appears to contradict the documentation. GitIgnoreSpec
handles these cases to more closely replicate Git's behavior::
>>> from pathspec import GitIgnoreSpec
>>> spec = GitIgnoreSpec.from_lines(spec_text.splitlines())
You do not specify the style of pattern for GitIgnoreSpec because it should
|