python-vcsgraph-0.2.0/.coveragerc0000644000000000000000000000013415167007306013717 0ustar00[run] branch = True source = breezy [report] exclude_lines = raise NotImplementedError python-vcsgraph-0.2.0/.dockerignore0000644000000000000000000000001615167007306014251 0ustar00Dockerfile *~ python-vcsgraph-0.2.0/.mailmap0000644000000000000000000000045315167007306013223 0ustar00Jelmer Vernooij Jelmer Vernooij Jelmer Vernooij INADA Naoki Martin Packman python-vcsgraph-0.2.0/.rsyncexclude0000644000000000000000000000031315167007306014306 0ustar00*.pyc *.pyo *~ # arch can bite me {arch} .arch-ids ,,* ++* /doc/*.html *.tmp bzr-test.log [#]*# .#* testrev.* /tmp # do want this after all + CHANGELOG /build test*.tmp .*.swp *.orig .*.orig .bzr-shelf* python-vcsgraph-0.2.0/.testr.conf0000644000000000000000000000030415167007306013663 0ustar00[DEFAULT] test_command=PYTHONPATH=`pwd`:$PYTHONPATH BRZ_PLUGIN_PATH=-site:-user python3 -m breezy selftest --subunit2 $IDOPTION $LISTOPT test_id_option=--load-list $IDFILE test_list_option=--list python-vcsgraph-0.2.0/CODE_OF_CONDUCT.md0000644000000000000000000000642715167007306014410 0ustar00# Contributor Covenant Code of Conduct ## Our Pledge In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation. ## Our Standards Examples of behavior that contributes to creating a positive environment include: * Using welcoming and inclusive language * Being respectful of differing viewpoints and experiences * Gracefully accepting constructive criticism * Focusing on what is best for the community * Showing empathy towards other community members Examples of unacceptable behavior by participants include: * The use of sexualized language or imagery and unwelcome sexual attention or advances * Trolling, insulting/derogatory comments, and personal or political attacks * Public or private harassment * Publishing others' private information, such as a physical or electronic address, without explicit permission * Other conduct which could reasonably be considered inappropriate in a professional setting ## Our Responsibilities Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior. Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful. ## Scope This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers. ## Enforcement Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team at core@breezy-vcs.org. All complaints will be reviewed and investigated and will result in a response that is deemed necessary and appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately. Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership. ## Attribution This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html [homepage]: https://www.contributor-covenant.org For answers to common questions about this code of conduct, see https://www.contributor-covenant.org/faq python-vcsgraph-0.2.0/COPYING.txt0000644000000000000000000004325415167007306013461 0ustar00 GNU GENERAL PUBLIC LICENSE Version 2, June 1991 Copyright (C) 1989, 1991 Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Lesser General Public License instead.) You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you". Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does. 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change. b) You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License. c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program. In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following: a) Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, b) Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, c) Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. 4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 9. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation. 10. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. Also add information on how to contact you by electronic and paper mail. If the program is interactive, make it output a short notice like this when it starts in an interactive mode: Gnomovision version 69, Copyright (C) year name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than `show w' and `show c'; they could even be mouse-clicks or menu items--whatever suits your program. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the program, if necessary. Here is a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (which makes passes at compilers) written by James Hacker. , 1 April 1989 Ty Coon, President of Vice This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. python-vcsgraph-0.2.0/Cargo.lock0000644000000000000000000001004615167007306013506 0ustar00# This file is automatically @generated by Cargo. # It is not intended for manual editing. version = 4 [[package]] name = "graph-py" version = "0.1.2" dependencies = [ "pyo3", "rustc-hash", "vcs-graph", ] [[package]] name = "heck" version = "0.5.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "2304e00983f87ffb38b55b444b5e3b60a884b5d30c0fca7d82fe33449bbe55ea" [[package]] name = "lazy_static" version = "1.5.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "bbd2bcb4c963f2ddae06a2efc7e9f3591312473c50c6685e1f298068316e66fe" [[package]] name = "libc" version = "0.2.183" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "b5b646652bf6661599e1da8901b3b9522896f01e736bad5f723fe7a3a27f899d" [[package]] name = "maplit" version = "1.0.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "3e2e65a1a2e43cfcb47a895c4c8b10d1f4a61097f9f254f183aee60cad9c651d" [[package]] name = "once_cell" version = "1.21.4" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "9f7c3e4beb33f85d45ae3e3a1792185706c8e16d043238c593331cc7cd313b50" [[package]] name = "portable-atomic" version = "1.13.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "c33a9471896f1c69cecef8d20cbe2f7accd12527ce60845ff44c153bb2a21b49" [[package]] name = "proc-macro2" version = "1.0.106" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "8fd00f0bb2e90d81d1044c2b32617f68fcb9fa3bb7640c23e9c748e53fb30934" dependencies = [ "unicode-ident", ] [[package]] name = "pyo3" version = "0.28.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "cf85e27e86080aafd5a22eae58a162e133a589551542b3e5cee4beb27e54f8e1" dependencies = [ "libc", "once_cell", "portable-atomic", "pyo3-build-config", "pyo3-ffi", "pyo3-macros", ] [[package]] name = "pyo3-build-config" version = "0.28.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "8bf94ee265674bf76c09fa430b0e99c26e319c945d96ca0d5a8215f31bf81cf7" dependencies = [ "target-lexicon", ] [[package]] name = "pyo3-ffi" version = "0.28.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "491aa5fc66d8059dd44a75f4580a2962c1862a1c2945359db36f6c2818b748dc" dependencies = [ "libc", "pyo3-build-config", ] [[package]] name = "pyo3-macros" version = "0.28.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "f5d671734e9d7a43449f8480f8b38115df67bef8d21f76837fa75ee7aaa5e52e" dependencies = [ "proc-macro2", "pyo3-macros-backend", "quote", "syn", ] [[package]] name = "pyo3-macros-backend" version = "0.28.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "22faaa1ce6c430a1f71658760497291065e6450d7b5dc2bcf254d49f66ee700a" dependencies = [ "heck", "proc-macro2", "pyo3-build-config", "quote", "syn", ] [[package]] name = "quote" version = "1.0.45" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "41f2619966050689382d2b44f664f4bc593e129785a36d6ee376ddf37259b924" dependencies = [ "proc-macro2", ] [[package]] name = "rustc-hash" version = "2.1.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "94300abf3f1ae2e2b8ffb7b58043de3d399c73fa6f4b73826402a5c457614dbe" [[package]] name = "syn" version = "2.0.117" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "e665b8803e7b1d2a727f4023456bbbbe74da67099c585258af0ad9c5013b9b99" dependencies = [ "proc-macro2", "quote", "unicode-ident", ] [[package]] name = "target-lexicon" version = "0.13.5" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "adb6935a6f5c20170eeceb1a3835a49e12e19d792f6dd344ccc76a985ca5a6ca" [[package]] name = "unicode-ident" version = "1.0.24" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "e6e4313cd5fcd3dad5cafa179702e2b244f760991f45397d14d4ebf38247da75" [[package]] name = "vcs-graph" version = "3.5.0" dependencies = [ "lazy_static", "maplit", "pyo3", "rustc-hash", ] python-vcsgraph-0.2.0/Cargo.toml0000644000000000000000000000026715167007306013535 0ustar00[workspace] members = ["crates/*"] resolver = "2" [workspace.package] version = "0.1.2" edition = "2021" rust-version = "1.60" [workspace.dependencies] pyo3 = { version = "=0.28" } python-vcsgraph-0.2.0/MANIFEST.in0000644000000000000000000000026015167007306013334 0ustar00include README.md setup.py COPYING.txt recursive-include vcsgraph *.py recursive-include crates Cargo.toml *.rs include Cargo.lock include Cargo.toml include vcsgraph/py.typed python-vcsgraph-0.2.0/README.md0000644000000000000000000000707115167007306013064 0ustar00# vcsgraph A Python library providing graph algorithms optimized for version control systems. ## Overview `vcsgraph` is a high-performance graph algorithms library specifically designed for working with version control system (VCS) data structures. It provides efficient implementations of common graph operations needed by VCS tools, with both pure Python and Rust-accelerated implementations for performance-critical operations. ## Features - **Topological Sorting**: Multiple algorithms for sorting commits/revisions in topological order - **Graph Traversal**: Efficient algorithms for traversing revision graphs and finding common ancestors - **Multi-parent Support**: Handle complex merge scenarios with multiple parent revisions - **Known Graph Operations**: Optimized operations on graphs where the full structure is known in advance - **Rust Acceleration**: Performance-critical algorithms implemented in Rust with Python bindings via PyO3 ## Installation ```bash pip install vcsgraph ``` ## Key Components ### Graph Operations The `Graph` class provides fundamental graph operations: - Finding least common ancestors (LCA) - Finding unique ancestors - Computing differences between revision sets - Finding merge bases between branches ### Topological Sorting Multiple sorting implementations optimized for different use cases: - `topo_sort()`: Fast sorting when the complete result is needed - `TopoSorter`: Iterator-based sorting for processing partial results - `MergeSorter`: Specialized sorting that preserves merge history ### Multi-parent Diffs The `MultiParent` class handles complex diff scenarios with multiple parent revisions, essential for three-way merges and conflict resolution. ### Known Graph The `KnownGraph` class provides optimized operations when the complete graph structure is known, enabling faster ancestor calculations and traversals. ## Usage Examples ### Basic Topological Sort ```python from vcsgraph import topo_sort # Define a graph as a list of (node, parents) tuples graph = [ (b'rev1', []), (b'rev2', [b'rev1']), (b'rev3', [b'rev1']), (b'rev4', [b'rev2', b'rev3']), ] # Sort nodes topologically (parents before children) sorted_nodes = topo_sort(graph) ``` ### Using Graph for Ancestry Operations ```python from vcsgraph import Graph, DictParentsProvider # Create a parents provider from a dictionary ancestry = { b'rev1': (b'null:',), b'rev2a': (b'rev1',), b'rev2b': (b'rev1',), b'rev3': (b'rev2a',), b'rev4': (b'rev3', b'rev2b'), } parents_provider = DictParentsProvider(ancestry) # Create a graph and find merge bases graph = Graph(parents_provider) merge_base = graph.find_merge_base(b'rev2a', b'rev2b') ``` ### Working with Known Graphs ```python from vcsgraph import KnownGraph # Create a known graph from parent relationships parent_map = { b'rev1': (b'null:',), b'rev2': (b'rev1',), b'rev3': (b'rev2',), } kg = KnownGraph(parent_map) # Get heads (revisions with no children) heads = kg.heads([b'rev1', b'rev2', b'rev3']) ``` ## Performance The library uses Rust for performance-critical operations while maintaining a Python interface. Key optimizations include: - Memory-efficient graph representations - Optimized ancestor searching algorithms - Lazy evaluation where possible - Caching of frequently accessed data ## License This project is licensed under the GNU General Public License v2 or later. See the COPYING.txt file for details. ## Origins This library was originally part of the Breezy version control system and has been extracted as a standalone package for use in other VCS-related projects.python-vcsgraph-0.2.0/crates/0000755000000000000000000000000015167007306013061 5ustar00python-vcsgraph-0.2.0/pyproject.toml0000644000000000000000000000615615167007306014524 0ustar00[build-system] requires = ["setuptools", "wheel", "setuptools-rust"] build-backend = "setuptools.build_meta" [project] name = "vcsgraph" description = "Graph algorithms for version control systems" readme = "README.md" requires-python = ">=3.10" dynamic = ["version"] [project.optional-dependencies] dev = [ "mypy==1.19.1", "ruff==0.15.9" ] [tool.setuptools] packages = ["vcsgraph"] [[tool.setuptools-rust.ext-modules]] target = "vcsgraph._graph_rs" path = "crates/graph-py/Cargo.toml" binding = "PyO3" [tool.setuptools.dynamic] version = {attr = "vcsgraph.__version__"} [tool.mypy] ignore_missing_imports = true [tool.ruff] extend-exclude = ["lib", "bin"] [tool.ruff.lint] select = [ "ANN", # annotations "D", # pydocstyle "E", # pycodestyle "F", # pyflakes "N", # naming "B", # bugbear "I", # isort "S", # bandit "TCH", # typecheck "INT", # gettext "SIM", # simplify "C4", # comprehensions "UP", # pyupgrade "RUF", # ruf-specific ] ignore = [ "ANN001", "ANN002", "ANN003", # missing-type-arg "ANN201", "ANN202", "ANN204", "ANN205", "ANN206", "D205", # 1 blank line required between summary line and description "D417", # Missing argument descriptions in the docstring "F821", # undefined-name "E501", # line too long "D402", # Missing blank line after last section "E402", # module level import not at top of file "E741", # ambiguous variable name "F405", # name may be undefined, or defined from star imports "N801", # Naming convention violation: invalid constant name "N802", # Naming convention violation: invalid variable name "N804", # Naming convention violation: invalid lowercase variable name "N806", # Naming convention violation: invalid lowercase function name "N818", # Naming convention violation: invalid argument name "N999", # Naming convention violation: invalid module name "S602", # subprocess with shell=True "S603", # check for execution of untrusted input "S105", # "hardcoded password"; false positives on "pwd" "S106", # "hardcoded password"; false positives on "pwd" "S110", # "consider logging exception" "S317", # use defusedxml # This triggers for docstrings that uses __doc__ "D104", # Missing docstring in public package "RUF012", # Mutable class attributes should be annotated with `typing.ClassVar` "RUF005", # Consider iterable concatenation instead of list concatenation "RUF015", # Prefer next() of single slice access "SIM102", # Use a single `if` statement instead of nested `if` statements "SIM105", # Use `contextlib.suppress "SIM108", # Use ternary operator "SIM114", # Combine `if` branches using logical `or` operator "SIM115", # Use context handler for opening files ] [tool.ruff.lint.extend-per-file-ignores] # Ignore docstring requirements for test files "vcsgraph/tests/*.py" = ["D100", "D101", "D102", "D103", "D104", "D105", "D106", "D107"] [tool.ruff.lint.pydocstyle] convention = "google" [tool.cibuildwheel.linux] skip = "*-musllinux_*" archs = ["auto", "aarch64"] python-vcsgraph-0.2.0/setup.py0000755000000000000000000000045715167007306013323 0ustar00#! /usr/bin/env python3 """Setup script for vcsgraph package.""" from setuptools import setup from setuptools_rust import Binding, RustExtension setup( rust_extensions=[ RustExtension( "vcsgraph._graph_rs", "crates/graph-py/Cargo.toml", binding=Binding.PyO3 ), ] ) python-vcsgraph-0.2.0/vcsgraph/0000755000000000000000000000000015167007306013415 5ustar00python-vcsgraph-0.2.0/crates/graph-py/0000755000000000000000000000000015167007306014610 5ustar00python-vcsgraph-0.2.0/crates/graph/0000755000000000000000000000000015167007306014162 5ustar00python-vcsgraph-0.2.0/crates/graph-py/Cargo.toml0000644000000000000000000000037715167007306016547 0ustar00[package] name = "graph-py" version = { workspace = true } edition = "2021" [lib] crate-type = ["cdylib"] [dependencies] pyo3 = { workspace = true, features = ["extension-module"]} rustc-hash = "2" vcs-graph = { path = "../graph", features = ["pyo3"] } python-vcsgraph-0.2.0/crates/graph-py/src/0000755000000000000000000000000015167007306015377 5ustar00python-vcsgraph-0.2.0/crates/graph-py/src/lib.rs0000644000000000000000000022112415167007306016515 0ustar00#![allow(non_snake_case)] use vcs_graph::bfs::BfsState; use vcs_graph::graph::{Graph as RsGraph, GraphError}; use vcs_graph::known_graph::KnownGraph as RsKnownGraph; use vcs_graph::{ CachingParentsProvider as RsCachingParentsProvider, ChildMap, ParentMap, Parents, ParentsProvider, RevnoVec, }; use pyo3::import_exception; use pyo3::prelude::*; use pyo3::types::{PyDict, PyList, PyTuple}; use pyo3::wrap_pyfunction; use std::collections::{HashMap, HashSet}; use std::hash::Hash; import_exception!(vcsgraph.errors, GraphCycleError); struct PyNode(Py); impl std::fmt::Debug for PyNode { fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result { Python::attach(|py| { let repr = self.0.bind(py).repr(); if PyErr::occurred(py) { return Err(std::fmt::Error); } if let Ok(repr) = repr { write!(f, "{}", repr) } else { write!(f, "???") } }) } } impl std::fmt::Display for PyNode { fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result { Python::attach(|py| { let repr = self.0.bind(py).repr(); if PyErr::occurred(py) { return Err(std::fmt::Error); } if let Ok(repr) = repr { write!(f, "{}", repr) } else { write!(f, "???") } }) } } impl Clone for PyNode { fn clone(&self) -> PyNode { Python::attach(|py| PyNode(self.0.clone_ref(py))) } } impl From> for PyNode { fn from(obj: Py) -> PyNode { PyNode(obj) } } impl<'a> From> for PyNode { fn from(obj: Bound<'a, PyAny>) -> PyNode { PyNode(obj.unbind()) } } impl<'py> FromPyObject<'_, 'py> for PyNode { type Error = PyErr; fn extract(obj: Borrowed<'_, 'py, PyAny>) -> Result { Ok(PyNode(obj.to_owned().unbind())) } } impl<'py> IntoPyObject<'py> for PyNode { type Target = PyAny; type Output = Bound<'py, Self::Target>; type Error = PyErr; fn into_pyobject(self, py: Python<'py>) -> Result { Ok(self.0.clone_ref(py).into_bound(py)) } } impl Hash for PyNode { fn hash(&self, state: &mut H) { Python::attach(|py| match self.0.bind(py).hash() { Err(err) => err.restore(py), Ok(hash) => state.write_isize(hash), }); } } impl PartialEq for PyNode { fn eq(&self, other: &PyNode) -> bool { Python::attach(|py| match self.0.bind(py).eq(other.0.bind(py)) { Err(err) => { err.restore(py); false } Ok(b) => b, }) } } impl std::cmp::Eq for PyNode {} impl PartialOrd for PyNode { fn partial_cmp(&self, other: &PyNode) -> Option { self.cmp(other).into() } } impl Ord for PyNode { fn cmp(&self, other: &PyNode) -> std::cmp::Ordering { Python::attach(|py| match self.0.bind(py).lt(other.0.bind(py)) { Err(err) => { err.restore(py); std::cmp::Ordering::Equal } Ok(b) => { if b { std::cmp::Ordering::Less } else { match self.0.bind(py).gt(other.0.bind(py)) { Err(err) => { err.restore(py); std::cmp::Ordering::Equal } Ok(b) => { if b { std::cmp::Ordering::Greater } else { std::cmp::Ordering::Equal } } } } } }) } } /// Given a map from child => parents, create a map of parent => children #[pyfunction] fn invert_parent_map(parent_map: ParentMap) -> ChildMap { vcs_graph::invert_parent_map::(&parent_map) } /// Collapse regions of the graph that are 'linear'. /// /// For example:: /// /// A:[B], B:[C] /// /// can be collapsed by removing B and getting:: /// /// A:[C] /// /// :param parent_map: A dictionary mapping children to their parents /// :return: Another dictionary with 'linear' chains collapsed #[pyfunction] fn collapse_linear_regions(parent_map: ParentMap) -> PyResult> { Ok(vcs_graph::collapse_linear_regions::(&parent_map)) } /// A parents provider for Graph objects, backed by a dict. /// /// Mirrors `vcsgraph.graph.DictParentsProvider`: takes a mapping of /// `{key: parents_list}` and serves `get_parent_map` lookups from it. #[pyclass(name = "DictParentsProvider", dict)] struct PyDictParentsProvider { inner: vcs_graph::DictParentsProvider, ancestry: Py, } #[pymethods] impl PyDictParentsProvider { #[new] fn new(py: Python, ancestry: Py) -> PyResult { // Extract the mapping into a ParentMap. We accept any dict-like // object by calling `.items()`. let items = ancestry.bind(py).call_method0("items")?; let mut pm = ParentMap::new(); for item in items.try_iter()? { let item = item?; let k: Py = item.get_item(0)?.unbind(); let v_obj = item.get_item(1)?; let mut parents: Vec = Vec::new(); for p in v_obj.try_iter()? { parents.push(PyNode::from(p?.unbind())); } pm.insert(PyNode::from(k), Parents::Known(parents)); } Ok(PyDictParentsProvider { inner: vcs_graph::DictParentsProvider::::new(pm), ancestry, }) } /// The underlying mapping, preserved as the original Python object. #[getter] fn ancestry(&self, py: Python) -> Py { self.ancestry.clone_ref(py) } fn __repr__(&self, py: Python) -> PyResult { let r = self.ancestry.bind(py).repr()?; Ok(format!("DictParentsProvider({})", r)) } fn get_parent_map<'py>( &self, py: Python<'py>, keys: Py, ) -> PyResult> { let nodes = extract_iter_pynodes(py, &keys)?; let hs: HashSet = nodes.into_iter().collect(); let pm = self.inner.get_parent_map(&hs); parent_map_to_pydict(py, pm) } } impl ParentsProvider for PyDictParentsProvider { fn get_parent_map(&self, keys: &HashSet) -> ParentMap { self.inner.get_parent_map(keys) } } /// A parents provider which stacks multiple child providers. Each child is /// queried in order, matching `vcsgraph.graph.StackedParentsProvider`. /// /// If a child exposes `get_cached_parent_map`, that fast path is queried /// first for all children before any full `get_parent_map` call — just /// like the Python version. #[pyclass(name = "StackedParentsProvider")] struct PyStackedParentsProvider { parent_providers: Vec>, } #[pymethods] impl PyStackedParentsProvider { #[new] fn new(py: Python, parent_providers: Py) -> PyResult { let mut providers = Vec::new(); for p in parent_providers.bind(py).try_iter()? { providers.push(p?.unbind()); } Ok(PyStackedParentsProvider { parent_providers: providers, }) } fn __repr__(&self, py: Python) -> PyResult { let list = pyo3::types::PyList::new(py, self.parent_providers.iter().map(|p| p.clone_ref(py)))?; let r = list.repr()?; Ok(format!("StackedParentsProvider({})", r)) } fn get_parent_map<'py>( &self, py: Python<'py>, keys: Py, ) -> PyResult> { let found = PyDict::new(py); let remaining = pyo3::types::PySet::empty(py)?; for k in keys.bind(py).try_iter()? { remaining.add(k?.unbind())?; } // First pass: any provider that implements get_cached_parent_map // gets queried cheaply before we hit the slow path. for pp in &self.parent_providers { if remaining.is_empty() { break; } let bound = pp.bind(py); let get_cached = bound.getattr("get_cached_parent_map"); let Ok(get_cached) = get_cached else { continue; }; if get_cached.is_none() { continue; } let new_found = get_cached.call1((remaining.clone(),))?; let new_found_dict = new_found.cast::()?; for (k, v) in new_found_dict.iter() { found.set_item(&k, v)?; remaining.discard(&k)?; } } if remaining.is_empty() { return Ok(found); } // Second pass: full get_parent_map calls, in order. for pp in &self.parent_providers { if remaining.is_empty() { break; } let bound = pp.bind(py); let new_found = match bound.call_method1("get_parent_map", (remaining.clone(),)) { Ok(r) => r, Err(err) => { // Match Python's behaviour of catching UnsupportedOperation // and moving on. if err.get_type(py).is(&py .import("vcsgraph.errors")? .getattr("UnsupportedOperation")?) { continue; } return Err(err); } }; let new_found_dict = new_found.cast::()?; for (k, v) in new_found_dict.iter() { found.set_item(&k, v)?; remaining.discard(&k)?; } } Ok(found) } } /// A parents provider that wraps any `get_parent_map`-like callable. /// /// Mirrors `vcsgraph.graph.CallableToParentsProviderAdapter`. #[pyclass(name = "CallableToParentsProviderAdapter")] struct PyCallableToParentsProviderAdapter { callable: Py, } #[pymethods] impl PyCallableToParentsProviderAdapter { #[new] fn new(callable: Py) -> Self { PyCallableToParentsProviderAdapter { callable } } fn __repr__(&self, py: Python) -> PyResult { let r = self.callable.bind(py).repr()?; Ok(format!("CallableToParentsProviderAdapter({})", r)) } fn get_parent_map(&self, py: Python, keys: Py) -> PyResult> { Ok(self.callable.bind(py).call1((keys,))?.unbind()) } } #[pyclass] struct TopoSorter { sorter: vcs_graph::tsort::TopoSorter, } #[pymethods] impl TopoSorter { #[new] fn new(py: Python, graph: Py) -> PyResult { let iter = if graph.bind(py).is_instance_of::() { graph .cast_bound::(py)? .call_method0("items")? .try_iter()? } else { graph.bind(py).try_iter()? }; let graph = iter .map(|k| k?.extract::<(Py, Vec>)>()) .map(|k| k.map(|(k, vs)| (PyNode::from(k), vs.into_iter().map(PyNode::from).collect()))) .collect::)>>>()?; let sorter = vcs_graph::tsort::TopoSorter::::new(graph.into_iter()); Ok(TopoSorter { sorter }) } fn __next__(&mut self, py: Python) -> PyResult>> { match self.sorter.next() { None => Ok(None), Some(Ok(node)) => Ok(Some(node.into_pyobject(py)?.unbind())), Some(Err(vcs_graph::Error::Cycle(e))) => Err(GraphCycleError::new_err(e)), Some(Err(e)) => panic!("Unexpected error: {:?}", e), } } fn __iter__(slf: PyRefMut) -> PyRefMut { slf } fn iter_topo_order(slf: PyRefMut) -> PyRefMut { slf } fn sorted(&mut self, py: Python) -> PyResult>> { let mut ret = Vec::new(); while let Some(node) = self.__next__(py)? { ret.push(node); } Ok(ret) } } fn revno_vec_to_py(py: Python, revno: RevnoVec) -> Py { PyTuple::new(py, revno.into_iter().map(|v| v.into_pyobject(py).unwrap())) .unwrap() .into_any() .unbind() } #[pyclass] struct MergeSorter { sorter: vcs_graph::tsort::MergeSorter, } fn branch_tip_is_null(py: Python, branch_tip: Py) -> bool { if let Ok(branch_tip) = branch_tip.extract::<&[u8]>(py) { branch_tip == b"null:" } else if let Ok((branch_tip,)) = branch_tip.extract::<(Vec,)>(py) { branch_tip.as_slice() == b"null:" } else { false } } #[pymethods] impl MergeSorter { #[new] #[pyo3(signature = (graph, branch_tip=None, mainline_revisions=None, generate_revno=false))] fn new( py: Python, graph: Py, mut branch_tip: Option>, mainline_revisions: Option>, generate_revno: Option, ) -> PyResult { let iter = if graph.bind(py).is_instance_of::() { graph .cast_bound::(py)? .call_method0("items")? .try_iter()? } else { graph.bind(py).try_iter()? }; let graph = iter .map(|k| k?.extract::<(Py, Vec>)>()) .map(|k| k.map(|(k, vs)| (PyNode::from(k), vs.into_iter().map(PyNode::from).collect()))) .collect::>>>()?; let mainline_revisions = if let Some(mainline_revisions) = mainline_revisions { let mainline_revisions = mainline_revisions .bind(py) .try_iter()? .map(|k| { let item = k?; Ok(item.extract::>()?) }) .collect::>>>()?; Some(mainline_revisions.into_iter().map(PyNode::from).collect()) } else { None }; // The null: revision doesn't exist in the graph, so don't attempt to remove it if let Some(ref mut tip_obj) = branch_tip { if branch_tip_is_null(py, tip_obj.clone_ref(py)) { branch_tip = None; } } let sorter = vcs_graph::tsort::MergeSorter::::new( graph, branch_tip.map(PyNode::from), mainline_revisions, generate_revno.unwrap_or(false), ); Ok(MergeSorter { sorter }) } fn __next__(&mut self, py: Python) -> PyResult>> { match self.sorter.next() { None => Ok(None), Some(Ok((sequence_number, node, merge_depth, None, end_of_merge))) => Ok(Some( ( sequence_number, node.into_pyobject(py)?.unbind(), merge_depth, end_of_merge, ) .into_pyobject(py)? .unbind() .into(), )), Some(Ok((sequence_number, node, merge_depth, Some(revno), end_of_merge))) => Ok(Some( ( sequence_number, node.into_pyobject(py)?.unbind(), merge_depth, revno_vec_to_py(py, revno), end_of_merge, ) .into_pyobject(py)? .unbind() .into(), )), Some(Err(vcs_graph::Error::Cycle(e))) => Err(GraphCycleError::new_err(e)), Some(Err(e)) => panic!("Unexpected error: {:?}", e), } } fn __iter__(slf: PyRefMut) -> PyRefMut { slf } fn iter_topo_order(slf: PyRefMut) -> PyRefMut { slf } fn sorted<'a>(&mut self, py: Python<'a>) -> PyResult> { let ret = PyList::empty(py); loop { let item = self.__next__(py)?; if let Some(item) = item { ret.append(item)?; } else { break; } } Ok(ret) } } /// Topological sort a graph which groups merges. /// /// :param graph: sequence of pairs of node->parents_list. /// :param branch_tip: the tip of the branch to graph. Revisions not /// reachable from branch_tip are not included in the /// output. /// :param mainline_revisions: If not None this forces a mainline to be /// used rather than synthesised from the graph. /// This must be a valid path through some part /// of the graph. If the mainline does not cover all /// the revisions, output stops at the start of the /// old revision listed in the mainline revisions /// list. /// The order for this parameter is oldest-first. /// :param generate_revno: Optional parameter controlling the generation of /// revision number sequences in the output. See the output description of /// the MergeSorter docstring for details. /// :result: See the MergeSorter docstring for details. /// /// Node identifiers can be any hashable object, and are typically strings. #[pyfunction] #[pyo3(signature = (graph, branch_tip=None, mainline_revisions=None, generate_revno=false))] fn merge_sort( py: Python, graph: Py, branch_tip: Option>, mainline_revisions: Option>, generate_revno: Option, ) -> PyResult> { let mut sorter = MergeSorter::new(py, graph, branch_tip, mainline_revisions, generate_revno)?; sorter.sorted(py) } const NULL_REVISION: &[u8] = b"null:"; fn is_null_revision(py: Python, obj: &Py) -> bool { if let Ok(b) = obj.extract::<&[u8]>(py) { b == NULL_REVISION } else { false } } #[pyclass(name = "KnownGraph")] struct PyKnownGraph { inner: RsKnownGraph, } fn extract_parent_map(py: Python, parent_map: Py) -> PyResult)>> { let iter = if parent_map.bind(py).is_instance_of::() { parent_map .cast_bound::(py)? .call_method0("items")? .try_iter()? } else { parent_map.bind(py).try_iter()? }; iter.map(|k| k?.extract::<(Py, Vec>)>()) .map(|k| k.map(|(k, vs)| (PyNode::from(k), vs.into_iter().map(PyNode::from).collect()))) .collect() } #[pymethods] impl PyKnownGraph { #[new] #[pyo3(signature = (parent_map, do_cache=true))] fn new(py: Python, parent_map: Py, do_cache: Option) -> PyResult { let pm = extract_parent_map(py, parent_map)?; Ok(PyKnownGraph { inner: RsKnownGraph::new(pm, do_cache.unwrap_or(true)), }) } /// Add a new node to the graph. If `key` was a ghost, it is filled in. fn add_node( &mut self, py: Python, key: Py, parent_keys: Vec>, ) -> PyResult<()> { let key = PyNode::from(key); let parents: Vec = parent_keys.into_iter().map(PyNode::from).collect(); self.inner .add_node(key.clone(), parents) .map_err(|e| match e { vcs_graph::Error::ParentMismatch { key, expected, actual, } => { let key_repr = format!("{:?}", key); let expected_py: Vec> = expected.into_iter().map(|n| n.0.clone_ref(py)).collect(); let actual_py: Vec> = actual.into_iter().map(|n| n.0.clone_ref(py)).collect(); pyo3::exceptions::PyValueError::new_err(format!( "Parent key mismatch, existing node {} has parents of {:?} not {:?}", key_repr, expected_py, actual_py )) } other => pyo3::exceptions::PyValueError::new_err(format!("{:?}", other)), }) } /// Return the parent keys for `key`. Returns `None` for ghosts; raises /// `KeyError` if `key` is not in the graph. fn get_parent_keys(&self, py: Python, key: Py) -> PyResult>>> { let node = PyNode::from(key); if !self.inner.contains(&node) { return Err(pyo3::exceptions::PyKeyError::new_err(format!("{:?}", node))); } Ok(self .inner .get_parent_keys(&node) .map(|ps| ps.iter().map(|n| n.0.clone_ref(py)).collect())) } /// Return the child keys for `key`. Raises `KeyError` if `key` is not in /// the graph. fn get_child_keys(&self, py: Python, key: Py) -> PyResult>> { let node = PyNode::from(key); match self.inner.get_child_keys(&node) { Some(cs) => Ok(cs.iter().map(|n| n.0.clone_ref(py)).collect()), None => Err(pyo3::exceptions::PyKeyError::new_err(format!("{:?}", node))), } } /// Return the heads from amongst `keys`. fn heads<'py>(&mut self, py: Python<'py>, keys: Py) -> PyResult> { let mut candidates: Vec> = Vec::new(); let mut had_null = false; for k in keys.bind(py).try_iter()? { let item: Py = k?.extract()?; if is_null_revision(py, &item) { had_null = true; } else { candidates.push(item); } } if candidates.is_empty() && had_null { // NULL_REVISION is only a head if it's the only entry. let fs = py.import("builtins")?.getattr("frozenset")?; let null = pyo3::types::PyBytes::new(py, NULL_REVISION); return fs.call1((vec![null.into_any()],)); } let nodes: Vec = candidates.into_iter().map(PyNode::from).collect(); let heads = self.inner.heads(nodes); let py_set: Vec> = heads.into_iter().map(|n| n.0).collect(); let fs = py.import("builtins")?.getattr("frozenset")?; fs.call1((py_set,)) } /// Return the nodes in topological order (parents first). fn topo_sort(&self, py: Python) -> PyResult>> { match self.inner.topo_sort() { Ok(v) => Ok(v.into_iter().map(|n| n.0.clone_ref(py)).collect()), Err(vcs_graph::Error::Cycle(_)) => { Err(GraphCycleError::new_err(("cycle in known graph",))) } Err(e) => Err(pyo3::exceptions::PyValueError::new_err(format!("{:?}", e))), } } /// Return a reverse topological ordering grouped by prefix. /// /// Mirrors the Python implementation: keys that are bytes use their first /// byte as the prefix bucket; single-element keys use an empty prefix. fn gc_sort(&self, py: Python) -> PyResult>> { let prefix_of = |k: &PyNode| -> Vec { Python::attach(|py| { let bound = k.0.bind(py); if let Ok(b) = bound.extract::<&[u8]>() { if b.len() == 1 { Vec::new() } else { vec![b[0]] } } else if let Ok(s) = bound.extract::<&str>() { if s.len() == 1 { Vec::new() } else { vec![s.as_bytes()[0]] } } else if let Ok(first) = bound.get_item(0) { if let Ok(b) = first.extract::<&[u8]>() { b.to_vec() } else if let Ok(s) = first.extract::<&str>() { s.as_bytes().to_vec() } else { Vec::new() } } else { Vec::new() } }) }; let v = self.inner.gc_sort(prefix_of); Ok(v.into_iter().map(|n| n.0.clone_ref(py)).collect()) } /// Compute the merge-sorted graph output starting at `tip_key`. /// /// If `tip_key` is `None`, `b"null:"`, or `(b"null:",)`, returns an empty /// list (matches the Python null-tip semantics). fn merge_sort( &self, py: Python, tip_key: Py, ) -> PyResult>> { if tip_key.is_none(py) || branch_tip_is_null(py, tip_key.clone_ref(py)) { return Ok(Vec::new()); } let tip = PyNode::from(tip_key); if !self.inner.contains(&tip) { return Ok(Vec::new()); } let result = self.inner.merge_sort(tip).map_err(|e| match e { vcs_graph::Error::Cycle(_) => GraphCycleError::new_err(("cycle in known graph",)), other => pyo3::exceptions::PyValueError::new_err(format!("{:?}", other)), })?; result .into_iter() .map(|n| { let revno = revno_vec_to_py(py, n.revno); Py::new( py, PyKnownGraphMergeSortNode { key: n.key.0, merge_depth: n.merge_depth, revno, end_of_merge: n.end_of_merge, }, ) }) .collect() } /// Return a mapping-like view of all nodes in the graph, keyed by node /// key. Each value is a `_KnownGraphNode` exposing live `key`, /// `parent_keys`, `child_keys`, and `gdfo` attributes. #[getter] fn _nodes(slf: Py) -> PyKnownGraphNodesView { PyKnownGraphNodesView { graph: slf } } } #[pyclass(name = "_KnownGraphNodesView")] struct PyKnownGraphNodesView { graph: Py, } #[pymethods] impl PyKnownGraphNodesView { fn __getitem__(&self, py: Python, key: Py) -> PyResult { let g = self.graph.borrow(py); let node = PyNode::from(key.clone_ref(py)); if !g.inner.contains(&node) { return Err(pyo3::exceptions::PyKeyError::new_err(format!("{:?}", node))); } Ok(PyKnownGraphNode { graph: self.graph.clone_ref(py), key, }) } fn __contains__(&self, py: Python, key: Py) -> bool { self.graph.borrow(py).inner.contains(&PyNode::from(key)) } fn __len__(&self, py: Python) -> usize { self.graph.borrow(py).inner.len() } fn __iter__(&self, py: Python) -> PyResult> { let keys: Vec> = self .graph .borrow(py) .inner .keys() .map(|n| n.0.clone_ref(py)) .collect(); Ok(pyo3::types::PyList::new(py, keys)? .call_method0("__iter__")? .unbind()) } fn keys(&self, py: Python) -> Vec> { self.graph .borrow(py) .inner .keys() .map(|n| n.0.clone_ref(py)) .collect() } fn values(&self, py: Python) -> Vec { self.graph .borrow(py) .inner .keys() .map(|n| PyKnownGraphNode { graph: self.graph.clone_ref(py), key: n.0.clone_ref(py), }) .collect() } } #[pyclass(name = "_KnownGraphNode")] struct PyKnownGraphNode { graph: Py, key: Py, } #[pymethods] impl PyKnownGraphNode { #[getter] fn key(&self, py: Python) -> Py { self.key.clone_ref(py) } #[getter] fn gdfo(&self, py: Python) -> Option { self.graph .borrow(py) .inner .gdfo(&PyNode::from(self.key.clone_ref(py))) } #[getter] fn parent_keys(&self, py: Python) -> Option>> { self.graph .borrow(py) .inner .get_parent_keys(&PyNode::from(self.key.clone_ref(py))) .map(|ps| ps.iter().map(|n| n.0.clone_ref(py)).collect()) } #[getter] fn child_keys(&self, py: Python) -> Vec> { self.graph .borrow(py) .inner .get_child_keys(&PyNode::from(self.key.clone_ref(py))) .map(|cs| cs.iter().map(|n| n.0.clone_ref(py)).collect()) .unwrap_or_default() } fn __repr__(&self, py: Python) -> String { format!( "_KnownGraphNode({:?} gdfo:{:?} par:{:?} child:{:?})", self.key .bind(py) .repr() .map(|r| r.to_string()) .unwrap_or_default(), self.gdfo(py), self.parent_keys(py), self.child_keys(py), ) } } #[pyclass(name = "_MergeSortNode")] struct PyKnownGraphMergeSortNode { #[pyo3(get, set)] key: Py, #[pyo3(get, set)] merge_depth: usize, #[pyo3(get, set)] revno: Py, #[pyo3(get, set)] end_of_merge: bool, } /// Sort a collection of PyNodes by hash for use as a deterministic cache /// key. Collisions only hurt cache hit rate, not correctness, because two /// different sets can't produce identical sorted Vecs within a process. fn sort_pynodes_by_hash(items: impl IntoIterator) -> Vec { use std::collections::hash_map::DefaultHasher; use std::hash::{Hash, Hasher}; let mut v: Vec = items.into_iter().collect(); v.sort_by_key(|n| { let mut h = DefaultHasher::new(); n.hash(&mut h); h.finish() }); v } /// Lazy iterator yielding the lefthand ancestry of a starting key. /// /// Mirrors the Python `Graph.iter_lefthand_ancestry` generator semantics: /// each `__next__` call walks one step down the left-most parent chain. /// Callers can break out of the iteration before the walk reaches a ghost /// sentinel, matching Python's early-exit behaviour. #[pyclass(name = "_LefthandAncestryIterator")] struct PyLefthandAncestryIterator { provider: Py, next_key: Option>, stop_keys: Py, graph_py: Py, } #[pymethods] impl PyLefthandAncestryIterator { fn __iter__(slf: PyRef<'_, Self>) -> PyRef<'_, Self> { slf } fn __next__(&mut self, py: Python) -> PyResult> { let key = match self.next_key.take() { Some(k) => k, None => return Err(pyo3::exceptions::PyStopIteration::new_err(())), }; // Honour stop_keys (a Python container supporting `in`). if self.stop_keys.bind(py).contains(&key)? { return Err(pyo3::exceptions::PyStopIteration::new_err(())); } // Fetch the left-most parent for the next iteration. Python's // generator raises RevisionNotPresent if the key is missing from // the provider; mirror that exactly. let key_list = pyo3::types::PyList::new(py, [key.clone_ref(py)])?; let parent_map = self .provider .bind(py) .call_method1("get_parent_map", (key_list,))?; let parents = parent_map.get_item(key.clone_ref(py)); match parents { Ok(ps) => { let ps_iter: Vec> = { let mut out = Vec::new(); for item in ps.try_iter()? { out.push(item?.unbind()); } out }; if ps_iter.is_empty() { self.next_key = None; } else { self.next_key = Some(ps_iter.into_iter().next().unwrap()); } Ok(key) } Err(_) => Err(RevisionNotPresent::new_err(( key, self.graph_py.clone_ref(py), ))), } } } /// Adapter that wraps a graph whose keys are 1-tuples of ids. Each method /// takes ids, wraps them in tuples, delegates to the underlying graph, /// then unwraps the tuples in the response. Mirrors /// `vcsgraph.graph.GraphThunkIdsToKeys`. #[pyclass(name = "GraphThunkIdsToKeys")] struct PyGraphThunkIdsToKeys { graph: Py, } #[pymethods] impl PyGraphThunkIdsToKeys { #[new] fn new(graph: Py) -> Self { PyGraphThunkIdsToKeys { graph } } fn topo_sort(&self, py: Python) -> PyResult>> { let result = self.graph.bind(py).call_method0("topo_sort")?; let mut out = Vec::new(); for item in result.try_iter()? { let tup = item?; out.push(tup.get_item(0)?.unbind()); } Ok(out) } fn heads<'py>( &self, py: Python<'py>, ids: Py, ) -> PyResult> { let mut as_keys: Vec> = Vec::new(); for item in ids.bind(py).try_iter()? { let item = item?; let tup = pyo3::types::PyTuple::new(py, [item.unbind()])?; as_keys.push(tup.into_any().unbind()); } let head_keys = self .graph .bind(py) .call_method1("heads", (pyo3::types::PyList::new(py, as_keys)?,))?; let mut out_items: Vec> = Vec::new(); for h in head_keys.try_iter()? { let h = h?; out_items.push(h.get_item(0)?.unbind()); } pyo3::types::PySet::new(py, out_items) } fn merge_sort(&self, py: Python, tip_revision: Py) -> PyResult> { let tip_tuple = pyo3::types::PyTuple::new(py, [tip_revision])?; let nodes = self .graph .bind(py) .call_method1("merge_sort", (tip_tuple,))?; for item in nodes.try_iter()? { let item = item?; let key = item.getattr("key")?; let unwrapped = key.get_item(0)?; item.setattr("key", unwrapped)?; } Ok(nodes.unbind()) } fn add_node(&self, py: Python, revision: Py, parents: Py) -> PyResult<()> { let rev_tuple = pyo3::types::PyTuple::new(py, [revision])?; let mut parent_tuples: Vec> = Vec::new(); for p in parents.bind(py).try_iter()? { let p = p?; let tup = pyo3::types::PyTuple::new(py, [p.unbind()])?; parent_tuples.push(tup.into_any().unbind()); } self.graph.bind(py).call_method1( "add_node", (rev_tuple, pyo3::types::PyList::new(py, parent_tuples)?), )?; Ok(()) } } /// A cache of results for graph heads() calls. /// /// The cache key is the unordered set of input keys; the value is the /// `.heads()` result for that set. Every call returns a fresh mutable set /// so callers can modify the result without affecting later lookups — /// matches `vcsgraph.graph.HeadsCache`. #[pyclass(name = "HeadsCache")] struct PyHeadsCache { graph: Py, cache: std::sync::Mutex, Vec>>, } #[pymethods] impl PyHeadsCache { #[new] fn new(graph: Py) -> Self { PyHeadsCache { graph, cache: std::sync::Mutex::new(std::collections::HashMap::new()), } } /// The underlying graph. Exposed so Python callers can reach /// `cache.graph` just like the pure-Python version allowed. #[getter] fn graph(&self, py: Python) -> Py { self.graph.clone_ref(py) } /// Return the heads of `keys`. The result is a mutable Python set; /// callers may modify it without affecting future lookups. fn heads<'py>( &self, py: Python<'py>, keys: Py, ) -> PyResult> { let nodes = extract_iter_pynodes(py, &keys)?; let key = sort_pynodes_by_hash(nodes); { let cache = self.cache.lock().unwrap(); if let Some(cached) = cache.get(&key) { return pyo3::types::PySet::new(py, cached.iter().map(|n| n.0.clone_ref(py))); } } // Cache miss: delegate to the wrapped graph. let result: Vec = { let heads = self.graph.bind(py).call_method1( "heads", (pyo3::types::PyList::new( py, key.iter().map(|n| n.0.clone_ref(py)), )?,), )?; let mut out = Vec::new(); for item in heads.try_iter()? { out.push(PyNode::from(item?)); } out }; let mut cache = self.cache.lock().unwrap(); cache.insert(key, result.clone()); pyo3::types::PySet::new(py, result.into_iter().map(|n| n.0)) } } /// A cache of `heads()` results that returns frozen sets. /// /// Same as [`PyHeadsCache`] but the results are immutable. Matches /// `vcsgraph.graph.FrozenHeadsCache`. #[pyclass(name = "FrozenHeadsCache")] struct PyFrozenHeadsCache { graph: Py, cache: std::sync::Mutex, Vec>>, } #[pymethods] impl PyFrozenHeadsCache { #[new] fn new(graph: Py) -> Self { PyFrozenHeadsCache { graph, cache: std::sync::Mutex::new(std::collections::HashMap::new()), } } #[getter] fn graph(&self, py: Python) -> Py { self.graph.clone_ref(py) } /// Return the heads of `keys` as a frozenset. fn heads(&self, py: Python, keys: Py) -> PyResult> { let nodes = extract_iter_pynodes(py, &keys)?; let key = sort_pynodes_by_hash(nodes); { let cache = self.cache.lock().unwrap(); if let Some(cached) = cache.get(&key) { let fs = py.import("builtins")?.getattr("frozenset")?; let items: Vec> = cached.iter().map(|n| n.0.clone_ref(py)).collect(); return Ok(fs.call1((items,))?.into_any().unbind()); } } let result: Vec = { let heads = self.graph.bind(py).call_method1( "heads", (pyo3::types::PyList::new( py, key.iter().map(|n| n.0.clone_ref(py)), )?,), )?; let mut out = Vec::new(); for item in heads.try_iter()? { out.push(PyNode::from(item?)); } out }; let mut cache = self.cache.lock().unwrap(); cache.insert(key, result.clone()); let fs = py.import("builtins")?.getattr("frozenset")?; let items: Vec> = result.into_iter().map(|n| n.0).collect(); Ok(fs.call1((items,))?.into_any().unbind()) } /// Store a precomputed `(keys, heads)` pair directly in the cache. fn cache(&self, py: Python, keys: Py, heads: Py) -> PyResult<()> { let key_nodes = extract_iter_pynodes(py, &keys)?; let head_nodes = extract_iter_pynodes(py, &heads)?; let key = sort_pynodes_by_hash(key_nodes); self.cache.lock().unwrap().insert(key, head_nodes); Ok(()) } } /// Python binding for [`CachingParentsProvider`]. Wraps an inner Python /// parents provider (any object with a `get_parent_map(keys)` method) and /// caches every lookup so repeated queries don't re-hit the inner provider. /// /// Mirrors `vcsgraph.graph.CachingParentsProvider`. #[pyclass(name = "CachingParentsProvider")] struct PyCachingParentsProvider { inner: RsCachingParentsProvider, real_provider_py: Py, } #[pymethods] impl PyCachingParentsProvider { /// Construct a caching wrapper around `parent_provider`, which must be /// either a Python object with a `get_parent_map` method or `None` when /// `get_parent_map` is supplied as a callable. #[new] #[pyo3(signature = (parent_provider=None, get_parent_map=None))] fn new( py: Python, parent_provider: Option>, get_parent_map: Option>, ) -> PyResult { let provider_obj = match (parent_provider, get_parent_map) { (Some(p), None) => p, (None, Some(cb)) => { // Wrap the callable into a tiny Python shim that exposes // a `get_parent_map` attribute, so the adapter can call it // uniformly. let builtins = py.import("builtins")?; let type_ = builtins.getattr("type")?; let ns = pyo3::types::PyDict::new(py); ns.set_item("get_parent_map", cb)?; let shim = type_.call1(("_CPPShim", (builtins.getattr("object")?,), ns))?; shim.call0()?.into_any().unbind() } (Some(_), Some(_)) => { return Err(pyo3::exceptions::PyValueError::new_err( "Pass parent_provider OR get_parent_map, not both", )) } (None, None) => { return Err(pyo3::exceptions::PyValueError::new_err( "Either parent_provider or get_parent_map must be supplied", )) } }; let adapter = PyParentsProviderAdapter { provider: provider_obj.clone_ref(py), }; Ok(PyCachingParentsProvider { inner: RsCachingParentsProvider::new(adapter), real_provider_py: provider_obj, }) } fn __repr__(&self, py: Python) -> PyResult { let r = self.real_provider_py.bind(py).repr()?; Ok(format!("CachingParentsProvider({})", r)) } #[pyo3(signature = (cache_misses=true))] fn enable_cache(&self, cache_misses: bool) -> PyResult<()> { self.inner .enable_cache(cache_misses) .map_err(pyo3::exceptions::PyAssertionError::new_err) } fn disable_cache(&self) { self.inner.disable_cache(); } /// Return a snapshot of the cache as a Python dict, or `None` if /// disabled. fn get_cached_map<'py>(&self, py: Python<'py>) -> PyResult>> { match self.inner.get_cached_map() { None => Ok(None), Some(map) => Ok(Some(cache_map_to_pydict(py, map)?)), } } /// Backward-compatible access to the raw `_cache` attribute. Returns a /// dict snapshot when the cache is enabled, `None` otherwise. The /// Python tests read this directly, so mirror Python's behaviour of /// exposing an empty dict rather than `None` for a freshly-enabled /// cache. #[getter] fn _cache<'py>(&self, py: Python<'py>) -> PyResult>> { self.get_cached_map(py) } fn get_cached_parent_map<'py>( &self, py: Python<'py>, keys: Py, ) -> PyResult> { let nodes = extract_iter_pynodes(py, &keys)?; let hs: HashSet = nodes.into_iter().collect(); let pm = self.inner.get_cached_parent_map(&hs); parent_map_to_pydict(py, pm) } fn get_parent_map<'py>( &self, py: Python<'py>, keys: Py, ) -> PyResult> { let nodes = extract_iter_pynodes(py, &keys)?; let hs: HashSet = nodes.into_iter().collect(); let pm = self.inner.get_parent_map(&hs); parent_map_to_pydict(py, pm) } fn note_missing_key(&self, key: Py) { self.inner.note_missing_key(PyNode::from(key)); } #[getter] fn missing_keys<'py>(&self, py: Python<'py>) -> PyResult> { pynodes_to_pyset(py, self.inner.missing_keys()) } } /// Adapter letting a Python parents provider satisfy Rust's `ParentsProvider` /// trait. Holds a raw `Py` and dispatches `get_parent_map(keys)` via /// the GIL. /// /// The Python provider's `get_parent_map` must accept an iterable of keys /// and return a dict-like `{key: parents_list}` (missing keys are treated /// as ghosts). Any Python exception during the call is caught and converted /// to an empty response — matching the Python Graph's behavior of treating /// the provider as best-effort. struct PyParentsProviderAdapter { provider: Py, } impl ParentsProvider for PyParentsProviderAdapter { fn get_parent_map(&self, keys: &HashSet) -> ParentMap { Python::attach(|py| { let key_list = pyo3::types::PyList::empty(py); for k in keys { if key_list.append(k.0.bind(py)).is_err() { return ParentMap::new(); } } let result = match self .provider .bind(py) .call_method1("get_parent_map", (key_list,)) { Ok(r) => r, Err(err) => { err.restore(py); return ParentMap::new(); } }; result.extract::>().unwrap_or_default() }) } } #[pyclass(name = "_RustGraph")] struct PyGraph { inner: RsGraph, provider_py: Py, } fn extract_iter_pynodes(py: Python, obj: &Py) -> PyResult> { let mut out = Vec::new(); for item in obj.bind(py).try_iter()? { out.push(PyNode::from(item?)); } Ok(out) } fn parent_map_to_pydict<'py>( py: Python<'py>, pm: ParentMap, ) -> PyResult> { let d = PyDict::new(py); for (k, v) in pm { match v { Parents::Known(ps) => { let list: Vec> = ps.into_iter().map(|n| n.0).collect(); d.set_item(k.0, list)?; } Parents::Ghost => { d.set_item(k.0, py.None())?; } } } Ok(d) } fn cache_map_to_pydict<'py>( py: Python<'py>, map: rustc_hash::FxHashMap>, ) -> PyResult> { let d = PyDict::new(py); for (k, v) in map { match v { Parents::Known(ps) => { let list: Vec> = ps.into_iter().map(|n| n.0).collect(); d.set_item(k.0, list)?; } Parents::Ghost => { d.set_item(k.0, py.None())?; } } } Ok(d) } #[pymethods] impl PyGraph { #[new] fn new(py: Python, parents_provider: Py) -> PyResult { let adapter = PyParentsProviderAdapter { provider: parents_provider.clone_ref(py), }; Ok(PyGraph { inner: RsGraph::new(adapter), provider_py: parents_provider, }) } /// Return the wrapped parents provider as given at construction time. #[getter] fn parents_provider(&self, py: Python) -> Py { self.provider_py.clone_ref(py) } fn get_parent_map<'py>( &self, py: Python<'py>, keys: Py, ) -> PyResult> { let nodes = extract_iter_pynodes(py, &keys)?; let pm = self.inner.get_parent_map(nodes); parent_map_to_pydict(py, pm) } fn get_child_map<'py>(&self, py: Python<'py>, keys: Py) -> PyResult> { let nodes = extract_iter_pynodes(py, &keys)?; let cm = self.inner.get_child_map(nodes); let d = PyDict::new(py); for (parent, children) in cm { let list: Vec> = children.into_iter().map(|n| n.0).collect(); d.set_item(parent.0, list)?; } Ok(d) } fn iter_topo_order(&self, py: Python, revisions: Py) -> PyResult>> { let nodes = extract_iter_pynodes(py, &revisions)?; self.inner .iter_topo_order(nodes) .map(|v| v.into_iter().map(|n| n.0).collect()) .map_err(|e| match e { vcs_graph::Error::Cycle(_) => { GraphCycleError::new_err(("cycle in graph while iter_topo_order",)) } other => pyo3::exceptions::PyValueError::new_err(format!("{:?}", other)), }) } /// Return a lazy iterator over the lefthand ancestry of `start_key`. /// /// The iterator yields revisions one at a time, walking left-most /// parents, and raises RevisionNotPresent if a key in the walk is /// missing from the provider. `stop_keys` is any container supporting /// the `in` operator; iteration stops when the current key is in it. #[pyo3(signature = (start_key, stop_keys=None))] fn iter_lefthand_ancestry( slf: PyRef<'_, Self>, py: Python, start_key: Py, stop_keys: Option>, ) -> PyResult { let stop_keys = match stop_keys { Some(s) => s, None => pyo3::types::PyTuple::empty(py).into_any().unbind(), }; Ok(PyLefthandAncestryIterator { provider: slf.provider_py.clone_ref(py), next_key: Some(start_key), stop_keys, graph_py: slf.into_pyobject(py)?.into_any().unbind(), }) } /// Iterate ancestry, yielding `(key, parents_list_or_None)` pairs. fn iter_ancestry<'py>( &self, py: Python<'py>, revision_ids: Py, ) -> PyResult>> { let nodes = extract_iter_pynodes(py, &revision_ids)?; let pairs = self.inner.iter_ancestry(nodes); pairs .into_iter() .map(|(k, parents)| { let parents_obj: Py = match parents { Parents::Known(ps) => { let list: Vec> = ps.into_iter().map(|n| n.0).collect(); pyo3::types::PyTuple::new(py, list)?.into_any().unbind() } Parents::Ghost => py.None(), }; pyo3::types::PyTuple::new(py, [k.0, parents_obj]) }) .collect() } fn find_distance_to_null( &self, py: Python, target_revision_id: Py, known_revision_ids: Py, ) -> PyResult { let target = PyNode::from(target_revision_id); let mut known: Vec<(PyNode, i64)> = Vec::new(); for item in known_revision_ids.bind(py).try_iter()? { let (k, d): (Py, i64) = item?.extract()?; known.push((PyNode::from(k), d)); } let null_bytes = pyo3::types::PyBytes::new(py, NULL_REVISION); let null = PyNode::from(null_bytes.into_any().unbind()); self.inner .find_distance_to_null(target, known, null) .map_err(graph_error_to_py) } fn find_lefthand_distances<'py>( &self, py: Python<'py>, keys: Py, ) -> PyResult> { let nodes = extract_iter_pynodes(py, &keys)?; let null_bytes = pyo3::types::PyBytes::new(py, NULL_REVISION); let null = PyNode::from(null_bytes.into_any().unbind()); let result = self.inner.find_lefthand_distances(nodes, null); let d = PyDict::new(py); for (k, dist) in result { d.set_item(k.0, dist)?; } Ok(d) } /// Return the heads from amongst keys. /// /// This is done by searching the ancestries of each key. Any key that is /// reachable from another key is not returned; all the others are. /// /// This operation scales with the relative depth between any two keys. If /// any two keys are completely disconnected all ancestry of both sides /// will be retrieved. /// /// :param keys: An iterable of keys. /// :return: A set of the heads. Note that as a set there is no ordering /// information. Callers will need to filter their input to create /// order if they need it. fn heads<'py>( &self, py: Python<'py>, keys: Py, ) -> PyResult> { let nodes = extract_iter_pynodes(py, &keys)?; let null_bytes = pyo3::types::PyBytes::new(py, NULL_REVISION); let null = PyNode::from(null_bytes.into_any().unbind()); let result = self.inner.heads_with_null(nodes, &null); pynodes_to_pyset(py, result) } /// Determine the lowest common ancestors of the provided revisions. /// /// A lowest common ancestor is a common ancestor none of whose /// descendants are common ancestors. In graphs, unlike trees, there may /// be multiple lowest common ancestors. /// /// This algorithm has two phases. Phase 1 identifies border ancestors, /// and phase 2 filters border ancestors to determine lowest common /// ancestors. /// /// In phase 1, border ancestors are identified, using a breadth-first /// search starting at the bottom of the graph. Searches are stopped /// whenever a node or one of its descendants is determined to be common /// /// In phase 2, the border ancestors are filtered to find the least /// common ancestors. This is done by searching the ancestries of each /// border ancestor. /// /// Phase 2 is perfomed on the principle that a border ancestor that is /// not an ancestor of any other border ancestor is a least common /// ancestor. /// /// Searches are stopped when they find a node that is determined to be a /// common ancestor of all border ancestors, because this shows that it /// cannot be a descendant of any border ancestor. /// /// The scaling of this operation should be proportional to: /// /// 1. The number of uncommon ancestors /// 2. The number of border ancestors /// 3. The length of the shortest path between a border ancestor and an /// ancestor of all border ancestors. fn find_lca<'py>( &self, py: Python<'py>, revisions: Py, ) -> PyResult> { let nodes = extract_iter_pynodes(py, &revisions)?; // Match Python's `_find_border_ancestors` precondition: None is // not a valid revision id. Raise InvalidRevisionId before running // the algorithm so callers see the same error they used to. for n in &nodes { if n.0.is_none(py) { return Err(InvalidRevisionId::new_err(( py.None(), self.provider_py.clone_ref(py), ))); } } let null_bytes = pyo3::types::PyBytes::new(py, NULL_REVISION); let null = PyNode::from(null_bytes.into_any().unbind()); let result = self.inner.find_lca(nodes, &null); pynodes_to_pyset(py, result) } /// Determine whether a revision is an ancestor of another. /// /// We answer this using heads() as heads() has the logic to perform the /// smallest number of parent lookups to determine the ancestral /// relationship between N revisions. fn is_ancestor( &self, py: Python, candidate_ancestor: Py, candidate_descendant: Py, ) -> bool { let null_bytes = pyo3::types::PyBytes::new(py, NULL_REVISION); let null = PyNode::from(null_bytes.into_any().unbind()); self.inner.is_ancestor( PyNode::from(candidate_ancestor), PyNode::from(candidate_descendant), &null, ) } /// Determine whether a revision is between two others. /// /// returns true if and only if: /// lower_bound_revid <= revid <= upper_bound_revid #[pyo3(signature = (revid, lower_bound_revid, upper_bound_revid))] fn is_between( &self, py: Python, revid: Py, lower_bound_revid: Option>, upper_bound_revid: Option>, ) -> bool { let null_bytes = pyo3::types::PyBytes::new(py, NULL_REVISION); let null = PyNode::from(null_bytes.into_any().unbind()); self.inner.is_between( PyNode::from(revid), lower_bound_revid.map(PyNode::from), upper_bound_revid.map(PyNode::from), &null, ) } /// Find the order that each revision was merged into tip. /// /// This basically just walks backwards with a stack, and walks left-first /// until it finds a node to stop. fn find_merge_order( &self, py: Python, tip_revision_id: Py, lca_revision_ids: Py, ) -> PyResult>> { let tip = PyNode::from(tip_revision_id); let lcas = extract_iter_pynodes(py, &lca_revision_ids)?; let result = self.inner.find_merge_order(tip, lcas); Ok(result.into_iter().map(|n| n.0).collect()) } /// Find the unique ancestors for a revision versus others. /// /// This returns the ancestry of unique_revision, excluding all revisions /// in the ancestry of common_revisions. If unique_revision is in the /// ancestry, then the empty set will be returned. /// /// :param unique_revision: The revision_id whose ancestry we are /// interested in. /// (XXX: Would this API be better if we allowed multiple revisions on /// to be searched here?) /// :param common_revisions: Revision_ids of ancestries to exclude. /// :return: A set of revisions in the ancestry of unique_revision /// /// Algorithm description: /// /// 1) Walk backwards from the unique node and all common nodes. /// 2) When a node is seen by both sides, stop searching it in the unique /// walker, include it in the common walker. /// 3) Stop searching when there are no nodes left for the unique walker. /// At this point, you have a maximal set of unique nodes. Some of /// them may actually be common, and you haven't reached them yet. /// 4) Start new searchers for the unique nodes, seeded with the /// information you have so far. /// 5) Continue searching, stopping the common searches when the search /// tip is an ancestor of all unique nodes. /// 6) Aggregate together unique searchers when they are searching the /// same tips. When all unique searchers are searching the same node, /// stop move it to a single 'all_unique_searcher'. /// 7) The 'all_unique_searcher' represents the very 'tip' of searching. /// Most of the time this produces very little important information. /// So don't step it as quickly as the other searchers. /// 8) Search is done when all common searchers have completed. fn find_unique_ancestors<'py>( &self, py: Python<'py>, unique_revision: Py, common_revisions: Py, ) -> PyResult> { let unique = PyNode::from(unique_revision); let commons = extract_iter_pynodes(py, &common_revisions)?; let result = self.inner.find_unique_ancestors(unique, commons); pynodes_to_pyset(py, result) } /// Determine the graph difference between two revisions. fn find_difference<'py>( &self, py: Python<'py>, left_revision: Py, right_revision: Py, ) -> PyResult<( Bound<'py, pyo3::types::PySet>, Bound<'py, pyo3::types::PySet>, )> { let left = PyNode::from(left_revision); let right = PyNode::from(right_revision); let (l, r) = self.inner.find_difference(left, right); Ok((pynodes_to_pyset(py, l)?, pynodes_to_pyset(py, r)?)) } /// Find a unique LCA. /// /// Find lowest common ancestors. If there is no unique common /// ancestor, find the lowest common ancestors of those ancestors. /// /// Iteration stops when a unique lowest common ancestor is found. /// The graph origin is necessarily a unique lowest common ancestor. /// /// Note that None is not an acceptable substitute for NULL_REVISION. /// in the input for this method. /// /// :param count_steps: If True, the return value will be a tuple of /// (unique_lca, steps) where steps is the number of times that /// find_lca was run. If False, only unique_lca is returned. #[pyo3(signature = (left_revision, right_revision, count_steps=false))] fn find_unique_lca( &self, py: Python, left_revision: Py, right_revision: Py, count_steps: bool, ) -> PyResult> { let left = PyNode::from(left_revision.clone_ref(py)); let right = PyNode::from(right_revision.clone_ref(py)); let null_bytes = pyo3::types::PyBytes::new(py, NULL_REVISION); let null = PyNode::from(null_bytes.into_any().unbind()); match self.inner.find_unique_lca(left, right, &null) { Some((key, steps)) => { if count_steps { Ok(pyo3::types::PyTuple::new( py, [key.0, steps.into_pyobject(py)?.into_any().unbind()], )? .into_any() .unbind()) } else { Ok(key.0) } } None => Err(NoCommonAncestor::new_err((left_revision, right_revision))), } } /// Find the first lefthand ancestor of tip_key that merged merged_key. /// /// We do this by first finding the descendants of merged_key, then /// walking through the lefthand ancestry of tip_key until we find a key /// that doesn't descend from merged_key. Its child is the key that /// merged merged_key. /// /// :return: The first lefthand ancestor of tip_key to merge merged_key. /// merged_key if it is a lefthand ancestor of tip_key. /// None if no ancestor of tip_key merged merged_key. fn find_lefthand_merger( &self, merged_key: Py, tip_key: Py, ) -> PyResult>> { let merged = PyNode::from(merged_key); let tip = PyNode::from(tip_key); Ok(self.inner.find_lefthand_merger(merged, tip).map(|n| n.0)) } /// Find descendants of `old_key` that are ancestors of `new_key`. fn find_descendants<'py>( &self, py: Python<'py>, old_key: Py, new_key: Py, ) -> PyResult> { let old = PyNode::from(old_key); let new = PyNode::from(new_key); let result = self.inner.find_descendants(old, new); pynodes_to_pyset(py, result) } /// Remove revisions which are children of other ones in the set. /// /// This doesn't do any graph searching, it just checks the immediate /// parent_map to find if there are any children which can be removed. /// /// :param revisions: A set of revision_ids /// :param parent_map: A mapping `{key: parents}` — the value may be /// a list, tuple, or None (for ghosts). /// :return: A set of revision_ids with the children removed fn _remove_simple_descendants<'py>( &self, py: Python<'py>, revisions: Py, parent_map: Py, ) -> PyResult> { let revs: std::collections::HashSet = { let mut s = std::collections::HashSet::new(); for item in revisions.bind(py).try_iter()? { s.insert(PyNode::from(item?)); } s }; // Walk parent_map.items() Python-side so we accept any dict-like // mapping, tuples or lists as the value. let items = parent_map.bind(py).call_method0("items")?; let mut simple = revs.clone(); for item in items.try_iter()? { let item = item?; let key: Py = item.get_item(0)?.unbind(); let parents = item.get_item(1)?; if parents.is_none() { continue; } for parent in parents.try_iter()? { let parent = parent?; let parent_node = PyNode::from(parent.unbind()); if revs.contains(&parent_node) { simple.remove(&PyNode::from(key.clone_ref(py))); break; } } } let items: Vec> = simple.into_iter().map(|n| n.0).collect(); pyo3::types::PySet::new(py, items) } /// Find ancestors of `new_key` that may be descendants of `old_key`. fn _find_descendant_ancestors<'py>( &self, py: Python<'py>, old_key: Py, new_key: Py, ) -> PyResult> { let old = PyNode::from(old_key); let new = PyNode::from(new_key); let result = self.inner.find_descendant_ancestors(old, new); pynodes_to_pyset(py, result) } fn __repr__(&self, py: Python) -> PyResult { let r = self.provider_py.bind(py).repr()?; Ok(format!("Graph({})", r)) } } import_exception!(vcsgraph.errors, GhostRevisionsHaveNoRevno); import_exception!(vcsgraph.errors, InvalidRevisionId); import_exception!(vcsgraph.errors, NoCommonAncestor); import_exception!(vcsgraph.errors, RevisionNotPresent); fn graph_error_to_py(e: GraphError) -> PyErr { match e { GraphError::GhostRevision { target, ghost } => Python::attach(|py| { GhostRevisionsHaveNoRevno::new_err((target.0.clone_ref(py), ghost.0.clone_ref(py))) }), GraphError::RevisionNotPresent(key) => { Python::attach(|py| RevisionNotPresent::new_err((key.0.clone_ref(py),))) } GraphError::Cycle(_) => GraphCycleError::new_err(("cycle in graph",)), } } /// Helper: convert a set of PyNodes into a Python `set()`. fn pynodes_to_pyset<'py>( py: Python<'py>, set: rustc_hash::FxHashSet, ) -> PyResult> { let items: Vec> = set.into_iter().map(|n| n.0).collect(); pyo3::types::PySet::new(py, items) } /// Python binding for [`BfsState`]. Owns its own adapter over a Python /// parents provider and holds the Rust BFS state as sibling fields. /// /// `dict=true` gives the class a `__dict__`, which lets the existing Python /// callers in `graph.py` stash a `_label` attribute on the instance for /// debug logging. This can be removed once Phase 4 (heads / border /// ancestors) no longer relies on it. #[pyclass(name = "_BreadthFirstSearcher", dict)] struct PyBFSearcher { state: BfsState, adapter: PyParentsProviderAdapter, } #[pymethods] impl PyBFSearcher { #[new] fn new(py: Python, revisions: Py, parents_provider: Py) -> PyResult { let mut revs: Vec = Vec::new(); for item in revisions.bind(py).try_iter()? { revs.push(PyNode::from(item?)); } Ok(PyBFSearcher { state: BfsState::new(revs), adapter: PyParentsProviderAdapter { provider: parents_provider, }, }) } /// `seen` attribute — matches the Python API. Returns a set snapshot. #[getter] fn seen<'py>(&self, py: Python<'py>) -> PyResult> { pynodes_to_pyset(py, self.state.seen.clone()) } /// `_next_query` attribute — the current search frontier. /// /// Read-only snapshot. Existing callers in `graph.py` iterate and /// truth-test this attribute but do not mutate it. #[getter] fn _next_query<'py>(&self, py: Python<'py>) -> PyResult> { pynodes_to_pyset(py, self.state.next_query().clone()) } /// `_iterations` attribute — number of advance steps performed. #[getter] fn _iterations(&self) -> usize { self.state.iterations() } /// Python `step` returns the next query set, or `()` on StopIteration. fn step<'py>(&mut self, py: Python<'py>) -> PyResult> { match self.state.next_set(&self.adapter) { Some(set) => Ok(pynodes_to_pyset(py, set)?.into_any().unbind()), None => Ok(pyo3::types::PyTuple::empty(py).into_any().unbind()), } } fn __next__<'py>(&mut self, py: Python<'py>) -> PyResult> { match self.state.next_set(&self.adapter) { Some(set) => pynodes_to_pyset(py, set), None => Err(pyo3::exceptions::PyStopIteration::new_err(())), } } fn next<'py>(&mut self, py: Python<'py>) -> PyResult> { self.__next__(py) } fn next_with_ghosts<'py>( &mut self, py: Python<'py>, ) -> PyResult<( Bound<'py, pyo3::types::PySet>, Bound<'py, pyo3::types::PySet>, )> { match self.state.next_with_ghosts(&self.adapter) { Some((present, ghosts)) => Ok(( pynodes_to_pyset(py, present)?, pynodes_to_pyset(py, ghosts)?, )), None => Err(pyo3::exceptions::PyStopIteration::new_err(())), } } fn __iter__(slf: PyRef<'_, Self>) -> PyRef<'_, Self> { slf } fn get_state<'py>( &mut self, py: Python<'py>, ) -> PyResult<( Bound<'py, pyo3::types::PySet>, Bound<'py, pyo3::types::PySet>, Bound<'py, pyo3::types::PySet>, )> { let (started, excludes, included) = self.state.get_state(&self.adapter); Ok(( pynodes_to_pyset(py, started)?, pynodes_to_pyset(py, excludes)?, pynodes_to_pyset(py, included)?, )) } fn find_seen_ancestors<'py>( &self, py: Python<'py>, revisions: Py, ) -> PyResult> { let mut revs: Vec = Vec::new(); for item in revisions.bind(py).try_iter()? { revs.push(PyNode::from(item?)); } let result = self.state.find_seen_ancestors(revs, &self.adapter); pynodes_to_pyset(py, result) } fn stop_searching_any<'py>( &mut self, py: Python<'py>, revisions: Py, ) -> PyResult> { let mut revs: Vec = Vec::new(); for item in revisions.bind(py).try_iter()? { revs.push(PyNode::from(item?)); } let stopped = self.state.stop_searching_any(revs); pynodes_to_pyset(py, stopped) } fn start_searching(&mut self, py: Python, revisions: Py) -> PyResult> { let mut revs: Vec = Vec::new(); for item in revisions.bind(py).try_iter()? { revs.push(PyNode::from(item?)); } match self.state.start_searching(revs, &self.adapter) { Some((present, ghosts)) => { let pres = pynodes_to_pyset(py, present)?; let gh = pynodes_to_pyset(py, ghosts)?; Ok( pyo3::types::PyTuple::new(py, [pres.into_any(), gh.into_any()])? .into_any() .unbind(), ) } // In Next mode Python returns None (the function has no explicit // return), which matches our None here. None => Ok(py.None()), } } fn __repr__(&self, py: Python) -> PyResult { let prefix = if self.state.iterations() > 0 { "searching" } else { "starting" }; let seen_repr = pyo3::types::PyList::new(py, self.state.seen.iter().map(|n| n.0.clone_ref(py)))? .repr()?; let next_repr = { let next_keys: Vec> = self .state .started_keys .iter() .map(|n| n.0.clone_ref(py)) .collect(); pyo3::types::PyList::new(py, next_keys)?.repr()? }; Ok(format!( "_BreadthFirstSearcher(iterations={}, {}={}, seen={})", self.state.iterations(), prefix, next_repr, seen_repr )) } } #[pymodule] fn _graph_rs(_py: Python, m: &Bound) -> PyResult<()> { m.add_wrapped(wrap_pyfunction!(invert_parent_map))?; m.add_wrapped(wrap_pyfunction!(collapse_linear_regions))?; m.add_wrapped(wrap_pyfunction!(merge_sort))?; m.add_class::()?; m.add_class::()?; m.add_class::()?; m.add_class::()?; m.add_class::()?; m.add_class::()?; m.add_class::()?; m.add_class::()?; m.add_class::()?; m.add_class::()?; m.add_class::()?; m.add_class::()?; m.add_class::()?; m.add_class::()?; m.add_class::()?; m.add_class::()?; Ok(()) } python-vcsgraph-0.2.0/crates/graph/Cargo.toml0000644000000000000000000000131315167007306016110 0ustar00[package] name = "vcs-graph" version = "3.5.0" authors = [ "Martin Packman ", "Jelmer Vernooij "] edition = "2021" description = "Graph algorithms for version control systems: topological sort, merge-aware sort, parent maps, and ancestry queries." license = "GPL-2.0-or-later" homepage = "https://www.breezy-vcs.org/" repository = "https://github.com/breezy-team/vcsgraph" readme = "README.md" keywords = ["vcs", "graph", "toposort", "dag", "breezy"] categories = ["algorithms", "data-structures"] [lib] [dependencies] lazy_static = "1.4.0" pyo3 = { workspace = true, optional = true } rustc-hash = "2" [dev-dependencies] maplit = "1.0.2" [features] pyo3 = ["dep:pyo3"] python-vcsgraph-0.2.0/crates/graph/README.md0000644000000000000000000000260315167007306015442 0ustar00# vcs-graph Graph algorithms for version control systems. `vcs-graph` provides building blocks used by version control tools: topological sorting (including merge-aware sorting that preserves branch structure), parent-map manipulation, least common ancestor queries, and related operations. It is the Rust core behind the [`vcsgraph`][vcsgraph-py] Python package, originally extracted from the Breezy version control system. ## Features - `TopoSorter` — iterative topological sort of a parent graph. - `MergeSorter` — merge-aware topological sort that assigns revision numbers and tracks merge depth, suitable for rendering commit history. - `ParentMap` / `ChildMap` — parent/child map types with utilities like `invert_parent_map` and `collapse_linear_regions`. - `ParentsProvider` trait with `DictParentsProvider` and `StackedParentsProvider` implementations. - Optional `pyo3` feature for exposing these types to Python. ## Example ```rust use std::collections::HashMap; use vcs_graph::tsort::TopoSorter; let graph: HashMap<&str, Vec<&str>> = HashMap::from([ ("A", vec![]), ("B", vec!["A"]), ("C", vec!["A"]), ("D", vec!["B", "C"]), ]); let sorted = TopoSorter::new(graph.into_iter()).sorted().unwrap(); // Parents always come before children. ``` ## License GPL-2.0-or-later. See `COPYING.txt` in the repository root. [vcsgraph-py]: https://pypi.org/project/vcsgraph/ python-vcsgraph-0.2.0/crates/graph/src/0000755000000000000000000000000015167007306014751 5ustar00python-vcsgraph-0.2.0/crates/graph/src/bfs.rs0000644000000000000000000007222115167007306016075 0ustar00//! Breadth-first ancestry search. //! //! Ported from the `_BreadthFirstSearcher` class in `vcsgraph/graph.py`. //! The searcher walks the ancestry of a set of revisions, optionally with //! ghosts split out, and supports mid-walk modifications via //! [`BfsState::start_searching`] and [`BfsState::stop_searching_any`]. //! //! The state is decoupled from the parents provider: each advance method //! takes a `&impl ParentsProvider` explicitly. This lets Python bindings //! keep the provider adapter and the state as sibling fields in the same //! pyclass without running into self-reference problems. use crate::{ParentMap, Parents, ParentsProvider}; use rustc_hash::{FxHashMap, FxHashSet}; use std::collections::HashSet; use std::hash::Hash; /// Which kind of result the searcher returned on its most recent call. /// /// Callers can interleave `next` and `next_with_ghosts` calls; the searcher /// transparently advances the underlying state when the mode flips. #[derive(Clone, Copy, Debug, PartialEq, Eq)] enum ReturnMode { /// Most recent return was a plain `next()` — revisions yielded before /// their parents were queried, so ghosts are mixed in with real nodes. Next, /// Most recent return was `next_with_ghosts()` — revisions yielded after /// their parents were queried, so ghosts are split out. NextWithGhosts, } /// Outcome of a `_do_query` step. struct QueryResult { /// Nodes present in the provider's response. found: FxHashSet, /// Nodes not found (ghosts). ghosts: FxHashSet, /// Parents of the found nodes that we haven't seen before. next: FxHashSet, /// The full parent map returned by the provider for the queried keys. parents: FxHashMap>, } /// Mutable state of a breadth-first ancestry search. /// /// Constructed via [`BfsState::new`]; advanced with [`next`](Self::next) or /// [`next_with_ghosts`](Self::next_with_ghosts), which both take a reference /// to a parents provider. Mid-walk mutations go through /// [`start_searching`](Self::start_searching) and /// [`stop_searching_any`](Self::stop_searching_any). pub struct BfsState { /// All revisions the searcher has ever visited (seen or about to visit). pub seen: FxHashSet, /// Revisions the caller originally asked to search from, plus any added /// via `start_searching`. pub started_keys: FxHashSet, /// Revisions the caller explicitly asked to not descend through, plus /// any ghosts encountered. Ghosts are implicit stop points so the search /// can be repeated after ghosts are filled in. pub stopped_keys: FxHashSet, next_query: FxHashSet, current_present: FxHashSet, current_ghosts: FxHashSet, current_parents: FxHashMap>, returning: ReturnMode, iterations: usize, } impl BfsState { /// Start a new search from `revisions`. pub fn new>(revisions: I) -> Self { let next_query: FxHashSet = revisions.into_iter().collect(); let started_keys: FxHashSet = next_query.iter().cloned().collect(); BfsState { seen: FxHashSet::default(), started_keys, stopped_keys: FxHashSet::default(), next_query, current_present: FxHashSet::default(), current_ghosts: FxHashSet::default(), current_parents: FxHashMap::default(), returning: ReturnMode::NextWithGhosts, iterations: 0, } } /// Return the number of iterations performed so far. pub fn iterations(&self) -> usize { self.iterations } /// Borrow the current frontier (next query set). /// /// Exposed so bindings can reflect Python's `_next_query` attribute, /// which existing callers in `graph.py` read (but do not mutate). pub fn next_query(&self) -> &FxHashSet { &self.next_query } /// Snapshot of `(started_keys, excludes, included_keys)` describing what /// the searcher has reached. Matches Python's `get_state` return shape. /// /// This method intentionally calls the provider if the searcher is in /// `Next` mode, since we need the current query's children in order to /// list their parents as excludes. The subsequent iteration advances /// normally; the preview read is backed out of `seen`. pub fn get_state>( &mut self, provider: &P, ) -> (FxHashSet, FxHashSet, FxHashSet) { let next_query = if self.returning == ReturnMode::Next { let result = Self::do_query(&mut self.seen, &self.next_query, provider); // Undo the `seen` updates the preview made. for k in &result.next { self.seen.remove(k); } let mut nq = result.next; nq.extend(result.ghosts); nq } else { self.next_query.clone() }; let mut excludes = self.stopped_keys.clone(); excludes.extend(next_query); let included: FxHashSet = self.seen.difference(&excludes).cloned().collect(); (self.started_keys.clone(), excludes, included) } /// Advance the searcher and return the set yielded by Python's /// `__next__` / `next`. /// /// Each call yields the current query before its parents are queried, /// so ghosts are mixed in with present revisions. /// /// Returns `None` when there is nothing left to search. pub fn next_set>(&mut self, provider: &P) -> Option> { if self.returning != ReturnMode::Next { self.returning = ReturnMode::Next; self.iterations += 1; } else { self.advance(provider); } if self.next_query.is_empty() { return None; } self.seen.extend(self.next_query.iter().cloned()); Some(self.next_query.clone()) } /// Advance the searcher and return `(present, ghosts)` the way Python's /// `next_with_ghosts` does. /// /// Returns `None` when there is nothing left to search. pub fn next_with_ghosts>( &mut self, provider: &P, ) -> Option<(FxHashSet, FxHashSet)> { if self.returning != ReturnMode::NextWithGhosts { self.returning = ReturnMode::NextWithGhosts; self.advance(provider); } if self.next_query.is_empty() { return None; } self.advance(provider); Some((self.current_present.clone(), self.current_ghosts.clone())) } fn advance>(&mut self, provider: &P) { self.iterations += 1; // Split borrow: `do_query` only needs to read `next_query` and // write `seen`, so we pass them as separate references and avoid // cloning `next_query` on every advance. let result = Self::do_query(&mut self.seen, &self.next_query, provider); self.current_present = result.found; self.current_ghosts = result.ghosts; self.next_query = result.next; self.current_parents = result.parents; // Ghosts become implicit stop points. self.stopped_keys .extend(self.current_ghosts.iter().cloned()); } fn do_query>( seen: &mut FxHashSet, revisions: &FxHashSet, provider: &P, ) -> QueryResult { seen.extend(revisions.iter().cloned()); // ParentsProvider takes a std HashSet by reference. let mut std_set: HashSet = HashSet::with_capacity(revisions.len()); for k in revisions { std_set.insert(k.clone()); } let parent_map: ParentMap = provider.get_parent_map(&std_set); let mut found: FxHashSet = FxHashSet::default(); let mut parents_of_found: FxHashSet = FxHashSet::default(); let mut parents_owned: FxHashMap> = FxHashMap::default(); for (rev_id, parents) in parent_map.iter() { found.insert(rev_id.clone()); match parents { Parents::Known(ps) => { parents_owned.insert(rev_id.clone(), ps.clone()); for p in ps { if !seen.contains(p) { parents_of_found.insert(p.clone()); } } } // Python treats None parents as "continue" — no new parents // contributed. Parents::Ghost => {} } } let ghosts: FxHashSet = revisions.difference(&found).cloned().collect(); QueryResult { found, ghosts, next: parents_of_found, parents: parents_owned, } } /// Find already-seen ancestors of `revisions`. /// /// This walks backwards from `revisions` through `seen` keys only, /// querying the provider for parents. It matches the Python behavior: /// nodes not yet searched (in `next_query` when we're in `Next` mode) /// are skipped so we don't probe ahead of the search frontier. pub fn find_seen_ancestors(&self, revisions: I, provider: &P) -> FxHashSet where I: IntoIterator, P: ParentsProvider, { let mut pending: FxHashSet = revisions .into_iter() .filter(|r| self.seen.contains(r)) .collect(); let mut seen_ancestors: FxHashSet = pending.iter().cloned().collect(); // In `Next` mode `seen` contains nodes that have been *returned* but // whose parents haven't been queried yet. Skip those so we don't // probe ahead of the search frontier. let empty: FxHashSet = FxHashSet::default(); let not_searched_yet: &FxHashSet = if self.returning == ReturnMode::Next { &self.next_query } else { &empty }; pending.retain(|k| !not_searched_yet.contains(k)); while !pending.is_empty() { let mut std_set: HashSet = HashSet::with_capacity(pending.len()); for k in &pending { std_set.insert(k.clone()); } let parent_map = provider.get_parent_map(&std_set); let mut all_parents: Vec = Vec::new(); for (_, parents) in parent_map.iter() { if let Parents::Known(ps) = parents { all_parents.extend(ps.iter().cloned()); } } let mut next_pending: FxHashSet = FxHashSet::default(); for p in all_parents { if self.seen.contains(&p) && !seen_ancestors.contains(&p) { next_pending.insert(p); } } seen_ancestors.extend(next_pending.iter().cloned()); next_pending.retain(|k| !not_searched_yet.contains(k)); pending = next_pending; } seen_ancestors } /// Stop searching any of `revisions`. Returns the set of revisions /// actually removed from the current search frontier (not the ones that /// had already passed). pub fn stop_searching_any>(&mut self, revisions: I) -> FxHashSet { let revisions: FxHashSet = revisions.into_iter().collect(); let stopped: FxHashSet = if self.returning == ReturnMode::Next { let stopped: FxHashSet = self.next_query.intersection(&revisions).cloned().collect(); self.next_query.retain(|k| !revisions.contains(k)); stopped } else { let stopped_present: FxHashSet = self .current_present .intersection(&revisions) .cloned() .collect(); let stopped_ghosts: FxHashSet = self .current_ghosts .intersection(&revisions) .cloned() .collect(); let stopped: FxHashSet = stopped_present.union(&stopped_ghosts).cloned().collect(); self.current_present.retain(|k| !revisions.contains(k)); self.current_ghosts.retain(|k| !revisions.contains(k)); // Stopping X should stop returning parents of X — but only if no // other current node still references the same parent. Count // references to each parent from stopped_present, then decrement // for each non-stopped reference. let mut stop_rev_references: FxHashMap = FxHashMap::default(); for rev in &stopped_present { if let Some(parents) = self.current_parents.get(rev) { for parent_id in parents { *stop_rev_references.entry(parent_id.clone()).or_insert(0) += 1; } } } for parents in self.current_parents.values() { for parent_id in parents { if let Some(count) = stop_rev_references.get_mut(parent_id) { *count -= 1; } } } let stop_parents: FxHashSet = stop_rev_references .into_iter() .filter_map(|(k, refs)| if refs == 0 { Some(k) } else { None }) .collect(); self.next_query.retain(|k| !stop_parents.contains(k)); stopped }; self.stopped_keys.extend(stopped.iter().cloned()); self.stopped_keys.extend(revisions); stopped } /// Add more revisions to the search. /// /// In `NextWithGhosts` mode this performs an immediate query on the new /// revisions and returns `Some((present, ghosts))`. In `Next` mode the /// new revisions join the current query without a provider call and the /// function returns `None`. pub fn start_searching( &mut self, revisions: I, provider: &P, ) -> Option<(FxHashSet, FxHashSet)> where I: IntoIterator, P: ParentsProvider, { let revisions: FxHashSet = revisions.into_iter().collect(); self.started_keys.extend(revisions.iter().cloned()); let new_revisions: FxHashSet = revisions.difference(&self.seen).cloned().collect(); if self.returning == ReturnMode::Next { self.next_query.extend(new_revisions.iter().cloned()); self.seen.extend(new_revisions); None } else { let result = Self::do_query(&mut self.seen, &revisions, provider); self.stopped_keys.extend(result.ghosts.iter().cloned()); self.current_present.extend(result.found.iter().cloned()); self.current_ghosts.extend(result.ghosts.iter().cloned()); self.next_query.extend(result.next); for (k, v) in result.parents { self.current_parents.insert(k, v); } Some((result.found, result.ghosts)) } } } #[cfg(test)] mod tests { use super::*; use crate::DictParentsProvider; use std::collections::HashMap; fn provider(edges: &[(&'static str, &[&'static str])]) -> DictParentsProvider<&'static str> { let map: HashMap<&'static str, Vec<&'static str>> = edges.iter().map(|(k, ps)| (*k, ps.to_vec())).collect(); DictParentsProvider::from(map) } fn as_set(xs: [&'static str; N]) -> FxHashSet<&'static str> { xs.into_iter().collect() } #[test] fn next_walks_linear() { // a <- b <- c let p = provider(&[("a", &[]), ("b", &["a"]), ("c", &["b"])]); let mut s = BfsState::new(["c"]); assert_eq!(s.next_set(&p), Some(as_set(["c"]))); assert_eq!(s.next_set(&p), Some(as_set(["b"]))); assert_eq!(s.next_set(&p), Some(as_set(["a"]))); assert_eq!(s.next_set(&p), None); } #[test] fn next_with_ghosts_splits() { // head -> present -> (child, ghost); child has no parents; ghost missing let p = provider(&[ ("head", &["present"]), ("present", &["child", "ghost"]), ("child", &[]), ]); let mut s = BfsState::new(["head"]); assert_eq!(s.next_with_ghosts(&p), Some((as_set(["head"]), as_set([])))); assert_eq!( s.next_with_ghosts(&p), Some((as_set(["present"]), as_set([]))) ); assert_eq!( s.next_with_ghosts(&p), Some((as_set(["child"]), as_set(["ghost"]))) ); assert_eq!(s.next_with_ghosts(&p), None); } #[test] fn next_mode_mixes_ghosts_in_with_present() { // Same graph as above, but via next() — ghost should appear alongside child. let p = provider(&[ ("head", &["present"]), ("present", &["child", "ghost"]), ("child", &[]), ]); let mut s = BfsState::new(["head"]); assert_eq!(s.next_set(&p), Some(as_set(["head"]))); assert_eq!(s.next_set(&p), Some(as_set(["present"]))); assert_eq!(s.next_set(&p), Some(as_set(["child", "ghost"]))); assert_eq!(s.next_set(&p), None); } #[test] fn stop_searching_any_next_mode() { // In Next mode, `next_query` holds the set just returned by `next()` // (since the caller is given the query, not the results). So stopping // the set that was just yielded removes it from the frontier. let p = provider(&[ ("head", &["present"]), ("present", &["stopped"]), ("stopped", &[]), ]); let mut s = BfsState::new(["head"]); assert_eq!(s.next_set(&p), Some(as_set(["head"]))); assert_eq!(s.next_set(&p), Some(as_set(["present"]))); let stopped = s.stop_searching_any(["present"]); assert_eq!(stopped, as_set(["present"])); // With `present` stopped before its parents are queried, the search // is now exhausted. assert_eq!(s.next_set(&p), None); } #[test] fn start_searching_next_with_ghosts_queries_immediately() { let p = provider(&[("new_root", &["its_parent"]), ("its_parent", &[])]); let mut s: BfsState<&'static str> = BfsState::new([] as [&'static str; 0]); let (found, ghosts) = s.start_searching(["new_root", "ghost"], &p).unwrap(); assert!(found.contains(&"new_root")); assert!(ghosts.contains(&"ghost")); } /// Translated from `test_breadth_first_search_start_ghosts` in /// `vcsgraph/tests/test_graph.py`: starting with only a ghost, the first /// step yields just the ghost and then the search is exhausted. #[test] fn start_with_only_a_ghost() { let p = provider(&[("a-ghost", &[])]); let mut s = BfsState::new(["a-ghost"]); assert_eq!(s.next_set(&p), Some(as_set(["a-ghost"]))); assert_eq!(s.next_set(&p), None); } /// Translated from `test_breadth_first_change_search`: stop the current /// frontier, start a new search from an unrelated revision, and /// verify the BFS picks up the new revision's ancestors. #[test] fn change_search_via_stop_and_start() { let p = provider(&[ ("head", &["present"]), ("present", &["stopped"]), ("stopped", &[]), ("other", &["other_2"]), ("other_2", &[]), ]); let mut s = BfsState::new(["head"]); assert_eq!(s.next_with_ghosts(&p), Some((as_set(["head"]), as_set([])))); assert_eq!( s.next_with_ghosts(&p), Some((as_set(["present"]), as_set([]))) ); assert_eq!(s.stop_searching_any(["present"]), as_set(["present"])); let (present, ghosts) = s.start_searching(["other", "other_ghost"], &p).unwrap(); assert_eq!(present, as_set(["other"])); assert_eq!(ghosts, as_set(["other_ghost"])); assert_eq!( s.next_with_ghosts(&p), Some((as_set(["other_2"]), as_set([]))) ); assert_eq!(s.next_with_ghosts(&p), None); } const NULL: &str = "null:"; /// Mirrors `test_breadth_first_search_change_next_to_next_with_ghosts`: /// interleave `next()` and `next_with_ghosts()` on the same searcher /// and verify both modes produce sensible values. #[test] fn change_next_to_next_with_ghosts() { let p = provider(&[ ("head", &["present"]), ("present", &["child", "ghost"]), ("child", &[]), ]); let mut s = BfsState::new(["head"]); assert_eq!(s.next_with_ghosts(&p), Some((as_set(["head"]), as_set([])))); assert_eq!(s.next_set(&p), Some(as_set(["present"]))); assert_eq!( s.next_with_ghosts(&p), Some((as_set(["child"]), as_set(["ghost"]))) ); assert_eq!(s.next_set(&p), None); // Symmetric: start with next(), switch to next_with_ghosts(). let mut s = BfsState::new(["head"]); assert_eq!(s.next_set(&p), Some(as_set(["head"]))); assert_eq!( s.next_with_ghosts(&p), Some((as_set(["present"]), as_set([]))) ); assert_eq!(s.next_set(&p), Some(as_set(["child", "ghost"]))); assert_eq!(s.next_with_ghosts(&p), None); } /// Mirrors `test_breadth_first_get_result_excludes_current_pending`: /// at the start, nothing is seen; after each advance, `get_state()` /// reports the started keys, the excluded set, and the included /// (fully explored) set. #[test] fn get_state_excludes_current_pending() { let p = provider(&[("head", &["child"]), ("child", &[NULL]), (NULL, &[])]); let mut s = BfsState::new(["head"]); let (started, excludes, included) = s.get_state(&p); assert_eq!(started, as_set(["head"])); assert_eq!(excludes, as_set(["head"])); assert_eq!(included, as_set([])); assert_eq!(s.seen, as_set([])); // After next: head is yielded, still excluded because child is // the next frontier. s.next_set(&p); let (_, excludes, included) = s.get_state(&p); assert_eq!(excludes, as_set(["child"])); assert_eq!(included, as_set(["head"])); assert_eq!(s.seen, as_set(["head"])); // After child: null is the next frontier. s.next_set(&p); let (_, excludes, included) = s.get_state(&p); assert_eq!(excludes, as_set([NULL])); assert_eq!(included, as_set(["head", "child"])); // After null: nothing left in the frontier. s.next_set(&p); let (_, excludes, included) = s.get_state(&p); assert_eq!(excludes, as_set([])); assert_eq!(included, as_set(["head", "child", NULL])); } /// Mirrors `test_breadth_first_stop_searching_not_queried`: a client /// may tell the searcher to stop a key, and stopped_keys records it /// for later exclusion from the result's included-set. #[test] fn stop_searching_records_stops() { let p = provider(&[ ("head", &["child", "ghost1"]), ("child", &[NULL]), (NULL, &[]), ]); let mut s = BfsState::new(["head"]); s.next_set(&p); // yields head s.stop_searching_any([NULL, "ghost1"]); // The stopped keys are in stopped_keys regardless of whether // they've been visited yet. assert!(s.stopped_keys.contains(&NULL)); assert!(s.stopped_keys.contains(&"ghost1")); // get_state() should exclude the stopped keys from the // "included" snapshot. let (_, excludes, included) = s.get_state(&p); assert!(excludes.contains(&NULL)); assert!(excludes.contains(&"ghost1")); assert!(!included.contains(&NULL)); assert!(!included.contains(&"ghost1")); } /// Mirrors `test_breadth_first_stop_searching_late`: stopping a key /// from an older iteration should still exclude it from the result. #[test] fn stop_searching_late() { let p = provider(&[ ("head", &["middle"]), ("middle", &["child"]), ("child", &[NULL]), (NULL, &[]), ]); let mut s = BfsState::new(["head"]); s.next_set(&p); // yields head s.next_set(&p); // yields middle s.next_set(&p); // yields child // Now stop both middle and child retroactively. s.stop_searching_any(["middle", "child"]); assert!(s.stopped_keys.contains(&"middle")); assert!(s.stopped_keys.contains(&"child")); // After the stop, the remaining state reflects that only the // original head is included. let (_, excludes, included) = s.get_state(&p); assert!(excludes.contains(&"middle")); assert!(excludes.contains(&"child")); assert_eq!(included, as_set(["head"])); } /// Mirrors `test_breadth_first_get_result_starting_a_ghost_ghost_is_excluded`: /// start_searching a ghost key mid-walk. The ghost is recorded in seen /// but gets filed under stopped_keys so it is excluded from included(). #[test] fn start_searching_a_ghost_excludes_it() { let p = provider(&[("head", &["child"]), ("child", &[NULL]), (NULL, &[])]); let mut s = BfsState::new(["head"]); // Start-searching a ghost while in next_with_ghosts mode (the // default after construction). This returns (present, ghosts). let (present, ghosts) = s.start_searching(["ghost"], &p).unwrap(); assert_eq!(present, as_set([])); assert_eq!(ghosts, as_set(["ghost"])); // ghost is now in stopped_keys so included() doesn't report it. assert!(s.stopped_keys.contains(&"ghost")); } /// Mirrors `test_breadth_first_revision_count_includes_NULL_REVISION`: /// walking to the sentinel should count it as part of `seen`. #[test] fn walk_includes_null_revision() { let p = provider(&[("head", &[NULL]), (NULL, &[])]); let mut s = BfsState::new(["head"]); s.next_set(&p); // yields head s.next_set(&p); // yields null assert_eq!(s.seen, as_set(["head", NULL])); assert_eq!(s.next_set(&p), None); } /// Mirrors `test_breadth_first_search_get_result_after_StopIteration`: /// hitting StopIteration should not invalidate the searcher; a /// subsequent get_state() still works. #[test] fn get_state_after_stop_iteration() { let p = provider(&[("head", &[NULL]), (NULL, &[])]); let mut s = BfsState::new(["head"]); while s.next_set(&p).is_some() {} // No more to yield. assert_eq!(s.next_set(&p), None); let (started, _excludes, included) = s.get_state(&p); assert_eq!(started, as_set(["head"])); assert!(included.contains(&"head")); assert!(included.contains(&NULL)); } /// find_seen_ancestors should walk the parent chain and collect all /// ancestors already in `seen` — not new ones. #[test] fn find_seen_ancestors_walks_seen_chain() { let p = provider(&[ ("head", &["middle"]), ("middle", &["child"]), ("child", &[NULL]), (NULL, &[]), ]); let mut s = BfsState::new(["head"]); // Walk the whole thing. while s.next_set(&p).is_some() {} // Ask for ancestors of "middle" — should find middle, child, null. let anc = s.find_seen_ancestors(["middle"], &p); assert!(anc.contains(&"middle")); assert!(anc.contains(&"child")); assert!(anc.contains(&NULL)); } /// find_seen_ancestors should filter out keys not in seen. #[test] fn find_seen_ancestors_skips_unseen() { let p = provider(&[("head", &[NULL]), (NULL, &[]), ("unrelated", &[])]); let mut s = BfsState::new(["head"]); s.next_set(&p); // yields head let anc = s.find_seen_ancestors(["unrelated"], &p); // "unrelated" isn't in seen, so find_seen_ancestors returns an // empty set for it. assert!(!anc.contains(&"unrelated")); } /// stop_searching_any should return only the keys that were actually /// removed from the current frontier (not keys that had already been /// processed). #[test] fn stop_searching_any_returns_only_effective_stops() { let p = provider(&[("head", &["a"]), ("a", &["b"]), ("b", &[])]); let mut s = BfsState::new(["head"]); s.next_set(&p); // yields head s.next_set(&p); // yields a // `head` is already returned; stopping it should report it as // stopped but a no longer in frontier means only keys that are in // next_query at stop time get returned. let stopped = s.stop_searching_any(["a"]); assert_eq!(stopped, as_set(["a"])); // After stopping a, the search is exhausted. assert_eq!(s.next_set(&p), None); } /// Starting an already-seen key should be a no-op on `seen` and the /// key should not be re-queried. #[test] fn start_searching_already_seen_is_noop() { let p = provider(&[("head", &["child"]), ("child", &[])]); let mut s = BfsState::new(["head"]); s.next_set(&p); // yields head, frontier now contains "child" let pre_seen = s.seen.clone(); s.start_searching(["head"], &p); // seen should be unchanged (head was already there). assert_eq!(s.seen, pre_seen); } } python-vcsgraph-0.2.0/crates/graph/src/graph.rs0000644000000000000000000024460215167007306016430 0ustar00//! Incremental graph queries backed by a [`ParentsProvider`]. //! //! Ported incrementally from `vcsgraph/graph.py`. Phase 1 covers the trivial //! methods that don't need a BFS searcher: parent/child map queries, //! topological ordering (delegated to [`crate::tsort::TopoSorter`]), and the //! left-hand ancestry walks. use crate::bfs::BfsState; use crate::parents_provider::{DictParentsProvider, ParentsProvider}; use crate::tsort::TopoSorter; use crate::{Error, ParentMap, Parents}; use rustc_hash::{FxHashMap, FxHashSet}; use std::collections::{BTreeMap, HashMap}; use std::hash::Hash; /// A revision graph backed by an arbitrary [`ParentsProvider`]. /// /// Unlike [`crate::KnownGraph`] this type does not own the full ancestry — /// queries are dispatched to the provider on demand. pub struct Graph where K: Hash + Eq + Clone, P: ParentsProvider, { provider: P, _marker: std::marker::PhantomData, } /// An error returned from one of `Graph`'s traversal methods. #[derive(Debug, Clone, PartialEq, Eq)] pub enum GraphError { /// A revision reachable via the ancestry walk turned out to be a ghost, /// so we cannot compute a revno for it. GhostRevision { target: K, ghost: K }, /// A revision was not known to the provider at all. RevisionNotPresent(K), /// A cycle was detected during a topological walk. Cycle(Vec), } impl std::fmt::Display for GraphError { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { match self { GraphError::GhostRevision { target, ghost } => write!( f, "ghost revision {ghost} reached while finding revno for {target}" ), GraphError::RevisionNotPresent(key) => { write!(f, "revision {key} not present in graph") } GraphError::Cycle(nodes) => { write!(f, "cycle detected: ")?; for (i, n) in nodes.iter().enumerate() { if i > 0 { write!(f, " -> ")?; } write!(f, "{n}")?; } Ok(()) } } } } impl std::error::Error for GraphError {} impl Graph where K: Hash + Eq + Clone, P: ParentsProvider, { /// Construct a new `Graph` backed by `provider`. pub fn new(provider: P) -> Self { Graph { provider, _marker: std::marker::PhantomData, } } /// Borrow the underlying parents provider. pub fn parents_provider(&self) -> &P { &self.provider } /// Return a parent map for `keys`. Missing keys are omitted; ghosts are /// reported as [`Parents::Ghost`]. pub fn get_parent_map(&self, keys: I) -> ParentMap where I: IntoIterator, { // ParentsProvider takes a std HashSet; collect directly into one // rather than going via FxHashSet and copying. let set: std::collections::HashSet = keys.into_iter().collect(); self.provider.get_parent_map(&set) } /// Return a mapping from parent → children for the requested keys. /// /// This is the inversion of [`get_parent_map`](Self::get_parent_map); /// only the supplied `keys` are considered as potential children. Ghosts /// are skipped. The children lists are sorted (by insertion order driven /// by the BTreeMap iteration) to match Python's `sorted()` behavior. pub fn get_child_map(&self, keys: I) -> BTreeMap> where K: Ord, I: IntoIterator, { let parent_map = self.get_parent_map(keys); // Walk children in sorted order so parent→children lists mirror the // Python implementation's sorted() iteration. let mut sorted: BTreeMap> = BTreeMap::new(); for (k, v) in parent_map { sorted.insert(k, v); } let mut result: BTreeMap> = BTreeMap::new(); for (child, parents) in sorted { if let Parents::Known(ps) = parents { for parent in ps { result.entry(parent).or_default().push(child.clone()); } } } result } /// Iterate over the ancestry of `revision_ids` in topological order. /// /// This delegates to [`TopoSorter`]. The topological order only ensures /// that parents come before children within the ancestry that is /// reachable from the input revisions. pub fn iter_topo_order(&self, revisions: I) -> Result, Error> where K: std::fmt::Debug, I: IntoIterator, { let pm = self.get_parent_map(revisions); let iter = pm.into_iter().filter_map(|(k, parents)| match parents { Parents::Known(ps) => Some((k, ps)), Parents::Ghost => None, }); TopoSorter::new(iter).sorted() } /// Walk the left-hand ancestry of `start_key`, stopping when a key in /// `stop_keys` is encountered. Yields `start_key` first, then its /// left-most parent, and so on. /// /// Errors with [`GraphError::RevisionNotPresent`] if a key in the walk is /// missing from the provider. pub fn iter_lefthand_ancestry( &self, start_key: K, stop_keys: S, ) -> Result, GraphError> where S: IntoIterator, { let stop_keys: FxHashSet = stop_keys.into_iter().collect(); let mut result = Vec::new(); let mut next_key = start_key; loop { if stop_keys.contains(&next_key) { return Ok(result); } let pm = self.get_parent_map(std::iter::once(next_key.clone())); let parents = match pm.get(&next_key) { Some(Parents::Known(ps)) => ps.clone(), Some(Parents::Ghost) => { return Err(GraphError::RevisionNotPresent(next_key)); } None => return Err(GraphError::RevisionNotPresent(next_key)), }; result.push(next_key.clone()); if parents.is_empty() { return Ok(result); } next_key = parents.into_iter().next().unwrap(); } } /// Iterate over the ancestry reachable from `revision_ids`, yielding /// `(key, parents)` pairs in a BFS order. Ghosts are yielded with /// `Parents::Ghost`. pub fn iter_ancestry(&self, revision_ids: I) -> Vec<(K, Parents)> where I: IntoIterator, { let mut pending: FxHashSet = revision_ids.into_iter().collect(); let mut processed: FxHashSet = FxHashSet::default(); let mut out: Vec<(K, Parents)> = Vec::new(); while !pending.is_empty() { processed.extend(pending.iter().cloned()); let next_map = self.get_parent_map(pending.iter().cloned()); let mut next_pending: FxHashSet = FxHashSet::default(); let mut seen_in_map: FxHashSet = FxHashSet::default(); for (k, parents) in next_map.iter() { seen_in_map.insert(k.clone()); if let Parents::Known(ps) = parents { for p in ps { if !processed.contains(p) { next_pending.insert(p.clone()); } } } out.push((k.clone(), parents.clone())); } // Keys in `pending` that the provider didn't return are ghosts. for ghost in pending.difference(&seen_in_map) { out.push((ghost.clone(), Parents::Ghost)); } pending = next_pending; } out } /// Find the left-hand distance from `target_revision_id` to the origin. /// /// `known_distances` is an iterable of `(revision_id, distance)` pairs /// that seed the search. The origin sentinel (any key equal to `null`, /// supplied by the caller) should be included with distance 0. /// /// This mirrors Python's `find_distance_to_null`, which hard-codes the /// sentinel `NULL_REVISION = b"null:"`. Keeping the sentinel Python-side /// lets the Rust core stay string-typed without baking in bytes. pub fn find_distance_to_null( &self, target_revision_id: K, known_distances: impl IntoIterator, null: K, ) -> Result> { let mut known_revnos: FxHashMap = known_distances.into_iter().collect(); let mut cur_tip = target_revision_id.clone(); let mut num_steps: i64 = 0; known_revnos.insert(null.clone(), 0); let mut searching_known_tips: Vec = known_revnos.keys().cloned().collect(); let mut unknown_searched: FxHashMap = FxHashMap::default(); while !known_revnos.contains_key(&cur_tip) { unknown_searched.insert(cur_tip.clone(), num_steps); num_steps += 1; let mut to_search: FxHashSet = searching_known_tips.iter().cloned().collect(); to_search.insert(cur_tip.clone()); let parent_map = self.get_parent_map(to_search); let parents = match parent_map.get(&cur_tip) { Some(Parents::Known(ps)) if !ps.is_empty() => ps, _ => { return Err(GraphError::GhostRevision { target: target_revision_id, ghost: cur_tip, }); } }; let next_tip = parents[0].clone(); let mut next_known_tips: Vec = Vec::new(); for revision_id in &searching_known_tips { let parents = match parent_map.get(revision_id) { Some(Parents::Known(ps)) if !ps.is_empty() => ps, _ => continue, }; let next = parents[0].clone(); let next_revno = known_revnos[revision_id] - 1; if let Some(unknown_steps) = unknown_searched.get(&next) { return Ok(next_revno + unknown_steps); } if known_revnos.contains_key(&next) { continue; } known_revnos.insert(next.clone(), next_revno); next_known_tips.push(next); } searching_known_tips = next_known_tips; cur_tip = next_tip; } Ok(known_revnos[&cur_tip] + num_steps) } /// Find left-hand distances for every key in `keys`. /// /// Ghosts are reported as distance `-1`, matching the Python contract. pub fn find_lefthand_distances( &self, keys: impl IntoIterator, null: K, ) -> FxHashMap { let mut result: FxHashMap = FxHashMap::default(); let mut known: Vec<(K, i64)> = Vec::new(); let mut ghosts: Vec = Vec::new(); for key in keys { match self.find_distance_to_null(key.clone(), known.iter().cloned(), null.clone()) { Ok(d) => { known.push((key.clone(), d)); result.insert(key, d); } Err(GraphError::GhostRevision { .. }) => ghosts.push(key), Err(_) => { // Other errors are unreachable from find_distance_to_null // in practice. Match Python by skipping. } } } for ghost in ghosts { result.insert(ghost, -1); } result } /// Find the first lefthand ancestor of `tip_key` that merged `merged_key`. /// /// Walks the lefthand ancestry of `tip_key` one step at a time, stopping /// as soon as a candidate is not a descendant of `merged_key`. Returns /// the last candidate that *was* a descendant — or `None` if none is. pub fn find_lefthand_merger(&self, merged_key: K, tip_key: K) -> Option where K: Ord, { let descendants = self.find_descendants(merged_key, tip_key.clone()); let mut last_candidate: Option = None; let mut next_key = tip_key; loop { if !descendants.contains(&next_key) { return last_candidate; } let pm = self.get_parent_map(std::iter::once(next_key.clone())); let parents = match pm.get(&next_key) { Some(Parents::Known(ps)) => ps.clone(), _ => { // Missing entry or ghost — treat as end of walk. return Some(next_key); } }; last_candidate = Some(next_key); if parents.is_empty() { return last_candidate; } next_key = parents.into_iter().next().unwrap(); } } /// Compute `(left_only, right_only)` — the set difference between the /// ancestries of `left` and `right`. pub fn find_difference(&self, left: K, right: K) -> (FxHashSet, FxHashSet) where K: Ord, { let (_border, common, searchers) = self.find_border_ancestors([left, right]); // find_border_ancestors always returns one searcher per input // revision, so for a 2-element input we know we get exactly two. let mut pair: [BfsState; 2] = match <[BfsState; 2]>::try_from(searchers) { Ok(pair) => pair, Err(_) => unreachable!("find_border_ancestors returned a non-2 pair"), }; self.search_for_extra_common(&common, &mut pair); let [left_searcher, right_searcher] = pair; let left_seen = &left_searcher.seen; let right_seen = &right_searcher.seen; ( left_seen.difference(right_seen).cloned().collect(), right_seen.difference(left_seen).cloned().collect(), ) } /// Run the "extra common" reconvergence pass on a pair of searchers /// left in the state they finished `find_border_ancestors` in. Mirrors /// Python's `_search_for_extra_common`. /// /// Takes a fixed-size `[BfsState; 2]` so the two-searcher restriction /// is enforced at compile time instead of via a runtime assertion. fn search_for_extra_common(&self, _common: &FxHashSet, searchers: &mut [BfsState; 2]) where K: Ord, { let unique: FxHashSet = searchers[0] .seen .symmetric_difference(&searchers[1].seen) .cloned() .collect(); if unique.is_empty() { return; } let parent_map = self.get_parent_map(unique.iter().cloned()); let unique = Self::remove_simple_descendants(&unique, &parent_map); // Build unique-searchers: one per unique revision. let mut unique_searchers: Vec> = Vec::new(); for revision_id in unique.iter() { let revs_to_search: FxHashSet = { let parent_idx = if searchers[0].seen.contains(revision_id) { 0 } else { 1 }; let seed = [revision_id.clone()]; let anc = searchers[parent_idx].find_seen_ancestors(seed, &self.provider); if anc.is_empty() { [revision_id.clone()].into_iter().collect() } else { anc } }; let mut s = BfsState::new(revs_to_search); s.next_set(&self.provider); unique_searchers.push(s); } // Compute initial ancestor_all_unique: intersection of all seen sets. let mut ancestor_all_unique: FxHashSet = FxHashSet::default(); for (i, s) in unique_searchers.iter().enumerate() { if i == 0 { ancestor_all_unique = s.seen.clone(); } else { ancestor_all_unique = ancestor_all_unique.intersection(&s.seen).cloned().collect(); } } loop { let mut newly_seen_common: FxHashSet = FxHashSet::default(); for s in searchers.iter_mut() { if let Some(set) = s.next_set(&self.provider) { newly_seen_common.extend(set); } } let mut newly_seen_unique: FxHashSet = FxHashSet::default(); for s in unique_searchers.iter_mut() { if let Some(set) = s.next_set(&self.provider) { newly_seen_unique.extend(set); } } let mut new_common_unique: FxHashSet = FxHashSet::default(); for revision in &newly_seen_unique { if unique_searchers.iter().all(|s| s.seen.contains(revision)) { new_common_unique.insert(revision.clone()); } } if !newly_seen_common.is_empty() { // Merge newly_seen_common seen-ancestors from each common searcher. let mut expanded = newly_seen_common.clone(); for s in searchers.iter() { expanded .extend(s.find_seen_ancestors(expanded.iter().cloned(), &self.provider)); } let expanded_frozen = expanded; for s in searchers.iter_mut() { s.start_searching(expanded_frozen.iter().cloned(), &self.provider); } let stop_searching_common: FxHashSet = ancestor_all_unique .intersection(&expanded_frozen) .cloned() .collect(); if !stop_searching_common.is_empty() { for s in searchers.iter_mut() { s.stop_searching_any(stop_searching_common.iter().cloned()); } } } if !new_common_unique.is_empty() { let mut expanded = new_common_unique.clone(); for s in unique_searchers.iter() { expanded .extend(s.find_seen_ancestors(expanded.iter().cloned(), &self.provider)); } for s in searchers.iter() { expanded .extend(s.find_seen_ancestors(expanded.iter().cloned(), &self.provider)); } for s in unique_searchers.iter_mut() { s.start_searching(expanded.iter().cloned(), &self.provider); } for s in searchers.iter_mut() { s.stop_searching_any(expanded.iter().cloned()); } ancestor_all_unique.extend(expanded); // Collapse unique searchers that ended up with the same frontier. let mut seen_frontiers: std::collections::HashSet> = std::collections::HashSet::new(); let mut next_unique: Vec> = Vec::new(); for searcher in unique_searchers { let mut key: Vec = searcher.next_query().iter().cloned().collect(); key.sort_by_key(|k| { use std::collections::hash_map::DefaultHasher; use std::hash::Hasher; let mut h = DefaultHasher::new(); k.hash(&mut h); h.finish() }); if seen_frontiers.insert(key) { next_unique.push(searcher); } } unique_searchers = next_unique; } let any_common_active = searchers.iter().any(|s| !s.next_query().is_empty()); if !any_common_active { return; } } } /// Find a unique lowest common ancestor by iterating `find_lca`. /// /// If there are multiple LCAs, recursively find the LCA of that set /// until exactly one remains. Returns `None` if there is no common /// ancestor. If `count_steps` is true, also returns the number of /// iterations. pub fn find_unique_lca(&self, left: K, right: K, null: &K) -> Option<(K, usize)> { let mut revisions: Vec = vec![left, right]; let mut steps: usize = 0; loop { steps += 1; let lca = self.find_lca(revisions.iter().cloned(), null); match lca.len() { 1 => return lca.into_iter().next().map(|k| (k, steps)), 0 => return None, _ => revisions = lca.into_iter().collect(), } } } /// Find the unique ancestors of `unique_revision` relative to /// `common_revisions`. /// /// Returns the set of revisions that are ancestors of `unique_revision` /// but not of any of `common_revisions`. If `unique_revision` is itself /// in `common_revisions`, returns an empty set. /// /// Algorithm description: /// /// 1. Walk backwards from the unique node and all common nodes. /// 2. When a node is seen by both sides, stop searching it in the unique /// walker, include it in the common walker. /// 3. Stop searching when there are no nodes left for the unique walker. /// At this point, you have a maximal set of unique nodes. Some of /// them may actually be common, and you haven't reached them yet. /// 4. Start new searchers for the unique nodes, seeded with the /// information you have so far. /// 5. Continue searching, stopping the common searches when the search /// tip is an ancestor of all unique nodes. /// 6. Aggregate together unique searchers when they are searching the /// same tips. When all unique searchers are searching the same node, /// stop move it to a single 'all_unique_searcher'. /// 7. The 'all_unique_searcher' represents the very 'tip' of searching. /// Most of the time this produces very little important information. /// So don't step it as quickly as the other searchers. /// 8. Search is done when all common searchers have completed. pub fn find_unique_ancestors( &self, unique_revision: K, common_revisions: impl IntoIterator, ) -> FxHashSet where K: Ord, { let common_revisions: Vec = common_revisions.into_iter().collect(); if common_revisions.contains(&unique_revision) { return FxHashSet::default(); } // Phase 1: find maximal unique set. let (mut unique_searcher, mut common_searcher) = self.find_initial_unique_nodes([unique_revision], common_revisions); let unique_nodes: FxHashSet = unique_searcher .seen .difference(&common_searcher.seen) .cloned() .collect(); if unique_nodes.is_empty() { return unique_nodes; } // Phase 2: refine via unique-tip searchers. let (mut all_unique_searcher, mut unique_tip_searchers) = self.make_unique_searchers(&unique_nodes, &mut unique_searcher, &mut common_searcher); self.refine_unique_nodes( &mut unique_searcher, &mut all_unique_searcher, &mut unique_tip_searchers, &mut common_searcher, ); unique_nodes .difference(&common_searcher.seen) .cloned() .collect() } /// Phase 1 of find_unique_ancestors: find the maximal unique set. fn find_initial_unique_nodes( &self, unique_revisions: impl IntoIterator, common_revisions: impl IntoIterator, ) -> (BfsState, BfsState) { let mut unique_searcher = BfsState::new(unique_revisions); // Skip past the starting unique revisions themselves. unique_searcher.next_set(&self.provider); let mut common_searcher = BfsState::new(common_revisions); while !unique_searcher.next_query().is_empty() { let next_unique_nodes: FxHashSet = unique_searcher.next_set(&self.provider).unwrap_or_default(); let next_common_nodes: FxHashSet = common_searcher.next_set(&self.provider).unwrap_or_default(); let mut unique_are_common_nodes: FxHashSet = next_unique_nodes .intersection(&common_searcher.seen) .cloned() .collect(); unique_are_common_nodes.extend( next_common_nodes .intersection(&unique_searcher.seen) .cloned(), ); if !unique_are_common_nodes.is_empty() { let mut ancestors = unique_searcher.find_seen_ancestors(unique_are_common_nodes, &self.provider); let more = common_searcher.find_seen_ancestors(ancestors.clone(), &self.provider); ancestors.extend(more); unique_searcher.stop_searching_any(ancestors.iter().cloned()); common_searcher.start_searching(ancestors, &self.provider); } } (unique_searcher, common_searcher) } /// Phase 2 setup: create a searcher for each unique-node tip plus an /// `all_unique_searcher` covering ancestry shared by every unique tip. fn make_unique_searchers( &self, unique_nodes: &FxHashSet, unique_searcher: &mut BfsState, common_searcher: &mut BfsState, ) -> (BfsState, Vec>) where K: Ord, { let parent_map = self.get_parent_map(unique_nodes.iter().cloned()); let unique_tips = Self::remove_simple_descendants(unique_nodes, &parent_map); let mut unique_tip_searchers: Vec> = Vec::new(); let ancestor_all_unique: FxHashSet = if unique_tips.len() == 1 { unique_searcher.find_seen_ancestors(unique_tips, &self.provider) } else { for tip in unique_tips { let mut revs_to_search = unique_searcher.find_seen_ancestors([tip.clone()], &self.provider); let more = common_searcher.find_seen_ancestors(revs_to_search.clone(), &self.provider); revs_to_search.extend(more); let mut searcher = BfsState::new(revs_to_search); // Skip past the starting nodes — we don't care about them. searcher.next_set(&self.provider); unique_tip_searchers.push(searcher); } // Fold the intersection from the borrowed `seen` sets so we // don't have to snapshot each searcher's seen set as we go. unique_tip_searchers .iter() .map(|s| &s.seen) .fold(None::>, |acc, s| match acc { None => Some(s.clone()), Some(a) => Some(a.intersection(s).cloned().collect()), }) .unwrap_or_default() }; // Collapse all common nodes into a single searcher covering the // `ancestor_all_unique` set, then advance it once. let mut all_unique_searcher = BfsState::new(ancestor_all_unique.iter().cloned()); if !ancestor_all_unique.is_empty() { all_unique_searcher.next_set(&self.provider); // Stop common-searcher tips that are already ancestors of all uniques. let to_stop = common_searcher .find_seen_ancestors(ancestor_all_unique.iter().cloned(), &self.provider); common_searcher.stop_searching_any(to_stop); for searcher in unique_tip_searchers.iter_mut() { let to_stop = searcher .find_seen_ancestors(ancestor_all_unique.iter().cloned(), &self.provider); searcher.stop_searching_any(to_stop); } } (all_unique_searcher, unique_tip_searchers) } /// Remove revisions which are descendants (via the parent_map) of other /// revisions in the set. This is a cheap O(E) pass that doesn't walk /// ancestry — it just drops keys whose parents are already in `revisions`. fn remove_simple_descendants( revisions: &FxHashSet, parent_map: &ParentMap, ) -> FxHashSet { let mut simple = revisions.clone(); for (revision, parents) in parent_map.iter() { if let Parents::Known(ps) = parents { for parent_id in ps { if revisions.contains(parent_id) { simple.remove(revision); break; } } } } simple } /// One BFS step across unique tip searchers, the unique_searcher, and /// the common_searcher, propagating find_seen_ancestors cross-checks. fn step_unique_and_common_searchers( &self, common_searcher: &mut BfsState, unique_tip_searchers: &mut [BfsState], unique_searcher: &BfsState, ) -> (FxHashSet, FxHashSet) { let newly_seen_common: FxHashSet = common_searcher.next_set(&self.provider).unwrap_or_default(); let mut newly_seen_unique: FxHashSet = FxHashSet::default(); // Snapshot seen sets of all tip searchers so we can cross-reference // without re-borrowing mid-loop. let tip_count = unique_tip_searchers.len(); // Collect (index, next_step) pairs first. let mut per_tip_next: Vec<(usize, FxHashSet)> = Vec::with_capacity(tip_count); for (i, s) in unique_tip_searchers.iter_mut().enumerate() { let mut next_set = s.next_set(&self.provider).unwrap_or_default(); // Include ancestors already known to the main unique_searcher. next_set.extend(unique_searcher.find_seen_ancestors(next_set.clone(), &self.provider)); // And to the common_searcher. next_set.extend(common_searcher.find_seen_ancestors(next_set.clone(), &self.provider)); per_tip_next.push((i, next_set)); } // Cross-check: each tip pulls in seen ancestors from every other tip. // We need to compute additions per tip from the current (pre-start) // state of the other tips, so snapshot their seen sets first. let seen_per_tip: Vec> = unique_tip_searchers .iter() .map(|s| s.seen.clone()) .collect(); for (i, next_set) in per_tip_next.iter_mut() { for (j, seen_j) in seen_per_tip.iter().enumerate() { if *i == j { continue; } // We can't call `alt_searcher.find_seen_ancestors` here // because that would re-borrow the slice we're already // iterating. Use the free-standing equivalent that takes // the other searcher's seen set by reference. let additions = Self::find_seen_ancestors_against(next_set.clone(), seen_j, &self.provider); next_set.extend(additions); } } // Apply start_searching and accumulate newly_seen_unique. for (i, next_set) in per_tip_next { unique_tip_searchers[i].start_searching(next_set.iter().cloned(), &self.provider); newly_seen_unique.extend(next_set); } (newly_seen_common, newly_seen_unique) } /// Free-standing equivalent of `BfsState::find_seen_ancestors` that /// walks the provider restricted to a given `seen` set. Used by /// `step_unique_and_common_searchers` to cross-check tip searchers /// without mid-loop re-borrows of the slice. fn find_seen_ancestors_against( revisions: FxHashSet, seen: &FxHashSet, provider: &P, ) -> FxHashSet { let mut pending: FxHashSet = revisions.into_iter().filter(|r| seen.contains(r)).collect(); let mut seen_ancestors: FxHashSet = pending.iter().cloned().collect(); while !pending.is_empty() { let mut std_set: std::collections::HashSet = std::collections::HashSet::with_capacity(pending.len()); for k in &pending { std_set.insert(k.clone()); } let parent_map = provider.get_parent_map(&std_set); let mut all_parents: Vec = Vec::new(); for (_, parents) in parent_map.iter() { if let Parents::Known(ps) = parents { all_parents.extend(ps.iter().cloned()); } } let mut next_pending: FxHashSet = FxHashSet::default(); for p in all_parents { if seen.contains(&p) && !seen_ancestors.contains(&p) { next_pending.insert(p); } } seen_ancestors.extend(next_pending.iter().cloned()); pending = next_pending; } seen_ancestors } /// Find nodes common to all unique tip searchers (and optionally step /// the `all_unique_searcher`). fn find_nodes_common_to_all_unique( &self, unique_tip_searchers: &[BfsState], all_unique_searcher: &mut BfsState, newly_seen_unique: &FxHashSet, step_all_unique: bool, ) -> FxHashSet { let mut common: FxHashSet = newly_seen_unique.clone(); for searcher in unique_tip_searchers { common = common.intersection(&searcher.seen).cloned().collect(); } common = common .intersection(&all_unique_searcher.seen) .cloned() .collect(); if step_all_unique { if let Some(nodes) = all_unique_searcher.next_set(&self.provider) { common.extend(nodes); } } common } /// Combine unique tip searchers that are searching the same frontier. fn collapse_unique_searchers( &self, unique_tip_searchers: Vec>, common_to_all_unique_nodes: &FxHashSet, ) -> Vec> { // First pass: stop searching the common-to-all set on each searcher // and bucket by resulting frontier. let mut buckets: FxHashMap, Vec>> = FxHashMap::default(); let mut empty_bucket: Vec> = Vec::new(); for mut searcher in unique_tip_searchers { searcher.stop_searching_any(common_to_all_unique_nodes.iter().cloned()); let nq = searcher.next_query().clone(); if nq.is_empty() { empty_bucket.push(searcher); continue; } // Sort the frontier by hash for a deterministic bucket key. let mut key: Vec = nq.into_iter().collect(); key.sort_by_key(|k| { use std::collections::hash_map::DefaultHasher; use std::hash::Hasher; let mut h = DefaultHasher::new(); k.hash(&mut h); h.finish() }); buckets.entry(key).or_default().push(searcher); } let _ = empty_bucket; // drop empties — those searchers are done. let mut next_unique_searchers: Vec> = Vec::new(); for (_key, mut searchers) in buckets { if searchers.len() == 1 { next_unique_searchers.push(searchers.pop().unwrap()); } else { // Combine: intersect all their seen sets into the first. let mut first = searchers.remove(0); for s in searchers { first.seen = first.seen.intersection(&s.seen).cloned().collect(); } next_unique_searchers.push(first); } } next_unique_searchers } /// Phase 2 main loop: refine unique-vs-common by stepping searchers /// until `common_searcher` has nothing left to search. fn refine_unique_nodes( &self, unique_searcher: &mut BfsState, all_unique_searcher: &mut BfsState, unique_tip_searchers: &mut Vec>, common_searcher: &mut BfsState, ) { // Step the all_unique_searcher every N steps (Python's // STEP_UNIQUE_SEARCHER_EVERY = 5). const STEP_ALL_UNIQUE_EVERY: usize = 5; let mut step_all_unique_counter: usize = 0; while !common_searcher.next_query().is_empty() { let (newly_seen_common, newly_seen_unique) = self.step_unique_and_common_searchers( common_searcher, unique_tip_searchers, unique_searcher, ); let common_to_all_unique_nodes = self.find_nodes_common_to_all_unique( unique_tip_searchers, all_unique_searcher, &newly_seen_unique, step_all_unique_counter == 0, ); step_all_unique_counter = (step_all_unique_counter + 1) % STEP_ALL_UNIQUE_EVERY; if !newly_seen_common.is_empty() { let stop: FxHashSet = all_unique_searcher .seen .intersection(&newly_seen_common) .cloned() .collect(); common_searcher.stop_searching_any(stop); } if !common_to_all_unique_nodes.is_empty() { let mut expanded = common_to_all_unique_nodes.clone(); expanded.extend(common_searcher.find_seen_ancestors( common_to_all_unique_nodes.iter().cloned(), &self.provider, )); all_unique_searcher.start_searching(expanded.iter().cloned(), &self.provider); common_searcher.stop_searching_any(expanded); } let old_searchers = std::mem::take(unique_tip_searchers); *unique_tip_searchers = self.collapse_unique_searchers(old_searchers, &common_to_all_unique_nodes); } } /// Find ancestors of `new_key` that may be descendants of `old_key`. /// /// Drives two parallel searchers: `stop` walks up from `old_key` and /// `descendants` walks up from `new_key`. For each iteration, prune /// nodes already seen by `stop` from `descendants`, then advance `stop` /// and prune any nodes in the newly-visited stop set that `descendants` /// has already reached (via `find_seen_ancestors`). /// /// Returns the set of keys reached by `descendants` but not by `stop`. pub fn find_descendant_ancestors(&self, old_key: K, new_key: K) -> FxHashSet { let mut stop = BfsState::new([old_key]); let mut descendants = BfsState::new([new_key]); // Python's `for revisions in descendants:` iterates `next()` until // StopIteration. Our next_set returns None to signal the end. while let Some(revisions) = descendants.next_set(&self.provider) { let old_stop: FxHashSet = stop.seen.intersection(&revisions).cloned().collect(); descendants.stop_searching_any(old_stop); let step = stop.next_set(&self.provider).unwrap_or_default(); let seen_stop = descendants.find_seen_ancestors(step, &self.provider); descendants.stop_searching_any(seen_stop); } descendants.seen.difference(&stop.seen).cloned().collect() } /// Find border ancestors of a set of revisions via a concurrent BFS. /// /// Returns `(border_ancestors, common_ancestors, searchers)`. The /// searchers are left in the state they finished in so callers can /// inspect `seen` for graph-difference calculations. pub fn find_border_ancestors( &self, revisions: impl IntoIterator, ) -> (FxHashSet, FxHashSet, Vec>) { let revisions: Vec = revisions.into_iter().collect(); let mut searchers: Vec> = revisions .iter() .map(|r| BfsState::new([r.clone()])) .collect(); let mut common_ancestors: FxHashSet = FxHashSet::default(); let mut border_ancestors: FxHashSet = FxHashSet::default(); loop { let mut newly_seen: FxHashSet = FxHashSet::default(); for searcher in searchers.iter_mut() { if let Some(new_ancestors) = searcher.next_set(&self.provider) { newly_seen.extend(new_ancestors); } } let mut new_common: FxHashSet = FxHashSet::default(); for revision in &newly_seen { if common_ancestors.contains(revision) { new_common.insert(revision.clone()); continue; } if searchers.iter().all(|s| s.seen.contains(revision)) { border_ancestors.insert(revision.clone()); new_common.insert(revision.clone()); } } if !new_common.is_empty() { // Pull in ancestors that are already seen by each searcher. // We can't borrow searchers twice in one pass, so snapshot // each searcher's contribution and merge. let mut expanded = new_common.clone(); for searcher in searchers.iter() { let seen_anc = searcher.find_seen_ancestors(new_common.clone(), &self.provider); expanded.extend(seen_anc); } let new_common = expanded; for searcher in searchers.iter_mut() { searcher.start_searching(new_common.iter().cloned(), &self.provider); } common_ancestors.extend(new_common); } // Convergence check: if all searchers have the same next query, // we've merged into a single common line and can stop. let first_frontier: FxHashSet = searchers .first() .map(|s| s.next_query().clone()) .unwrap_or_default(); let all_same = searchers.iter().all(|s| s.next_query() == &first_frontier); if all_same { let uncommon: FxHashSet = first_frontier .difference(&common_ancestors) .cloned() .collect(); if !uncommon.is_empty() { // Shouldn't happen in well-formed graphs, but instead of // panicking we just continue — matches Python's // AssertionError shape without crashing. // (Callers of find_difference etc. will see this as an // empty difference; tests never exercise this path.) } break; } } (border_ancestors, common_ancestors, searchers) } /// Return the heads from amongst `keys`. /// /// This walks each candidate's ancestry and prunes any key reachable /// from another. The `null` parameter is the sentinel the caller uses /// for the origin (`b"null:"` in the Python layer); passing it lets the /// Rust core stay string-typed without baking bytes into the API. pub fn heads_with_null(&self, keys: impl IntoIterator, null: &K) -> FxHashSet { let mut candidate_heads: FxHashSet = keys.into_iter().collect(); if candidate_heads.contains(null) { candidate_heads.remove(null); if candidate_heads.is_empty() { let mut r = FxHashSet::default(); r.insert(null.clone()); return r; } } if candidate_heads.len() < 2 { return candidate_heads; } // One searcher per candidate, keyed by the candidate revision. let mut searchers: FxHashMap> = candidate_heads .iter() .map(|c| (c.clone(), BfsState::new([c.clone()]))) .collect(); let mut active: FxHashSet = candidate_heads.iter().cloned().collect(); // Skip the first yield (the candidate itself). for (_, searcher) in searchers.iter_mut() { searcher.next_set(&self.provider); } // Common walker: tracks nodes known to be common across all // searchers, so that a searcher hitting one can stop early. let mut common_walker: BfsState = BfsState::new([] as [K; 0]); while !active.is_empty() { let mut ancestors: FxHashSet = FxHashSet::default(); // Advance the common walker one step if there's anything to advance. common_walker.next_set(&self.provider); // Advance each active searcher one step. let active_list: Vec = active.iter().cloned().collect(); for candidate in active_list { let finished = { let searcher = searchers.get_mut(&candidate).unwrap(); match searcher.next_set(&self.provider) { Some(set) => { ancestors.extend(set); false } None => true, } }; if finished { active.remove(&candidate); } } // Process found ancestors. let mut new_common: FxHashSet = FxHashSet::default(); for ancestor in ancestors { if candidate_heads.contains(&ancestor) { candidate_heads.remove(&ancestor); searchers.remove(&ancestor); active.remove(&ancestor); } if common_walker.seen.contains(&ancestor) { // Known common: tell every searcher to stop on it. let stop_set: FxHashSet = [ancestor].into_iter().collect(); for searcher in searchers.values_mut() { searcher.stop_searching_any(stop_set.iter().cloned()); } } else if searchers.values().all(|s| s.seen.contains(&ancestor)) { // All searchers have reached this node — it's newly // common. Stop any of its seen ancestors in each searcher. new_common.insert(ancestor.clone()); // Collect seen ancestors per searcher, then apply stops. let seen_per_searcher: Vec> = searchers .values() .map(|s| s.find_seen_ancestors([ancestor.clone()], &self.provider)) .collect(); for (searcher, seen_anc) in searchers.values_mut().zip(seen_per_searcher.into_iter()) { searcher.stop_searching_any(seen_anc); } } } common_walker.start_searching(new_common, &self.provider); } candidate_heads } /// Find the lowest common ancestors of `revisions`. pub fn find_lca(&self, revisions: impl IntoIterator, null: &K) -> FxHashSet { let (border_common, _common, _searchers) = self.find_border_ancestors(revisions); self.heads_with_null(border_common, null) } /// Return whether `candidate_ancestor` is an ancestor of `candidate_descendant`. pub fn is_ancestor(&self, candidate_ancestor: K, candidate_descendant: K, null: &K) -> bool { let heads = self.heads_with_null( [candidate_ancestor.clone(), candidate_descendant.clone()], null, ); heads.len() == 1 && heads.contains(&candidate_descendant) } /// Return whether `revid` is between `lower_bound_revid` and /// `upper_bound_revid` (inclusive). `None` bounds are skipped. pub fn is_between( &self, revid: K, lower_bound_revid: Option, upper_bound_revid: Option, null: &K, ) -> bool { let upper_ok = match upper_bound_revid { None => true, Some(upper) => self.is_ancestor(revid.clone(), upper, null), }; if !upper_ok { return false; } match lower_bound_revid { None => true, Some(lower) => self.is_ancestor(lower, revid, null), } } /// Find the order in which `lca_revision_ids` were merged into `tip`. /// /// Walks backwards from `tip` with a stack, left-first, collecting the /// LCA revisions in the order they are encountered. pub fn find_merge_order( &self, tip: K, lca_revision_ids: impl IntoIterator, ) -> Vec { let mut looking_for: FxHashSet = lca_revision_ids.into_iter().collect(); if looking_for.len() == 1 { return looking_for.into_iter().collect(); } let mut stack: Vec = vec![tip]; let mut found: Vec = Vec::new(); let mut stop: FxHashSet = FxHashSet::default(); while !stack.is_empty() && !looking_for.is_empty() { let next_key = stack.pop().unwrap(); stop.insert(next_key.clone()); if looking_for.remove(&next_key) { found.push(next_key); if looking_for.len() == 1 { // Only one LCA left — add it and break without walking. let last = looking_for.iter().next().cloned().unwrap(); looking_for.clear(); found.push(last); break; } continue; } let pm = self.get_parent_map(std::iter::once(next_key.clone())); let parents = match pm.get(&next_key) { Some(Parents::Known(ps)) if !ps.is_empty() => ps.clone(), _ => continue, }; // Walk parents in reverse so the left-most parent is popped first. for parent_id in parents.into_iter().rev() { if !stop.contains(&parent_id) { stack.push(parent_id.clone()); } stop.insert(parent_id); } } found } /// Find descendants of `old_key` that are ancestors of `new_key`. /// /// Uses [`find_descendant_ancestors`](Self::find_descendant_ancestors) /// to narrow down candidates, then walks forwards through the child /// relationships by running a BFS over a [`DictParentsProvider`] built /// from the inverted parent map. pub fn find_descendants(&self, old_key: K, new_key: K) -> FxHashSet where K: Ord, { let candidates = self.find_descendant_ancestors(old_key.clone(), new_key); let child_map = self.get_child_map(candidates); // Walk forwards via a DictParentsProvider built from the child map. let dict: HashMap> = child_map.into_iter().collect(); let provider = DictParentsProvider::from(dict); let mut searcher = BfsState::new([old_key]); while searcher.next_set(&provider).is_some() {} searcher.seen } } #[cfg(test)] mod tests { use super::*; use crate::DictParentsProvider; use std::collections::HashMap; const NULL: &str = "null:"; fn make( edges: &[(&'static str, &[&'static str])], ) -> Graph<&'static str, DictParentsProvider<&'static str>> { let map: HashMap<&'static str, Vec<&'static str>> = edges.iter().map(|(k, ps)| (*k, ps.to_vec())).collect(); Graph::new(DictParentsProvider::from(map)) } #[test] fn get_parent_map_basic() { let g = make(&[("a", &[]), ("b", &["a"])]); let pm = g.get_parent_map(vec!["a", "b", "missing"]); assert_eq!(pm.get(&"a"), Some(&Parents::Known(vec![]))); assert_eq!(pm.get(&"b"), Some(&Parents::Known(vec!["a"]))); assert_eq!(pm.get(&"missing"), None); } #[test] fn get_child_map_inverts() { let g = make(&[("a", &[]), ("b", &["a"]), ("c", &["a"]), ("d", &["b", "c"])]); let cm = g.get_child_map(vec!["a", "b", "c", "d"]); assert_eq!(cm.get(&"a"), Some(&vec!["b", "c"])); assert_eq!(cm.get(&"b"), Some(&vec!["d"])); assert_eq!(cm.get(&"c"), Some(&vec!["d"])); assert_eq!(cm.get(&"d"), None); } #[test] fn iter_lefthand_ancestry_linear() { // null <- a <- b <- c let g = make(&[("a", &[NULL]), ("b", &["a"]), ("c", &["b"])]); let out = g.iter_lefthand_ancestry("c", [NULL]).unwrap(); assert_eq!(out, vec!["c", "b", "a"]); } #[test] fn find_distance_to_null_linear() { // null <- a (1) <- b (2) <- c (3) let g = make(&[("a", &[NULL]), ("b", &["a"]), ("c", &["b"])]); assert_eq!( g.find_distance_to_null("c", std::iter::empty(), NULL) .unwrap(), 3 ); assert_eq!( g.find_distance_to_null("a", std::iter::empty(), NULL) .unwrap(), 1 ); } #[test] fn find_distance_to_null_with_known_seed() { // null <- a (1) <- b (2) <- c (3) <- d (4) let g = make(&[("a", &[NULL]), ("b", &["a"]), ("c", &["b"]), ("d", &["c"])]); assert_eq!( g.find_distance_to_null("d", std::iter::once(("b", 2)), NULL) .unwrap(), 4 ); } #[test] fn find_lefthand_distances_all() { let g = make(&[("a", &[NULL]), ("b", &["a"]), ("c", &["b"])]); let d = g.find_lefthand_distances(vec!["a", "b", "c"], NULL); assert_eq!(d.get(&"a"), Some(&1)); assert_eq!(d.get(&"b"), Some(&2)); assert_eq!(d.get(&"c"), Some(&3)); } #[test] fn iter_topo_order_parents_first() { let g = make(&[("a", &[]), ("b", &["a"]), ("c", &["a"]), ("d", &["b", "c"])]); let order = g.iter_topo_order(vec!["a", "b", "c", "d"]).unwrap(); let pos = |x: &&str| order.iter().position(|n| n == x).unwrap(); assert!(pos(&"a") < pos(&"b")); assert!(pos(&"a") < pos(&"c")); assert!(pos(&"b") < pos(&"d")); assert!(pos(&"c") < pos(&"d")); } #[test] fn iter_ancestry_reaches_all() { let g = make(&[("a", &[]), ("b", &["a"]), ("c", &["a"]), ("d", &["b", "c"])]); let anc = g.iter_ancestry(vec!["d"]); let keys: FxHashSet<&'static str> = anc.iter().map(|(k, _)| *k).collect(); let expected: FxHashSet<&'static str> = ["a", "b", "c", "d"].into_iter().collect(); assert_eq!(keys, expected); } #[test] fn find_descendants_diamond() { // a // / \ // b c // \ / // d let g = make(&[("a", &[]), ("b", &["a"]), ("c", &["a"]), ("d", &["b", "c"])]); let descendants = g.find_descendants("a", "d"); let expected: FxHashSet<&'static str> = ["a", "b", "c", "d"].into_iter().collect(); assert_eq!(descendants, expected); } #[test] fn find_descendants_linear() { // a <- b <- c <- d let g = make(&[("a", &[]), ("b", &["a"]), ("c", &["b"]), ("d", &["c"])]); let descendants = g.find_descendants("b", "d"); let expected: FxHashSet<&'static str> = ["b", "c", "d"].into_iter().collect(); assert_eq!(descendants, expected); } #[test] fn heads_single_candidate() { let g = make(&[("a", &[]), ("b", &["a"])]); let h = g.heads_with_null(vec!["b"], &NULL); assert_eq!(h, ["b"].into_iter().collect()); } #[test] fn heads_prunes_ancestors() { // a <- b <- c let g = make(&[("a", &[]), ("b", &["a"]), ("c", &["b"])]); let h = g.heads_with_null(vec!["a", "c"], &NULL); assert_eq!(h, ["c"].into_iter().collect()); } #[test] fn heads_diamond_returns_both() { // a // / \ // b c // \ / // d let g = make(&[("a", &[]), ("b", &["a"]), ("c", &["a"]), ("d", &["b", "c"])]); let h = g.heads_with_null(vec!["b", "c"], &NULL); let expected: FxHashSet<_> = ["b", "c"].into_iter().collect(); assert_eq!(h, expected); } #[test] fn heads_null_alone() { let g = make(&[("a", &[])]); let h = g.heads_with_null(vec![NULL], &NULL); assert_eq!(h, [NULL].into_iter().collect()); } #[test] fn find_lca_diamond() { // a // / \ // b c // \ / // d let g = make(&[("a", &[]), ("b", &["a"]), ("c", &["a"]), ("d", &["b", "c"])]); let lca = g.find_lca(vec!["b", "c"], &NULL); assert_eq!(lca, ["a"].into_iter().collect()); } #[test] fn is_ancestor_true_and_false() { // a <- b <- c let g = make(&[("a", &[]), ("b", &["a"]), ("c", &["b"])]); assert!(g.is_ancestor("a", "c", &NULL)); assert!(g.is_ancestor("b", "c", &NULL)); assert!(!g.is_ancestor("c", "a", &NULL)); } #[test] fn find_merge_order_single() { let g = make(&[("a", &[]), ("b", &["a"])]); let order = g.find_merge_order("b", vec!["a"]); assert_eq!(order, vec!["a"]); } #[test] fn find_descendants_unrelated() { // new_key is not a descendant of old_key. let g = make(&[("a", &[]), ("b", &["a"]), ("c", &["a"])]); let descendants = g.find_descendants("b", "c"); // b is not reachable from c, so no descendants of b among c's ancestry. assert!(descendants.is_empty() || descendants == ["b"].into_iter().collect()); } /// Build a set literal from an array of strings. fn set(xs: [&'static str; N]) -> FxHashSet<&'static str> { xs.into_iter().collect() } fn ancestry_1() -> Graph<&'static str, DictParentsProvider<&'static str>> { make(&[ ("rev1", &[NULL]), ("rev2a", &["rev1"]), ("rev2b", &["rev1"]), ("rev3", &["rev2a"]), ("rev4", &["rev3", "rev2b"]), ]) } fn ancestry_2() -> Graph<&'static str, DictParentsProvider<&'static str>> { make(&[ ("rev1a", &[NULL]), ("rev2a", &["rev1a"]), ("rev1b", &[NULL]), ("rev3a", &["rev2a"]), ("rev4a", &["rev3a"]), ]) } fn criss_cross() -> Graph<&'static str, DictParentsProvider<&'static str>> { make(&[ ("rev1", &[NULL]), ("rev2a", &["rev1"]), ("rev2b", &["rev1"]), ("rev3a", &["rev2a", "rev2b"]), ("rev3b", &["rev2b", "rev2a"]), ]) } fn criss_cross2() -> Graph<&'static str, DictParentsProvider<&'static str>> { make(&[ ("rev1a", &[NULL]), ("rev1b", &[NULL]), ("rev2a", &["rev1a", "rev1b"]), ("rev2b", &["rev1b", "rev1a"]), ]) } fn history_shortcut() -> Graph<&'static str, DictParentsProvider<&'static str>> { make(&[ ("rev1", &[NULL]), ("rev2a", &["rev1"]), ("rev2b", &["rev1"]), ("rev2c", &["rev1"]), ("rev3a", &["rev2a", "rev2b"]), ("rev3b", &["rev2b", "rev2c"]), ]) } fn extended_history_shortcut() -> Graph<&'static str, DictParentsProvider<&'static str>> { make(&[ ("a", &[NULL]), ("b", &["a"]), ("c", &["b"]), ("d", &["c"]), ("e", &["d"]), ("f", &["a", "d"]), ]) } fn double_shortcut_fixture() -> Graph<&'static str, DictParentsProvider<&'static str>> { make(&[ ("a", &[NULL]), ("b", &["a"]), ("c", &["b"]), ("d", &["c"]), ("e", &["c"]), ("f", &["a", "d"]), ("g", &["a", "e"]), ]) } fn complex_shortcut() -> Graph<&'static str, DictParentsProvider<&'static str>> { make(&[ ("a", &[NULL]), ("b", &["a"]), ("c", &["b"]), ("d", &["c"]), ("e", &["d"]), ("f", &["d"]), ("g", &["f"]), ("h", &["f"]), ("i", &["e", "g"]), ("j", &["g"]), ("k", &["j"]), ("l", &["k"]), ("m", &["i", "l"]), ("n", &["l", "h"]), ]) } fn complex_shortcut2() -> Graph<&'static str, DictParentsProvider<&'static str>> { make(&[ ("a", &[NULL]), ("b", &["a"]), ("c", &["b"]), ("d", &["c"]), ("e", &["d"]), ("f", &["e"]), ("g", &["f"]), ("h", &["d"]), ("i", &["g"]), ("j", &["h"]), ("k", &["h", "i"]), ("l", &["k"]), ("m", &["l"]), ("n", &["m"]), ("o", &["n"]), ("p", &["o"]), ("q", &["p"]), ("r", &["q"]), ("s", &["r"]), ("t", &["i", "s"]), ("u", &["s", "j"]), ]) } fn multiple_interesting_unique() -> Graph<&'static str, DictParentsProvider<&'static str>> { make(&[ ("a", &[NULL]), ("b", &["a"]), ("c", &["b"]), ("d", &["c"]), ("e", &["d"]), ("f", &["d"]), ("g", &["e"]), ("h", &["e"]), ("i", &["f"]), ("j", &["g"]), ("k", &["g"]), ("l", &["h"]), ("m", &["i"]), ("n", &["k", "l"]), ("o", &["m"]), ("p", &["m", "l"]), ("q", &["n", "o"]), ("r", &["q"]), ("s", &["r"]), ("t", &["s"]), ("u", &["t"]), ("v", &["u"]), ("w", &["v"]), ("x", &["w"]), ("y", &["j", "x"]), ("z", &["x", "p"]), ]) } fn shortcut_extra_root() -> Graph<&'static str, DictParentsProvider<&'static str>> { make(&[ ("a", &[NULL]), ("b", &["a"]), ("c", &["b"]), ("d", &["c"]), ("e", &["d"]), ("f", &["a", "d", "g"]), ("g", &[NULL]), ]) } fn boundary() -> Graph<&'static str, DictParentsProvider<&'static str>> { make(&[ ("a", &["b"]), ("c", &["b", "d"]), ("b", &["e"]), ("d", &["e"]), ("e", &["f"]), ("f", &[NULL]), ]) } #[test] fn test_lca_ancestry_1() { let g = ancestry_1(); assert_eq!(g.find_lca([NULL, NULL], &NULL), set([NULL])); assert_eq!(g.find_lca([NULL, "rev1"], &NULL), set([NULL])); assert_eq!(g.find_lca(["rev1", "rev1"], &NULL), set(["rev1"])); assert_eq!(g.find_lca(["rev2a", "rev2b"], &NULL), set(["rev1"])); } #[test] fn test_lca_criss_cross() { let g = criss_cross(); assert_eq!( g.find_lca(["rev3a", "rev3b"], &NULL), set(["rev2a", "rev2b"]) ); } #[test] fn test_lca_shortcut() { let g = history_shortcut(); assert_eq!(g.find_lca(["rev3a", "rev3b"], &NULL), set(["rev2b"])); } #[test] fn test_lca_double_shortcut() { let g = double_shortcut_fixture(); assert_eq!(g.find_lca(["f", "g"], &NULL), set(["c"])); } #[test] fn test_unique_lca_ancestry_1() { let g = ancestry_1(); assert_eq!(g.find_unique_lca(NULL, NULL, &NULL), Some((NULL, 1))); assert_eq!(g.find_unique_lca(NULL, "rev1", &NULL), Some((NULL, 1))); assert_eq!(g.find_unique_lca("rev1", "rev1", &NULL), Some(("rev1", 1))); assert_eq!( g.find_unique_lca("rev2a", "rev2b", &NULL), Some(("rev1", 1)) ); } #[test] fn test_unique_lca_criss_cross() { let g = criss_cross(); assert_eq!( g.find_unique_lca("rev3a", "rev3b", &NULL), Some(("rev1", 2)) ); } #[test] fn test_unique_lca_null_revision_criss_cross2() { let g = criss_cross2(); assert_eq!( g.find_unique_lca("rev2a", "rev1b", &NULL).map(|(k, _)| k), Some("rev1b") ); assert_eq!( g.find_unique_lca("rev2a", "rev2b", &NULL).map(|(k, _)| k), Some(NULL) ); } #[test] fn test_unique_lca_separate_ancestry() { let g = ancestry_2(); assert_eq!( g.find_unique_lca("rev4a", "rev1b", &NULL).map(|(k, _)| k), Some(NULL) ); } #[test] fn test_heads_null() { let g = ancestry_1(); assert_eq!(g.heads_with_null([NULL], &NULL), set([NULL])); assert_eq!(g.heads_with_null([NULL, "rev1"], &NULL), set(["rev1"])); assert_eq!(g.heads_with_null(["rev1", NULL], &NULL), set(["rev1"])); } #[test] fn test_heads_one() { let g = ancestry_1(); for key in [NULL, "rev1", "rev2a", "rev2b", "rev3", "rev4"] { assert_eq!(g.heads_with_null([key], &NULL), set([key])); } } #[test] fn test_heads_single_from_pair() { let g = ancestry_1(); assert_eq!(g.heads_with_null([NULL, "rev4"], &NULL), set(["rev4"])); assert_eq!(g.heads_with_null(["rev1", "rev2a"], &NULL), set(["rev2a"])); assert_eq!(g.heads_with_null(["rev1", "rev2b"], &NULL), set(["rev2b"])); assert_eq!(g.heads_with_null(["rev1", "rev3"], &NULL), set(["rev3"])); assert_eq!(g.heads_with_null(["rev1", "rev4"], &NULL), set(["rev4"])); assert_eq!(g.heads_with_null(["rev2a", "rev4"], &NULL), set(["rev4"])); assert_eq!(g.heads_with_null(["rev2b", "rev4"], &NULL), set(["rev4"])); assert_eq!(g.heads_with_null(["rev3", "rev4"], &NULL), set(["rev4"])); } #[test] fn test_heads_two_heads() { let g = ancestry_1(); assert_eq!( g.heads_with_null(["rev2a", "rev2b"], &NULL), set(["rev2a", "rev2b"]) ); assert_eq!( g.heads_with_null(["rev3", "rev2b"], &NULL), set(["rev3", "rev2b"]) ); } #[test] fn test_heads_criss_cross() { let g = criss_cross(); assert_eq!(g.heads_with_null(["rev2a", "rev1"], &NULL), set(["rev2a"])); assert_eq!(g.heads_with_null(["rev2b", "rev1"], &NULL), set(["rev2b"])); assert_eq!(g.heads_with_null(["rev3a", "rev1"], &NULL), set(["rev3a"])); assert_eq!(g.heads_with_null(["rev3b", "rev1"], &NULL), set(["rev3b"])); assert_eq!( g.heads_with_null(["rev2a", "rev2b"], &NULL), set(["rev2a", "rev2b"]) ); assert_eq!(g.heads_with_null(["rev3a", "rev2a"], &NULL), set(["rev3a"])); assert_eq!(g.heads_with_null(["rev3a", "rev2b"], &NULL), set(["rev3a"])); assert_eq!( g.heads_with_null(["rev3a", "rev2a", "rev2b"], &NULL), set(["rev3a"]) ); assert_eq!(g.heads_with_null(["rev3b", "rev2a"], &NULL), set(["rev3b"])); assert_eq!(g.heads_with_null(["rev3b", "rev2b"], &NULL), set(["rev3b"])); assert_eq!( g.heads_with_null(["rev3b", "rev2a", "rev2b"], &NULL), set(["rev3b"]) ); assert_eq!( g.heads_with_null(["rev3a", "rev3b"], &NULL), set(["rev3a", "rev3b"]) ); assert_eq!( g.heads_with_null(["rev3a", "rev3b", "rev2a", "rev2b"], &NULL), set(["rev3a", "rev3b"]) ); } #[test] fn test_heads_shortcut() { let g = history_shortcut(); assert_eq!( g.heads_with_null(["rev2a", "rev2b", "rev2c"], &NULL), set(["rev2a", "rev2b", "rev2c"]) ); assert_eq!( g.heads_with_null(["rev3a", "rev3b"], &NULL), set(["rev3a", "rev3b"]) ); assert_eq!( g.heads_with_null(["rev2a", "rev3a", "rev3b"], &NULL), set(["rev3a", "rev3b"]) ); assert_eq!( g.heads_with_null(["rev2a", "rev3b"], &NULL), set(["rev2a", "rev3b"]) ); assert_eq!( g.heads_with_null(["rev2c", "rev3a"], &NULL), set(["rev2c", "rev3a"]) ); } #[test] fn test_graph_difference_ancestry_1() { let g = ancestry_1(); assert_eq!( g.find_difference("rev1", "rev1"), (FxHashSet::default(), FxHashSet::default()) ); assert_eq!( g.find_difference(NULL, "rev1"), (FxHashSet::default(), set(["rev1"])) ); assert_eq!( g.find_difference("rev1", NULL), (set(["rev1"]), FxHashSet::default()) ); assert_eq!( g.find_difference("rev3", "rev2b"), (set(["rev2a", "rev3"]), set(["rev2b"])) ); assert_eq!( g.find_difference("rev4", "rev2b"), (set(["rev4", "rev3", "rev2a"]), FxHashSet::default()) ); } #[test] fn test_graph_difference_separate_ancestry() { let g = ancestry_2(); assert_eq!( g.find_difference("rev1a", "rev1b"), (set(["rev1a"]), set(["rev1b"])) ); assert_eq!( g.find_difference("rev4a", "rev1b"), (set(["rev1a", "rev2a", "rev3a", "rev4a"]), set(["rev1b"])) ); } #[test] fn test_graph_difference_criss_cross() { let g = criss_cross(); assert_eq!( g.find_difference("rev3a", "rev3b"), (set(["rev3a"]), set(["rev3b"])) ); assert_eq!( g.find_difference("rev2a", "rev3b"), (FxHashSet::default(), set(["rev3b", "rev2b"])) ); } #[test] fn test_graph_difference_extended_history() { let g = extended_history_shortcut(); assert_eq!(g.find_difference("e", "f"), (set(["e"]), set(["f"]))); assert_eq!(g.find_difference("f", "e"), (set(["f"]), set(["e"]))); } #[test] fn test_graph_difference_double_shortcut() { let g = double_shortcut_fixture(); assert_eq!( g.find_difference("f", "g"), (set(["d", "f"]), set(["e", "g"])) ); } #[test] fn test_graph_difference_complex_shortcut() { let g = complex_shortcut(); assert_eq!( g.find_difference("m", "n"), (set(["m", "i", "e"]), set(["n", "h"])) ); } #[test] fn test_graph_difference_complex_shortcut2() { let g = complex_shortcut2(); assert_eq!(g.find_difference("t", "u"), (set(["t"]), set(["j", "u"]))); } #[test] fn test_graph_difference_shortcut_extra_root() { let g = shortcut_extra_root(); assert_eq!(g.find_difference("e", "f"), (set(["e"]), set(["f", "g"]))); } #[test] fn test_unique_ancestors_empty_set() { let g = ancestry_1(); assert_eq!( g.find_unique_ancestors("rev1", ["rev1"]), FxHashSet::default() ); assert_eq!( g.find_unique_ancestors("rev2b", ["rev2b"]), FxHashSet::default() ); assert_eq!( g.find_unique_ancestors("rev3", ["rev1", "rev3"]), FxHashSet::default() ); } #[test] fn test_unique_ancestors_single_node() { let g = ancestry_1(); assert_eq!(g.find_unique_ancestors("rev2a", ["rev1"]), set(["rev2a"])); assert_eq!(g.find_unique_ancestors("rev2b", ["rev1"]), set(["rev2b"])); assert_eq!(g.find_unique_ancestors("rev3", ["rev2a"]), set(["rev3"])); } #[test] fn test_unique_ancestors_in_ancestry() { let g = ancestry_1(); assert_eq!( g.find_unique_ancestors("rev1", ["rev3"]), FxHashSet::default() ); assert_eq!( g.find_unique_ancestors("rev2b", ["rev4"]), FxHashSet::default() ); } #[test] fn test_unique_ancestors_multiple_revisions() { let g = ancestry_1(); assert_eq!( g.find_unique_ancestors("rev4", ["rev3", "rev2b"]), set(["rev4"]) ); assert_eq!( g.find_unique_ancestors("rev4", ["rev2b"]), set(["rev2a", "rev3", "rev4"]) ); } #[test] fn test_unique_ancestors_complex_shortcut() { let g = complex_shortcut(); assert_eq!(g.find_unique_ancestors("n", ["m"]), set(["h", "n"])); assert_eq!(g.find_unique_ancestors("m", ["n"]), set(["e", "i", "m"])); } #[test] fn test_unique_ancestors_complex_shortcut2() { let g = complex_shortcut2(); assert_eq!(g.find_unique_ancestors("u", ["t"]), set(["j", "u"])); assert_eq!(g.find_unique_ancestors("t", ["u"]), set(["t"])); } #[test] fn test_unique_ancestors_multiple_interesting_unique() { let g = multiple_interesting_unique(); assert_eq!(g.find_unique_ancestors("y", ["z"]), set(["j", "y"])); assert_eq!(g.find_unique_ancestors("z", ["y"]), set(["p", "z"])); } #[test] fn test_is_ancestor_ancestry_1() { let g = ancestry_1(); assert!(g.is_ancestor(NULL, NULL, &NULL)); assert!(g.is_ancestor(NULL, "rev1", &NULL)); assert!(!g.is_ancestor("rev1", NULL, &NULL)); assert!(g.is_ancestor(NULL, "rev4", &NULL)); assert!(!g.is_ancestor("rev4", NULL, &NULL)); assert!(!g.is_ancestor("rev4", "rev2b", &NULL)); assert!(g.is_ancestor("rev2b", "rev4", &NULL)); assert!(!g.is_ancestor("rev2b", "rev3", &NULL)); assert!(!g.is_ancestor("rev3", "rev2b", &NULL)); } #[test] fn test_is_ancestor_boundary() { // Python's test_is_ancestor_boundary: verify a is not an ancestor // of c despite both sharing a common ancestor further down. let g = boundary(); assert!(!g.is_ancestor("a", "c", &NULL)); } #[test] fn test_is_between_ancestry_1() { let g = ancestry_1(); assert!(g.is_between(NULL, Some(NULL), Some(NULL), &NULL)); assert!(g.is_between("rev1", Some(NULL), Some("rev1"), &NULL)); assert!(g.is_between("rev1", Some("rev1"), Some("rev4"), &NULL)); assert!(g.is_between("rev4", Some("rev1"), Some("rev4"), &NULL)); assert!(g.is_between("rev3", Some("rev1"), Some("rev4"), &NULL)); assert!(!g.is_between("rev4", Some("rev1"), Some("rev3"), &NULL)); assert!(!g.is_between("rev1", Some("rev2a"), Some("rev4"), &NULL)); assert!(!g.is_between(NULL, Some("rev1"), Some("rev4"), &NULL)); } #[test] fn test_find_merge_order_single_lca() { let g = ancestry_1(); assert_eq!(g.find_merge_order("rev4", ["rev2b"]), vec!["rev2b"]); } fn with_ghost() -> Graph<&'static str, DictParentsProvider<&'static str>> { // NULL_REVISION itself is explicitly included as a root so it // survives as a key in iter_ancestry's output. make(&[ ("a", &["b"]), ("c", &["b", "d"]), ("b", &["e"]), ("d", &["e", "g"]), ("e", &["f"]), ("f", &[NULL]), (NULL, &[]), ]) } fn racing_shortcuts() -> Graph<&'static str, DictParentsProvider<&'static str>> { make(&[ ("a", &[NULL]), ("b", &["a"]), ("c", &["b"]), ("d", &["c"]), ("e", &["d"]), ("f", &["e"]), ("g", &["f"]), ("h", &["g"]), ("i", &["h", "o"]), ("j", &["i", "y"]), ("k", &["d"]), ("l", &["k"]), ("m", &["l"]), ("n", &["m"]), ("o", &["n", "g"]), ("p", &["f"]), ("q", &["p", "m"]), ("r", &["o"]), ("s", &["r"]), ("t", &["s"]), ("u", &["t"]), ("v", &["u"]), ("w", &["v"]), ("x", &["w"]), ("y", &["x"]), ("z", &["x", "q"]), ]) } /// Python's `alt_merge` fixture. /// /// ```text /// a /// |\ /// b | /// | | /// c | /// \| /// d /// ``` fn alt_merge() -> Graph<&'static str, DictParentsProvider<&'static str>> { make(&[("a", &[]), ("b", &["a"]), ("c", &["b"]), ("d", &["a", "c"])]) } #[test] fn test_heads_alt_merge() { let g = alt_merge(); assert_eq!(g.heads_with_null(["a", "c"], &NULL), set(["c"])); } #[test] fn test_heads_with_ghost_fixture() { let g = with_ghost(); assert_eq!(g.heads_with_null(["e", "g"], &NULL), set(["e", "g"])); assert_eq!(g.heads_with_null(["a", "c"], &NULL), set(["a", "c"])); assert_eq!(g.heads_with_null(["a", "g"], &NULL), set(["a", "g"])); assert_eq!(g.heads_with_null(["f", "g"], &NULL), set(["f", "g"])); assert_eq!(g.heads_with_null(["c", "g"], &NULL), set(["c"])); assert_eq!(g.heads_with_null(["c", "b", "d", "g"], &NULL), set(["c"])); assert_eq!( g.heads_with_null(["a", "c", "e", "g"], &NULL), set(["a", "c"]) ); assert_eq!(g.heads_with_null(["a", "c", "f"], &NULL), set(["a", "c"])); } #[test] fn test_filter_candidate_lca() { // Corner case from Python: // NULL // / \ // a e // | | // b d // \ / // c // `a`'s descendant is `c`; `e`'s descendant is also `c`. So // heads([a, c, e]) should be just {c}. let g = make(&[ ("c", &["b", "d"]), ("d", &["e"]), ("b", &["a"]), ("a", &[NULL]), ("e", &[NULL]), ]); assert_eq!(g.heads_with_null(["a", "c", "e"], &NULL), set(["c"])); } #[test] fn test_iter_topo_order_ancestry_1() { let g = ancestry_1(); let order = g.iter_topo_order(["rev2a", "rev3", "rev1"]).unwrap(); let pos = |k: &&str| order.iter().position(|n| n == k).unwrap(); assert_eq!( order.iter().cloned().collect::>(), set(["rev1", "rev2a", "rev3"]) ); assert!(pos(&"rev2a") > pos(&"rev1")); assert!(pos(&"rev2a") < pos(&"rev3")); } #[test] fn test_iter_ancestry_boundary() { let g = with_ghost(); // `a` is not in the ancestry of `c`; everything else is. let anc = g.iter_ancestry(["c"]); let keys: FxHashSet<&'static str> = anc.iter().map(|(k, _)| *k).collect(); assert!(!keys.contains(&"a")); assert!(keys.contains(&"c")); assert!(keys.contains(&"b")); assert!(keys.contains(&"d")); assert!(keys.contains(&"e")); assert!(keys.contains(&"f")); } #[test] fn test_iter_ancestry_with_ghost_reports_none() { let g = with_ghost(); // `g` is a ghost (present as parent of `d` but not as key). // iter_ancestry should yield it with Parents::Ghost. let anc = g.iter_ancestry(["a", "c"]); let mut ghost_seen = false; for (k, parents) in &anc { if *k == "g" { ghost_seen = true; assert!(matches!(parents, Parents::Ghost)); } } assert!(ghost_seen, "ghost `g` should appear in iter_ancestry"); } #[test] fn test_find_lefthand_merger_rev2b() { // In ancestry_1, rev4 merged rev2b (rev4 has parents [rev3, rev2b]). // Walking rev4's lefthand ancestry from rev2b: rev4 is the merger. let g = ancestry_1(); assert_eq!(g.find_lefthand_merger("rev2b", "rev4"), Some("rev4")); } #[test] fn test_find_lefthand_merger_rev2a() { // rev2a is itself a lefthand ancestor of rev4 (via rev3), so it's // its own "merger". let g = ancestry_1(); assert_eq!(g.find_lefthand_merger("rev2a", "rev4"), Some("rev2a")); } #[test] fn test_find_lefthand_merger_rev4_not_ancestor() { // rev4 is a descendant of rev2a, not an ancestor. let g = ancestry_1(); assert_eq!(g.find_lefthand_merger("rev4", "rev2a"), None); } #[test] fn test_unique_lca_recursive_ancestry_1() { // In ancestry_1, rev1 is the unique LCA of rev2a and rev2b. let g = ancestry_1(); let (key, steps) = g.find_unique_lca("rev2a", "rev2b", &NULL).unwrap(); assert_eq!(key, "rev1"); assert_eq!(steps, 1); } #[test] fn test_unique_lca_no_common_ancestor() { // Two disjoint ancestries share only NULL_REVISION as a common // ancestor. find_unique_lca returns NULL (never errors). let g = ancestry_2(); let (key, _steps) = g.find_unique_lca("rev4a", "rev1b", &NULL).unwrap(); assert_eq!(key, NULL); } #[test] fn test_unique_ancestors_racing_shortcuts() { let g = racing_shortcuts(); assert_eq!(g.find_unique_ancestors("z", ["y"]), set(["p", "q", "z"])); assert_eq!( g.find_unique_ancestors("j", ["z"]), set(["h", "i", "j", "y"]) ); } #[test] fn test_find_distance_to_null_ancestry_1() { let g = ancestry_1(); assert_eq!( g.find_distance_to_null(NULL, std::iter::empty(), NULL) .unwrap(), 0 ); assert_eq!( g.find_distance_to_null("rev1", std::iter::empty(), NULL) .unwrap(), 1 ); assert_eq!( g.find_distance_to_null("rev2a", std::iter::empty(), NULL) .unwrap(), 2 ); assert_eq!( g.find_distance_to_null("rev2b", std::iter::empty(), NULL) .unwrap(), 2 ); assert_eq!( g.find_distance_to_null("rev3", std::iter::empty(), NULL) .unwrap(), 3 ); assert_eq!( g.find_distance_to_null("rev4", std::iter::empty(), NULL) .unwrap(), 4 ); } #[test] fn test_find_lefthand_distances_ghosts() { let g = make(&[("nonghost", &[NULL]), ("toghost", &["ghost"])]); let d = g.find_lefthand_distances(vec!["nonghost", "toghost"], NULL); assert_eq!(d.get(&"nonghost"), Some(&1)); // Ghosts are reported as distance -1. assert_eq!(d.get(&"toghost"), Some(&-1)); } #[test] fn test_find_lefthand_distances_smoke() { let g = make(&[ ("rev1", &[NULL]), ("rev2a", &["rev1"]), ("rev2b", &["rev1"]), ("rev2c", &["rev1"]), ("rev3a", &["rev2a", "rev2b"]), ("rev3b", &["rev2b", "rev2c"]), ]); let d = g.find_lefthand_distances(vec!["rev3b", "rev2a"], NULL); assert_eq!(d.get(&"rev2a"), Some(&2)); assert_eq!(d.get(&"rev3b"), Some(&3)); } #[test] fn test_get_child_map_ancestry_1() { let g = ancestry_1(); let cm = g.get_child_map(vec!["rev4", "rev3", "rev2a", "rev2b"]); assert_eq!(cm.get(&"rev1"), Some(&vec!["rev2a", "rev2b"])); assert_eq!(cm.get(&"rev2a"), Some(&vec!["rev3"])); assert_eq!(cm.get(&"rev2b"), Some(&vec!["rev4"])); assert_eq!(cm.get(&"rev3"), Some(&vec!["rev4"])); } } python-vcsgraph-0.2.0/crates/graph/src/known_graph.rs0000644000000000000000000011663515167007306017650 0ustar00//! KnownGraph: graph algorithms that assume the full ancestry is already loaded. //! //! Ported from `vcsgraph/known_graph.py`. use crate::tsort::MergeSorter; use crate::{Error, RevnoVec}; use rustc_hash::{FxHashMap, FxHashSet}; use std::cmp::Reverse; use std::collections::{BinaryHeap, HashMap, VecDeque}; use std::hash::Hash; /// A key that may either be a real node or the synthetic "origin" sentinel /// (equivalent to `NULL_REVISION` in the Python implementation). /// /// Only used by [`KnownGraph::heads`], which has special semantics for the /// origin: it is only considered a head when it is the sole candidate. #[derive(Clone, Debug, PartialEq, Eq, Hash)] pub enum Key { Origin, Node(K), } #[derive(Debug, Clone)] struct KnownGraphNode { parent_keys: Option>, child_keys: Vec, gdfo: Option, } /// Produce a Vec of `items` ordered by hash of each element. Used as a stable /// (within one process) cache key for unordered sets when `K: Ord` is not /// required. fn sort_by_hash>(items: I) -> Vec { use std::collections::hash_map::DefaultHasher; use std::hash::Hasher; let hash_of = |k: &K| { let mut h = DefaultHasher::new(); k.hash(&mut h); h.finish() }; let mut v: Vec = items.into_iter().collect(); v.sort_by_key(hash_of); v } impl KnownGraphNode { fn new(parent_keys: Option>) -> Self { KnownGraphNode { parent_keys, child_keys: Vec::new(), gdfo: None, } } fn is_ghost(&self) -> bool { self.parent_keys.is_none() } } /// Information about a node in a merge-sorted graph. #[derive(Debug, Clone, PartialEq, Eq)] pub struct MergeSortNode { pub key: K, pub merge_depth: usize, pub revno: RevnoVec, pub end_of_merge: bool, } /// A graph where the full ancestry is already known. /// /// Supports gdfo-based queries like [`heads`](Self::heads), plus various /// topological orderings. #[derive(Debug, Clone)] pub struct KnownGraph { nodes: FxHashMap>, known_heads: FxHashMap, FxHashSet>, do_cache: bool, } impl KnownGraph { /// Build a new KnownGraph from a parent map. pub fn new(parent_map: I, do_cache: bool) -> Self where I: IntoIterator)>, { let iter = parent_map.into_iter(); // Use the lower bound of size_hint to pre-allocate when the caller // passes a sized iterator (HashMap, Vec, etc). Ghosts will grow the // map a little beyond this. let cap = iter.size_hint().0; let mut g = KnownGraph { nodes: HashMap::with_capacity_and_hasher(cap, Default::default()), known_heads: FxHashMap::default(), do_cache, }; g.initialize_nodes(iter); g.find_gdfo(); g } fn initialize_nodes(&mut self, parent_map: I) where I: IntoIterator)>, { for (key, parent_keys) in parent_map { // Ensure all parent nodes exist and record the reverse edge. for parent_key in &parent_keys { self.nodes .entry(parent_key.clone()) .or_insert_with(|| KnownGraphNode::new(None)) .child_keys .push(key.clone()); } // Insert or update the node itself. let node = self .nodes .entry(key) .or_insert_with(|| KnownGraphNode::new(None)); node.parent_keys = Some(parent_keys); } } fn find_tails(&self) -> Vec { // A "tail" has no parents — either a real root (Some(empty)) or a // ghost (None). Both kinds are treated as gdfo=1 starting points, // matching the Python `not node.parent_keys` check. self.nodes .iter() .filter_map(|(k, n)| match &n.parent_keys { Some(p) if p.is_empty() => Some(k.clone()), None => Some(k.clone()), _ => None, }) .collect() } fn find_tips(&self) -> Vec { self.nodes .iter() .filter_map(|(k, n)| { if n.child_keys.is_empty() { Some(k.clone()) } else { None } }) .collect() } fn find_gdfo(&mut self) { let mut known_parent_gdfos: FxHashMap = FxHashMap::default(); let mut pending: Vec = Vec::new(); for key in self.find_tails() { self.nodes.get_mut(&key).unwrap().gdfo = Some(1); pending.push(key); } while let Some(node_key) = pending.pop() { let node_gdfo = self.nodes[&node_key].gdfo.unwrap(); let child_keys = self.nodes[&node_key].child_keys.clone(); for child_key in child_keys { let (known_gdfo, present) = match known_parent_gdfos.get(&child_key) { Some(v) => (*v + 1, true), None => (1, false), }; let child = self.nodes.get_mut(&child_key).unwrap(); let new_gdfo = node_gdfo + 1; if child.gdfo.is_none_or(|g| new_gdfo > g) { child.gdfo = Some(new_gdfo); } let parent_len = child.parent_keys.as_ref().map(|p| p.len()).unwrap_or(0); if known_gdfo == parent_len { pending.push(child_key.clone()); if present { known_parent_gdfos.remove(&child_key); } } else { known_parent_gdfos.insert(child_key, known_gdfo); } } } } /// Return the parent keys for `key`. Returns `None` if `key` is a ghost, /// and an error-equivalent `None` lookup via `contains_key` otherwise. /// /// Matches the Python semantics: `None` means ghost, missing key would /// raise `KeyError` in Python — here the caller should check /// [`contains`](Self::contains) if disambiguation is needed. pub fn get_parent_keys(&self, key: &K) -> Option<&[K]> { self.nodes.get(key)?.parent_keys.as_deref() } /// Return the child keys for `key`. Returns an empty slice for tips. pub fn get_child_keys(&self, key: &K) -> Option<&[K]> { self.nodes.get(key).map(|n| n.child_keys.as_slice()) } /// Return whether `key` is present in the graph at all (including ghosts). pub fn contains(&self, key: &K) -> bool { self.nodes.contains_key(key) } /// Return the number of nodes in the graph (including ghosts). pub fn len(&self) -> usize { self.nodes.len() } /// Return whether the graph is empty. pub fn is_empty(&self) -> bool { self.nodes.is_empty() } /// Iterate over all node keys in the graph (including ghosts). pub fn keys(&self) -> impl Iterator { self.nodes.keys() } /// Return the gdfo (greatest distance from origin) of `key`, if known. pub fn gdfo(&self, key: &K) -> Option { self.nodes.get(key).and_then(|n| n.gdfo) } /// Add a new node to the graph, possibly filling in a ghost. pub fn add_node(&mut self, key: K, parent_keys: Vec) -> Result<(), Error> { // Validate against existing state, then ensure the node exists with // its parents recorded. We hold off on inserting parents into the // graph until after this match, so the borrow of `existing` ends. match self.nodes.get_mut(&key) { Some(existing) => match &existing.parent_keys { Some(existing_parents) if existing_parents == &parent_keys => return Ok(()), Some(existing_parents) => { return Err(Error::ParentMismatch { expected: existing_parents.clone(), actual: parent_keys, key, }); } None => { // Filling in a ghost: the heads cache is no longer // trustworthy. existing.parent_keys = Some(parent_keys.clone()); self.known_heads.clear(); } }, None => { self.nodes .insert(key.clone(), KnownGraphNode::new(Some(parent_keys.clone()))); } } let mut parent_gdfo: u64 = 0; for parent_key in &parent_keys { let parent_node = self.nodes.entry(parent_key.clone()).or_insert_with(|| { let mut n = KnownGraphNode::new(None); // Ghosts and roots have gdfo 1. n.gdfo = Some(1); n }); if let Some(g) = parent_node.gdfo { parent_gdfo = parent_gdfo.max(g); } parent_node.child_keys.push(key.clone()); } self.nodes.get_mut(&key).unwrap().gdfo = Some(parent_gdfo + 1); // Propagate gdfo updates to descendants (BFS). let mut pending: VecDeque = VecDeque::new(); pending.push_back(key); while let Some(node_key) = pending.pop_front() { let next_gdfo = self.nodes[&node_key].gdfo.unwrap() + 1; let child_keys = self.nodes[&node_key].child_keys.clone(); for child_key in child_keys { let child = self.nodes.get_mut(&child_key).unwrap(); if child.gdfo.is_none_or(|g| g < next_gdfo) { child.gdfo = Some(next_gdfo); pending.push_back(child_key); } } } Ok(()) } /// Return the heads from amongst `keys`. /// /// Any key reachable from another key is filtered out. This method is /// sentinel-free on the core; the caller handles origin semantics by /// wrapping `K` in [`Key`] and calling [`heads_with_origin`]. /// /// All keys in `candidates` must be present in the graph (not ghosts). pub fn heads(&mut self, candidates: I) -> FxHashSet where I: IntoIterator, { let candidates: FxHashSet = candidates.into_iter().collect(); if candidates.len() < 2 { return candidates; } // Build a process-stable cache key by sorting candidates by their // hash. Hash collisions in the comparator just produce non-unique // orderings; the resulting Vec still uniquely identifies the input // set within a single process (different sets differ in length or // contents). We can't use BTreeSet here because K is not required // to be Ord. let heads_cache_key = sort_by_hash(candidates.iter().cloned()); if let Some(cached) = self.known_heads.get(&heads_cache_key) { return cached.clone(); } let mut seen: FxHashSet = FxHashSet::default(); let mut pending: Vec = Vec::new(); let mut min_gdfo: Option = None; for key in &candidates { let node = &self.nodes[key]; if let Some(parents) = &node.parent_keys { pending.extend(parents.iter().cloned()); } if let Some(g) = node.gdfo { min_gdfo = Some(min_gdfo.map_or(g, |m| m.min(g))); } } let min_gdfo = min_gdfo.unwrap_or(0); while let Some(node_key) = pending.pop() { if !seen.insert(node_key.clone()) { continue; } let node = &self.nodes[&node_key]; if node.gdfo.is_some_and(|g| g <= min_gdfo) { continue; } if let Some(parents) = &node.parent_keys { pending.extend(parents.iter().cloned()); } } let heads: FxHashSet = candidates.difference(&seen).cloned().collect(); if self.do_cache { self.known_heads.insert(heads_cache_key, heads.clone()); } heads } /// Return the nodes of the graph in topological order (parents first). /// /// Errors with [`Error::Cycle`] if the graph is not fully connected via /// gdfo (i.e. contains a cycle). pub fn topo_sort(&self) -> Result, Error> { let unreachable: Vec = self .nodes .iter() .filter(|(_, n)| n.gdfo.is_none()) .map(|(k, _)| k.clone()) .collect(); if !unreachable.is_empty() { return Err(Error::Cycle(unreachable)); } let mut pending = self.find_tails(); let mut num_seen_parents: FxHashMap = self.nodes.keys().map(|k| (k.clone(), 0)).collect(); let mut topo_order: Vec = Vec::with_capacity(self.nodes.len()); while let Some(node_key) = pending.pop() { let node = &self.nodes[&node_key]; // Skip ghosts in the output (matches Python behavior). if !node.is_ghost() { topo_order.push(node_key.clone()); } let child_keys = node.child_keys.clone(); for child_key in child_keys { let child = &self.nodes[&child_key]; let seen_parents = num_seen_parents[&child_key] + 1; let parent_len = child.parent_keys.as_ref().map(|p| p.len()).unwrap_or(0); if seen_parents == parent_len { pending.push(child_key.clone()); num_seen_parents.remove(&child_key); } else { num_seen_parents.insert(child_key, seen_parents); } } } Ok(topo_order) } /// Return a reverse topological ordering grouped by prefix. /// /// `prefix_of` maps each key to its prefix bucket. Within each bucket the /// ordering is lexicographic (by `K: Ord`), which mirrors Python's use of /// tuple/bytes ordering there. Ghost nodes are skipped in the output. pub fn gc_sort(&self, mut prefix_of: P) -> Vec where K: Ord, P: FnMut(&K) -> PFX, PFX: Ord + Hash, { let mut prefix_tips: FxHashMap> = FxHashMap::default(); for key in self.find_tips() { prefix_tips.entry(prefix_of(&key)).or_default().push(key); } let mut num_seen_children: FxHashMap = self.nodes.keys().map(|k| (k.clone(), 0)).collect(); let mut prefix_list: Vec<(PFX, Vec)> = prefix_tips.into_iter().collect(); prefix_list.sort_by(|a, b| a.0.cmp(&b.0)); let mut result: Vec = Vec::with_capacity(self.nodes.len()); for (_prefix, tips) in prefix_list { // A min-heap (via Reverse) keeps the next-smallest key at the top // in O(log n), instead of re-sorting the pending vector after // every parent insertion. let mut pending: BinaryHeap> = tips.into_iter().map(Reverse).collect(); while let Some(Reverse(node_key)) = pending.pop() { let node = &self.nodes[&node_key]; if node.is_ghost() { continue; } let parent_keys = node.parent_keys.as_deref().unwrap_or(&[]); for parent_key in parent_keys { let parent_node = &self.nodes[parent_key]; let seen_children = num_seen_children[parent_key] + 1; if seen_children == parent_node.child_keys.len() { pending.push(Reverse(parent_key.clone())); num_seen_children.remove(parent_key); } else { num_seen_children.insert(parent_key.clone(), seen_children); } } result.push(node_key); } } result } } impl KnownGraph { /// Merge-sort the graph starting from `tip_key`. /// /// Requires `K: Debug` because the underlying [`MergeSorter`] does. pub fn merge_sort(&self, tip_key: K) -> Result>, Error> { let as_parent_map: HashMap> = self .nodes .iter() .filter_map(|(k, n)| n.parent_keys.as_ref().map(|p| (k.clone(), p.clone()))) .collect(); MergeSorter::new(as_parent_map, Some(tip_key), None, true) .map(|item| { item.map(|(_, key, merge_depth, revno, end_of_merge)| MergeSortNode { key, merge_depth, revno: revno.unwrap_or_default(), end_of_merge, }) }) .collect() } } impl KnownGraph> { /// `heads()` variant that implements the Python `NULL_REVISION` filter: /// [`Key::Origin`] is only a head if it is the sole candidate. pub fn heads_with_origin(&mut self, candidates: I) -> FxHashSet> where I: IntoIterator>, { let mut candidates: FxHashSet> = candidates.into_iter().collect(); if candidates.contains(&Key::Origin) { candidates.remove(&Key::Origin); if candidates.is_empty() { let mut r = FxHashSet::default(); r.insert(Key::Origin); return r; } } self.heads(candidates) } } #[cfg(test)] mod tests { use super::*; fn make(edges: &[(&'static str, &[&'static str])]) -> KnownGraph<&'static str> { let pm = edges.iter().map(|(k, ps)| (*k, ps.to_vec())); KnownGraph::new(pm, true) } #[test] fn gdfo_linear() { // a -> b -> c let g = make(&[("a", &[]), ("b", &["a"]), ("c", &["b"])]); assert_eq!(g.gdfo(&"a"), Some(1)); assert_eq!(g.gdfo(&"b"), Some(2)); assert_eq!(g.gdfo(&"c"), Some(3)); } #[test] fn gdfo_diamond() { // a // / \ // b c // \ / // d let g = make(&[("a", &[]), ("b", &["a"]), ("c", &["a"]), ("d", &["b", "c"])]); assert_eq!(g.gdfo(&"a"), Some(1)); assert_eq!(g.gdfo(&"b"), Some(2)); assert_eq!(g.gdfo(&"c"), Some(2)); assert_eq!(g.gdfo(&"d"), Some(3)); } #[test] fn heads_trivial() { let mut g = make(&[("a", &[]), ("b", &["a"])]); let h = g.heads(vec!["a", "b"]); let expected: FxHashSet<_> = ["b"].iter().copied().collect(); assert_eq!(h, expected); } #[test] fn heads_diamond() { let mut g = make(&[("a", &[]), ("b", &["a"]), ("c", &["a"]), ("d", &["b", "c"])]); let h = g.heads(vec!["b", "c"]); let expected: FxHashSet<_> = ["b", "c"].iter().copied().collect(); assert_eq!(h, expected); let h2 = g.heads(vec!["a", "d"]); let expected2: FxHashSet<_> = ["d"].iter().copied().collect(); assert_eq!(h2, expected2); } #[test] fn heads_with_origin_only() { let mut g: KnownGraph> = KnownGraph::new(vec![(Key::Node("a"), vec![Key::Origin])], true); let h = g.heads_with_origin(vec![Key::Origin]); assert_eq!(h.len(), 1); assert!(h.contains(&Key::Origin)); } #[test] fn heads_with_origin_ignored() { let mut g: KnownGraph> = KnownGraph::new(vec![(Key::Node("a"), vec![Key::Origin])], true); let h = g.heads_with_origin(vec![Key::Origin, Key::Node("a")]); let expected: FxHashSet<_> = [Key::Node("a")].iter().cloned().collect(); assert_eq!(h, expected); } #[test] fn topo_sort_basic() { let g = make(&[("a", &[]), ("b", &["a"]), ("c", &["a"]), ("d", &["b", "c"])]); let order = g.topo_sort().unwrap(); // a must come before b, c; b, c must come before d. let pos = |x: &&str| order.iter().position(|n| n == x).unwrap(); assert!(pos(&"a") < pos(&"b")); assert!(pos(&"a") < pos(&"c")); assert!(pos(&"b") < pos(&"d")); assert!(pos(&"c") < pos(&"d")); } #[test] fn add_node_fills_ghost() { // Start with b having ghost parent a. let mut g = make(&[("b", &["a"])]); // a is a ghost: present with None parents. assert!(g.get_parent_keys(&"a").is_none()); g.add_node("a", vec![]).unwrap(); assert_eq!(g.get_parent_keys(&"a"), Some(&[][..])); assert_eq!(g.gdfo(&"a"), Some(1)); assert_eq!(g.gdfo(&"b"), Some(2)); } #[test] fn add_node_duplicate_ok() { let mut g = make(&[("a", &[]), ("b", &["a"])]); g.add_node("b", vec!["a"]).unwrap(); } #[test] fn add_node_mismatch_errors() { let mut g = make(&[("a", &[]), ("b", &["a"])]); let r = g.add_node("b", vec![]); assert!(matches!(r, Err(Error::ParentMismatch { .. }))); } #[test] fn merge_sort_simple() { // a -> b -> c, linear let g = make(&[("a", &[]), ("b", &["a"]), ("c", &["b"])]); let ms = g.merge_sort("c").unwrap(); let keys: Vec<_> = ms.iter().map(|n| n.key).collect(); assert_eq!(keys, vec!["c", "b", "a"]); } /// Shared fixtures mirrored from `vcsgraph/tests/test_graph.py`. const NULL: &str = "null:"; fn ancestry_1() -> KnownGraph<&'static str> { make(&[ ("rev1", &[NULL]), ("rev2a", &["rev1"]), ("rev2b", &["rev1"]), ("rev3", &["rev2a"]), ("rev4", &["rev3", "rev2b"]), ]) } fn feature_branch() -> KnownGraph<&'static str> { make(&[ ("rev1", &[NULL]), ("rev2b", &["rev1"]), ("rev3b", &["rev2b"]), ]) } fn extended_history_shortcut() -> KnownGraph<&'static str> { make(&[ ("a", &[NULL]), ("b", &["a"]), ("c", &["b"]), ("d", &["c"]), ("e", &["d"]), ("f", &["a", "d"]), ]) } fn with_ghost() -> KnownGraph<&'static str> { // A graph with a ghost at `g`. make(&[ ("a", &["b"]), ("c", &["b", "d"]), ("b", &["e"]), ("d", &["e", "g"]), ("e", &["f"]), ("f", &[NULL]), (NULL, &[]), ]) } fn criss_cross() -> KnownGraph<&'static str> { make(&[ ("rev1", &[NULL]), ("rev2a", &["rev1"]), ("rev2b", &["rev1"]), ("rev3a", &["rev2a", "rev2b"]), ("rev3b", &["rev2b", "rev2a"]), ]) } fn history_shortcut() -> KnownGraph<&'static str> { make(&[ ("rev1", &[NULL]), ("rev2a", &["rev1"]), ("rev2b", &["rev1"]), ("rev2c", &["rev1"]), ("rev3a", &["rev2a", "rev2b"]), ("rev3b", &["rev2b", "rev2c"]), ]) } /// Equivalent of Python's `alt_merge` fixture. fn alt_merge() -> KnownGraph<&'static str> { make(&[("a", &[]), ("b", &["a"]), ("c", &["b"]), ("d", &["a", "c"])]) } fn set(xs: [&'static str; N]) -> FxHashSet<&'static str> { xs.into_iter().collect() } #[test] fn test_children_ancestry1() { let g = ancestry_1(); assert_eq!(g.get_child_keys(&NULL), Some(&["rev1"][..])); let mut rev1_children: Vec<_> = g.get_child_keys(&"rev1").unwrap().to_vec(); rev1_children.sort(); assert_eq!(rev1_children, vec!["rev2a", "rev2b"]); assert_eq!(g.get_child_keys(&"rev2a"), Some(&["rev3"][..])); assert_eq!(g.get_child_keys(&"rev3"), Some(&["rev4"][..])); assert_eq!(g.get_child_keys(&"rev2b"), Some(&["rev4"][..])); assert_eq!(g.get_child_keys(&"not_in_graph"), None); } #[test] fn test_parent_ancestry1() { let g = ancestry_1(); assert_eq!(g.get_parent_keys(&"rev1"), Some(&[NULL][..])); assert_eq!(g.get_parent_keys(&"rev2a"), Some(&["rev1"][..])); assert_eq!(g.get_parent_keys(&"rev2b"), Some(&["rev1"][..])); assert_eq!(g.get_parent_keys(&"rev3"), Some(&["rev2a"][..])); let mut rev4_parents: Vec<_> = g.get_parent_keys(&"rev4").unwrap().to_vec(); rev4_parents.sort(); assert_eq!(rev4_parents, vec!["rev2b", "rev3"]); } #[test] fn test_parent_with_ghost() { // In `with_ghost`, "g" is a ghost: present as a parent of `d` but // has no parent_keys of its own. let g = with_ghost(); assert_eq!(g.get_parent_keys(&"g"), None); } #[test] fn test_gdfo_ancestry_1() { let g = ancestry_1(); assert_eq!(g.gdfo(&"rev1"), Some(2)); assert_eq!(g.gdfo(&"rev2a"), Some(3)); assert_eq!(g.gdfo(&"rev2b"), Some(3)); assert_eq!(g.gdfo(&"rev3"), Some(4)); assert_eq!(g.gdfo(&"rev4"), Some(5)); } #[test] fn test_gdfo_feature_branch() { let g = feature_branch(); assert_eq!(g.gdfo(&"rev1"), Some(2)); assert_eq!(g.gdfo(&"rev2b"), Some(3)); assert_eq!(g.gdfo(&"rev3b"), Some(4)); } #[test] fn test_gdfo_extended_history_shortcut() { let g = extended_history_shortcut(); assert_eq!(g.gdfo(&"a"), Some(2)); assert_eq!(g.gdfo(&"b"), Some(3)); assert_eq!(g.gdfo(&"c"), Some(4)); assert_eq!(g.gdfo(&"d"), Some(5)); assert_eq!(g.gdfo(&"e"), Some(6)); assert_eq!(g.gdfo(&"f"), Some(6)); } #[test] fn test_gdfo_with_ghost() { let g = with_ghost(); assert_eq!(g.gdfo(&"f"), Some(2)); assert_eq!(g.gdfo(&"e"), Some(3)); assert_eq!(g.gdfo(&"g"), Some(1)); assert_eq!(g.gdfo(&"b"), Some(4)); assert_eq!(g.gdfo(&"d"), Some(4)); assert_eq!(g.gdfo(&"a"), Some(5)); assert_eq!(g.gdfo(&"c"), Some(5)); } #[test] fn test_add_existing_node_noop() { let mut g = ancestry_1(); assert_eq!(g.gdfo(&"rev4"), Some(5)); g.add_node("rev4", vec!["rev3", "rev2b"]).unwrap(); assert_eq!(g.gdfo(&"rev4"), Some(5)); } #[test] fn test_add_existing_node_mismatched_parents() { let mut g = ancestry_1(); let r = g.add_node("rev4", vec!["rev2b", "rev3"]); assert!(matches!(r, Err(Error::ParentMismatch { .. }))); } #[test] fn test_add_node_with_ghost_parent() { let mut g = ancestry_1(); g.add_node("rev5", vec!["rev2b", "revGhost"]).unwrap(); assert_eq!(g.gdfo(&"rev5"), Some(4)); assert_eq!(g.gdfo(&"revGhost"), Some(1)); } #[test] fn test_add_new_root() { let mut g = ancestry_1(); g.add_node("rev5", vec![]).unwrap(); assert_eq!(g.gdfo(&"rev5"), Some(1)); } #[test] fn test_add_with_all_ghost_parents() { let mut g = ancestry_1(); g.add_node("rev5", vec!["ghost"]).unwrap(); assert_eq!(g.gdfo(&"rev5"), Some(2)); assert_eq!(g.gdfo(&"ghost"), Some(1)); } #[test] fn test_gdfo_after_add_node() { let mut g = ancestry_1(); assert_eq!(g.get_child_keys(&"rev4"), Some(&[][..])); g.add_node("rev5", vec!["rev4"]).unwrap(); assert_eq!(g.get_parent_keys(&"rev5"), Some(&["rev4"][..])); assert_eq!(g.get_child_keys(&"rev4"), Some(&["rev5"][..])); assert_eq!(g.get_child_keys(&"rev5"), Some(&[][..])); assert_eq!(g.gdfo(&"rev5"), Some(6)); g.add_node("rev6", vec!["rev2b"]).unwrap(); g.add_node("rev7", vec!["rev6"]).unwrap(); g.add_node("rev8", vec!["rev7", "rev5"]).unwrap(); assert_eq!(g.gdfo(&"rev5"), Some(6)); assert_eq!(g.gdfo(&"rev6"), Some(4)); assert_eq!(g.gdfo(&"rev7"), Some(5)); assert_eq!(g.gdfo(&"rev8"), Some(7)); } #[test] fn test_fill_in_ghost() { // Add a few new roots, then fill in the ghost `g` so the // children's gdfos get renumbered. let mut g = with_ghost(); g.add_node("x", vec![]).unwrap(); g.add_node("y", vec!["x"]).unwrap(); g.add_node("z", vec!["y"]).unwrap(); g.add_node("g", vec!["z"]).unwrap(); assert_eq!(g.gdfo(&"f"), Some(2)); assert_eq!(g.gdfo(&"e"), Some(3)); assert_eq!(g.gdfo(&"x"), Some(1)); assert_eq!(g.gdfo(&"y"), Some(2)); assert_eq!(g.gdfo(&"z"), Some(3)); assert_eq!(g.gdfo(&"g"), Some(4)); assert_eq!(g.gdfo(&"b"), Some(4)); assert_eq!(g.gdfo(&"d"), Some(5)); assert_eq!(g.gdfo(&"a"), Some(5)); assert_eq!(g.gdfo(&"c"), Some(6)); } /// Rust-side `heads()` is sentinel-free; callers are expected to filter /// NULL themselves. These tests use the core method directly (not /// `heads_with_origin`) and only test non-null cases — NULL-filtering /// semantics are covered by existing `heads_with_origin_*` tests. #[test] fn test_heads_one_non_null() { let mut g = ancestry_1(); for key in ["rev1", "rev2a", "rev2b", "rev3", "rev4"] { assert_eq!(g.heads(vec![key]), set([key])); } } #[test] fn test_heads_single_from_ancestry_1() { let mut g = ancestry_1(); assert_eq!(g.heads(vec!["rev1", "rev2a"]), set(["rev2a"])); assert_eq!(g.heads(vec!["rev1", "rev2b"]), set(["rev2b"])); assert_eq!(g.heads(vec!["rev1", "rev3"]), set(["rev3"])); assert_eq!(g.heads(vec!["rev3", "rev2a"]), set(["rev3"])); assert_eq!(g.heads(vec!["rev1", "rev4"]), set(["rev4"])); assert_eq!(g.heads(vec!["rev2a", "rev4"]), set(["rev4"])); assert_eq!(g.heads(vec!["rev2b", "rev4"]), set(["rev4"])); assert_eq!(g.heads(vec!["rev3", "rev4"]), set(["rev4"])); } #[test] fn test_heads_two_heads_from_ancestry_1() { let mut g = ancestry_1(); assert_eq!(g.heads(vec!["rev2a", "rev2b"]), set(["rev2a", "rev2b"])); assert_eq!(g.heads(vec!["rev3", "rev2b"]), set(["rev3", "rev2b"])); } #[test] fn test_heads_criss_cross_fixture() { let mut g = criss_cross(); assert_eq!(g.heads(vec!["rev2a", "rev1"]), set(["rev2a"])); assert_eq!(g.heads(vec!["rev2b", "rev1"]), set(["rev2b"])); assert_eq!(g.heads(vec!["rev3a", "rev1"]), set(["rev3a"])); assert_eq!(g.heads(vec!["rev3b", "rev1"]), set(["rev3b"])); assert_eq!(g.heads(vec!["rev2a", "rev2b"]), set(["rev2a", "rev2b"])); assert_eq!(g.heads(vec!["rev3a", "rev2a"]), set(["rev3a"])); assert_eq!(g.heads(vec!["rev3a", "rev2b"]), set(["rev3a"])); assert_eq!(g.heads(vec!["rev3a", "rev2a", "rev2b"]), set(["rev3a"])); assert_eq!(g.heads(vec!["rev3b", "rev2a"]), set(["rev3b"])); assert_eq!(g.heads(vec!["rev3b", "rev2b"]), set(["rev3b"])); assert_eq!(g.heads(vec!["rev3b", "rev2a", "rev2b"]), set(["rev3b"])); assert_eq!(g.heads(vec!["rev3a", "rev3b"]), set(["rev3a", "rev3b"])); assert_eq!( g.heads(vec!["rev3a", "rev3b", "rev2a", "rev2b"]), set(["rev3a", "rev3b"]) ); } #[test] fn test_heads_history_shortcut_fixture() { let mut g = history_shortcut(); assert_eq!( g.heads(vec!["rev2a", "rev2b", "rev2c"]), set(["rev2a", "rev2b", "rev2c"]) ); assert_eq!(g.heads(vec!["rev3a", "rev3b"]), set(["rev3a", "rev3b"])); assert_eq!( g.heads(vec!["rev2a", "rev3a", "rev3b"]), set(["rev3a", "rev3b"]) ); assert_eq!(g.heads(vec!["rev2a", "rev3b"]), set(["rev2a", "rev3b"])); assert_eq!(g.heads(vec!["rev2c", "rev3a"]), set(["rev2c", "rev3a"])); } #[test] fn test_heads_alt_merge() { let mut g = alt_merge(); assert_eq!(g.heads(vec!["a", "c"]), set(["c"])); } #[test] fn test_heads_with_ghost_fixture() { let mut g = with_ghost(); assert_eq!(g.heads(vec!["e", "g"]), set(["e", "g"])); assert_eq!(g.heads(vec!["a", "c"]), set(["a", "c"])); assert_eq!(g.heads(vec!["a", "g"]), set(["a", "g"])); assert_eq!(g.heads(vec!["f", "g"]), set(["f", "g"])); assert_eq!(g.heads(vec!["c", "g"]), set(["c"])); assert_eq!(g.heads(vec!["c", "b", "d", "g"]), set(["c"])); assert_eq!(g.heads(vec!["a", "c", "e", "g"]), set(["a", "c"])); assert_eq!(g.heads(vec!["a", "c", "f"]), set(["a", "c"])); } #[test] fn test_filling_in_ghosts_resets_head_cache() { let mut g = with_ghost(); assert_eq!(g.heads(vec!["e", "g"]), set(["e", "g"])); // Fill in the ghost so that `g` descends from `e`; the heads // cache must be invalidated, otherwise the second query would // return the stale result. g.add_node("g", vec!["e"]).unwrap(); assert_eq!(g.heads(vec!["e", "g"]), set(["g"])); } /// Helper: assert that `topo_sort` yields a valid topological order /// for the given parent map. fn assert_topo_sort_order(edges: &[(&'static str, &[&'static str])]) { let pm: FxHashMap<&'static str, Vec<&'static str>> = edges.iter().map(|(k, ps)| (*k, ps.to_vec())).collect(); let g = KnownGraph::new(pm.clone(), true); let result = g.topo_sort().unwrap(); assert_eq!(result.len(), pm.len()); let idx: FxHashMap<&str, usize> = result.iter().enumerate().map(|(i, k)| (*k, i)).collect(); for (node, parents) in &pm { for parent in parents { if !pm.contains_key(parent) { continue; // ghost } assert!( idx[node] > idx[parent], "parent {parent} must come before child {node}: {:?}", result ); } } } #[test] fn test_topo_sort_empty() { assert_topo_sort_order(&[]); } #[test] fn test_topo_sort_easy() { assert_topo_sort_order(&[("a", &[])]); } #[test] fn test_topo_sort_cycle_simple() { let pm = [("a", vec!["b"]), ("b", vec!["a"])]; let g = KnownGraph::new(pm, true); assert!(matches!(g.topo_sort(), Err(Error::Cycle(_)))); } #[test] fn test_topo_sort_cycle_long() { let pm = [("a", vec!["b"]), ("b", vec!["c"]), ("c", vec!["a"])]; let g = KnownGraph::new(pm, true); assert!(matches!(g.topo_sort(), Err(Error::Cycle(_)))); } #[test] fn test_topo_sort_cycle_with_tail() { let pm = [ ("a", vec!["b"]), ("b", vec!["c"]), ("c", vec!["d", "e"]), ("d", vec!["a"]), ("e", vec![]), ]; let g = KnownGraph::new(pm, true); assert!(matches!(g.topo_sort(), Err(Error::Cycle(_)))); } #[test] fn test_topo_sort_nontrivial() { assert_topo_sort_order(&[ ("a", &["d"]), ("b", &["e"]), ("c", &["b", "e"]), ("d", &[]), ("e", &["a", "d"]), ]); } #[test] fn test_topo_sort_partial() { assert_topo_sort_order(&[ ("a", &[]), ("b", &["a"]), ("c", &["a"]), ("d", &["a"]), ("e", &["b", "c", "d"]), ("f", &["b", "c"]), ("g", &["b", "c"]), ("h", &["c", "d"]), ("i", &["a", "b", "e", "f", "g"]), ]); } #[test] fn test_topo_sort_ghost_parent() { // `b` is a ghost parent of `a` (not in the map). `c`'s parent // `b` references the same. Output order must place `a` after `b` // and `b` after `c` (treating `b` as a tail since it has no // known parents in the output graph). assert_topo_sort_order(&[("a", &["b"]), ("b", &["c"])]); } /// Merge-sort assertion helper: compares against Python-shaped /// (key, merge_depth, revno, end_of_merge) tuples. fn assert_merge_sort( edges: &[(&'static str, &[&'static str])], tip: &'static str, expected: &[(&'static str, usize, &[usize], bool)], ) { let pm: FxHashMap<&'static str, Vec<&'static str>> = edges.iter().map(|(k, ps)| (*k, ps.to_vec())).collect(); let g = KnownGraph::new(pm, true); let result = g.merge_sort(tip).unwrap(); assert_eq!( result.len(), expected.len(), "length mismatch: got {:?}", result .iter() .map(|n| (n.key, n.merge_depth, n.revno.clone(), n.end_of_merge)) .collect::>() ); for (i, ((got_key, got_depth, got_eom), (exp_key, exp_depth, exp_revno, exp_eom))) in result .iter() .map(|n| (n.key, n.merge_depth, n.end_of_merge)) .zip(expected.iter()) .enumerate() { let got_revno: Vec = result[i].revno.clone().into_iter().collect(); let exp_revno_v: Vec = exp_revno.to_vec(); assert_eq!( (got_key, got_depth, got_revno.clone(), got_eom), (*exp_key, *exp_depth, exp_revno_v.clone(), *exp_eom), "row {i} mismatch" ); } } #[test] fn test_merge_sort_one_revision() { assert_merge_sort(&[("id", &[])], "id", &[("id", 0, &[1], true)]); } #[test] fn test_merge_sort_sequence_no_merges() { assert_merge_sort( &[("A", &[]), ("B", &["A"]), ("C", &["B"])], "C", &[ ("C", 0, &[3], false), ("B", 0, &[2], false), ("A", 0, &[1], true), ], ); } #[test] fn test_merge_sort_sequence_with_merges() { assert_merge_sort( &[("A", &[]), ("B", &["A"]), ("C", &["A", "B"])], "C", &[ ("C", 0, &[2], false), ("B", 1, &[1, 1, 1], true), ("A", 0, &[1], true), ], ); } #[test] fn test_merge_sort_merge_depth_with_nested_merges() { assert_merge_sort( &[ ("A", &["D", "B"]), ("B", &["C", "F"]), ("C", &["H"]), ("D", &["H", "E"]), ("E", &["G", "F"]), ("F", &["G"]), ("G", &["H"]), ("H", &[]), ], "A", &[ ("A", 0, &[3], false), ("B", 1, &[1, 3, 2], false), ("C", 1, &[1, 3, 1], true), ("D", 0, &[2], false), ("E", 1, &[1, 1, 2], false), ("F", 2, &[1, 2, 1], true), ("G", 1, &[1, 1, 1], true), ("H", 0, &[1], true), ], ); } #[test] fn test_merge_sort_end_of_merge_not_last() { assert_merge_sort( &[("A", &["B"]), ("B", &[])], "A", &[("A", 0, &[2], false), ("B", 0, &[1], true)], ); } #[test] fn test_merge_sort_parallel_roots() { assert_merge_sort( &[("A", &[]), ("B", &[]), ("C", &["A", "B"])], "C", &[ ("C", 0, &[2], false), ("B", 1, &[0, 1, 1], true), ("A", 0, &[1], true), ], ); } #[test] fn test_merge_sort_cycle_errors() { // E <- D <- C <- B, B <- D creates a cycle B-C-D-B let pm = [ ("A", vec![] as Vec<&'static str>), ("B", vec!["D"]), ("C", vec!["B"]), ("D", vec!["C"]), ("E", vec!["D"]), ]; let g = KnownGraph::new(pm, true); let r = g.merge_sort("E"); assert!(matches!(r, Err(Error::Cycle(_)))); } } python-vcsgraph-0.2.0/crates/graph/src/lib.rs0000644000000000000000000004215015167007306016067 0ustar00#![allow(clippy::if_same_then_else)] /// DIAGRAM of terminology /// A /// /\ /// B C /// | |\ /// D E F /// |\/| | /// |/\|/ /// G H /// /// In this diagram, relative to G and H: /// A, B, C, D, E are common ancestors. /// C, D and E are border ancestors, because each has a non-common descendant. /// D and E are least common ancestors because none of their descendants are /// common ancestors. /// C is not a least common ancestor because its descendant, E, is a common /// ancestor. /// /// The find_unique_lca algorithm will pick A in two steps: /// 1. find_lca('G', 'H') => ['D', 'E'] /// 2. Since len(['D', 'E']) > 1, find_lca('D', 'E') => ['A'] use std::collections::{HashMap, HashSet}; use std::hash::Hash; pub mod bfs; pub use bfs::BfsState; pub mod graph; pub use graph::{Graph, GraphError}; pub mod known_graph; pub use known_graph::{Key, KnownGraph, MergeSortNode}; mod parents_provider; pub use parents_provider::{ CachingParentsProvider, DictParentsProvider, ParentsProvider, StackedParentsProvider, }; #[derive(Clone, Debug, PartialEq, Eq)] pub enum Parents { Ghost, Known(Vec), } impl Parents { pub fn is_ghost(&self) -> bool { match self { Parents::Ghost => true, Parents::Known(_) => false, } } pub fn is_known(&self) -> bool { match self { Parents::Ghost => false, Parents::Known(_) => true, } } pub fn unwrap(&self) -> Vec { match self { Parents::Ghost => panic!("unwrap called on Ghost"), Parents::Known(v) => v.clone(), } } /// Borrow the known parents as a slice without cloning. /// /// Panics if this is a `Ghost`. pub fn as_slice(&self) -> &[K] { match self { Parents::Ghost => panic!("as_slice called on Ghost"), Parents::Known(v) => v.as_slice(), } } pub fn as_ref(&self) -> Parents<&K> { match self { Parents::Ghost => Parents::Ghost, Parents::Known(v) => Parents::Known(v.iter().collect()), } } } #[cfg(feature = "pyo3")] impl<'py, K: pyo3::IntoPyObject<'py> + Clone + PartialEq + Eq> pyo3::IntoPyObject<'py> for Parents { type Target = pyo3::types::PyAny; type Output = pyo3::Bound<'py, Self::Target>; type Error = pyo3::PyErr; fn into_pyobject(self, py: pyo3::Python<'py>) -> Result { match self { Parents::Ghost => Ok(py.None().into_pyobject(py)?), Parents::Known(v) => Ok(v.into_pyobject(py)?.into_any()), } } } #[cfg(feature = "pyo3")] impl<'py, K: pyo3::conversion::FromPyObjectOwned<'py> + Clone + PartialEq + Eq> pyo3::FromPyObject<'_, 'py> for Parents { type Error = pyo3::PyErr; fn extract(obj: pyo3::Borrowed<'_, 'py, pyo3::PyAny>) -> Result { use pyo3::prelude::*; if obj.is_none() { Ok(Parents::Ghost) } else { let v = obj.extract::>()?; Ok(Parents::Known(v)) } } } #[derive(Clone, Debug, PartialEq, Eq)] pub struct ParentMap(HashMap>); impl ParentMap { pub fn new() -> Self { ParentMap(HashMap::new()) } #[inline] pub fn insert(&mut self, k: K, v: Parents) { self.0.insert(k, v); } #[inline] pub fn get(&self, k: &K) -> Option<&Parents> { self.0.get(k) } #[inline] pub fn get_key_value(&self, k: &K) -> Option<(&K, &Parents)> { self.0.get_key_value(k) } #[inline] pub fn iter(&self) -> impl Iterator)> { self.0.iter() } #[inline] pub fn contains_key(&self, k: &K) -> bool { self.0.contains_key(k) } #[inline] pub fn keys(&self) -> impl Iterator { self.0.keys() } #[inline] pub fn values(&self) -> impl Iterator> { self.0.values() } #[inline] pub fn len(&self) -> usize { self.0.len() } #[inline] pub fn remove(&mut self, k: &K) -> Option> { self.0.remove(k) } #[inline] pub fn is_empty(&self) -> bool { self.0.is_empty() } #[inline] pub fn extend(&mut self, other: ParentMap) { self.0.extend(other.0); } } impl Default for ParentMap { fn default() -> Self { Self::new() } } impl From> for HashMap> { fn from(map: ParentMap) -> Self { map.0 .into_iter() .map(|(k, v)| (k, v.unwrap())) .collect::>>() } } impl From>> for ParentMap { fn from(map: HashMap>) -> Self { ParentMap( map.into_iter() .map(|(k, v)| (k, Parents::Known(v))) .collect::>>(), ) } } impl IntoIterator for ParentMap { type Item = (K, Parents); type IntoIter = std::collections::hash_map::IntoIter>; fn into_iter(self) -> Self::IntoIter { self.0.into_iter() } } #[cfg(feature = "pyo3")] impl<'py, K: pyo3::IntoPyObject<'py, Error = pyo3::PyErr> + Hash + Clone + PartialEq + Eq> pyo3::IntoPyObject<'py> for ParentMap { type Target = pyo3::types::PyDict; type Output = pyo3::Bound<'py, Self::Target>; type Error = pyo3::PyErr; fn into_pyobject(self, py: pyo3::Python<'py>) -> Result { use pyo3::prelude::*; let dict = pyo3::types::PyDict::new(py); for (k, v) in self.into_iter() { dict.set_item(k, v)?; } Ok(dict) } } #[cfg(feature = "pyo3")] impl<'py, K> pyo3::FromPyObject<'_, 'py> for ParentMap where K: for<'a> pyo3::FromPyObject<'a, 'py, Error = pyo3::PyErr> + Hash + Clone + PartialEq + Eq, { type Error = pyo3::PyErr; fn extract(obj: pyo3::Borrowed<'_, 'py, pyo3::PyAny>) -> Result { use pyo3::prelude::*; let dict = obj.cast::()?; let mut result = ParentMap::new(); for (k, v) in dict.iter() { result.insert(k.extract()?, v.extract()?); } Ok(result) } } #[derive(Clone, Debug, PartialEq, Eq)] pub struct ChildMap(HashMap>); impl Default for ChildMap { fn default() -> Self { Self::new() } } impl ChildMap { pub fn new() -> Self { ChildMap(HashMap::new()) } #[inline] pub fn insert(&mut self, k: K) { self.0.entry(k).or_default(); } #[inline] pub fn drain(&mut self) -> impl Iterator)> + '_ { self.0.drain() } #[inline] pub fn add(&mut self, k: K, v: K) { self.0.entry(k).or_default().push(v); } #[inline] pub fn iter(&self) -> impl Iterator)> { self.0.iter() } #[inline] pub fn get(&self, k: &K) -> Option<&Vec> { self.0.get(k) } #[inline] pub fn remove(&mut self, k: &K) -> Option> { self.0.remove(k) } #[inline] pub fn is_empty(&self) -> bool { self.0.is_empty() } #[inline] pub fn contains_key(&self, k: &K) -> bool { self.0.contains_key(k) } } impl std::ops::Index<&K> for ChildMap { type Output = Vec; fn index(&self, index: &K) -> &Self::Output { &self.0[index] } } impl IntoIterator for ChildMap { type Item = (K, Vec); type IntoIter = std::collections::hash_map::IntoIter>; fn into_iter(self) -> Self::IntoIter { self.0.into_iter() } } #[cfg(feature = "pyo3")] impl<'py, K: pyo3::IntoPyObject<'py> + Hash + Clone + PartialEq + Eq> pyo3::IntoPyObject<'py> for ChildMap { type Target = pyo3::types::PyDict; type Output = pyo3::Bound<'py, Self::Target>; type Error = pyo3::PyErr; fn into_pyobject(self, py: pyo3::Python<'py>) -> Result { use pyo3::prelude::*; let dict = pyo3::types::PyDict::new(py); for (k, v) in self.into_iter() { dict.set_item(k, v)?; } Ok(dict) } } impl From>> for ChildMap { fn from(map: HashMap>) -> Self { ChildMap(map) } } /// Create a child map from a parent map. pub fn invert_parent_map(parent_map: &ParentMap) -> ChildMap { let mut child_map = ChildMap::new(); for (child, parents) in parent_map.iter() { if parents.is_ghost() { continue; } for p in parents.as_slice() { child_map.add(p.clone(), child.clone()); } } child_map } impl From> for ChildMap where K: Hash + Eq + Clone, { fn from(parent_map: ParentMap) -> Self { invert_parent_map(&parent_map) } } #[cfg(test)] mod invert_parent_map_tests { use super::*; use maplit::hashmap; #[test] fn test_invert() { let result = super::invert_parent_map(&ParentMap::from(hashmap! { 2 => vec![1], 3 => vec![1, 2], })); // Check node 1's children (order doesn't matter) let mut node1_children = result.get(&1).unwrap().clone(); node1_children.sort(); assert_eq!(vec![2, 3], node1_children); // Check node 2's children assert_eq!(vec![3], *result.get(&2).unwrap()); // Node 3 should have no children (may not be in the map) assert!(result.get(&3).is_none() || result.get(&3).unwrap().is_empty()); } #[test] fn test_ghost() { let result = super::invert_parent_map(&ParentMap::from(hashmap! { 2 => vec![1], 3 => vec![1, 2], })); // Check node 1's children (order doesn't matter) let mut node1_children = result.get(&1).unwrap().clone(); node1_children.sort(); assert_eq!(vec![2, 3], node1_children); // Check node 2's children assert_eq!(vec![3], *result.get(&2).unwrap()); } } /// Collapse regions of the graph that are 'linear'. /// /// For example:: /// /// A:[B], B:[C] /// /// can be collapsed by removing B and getting:: /// /// A:[C] /// /// Args: /// parent_map: A dictionary mapping children to their parents /// Returns: Another dictionary with 'linear' chains collapsed pub fn collapse_linear_regions(parent_map: &ParentMap) -> ParentMap { // Note: this isn't a strictly minimal collapse. For example: // A // / \ // B C // \ / // D // | // E // Will not have 'D' removed, even though 'E' could fit. Also: // A // | A // B => | // | C // C // A and C are both kept because they are edges of the graph. We *could* get // rid of A if we wanted. // A // / \ // B C // | | // D E // \ / // F // Will not have any nodes removed, even though you do have an // 'uninteresting' linear D->B and E->C let mut children: HashMap> = HashMap::new(); for (child, parents) in parent_map.iter() { children.entry(child.clone()).or_default(); for p in parents.as_slice() { children.entry(p.clone()).or_default().push(child.clone()); } } let mut removed = HashSet::new(); let mut result: ParentMap = parent_map.clone(); for node in parent_map.keys() { let parents = result.get(node).unwrap().as_slice(); if parents.len() != 1 { continue; } let parent_children = children.get(&parents[0]).unwrap(); if parent_children.len() != 1 { // This is not the only child continue; } let node_children = children.get(node).unwrap(); if node_children.len() != 1 { continue; } let Some(child_parents) = result.get(&node_children[0]) else { continue; }; if child_parents.as_slice().len() != 1 { // This is not its only parent continue; } // The child of this node only points at it, and the parent only has // this as a child. Remove this node and splice around it. let parents_owned = parents.to_vec(); let node_children_owned = node_children.clone(); result.remove(node); result.insert( node_children_owned[0].clone(), Parents::Known(parents_owned.clone()), ); children.insert(parents_owned[0].clone(), node_children_owned); children.remove(node); removed.insert(node); } result } pub mod tsort; #[cfg(test)] mod test; #[derive(Clone, PartialEq, Eq)] pub struct RevnoVec(Vec); impl RevnoVec { pub fn new() -> Self { RevnoVec(vec![]) } pub fn bump_last(&self) -> Self { let mut ret = self.clone(); *ret.0.last_mut().expect("bump_last on empty RevnoVec") += 1; ret } pub fn new_branch(&self, branch_count: usize) -> Self { RevnoVec::from(vec![self[0], branch_count, 1]) } } impl Default for RevnoVec { fn default() -> Self { Self::new() } } impl IntoIterator for RevnoVec { type Item = usize; type IntoIter = std::vec::IntoIter; fn into_iter(self) -> Self::IntoIter { self.0.into_iter() } } impl std::ops::Index for RevnoVec { type Output = usize; fn index(&self, index: usize) -> &Self::Output { &self.0[index] } } impl std::ops::IndexMut for RevnoVec { fn index_mut(&mut self, index: usize) -> &mut Self::Output { &mut self.0[index] } } impl std::fmt::Debug for RevnoVec { fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result { write!(f, "RevnoVec({:?})", self.0) } } impl std::fmt::Display for RevnoVec { fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result { let mut first = true; for r in self.0.iter() { if first { first = false; } else { write!(f, ".")?; } write!(f, "{}", r)?; } Ok(()) } } impl From> for RevnoVec { fn from(v: Vec) -> Self { RevnoVec(v) } } impl From for RevnoVec { fn from(v: usize) -> Self { RevnoVec(vec![v]) } } #[cfg(feature = "pyo3")] impl<'py> pyo3::IntoPyObject<'py> for RevnoVec { type Target = pyo3::types::PyTuple; type Output = pyo3::Bound<'py, Self::Target>; type Error = pyo3::PyErr; fn into_pyobject(self, py: pyo3::Python<'py>) -> Result { pyo3::types::PyTuple::new(py, self.0.iter()) } } #[cfg(feature = "pyo3")] impl<'py> pyo3::FromPyObject<'_, 'py> for RevnoVec { type Error = pyo3::PyErr; fn extract(obj: pyo3::Borrowed<'_, 'py, pyo3::PyAny>) -> Result { use pyo3::prelude::*; let tuple = obj.cast::()?; let mut ret = RevnoVec::new(); for r in tuple.iter() { ret.0.push(r.extract()?); } Ok(ret) } } #[derive(std::fmt::Debug)] pub enum Error { Cycle(Vec), ParentMismatch { key: K, expected: Vec, actual: Vec, }, } impl std::fmt::Display for Error { fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result { match self { Error::Cycle(cycle) => { write!(f, "Cycle: ")?; let mut first = true; for c in cycle.iter() { if first { first = false; } else { write!(f, " -> ")?; } write!(f, "{}", c)?; } Ok(()) } Error::ParentMismatch { key, expected, actual, } => { write!(f, "Parent mismatch for {}: ", key)?; let mut first = true; for e in expected.iter() { if first { first = false; } else { write!(f, ", ")?; } write!(f, "{}", e)?; } write!(f, " != ")?; let mut first = true; for a in actual.iter() { if first { first = false; } else { write!(f, ", ")?; } write!(f, "{}", a)?; } Ok(()) } } } } impl std::error::Error for Error {} python-vcsgraph-0.2.0/crates/graph/src/parents_provider.rs0000644000000000000000000003030015167007306020701 0ustar00use crate::{ParentMap, Parents}; use rustc_hash::{FxHashMap, FxHashSet}; use std::collections::{HashMap, HashSet}; use std::hash::Hash; use std::sync::Mutex; pub trait ParentsProvider { fn get_parent_map(&self, keys: &HashSet) -> ParentMap; } pub struct StackedParentsProvider { parent_providers: Vec>>, } impl StackedParentsProvider { pub fn new(parent_providers: Vec>>) -> Self { StackedParentsProvider { parent_providers } } } impl ParentsProvider for StackedParentsProvider { fn get_parent_map(&self, keys: &HashSet) -> ParentMap { let mut found = ParentMap::new(); let mut remaining = keys.clone(); for parent_provider in self.parent_providers.iter() { if remaining.is_empty() { break; } let new_found = parent_provider.get_parent_map(&remaining); for k in new_found.keys() { remaining.remove(k); } found.extend(new_found); } found } } pub struct DictParentsProvider(ParentMap); impl From> for DictParentsProvider { fn from(parent_map: ParentMap) -> Self { DictParentsProvider(parent_map) } } impl From>> for DictParentsProvider { fn from(parent_map: HashMap>) -> Self { DictParentsProvider::new(ParentMap( parent_map .into_iter() .map(|(k, v)| (k, Parents::Known(v))) .collect(), )) } } impl DictParentsProvider { pub fn new(parent_map: ParentMap) -> Self { DictParentsProvider(parent_map) } } impl ParentsProvider for DictParentsProvider { fn get_parent_map(&self, keys: &HashSet) -> ParentMap { ParentMap( keys.iter() .filter_map(|k| self.0.get_key_value(k)) .map(|(k, v)| (k.clone(), v.clone())) .collect(), ) } } /// A parents provider which caches its lookups. /// /// Wraps an inner `ParentsProvider` and memoizes every `(key, parents)` /// pair it returns. When cache-misses-tracking is enabled, keys that were /// requested but not present in the inner provider are also remembered so /// we don't re-request them. /// /// The cache can be disabled and re-enabled at runtime; disabling clears /// the cache entirely. pub struct CachingParentsProvider> { inner: P, // Interior mutability so `get_parent_map(&self, ...)` can populate the // cache. The whole provider is still Sync (via Mutex) which matches the // base trait's `&self` contract. state: Mutex>, } struct CacheState { /// None when the cache is disabled, Some when enabled. cache: Option>>, /// Keys known to be missing from the inner provider. Only populated /// when `cache_misses` is true. missing_keys: FxHashSet, /// Whether to remember keys that aren't in the inner provider. cache_misses: bool, } impl> CachingParentsProvider { /// Create a caching wrapper around `inner`. The cache is enabled by /// default with cache-misses tracking on. pub fn new(inner: P) -> Self { CachingParentsProvider { inner, state: Mutex::new(CacheState { cache: Some(FxHashMap::default()), missing_keys: FxHashSet::default(), cache_misses: true, }), } } /// Enable the cache. Matches Python's semantics: calling this when the /// cache is already enabled is an error. `cache_misses` controls /// whether missing keys are remembered between calls. pub fn enable_cache(&self, cache_misses: bool) -> Result<(), &'static str> { let mut state = self.state.lock().unwrap(); if state.cache.is_some() { return Err("Cache enabled when already enabled."); } state.cache = Some(FxHashMap::default()); state.cache_misses = cache_misses; state.missing_keys = FxHashSet::default(); Ok(()) } /// Disable and clear the cache. pub fn disable_cache(&self) { let mut state = self.state.lock().unwrap(); state.cache = None; state.cache_misses = false; state.missing_keys = FxHashSet::default(); } /// Return a snapshot of the current cache, or `None` if disabled. pub fn get_cached_map(&self) -> Option>> { let state = self.state.lock().unwrap(); state.cache.clone() } /// Return entries from the cache without consulting the inner provider. pub fn get_cached_parent_map(&self, keys: &HashSet) -> ParentMap { let state = self.state.lock().unwrap(); let Some(cache) = state.cache.as_ref() else { return ParentMap::new(); }; ParentMap( keys.iter() .filter_map(|k| cache.get_key_value(k)) .map(|(k, v)| (k.clone(), v.clone())) .collect(), ) } /// Note that `key` was missing from the inner provider. pub fn note_missing_key(&self, key: K) { let mut state = self.state.lock().unwrap(); if state.cache_misses { state.missing_keys.insert(key); } } /// Snapshot of the missing-keys set. pub fn missing_keys(&self) -> FxHashSet { self.state.lock().unwrap().missing_keys.clone() } /// Borrow the inner provider. pub fn inner(&self) -> &P { &self.inner } } impl> ParentsProvider for CachingParentsProvider { fn get_parent_map(&self, keys: &HashSet) -> ParentMap { // Fast path: cache disabled — delegate straight to inner. { let state = self.state.lock().unwrap(); if state.cache.is_none() { drop(state); // Note: Python filters the response to only the requested // keys with non-None values; we do the same by filtering // known parents below. let pm = self.inner.get_parent_map(keys); let mut result = ParentMap::new(); for k in keys { if let Some(v) = pm.get(k) { if matches!(v, Parents::Known(_)) { result.insert(k.clone(), v.clone()); } } } return result; } } // Determine which keys we still need to fetch from the inner // provider (not in cache and not known-missing). let needed: HashSet = { let state = self.state.lock().unwrap(); let cache = state.cache.as_ref().unwrap(); keys.iter() .filter(|k| !cache.contains_key(*k) && !state.missing_keys.contains(*k)) .cloned() .collect() }; if !needed.is_empty() { let fetched = self.inner.get_parent_map(&needed); let mut state = self.state.lock().unwrap(); let cache = state.cache.as_mut().unwrap(); for (k, v) in fetched.iter() { cache.insert(k.clone(), v.clone()); } if state.cache_misses { for k in &needed { if !fetched.contains_key(k) { state.missing_keys.insert(k.clone()); } } } } // Build the response from the cache, filtering out ghosts/None the // same way Python does. let state = self.state.lock().unwrap(); let cache = state.cache.as_ref().unwrap(); let mut result = ParentMap::new(); for k in keys { if let Some(v) = cache.get(k) { if matches!(v, Parents::Known(_)) { result.insert(k.clone(), v.clone()); } } } result } } #[cfg(test)] mod tests { use super::*; use std::cell::RefCell; /// A ParentsProvider wrapper that counts how many distinct keys were /// requested across all calls to `get_parent_map`. Used to verify the /// caching wrapper avoids redundant lookups. struct CountingProvider> { inner: P, requested: RefCell>, } impl> CountingProvider { fn new(inner: P) -> Self { CountingProvider { inner, requested: RefCell::new(Vec::new()), } } } impl> ParentsProvider for CountingProvider { fn get_parent_map(&self, keys: &HashSet) -> ParentMap { self.requested.borrow_mut().extend(keys.iter().cloned()); self.inner.get_parent_map(keys) } } fn dict(edges: &[(&'static str, &[&'static str])]) -> DictParentsProvider<&'static str> { let map: HashMap<&'static str, Vec<&'static str>> = edges.iter().map(|(k, ps)| (*k, ps.to_vec())).collect(); DictParentsProvider::from(map) } fn query( cp: &CachingParentsProvider< &'static str, CountingProvider<&'static str, DictParentsProvider<&'static str>>, >, keys: &[&'static str], ) -> ParentMap<&'static str> { let hs: HashSet<&'static str> = keys.iter().copied().collect(); cp.get_parent_map(&hs) } #[test] fn caching_returns_known_parents() { let inner = CountingProvider::new(dict(&[("a", &[]), ("b", &["a"])])); let cp = CachingParentsProvider::new(inner); let pm = query(&cp, &["a", "b"]); assert_eq!(pm.get(&"a"), Some(&Parents::Known(vec![]))); assert_eq!(pm.get(&"b"), Some(&Parents::Known(vec!["a"]))); } #[test] fn caching_avoids_refetching_known_keys() { let inner = CountingProvider::new(dict(&[("a", &[]), ("b", &["a"])])); let cp = CachingParentsProvider::new(inner); query(&cp, &["a", "b"]); query(&cp, &["a", "b"]); // Only one round trip for each key. let requested = cp.inner().requested.borrow(); let mut seen = FxHashSet::default(); for k in requested.iter() { seen.insert(*k); } assert_eq!(seen, ["a", "b"].into_iter().collect::>()); assert_eq!(requested.len(), 2); } #[test] fn caching_remembers_missing_keys() { let inner = CountingProvider::new(dict(&[("a", &[])])); let cp = CachingParentsProvider::new(inner); query(&cp, &["a", "missing"]); query(&cp, &["missing"]); // "missing" should have been requested exactly once. let requested = cp.inner().requested.borrow(); let count = requested.iter().filter(|k| **k == "missing").count(); assert_eq!(count, 1); } #[test] fn disable_cache_clears_state() { let inner = CountingProvider::new(dict(&[("a", &[])])); let cp = CachingParentsProvider::new(inner); query(&cp, &["a"]); cp.disable_cache(); query(&cp, &["a"]); // With the cache disabled, every call hits the inner provider. let requested = cp.inner().requested.borrow(); let count = requested.iter().filter(|k| **k == "a").count(); assert_eq!(count, 2); } #[test] fn enable_while_enabled_errors() { let cp = CachingParentsProvider::new(dict(&[("a", &[])])); assert!(cp.enable_cache(true).is_err()); } #[test] fn reenabling_after_disable_works() { let cp = CachingParentsProvider::new(dict(&[("a", &[])])); cp.disable_cache(); cp.enable_cache(true).unwrap(); let hs: HashSet<&'static str> = ["a"].into_iter().collect(); let pm = cp.get_parent_map(&hs); assert_eq!(pm.get(&"a"), Some(&Parents::Known(vec![]))); } } python-vcsgraph-0.2.0/crates/graph/src/test.rs0000644000000000000000000000525115167007306016301 0ustar00use crate::tsort::TopoSorter; use crate::Error; use std::collections::HashMap; #[test] fn test_tsort_empty() { let graph = HashMap::new(); assert_sort_and_iterate(&graph, &[]); } #[test] fn test_tsort_easy() { let graph = [(0, vec![])].iter().cloned().collect(); assert_sort_and_iterate(&graph, &[0]); } #[test] fn test_tsort_cycle() { let graph = [(0, vec![1]), (1, vec![0])].iter().cloned().collect(); assert_sort_and_iterate_cycle(&graph); } #[test] fn test_tsort_cycle_2() { let graph = [(0, vec![1]), (1, vec![2]), (2, vec![0])] .iter() .cloned() .collect(); assert_sort_and_iterate_cycle(&graph); } #[test] fn test_topo_sort_cycle_with_tail() { let graph = [ (0, vec![1]), (1, vec![2]), (2, vec![3, 4]), (3, vec![0]), (4, vec![]), ] .iter() .cloned() .collect(); assert_sort_and_iterate_cycle(&graph); } #[test] fn test_tsort_1() { let graph = [ (0, vec![3]), (1, vec![4]), (2, vec![1, 4]), (3, vec![]), (4, vec![0, 3]), ] .iter() .cloned() .collect(); assert_sort_and_iterate_order(&graph); } #[test] fn test_tsort_partial() { let graph = [ (0, vec![]), (1, vec![0]), (2, vec![0]), (3, vec![0]), (4, vec![1, 2, 3]), (5, vec![1, 2]), (6, vec![1, 2]), (7, vec![2, 3]), (8, vec![0, 1, 4, 5, 6]), ] .iter() .cloned() .collect(); assert_sort_and_iterate_order(&graph); } #[test] fn test_tsort_unincluded_parent() { let graph = [(0, vec![1]), (1, vec![2])].iter().cloned().collect(); assert_sort_and_iterate(&graph, &[1, 0]); } fn topo_sort(graph: &HashMap>) -> Result, Error> { TopoSorter::new(graph.clone().into_iter()).sorted() } fn assert_sort_and_iterate_order(graph: &HashMap>) { let sort_result = topo_sort(graph).unwrap(); for (node, parents) in graph { for parent in parents { if sort_result.iter().position(|&n| n == *node).unwrap() < sort_result.iter().position(|&n| n == *parent).unwrap() { panic!( "parent {} must come before child {}:\n{:?}", parent, node, sort_result ); } } } } fn assert_sort_and_iterate_cycle(graph: &HashMap>) { let sort_result = topo_sort(graph); assert!(sort_result.is_err()); } fn assert_sort_and_iterate(graph: &HashMap>, expected: &[usize]) { let sort_result = topo_sort(graph).unwrap(); assert_eq!(sort_result, expected); } python-vcsgraph-0.2.0/crates/graph/src/tsort.rs0000644000000000000000000006213615167007306016502 0ustar00#![allow(clippy::if_same_then_else)] use crate::{Error, RevnoVec}; use rustc_hash::{FxHashMap, FxHashSet}; use std::collections::HashMap; use std::hash::Hash; #[derive(Debug)] pub struct TopoSorter { graph: FxHashMap>, visitable: FxHashSet, // this is a stack storing the depth first search into the graph. pending_node_stack: Vec, // at each level of 'recursion' we have to check each parent. This // stack stores the parents we have not yet checked for the node at the // matching depth in pending_node_stack pending_parents_stack: Vec>, // this is a set of the completed nodes for fast checking whether a // parent in a node we are processing on the stack has already been // emitted and thus can be skipped. completed_node_names: FxHashSet, } impl TopoSorter { /// Create a new `TopoSorter` from a graph represented as a sequence of pairs /// of node_name->parent_names_list. pub fn new(graph: impl Iterator)>) -> TopoSorter { let mut g = FxHashMap::default(); for (node, parents) in graph { g.insert(node, parents); } let visitable = g.keys().cloned().collect(); TopoSorter { graph: g, visitable, pending_node_stack: vec![], pending_parents_stack: vec![], completed_node_names: FxHashSet::default(), } } /// Sort the graph and return the nodes as a vector. /// /// After calling this the sorter is empty and you must create a new one. pub fn sorted(&mut self) -> std::result::Result, Error> { self.iter_topo_order() .collect::, Error>>() } /// Yield the nodes of the graph in a topological order. /// /// After finishing iteration the sorter is empty and you cannot continue /// iteration. pub fn iter_topo_order( &mut self, ) -> impl Iterator>> + '_ { self } } impl Iterator for TopoSorter { type Item = std::result::Result>; fn next(&mut self) -> Option>> { loop { // loop until pending_node_stack is empty while !self.pending_node_stack.is_empty() { let parents_to_visit = self.pending_parents_stack.last_mut().unwrap(); // if there are no parents left, the revision is done if parents_to_visit.is_empty() { // append the revision to the topo sorted list // all the nodes parents have been added to the output, // now we can add it to the output. let popped_node = self.pending_node_stack.pop().unwrap(); self.pending_parents_stack.pop(); self.completed_node_names.insert(popped_node.clone()); return Some(Ok(popped_node)); } else { // recurse depth first into a single parent let next_node_name = parents_to_visit.pop().unwrap(); if self.completed_node_names.contains(&next_node_name) { // parent was already completed by a child, skip it. continue; } if !self.visitable.contains(&next_node_name) { // parent is not a node in the original graph, skip it. continue; } // transfer it along with its parents from the source graph // into the top of the current depth first search stack. if let Some(parents) = self.graph.remove(&next_node_name) { self.pending_node_stack.push(next_node_name); self.pending_parents_stack.push(parents); } else { // if the next node is not in the source graph it has // already been popped from it and placed into the // current search stack (but not completed or we would // have hit the continue 6 lines up). this indicates a // cycle. return Some(Err(Error::Cycle(self.pending_node_stack.to_vec()))); } } } if let Some(node_name) = self.graph.keys().next() { let node_name = node_name.clone(); let parents = self.graph.remove(&node_name).unwrap(); // now pick a random node in the source graph, and transfer it to the // top of the depth first search stack of pending nodes. self.pending_node_stack.push(node_name); self.pending_parents_stack.push(parents); } else { // if the source graph is empty, we are done. return None; } } } } /// Merge-aware topological sorting of a graph. /// /// :param graph: sequence of pairs of node_name->parent_names_list. /// i.e. [('C', ['B']), ('B', ['A']), ('A', [])] /// For this input the output from the sort or /// iter_topo_order routines will be: /// 'A', 'B', 'C' /// :param branch_tip: the tip of the branch to graph. Revisions not /// reachable from branch_tip are not included in the /// output. /// :param mainline_revisions: If not None this forces a mainline to be /// used rather than synthesised from the graph. /// This must be a valid path through some part /// of the graph. If the mainline does not cover all /// the revisions, output stops at the start of the /// old revision listed in the mainline revisions /// list. /// The order for this parameter is oldest-first. /// :param generate_revno: Optional parameter controlling the generation of /// revision number sequences in the output. See the output description /// for more details. /// /// The result is a list sorted so that all parents come before /// their children. Each element of the list is a tuple containing: /// `(sequence_number, node_name, merge_depth, end_of_merge)`. /// /// - `sequence_number`: the sequence of this row in the output. Useful for /// GUIs. /// - `node_name`: the node name; opaque text to the merge routine. /// - `merge_depth`: how many levels of merging deep this node has been found. /// - `revno_sequence`: when requested this field provides a sequence of /// revision numbers for all revisions. The format is /// `(REVNO, BRANCHNUM, BRANCHREVNO)`. `BRANCHNUM` is the number of the /// branch that the revno is on. From left to right the `REVNO` numbers /// are the sequence numbers within that branch of the revision. /// For instance, the graph `{A:[], B:['A'], C:['A', 'B']}` will get /// the following revno sequences assigned: `A:(1,), B:(1,1,1), C:(2,)`. /// This should be read as 'A is the first commit in the trunk', /// 'B is the first commit on the first branch made from A', 'C is the /// second commit in the trunk'. /// - `end_of_merge`: when true the next node is part of a different merge. /// /// Node identifiers can be any hashable object, and are typically strings. /// /// If you have a graph like `[('a', ['b']), ('a', ['c'])]` this will only /// use one of the two values for 'a'. /// /// The graph is sorted lazily: until you iterate or sort the input is not /// processed other than to create an internal representation. /// /// Iteration or sorting may raise a cycle error if a cycle is present in /// the graph. /// /// # Background on the design /// /// The end of any cluster or 'merge' occurs when: /// /// 1. the next revision has a lower merge depth than we do: /// /// ```text /// A 0 /// B 1 /// C 2 /// D 1 /// E 0 /// ``` /// /// C and D are the ends of clusters; E might be but we need more data. /// /// 2. or the next revision at our merge depth is not our left most ancestor. /// This is required to handle multiple-merges in one commit: /// /// ```text /// A 0 [F, B, E] /// B 1 [D, C] /// C 2 [D] /// D 1 [F] /// E 1 [F] /// F 0 /// ``` /// /// C is the end of a cluster due to rule 1. D is not the end of a /// cluster from rule 1, but is from rule 2: E is not its left most /// ancestor. E is the end of a cluster due to rule 1. F might be but we /// need more data. /// /// We show connecting lines to a parent when: /// /// - The parent is the start of a merge within this cluster. That is, the /// merge was not done to the mainline before this cluster was merged to /// the mainline. This can be detected thus: the parent has a higher /// merge depth and is the next revision in the list. The next-revision /// constraint is needed for this case: /// /// ```text /// A 0 [D, B] /// B 1 [C, F] # we do not want to show a line to F which is depth 2 /// # but not a merge /// C 1 [H] # note that this is a long line to show back to the /// # ancestor - see the end of merge rules. /// D 0 [G, E] /// E 1 [G, F] /// F 2 [G] /// G 1 [H] /// H 0 /// ``` /// /// - Part of this merge's branch: the parent has the same merge depth and /// is our left most parent and we are not the end of the cluster: /// /// ```text /// A 0 [C, B] lines: [B, C] /// B 1 [E, C] lines: [C] /// C 0 [D] lines: [D] /// D 0 [F, E] lines: [E, F] /// E 1 [F] lines: [F] /// F 0 /// ``` /// /// - The end of this merge/cluster: we can only have multiple parents at /// the end of a cluster if this branch was previously merged into the /// 'mainline'. /// /// - If we have one and only one parent, show it. Note that this may be /// to a greater merge depth — for instance if this branch continued /// from a deeply nested branch to add something to it. /// - If we have more than one parent, show the second oldest (older == /// further down the list) parent with an equal or lower merge depth. pub struct MergeSorter { // this is a stack storing the depth first search into the graph. node_name_stack: Vec, // at each level of recursion we need the merge depth this node is at: node_merge_depth_stack: Vec, // at each level of 'recursion' we have to check each parent. This // stack stores the parents we have not yet checked for the node at the // matching depth in _node_name_stack pending_parents_stack: Vec>, // When we first look at a node we assign it a seqence number from its // leftmost parent. first_child_stack: Vec>, // This records for each node when we have processed its left most // unmerged subtree. After this subtree is scheduled, all other subtrees // have their merge depth increased by one from this nodes merge depth. // it contains tuples - name, merge_depth left_subtree_pushed_stack: Vec, generate_revno: bool, // The full parent map. Read-only after construction. This used to be // stored twice (once mutable for destructive iteration via `remove`, once // immutable for end-of-merge lookups); it is now stored once and the // "still pending" bookkeeping lives in `not_yet_scheduled`. graph: HashMap>, // Nodes in `graph` that have not yet been pushed onto the pending stack. // Plays the role of the old mutable `graph`'s key set. not_yet_scheduled: FxHashSet, stop_revision: Option, revnos: FxHashMap, bool)>, // Each mainline revision counts how many child branches have spawned from it. revno_to_branch_count: FxHashMap, // this is a set of the nodes who have been completely analysed for fast // membership checking completed_node_names: FxHashSet, // this is the scheduling of nodes list. // Nodes are scheduled // from the bottom left of the tree: in the tree // A 0 [D, B] // B 1 [C] // C 1 [D] // D 0 [F, E] // E 1 [F] // F 0 // the scheduling order is: F, E, D, C, B, A // that is - 'left subtree, right subtree, node' // which would mean that when we schedule A we can emit the entire tree. scheduled_nodes: Vec<(K, usize, RevnoVec)>, sequence_number: usize, } /// A single row emitted by [`MergeSorter`]. /// /// Fields in order: sequence number, node name, merge depth, optional revno /// sequence, and an end-of-merge flag. See the [`MergeSorter`] docs for the /// meaning of each field. pub type MergeSortRow = (usize, K, usize, Option, bool); impl MergeSorter { pub fn new( mut graph: HashMap>, branch_tip: Option, mainline_revisions: Option>, generate_revno: bool, ) -> Self { let stop_revision; // if there is an explicit mainline, alter the graph to match. This is // easier than checking at every merge whether we are on the mainline and // if so which path to take. if let Some(mainline_revisions) = mainline_revisions.as_ref() { stop_revision = Some(mainline_revisions[0].clone()); // skip the first revision, its what we reach and its parents are // therefore irrelevant for (index, revision) in mainline_revisions[1..].iter().enumerate() { // NB: index 0 means self._mainline_revisions[1] // if the mainline matches the graph, nothing to do. let parent = &mainline_revisions[index]; let graph_parent_ids = graph.get_mut(revision).unwrap(); if !graph_parent_ids.is_empty() { if graph_parent_ids[0] == *parent { continue; } let current_position = graph_parent_ids.iter().position(|x| x == parent).unwrap(); graph_parent_ids.swap(0, current_position); } else { // We ran into a ghost, skip over it, this is a workaround for // bug #243536, the _graph has had ghosts stripped, but the // mainline_revisions have not continue; } } } else { stop_revision = None; } // we need to know the revision numbers of revisions to determine // the revision numbers of their descendants // this is a graph from node to [revno_tuple, first_child] // where first_child is True if no other children have seen this node // and revno_tuple is the tuple that was assigned to the node. // we dont know revnos to start with, so we start it seeded with // [None, True] let revnos = graph .keys() .map(|revision| (revision.clone(), (None, true))) .collect::, bool)>>(); let not_yet_scheduled: FxHashSet = graph.keys().cloned().collect(); let mut sorter = MergeSorter { generate_revno, graph, not_yet_scheduled, stop_revision, revnos, revno_to_branch_count: FxHashMap::default(), node_name_stack: Vec::new(), node_merge_depth_stack: Vec::new(), pending_parents_stack: Vec::new(), first_child_stack: Vec::new(), completed_node_names: FxHashSet::default(), scheduled_nodes: Vec::new(), left_subtree_pushed_stack: Vec::new(), sequence_number: 0, }; if let Some(branch_tip) = branch_tip { let parents = sorter.take_parents(&branch_tip).unwrap(); sorter.push_node(branch_tip, 0, parents); } sorter } /// Mark `key` as scheduled and return a clone of its parent list, or /// `None` if it was already scheduled or not in the graph. fn take_parents(&mut self, key: &K) -> Option> { if self.not_yet_scheduled.remove(key) { Some(self.graph[key].clone()) } else { None } } /// Sort the graph and return as a list. /// /// After calling this the sorter is empty and you must create a new one. pub fn sorted(&mut self) -> std::result::Result>, Error> { self.iter_topo_order().collect() } /// /// After finishing iteration the sorter is empty and you cannot continue /// iteration. pub fn iter_topo_order( &mut self, ) -> impl Iterator, Error>> + '_ { self } /// Add node_name to the pending node stack. /// /// Names in this stack will get emitted into the output as they are popped /// off the stack. pub fn push_node(&mut self, node_name: K, merge_depth: usize, parents: Vec) { self.node_name_stack.push(node_name); self.node_merge_depth_stack.push(merge_depth); self.left_subtree_pushed_stack.push(false); // As we push it, figure out if this is the first child let first_child: Option; if !parents.is_empty() { // Node has parents, assign from the left most parent. if let Some(entry) = self.revnos.get_mut(&parents[0]) { first_child = Some(entry.1); entry.1 = false; } else { // Left-hand parent is a ghost, consider it not to exist first_child = None; } } else { first_child = None; } self.pending_parents_stack.push(parents); self.first_child_stack.push(first_child); } pub fn pop_node(&mut self) -> K { // Pop the top node off the stack // // The node is appended to the sorted output. let node_name = self.node_name_stack.pop().unwrap(); let merge_depth = self.node_merge_depth_stack.pop().unwrap(); let first_child = self.first_child_stack.pop().unwrap(); // remove this node from the pending lists: self.left_subtree_pushed_stack.pop().unwrap(); self.pending_parents_stack.pop().unwrap(); let parents = self.graph.get(&node_name).unwrap(); // Left-hand parent's revno, if it exists and isn't a ghost. let parent_revno = parents .first() .and_then(|p| self.revnos.get(p)) .and_then(|entry| entry.0.clone()); let revno: RevnoVec = if let Some(parent_revno) = parent_revno { if first_child == Some(true) { // as the first child, we just increase the final revision number parent_revno.bump_last() } else { // not the first child, make a new branch let base_revno = parent_revno[0]; let branch_count = self .revno_to_branch_count .get(&base_revno) .copied() .unwrap_or(0) + 1; self.revno_to_branch_count.insert(base_revno, branch_count); parent_revno.new_branch(branch_count) } } else { // no parents, use the root sequence let root_count = if let Some(root_count) = self.revno_to_branch_count.get(&0) { root_count + 1 } else { 0 }; self.revno_to_branch_count.insert(0, root_count); if root_count > 0 { RevnoVec::from(vec![0, root_count, 1]) } else { RevnoVec::from(1) } }; // store the revno for this node for future reference self.revnos .entry(node_name.clone()) .and_modify(|e| e.0 = Some(revno.clone())); self.completed_node_names.insert(node_name.clone()); self.scheduled_nodes .push((node_name.clone(), merge_depth, revno)); node_name } fn build(&mut self) -> std::result::Result<(), Error> { while !self.node_name_stack.is_empty() { let parents_to_visit = self.pending_parents_stack.last().unwrap(); if parents_to_visit.is_empty() { self.pop_node(); } else { while !self.pending_parents_stack.last().unwrap().is_empty() { let is_left_subtree; let next_node_name; if !self.left_subtree_pushed_stack.last().unwrap() { next_node_name = self.pending_parents_stack.last_mut().unwrap().remove(0); is_left_subtree = true; *self.left_subtree_pushed_stack.last_mut().unwrap() = true; // recurse depth first into the primary parent } else { next_node_name = self .pending_parents_stack .last_mut() .unwrap() .pop() .unwrap(); is_left_subtree = false; // place any merges in right-to-left order for scheduling // which gives us left-to-right order after we reverse // the scheduled queue. XXX: This has the effect of // allocating common-new revisions to the right-most // subtree rather than the left most, which will // display nicely (you get smaller trees at the top // of the combined merge). } if self.completed_node_names.contains(&next_node_name) { // this parent was completed by a child on the // call stack. skip it. continue; } // otherwise transfer it from the source graph into the // top of the current depth first search stack. let parents = match self.take_parents(&next_node_name) { Some(parents) => parents, None => { // if the next node is not marked as pending it has // already been popped from the source graph and // placed into the current search stack (but not // completed or we would have hit the continue 4 // lines up). this indicates a cycle. if self.graph.contains_key(&next_node_name) { return Err(Error::Cycle(self.node_name_stack.clone())); } else { // This is just a ghost parent, ignore it continue; } } }; let next_merge_depth = usize::from(!is_left_subtree) + self.node_merge_depth_stack.last().unwrap(); self.push_node(next_node_name, next_merge_depth, parents); // and do not continue processing parents until this 'call' // has recursed. break; } } } Ok(()) } } impl Iterator for MergeSorter { type Item = std::result::Result, Error>; fn next(&mut self) -> Option { if let Err(err) = self.build() { return Some(Err(err)); } let (node_name, merge_depth, revno) = self.scheduled_nodes.pop()?; if let Some(stop) = self.stop_revision.as_ref() { if &node_name == stop { return None; } } let end_of_merge = match self.scheduled_nodes.last() { // last revision is the end of a merge None => true, // the next node is to our left Some((_, next_depth, _)) if *next_depth < merge_depth => true, // the next node was part of a multiple-merge Some((next_name, next_depth, _)) if *next_depth == merge_depth => { !self.graph.get(&node_name).unwrap().contains(next_name) } _ => false, }; let revno_out = self.generate_revno.then_some(revno); let result = ( self.sequence_number, node_name, merge_depth, revno_out, end_of_merge, ); self.sequence_number += 1; Some(Ok(result)) } } pub fn merge_sort( graph: HashMap>, branch_tip: Option, mainline_revisions: Option>, generate_revno: bool, ) -> std::result::Result>, Error> { MergeSorter::new(graph, branch_tip, mainline_revisions, generate_revno).sorted() } python-vcsgraph-0.2.0/vcsgraph/.gitignore0000644000000000000000000000001115167007306015375 0ustar00/target/ python-vcsgraph-0.2.0/vcsgraph/__init__.py0000644000000000000000000000267615167007306015541 0ustar00# Copyright (C) 2005-2010 Canonical Ltd # Copyright (C) 2018-2025 Breezy Developers # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA """Graph algorithms for version control systems. This package contains graph-related algorithms that are useful for version control systems, including: - Topological sorting (tsort) - Graph operations (graph) """ __all__ = [ "BaseVersionedFile", "DictParentsProvider", "FrozenHeadsCache", "Graph", "KnownGraph", "MultiMemoryVersionedFile", "MultiVersionedFile", "invert_parent_map", "topo_sort", ] __version__ = (0, 2, 0) # Re-export commonly used functions and classes from .graph import ( DictParentsProvider, FrozenHeadsCache, Graph, invert_parent_map, ) from .known_graph import KnownGraph from .tsort import topo_sort python-vcsgraph-0.2.0/vcsgraph/errors.py0000644000000000000000000001270115167007306015304 0ustar00# Copyright (C) 2005-2010 Canonical Ltd # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA """Exceptions for vcsgraph operations.""" class Error(Exception): """Base class for vcsgraph exceptions.""" pass class UnsupportedOperation(Error): """Requested operation is not supported.""" pass class GhostRevisionsHaveNoRevno(Error): """When searching for revnos, we encounter a ghost.""" def __init__(self, revision_id, ghost_revision_id): """Initialize GhostRevisionsHaveNoRevno exception. Args: revision_id: The revision ID being searched for. ghost_revision_id: The ghost revision ID encountered. """ self.revision_id = revision_id self.ghost_revision_id = ghost_revision_id super().__init__( f"Ghost revision {ghost_revision_id!r} has no revno, " f"cannot determine revno for {revision_id!r}" ) def __eq__(self, other): """Check equality with another GhostRevisionsHaveNoRevno instance. Args: other: Object to compare with. Returns: True if both instances have the same revision_id and ghost_revision_id. """ if not isinstance(other, GhostRevisionsHaveNoRevno): return False return ( self.revision_id == other.revision_id and self.ghost_revision_id == other.ghost_revision_id ) class InvalidRevisionId(Error): """Invalid revision ID.""" def __init__(self, revision_id, client): """Initialize InvalidRevisionId exception. Args: revision_id: The invalid revision ID. client: The client that encountered the invalid revision. """ self.revision_id = revision_id self.client = client super().__init__(f"Invalid revision ID {revision_id!r}") def __eq__(self, other): """Check equality with another InvalidRevisionId instance. Args: other: Object to compare with. Returns: True if both instances have the same revision_id and client. """ if not isinstance(other, InvalidRevisionId): return False return self.revision_id == other.revision_id and self.client == other.client class NoCommonAncestor(Error): """No common ancestor found between revisions.""" def __init__(self, revision_a, revision_b): """Initialize NoCommonAncestor exception. Args: revision_a: First revision ID. revision_b: Second revision ID. """ self.revision_a = revision_a self.revision_b = revision_b super().__init__( f"No common ancestor found between {revision_a!r} and {revision_b!r}" ) def __eq__(self, other): """Check equality with another NoCommonAncestor instance. Args: other: Object to compare with. Returns: True if both instances have the same revision_a and revision_b. """ if not isinstance(other, NoCommonAncestor): return False return ( self.revision_a == other.revision_a and self.revision_b == other.revision_b ) class RevisionNotPresent(Error): """Revision not present in the graph.""" def __init__(self, revision_id, graph): """Initialize RevisionNotPresent exception. Args: revision_id: The revision ID not present in the graph. graph: The graph object where the revision was not found. """ self.revision_id = revision_id self.graph = graph super().__init__(f"Revision {revision_id!r} not present in graph") def __eq__(self, other): """Check equality with another RevisionNotPresent instance. Args: other: Object to compare with. Returns: True if both instances have the same revision_id and graph. """ if not isinstance(other, RevisionNotPresent): return False return self.revision_id == other.revision_id and self.graph == other.graph class GraphCycleError(Error): """Cycle detected in graph. Raised when a cycle is detected in a directed graph that should be acyclic. """ def __init__(self, graph): """Initialize with the graph containing the cycle. Args: graph: The graph object that contains a cycle. """ self.graph = graph super().__init__(f"Cycle in graph {graph!r}") def __eq__(self, other): """Check equality with another GraphCycleError instance. Args: other: Object to compare with. Returns: True if both instances have the same graph. """ if not isinstance(other, GraphCycleError): return False return self.graph == other.graph python-vcsgraph-0.2.0/vcsgraph/graph.py0000644000000000000000000003247115167007306015077 0ustar00# Copyright (C) 2007-2011 Canonical Ltd # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA """Graph algorithms for version control systems.""" __all__ = [ "CachingParentsProvider", "CallableToParentsProviderAdapter", "DictParentsProvider", "FrozenHeadsCache", "GraphThunkIdsToKeys", "HeadsCache", "StackedParentsProvider", "collapse_linear_regions", "invert_parent_map", ] from ._graph_rs import ( CachingParentsProvider, CallableToParentsProviderAdapter, DictParentsProvider, FrozenHeadsCache, GraphThunkIdsToKeys, HeadsCache, StackedParentsProvider, _RustGraph, collapse_linear_regions, invert_parent_map, ) from ._graph_rs import ( _BreadthFirstSearcher as _RustBreadthFirstSearcher, ) # NULL_REVISION constant NULL_REVISION = b"null:" # DIAGRAM of terminology # A # /\ # B C # | |\ # D E F # |\/| | # |/\|/ # G H # # In this diagram, relative to G and H: # A, B, C, D, E are common ancestors. # C, D and E are border ancestors, because each has a non-common descendant. # D and E are least common ancestors because none of their descendants are # common ancestors. # C is not a least common ancestor because its descendant, E, is a common # ancestor. # # The find_unique_lca algorithm will pick A in two steps: # 1. find_lca('G', 'H') => ['D', 'E'] # 2. Since len(['D', 'E']) > 1, find_lca('D', 'E') => ['A'] class Graph: """Provide incremental access to revision graphs. This is the generic implementation; it is intended to be subclassed to specialize it for other repository types. """ def __init__(self, parents_provider): """Construct a Graph that uses several graphs as its input. This should not normally be invoked directly, because there may be specialized implementations for particular repository types. See Repository.get_graph(). :param parents_provider: An object providing a get_parent_map call conforming to the behavior of StackedParentsProvider.get_parent_map. """ if getattr(parents_provider, "get_parents", None) is not None: self.get_parents = parents_provider.get_parents if getattr(parents_provider, "get_parent_map", None) is not None: self.get_parent_map = parents_provider.get_parent_map self._parents_provider = parents_provider # Rust-backed helper for methods that have been ported. Uses the same # parents provider; the Rust side calls back into Python via a GIL # adapter when it needs a parent lookup. self._rs = _RustGraph(parents_provider) def __repr__(self): return f"Graph({self._parents_provider!r})" def find_lca(self, *revisions): """Determine the lowest common ancestors of the provided revisions. A lowest common ancestor is a common ancestor none of whose descendants are common ancestors. In graphs, unlike trees, there may be multiple lowest common ancestors. This algorithm has two phases. Phase 1 identifies border ancestors, and phase 2 filters border ancestors to determine lowest common ancestors. In phase 1, border ancestors are identified, using a breadth-first search starting at the bottom of the graph. Searches are stopped whenever a node or one of its descendants is determined to be common In phase 2, the border ancestors are filtered to find the least common ancestors. This is done by searching the ancestries of each border ancestor. Phase 2 is perfomed on the principle that a border ancestor that is not an ancestor of any other border ancestor is a least common ancestor. Searches are stopped when they find a node that is determined to be a common ancestor of all border ancestors, because this shows that it cannot be a descendant of any border ancestor. The scaling of this operation should be proportional to: 1. The number of uncommon ancestors 2. The number of border ancestors 3. The length of the shortest path between a border ancestor and an ancestor of all border ancestors. """ return self._rs.find_lca(revisions) def find_difference(self, left_revision, right_revision): """Determine the graph difference between two revisions.""" return self._rs.find_difference(left_revision, right_revision) def find_descendants(self, old_key, new_key): """Find descendants of old_key that are ancestors of new_key.""" return self._rs.find_descendants(old_key, new_key) def _find_descendant_ancestors(self, old_key, new_key): """Find ancestors of new_key that may be descendants of old_key.""" return self._rs._find_descendant_ancestors(old_key, new_key) def _remove_simple_descendants(self, revisions, parent_map): """Remove revisions which are children of other ones in the set. This doesn't do any graph searching, it just checks the immediate parent_map to find if there are any children which can be removed. :param revisions: A set of revision_ids :return: A set of revision_ids with the children removed """ return self._rs._remove_simple_descendants(revisions, parent_map) def get_child_map(self, keys): """Get a mapping from parents to children of the specified keys. This is simply the inversion of get_parent_map. Only supplied keys will be discovered as children. :return: a dict of key:child_list for keys. """ parent_map = self._parents_provider.get_parent_map(keys) parent_child = {} for child, parents in sorted(parent_map.items()): for parent in parents: parent_child.setdefault(parent, []).append(child) return parent_child def find_distance_to_null(self, target_revision_id, known_revision_ids): """Find the left-hand distance to the NULL_REVISION. (This can also be considered the revno of a branch at target_revision_id.) :param target_revision_id: A revision_id which we would like to know the revno for. :param known_revision_ids: [(revision_id, revno)] A list of known revno, revision_id tuples. We'll use this to seed the search. """ return self._rs.find_distance_to_null(target_revision_id, known_revision_ids) def find_lefthand_distances(self, keys): """Find the distance to null for all the keys in keys. :param keys: keys to lookup. :return: A dict key->distance for all of keys. """ return self._rs.find_lefthand_distances(keys) def find_unique_ancestors(self, unique_revision, common_revisions): """Find the unique ancestors for a revision versus others. This returns the ancestry of unique_revision, excluding all revisions in the ancestry of common_revisions. If unique_revision is in the ancestry, then the empty set will be returned. :param unique_revision: The revision_id whose ancestry we are interested in. (XXX: Would this API be better if we allowed multiple revisions on to be searched here?) :param common_revisions: Revision_ids of ancestries to exclude. :return: A set of revisions in the ancestry of unique_revision """ return self._rs.find_unique_ancestors(unique_revision, common_revisions) def get_parent_map(self, revisions): # type: ignore """Get a map of key:parent_list for revisions. This implementation delegates to get_parents, for old parent_providers that do not supply get_parent_map. """ result = {} for rev, parents in self.get_parents(revisions): if parents is not None: result[rev] = parents return result def _make_breadth_first_searcher(self, revisions): return _RustBreadthFirstSearcher(revisions, self) def heads(self, keys): """Return the heads from amongst keys. This is done by searching the ancestries of each key. Any key that is reachable from another key is not returned; all the others are. This operation scales with the relative depth between any two keys. If any two keys are completely disconnected all ancestry of both sides will be retrieved. :param keys: An iterable of keys. :return: A set of the heads. Note that as a set there is no ordering information. Callers will need to filter their input to create order if they need it. """ return self._rs.heads(keys) def find_merge_order(self, tip_revision_id, lca_revision_ids): """Find the order that each revision was merged into tip. This basically just walks backwards with a stack, and walks left-first until it finds a node to stop. """ return self._rs.find_merge_order(tip_revision_id, lca_revision_ids) def find_lefthand_merger(self, merged_key, tip_key): """Find the first lefthand ancestor of tip_key that merged merged_key. We do this by first finding the descendants of merged_key, then walking through the lefthand ancestry of tip_key until we find a key that doesn't descend from merged_key. Its child is the key that merged merged_key. :return: The first lefthand ancestor of tip_key to merge merged_key. merged_key if it is a lefthand ancestor of tip_key. None if no ancestor of tip_key merged merged_key. """ return self._rs.find_lefthand_merger(merged_key, tip_key) def find_unique_lca(self, left_revision, right_revision, count_steps=False): """Find a unique LCA. Find lowest common ancestors. If there is no unique common ancestor, find the lowest common ancestors of those ancestors. Iteration stops when a unique lowest common ancestor is found. The graph origin is necessarily a unique lowest common ancestor. Note that None is not an acceptable substitute for NULL_REVISION. in the input for this method. :param count_steps: If True, the return value will be a tuple of (unique_lca, steps) where steps is the number of times that find_lca was run. If False, only unique_lca is returned. """ return self._rs.find_unique_lca(left_revision, right_revision, count_steps) def iter_ancestry(self, revision_ids): """Iterate the ancestry of this revision. :param revision_ids: Nodes to start the search :return: Yield tuples mapping a revision_id to its parents for the ancestry of revision_id. Ghosts will be returned with None as their parents, and nodes with no parents will have NULL_REVISION as their only parent. (As defined by get_parent_map.) There will also be a node for (NULL_REVISION, ()) """ yield from self._rs.iter_ancestry(revision_ids) def iter_lefthand_ancestry(self, start_key, stop_keys=None): """Iterate the lefthand ancestry of start_key. Yields revisions in lefthand order. Callers may break early; the walk is lazy and does not query further than the last yielded key. Raises RevisionNotPresent if the walk reaches a key that is not in the parents provider. """ return self._rs.iter_lefthand_ancestry(start_key, stop_keys) def iter_topo_order(self, revisions): """Iterate through the input revisions in topological order. This sorting only ensures that parents come before their children. An ancestor may sort after a descendant if the relationship is not visible in the supplied list of revisions. """ return iter(self._rs.iter_topo_order(revisions)) def is_ancestor(self, candidate_ancestor, candidate_descendant): """Determine whether a revision is an ancestor of another. We answer this using heads() as heads() has the logic to perform the smallest number of parent lookups to determine the ancestral relationship between N revisions. """ return self._rs.is_ancestor(candidate_ancestor, candidate_descendant) def is_between(self, revid, lower_bound_revid, upper_bound_revid): """Determine whether a revision is between two others. returns true if and only if: lower_bound_revid <= revid <= upper_bound_revid """ return self._rs.is_between(revid, lower_bound_revid, upper_bound_revid) _counters = [0, 0, 0, 0, 0, 0, 0] # Import KnownGraph to make it available through this module for compatibility python-vcsgraph-0.2.0/vcsgraph/known_graph.py0000644000000000000000000000201515167007306016302 0ustar00# Copyright (C) 2009, 2010 Canonical Ltd # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA """KnownGraph: graph algorithms for the case where the full ancestry is known. The implementation is provided by the Rust extension module ``_graph_rs``. """ from ._graph_rs import KnownGraph, _KnownGraphNode, _MergeSortNode __all__ = ["KnownGraph", "_KnownGraphNode", "_MergeSortNode"] python-vcsgraph-0.2.0/vcsgraph/py.typed0000644000000000000000000000000015167007306015102 0ustar00python-vcsgraph-0.2.0/vcsgraph/tests/0000755000000000000000000000000015167007306014557 5ustar00python-vcsgraph-0.2.0/vcsgraph/tsort.py0000644000000000000000000000331215167007306015141 0ustar00# Copyright (C) 2005, 2006, 2008 Canonical Ltd # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA """Topological sorting routines.""" from .known_graph import KnownGraph __all__ = ["MergeSorter", "TopoSorter", "merge_sort", "topo_sort"] # The Rust implementations are optional from ._graph_rs import MergeSorter, TopoSorter, merge_sort def topo_sort(graph): """Topological sort a graph. graph -- sequence of pairs of node->parents_list. The result is a list of node names, such that all parents come before their children. node identifiers can be any hashable object, and are typically strings. This function has the same purpose as the TopoSorter class, but uses a different algorithm to sort the graph. That means that while both return a list with parents before their child nodes, the exact ordering can be different. topo_sort is faster when the whole list is needed, while when iterating over a part of the list, TopoSorter.iter_topo_order should be used. """ kg = KnownGraph(dict(graph)) return kg.topo_sort() python-vcsgraph-0.2.0/vcsgraph/tests/__init__.py0000644000000000000000000000360315167007306016672 0ustar00# Copyright (C) 2025 Breezy Developers # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA import os import shutil import tempfile import unittest from unittest import TestCase __all__ = ["TestCaseInTempDir"] class TestCaseInTempDir(TestCase): """Minimal TestCase that runs in a temporary directory. Only implements the functionality actually needed by vcsgraph tests. """ def setUp(self): super().setUp() self.test_dir = tempfile.mkdtemp(prefix="vcsgraph_test_") self.original_dir = os.getcwd() os.chdir(self.test_dir) def tearDown(self): os.chdir(self.original_dir) shutil.rmtree(self.test_dir) super().tearDown() def assertPathExists(self, path): """Fail unless path exists.""" self.assertTrue(os.path.lexists(path), f"{path} does not exist") def assertPathDoesNotExist(self, path): """Fail if path exists.""" self.assertFalse(os.path.lexists(path), f"{path} exists") def test_suite() -> unittest.TestSuite: names = [ "known_graph", "graph", "tsort", ] module_names = ["vcsgraph.tests.test_" + name for name in names] loader = unittest.TestLoader() return loader.loadTestsFromNames(module_names) python-vcsgraph-0.2.0/vcsgraph/tests/test_graph.py0000644000000000000000000020373215167007306017300 0ustar00# Copyright (C) 2007-2011 Canonical Ltd # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA from unittest import TestCase from .. import errors from .. import ( graph as _mod_graph, ) from .. import ( known_graph as _mod_known_graph, ) from ..graph import NULL_REVISION # Ancestry 1: # # NULL_REVISION # | # rev1 # /\ # rev2a rev2b # | | # rev3 / # | / # rev4 ancestry_1 = { b"rev1": (NULL_REVISION,), b"rev2a": (b"rev1",), b"rev2b": (b"rev1",), b"rev3": (b"rev2a",), b"rev4": (b"rev3", b"rev2b"), } # Ancestry 2: # # NULL_REVISION # / \ # rev1a rev1b # | # rev2a # | # rev3a # | # rev4a ancestry_2 = { b"rev1a": (NULL_REVISION,), b"rev2a": (b"rev1a",), b"rev1b": (NULL_REVISION,), b"rev3a": (b"rev2a",), b"rev4a": (b"rev3a",), } # Criss cross ancestry # # NULL_REVISION # | # rev1 # / \ # rev2a rev2b # |\ /| # | X | # |/ \| # rev3a rev3b criss_cross = { b"rev1": (NULL_REVISION,), b"rev2a": (b"rev1",), b"rev2b": (b"rev1",), b"rev3a": (b"rev2a", b"rev2b"), b"rev3b": (b"rev2b", b"rev2a"), } # Criss-cross 2 # # NULL_REVISION # / \ # rev1a rev1b # |\ /| # | \ / | # | X | # | / \ | # |/ \| # rev2a rev2b criss_cross2 = { b"rev1a": [NULL_REVISION], b"rev1b": [NULL_REVISION], b"rev2a": [b"rev1a", b"rev1b"], b"rev2b": [b"rev1b", b"rev1a"], } # Mainline: # # NULL_REVISION # | # rev1 # / \ # | rev2b # | / # rev2a mainline = { b"rev1": [NULL_REVISION], b"rev2a": [b"rev1", b"rev2b"], b"rev2b": [b"rev1"], } # feature branch: # # NULL_REVISION # | # rev1 # | # rev2b # | # rev3b feature_branch = {b"rev1": [NULL_REVISION], b"rev2b": [b"rev1"], b"rev3b": [b"rev2b"]} # History shortcut # NULL_REVISION # | # rev1------ # / \ \ # rev2a rev2b rev2c # | / \ / # rev3a rev3b history_shortcut = { b"rev1": [NULL_REVISION], b"rev2a": [b"rev1"], b"rev2b": [b"rev1"], b"rev2c": [b"rev1"], b"rev3a": [b"rev2a", b"rev2b"], b"rev3b": [b"rev2b", b"rev2c"], } # Extended history shortcut # NULL_REVISION # | # a # |\ # b | # | | # c | # | | # d | # |\| # e f extended_history_shortcut = { b"a": [NULL_REVISION], b"b": [b"a"], b"c": [b"b"], b"d": [b"c"], b"e": [b"d"], b"f": [b"a", b"d"], } # Double shortcut # Both sides will see b'A' first, even though it is actually a decendent of a # different common revision. # # NULL_REVISION # | # a # /|\ # / b \ # / | \ # | c | # | / \ | # | d e | # |/ \| # f g double_shortcut = { b"a": [NULL_REVISION], b"b": [b"a"], b"c": [b"b"], b"d": [b"c"], b"e": [b"c"], b"f": [b"a", b"d"], b"g": [b"a", b"e"], } # Complex shortcut # This has a failure mode in that a shortcut will find some nodes in common, # but the common searcher won't have time to find that one branch is actually # in common. The extra nodes at the beginning are because we want to avoid # walking off the graph. Specifically, node G should be considered common, but # is likely to be seen by M long before the common searcher finds it. # # NULL_REVISION # | # a # | # b # | # c # | # d # |\ # e f # | |\ # | g h # |/| | # i j | # | | | # | k | # | | | # | l | # |/|/ # m n complex_shortcut = { b"a": [NULL_REVISION], b"b": [b"a"], b"c": [b"b"], b"d": [b"c"], b"e": [b"d"], b"f": [b"d"], b"g": [b"f"], b"h": [b"f"], b"i": [b"e", b"g"], b"j": [b"g"], b"k": [b"j"], b"l": [b"k"], b"m": [b"i", b"l"], b"n": [b"l", b"h"], } # NULL_REVISION # | # a # | # b # | # c # | # d # |\ # e | # | | # f | # | | # g h # | |\ # i | j # |\| | # | k | # | | | # | l | # | | | # | m | # | | | # | n | # | | | # | o | # | | | # | p | # | | | # | q | # | | | # | r | # | | | # | s | # | | | # |/|/ # t u complex_shortcut2 = { b"a": [NULL_REVISION], b"b": [b"a"], b"c": [b"b"], b"d": [b"c"], b"e": [b"d"], b"f": [b"e"], b"g": [b"f"], b"h": [b"d"], b"i": [b"g"], b"j": [b"h"], b"k": [b"h", b"i"], b"l": [b"k"], b"m": [b"l"], b"n": [b"m"], b"o": [b"n"], b"p": [b"o"], b"q": [b"p"], b"r": [b"q"], b"s": [b"r"], b"t": [b"i", b"s"], b"u": [b"s", b"j"], } # Graph where different walkers will race to find the common and uncommon # nodes. # # NULL_REVISION # | # a # | # b # | # c # | # d # |\ # e k # | | # f-+-p # | | | # | l | # | | | # | m | # | |\| # g n q # |\| | # h o | # |/| | # i r | # | | | # | s | # | | | # | t | # | | | # | u | # | | | # | v | # | | | # | w | # | | | # | x | # | |\| # | y z # |/ # j # # x is found to be common right away, but is the start of a long series of # common commits. # o is actually common, but the i-j shortcut makes it look like it is actually # unique to j at first, you have to traverse all of x->o to find it. # q,m gives the walker from j a common point to stop searching, as does p,f. # k-n exists so that the second pass still has nodes that are worth searching, # rather than instantly cancelling the extra walker. racing_shortcuts = { b"a": [NULL_REVISION], b"b": [b"a"], b"c": [b"b"], b"d": [b"c"], b"e": [b"d"], b"f": [b"e"], b"g": [b"f"], b"h": [b"g"], b"i": [b"h", b"o"], b"j": [b"i", b"y"], b"k": [b"d"], b"l": [b"k"], b"m": [b"l"], b"n": [b"m"], b"o": [b"n", b"g"], b"p": [b"f"], b"q": [b"p", b"m"], b"r": [b"o"], b"s": [b"r"], b"t": [b"s"], b"u": [b"t"], b"v": [b"u"], b"w": [b"v"], b"x": [b"w"], b"y": [b"x"], b"z": [b"x", b"q"], } # A graph with multiple nodes unique to one side. # # NULL_REVISION # | # a # | # b # | # c # | # d # |\ # e f # |\ \ # g h i # |\ \ \ # j k l m # | |/ x| # | n o p # | |/ | # | q | # | | | # | r | # | | | # | s | # | | | # | t | # | | | # | u | # | | | # | v | # | | | # | w | # | | | # | x | # |/ \ / # y z # multiple_interesting_unique = { b"a": (NULL_REVISION,), b"b": [b"a"], b"c": [b"b"], b"d": [b"c"], b"e": [b"d"], b"f": [b"d"], b"g": [b"e"], b"h": [b"e"], b"i": [b"f"], b"j": [b"g"], b"k": [b"g"], b"l": [b"h"], b"m": [b"i"], b"n": [b"k", b"l"], b"o": [b"m"], b"p": [b"m", b"l"], b"q": [b"n", b"o"], b"r": [b"q"], b"s": [b"r"], b"t": [b"s"], b"u": [b"t"], b"v": [b"u"], b"w": [b"v"], b"x": [b"w"], b"y": [b"j", b"x"], b"z": [b"x", b"p"], } # Shortcut with extra root # We have a long history shortcut, and an extra root, which is why we can't # stop searchers based on seeing NULL_REVISION # NULL_REVISION # | | # a | # |\ | # b | | # | | | # c | | # | | | # d | g # |\|/ # e f shortcut_extra_root = { b"a": (NULL_REVISION,), b"b": (b"a",), b"c": (b"b",), b"d": (b"c",), b"e": (b"d",), b"f": (b"a", b"d", b"g"), b"g": (NULL_REVISION,), } # NULL_REVISION # | # f # | # e # / \ # b d # | \ | # a c boundary = { b"a": (b"b",), b"c": (b"b", b"d"), b"b": (b"e",), b"d": (b"e",), b"e": (b"f",), b"f": (NULL_REVISION,), } # A graph that contains a ghost # NULL_REVISION # | # f # | # e g # / \ / # b d # | \ | # a c with_ghost = { b"a": (b"b",), b"c": (b"b", b"d"), b"b": (b"e",), b"d": (b"e", b"g"), b"e": (b"f",), b"f": (NULL_REVISION,), NULL_REVISION: (), } # A graph that shows we can shortcut finding revnos when reaching them from the # side. # NULL_REVISION # | # a # | # b # | # c # | # d # | # e # / \ # f g # | # h # | # i with_tail = { b"a": [NULL_REVISION], b"b": [b"a"], b"c": [b"b"], b"d": [b"c"], b"e": [b"d"], b"f": [b"e"], b"g": [b"e"], b"h": [b"f"], b"i": [b"h"], } class InstrumentedParentsProvider: def __init__(self, parents_provider): self.calls = [] self._real_parents_provider = parents_provider get_cached = getattr(parents_provider, "get_cached_parent_map", None) if get_cached is not None: # Only expose the underlying 'get_cached_parent_map' function if # the wrapped provider has it. self.get_cached_parent_map = self._get_cached_parent_map def get_parent_map(self, nodes): self.calls.extend(nodes) return self._real_parents_provider.get_parent_map(nodes) def _get_cached_parent_map(self, nodes): self.calls.append(("cached", sorted(nodes))) return self._real_parents_provider.get_cached_parent_map(nodes) class SharedInstrumentedParentsProvider: def __init__(self, parents_provider, calls, info): self.calls = calls self.info = info self._real_parents_provider = parents_provider get_cached = getattr(parents_provider, "get_cached_parent_map", None) if get_cached is not None: # Only expose the underlying 'get_cached_parent_map' function if # the wrapped provider has it. self.get_cached_parent_map = self._get_cached_parent_map def get_parent_map(self, nodes): self.calls.append((self.info, sorted(nodes))) return self._real_parents_provider.get_parent_map(nodes) def _get_cached_parent_map(self, nodes): self.calls.append((self.info, "cached", sorted(nodes))) return self._real_parents_provider.get_cached_parent_map(nodes) class TestGraphBase(TestCase): def make_graph(self, ancestors): return _mod_graph.Graph(_mod_graph.DictParentsProvider(ancestors)) def make_breaking_graph(self, ancestors, break_on): """Make a Graph that raises an exception if we hit a node.""" g = self.make_graph(ancestors) orig_parent_map = g.get_parent_map def get_parent_map(keys): bad_keys = set(keys).intersection(break_on) if bad_keys: self.fail(f"key(s) {sorted(bad_keys)} was accessed") return orig_parent_map(keys) g.get_parent_map = get_parent_map return g class TestGraph(TestCase): def make_graph(self, ancestors): return _mod_graph.Graph(_mod_graph.DictParentsProvider(ancestors)) def build_ancestry(self, tree, ancestors): """Create an ancestry as specified by a graph dict. :param tree: A tree to use :param ancestors: a dict of {node: [node_parent, ...]} """ pending = [NULL_REVISION] descendants = {} for descendant, parents in ancestors.items(): for parent in parents: descendants.setdefault(parent, []).append(descendant) while len(pending) > 0: cur_node = pending.pop() for descendant in descendants.get(cur_node, []): if tree.branch.repository.has_revision(descendant): continue parents = [p for p in ancestors[descendant] if p is not NULL_REVISION] if ( len( [ p for p in parents if not tree.branch.repository.has_revision(p) ] ) > 0 ): continue tree.set_parent_ids(parents) left_parent = parents[0] if len(parents) > 0 else NULL_REVISION tree.branch.set_last_revision_info( len(tree.branch._lefthand_history(left_parent)), left_parent ) tree.commit(descendant, rev_id=descendant) pending.append(descendant) def test_lca(self): """Test finding least common ancestor. ancestry_1 should always have a single common ancestor """ graph = self.make_graph(ancestry_1) self.assertRaises(errors.InvalidRevisionId, graph.find_lca, None) self.assertEqual({NULL_REVISION}, graph.find_lca(NULL_REVISION, NULL_REVISION)) self.assertEqual({NULL_REVISION}, graph.find_lca(NULL_REVISION, b"rev1")) self.assertEqual({b"rev1"}, graph.find_lca(b"rev1", b"rev1")) self.assertEqual({b"rev1"}, graph.find_lca(b"rev2a", b"rev2b")) def test_no_unique_lca(self): """Test error when one revision is not in the graph.""" graph = self.make_graph(ancestry_1) self.assertRaises( errors.NoCommonAncestor, graph.find_unique_lca, b"rev1", b"1rev" ) def test_lca_criss_cross(self): """Test least-common-ancestor after a criss-cross merge.""" graph = self.make_graph(criss_cross) self.assertEqual({b"rev2a", b"rev2b"}, graph.find_lca(b"rev3a", b"rev3b")) self.assertEqual({b"rev2b"}, graph.find_lca(b"rev3a", b"rev3b", b"rev2b")) def test_lca_shortcut(self): """Test least-common ancestor on this history shortcut.""" graph = self.make_graph(history_shortcut) self.assertEqual({b"rev2b"}, graph.find_lca(b"rev3a", b"rev3b")) def test_lefthand_distance_smoke(self): """A simple does it work test for graph.lefthand_distance(keys).""" graph = self.make_graph(history_shortcut) distance_graph = graph.find_lefthand_distances([b"rev3b", b"rev2a"]) self.assertEqual({b"rev2a": 2, b"rev3b": 3}, distance_graph) def test_lefthand_distance_ghosts(self): """A simple does it work test for graph.lefthand_distance(keys).""" nodes = {b"nonghost": [NULL_REVISION], b"toghost": [b"ghost"]} graph = self.make_graph(nodes) distance_graph = graph.find_lefthand_distances([b"nonghost", b"toghost"]) self.assertEqual({b"nonghost": 1, b"toghost": -1}, distance_graph) def test_recursive_unique_lca(self): """Test finding a unique least common ancestor. ancestry_1 should always have a single common ancestor """ graph = self.make_graph(ancestry_1) self.assertEqual( NULL_REVISION, graph.find_unique_lca(NULL_REVISION, NULL_REVISION) ) self.assertEqual(NULL_REVISION, graph.find_unique_lca(NULL_REVISION, b"rev1")) self.assertEqual(b"rev1", graph.find_unique_lca(b"rev1", b"rev1")) self.assertEqual(b"rev1", graph.find_unique_lca(b"rev2a", b"rev2b")) self.assertEqual( ( b"rev1", 1, ), graph.find_unique_lca(b"rev2a", b"rev2b", count_steps=True), ) def assertRemoveDescendants(self, expected, graph, revisions): parents = graph.get_parent_map(revisions) self.assertEqual(expected, graph._remove_simple_descendants(revisions, parents)) def test__remove_simple_descendants(self): graph = self.make_graph(ancestry_1) self.assertRemoveDescendants( {b"rev1"}, graph, {b"rev1", b"rev2a", b"rev2b", b"rev3", b"rev4"} ) def test__remove_simple_descendants_disjoint(self): graph = self.make_graph(ancestry_1) self.assertRemoveDescendants({b"rev1", b"rev3"}, graph, {b"rev1", b"rev3"}) def test__remove_simple_descendants_chain(self): graph = self.make_graph(ancestry_1) self.assertRemoveDescendants({b"rev1"}, graph, {b"rev1", b"rev2a", b"rev3"}) def test__remove_simple_descendants_siblings(self): graph = self.make_graph(ancestry_1) self.assertRemoveDescendants( {b"rev2a", b"rev2b"}, graph, {b"rev2a", b"rev2b", b"rev3"} ) def test_unique_lca_criss_cross(self): """Ensure we don't pick non-unique lcas in a criss-cross.""" graph = self.make_graph(criss_cross) self.assertEqual(b"rev1", graph.find_unique_lca(b"rev3a", b"rev3b")) lca, steps = graph.find_unique_lca(b"rev3a", b"rev3b", count_steps=True) self.assertEqual(b"rev1", lca) self.assertEqual(2, steps) def test_unique_lca_null_revision(self): """Ensure we pick NULL_REVISION when necessary.""" graph = self.make_graph(criss_cross2) self.assertEqual(b"rev1b", graph.find_unique_lca(b"rev2a", b"rev1b")) self.assertEqual(NULL_REVISION, graph.find_unique_lca(b"rev2a", b"rev2b")) def test_unique_lca_null_revision2(self): """Ensure we pick NULL_REVISION when necessary.""" graph = self.make_graph(ancestry_2) self.assertEqual(NULL_REVISION, graph.find_unique_lca(b"rev4a", b"rev1b")) def test_lca_double_shortcut(self): graph = self.make_graph(double_shortcut) self.assertEqual(b"c", graph.find_unique_lca(b"f", b"g")) def test_graph_difference(self): graph = self.make_graph(ancestry_1) self.assertEqual((set(), set()), graph.find_difference(b"rev1", b"rev1")) self.assertEqual( (set(), {b"rev1"}), graph.find_difference(NULL_REVISION, b"rev1") ) self.assertEqual( ({b"rev1"}, set()), graph.find_difference(b"rev1", NULL_REVISION) ) self.assertEqual( ({b"rev2a", b"rev3"}, {b"rev2b"}), graph.find_difference(b"rev3", b"rev2b") ) self.assertEqual( ({b"rev4", b"rev3", b"rev2a"}, set()), graph.find_difference(b"rev4", b"rev2b"), ) def test_graph_difference_separate_ancestry(self): graph = self.make_graph(ancestry_2) self.assertEqual( ({b"rev1a"}, {b"rev1b"}), graph.find_difference(b"rev1a", b"rev1b") ) self.assertEqual( ({b"rev1a", b"rev2a", b"rev3a", b"rev4a"}, {b"rev1b"}), graph.find_difference(b"rev4a", b"rev1b"), ) def test_graph_difference_criss_cross(self): graph = self.make_graph(criss_cross) self.assertEqual( ({b"rev3a"}, {b"rev3b"}), graph.find_difference(b"rev3a", b"rev3b") ) self.assertEqual( (set(), {b"rev3b", b"rev2b"}), graph.find_difference(b"rev2a", b"rev3b") ) def test_graph_difference_extended_history(self): graph = self.make_graph(extended_history_shortcut) self.assertEqual(({b"e"}, {b"f"}), graph.find_difference(b"e", b"f")) self.assertEqual(({b"f"}, {b"e"}), graph.find_difference(b"f", b"e")) def test_graph_difference_double_shortcut(self): graph = self.make_graph(double_shortcut) self.assertEqual( ({b"d", b"f"}, {b"e", b"g"}), graph.find_difference(b"f", b"g") ) def test_graph_difference_complex_shortcut(self): graph = self.make_graph(complex_shortcut) self.assertEqual( ({b"m", b"i", b"e"}, {b"n", b"h"}), graph.find_difference(b"m", b"n") ) def test_graph_difference_complex_shortcut2(self): graph = self.make_graph(complex_shortcut2) self.assertEqual(({b"t"}, {b"j", b"u"}), graph.find_difference(b"t", b"u")) def test_graph_difference_shortcut_extra_root(self): graph = self.make_graph(shortcut_extra_root) self.assertEqual(({b"e"}, {b"f", b"g"}), graph.find_difference(b"e", b"f")) def test_iter_topo_order(self): graph = self.make_graph(ancestry_1) args = [b"rev2a", b"rev3", b"rev1"] topo_args = list(graph.iter_topo_order(args)) self.assertEqual(set(args), set(topo_args)) self.assertGreater(topo_args.index(b"rev2a"), topo_args.index(b"rev1")) self.assertLess(topo_args.index(b"rev2a"), topo_args.index(b"rev3")) def test_is_ancestor(self): graph = self.make_graph(ancestry_1) self.assertEqual(True, graph.is_ancestor(b"null:", b"null:")) self.assertEqual(True, graph.is_ancestor(b"null:", b"rev1")) self.assertEqual(False, graph.is_ancestor(b"rev1", b"null:")) self.assertEqual(True, graph.is_ancestor(b"null:", b"rev4")) self.assertEqual(False, graph.is_ancestor(b"rev4", b"null:")) self.assertEqual(False, graph.is_ancestor(b"rev4", b"rev2b")) self.assertEqual(True, graph.is_ancestor(b"rev2b", b"rev4")) self.assertEqual(False, graph.is_ancestor(b"rev2b", b"rev3")) self.assertEqual(False, graph.is_ancestor(b"rev3", b"rev2b")) instrumented_provider = InstrumentedParentsProvider(graph) instrumented_graph = _mod_graph.Graph(instrumented_provider) instrumented_graph.is_ancestor(b"rev2a", b"rev2b") self.assertNotIn(b"null:", instrumented_provider.calls) def test_is_between(self): graph = self.make_graph(ancestry_1) self.assertEqual(True, graph.is_between(b"null:", b"null:", b"null:")) self.assertEqual(True, graph.is_between(b"rev1", b"null:", b"rev1")) self.assertEqual(True, graph.is_between(b"rev1", b"rev1", b"rev4")) self.assertEqual(True, graph.is_between(b"rev4", b"rev1", b"rev4")) self.assertEqual(True, graph.is_between(b"rev3", b"rev1", b"rev4")) self.assertEqual(False, graph.is_between(b"rev4", b"rev1", b"rev3")) self.assertEqual(False, graph.is_between(b"rev1", b"rev2a", b"rev4")) self.assertEqual(False, graph.is_between(b"null:", b"rev1", b"rev4")) def test_is_ancestor_boundary(self): """Ensure that we avoid searching the whole graph. This requires searching through b as a common ancestor, so we can identify that e is common. """ graph = self.make_graph(boundary) instrumented_provider = InstrumentedParentsProvider(graph) graph = _mod_graph.Graph(instrumented_provider) self.assertFalse(graph.is_ancestor(b"a", b"c")) self.assertNotIn(b"null:", instrumented_provider.calls) def test_iter_ancestry(self): nodes = boundary.copy() nodes[NULL_REVISION] = () graph = self.make_graph(nodes) expected = nodes.copy() expected.pop(b"a") # b'a' is not in the ancestry of b'c', all the # other nodes are self.assertEqual(expected, dict(graph.iter_ancestry([b"c"]))) self.assertEqual(nodes, dict(graph.iter_ancestry([b"a", b"c"]))) def test_iter_ancestry_with_ghost(self): graph = self.make_graph(with_ghost) expected = with_ghost.copy() # b'a' is not in the ancestry of b'c', and b'g' is a ghost expected[b"g"] = None self.assertEqual(expected, dict(graph.iter_ancestry([b"a", b"c"]))) expected.pop(b"a") self.assertEqual(expected, dict(graph.iter_ancestry([b"c"]))) def test_filter_candidate_lca(self): """Test filter_candidate_lca for a corner case. This tests the case where we encounter the end of iteration for b'e' in the same pass as we discover that b'd' is an ancestor of b'e', and therefore b'e' can't be an lca. To compensate for different dict orderings on other Python implementations, we mirror b'd' and b'e' with b'b' and b'a'. """ # This test is sensitive to the iteration order of dicts. It will # pass incorrectly if b'e' and b'a' sort before b'c' # # NULL_REVISION # / \ # a e # | | # b d # \ / # c graph = self.make_graph( { b"c": [b"b", b"d"], b"d": [b"e"], b"b": [b"a"], b"a": [NULL_REVISION], b"e": [NULL_REVISION], } ) self.assertEqual({b"c"}, graph.heads([b"a", b"c", b"e"])) def test_heads_null(self): graph = self.make_graph(ancestry_1) self.assertEqual({b"null:"}, graph.heads([b"null:"])) self.assertEqual({b"rev1"}, graph.heads([b"null:", b"rev1"])) self.assertEqual({b"rev1"}, graph.heads([b"rev1", b"null:"])) self.assertEqual({b"rev1"}, graph.heads({b"rev1", b"null:"})) self.assertEqual({b"rev1"}, graph.heads((b"rev1", b"null:"))) def test_heads_one(self): # A single node will always be a head graph = self.make_graph(ancestry_1) self.assertEqual({b"null:"}, graph.heads([b"null:"])) self.assertEqual({b"rev1"}, graph.heads([b"rev1"])) self.assertEqual({b"rev2a"}, graph.heads([b"rev2a"])) self.assertEqual({b"rev2b"}, graph.heads([b"rev2b"])) self.assertEqual({b"rev3"}, graph.heads([b"rev3"])) self.assertEqual({b"rev4"}, graph.heads([b"rev4"])) def test_heads_single(self): graph = self.make_graph(ancestry_1) self.assertEqual({b"rev4"}, graph.heads([b"null:", b"rev4"])) self.assertEqual({b"rev2a"}, graph.heads([b"rev1", b"rev2a"])) self.assertEqual({b"rev2b"}, graph.heads([b"rev1", b"rev2b"])) self.assertEqual({b"rev3"}, graph.heads([b"rev1", b"rev3"])) self.assertEqual({b"rev4"}, graph.heads([b"rev1", b"rev4"])) self.assertEqual({b"rev4"}, graph.heads([b"rev2a", b"rev4"])) self.assertEqual({b"rev4"}, graph.heads([b"rev2b", b"rev4"])) self.assertEqual({b"rev4"}, graph.heads([b"rev3", b"rev4"])) def test_heads_two_heads(self): graph = self.make_graph(ancestry_1) self.assertEqual({b"rev2a", b"rev2b"}, graph.heads([b"rev2a", b"rev2b"])) self.assertEqual({b"rev3", b"rev2b"}, graph.heads([b"rev3", b"rev2b"])) def test_heads_criss_cross(self): graph = self.make_graph(criss_cross) self.assertEqual({b"rev2a"}, graph.heads([b"rev2a", b"rev1"])) self.assertEqual({b"rev2b"}, graph.heads([b"rev2b", b"rev1"])) self.assertEqual({b"rev3a"}, graph.heads([b"rev3a", b"rev1"])) self.assertEqual({b"rev3b"}, graph.heads([b"rev3b", b"rev1"])) self.assertEqual({b"rev2a", b"rev2b"}, graph.heads([b"rev2a", b"rev2b"])) self.assertEqual({b"rev3a"}, graph.heads([b"rev3a", b"rev2a"])) self.assertEqual({b"rev3a"}, graph.heads([b"rev3a", b"rev2b"])) self.assertEqual({b"rev3a"}, graph.heads([b"rev3a", b"rev2a", b"rev2b"])) self.assertEqual({b"rev3b"}, graph.heads([b"rev3b", b"rev2a"])) self.assertEqual({b"rev3b"}, graph.heads([b"rev3b", b"rev2b"])) self.assertEqual({b"rev3b"}, graph.heads([b"rev3b", b"rev2a", b"rev2b"])) self.assertEqual({b"rev3a", b"rev3b"}, graph.heads([b"rev3a", b"rev3b"])) self.assertEqual( {b"rev3a", b"rev3b"}, graph.heads([b"rev3a", b"rev3b", b"rev2a", b"rev2b"]) ) def test_heads_shortcut(self): graph = self.make_graph(history_shortcut) self.assertEqual( {b"rev2a", b"rev2b", b"rev2c"}, graph.heads([b"rev2a", b"rev2b", b"rev2c"]) ) self.assertEqual({b"rev3a", b"rev3b"}, graph.heads([b"rev3a", b"rev3b"])) self.assertEqual( {b"rev3a", b"rev3b"}, graph.heads([b"rev2a", b"rev3a", b"rev3b"]) ) self.assertEqual({b"rev2a", b"rev3b"}, graph.heads([b"rev2a", b"rev3b"])) self.assertEqual({b"rev2c", b"rev3a"}, graph.heads([b"rev2c", b"rev3a"])) def _run_heads_break_deeper(self, graph_dict, search): """Run heads on a graph-as-a-dict. If the search asks for the parents of b'deeper' the test will fail. """ class stub: pass def get_parent_map(keys): result = {} for key in keys: if key == b"deeper": self.fail("key deeper was accessed") result[key] = graph_dict[key] return result an_obj = stub() an_obj.get_parent_map = get_parent_map graph = _mod_graph.Graph(an_obj) return graph.heads(search) def test_heads_limits_search(self): # test that a heads query does not search all of history graph_dict = { b"left": [b"common"], b"right": [b"common"], b"common": [b"deeper"], } self.assertEqual( {b"left", b"right"}, self._run_heads_break_deeper(graph_dict, [b"left", b"right"]), ) def test_heads_limits_search_assymetric(self): # test that a heads query does not search all of history graph_dict = { b"left": [b"midleft"], b"midleft": [b"common"], b"right": [b"common"], b"common": [b"aftercommon"], b"aftercommon": [b"deeper"], } self.assertEqual( {b"left", b"right"}, self._run_heads_break_deeper(graph_dict, [b"left", b"right"]), ) def test_heads_limits_search_common_search_must_continue(self): # test that common nodes are still queried, preventing # all-the-way-to-origin behaviour in the following graph: graph_dict = { b"h1": [b"shortcut", b"common1"], b"h2": [b"common1"], b"shortcut": [b"common2"], b"common1": [b"common2"], b"common2": [b"deeper"], } self.assertEqual( {b"h1", b"h2"}, self._run_heads_break_deeper(graph_dict, [b"h1", b"h2"]) ) def test_breadth_first_search_start_ghosts(self): graph = self.make_graph({}) # with_ghosts reports the ghosts search = graph._make_breadth_first_searcher([b"a-ghost"]) self.assertEqual((set(), {b"a-ghost"}), search.next_with_ghosts()) self.assertRaises(StopIteration, search.next_with_ghosts) # next includes them search = graph._make_breadth_first_searcher([b"a-ghost"]) self.assertEqual({b"a-ghost"}, next(search)) self.assertRaises(StopIteration, next, search) def test_breadth_first_search_deep_ghosts(self): graph = self.make_graph( { b"head": [b"present"], b"present": [b"child", b"ghost"], b"child": [], } ) # with_ghosts reports the ghosts search = graph._make_breadth_first_searcher([b"head"]) self.assertEqual(({b"head"}, set()), search.next_with_ghosts()) self.assertEqual(({b"present"}, set()), search.next_with_ghosts()) self.assertEqual(({b"child"}, {b"ghost"}), search.next_with_ghosts()) self.assertRaises(StopIteration, search.next_with_ghosts) # next includes them search = graph._make_breadth_first_searcher([b"head"]) self.assertEqual({b"head"}, next(search)) self.assertEqual({b"present"}, next(search)) self.assertEqual({b"child", b"ghost"}, next(search)) self.assertRaises(StopIteration, next, search) def test_breadth_first_search_change_next_to_next_with_ghosts(self): # To make the API robust, we allow calling both next() and # next_with_ghosts() on the same searcher. graph = self.make_graph( { b"head": [b"present"], b"present": [b"child", b"ghost"], b"child": [], } ) # start with next_with_ghosts search = graph._make_breadth_first_searcher([b"head"]) self.assertEqual(({b"head"}, set()), search.next_with_ghosts()) self.assertEqual({b"present"}, next(search)) self.assertEqual(({b"child"}, {b"ghost"}), search.next_with_ghosts()) self.assertRaises(StopIteration, next, search) # start with next search = graph._make_breadth_first_searcher([b"head"]) self.assertEqual({b"head"}, next(search)) self.assertEqual(({b"present"}, set()), search.next_with_ghosts()) self.assertEqual({b"child", b"ghost"}, next(search)) self.assertRaises(StopIteration, search.next_with_ghosts) def test_breadth_first_change_search(self): # Changing the search should work with both next and next_with_ghosts. graph = self.make_graph( { b"head": [b"present"], b"present": [b"stopped"], b"other": [b"other_2"], b"other_2": [], } ) search = graph._make_breadth_first_searcher([b"head"]) self.assertEqual(({b"head"}, set()), search.next_with_ghosts()) self.assertEqual(({b"present"}, set()), search.next_with_ghosts()) self.assertEqual({b"present"}, search.stop_searching_any([b"present"])) self.assertEqual( ({b"other"}, {b"other_ghost"}), search.start_searching([b"other", b"other_ghost"]), ) self.assertEqual(({b"other_2"}, set()), search.next_with_ghosts()) self.assertRaises(StopIteration, search.next_with_ghosts) # next includes them search = graph._make_breadth_first_searcher([b"head"]) self.assertEqual({b"head"}, next(search)) self.assertEqual({b"present"}, next(search)) self.assertEqual({b"present"}, search.stop_searching_any([b"present"])) search.start_searching([b"other", b"other_ghost"]) self.assertEqual({b"other_2"}, next(search)) self.assertRaises(StopIteration, next, search) def assertSeenAndResult(self, instructions, search, next): """Check the results of .seen and get_result() for a seach. :param instructions: A list of tuples: (seen, recipe, included_keys, starts, stops). seen, recipe and included_keys are results to check on the search and the searches get_result(). starts and stops are parameters to pass to start_searching and stop_searching_any during each iteration, if they are not None. :param search: The search to use. :param next: A callable to advance the search. """ for seen, recipe, included_keys, starts, stops in instructions: # Adjust for recipe contract changes that don't vary for all the # current tests. recipe = ("search",) + recipe next() if starts is not None: search.start_searching(starts) if stops is not None: search.stop_searching_any(stops) state = search.get_state() self.assertEqual(set(included_keys), state[2]) self.assertEqual(seen, search.seen) def test_breadth_first_get_result_excludes_current_pending(self): graph = self.make_graph( { b"head": [b"child"], b"child": [NULL_REVISION], NULL_REVISION: [], } ) search = graph._make_breadth_first_searcher([b"head"]) # At the start, nothing has been seen, to its all excluded: state = search.get_state() self.assertEqual(({b"head"}, {b"head"}, set()), state) self.assertEqual(set(), search.seen) # using next: expected = [ ({b"head"}, ({b"head"}, {b"child"}, 1), [b"head"], None, None), ( {b"head", b"child"}, ({b"head"}, {NULL_REVISION}, 2), [b"head", b"child"], None, None, ), ( {b"head", b"child", NULL_REVISION}, ({b"head"}, set(), 3), [b"head", b"child", NULL_REVISION], None, None, ), ] self.assertSeenAndResult(expected, search, search.__next__) # using next_with_ghosts: search = graph._make_breadth_first_searcher([b"head"]) self.assertSeenAndResult(expected, search, search.next_with_ghosts) def test_breadth_first_get_result_starts_stops(self): graph = self.make_graph( { b"head": [b"child"], b"child": [NULL_REVISION], b"otherhead": [b"otherchild"], b"otherchild": [b"excluded"], b"excluded": [NULL_REVISION], NULL_REVISION: [], } ) search = graph._make_breadth_first_searcher([]) # Starting with nothing and adding a search works: search.start_searching([b"head"]) # head has been seen: state = search.get_state() self.assertEqual(({b"head"}, {b"child"}, {b"head"}), state) self.assertEqual({b"head"}, search.seen) # using next: expected = [ # stop at child, and start a new search at otherhead: # - otherhead counts as seen immediately when start_searching is # called. ( {b"head", b"child", b"otherhead"}, ({b"head", b"otherhead"}, {b"child", b"otherchild"}, 2), [b"head", b"otherhead"], [b"otherhead"], [b"child"], ), ( {b"head", b"child", b"otherhead", b"otherchild"}, ({b"head", b"otherhead"}, {b"child", b"excluded"}, 3), [b"head", b"otherhead", b"otherchild"], None, None, ), # stop searching excluded now ( {b"head", b"child", b"otherhead", b"otherchild", b"excluded"}, ({b"head", b"otherhead"}, {b"child", b"excluded"}, 3), [b"head", b"otherhead", b"otherchild"], None, [b"excluded"], ), ] self.assertSeenAndResult(expected, search, search.__next__) # using next_with_ghosts: search = graph._make_breadth_first_searcher([]) search.start_searching([b"head"]) self.assertSeenAndResult(expected, search, search.next_with_ghosts) def test_breadth_first_stop_searching_not_queried(self): # A client should be able to say b'stop node X' even if X has not been # returned to the client. graph = self.make_graph( { b"head": [b"child", b"ghost1"], b"child": [NULL_REVISION], NULL_REVISION: [], } ) search = graph._make_breadth_first_searcher([b"head"]) expected = [ # NULL_REVISION and ghost1 have not been returned ( {b"head"}, ({b"head"}, {b"child", NULL_REVISION, b"ghost1"}, 1), [b"head"], None, [NULL_REVISION, b"ghost1"], ), # ghost1 has been returned, NULL_REVISION is to be returned in the # next iteration. ( {b"head", b"child", b"ghost1"}, ({b"head"}, {b"ghost1", NULL_REVISION}, 2), [b"head", b"child"], None, [NULL_REVISION, b"ghost1"], ), ] self.assertSeenAndResult(expected, search, search.__next__) # using next_with_ghosts: search = graph._make_breadth_first_searcher([b"head"]) self.assertSeenAndResult(expected, search, search.next_with_ghosts) def test_breadth_first_stop_searching_late(self): # A client should be able to say b'stop node X' and have it excluded # from the result even if X was seen in an older iteration of the # search. graph = self.make_graph( { b"head": [b"middle"], b"middle": [b"child"], b"child": [NULL_REVISION], NULL_REVISION: [], } ) search = graph._make_breadth_first_searcher([b"head"]) expected = [ ({b"head"}, ({b"head"}, {b"middle"}, 1), [b"head"], None, None), ( {b"head", b"middle"}, ({b"head"}, {b"child"}, 2), [b"head", b"middle"], None, None, ), # b'middle' came from the previous iteration, but we don't stop # searching it until *after* advancing the searcher. ( {b"head", b"middle", b"child"}, ({b"head"}, {b"middle", b"child"}, 1), [b"head"], None, [b"middle", b"child"], ), ] self.assertSeenAndResult(expected, search, search.__next__) # using next_with_ghosts: search = graph._make_breadth_first_searcher([b"head"]) self.assertSeenAndResult(expected, search, search.next_with_ghosts) def test_breadth_first_get_result_ghosts_are_excluded(self): graph = self.make_graph( { b"head": [b"child", b"ghost"], b"child": [NULL_REVISION], NULL_REVISION: [], } ) search = graph._make_breadth_first_searcher([b"head"]) # using next: expected = [ ({b"head"}, ({b"head"}, {b"ghost", b"child"}, 1), [b"head"], None, None), ( {b"head", b"child", b"ghost"}, ({b"head"}, {NULL_REVISION, b"ghost"}, 2), [b"head", b"child"], None, None, ), ] self.assertSeenAndResult(expected, search, search.__next__) # using next_with_ghosts: search = graph._make_breadth_first_searcher([b"head"]) self.assertSeenAndResult(expected, search, search.next_with_ghosts) def test_breadth_first_get_result_starting_a_ghost_ghost_is_excluded(self): graph = self.make_graph( { b"head": [b"child"], b"child": [NULL_REVISION], NULL_REVISION: [], } ) search = graph._make_breadth_first_searcher([b"head"]) # using next: expected = [ ( {b"head", b"ghost"}, ({b"head", b"ghost"}, {b"child", b"ghost"}, 1), [b"head"], [b"ghost"], None, ), ( {b"head", b"child", b"ghost"}, ({b"head", b"ghost"}, {NULL_REVISION, b"ghost"}, 2), [b"head", b"child"], None, None, ), ] self.assertSeenAndResult(expected, search, search.__next__) # using next_with_ghosts: search = graph._make_breadth_first_searcher([b"head"]) self.assertSeenAndResult(expected, search, search.next_with_ghosts) def test_breadth_first_revision_count_includes_NULL_REVISION(self): graph = self.make_graph( { b"head": [NULL_REVISION], NULL_REVISION: [], } ) search = graph._make_breadth_first_searcher([b"head"]) # using next: expected = [ ({b"head"}, ({b"head"}, {NULL_REVISION}, 1), [b"head"], None, None), ( {b"head", NULL_REVISION}, ({b"head"}, set(), 2), [b"head", NULL_REVISION], None, None, ), ] self.assertSeenAndResult(expected, search, search.__next__) # using next_with_ghosts: search = graph._make_breadth_first_searcher([b"head"]) self.assertSeenAndResult(expected, search, search.next_with_ghosts) def test_breadth_first_search_get_result_after_StopIteration(self): # StopIteration should not invalid anything.. graph = self.make_graph( { b"head": [NULL_REVISION], NULL_REVISION: [], } ) search = graph._make_breadth_first_searcher([b"head"]) # using next: expected = [ ({b"head"}, ({b"head"}, {NULL_REVISION}, 1), [b"head"], None, None), ( {b"head", b"ghost", NULL_REVISION}, ({b"head", b"ghost"}, {b"ghost"}, 2), [b"head", NULL_REVISION], [b"ghost"], None, ), ] self.assertSeenAndResult(expected, search, search.__next__) self.assertRaises(StopIteration, next, search) self.assertEqual({b"head", b"ghost", NULL_REVISION}, search.seen) state = search.get_state() self.assertEqual( ({b"ghost", b"head"}, {b"ghost"}, {b"head", NULL_REVISION}), state ) # using next_with_ghosts: search = graph._make_breadth_first_searcher([b"head"]) self.assertSeenAndResult(expected, search, search.next_with_ghosts) self.assertRaises(StopIteration, next, search) self.assertEqual({b"head", b"ghost", NULL_REVISION}, search.seen) state = search.get_state() self.assertEqual( ({b"ghost", b"head"}, {b"ghost"}, {b"head", NULL_REVISION}), state ) class TestFindUniqueAncestors(TestGraphBase): def assertFindUniqueAncestors(self, graph, expected, node, common): actual = graph.find_unique_ancestors(node, common) self.assertEqual(expected, sorted(actual)) def test_empty_set(self): graph = self.make_graph(ancestry_1) self.assertFindUniqueAncestors(graph, [], b"rev1", [b"rev1"]) self.assertFindUniqueAncestors(graph, [], b"rev2b", [b"rev2b"]) self.assertFindUniqueAncestors(graph, [], b"rev3", [b"rev1", b"rev3"]) def test_single_node(self): graph = self.make_graph(ancestry_1) self.assertFindUniqueAncestors(graph, [b"rev2a"], b"rev2a", [b"rev1"]) self.assertFindUniqueAncestors(graph, [b"rev2b"], b"rev2b", [b"rev1"]) self.assertFindUniqueAncestors(graph, [b"rev3"], b"rev3", [b"rev2a"]) def test_minimal_ancestry(self): graph = self.make_breaking_graph( extended_history_shortcut, [NULL_REVISION, b"a", b"b"] ) self.assertFindUniqueAncestors(graph, [b"e"], b"e", [b"d"]) graph = self.make_breaking_graph(extended_history_shortcut, [b"b"]) self.assertFindUniqueAncestors(graph, [b"f"], b"f", [b"a", b"d"]) graph = self.make_breaking_graph(complex_shortcut, [b"a", b"b"]) self.assertFindUniqueAncestors(graph, [b"h"], b"h", [b"i"]) self.assertFindUniqueAncestors(graph, [b"e", b"g", b"i"], b"i", [b"h"]) self.assertFindUniqueAncestors(graph, [b"h"], b"h", [b"g"]) self.assertFindUniqueAncestors(graph, [b"h"], b"h", [b"j"]) def test_in_ancestry(self): graph = self.make_graph(ancestry_1) self.assertFindUniqueAncestors(graph, [], b"rev1", [b"rev3"]) self.assertFindUniqueAncestors(graph, [], b"rev2b", [b"rev4"]) def test_multiple_revisions(self): graph = self.make_graph(ancestry_1) self.assertFindUniqueAncestors(graph, [b"rev4"], b"rev4", [b"rev3", b"rev2b"]) self.assertFindUniqueAncestors( graph, [b"rev2a", b"rev3", b"rev4"], b"rev4", [b"rev2b"] ) def test_complex_shortcut(self): graph = self.make_graph(complex_shortcut) self.assertFindUniqueAncestors(graph, [b"h", b"n"], b"n", [b"m"]) self.assertFindUniqueAncestors(graph, [b"e", b"i", b"m"], b"m", [b"n"]) def test_complex_shortcut2(self): graph = self.make_graph(complex_shortcut2) self.assertFindUniqueAncestors(graph, [b"j", b"u"], b"u", [b"t"]) self.assertFindUniqueAncestors(graph, [b"t"], b"t", [b"u"]) def test_multiple_interesting_unique(self): graph = self.make_graph(multiple_interesting_unique) self.assertFindUniqueAncestors(graph, [b"j", b"y"], b"y", [b"z"]) self.assertFindUniqueAncestors(graph, [b"p", b"z"], b"z", [b"y"]) def test_racing_shortcuts(self): graph = self.make_graph(racing_shortcuts) self.assertFindUniqueAncestors(graph, [b"p", b"q", b"z"], b"z", [b"y"]) self.assertFindUniqueAncestors(graph, [b"h", b"i", b"j", b"y"], b"j", [b"z"]) class TestGraphFindDistanceToNull(TestGraphBase): """Test an api that should be able to compute a revno.""" def assertFindDistance(self, revno, graph, target_id, known_ids): """Assert the output of Graph.find_distance_to_null().""" actual = graph.find_distance_to_null(target_id, known_ids) self.assertEqual(revno, actual) def test_nothing_known(self): graph = self.make_graph(ancestry_1) self.assertFindDistance(0, graph, NULL_REVISION, []) self.assertFindDistance(1, graph, b"rev1", []) self.assertFindDistance(2, graph, b"rev2a", []) self.assertFindDistance(2, graph, b"rev2b", []) self.assertFindDistance(3, graph, b"rev3", []) self.assertFindDistance(4, graph, b"rev4", []) def test_rev_is_ghost(self): graph = self.make_graph(ancestry_1) try: graph.find_distance_to_null( b"rev_missing", [], ) except errors.GhostRevisionsHaveNoRevno as e: self.assertEqual(b"rev_missing", e.revision_id) self.assertEqual(b"rev_missing", e.ghost_revision_id) else: self.fail("Expected GhostRevisionsHaveNoRevno") def test_ancestor_is_ghost(self): graph = self.make_graph({b"rev": [b"parent"]}) try: graph.find_distance_to_null(b"rev", []) except errors.GhostRevisionsHaveNoRevno as e: self.assertEqual(b"rev", e.revision_id) self.assertEqual(b"parent", e.ghost_revision_id) else: self.fail("Expected GhostRevisionsHaveNoRevno") def test_known_in_ancestry(self): graph = self.make_graph(ancestry_1) self.assertFindDistance(2, graph, b"rev2a", [(b"rev1", 1)]) self.assertFindDistance(3, graph, b"rev3", [(b"rev2a", 2)]) def test_known_in_ancestry_limits(self): graph = self.make_breaking_graph(ancestry_1, [b"rev1"]) self.assertFindDistance(4, graph, b"rev4", [(b"rev3", 3)]) def test_target_is_ancestor(self): graph = self.make_graph(ancestry_1) self.assertFindDistance(2, graph, b"rev2a", [(b"rev3", 3)]) def test_target_is_ancestor_limits(self): """We shouldn't search all history if we run into ourselves.""" graph = self.make_breaking_graph(ancestry_1, [b"rev1"]) self.assertFindDistance(3, graph, b"rev3", [(b"rev4", 4)]) def test_target_parallel_to_known_limits(self): # Even though the known revision isn't part of the other ancestry, they # eventually converge graph = self.make_breaking_graph(with_tail, [b"a"]) self.assertFindDistance(6, graph, b"f", [(b"g", 6)]) self.assertFindDistance(7, graph, b"h", [(b"g", 6)]) self.assertFindDistance(8, graph, b"i", [(b"g", 6)]) self.assertFindDistance(6, graph, b"g", [(b"i", 8)]) class TestFindMergeOrder(TestGraphBase): def assertMergeOrder(self, expected, graph, tip, base_revisions): self.assertEqual(expected, graph.find_merge_order(tip, base_revisions)) def test_parents(self): graph = self.make_graph(ancestry_1) self.assertMergeOrder([b"rev3", b"rev2b"], graph, b"rev4", [b"rev3", b"rev2b"]) self.assertMergeOrder([b"rev3", b"rev2b"], graph, b"rev4", [b"rev2b", b"rev3"]) def test_ancestors(self): graph = self.make_graph(ancestry_1) self.assertMergeOrder([b"rev1", b"rev2b"], graph, b"rev4", [b"rev1", b"rev2b"]) self.assertMergeOrder([b"rev1", b"rev2b"], graph, b"rev4", [b"rev2b", b"rev1"]) def test_shortcut_one_ancestor(self): # When we have enough info, we can stop searching graph = self.make_breaking_graph(ancestry_1, [b"rev3", b"rev2b", b"rev4"]) # Single ancestors shortcut right away self.assertMergeOrder([b"rev3"], graph, b"rev4", [b"rev3"]) def test_shortcut_after_one_ancestor(self): graph = self.make_breaking_graph(ancestry_1, [b"rev2a", b"rev2b"]) self.assertMergeOrder([b"rev3", b"rev1"], graph, b"rev4", [b"rev1", b"rev3"]) class TestFindDescendants(TestGraphBase): def test_find_descendants_rev1_rev3(self): graph = self.make_graph(ancestry_1) descendants = graph.find_descendants(b"rev1", b"rev3") self.assertEqual({b"rev1", b"rev2a", b"rev3"}, descendants) def test_find_descendants_rev1_rev4(self): graph = self.make_graph(ancestry_1) descendants = graph.find_descendants(b"rev1", b"rev4") self.assertEqual({b"rev1", b"rev2a", b"rev2b", b"rev3", b"rev4"}, descendants) def test_find_descendants_rev2a_rev4(self): graph = self.make_graph(ancestry_1) descendants = graph.find_descendants(b"rev2a", b"rev4") self.assertEqual({b"rev2a", b"rev3", b"rev4"}, descendants) class TestFindLefthandMerger(TestGraphBase): def check_merger(self, result, ancestry, merged, tip): graph = self.make_graph(ancestry) self.assertEqual(result, graph.find_lefthand_merger(merged, tip)) def test_find_lefthand_merger_rev2b(self): self.check_merger(b"rev4", ancestry_1, b"rev2b", b"rev4") def test_find_lefthand_merger_rev2a(self): self.check_merger(b"rev2a", ancestry_1, b"rev2a", b"rev4") def test_find_lefthand_merger_rev4(self): self.check_merger(None, ancestry_1, b"rev4", b"rev2a") def test_find_lefthand_merger_f(self): self.check_merger(b"i", complex_shortcut, b"f", b"m") def test_find_lefthand_merger_g(self): self.check_merger(b"i", complex_shortcut, b"g", b"m") def test_find_lefthand_merger_h(self): self.check_merger(b"n", complex_shortcut, b"h", b"n") class TestGetChildMap(TestGraphBase): def test_get_child_map(self): graph = self.make_graph(ancestry_1) child_map = graph.get_child_map([b"rev4", b"rev3", b"rev2a", b"rev2b"]) self.assertEqual( { b"rev1": [b"rev2a", b"rev2b"], b"rev2a": [b"rev3"], b"rev2b": [b"rev4"], b"rev3": [b"rev4"], }, child_map, ) class TestCachingParentsProvider(TestCase): """Test CachingParentsProvider. These tests run with: self.inst_pp, a recording parents provider with a graph of a->b, and b is a ghost. self.caching_pp, a CachingParentsProvider layered on inst_pp. """ def setUp(self): super().setUp() dict_pp = _mod_graph.DictParentsProvider({b"a": [b"b"]}) self.inst_pp = InstrumentedParentsProvider(dict_pp) self.caching_pp = _mod_graph.CachingParentsProvider(self.inst_pp) def test_get_parent_map(self): """Requesting the same revision should be returned from cache.""" self.assertEqual({}, self.caching_pp._cache) self.assertEqual({b"a": [b"b"]}, self.caching_pp.get_parent_map([b"a"])) self.assertEqual([b"a"], self.inst_pp.calls) self.assertEqual({b"a": [b"b"]}, self.caching_pp.get_parent_map([b"a"])) # No new call, as it should have been returned from the cache self.assertEqual([b"a"], self.inst_pp.calls) self.assertEqual({b"a": [b"b"]}, self.caching_pp._cache) def test_get_parent_map_not_present(self): """The cache should also track when a revision doesn't exist.""" self.assertEqual({}, self.caching_pp.get_parent_map([b"b"])) self.assertEqual([b"b"], self.inst_pp.calls) self.assertEqual({}, self.caching_pp.get_parent_map([b"b"])) # No new calls self.assertEqual([b"b"], self.inst_pp.calls) def test_get_parent_map_mixed(self): """Anything that can be returned from cache, should be.""" self.assertEqual({}, self.caching_pp.get_parent_map([b"b"])) self.assertEqual([b"b"], self.inst_pp.calls) self.assertEqual({b"a": [b"b"]}, self.caching_pp.get_parent_map([b"a", b"b"])) self.assertEqual([b"b", b"a"], self.inst_pp.calls) def test_get_parent_map_repeated(self): """Asking for the same parent 2x will only forward 1 request.""" self.assertEqual( {b"a": [b"b"]}, self.caching_pp.get_parent_map([b"b", b"a", b"b"]) ) # Use sorted because we don't care about the order, just that each is # only present 1 time. self.assertEqual([b"a", b"b"], sorted(self.inst_pp.calls)) def test_note_missing_key(self): """After noting that a key is missing it is cached.""" self.caching_pp.note_missing_key(b"b") self.assertEqual({}, self.caching_pp.get_parent_map([b"b"])) self.assertEqual([], self.inst_pp.calls) self.assertEqual({b"b"}, self.caching_pp.missing_keys) def test_get_cached_parent_map(self): self.assertEqual({}, self.caching_pp.get_cached_parent_map([b"a"])) self.assertEqual([], self.inst_pp.calls) self.assertEqual({b"a": [b"b"]}, self.caching_pp.get_parent_map([b"a"])) self.assertEqual([b"a"], self.inst_pp.calls) self.assertEqual({b"a": [b"b"]}, self.caching_pp.get_cached_parent_map([b"a"])) class TestCachingParentsProviderExtras(TestCase): """Test the behaviour when parents are provided that were not requested.""" def setUp(self): super().setUp() class ExtraParentsProvider: def get_parent_map(self, keys): return { b"rev1": [], b"rev2": [ b"rev1", ], } self.inst_pp = InstrumentedParentsProvider(ExtraParentsProvider()) self.caching_pp = _mod_graph.CachingParentsProvider( get_parent_map=self.inst_pp.get_parent_map ) def test_uncached(self): self.caching_pp.disable_cache() self.assertEqual({b"rev1": []}, self.caching_pp.get_parent_map([b"rev1"])) self.assertEqual([b"rev1"], self.inst_pp.calls) self.assertIs(None, self.caching_pp._cache) def test_cache_initially_empty(self): self.assertEqual({}, self.caching_pp._cache) def test_cached(self): self.assertEqual({b"rev1": []}, self.caching_pp.get_parent_map([b"rev1"])) self.assertEqual([b"rev1"], self.inst_pp.calls) self.assertEqual({b"rev1": [], b"rev2": [b"rev1"]}, self.caching_pp._cache) self.assertEqual({b"rev1": []}, self.caching_pp.get_parent_map([b"rev1"])) self.assertEqual([b"rev1"], self.inst_pp.calls) def test_disable_cache_clears_cache(self): # Put something in the cache self.caching_pp.get_parent_map([b"rev1"]) self.assertEqual(2, len(self.caching_pp._cache)) self.caching_pp.disable_cache() self.assertIs(None, self.caching_pp._cache) def test_enable_cache_raises(self): try: self.caching_pp.enable_cache() except AssertionError as e: self.assertEqual("Cache enabled when already enabled.", str(e)) else: self.fail("Expected AssertionError") def test_cache_misses(self): self.caching_pp.get_parent_map([b"rev3"]) self.caching_pp.get_parent_map([b"rev3"]) self.assertEqual([b"rev3"], self.inst_pp.calls) def test_no_cache_misses(self): self.caching_pp.disable_cache() self.caching_pp.enable_cache(cache_misses=False) self.caching_pp.get_parent_map([b"rev3"]) self.caching_pp.get_parent_map([b"rev3"]) self.assertEqual([b"rev3", b"rev3"], self.inst_pp.calls) def test_cache_extras(self): self.assertEqual({}, self.caching_pp.get_parent_map([b"rev3"])) self.assertEqual( {b"rev2": [b"rev1"]}, self.caching_pp.get_parent_map([b"rev2"]) ) self.assertEqual([b"rev3"], self.inst_pp.calls) def test_extras_using_cached(self): self.assertEqual({}, self.caching_pp.get_cached_parent_map([b"rev3"])) self.assertEqual({}, self.caching_pp.get_parent_map([b"rev3"])) self.assertEqual( {b"rev2": [b"rev1"]}, self.caching_pp.get_cached_parent_map([b"rev2"]) ) self.assertEqual([b"rev3"], self.inst_pp.calls) class TestCollapseLinearRegions(TestCase): def assertCollapsed(self, collapsed, original): self.assertEqual(collapsed, _mod_graph.collapse_linear_regions(original)) def test_collapse_nothing(self): d = {1: [2, 3], 2: [], 3: []} self.assertCollapsed(d, d) d = {1: [2], 2: [3, 4], 3: [5], 4: [5], 5: []} self.assertCollapsed(d, d) def test_collapse_chain(self): # Any time we have a linear chain, we should be able to collapse d = {1: [2], 2: [3], 3: [4], 4: [5], 5: []} self.assertCollapsed({1: [5], 5: []}, d) d = {5: [4], 4: [3], 3: [2], 2: [1], 1: []} self.assertCollapsed({5: [1], 1: []}, d) d = {5: [3], 3: [4], 4: [1], 1: [2], 2: []} self.assertCollapsed({5: [2], 2: []}, d) def test_collapse_with_multiple_children(self): # 7 # | # 6 # / \ # 4 5 # | | # 2 3 # \ / # 1 # # 4 and 5 cannot be removed because 6 has 2 children # 2 and 3 cannot be removed because 1 has 2 parents d = {1: [2, 3], 2: [4], 4: [6], 3: [5], 5: [6], 6: [7], 7: []} self.assertCollapsed(d, d) class TestGraphThunkIdsToKeys(TestCase): def test_heads(self): # A # |\ # B C # |/ # D d = { (b"D",): [(b"B",), (b"C",)], (b"C",): [(b"A",)], (b"B",): [(b"A",)], (b"A",): [], } g = _mod_graph.Graph(_mod_graph.DictParentsProvider(d)) graph_thunk = _mod_graph.GraphThunkIdsToKeys(g) self.assertEqual([b"D"], sorted(graph_thunk.heads([b"D", b"A"]))) self.assertEqual([b"D"], sorted(graph_thunk.heads([b"D", b"B"]))) self.assertEqual([b"D"], sorted(graph_thunk.heads([b"D", b"C"]))) self.assertEqual([b"B", b"C"], sorted(graph_thunk.heads([b"B", b"C"]))) def test_add_node(self): d = {(b"C",): [(b"A",)], (b"B",): [(b"A",)], (b"A",): []} g = _mod_known_graph.KnownGraph(d) graph_thunk = _mod_graph.GraphThunkIdsToKeys(g) graph_thunk.add_node(b"D", [b"A", b"C"]) self.assertEqual([b"B", b"D"], sorted(graph_thunk.heads([b"D", b"B", b"A"]))) def test_merge_sort(self): d = {(b"C",): [(b"A",)], (b"B",): [(b"A",)], (b"A",): []} g = _mod_known_graph.KnownGraph(d) graph_thunk = _mod_graph.GraphThunkIdsToKeys(g) graph_thunk.add_node(b"D", [b"A", b"C"]) self.assertEqual( [(b"C", 0, (2,), False), (b"A", 0, (1,), True)], [ (n.key, n.merge_depth, n.revno, n.end_of_merge) for n in graph_thunk.merge_sort(b"C") ], ) class TestStackedParentsProvider(TestCase): def setUp(self): super().setUp() self.calls = [] def get_shared_provider(self, info, ancestry, has_cached): pp = _mod_graph.DictParentsProvider(ancestry) if has_cached: pp.get_cached_parent_map = pp.get_parent_map return SharedInstrumentedParentsProvider(pp, self.calls, info) def test_stacked_parents_provider(self): parents1 = _mod_graph.DictParentsProvider({b"rev2": [b"rev3"]}) parents2 = _mod_graph.DictParentsProvider({b"rev1": [b"rev4"]}) stacked = _mod_graph.StackedParentsProvider([parents1, parents2]) self.assertEqual( {b"rev1": [b"rev4"], b"rev2": [b"rev3"]}, stacked.get_parent_map( ( b"rev1", b"rev2", ) ), ) self.assertEqual( {b"rev2": [b"rev3"], b"rev1": [b"rev4"]}, stacked.get_parent_map((b"rev2", b"rev1")), ) self.assertEqual( {b"rev2": [b"rev3"]}, stacked.get_parent_map((b"rev2", b"rev2")) ) self.assertEqual( {b"rev1": [b"rev4"]}, stacked.get_parent_map((b"rev1", b"rev1")) ) def test_stacked_parents_provider_overlapping(self): # rev2 is availible in both providers. # 1 # | # 2 parents1 = _mod_graph.DictParentsProvider({b"rev2": [b"rev1"]}) parents2 = _mod_graph.DictParentsProvider({b"rev2": [b"rev1"]}) stacked = _mod_graph.StackedParentsProvider([parents1, parents2]) self.assertEqual({b"rev2": [b"rev1"]}, stacked.get_parent_map([b"rev2"])) def test_handles_no_get_cached_parent_map(self): # this shows that we both handle when a provider doesn't implement # get_cached_parent_map pp1 = self.get_shared_provider(b"pp1", {b"rev2": [b"rev1"]}, has_cached=False) pp2 = self.get_shared_provider(b"pp2", {b"rev2": [b"rev1"]}, has_cached=True) stacked = _mod_graph.StackedParentsProvider([pp1, pp2]) self.assertEqual({b"rev2": [b"rev1"]}, stacked.get_parent_map([b"rev2"])) # No call on b'pp1' because it doesn't provide get_cached_parent_map self.assertEqual([(b"pp2", "cached", [b"rev2"])], self.calls) def test_query_order(self): # We should call get_cached_parent_map on all providers before we call # get_parent_map. Further, we should track what entries we have found, # and not re-try them. pp1 = self.get_shared_provider(b"pp1", {b"a": []}, has_cached=True) pp2 = self.get_shared_provider(b"pp2", {b"c": [b"b"]}, has_cached=False) pp3 = self.get_shared_provider(b"pp3", {b"b": [b"a"]}, has_cached=True) stacked = _mod_graph.StackedParentsProvider([pp1, pp2, pp3]) self.assertEqual( {b"a": [], b"b": [b"a"], b"c": [b"b"]}, stacked.get_parent_map([b"a", b"b", b"c", b"d"]), ) self.assertEqual( [ (b"pp1", "cached", [b"a", b"b", b"c", b"d"]), # No call to pp2, because it doesn't have cached (b"pp3", "cached", [b"b", b"c", b"d"]), (b"pp1", [b"c", b"d"]), (b"pp2", [b"c", b"d"]), (b"pp3", [b"d"]), ], self.calls, ) python-vcsgraph-0.2.0/vcsgraph/tests/test_known_graph.py0000644000000000000000000007550615167007306020522 0ustar00# Copyright (C) 2009, 2010, 2011 Canonical Ltd # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA """Tests for the python and pyrex extensions of KnownGraph.""" import pprint from unittest import TestCase from .. import errors from ..graph import NULL_REVISION from ..known_graph import KnownGraph from . import test_graph # a # |\ # b | # | | # c | # \| # d alt_merge = {b"a": [], b"b": [b"a"], b"c": [b"b"], b"d": [b"a", b"c"]} class TestCaseWithKnownGraph(TestCase): def make_known_graph(self, ancestry): return KnownGraph(ancestry) class TestKnownGraph(TestCaseWithKnownGraph): def assertGDFO(self, graph, rev, gdfo): node = graph._nodes[rev] self.assertEqual(gdfo, node.gdfo) def test_children_ancestry1(self): graph = self.make_known_graph(test_graph.ancestry_1) self.assertEqual([b"rev1"], graph.get_child_keys(NULL_REVISION)) self.assertEqual([b"rev2a", b"rev2b"], sorted(graph.get_child_keys(b"rev1"))) self.assertEqual([b"rev3"], graph.get_child_keys(b"rev2a")) self.assertEqual([b"rev4"], graph.get_child_keys(b"rev3")) self.assertEqual([b"rev4"], graph.get_child_keys(b"rev2b")) self.assertRaises(KeyError, graph.get_child_keys, b"not_in_graph") def test_parent_ancestry1(self): graph = self.make_known_graph(test_graph.ancestry_1) self.assertEqual((NULL_REVISION,), tuple(graph.get_parent_keys(b"rev1"))) self.assertEqual((b"rev1",), tuple(graph.get_parent_keys(b"rev2a"))) self.assertEqual((b"rev1",), tuple(graph.get_parent_keys(b"rev2b"))) self.assertEqual((b"rev2a",), tuple(graph.get_parent_keys(b"rev3"))) self.assertEqual([b"rev2b", b"rev3"], sorted(graph.get_parent_keys(b"rev4"))) self.assertRaises(KeyError, graph.get_child_keys, b"not_in_graph") def test_parent_with_ghost(self): graph = self.make_known_graph(test_graph.with_ghost) self.assertEqual(None, graph.get_parent_keys(b"g")) def test_gdfo_ancestry_1(self): graph = self.make_known_graph(test_graph.ancestry_1) self.assertGDFO(graph, b"rev1", 2) self.assertGDFO(graph, b"rev2b", 3) self.assertGDFO(graph, b"rev2a", 3) self.assertGDFO(graph, b"rev3", 4) self.assertGDFO(graph, b"rev4", 5) def test_gdfo_feature_branch(self): graph = self.make_known_graph(test_graph.feature_branch) self.assertGDFO(graph, b"rev1", 2) self.assertGDFO(graph, b"rev2b", 3) self.assertGDFO(graph, b"rev3b", 4) def test_gdfo_extended_history_shortcut(self): graph = self.make_known_graph(test_graph.extended_history_shortcut) self.assertGDFO(graph, b"a", 2) self.assertGDFO(graph, b"b", 3) self.assertGDFO(graph, b"c", 4) self.assertGDFO(graph, b"d", 5) self.assertGDFO(graph, b"e", 6) self.assertGDFO(graph, b"f", 6) def test_gdfo_with_ghost(self): graph = self.make_known_graph(test_graph.with_ghost) self.assertGDFO(graph, b"f", 2) self.assertGDFO(graph, b"e", 3) self.assertGDFO(graph, b"g", 1) self.assertGDFO(graph, b"b", 4) self.assertGDFO(graph, b"d", 4) self.assertGDFO(graph, b"a", 5) self.assertGDFO(graph, b"c", 5) def test_add_existing_node(self): graph = self.make_known_graph(test_graph.ancestry_1) # Add a node that already exists with identical content # This is a 'no-op' self.assertGDFO(graph, b"rev4", 5) graph.add_node(b"rev4", [b"rev3", b"rev2b"]) self.assertGDFO(graph, b"rev4", 5) # This also works if we use a tuple rather than a list graph.add_node(b"rev4", (b"rev3", b"rev2b")) def test_add_existing_node_mismatched_parents(self): graph = self.make_known_graph(test_graph.ancestry_1) self.assertRaises(ValueError, graph.add_node, b"rev4", [b"rev2b", b"rev3"]) def test_add_node_with_ghost_parent(self): graph = self.make_known_graph(test_graph.ancestry_1) graph.add_node(b"rev5", [b"rev2b", b"revGhost"]) self.assertGDFO(graph, b"rev5", 4) self.assertGDFO(graph, b"revGhost", 1) def test_add_new_root(self): graph = self.make_known_graph(test_graph.ancestry_1) graph.add_node(b"rev5", []) self.assertGDFO(graph, b"rev5", 1) def test_add_with_all_ghost_parents(self): graph = self.make_known_graph(test_graph.ancestry_1) graph.add_node(b"rev5", [b"ghost"]) self.assertGDFO(graph, b"rev5", 2) self.assertGDFO(graph, b"ghost", 1) def test_gdfo_after_add_node(self): graph = self.make_known_graph(test_graph.ancestry_1) self.assertEqual([], graph.get_child_keys(b"rev4")) graph.add_node(b"rev5", [b"rev4"]) self.assertEqual([b"rev4"], graph.get_parent_keys(b"rev5")) self.assertEqual([b"rev5"], graph.get_child_keys(b"rev4")) self.assertEqual([], graph.get_child_keys(b"rev5")) self.assertGDFO(graph, b"rev5", 6) graph.add_node(b"rev6", [b"rev2b"]) graph.add_node(b"rev7", [b"rev6"]) graph.add_node(b"rev8", [b"rev7", b"rev5"]) self.assertGDFO(graph, b"rev5", 6) self.assertGDFO(graph, b"rev6", 4) self.assertGDFO(graph, b"rev7", 5) self.assertGDFO(graph, b"rev8", 7) def test_fill_in_ghost(self): graph = self.make_known_graph(test_graph.with_ghost) # Add in a couple nodes and then fill in the 'ghost' so that it should # cause renumbering of children nodes graph.add_node(b"x", []) graph.add_node(b"y", [b"x"]) graph.add_node(b"z", [b"y"]) graph.add_node(b"g", [b"z"]) self.assertGDFO(graph, b"f", 2) self.assertGDFO(graph, b"e", 3) self.assertGDFO(graph, b"x", 1) self.assertGDFO(graph, b"y", 2) self.assertGDFO(graph, b"z", 3) self.assertGDFO(graph, b"g", 4) self.assertGDFO(graph, b"b", 4) self.assertGDFO(graph, b"d", 5) self.assertGDFO(graph, b"a", 5) self.assertGDFO(graph, b"c", 6) class TestKnownGraphHeads(TestCaseWithKnownGraph): def test_heads_null(self): graph = self.make_known_graph(test_graph.ancestry_1) self.assertEqual({b"null:"}, graph.heads([b"null:"])) self.assertEqual({b"rev1"}, graph.heads([b"null:", b"rev1"])) self.assertEqual({b"rev1"}, graph.heads([b"rev1", b"null:"])) self.assertEqual({b"rev1"}, graph.heads({b"rev1", b"null:"})) self.assertEqual({b"rev1"}, graph.heads((b"rev1", b"null:"))) def test_heads_one(self): # A single node will always be a head graph = self.make_known_graph(test_graph.ancestry_1) self.assertEqual({b"null:"}, graph.heads([b"null:"])) self.assertEqual({b"rev1"}, graph.heads([b"rev1"])) self.assertEqual({b"rev2a"}, graph.heads([b"rev2a"])) self.assertEqual({b"rev2b"}, graph.heads([b"rev2b"])) self.assertEqual({b"rev3"}, graph.heads([b"rev3"])) self.assertEqual({b"rev4"}, graph.heads([b"rev4"])) def test_heads_single(self): graph = self.make_known_graph(test_graph.ancestry_1) self.assertEqual({b"rev4"}, graph.heads([b"null:", b"rev4"])) self.assertEqual({b"rev2a"}, graph.heads([b"rev1", b"rev2a"])) self.assertEqual({b"rev2b"}, graph.heads([b"rev1", b"rev2b"])) self.assertEqual({b"rev3"}, graph.heads([b"rev1", b"rev3"])) self.assertEqual({b"rev3"}, graph.heads([b"rev3", b"rev2a"])) self.assertEqual({b"rev4"}, graph.heads([b"rev1", b"rev4"])) self.assertEqual({b"rev4"}, graph.heads([b"rev2a", b"rev4"])) self.assertEqual({b"rev4"}, graph.heads([b"rev2b", b"rev4"])) self.assertEqual({b"rev4"}, graph.heads([b"rev3", b"rev4"])) def test_heads_two_heads(self): graph = self.make_known_graph(test_graph.ancestry_1) self.assertEqual({b"rev2a", b"rev2b"}, graph.heads([b"rev2a", b"rev2b"])) self.assertEqual({b"rev3", b"rev2b"}, graph.heads([b"rev3", b"rev2b"])) def test_heads_criss_cross(self): graph = self.make_known_graph(test_graph.criss_cross) self.assertEqual({b"rev2a"}, graph.heads([b"rev2a", b"rev1"])) self.assertEqual({b"rev2b"}, graph.heads([b"rev2b", b"rev1"])) self.assertEqual({b"rev3a"}, graph.heads([b"rev3a", b"rev1"])) self.assertEqual({b"rev3b"}, graph.heads([b"rev3b", b"rev1"])) self.assertEqual({b"rev2a", b"rev2b"}, graph.heads([b"rev2a", b"rev2b"])) self.assertEqual({b"rev3a"}, graph.heads([b"rev3a", b"rev2a"])) self.assertEqual({b"rev3a"}, graph.heads([b"rev3a", b"rev2b"])) self.assertEqual({b"rev3a"}, graph.heads([b"rev3a", b"rev2a", b"rev2b"])) self.assertEqual({b"rev3b"}, graph.heads([b"rev3b", b"rev2a"])) self.assertEqual({b"rev3b"}, graph.heads([b"rev3b", b"rev2b"])) self.assertEqual({b"rev3b"}, graph.heads([b"rev3b", b"rev2a", b"rev2b"])) self.assertEqual({b"rev3a", b"rev3b"}, graph.heads([b"rev3a", b"rev3b"])) self.assertEqual( {b"rev3a", b"rev3b"}, graph.heads([b"rev3a", b"rev3b", b"rev2a", b"rev2b"]) ) def test_heads_shortcut(self): graph = self.make_known_graph(test_graph.history_shortcut) self.assertEqual( {b"rev2a", b"rev2b", b"rev2c"}, graph.heads([b"rev2a", b"rev2b", b"rev2c"]) ) self.assertEqual({b"rev3a", b"rev3b"}, graph.heads([b"rev3a", b"rev3b"])) self.assertEqual( {b"rev3a", b"rev3b"}, graph.heads([b"rev2a", b"rev3a", b"rev3b"]) ) self.assertEqual({b"rev2a", b"rev3b"}, graph.heads([b"rev2a", b"rev3b"])) self.assertEqual({b"rev2c", b"rev3a"}, graph.heads([b"rev2c", b"rev3a"])) def test_heads_linear(self): graph = self.make_known_graph(test_graph.racing_shortcuts) self.assertEqual({b"w"}, graph.heads([b"w", b"s"])) self.assertEqual({b"z"}, graph.heads([b"w", b"s", b"z"])) self.assertEqual({b"w", b"q"}, graph.heads([b"w", b"s", b"q"])) self.assertEqual({b"z"}, graph.heads([b"s", b"z"])) def test_heads_alt_merge(self): graph = self.make_known_graph(alt_merge) self.assertEqual({b"c"}, graph.heads([b"a", b"c"])) def test_heads_with_ghost(self): graph = self.make_known_graph(test_graph.with_ghost) self.assertEqual({b"e", b"g"}, graph.heads([b"e", b"g"])) self.assertEqual({b"a", b"c"}, graph.heads([b"a", b"c"])) self.assertEqual({b"a", b"g"}, graph.heads([b"a", b"g"])) self.assertEqual({b"f", b"g"}, graph.heads([b"f", b"g"])) self.assertEqual({b"c"}, graph.heads([b"c", b"g"])) self.assertEqual({b"c"}, graph.heads([b"c", b"b", b"d", b"g"])) self.assertEqual({b"a", b"c"}, graph.heads([b"a", b"c", b"e", b"g"])) self.assertEqual({b"a", b"c"}, graph.heads([b"a", b"c", b"f"])) def test_filling_in_ghosts_resets_head_cache(self): graph = self.make_known_graph(test_graph.with_ghost) self.assertEqual({b"e", b"g"}, graph.heads([b"e", b"g"])) # 'g' is filled in, and decends from 'e', so the heads result is now # different graph.add_node(b"g", [b"e"]) self.assertEqual({b"g"}, graph.heads([b"e", b"g"])) class TestKnownGraphTopoSort(TestCaseWithKnownGraph): def assertTopoSortOrder(self, ancestry): """Check topo_sort and iter_topo_order is genuinely topological order. For every child in the graph, check if it comes after all of it's parents. """ graph = self.make_known_graph(ancestry) sort_result = graph.topo_sort() # We should have an entry in sort_result for every entry present in the # graph. self.assertEqual(len(ancestry), len(sort_result)) node_idx = {node: idx for idx, node in enumerate(sort_result)} for node in sort_result: parents = ancestry[node] for parent in parents: if parent not in ancestry: # ghost continue if node_idx[node] <= node_idx[parent]: self.fail( f"parent {parent} must come before child {node}:\n{sort_result}" ) def test_topo_sort_empty(self): """TopoSort empty list.""" self.assertTopoSortOrder({}) def test_topo_sort_easy(self): """TopoSort list with one node.""" self.assertTopoSortOrder({0: []}) def test_topo_sort_cycle(self): """TopoSort traps graph with cycles.""" g = self.make_known_graph({0: [1], 1: [0]}) self.assertRaises(errors.GraphCycleError, g.topo_sort) def test_topo_sort_cycle_2(self): """TopoSort traps graph with longer cycle.""" g = self.make_known_graph({0: [1], 1: [2], 2: [0]}) self.assertRaises(errors.GraphCycleError, g.topo_sort) def test_topo_sort_cycle_with_tail(self): """TopoSort traps graph with longer cycle.""" g = self.make_known_graph({0: [1], 1: [2], 2: [3, 4], 3: [0], 4: []}) self.assertRaises(errors.GraphCycleError, g.topo_sort) def test_topo_sort_1(self): """TopoSort simple nontrivial graph.""" self.assertTopoSortOrder({0: [3], 1: [4], 2: [1, 4], 3: [], 4: [0, 3]}) def test_topo_sort_partial(self): """Topological sort with partial ordering. Multiple correct orderings are possible, so test for correctness, not for exact match on the resulting list. """ self.assertTopoSortOrder( { 0: [], 1: [0], 2: [0], 3: [0], 4: [1, 2, 3], 5: [1, 2], 6: [1, 2], 7: [2, 3], 8: [0, 1, 4, 5, 6], } ) def test_topo_sort_ghost_parent(self): """Sort nodes, but don't include some parents in the output.""" self.assertTopoSortOrder({0: [1], 1: [2]}) class TestKnownGraphMergeSort(TestCaseWithKnownGraph): def assertSortAndIterate(self, ancestry, branch_tip, result_list): """Check that merge based sorting and iter_topo_order on graph works.""" graph = self.make_known_graph(ancestry) value = graph.merge_sort(branch_tip) value = [(n.key, n.merge_depth, n.revno, n.end_of_merge) for n in value] if result_list != value: self.assertEqualDiff(pprint.pformat(result_list), pprint.pformat(value)) def test_merge_sort_empty(self): # sorting of an emptygraph does not error self.assertSortAndIterate({}, None, []) self.assertSortAndIterate({}, NULL_REVISION, []) self.assertSortAndIterate({}, (NULL_REVISION,), []) def test_merge_sort_not_empty_no_tip(self): # merge sorting of a branch starting with None should result # in an empty list: no revisions are dragged in. self.assertSortAndIterate({0: []}, None, []) self.assertSortAndIterate({0: []}, NULL_REVISION, []) self.assertSortAndIterate({0: []}, (NULL_REVISION,), []) def test_merge_sort_one_revision(self): # sorting with one revision as the tip returns the correct fields: # sequence - 0, revision id, merge depth - 0, end_of_merge self.assertSortAndIterate({"id": []}, "id", [("id", 0, (1,), True)]) def test_sequence_numbers_increase_no_merges(self): # emit a few revisions with no merges to check the sequence # numbering works in trivial cases self.assertSortAndIterate( {"A": [], "B": ["A"], "C": ["B"]}, "C", [ ("C", 0, (3,), False), ("B", 0, (2,), False), ("A", 0, (1,), True), ], ) def test_sequence_numbers_increase_with_merges(self): # test that sequence numbers increase across merges self.assertSortAndIterate( {"A": [], "B": ["A"], "C": ["A", "B"]}, "C", [ ("C", 0, (2,), False), ("B", 1, (1, 1, 1), True), ("A", 0, (1,), True), ], ) def test_merge_sort_race(self): # A # | # B-. # |\ \ # | | C # | |/ # | D # |/ # F graph = { "A": [], "B": ["A"], "C": ["B"], "D": ["B", "C"], "F": ["B", "D"], } self.assertSortAndIterate( graph, "F", [ ("F", 0, (3,), False), ("D", 1, (2, 2, 1), False), ("C", 2, (2, 1, 1), True), ("B", 0, (2,), False), ("A", 0, (1,), True), ], ) # A # | # B-. # |\ \ # | X C # | |/ # | D # |/ # F graph = { "A": [], "B": ["A"], "C": ["B"], "X": ["B"], "D": ["X", "C"], "F": ["B", "D"], } self.assertSortAndIterate( graph, "F", [ ("F", 0, (3,), False), ("D", 1, (2, 1, 2), False), ("C", 2, (2, 2, 1), True), ("X", 1, (2, 1, 1), True), ("B", 0, (2,), False), ("A", 0, (1,), True), ], ) def test_merge_depth_with_nested_merges(self): # the merge depth marker should reflect the depth of the revision # in terms of merges out from the mainline # revid, depth, parents: # A 0 [D, B] # B 1 [C, F] # C 1 [H] # D 0 [H, E] # E 1 [G, F] # F 2 [G] # G 1 [H] # H 0 self.assertSortAndIterate( { "A": ["D", "B"], "B": ["C", "F"], "C": ["H"], "D": ["H", "E"], "E": ["G", "F"], "F": ["G"], "G": ["H"], "H": [], }, "A", [ ("A", 0, (3,), False), ("B", 1, (1, 3, 2), False), ("C", 1, (1, 3, 1), True), ("D", 0, (2,), False), ("E", 1, (1, 1, 2), False), ("F", 2, (1, 2, 1), True), ("G", 1, (1, 1, 1), True), ("H", 0, (1,), True), ], ) def test_dotted_revnos_with_simple_merges(self): # A 1 # |\ # B C 2, 1.1.1 # | |\ # D E F 3, 1.1.2, 1.2.1 # |/ /| # G H I 4, 1.2.2, 1.3.1 # |/ / # J K 5, 1.3.2 # |/ # L 6 self.assertSortAndIterate( { "A": [], "B": ["A"], "C": ["A"], "D": ["B"], "E": ["C"], "F": ["C"], "G": ["D", "E"], "H": ["F"], "I": ["F"], "J": ["G", "H"], "K": ["I"], "L": ["J", "K"], }, "L", [ ("L", 0, (6,), False), ("K", 1, (1, 3, 2), False), ("I", 1, (1, 3, 1), True), ("J", 0, (5,), False), ("H", 1, (1, 2, 2), False), ("F", 1, (1, 2, 1), True), ("G", 0, (4,), False), ("E", 1, (1, 1, 2), False), ("C", 1, (1, 1, 1), True), ("D", 0, (3,), False), ("B", 0, (2,), False), ("A", 0, (1,), True), ], ) # Adding a shortcut from the first revision should not change any of # the existing numbers self.assertSortAndIterate( { "A": [], "B": ["A"], "C": ["A"], "D": ["B"], "E": ["C"], "F": ["C"], "G": ["D", "E"], "H": ["F"], "I": ["F"], "J": ["G", "H"], "K": ["I"], "L": ["J", "K"], "M": ["A"], "N": ["L", "M"], }, "N", [ ("N", 0, (7,), False), ("M", 1, (1, 4, 1), True), ("L", 0, (6,), False), ("K", 1, (1, 3, 2), False), ("I", 1, (1, 3, 1), True), ("J", 0, (5,), False), ("H", 1, (1, 2, 2), False), ("F", 1, (1, 2, 1), True), ("G", 0, (4,), False), ("E", 1, (1, 1, 2), False), ("C", 1, (1, 1, 1), True), ("D", 0, (3,), False), ("B", 0, (2,), False), ("A", 0, (1,), True), ], ) def test_end_of_merge_not_last_revision_in_branch(self): # within a branch only the last revision gets an # end of merge marker. self.assertSortAndIterate( { "A": ["B"], "B": [], }, "A", [("A", 0, (2,), False), ("B", 0, (1,), True)], ) def test_end_of_merge_multiple_revisions_merged_at_once(self): # when multiple branches are merged at once, both of their # branch-endpoints should be listed as end-of-merge. # Also, the order of the multiple merges should be # left-right shown top to bottom. # * means end of merge # A 0 [H, B, E] # B 1 [D, C] # C 2 [D] * # D 1 [H] * # E 1 [G, F] # F 2 [G] * # G 1 [H] * # H 0 [] * self.assertSortAndIterate( { "A": ["H", "B", "E"], "B": ["D", "C"], "C": ["D"], "D": ["H"], "E": ["G", "F"], "F": ["G"], "G": ["H"], "H": [], }, "A", [ ("A", 0, (2,), False), ("B", 1, (1, 3, 2), False), ("C", 2, (1, 4, 1), True), ("D", 1, (1, 3, 1), True), ("E", 1, (1, 1, 2), False), ("F", 2, (1, 2, 1), True), ("G", 1, (1, 1, 1), True), ("H", 0, (1,), True), ], ) def test_parallel_root_sequence_numbers_increase_with_merges(self): """When there are parallel roots, check their revnos.""" self.assertSortAndIterate( {"A": [], "B": [], "C": ["A", "B"]}, "C", [ ("C", 0, (2,), False), ("B", 1, (0, 1, 1), True), ("A", 0, (1,), True), ], ) def test_revnos_are_globally_assigned(self): """Revnos are assigned according to the revision they derive from.""" # in this test we setup a number of branches that all derive from # the first revision, and then merge them one at a time, which # should give the revisions as they merge numbers still deriving from # the revision were based on. # merge 3: J: ['G', 'I'] # branch 3: # I: ['H'] # H: ['A'] # merge 2: G: ['D', 'F'] # branch 2: # F: ['E'] # E: ['A'] # merge 1: D: ['A', 'C'] # branch 1: # C: ['B'] # B: ['A'] # root: A: [] self.assertSortAndIterate( { "J": ["G", "I"], "I": [ "H", ], "H": ["A"], "G": ["D", "F"], "F": ["E"], "E": ["A"], "D": ["A", "C"], "C": ["B"], "B": ["A"], "A": [], }, "J", [ ("J", 0, (4,), False), ("I", 1, (1, 3, 2), False), ("H", 1, (1, 3, 1), True), ("G", 0, (3,), False), ("F", 1, (1, 2, 2), False), ("E", 1, (1, 2, 1), True), ("D", 0, (2,), False), ("C", 1, (1, 1, 2), False), ("B", 1, (1, 1, 1), True), ("A", 0, (1,), True), ], ) def test_roots_and_sub_branches_versus_ghosts(self): """Extra roots and their mini branches use the same numbering. All of them use the 0-node numbering. """ # A D K # | |\ |\ # B E F L M # | |/ |/ # C G N # |/ |\ # H I O P # |/ |/ # J Q # |.---' # R self.assertSortAndIterate( { "A": [], "B": ["A"], "C": ["B"], "D": [], "E": ["D"], "F": ["D"], "G": ["E", "F"], "H": ["C", "G"], "I": [], "J": ["H", "I"], "K": [], "L": ["K"], "M": ["K"], "N": ["L", "M"], "O": ["N"], "P": ["N"], "Q": ["O", "P"], "R": ["J", "Q"], }, "R", [ ("R", 0, (6,), False), ("Q", 1, (0, 4, 5), False), ("P", 2, (0, 6, 1), True), ("O", 1, (0, 4, 4), False), ("N", 1, (0, 4, 3), False), ("M", 2, (0, 5, 1), True), ("L", 1, (0, 4, 2), False), ("K", 1, (0, 4, 1), True), ("J", 0, (5,), False), ("I", 1, (0, 3, 1), True), ("H", 0, (4,), False), ("G", 1, (0, 1, 3), False), ("F", 2, (0, 2, 1), True), ("E", 1, (0, 1, 2), False), ("D", 1, (0, 1, 1), True), ("C", 0, (3,), False), ("B", 0, (2,), False), ("A", 0, (1,), True), ], ) def test_ghost(self): # merge_sort should be able to ignore ghosts # A # | # B ghost # |/ # C self.assertSortAndIterate( { "A": [], "B": ["A"], "C": ["B", "ghost"], }, "C", [ ("C", 0, (3,), False), ("B", 0, (2,), False), ("A", 0, (1,), True), ], ) def test_lefthand_ghost(self): # ghost # | # A # | # B self.assertSortAndIterate( { "A": ["ghost"], "B": ["A"], }, "B", [ ("B", 0, (2,), False), ("A", 0, (1,), True), ], ) def test_graph_cycle(self): # merge_sort should fail with a simple error when a graph cycle is # encountered. # # A # |,-. # B | # | | # C ^ # | | # D | # |'-' # E self.assertRaises( errors.GraphCycleError, self.assertSortAndIterate, { "A": [], "B": ["D"], "C": ["B"], "D": ["C"], "E": ["D"], }, "E", [], ) class TestKnownGraphStableReverseTopoSort(TestCaseWithKnownGraph): """Test the sort order returned by gc_sort.""" def assertSorted(self, expected, parent_map): graph = self.make_known_graph(parent_map) value = graph.gc_sort() if expected != value: self.assertEqualDiff(pprint.pformat(expected), pprint.pformat(value)) def test_empty(self): self.assertSorted([], {}) def test_single(self): self.assertSorted(["a"], {"a": ()}) self.assertSorted([("a",)], {("a",): ()}) self.assertSorted([("F", "a")], {("F", "a"): ()}) def test_linear(self): self.assertSorted(["c", "b", "a"], {"a": (), "b": ("a",), "c": ("b",)}) self.assertSorted( [("c",), ("b",), ("a",)], {("a",): (), ("b",): (("a",),), ("c",): (("b",),)} ) self.assertSorted( [("F", "c"), ("F", "b"), ("F", "a")], {("F", "a"): (), ("F", "b"): (("F", "a"),), ("F", "c"): (("F", "b"),)}, ) def test_mixed_ancestries(self): # Each prefix should be sorted separately self.assertSorted( [ ("F", "c"), ("F", "b"), ("F", "a"), ("G", "c"), ("G", "b"), ("G", "a"), ("Q", "c"), ("Q", "b"), ("Q", "a"), ], { ("F", "a"): (), ("F", "b"): (("F", "a"),), ("F", "c"): (("F", "b"),), ("G", "a"): (), ("G", "b"): (("G", "a"),), ("G", "c"): (("G", "b"),), ("Q", "a"): (), ("Q", "b"): (("Q", "a"),), ("Q", "c"): (("Q", "b"),), }, ) def test_stable_sorting(self): # the sort order should be stable even when extra nodes are added self.assertSorted(["b", "c", "a"], {"a": (), "b": ("a",), "c": ("a",)}) self.assertSorted( ["b", "c", "d", "a"], {"a": (), "b": ("a",), "c": ("a",), "d": ("a",)} ) self.assertSorted( ["b", "c", "d", "a"], {"a": (), "b": ("a",), "c": ("a",), "d": ("a",)} ) self.assertSorted( ["Z", "b", "c", "d", "a"], {"a": (), "b": ("a",), "c": ("a",), "d": ("a",), "Z": ("a",)}, ) self.assertSorted( ["e", "b", "c", "f", "Z", "d", "a"], { "a": (), "b": ("a",), "c": ("a",), "d": ("a",), "Z": ("a",), "e": ("b", "c", "d"), "f": ("d", "Z"), }, ) def test_skip_ghost(self): self.assertSorted(["b", "c", "a"], {"a": (), "b": ("a", "ghost"), "c": ("a",)}) def test_skip_mainline_ghost(self): self.assertSorted(["b", "c", "a"], {"a": (), "b": ("ghost", "a"), "c": ("a",)}) python-vcsgraph-0.2.0/vcsgraph/tests/test_tsort.py0000644000000000000000000005422215167007306017350 0ustar00# Copyright (C) 2005-2009, 2016 Canonical Ltd # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA """Tests for topological sort.""" import pprint from unittest import TestCase from ..errors import GraphCycleError from ..graph import NULL_REVISION from ..tsort import MergeSorter, TopoSorter, merge_sort, topo_sort class TopoSortTests(TestCase): def assertSortAndIterate(self, graph, result_list): """Check that sorting and iter_topo_order on graph works.""" self.assertEqual(result_list, topo_sort(graph)) self.assertEqual(result_list, list(TopoSorter(graph).iter_topo_order())) def assertSortAndIterateRaise(self, exception_type, graph): """Try iterating and topo_sorting graph and expect an exception.""" self.assertRaises(exception_type, topo_sort, graph) self.assertRaises(exception_type, list, TopoSorter(graph).iter_topo_order()) def assertSortAndIterateOrder(self, graph): """Check topo_sort and iter_topo_order is genuinely topological order. For every child in the graph, check if it comes after all of it's parents. """ sort_result = topo_sort(graph) iter_result = list(TopoSorter(graph).iter_topo_order()) for node, parents in graph: for parent in parents: if sort_result.index(node) < sort_result.index(parent): self.fail( f"parent {parent} must come before child {node}:\n{sort_result}" ) if iter_result.index(node) < iter_result.index(parent): self.fail( f"parent {parent} must come before child {node}:\n{iter_result}" ) def test_tsort_empty(self): """TopoSort empty list.""" self.assertSortAndIterate([], []) def test_tsort_easy(self): """TopoSort list with one node.""" self.assertSortAndIterate({0: []}.items(), [0]) def test_tsort_cycle(self): """TopoSort traps graph with cycles.""" self.assertSortAndIterateRaise(GraphCycleError, {0: [1], 1: [0]}.items()) def test_tsort_cycle_2(self): """TopoSort traps graph with longer cycle.""" self.assertSortAndIterateRaise( GraphCycleError, {0: [1], 1: [2], 2: [0]}.items() ) def test_topo_sort_cycle_with_tail(self): """TopoSort traps graph with longer cycle.""" self.assertSortAndIterateRaise( GraphCycleError, {0: [1], 1: [2], 2: [3, 4], 3: [0], 4: []}.items() ) def test_tsort_1(self): """TopoSort simple nontrivial graph.""" self.assertSortAndIterate( {0: [3], 1: [4], 2: [1, 4], 3: [], 4: [0, 3]}.items(), [3, 0, 4, 1, 2] ) def test_tsort_partial(self): """Topological sort with partial ordering. Multiple correct orderings are possible, so test for correctness, not for exact match on the resulting list. """ self.assertSortAndIterateOrder( [ (0, []), (1, [0]), (2, [0]), (3, [0]), (4, [1, 2, 3]), (5, [1, 2]), (6, [1, 2]), (7, [2, 3]), (8, [0, 1, 4, 5, 6]), ] ) def test_tsort_unincluded_parent(self): """Sort nodes, but don't include some parents in the output.""" self.assertSortAndIterate([(0, [1]), (1, [2])], [1, 0]) class MergeSortTests(TestCase): def assertSortAndIterate( self, graph, branch_tip, result_list, generate_revno, mainline_revisions=None ): """Check that merge based sort and iter_topo_order on graph works.""" value = merge_sort( graph, branch_tip, mainline_revisions=mainline_revisions, generate_revno=generate_revno, ) if result_list != value: self.assertEqualDiff(pprint.pformat(result_list), pprint.pformat(value)) self.assertEqual( result_list, list( MergeSorter( graph, branch_tip, mainline_revisions=mainline_revisions, generate_revno=generate_revno, ).iter_topo_order() ), ) def test_merge_sort_empty(self): # sorting of an emptygraph does not error self.assertSortAndIterate({}, None, [], False) self.assertSortAndIterate({}, None, [], True) self.assertSortAndIterate({}, NULL_REVISION, [], False) self.assertSortAndIterate({}, NULL_REVISION, [], True) def test_merge_sort_not_empty_no_tip(self): # merge sorting of a branch starting with None should result # in an empty list: no revisions are dragged in. self.assertSortAndIterate({0: []}.items(), None, [], False) self.assertSortAndIterate({0: []}.items(), None, [], True) def test_merge_sort_one_revision(self): # sorting with one revision as the tip returns the correct fields: # sequence - 0, revision id, merge depth - 0, end_of_merge self.assertSortAndIterate({"id": []}.items(), "id", [(0, "id", 0, True)], False) self.assertSortAndIterate( {"id": []}.items(), "id", [(0, "id", 0, (1,), True)], True ) def test_sequence_numbers_increase_no_merges(self): # emit a few revisions with no merges to check the sequence # numbering works in trivial cases self.assertSortAndIterate( {"A": [], "B": ["A"], "C": ["B"]}.items(), "C", [ (0, "C", 0, False), (1, "B", 0, False), (2, "A", 0, True), ], False, ) self.assertSortAndIterate( {"A": [], "B": ["A"], "C": ["B"]}.items(), "C", [ (0, "C", 0, (3,), False), (1, "B", 0, (2,), False), (2, "A", 0, (1,), True), ], True, ) def test_sequence_numbers_increase_with_merges(self): # test that sequence numbers increase across merges self.assertSortAndIterate( {"A": [], "B": ["A"], "C": ["A", "B"]}.items(), "C", [ (0, "C", 0, False), (1, "B", 1, True), (2, "A", 0, True), ], False, ) self.assertSortAndIterate( {"A": [], "B": ["A"], "C": ["A", "B"]}.items(), "C", [ (0, "C", 0, (2,), False), (1, "B", 1, (1, 1, 1), True), (2, "A", 0, (1,), True), ], True, ) def test_merge_sort_race(self): # A # | # B-. # |\ \ # | | C # | |/ # | D # |/ # F graph = { "A": [], "B": ["A"], "C": ["B"], "D": ["B", "C"], "F": ["B", "D"], } self.assertSortAndIterate( graph, "F", [ (0, "F", 0, (3,), False), (1, "D", 1, (2, 2, 1), False), (2, "C", 2, (2, 1, 1), True), (3, "B", 0, (2,), False), (4, "A", 0, (1,), True), ], True, ) # A # | # B-. # |\ \ # | X C # | |/ # | D # |/ # F graph = { "A": [], "B": ["A"], "C": ["B"], "X": ["B"], "D": ["X", "C"], "F": ["B", "D"], } self.assertSortAndIterate( graph, "F", [ (0, "F", 0, (3,), False), (1, "D", 1, (2, 1, 2), False), (2, "C", 2, (2, 2, 1), True), (3, "X", 1, (2, 1, 1), True), (4, "B", 0, (2,), False), (5, "A", 0, (1,), True), ], True, ) def test_merge_depth_with_nested_merges(self): # the merge depth marker should reflect the depth of the revision # in terms of merges out from the mainline # revid, depth, parents: # A 0 [D, B] # B 1 [C, F] # C 1 [H] # D 0 [H, E] # E 1 [G, F] # F 2 [G] # G 1 [H] # H 0 self.assertSortAndIterate( { "A": ["D", "B"], "B": ["C", "F"], "C": ["H"], "D": ["H", "E"], "E": ["G", "F"], "F": ["G"], "G": ["H"], "H": [], }.items(), "A", [ (0, "A", 0, False), (1, "B", 1, False), (2, "C", 1, True), (3, "D", 0, False), (4, "E", 1, False), (5, "F", 2, True), (6, "G", 1, True), (7, "H", 0, True), ], False, ) self.assertSortAndIterate( { "A": ["D", "B"], "B": ["C", "F"], "C": ["H"], "D": ["H", "E"], "E": ["G", "F"], "F": ["G"], "G": ["H"], "H": [], }.items(), "A", [ (0, "A", 0, (3,), False), (1, "B", 1, (1, 3, 2), False), (2, "C", 1, (1, 3, 1), True), (3, "D", 0, (2,), False), (4, "E", 1, (1, 1, 2), False), (5, "F", 2, (1, 2, 1), True), (6, "G", 1, (1, 1, 1), True), (7, "H", 0, (1,), True), ], True, ) def test_dotted_revnos_with_simple_merges(self): # A 1 # |\ # B C 2, 1.1.1 # | |\ # D E F 3, 1.1.2, 1.2.1 # |/ /| # G H I 4, 1.2.2, 1.3.1 # |/ / # J K 5, 1.3.2 # |/ # L 6 self.assertSortAndIterate( { "A": [], "B": ["A"], "C": ["A"], "D": ["B"], "E": ["C"], "F": ["C"], "G": ["D", "E"], "H": ["F"], "I": ["F"], "J": ["G", "H"], "K": ["I"], "L": ["J", "K"], }.items(), "L", [ (0, "L", 0, (6,), False), (1, "K", 1, (1, 3, 2), False), (2, "I", 1, (1, 3, 1), True), (3, "J", 0, (5,), False), (4, "H", 1, (1, 2, 2), False), (5, "F", 1, (1, 2, 1), True), (6, "G", 0, (4,), False), (7, "E", 1, (1, 1, 2), False), (8, "C", 1, (1, 1, 1), True), (9, "D", 0, (3,), False), (10, "B", 0, (2,), False), (11, "A", 0, (1,), True), ], True, ) # Adding a shortcut from the first revision should not change any of # the existing numbers self.assertSortAndIterate( { "A": [], "B": ["A"], "C": ["A"], "D": ["B"], "E": ["C"], "F": ["C"], "G": ["D", "E"], "H": ["F"], "I": ["F"], "J": ["G", "H"], "K": ["I"], "L": ["J", "K"], "M": ["A"], "N": ["L", "M"], }.items(), "N", [ (0, "N", 0, (7,), False), (1, "M", 1, (1, 4, 1), True), (2, "L", 0, (6,), False), (3, "K", 1, (1, 3, 2), False), (4, "I", 1, (1, 3, 1), True), (5, "J", 0, (5,), False), (6, "H", 1, (1, 2, 2), False), (7, "F", 1, (1, 2, 1), True), (8, "G", 0, (4,), False), (9, "E", 1, (1, 1, 2), False), (10, "C", 1, (1, 1, 1), True), (11, "D", 0, (3,), False), (12, "B", 0, (2,), False), (13, "A", 0, (1,), True), ], True, ) def test_end_of_merge_not_last_revision_in_branch(self): # within a branch only the last revision gets an # end of merge marker. self.assertSortAndIterate( { "A": ["B"], "B": [], }, "A", [(0, "A", 0, False), (1, "B", 0, True)], False, ) self.assertSortAndIterate( { "A": ["B"], "B": [], }, "A", [(0, "A", 0, (2,), False), (1, "B", 0, (1,), True)], True, ) def test_end_of_merge_multiple_revisions_merged_at_once(self): # when multiple branches are merged at once, both of their # branch-endpoints should be listed as end-of-merge. # Also, the order of the multiple merges should be # left-right shown top to bottom. # * means end of merge # A 0 [H, B, E] # B 1 [D, C] # C 2 [D] * # D 1 [H] * # E 1 [G, F] # F 2 [G] * # G 1 [H] * # H 0 [] * self.assertSortAndIterate( { "A": ["H", "B", "E"], "B": ["D", "C"], "C": ["D"], "D": ["H"], "E": ["G", "F"], "F": ["G"], "G": ["H"], "H": [], }, "A", [ (0, "A", 0, False), (1, "B", 1, False), (2, "C", 2, True), (3, "D", 1, True), (4, "E", 1, False), (5, "F", 2, True), (6, "G", 1, True), (7, "H", 0, True), ], False, ) self.assertSortAndIterate( { "A": ["H", "B", "E"], "B": ["D", "C"], "C": ["D"], "D": ["H"], "E": ["G", "F"], "F": ["G"], "G": ["H"], "H": [], }, "A", [ (0, "A", 0, (2,), False), (1, "B", 1, (1, 3, 2), False), (2, "C", 2, (1, 4, 1), True), (3, "D", 1, (1, 3, 1), True), (4, "E", 1, (1, 1, 2), False), (5, "F", 2, (1, 2, 1), True), (6, "G", 1, (1, 1, 1), True), (7, "H", 0, (1,), True), ], True, ) def test_mainline_revs_partial(self): # when a mainline_revisions list is passed this must # override the graphs idea of mainline, and must also # truncate the output to the specified range, if needed. # so we test both at once: a mainline_revisions list that # disagrees with the graph about which revs are 'mainline' # and also truncates the output. # graph: # A 0 [E, B] # B 1 [D, C] # C 2 [D] # D 1 [E] # E 0 # with a mainline of NONE,E,A (the inferred one) this will show the # merge depths above. # with a overriden mainline of NONE,E,D,B,A it should show: # A 0 # B 0 # C 1 # D 0 # E 0 # and thus when truncated to D,B,A it should show # A 0 # B 0 # C 1 # because C is brought in by B in this view and D # is the terminating revision id # this should also preserve revision numbers: C should still be 2.1.1 self.assertSortAndIterate( {"A": ["E", "B"], "B": ["D", "C"], "C": ["D"], "D": ["E"], "E": []}, "A", [ (0, "A", 0, False), (1, "B", 0, False), (2, "C", 1, True), ], False, mainline_revisions=["D", "B", "A"], ) self.assertSortAndIterate( {"A": ["E", "B"], "B": ["D", "C"], "C": ["D"], "D": ["E"], "E": []}, "A", [ (0, "A", 0, (4,), False), (1, "B", 0, (3,), False), (2, "C", 1, (2, 1, 1), True), ], True, mainline_revisions=["D", "B", "A"], ) def test_mainline_revs_with_none(self): # a simple test to ensure that a mainline_revs # list which goes all the way to None works self.assertSortAndIterate( { "A": [], }, "A", [ (0, "A", 0, True), ], False, mainline_revisions=[None, "A"], ) self.assertSortAndIterate( { "A": [], }, "A", [ (0, "A", 0, (1,), True), ], True, mainline_revisions=[None, "A"], ) def test_mainline_revs_with_ghost(self): # We have a mainline, but the end of it is actually a ghost # The graph that is passed to tsort has had ghosts filtered out, but # the mainline history has not. self.assertSortAndIterate( {"B": [], "C": ["B"]}.items(), "C", [ (0, "C", 0, (2,), False), (1, "B", 0, (1,), True), ], True, mainline_revisions=["A", "B", "C"], ) def test_parallel_root_sequence_numbers_increase_with_merges(self): """When there are parallel roots, check their revnos.""" self.assertSortAndIterate( {"A": [], "B": [], "C": ["A", "B"]}.items(), "C", [ (0, "C", 0, (2,), False), (1, "B", 1, (0, 1, 1), True), (2, "A", 0, (1,), True), ], True, ) def test_revnos_are_globally_assigned(self): """Revnos are assigned according to the revision they derive from.""" # in this test we setup a number of branches that all derive from # the first revision, and then merge them one at a time, which # should give the revisions as they merge numbers still deriving from # the revision were based on. # merge 3: J: ['G', 'I'] # branch 3: # I: ['H'] # H: ['A'] # merge 2: G: ['D', 'F'] # branch 2: # F: ['E'] # E: ['A'] # merge 1: D: ['A', 'C'] # branch 1: # C: ['B'] # B: ['A'] # root: A: [] self.assertSortAndIterate( { "J": ["G", "I"], "I": [ "H", ], "H": ["A"], "G": ["D", "F"], "F": ["E"], "E": ["A"], "D": ["A", "C"], "C": ["B"], "B": ["A"], "A": [], }.items(), "J", [ (0, "J", 0, (4,), False), (1, "I", 1, (1, 3, 2), False), (2, "H", 1, (1, 3, 1), True), (3, "G", 0, (3,), False), (4, "F", 1, (1, 2, 2), False), (5, "E", 1, (1, 2, 1), True), (6, "D", 0, (2,), False), (7, "C", 1, (1, 1, 2), False), (8, "B", 1, (1, 1, 1), True), (9, "A", 0, (1,), True), ], True, ) def test_roots_and_sub_branches_versus_ghosts(self): """Extra roots and their mini branches use the same numbering. All of them use the 0-node numbering. """ # A D K # | |\ |\ # B E F L M # | |/ |/ # C G N # |/ |\ # H I O P # |/ |/ # J Q # |.---' # R self.assertSortAndIterate( { "A": [], "B": ["A"], "C": ["B"], "D": [], "E": ["D"], "F": ["D"], "G": ["E", "F"], "H": ["C", "G"], "I": [], "J": ["H", "I"], "K": [], "L": ["K"], "M": ["K"], "N": ["L", "M"], "O": ["N"], "P": ["N"], "Q": ["O", "P"], "R": ["J", "Q"], }.items(), "R", [ (0, "R", 0, (6,), False), (1, "Q", 1, (0, 4, 5), False), (2, "P", 2, (0, 6, 1), True), (3, "O", 1, (0, 4, 4), False), (4, "N", 1, (0, 4, 3), False), (5, "M", 2, (0, 5, 1), True), (6, "L", 1, (0, 4, 2), False), (7, "K", 1, (0, 4, 1), True), (8, "J", 0, (5,), False), (9, "I", 1, (0, 3, 1), True), (10, "H", 0, (4,), False), (11, "G", 1, (0, 1, 3), False), (12, "F", 2, (0, 2, 1), True), (13, "E", 1, (0, 1, 2), False), (14, "D", 1, (0, 1, 1), True), (15, "C", 0, (3,), False), (16, "B", 0, (2,), False), (17, "A", 0, (1,), True), ], True, )