././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1709152428.2145355 hypothesis-jsonschema-0.23.1/0000755000175100001770000000000000000000000016732 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152418.0 hypothesis-jsonschema-0.23.1/LICENSE0000644000175100001770000004052400000000000017744 0ustar00runnerdocker00000000000000Mozilla Public License Version 2.0 ================================== 1. Definitions -------------- 1.1. "Contributor" means each individual or legal entity that creates, contributes to the creation of, or owns Covered Software. 1.2. "Contributor Version" means the combination of the Contributions of others (if any) used by a Contributor and that particular Contributor's Contribution. 1.3. "Contribution" means Covered Software of a particular Contributor. 1.4. "Covered Software" means Source Code Form to which the initial Contributor has attached the notice in Exhibit A, the Executable Form of such Source Code Form, and Modifications of such Source Code Form, in each case including portions thereof. 1.5. "Incompatible With Secondary Licenses" means (a) that the initial Contributor has attached the notice described in Exhibit B to the Covered Software; or (b) that the Covered Software was made available under the terms of version 1.1 or earlier of the License, but not also under the terms of a Secondary License. 1.6. "Executable Form" means any form of the work other than Source Code Form. 1.7. "Larger Work" means a work that combines Covered Software with other material, in a separate file or files, that is not Covered Software. 1.8. "License" means this document. 1.9. "Licensable" means having the right to grant, to the maximum extent possible, whether at the time of the initial grant or subsequently, any and all of the rights conveyed by this License. 1.10. "Modifications" means any of the following: (a) any file in Source Code Form that results from an addition to, deletion from, or modification of the contents of Covered Software; or (b) any new file in Source Code Form that contains any Covered Software. 1.11. "Patent Claims" of a Contributor means any patent claim(s), including without limitation, method, process, and apparatus claims, in any patent Licensable by such Contributor that would be infringed, but for the grant of the License, by the making, using, selling, offering for sale, having made, import, or transfer of either its Contributions or its Contributor Version. 1.12. "Secondary License" means either the GNU General Public License, Version 2.0, the GNU Lesser General Public License, Version 2.1, the GNU Affero General Public License, Version 3.0, or any later versions of those licenses. 1.13. "Source Code Form" means the form of the work preferred for making modifications. 1.14. "You" (or "Your") means an individual or a legal entity exercising rights under this License. For legal entities, "You" includes any entity that controls, is controlled by, or is under common control with You. For purposes of this definition, "control" means (a) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (b) ownership of more than fifty percent (50%) of the outstanding shares or beneficial ownership of such entity. 2. License Grants and Conditions -------------------------------- 2.1. Grants Each Contributor hereby grants You a world-wide, royalty-free, non-exclusive license: (a) under intellectual property rights (other than patent or trademark) Licensable by such Contributor to use, reproduce, make available, modify, display, perform, distribute, and otherwise exploit its Contributions, either on an unmodified basis, with Modifications, or as part of a Larger Work; and (b) under Patent Claims of such Contributor to make, use, sell, offer for sale, have made, import, and otherwise transfer either its Contributions or its Contributor Version. 2.2. Effective Date The licenses granted in Section 2.1 with respect to any Contribution become effective for each Contribution on the date the Contributor first distributes such Contribution. 2.3. Limitations on Grant Scope The licenses granted in this Section 2 are the only rights granted under this License. No additional rights or licenses will be implied from the distribution or licensing of Covered Software under this License. Notwithstanding Section 2.1(b) above, no patent license is granted by a Contributor: (a) for any code that a Contributor has removed from Covered Software; or (b) for infringements caused by: (i) Your and any other third party's modifications of Covered Software, or (ii) the combination of its Contributions with other software (except as part of its Contributor Version); or (c) under Patent Claims infringed by Covered Software in the absence of its Contributions. This License does not grant any rights in the trademarks, service marks, or logos of any Contributor (except as may be necessary to comply with the notice requirements in Section 3.4). 2.4. Subsequent Licenses No Contributor makes additional grants as a result of Your choice to distribute the Covered Software under a subsequent version of this License (see Section 10.2) or under the terms of a Secondary License (if permitted under the terms of Section 3.3). 2.5. Representation Each Contributor represents that the Contributor believes its Contributions are its original creation(s) or it has sufficient rights to grant the rights to its Contributions conveyed by this License. 2.6. Fair Use This License is not intended to limit any rights You have under applicable copyright doctrines of fair use, fair dealing, or other equivalents. 2.7. Conditions Sections 3.1, 3.2, 3.3, and 3.4 are conditions of the licenses granted in Section 2.1. 3. Responsibilities ------------------- 3.1. Distribution of Source Form All distribution of Covered Software in Source Code Form, including any Modifications that You create or to which You contribute, must be under the terms of this License. You must inform recipients that the Source Code Form of the Covered Software is governed by the terms of this License, and how they can obtain a copy of this License. You may not attempt to alter or restrict the recipients' rights in the Source Code Form. 3.2. Distribution of Executable Form If You distribute Covered Software in Executable Form then: (a) such Covered Software must also be made available in Source Code Form, as described in Section 3.1, and You must inform recipients of the Executable Form how they can obtain a copy of such Source Code Form by reasonable means in a timely manner, at a charge no more than the cost of distribution to the recipient; and (b) You may distribute such Executable Form under the terms of this License, or sublicense it under different terms, provided that the license for the Executable Form does not attempt to limit or alter the recipients' rights in the Source Code Form under this License. 3.3. Distribution of a Larger Work You may create and distribute a Larger Work under terms of Your choice, provided that You also comply with the requirements of this License for the Covered Software. If the Larger Work is a combination of Covered Software with a work governed by one or more Secondary Licenses, and the Covered Software is not Incompatible With Secondary Licenses, this License permits You to additionally distribute such Covered Software under the terms of such Secondary License(s), so that the recipient of the Larger Work may, at their option, further distribute the Covered Software under the terms of either this License or such Secondary License(s). 3.4. Notices You may not remove or alter the substance of any license notices (including copyright notices, patent notices, disclaimers of warranty, or limitations of liability) contained within the Source Code Form of the Covered Software, except that You may alter any license notices to the extent required to remedy known factual inaccuracies. 3.5. Application of Additional Terms You may choose to offer, and to charge a fee for, warranty, support, indemnity or liability obligations to one or more recipients of Covered Software. However, You may do so only on Your own behalf, and not on behalf of any Contributor. You must make it absolutely clear that any such warranty, support, indemnity, or liability obligation is offered by You alone, and You hereby agree to indemnify every Contributor for any liability incurred by such Contributor as a result of warranty, support, indemnity or liability terms You offer. You may include additional disclaimers of warranty and limitations of liability specific to any jurisdiction. 4. Inability to Comply Due to Statute or Regulation --------------------------------------------------- If it is impossible for You to comply with any of the terms of this License with respect to some or all of the Covered Software due to statute, judicial order, or regulation then You must: (a) comply with the terms of this License to the maximum extent possible; and (b) describe the limitations and the code they affect. Such description must be placed in a text file included with all distributions of the Covered Software under this License. Except to the extent prohibited by statute or regulation, such description must be sufficiently detailed for a recipient of ordinary skill to be able to understand it. 5. Termination -------------- 5.1. The rights granted under this License will terminate automatically if You fail to comply with any of its terms. However, if You become compliant, then the rights granted under this License from a particular Contributor are reinstated (a) provisionally, unless and until such Contributor explicitly and finally terminates Your grants, and (b) on an ongoing basis, if such Contributor fails to notify You of the non-compliance by some reasonable means prior to 60 days after You have come back into compliance. Moreover, Your grants from a particular Contributor are reinstated on an ongoing basis if such Contributor notifies You of the non-compliance by some reasonable means, this is the first time You have received notice of non-compliance with this License from such Contributor, and You become compliant prior to 30 days after Your receipt of the notice. 5.2. If You initiate litigation against any entity by asserting a patent infringement claim (excluding declaratory judgment actions, counter-claims, and cross-claims) alleging that a Contributor Version directly or indirectly infringes any patent, then the rights granted to You by any and all Contributors for the Covered Software under Section 2.1 of this License shall terminate. 5.3. In the event of termination under Sections 5.1 or 5.2 above, all end user license agreements (excluding distributors and resellers) which have been validly granted by You or Your distributors under this License prior to termination shall survive termination. ************************************************************************ * * * 6. Disclaimer of Warranty * * ------------------------- * * * * Covered Software is provided under this License on an "as is" * * basis, without warranty of any kind, either expressed, implied, or * * statutory, including, without limitation, warranties that the * * Covered Software is free of defects, merchantable, fit for a * * particular purpose or non-infringing. The entire risk as to the * * quality and performance of the Covered Software is with You. * * Should any Covered Software prove defective in any respect, You * * (not any Contributor) assume the cost of any necessary servicing, * * repair, or correction. This disclaimer of warranty constitutes an * * essential part of this License. No use of any Covered Software is * * authorized under this License except under this disclaimer. * * * ************************************************************************ ************************************************************************ * * * 7. Limitation of Liability * * -------------------------- * * * * Under no circumstances and under no legal theory, whether tort * * (including negligence), contract, or otherwise, shall any * * Contributor, or anyone who distributes Covered Software as * * permitted above, be liable to You for any direct, indirect, * * special, incidental, or consequential damages of any character * * including, without limitation, damages for lost profits, loss of * * goodwill, work stoppage, computer failure or malfunction, or any * * and all other commercial damages or losses, even if such party * * shall have been informed of the possibility of such damages. This * * limitation of liability shall not apply to liability for death or * * personal injury resulting from such party's negligence to the * * extent applicable law prohibits such limitation. Some * * jurisdictions do not allow the exclusion or limitation of * * incidental or consequential damages, so this exclusion and * * limitation may not apply to You. * * * ************************************************************************ 8. Litigation ------------- Any litigation relating to this License may be brought only in the courts of a jurisdiction where the defendant maintains its principal place of business and such litigation shall be governed by laws of that jurisdiction, without reference to its conflict-of-law provisions. Nothing in this Section shall prevent a party's ability to bring cross-claims or counter-claims. 9. Miscellaneous ---------------- This License represents the complete agreement concerning the subject matter hereof. If any provision of this License is held to be unenforceable, such provision shall be reformed only to the extent necessary to make it enforceable. Any law or regulation which provides that the language of a contract shall be construed against the drafter shall not be used to construe this License against a Contributor. 10. Versions of the License --------------------------- 10.1. New Versions Mozilla Foundation is the license steward. Except as provided in Section 10.3, no one other than the license steward has the right to modify or publish new versions of this License. Each version will be given a distinguishing version number. 10.2. Effect of New Versions You may distribute the Covered Software under the terms of the version of the License under which You originally received the Covered Software, or under the terms of any subsequent version published by the license steward. 10.3. Modified Versions If you create software not governed by this License, and you want to create a new license for such software, you may create and use a modified version of this License if you rename the license and remove any references to the name of the license steward (except to note that such modified license differs from this License). 10.4. Distributing Source Code Form that is Incompatible With Secondary Licenses If You choose to distribute Source Code Form that is Incompatible With Secondary Licenses under the terms of this version of the License, the notice described in Exhibit B of this License must be attached. Exhibit A - Source Code Form License Notice ------------------------------------------- This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0. If a copy of the MPL was not distributed with this file, You can obtain one at http://mozilla.org/MPL/2.0/. If it is not possible or desirable to put the notice in a particular file, then You may include the notice in a location (such as a LICENSE file in a relevant directory) where a recipient would be likely to look for such a notice. You may add additional accurate notices of copyright ownership. Exhibit B - "Incompatible With Secondary Licenses" Notice --------------------------------------------------------- This Source Code Form is "Incompatible With Secondary Licenses", as defined by the Mozilla Public License, v. 2.0.././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152418.0 hypothesis-jsonschema-0.23.1/MANIFEST.in0000644000175100001770000000014300000000000020466 0ustar00runnerdocker00000000000000include LICENSE include *.json include *.py include tox.ini recursive-include deps *.in *.md *.txt ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1709152428.2145355 hypothesis-jsonschema-0.23.1/PKG-INFO0000644000175100001770000001041600000000000020031 0ustar00runnerdocker00000000000000Metadata-Version: 2.1 Name: hypothesis-jsonschema Version: 0.23.1 Summary: Generate test data from JSON schemata with Hypothesis Home-page: https://github.com/Zac-HD/hypothesis-jsonschema Author: Zac Hatfield-Dodds Author-email: zac@zhd.dev License: MPL 2.0 Project-URL: Funding, https://github.com/sponsors/Zac-HD Keywords: python testing fuzzing property-based-testing json-schema Classifier: Development Status :: 4 - Beta Classifier: Framework :: Hypothesis Classifier: Intended Audience :: Developers Classifier: License :: OSI Approved :: Mozilla Public License 2.0 (MPL 2.0) Classifier: Programming Language :: Python Classifier: Programming Language :: Python :: 3 Classifier: Programming Language :: Python :: 3.8 Classifier: Programming Language :: Python :: 3.9 Classifier: Programming Language :: Python :: 3.10 Classifier: Programming Language :: Python :: 3.11 Classifier: Topic :: Education :: Testing Classifier: Topic :: Software Development :: Testing Classifier: Typing :: Typed Requires-Python: >=3.8 Description-Content-Type: text/markdown License-File: LICENSE Requires-Dist: hypothesis>=6.84.3 Requires-Dist: jsonschema>=4.18.0 # hypothesis-jsonschema A [Hypothesis](https://hypothesis.readthedocs.io) strategy for generating data that matches some [JSON schema](https://json-schema.org/). [Here's the PyPI page.](https://pypi.org/project/hypothesis-jsonschema/) ## API The public API consists of just one function: `hypothesis_jsonschema.from_schema`, which takes a JSON schema and returns a strategy for allowed JSON objects. ```python from hypothesis import given from hypothesis_jsonschema import from_schema @given(from_schema({"type": "integer", "minimum": 1, "exclusiveMaximum": 10})) def test_integers(value): assert isinstance(value, int) assert 1 <= value < 10 @given( from_schema( {"type": "string", "format": "card"}, # Standard formats work out of the box. Custom formats are ignored # by default, but you can pass custom strategies for them - e.g. custom_formats={"card": st.sampled_from(EXAMPLE_CARD_NUMBERS)}, ) ) def test_card_numbers(value): assert isinstance(value, str) assert re.match(r"^\d{4} \d{4} \d{4} \d{4}$", value) @given(from_schema({}, allow_x00=False, codec="utf-8").map(json.dumps)) def test_card_numbers(payload): assert isinstance(payload, str) assert "\0" not in payload # use allow_x00=False to exclude null characters # If you want to restrict generated strings characters which are valid in # a specific character encoding, you can do that with the `codec=` argument. payload.encode(codec="utf-8") ``` For more details on property-based testing and how to use or customise strategies, [see the Hypothesis docs](https://hypothesis.readthedocs.io/). JSONSchema drafts 04, 05, and 07 are fully tested and working. As of version 0.11, this includes resolving non-recursive references! ## Supported versions `hypothesis-jsonschema` requires Python 3.6 or later. In general, 0.x versions will require very recent versions of all dependencies because I don't want to deal with compatibility workarounds. `hypothesis-jsonschema` may make backwards-incompatible changes at any time before version 1.x - that's what semver means! - but I've kept the API surface small enough that this should be avoidable. The main source of breaks will be if or when schema that never really worked turn into explicit errors instead of generating values that don't quite match. You can [sponsor me](https://github.com/sponsors/Zac-HD) to get priority support, roadmap input, and prioritized feature development. ## Contributing to `hypothesis-jsonschema` We love external contributions - and try to make them both easy and fun. You can [read more details in our contributing guide](https://github.com/Zac-HD/hypothesis-jsonschema/blob/master/CONTRIBUTING.md), and [see everyone who has contributed on GitHub](https://github.com/Zac-HD/hypothesis-jsonschema/graphs/contributors). Thanks, everyone! ### Changelog Patch notes [can be found in `CHANGELOG.md`](https://github.com/Zac-HD/hypothesis-jsonschema/blob/master/CHANGELOG.md). ### Security contact information To report a security vulnerability, please use the [Tidelift security contact](https://tidelift.com/security). Tidelift will coordinate the fix and disclosure. ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152418.0 hypothesis-jsonschema-0.23.1/README.md0000644000175100001770000000621600000000000020216 0ustar00runnerdocker00000000000000# hypothesis-jsonschema A [Hypothesis](https://hypothesis.readthedocs.io) strategy for generating data that matches some [JSON schema](https://json-schema.org/). [Here's the PyPI page.](https://pypi.org/project/hypothesis-jsonschema/) ## API The public API consists of just one function: `hypothesis_jsonschema.from_schema`, which takes a JSON schema and returns a strategy for allowed JSON objects. ```python from hypothesis import given from hypothesis_jsonschema import from_schema @given(from_schema({"type": "integer", "minimum": 1, "exclusiveMaximum": 10})) def test_integers(value): assert isinstance(value, int) assert 1 <= value < 10 @given( from_schema( {"type": "string", "format": "card"}, # Standard formats work out of the box. Custom formats are ignored # by default, but you can pass custom strategies for them - e.g. custom_formats={"card": st.sampled_from(EXAMPLE_CARD_NUMBERS)}, ) ) def test_card_numbers(value): assert isinstance(value, str) assert re.match(r"^\d{4} \d{4} \d{4} \d{4}$", value) @given(from_schema({}, allow_x00=False, codec="utf-8").map(json.dumps)) def test_card_numbers(payload): assert isinstance(payload, str) assert "\0" not in payload # use allow_x00=False to exclude null characters # If you want to restrict generated strings characters which are valid in # a specific character encoding, you can do that with the `codec=` argument. payload.encode(codec="utf-8") ``` For more details on property-based testing and how to use or customise strategies, [see the Hypothesis docs](https://hypothesis.readthedocs.io/). JSONSchema drafts 04, 05, and 07 are fully tested and working. As of version 0.11, this includes resolving non-recursive references! ## Supported versions `hypothesis-jsonschema` requires Python 3.6 or later. In general, 0.x versions will require very recent versions of all dependencies because I don't want to deal with compatibility workarounds. `hypothesis-jsonschema` may make backwards-incompatible changes at any time before version 1.x - that's what semver means! - but I've kept the API surface small enough that this should be avoidable. The main source of breaks will be if or when schema that never really worked turn into explicit errors instead of generating values that don't quite match. You can [sponsor me](https://github.com/sponsors/Zac-HD) to get priority support, roadmap input, and prioritized feature development. ## Contributing to `hypothesis-jsonschema` We love external contributions - and try to make them both easy and fun. You can [read more details in our contributing guide](https://github.com/Zac-HD/hypothesis-jsonschema/blob/master/CONTRIBUTING.md), and [see everyone who has contributed on GitHub](https://github.com/Zac-HD/hypothesis-jsonschema/graphs/contributors). Thanks, everyone! ### Changelog Patch notes [can be found in `CHANGELOG.md`](https://github.com/Zac-HD/hypothesis-jsonschema/blob/master/CHANGELOG.md). ### Security contact information To report a security vulnerability, please use the [Tidelift security contact](https://tidelift.com/security). Tidelift will coordinate the fix and disclosure. ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1709152428.2105355 hypothesis-jsonschema-0.23.1/deps/0000755000175100001770000000000000000000000017665 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152418.0 hypothesis-jsonschema-0.23.1/deps/README.md0000644000175100001770000000137000000000000021145 0ustar00runnerdocker00000000000000# Pinning dependencies `hypothesis-jsonschema` pins *all* our dependencies for testing, and disables installation of any unlisted dependencies to make sure the set of pins is complete. How does this work? 1. `setup.py` lists all our top-level dependencies for the library, and *also* lists the development and test-time dependencies. 2. `pip-compile` calculates all the transitive dependencies we need, with exact version pins. We use `tox -e deps` to make this more convenient, and don't bother pinning `pip-tools` as it's always run manually (never in CI). 3. `tox` then installs from the files full of pinned versions here! That's it - a simple implementation but it stabilises the whole dependency chain and really improves visibility :-) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152418.0 hypothesis-jsonschema-0.23.1/deps/check.in0000644000175100001770000000010200000000000021263 0ustar00runnerdocker00000000000000# Top-level dependencies for `tox -e check` flake8 ruff mypy shed ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152418.0 hypothesis-jsonschema-0.23.1/deps/check.txt0000644000175100001770000000213600000000000021505 0ustar00runnerdocker00000000000000# # This file is autogenerated by pip-compile with Python 3.10 # by the following command: # # pip-compile --output-file=deps/check.txt deps/check.in # autoflake==2.2.1 # via shed black==23.9.0 # via shed click==8.1.7 # via black com2ann==0.3.0 # via shed flake8==6.1.0 # via -r deps/check.in isort==5.12.0 # via shed libcst==1.0.1 # via shed mccabe==0.7.0 # via flake8 mypy==1.5.1 # via -r deps/check.in mypy-extensions==1.0.0 # via # black # mypy # typing-inspect packaging==23.1 # via black pathspec==0.11.2 # via black platformdirs==3.10.0 # via black pycodestyle==2.11.0 # via flake8 pyflakes==3.1.0 # via # autoflake # flake8 pyupgrade==3.10.1 # via shed pyyaml==6.0.1 # via libcst ruff==0.0.287 # via -r deps/check.in shed==2023.6.1 # via -r deps/check.in tokenize-rt==5.2.0 # via pyupgrade tomli==2.0.1 # via # autoflake # black # mypy typing-extensions==4.7.1 # via # black # libcst # mypy # typing-inspect typing-inspect==0.9.0 # via libcst ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152418.0 hypothesis-jsonschema-0.23.1/deps/deps.in0000644000175100001770000000006500000000000021151 0ustar00runnerdocker00000000000000# Top-level dependencies for `tox -e deps` pip-tools ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152418.0 hypothesis-jsonschema-0.23.1/deps/deps.txt0000644000175100001770000000034300000000000021361 0ustar00runnerdocker00000000000000# # This file is autogenerated by pip-compile # To update, run: # # pip-compile --output-file=deps/deps.txt deps/deps.in # click==7.0 # via pip-tools pip-tools==4.1.0 six==1.12.0 # via pip-tools ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152418.0 hypothesis-jsonschema-0.23.1/deps/test.in0000644000175100001770000000013500000000000021173 0ustar00runnerdocker00000000000000# Top-level dependencies for `tox -e test` pytest pytest-cov pytest-xdist jsonschema[format] ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152418.0 hypothesis-jsonschema-0.23.1/deps/test.txt0000644000175100001770000000306200000000000021406 0ustar00runnerdocker00000000000000# # This file is autogenerated by pip-compile with Python 3.10 # by the following command: # # pip-compile --output-file=deps/test.txt deps/test.in setup.py # arrow==1.2.3 # via isoduration attrs==23.1.0 # via # hypothesis # jsonschema # referencing coverage[toml]==7.3.1 # via pytest-cov exceptiongroup==1.1.3 # via # hypothesis # pytest execnet==2.0.2 # via pytest-xdist fqdn==1.5.1 # via jsonschema hypothesis==6.84.3 # via hypothesis-jsonschema (setup.py) idna==3.4 # via jsonschema iniconfig==2.0.0 # via pytest isoduration==20.11.0 # via jsonschema jsonpointer==2.4 # via jsonschema jsonschema[format]==4.19.0 # via # -r deps/test.in # hypothesis-jsonschema (setup.py) jsonschema-specifications==2023.7.1 # via jsonschema packaging==23.1 # via pytest pluggy==1.3.0 # via pytest pytest==7.4.2 # via # -r deps/test.in # pytest-cov # pytest-xdist pytest-cov==4.1.0 # via -r deps/test.in pytest-xdist==3.3.1 # via -r deps/test.in python-dateutil==2.8.2 # via arrow referencing==0.30.2 # via # jsonschema # jsonschema-specifications rfc3339-validator==0.1.4 # via jsonschema rfc3987==1.3.8 # via jsonschema rpds-py==0.10.2 # via # jsonschema # referencing six==1.16.0 # via # python-dateutil # rfc3339-validator sortedcontainers==2.4.0 # via hypothesis tomli==2.0.1 # via # coverage # pytest uri-template==1.3.0 # via jsonschema webcolors==1.13 # via jsonschema ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152418.0 hypothesis-jsonschema-0.23.1/pyproject.toml0000644000175100001770000000200400000000000021642 0ustar00runnerdocker00000000000000[tool.ruff] select = [ "ASYNC", # flake8-async "B", # flake8-bugbear "C4", # flake8-comprehensions "COM", # flake8-commas "E", # pycodestyle "F", # Pyflakes "FBT", # flake8-boolean-trap "FLY", # flynt "G", # flake8-logging-format "INT", # flake8-gettext "ISC", # flake8-implicit-str-concat "PIE", # flake8-pie "PLE", # Pylint errors "PT", # flake8-pytest-style "RET504", # flake8-return "RSE", # flake8-raise "SIM", # flake8-simplify "T10", # flake8-debugger "TID", # flake8-tidy-imports "UP", # pyupgrade "W", # pycodestyle "YTT", # flake8-2020 # "PTH", # flake8-use-pathlib "RUF", # Ruff-specific rules ] ignore = [ "B008", "B018", "B017", "C408", "COM812", "DJ008", "E501", "E721", "E731", "E741", "FBT003", "PT001", "PT003", "PT006", "PT007", "PT009", "PT011", "PT012", "PT013", "PT017", "PT019", "PT023", "PT027", "UP031", ] target-version = "py38" ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1709152428.2145355 hypothesis-jsonschema-0.23.1/setup.cfg0000644000175100001770000000004600000000000020553 0ustar00runnerdocker00000000000000[egg_info] tag_build = tag_date = 0 ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152418.0 hypothesis-jsonschema-0.23.1/setup.py0000644000175100001770000000350100000000000020443 0ustar00runnerdocker00000000000000import os import pathlib import setuptools def local_file(name: str) -> str: """Interpret filename as relative to this file.""" return os.path.relpath(os.path.join(os.path.dirname(__file__), name)) SOURCE = local_file("src") README = local_file("README.md") with open(local_file("src/hypothesis_jsonschema/__init__.py")) as o: for line in o: if line.startswith("__version__"): _, __version__, _ = line.split('"') setuptools.setup( name="hypothesis-jsonschema", version=__version__, author="Zac Hatfield-Dodds", author_email="zac@zhd.dev", packages=setuptools.find_packages(SOURCE), package_dir={"": SOURCE}, package_data={"": ["py.typed"]}, url="https://github.com/Zac-HD/hypothesis-jsonschema", project_urls={"Funding": "https://github.com/sponsors/Zac-HD"}, license="MPL 2.0", description="Generate test data from JSON schemata with Hypothesis", zip_safe=False, install_requires=["hypothesis>=6.84.3", "jsonschema>=4.18.0"], python_requires=">=3.8", classifiers=[ "Development Status :: 4 - Beta", "Framework :: Hypothesis", "Intended Audience :: Developers", "License :: OSI Approved :: Mozilla Public License 2.0 (MPL 2.0)", "Programming Language :: Python", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.8", "Programming Language :: Python :: 3.9", "Programming Language :: Python :: 3.10", "Programming Language :: Python :: 3.11", "Topic :: Education :: Testing", "Topic :: Software Development :: Testing", "Typing :: Typed", ], long_description=pathlib.Path(README).read_text(), long_description_content_type="text/markdown", keywords="python testing fuzzing property-based-testing json-schema", ) ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1709152428.2065356 hypothesis-jsonschema-0.23.1/src/0000755000175100001770000000000000000000000017521 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1709152428.2105355 hypothesis-jsonschema-0.23.1/src/hypothesis_jsonschema/0000755000175100001770000000000000000000000024132 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152418.0 hypothesis-jsonschema-0.23.1/src/hypothesis_jsonschema/__init__.py0000644000175100001770000000032200000000000026240 0ustar00runnerdocker00000000000000"""A Hypothesis extension for JSON schemata. The only public API is `from_schema`; check the docstring for details. """ __version__ = "0.23.1" __all__ = ["from_schema"] from ._from_schema import from_schema ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152418.0 hypothesis-jsonschema-0.23.1/src/hypothesis_jsonschema/_canonicalise.py0000644000175100001770000010477000000000000027304 0ustar00runnerdocker00000000000000""" Canonicalisation logic for JSON schemas. The canonical format that we transform to is not intended for human consumption. Instead, it prioritises locality of reasoning - for example, we convert oneOf arrays into an anyOf of allOf (each sub-schema being the original plus not anyOf the rest). Resolving references and merging subschemas is also really helpful. All this effort is justified by the huge performance improvements that we get when converting to Hypothesis strategies. To the extent possible there is only one way to generate any given value... but much more importantly, we can do most things by construction instead of by filtering. That's the difference between "I'd like it to be faster" and "doesn't finish at all". """ import contextlib import itertools import json import math import re from fractions import Fraction from functools import lru_cache from typing import Any, Dict, List, Optional, Tuple, Union import jsonschema from hypothesis.errors import InvalidArgument from hypothesis.internal.floats import next_down as ieee_next_down, next_up from ._encode import JSONType, encode_canonical_json, sort_key Schema = Dict[str, JSONType] JSONSchemaValidator = Union[ jsonschema.validators.Draft4Validator, jsonschema.validators.Draft6Validator, jsonschema.validators.Draft7Validator, ] # Canonical type strings, in order. TYPE_STRINGS = ("null", "boolean", "integer", "number", "string", "array", "object") TYPE_SPECIFIC_KEYS = ( ("number", "multipleOf maximum exclusiveMaximum minimum exclusiveMinimum"), ("integer", "multipleOf maximum exclusiveMaximum minimum exclusiveMinimum"), ("string", "maxLength minLength pattern format contentEncoding contentMediaType"), ("array", "items additionalItems maxItems minItems uniqueItems contains"), ( "object", "maxProperties minProperties required properties patternProperties " "additionalProperties dependencies propertyNames", ), ) # Names of keywords where the associated values may be schemas or lists of schemas. SCHEMA_KEYS = tuple( "items additionalItems contains additionalProperties propertyNames " "if then else allOf anyOf oneOf not".split() ) # Names of keywords where the value is an object whose values are schemas. # Note that in some cases ("dependencies"), the value may be a list of strings. SCHEMA_OBJECT_KEYS = ("properties", "patternProperties", "dependencies") ALL_KEYWORDS = ( *SCHEMA_KEYS, *SCHEMA_OBJECT_KEYS, *sum((s.split() for _, s in TYPE_SPECIFIC_KEYS), []), ) def next_down(val: float) -> float: """Compensate for JSONschema's lack of negative zero with an extra step.""" out = ieee_next_down(val) if out == 0 and math.copysign(1, out) == -1: out = ieee_next_down(out) assert isinstance(out, float) return out class CacheableSchema: """Cache schema by its JSON representation. Canonicalisation is not required as schemas with the same JSON representation will have the same validator. """ __slots__ = ("schema", "encoded") def __init__(self, schema: Schema) -> None: self.schema = schema self.encoded = hash(json.dumps(schema, sort_keys=True)) def __eq__(self, other: "CacheableSchema") -> bool: # type: ignore return self.encoded == other.encoded def __hash__(self) -> int: return self.encoded def _get_validator_class(schema: Schema) -> JSONSchemaValidator: return __get_validator_class(CacheableSchema(schema)) @lru_cache(maxsize=128) def __get_validator_class(wrapper: CacheableSchema) -> JSONSchemaValidator: schema = wrapper.schema with contextlib.suppress(jsonschema.exceptions.SchemaError): validator = jsonschema.validators.validator_for(schema) validator.check_schema(schema) return validator with contextlib.suppress(jsonschema.exceptions.SchemaError): jsonschema.Draft7Validator.check_schema(schema) return jsonschema.Draft7Validator jsonschema.Draft4Validator.check_schema(schema) return jsonschema.Draft4Validator def make_validator(schema: Schema) -> JSONSchemaValidator: validator = _get_validator_class(schema) return validator(schema) class HypothesisRefResolutionError(jsonschema.exceptions._RefResolutionError): pass def get_type(schema: Schema) -> List[str]: """Return a canonical value for the "type" key. Note that this will return [], the empty list, if the value is a list without any allowed type names; *even though* this is explicitly an invalid value. """ type_ = schema.get("type", list(TYPE_STRINGS)) # Canonicalise the "type" key to a sorted list of type strings. if isinstance(type_, str): assert type_ in TYPE_STRINGS return [type_] assert isinstance(type_, list), type_ assert set(type_).issubset(TYPE_STRINGS), type_ type_ = [t for t in TYPE_STRINGS if t in type_] if "number" in type_ and "integer" in type_: type_.remove("integer") # all integers are numbers, so this is redundant return type_ def upper_bound_instances(schema: Schema) -> float: """Return an upper bound on the number of instances that match this schema.""" if schema == FALSEY: return 0 if "const" in schema: return 1 if "enum" in schema: assert isinstance(schema["enum"], list) return len(schema["enum"]) if get_type(schema) == ["integer"]: lower, upper = get_integer_bounds(schema) if lower is not None and upper is not None: mul = schema.get("multipleOf") if isinstance(mul, int): return 1 + (upper - lower) % mul return 1 + (upper - lower) # Non-integer mul can only reduce upper bound if ( get_type(schema) == ["array"] and isinstance(schema.get("items"), dict) and schema.get("maxItems", math.inf) < 100 # type: ignore ): # For simplicity, we use the upper bound with replacement; while we could # tighten this by considering uniqueItems it's not worth the extra code. items_bound = upper_bound_instances(schema["items"]) # type: ignore if items_bound < 100: lo, hi = schema.get("minItems", 0), schema["maxItems"] assert isinstance(lo, int) assert isinstance(hi, int) return sum(items_bound**n for n in range(lo, hi + 1)) return math.inf def _get_numeric_bounds( schema: Schema, ) -> Tuple[Optional[float], Optional[float], bool, bool]: """Get the min and max allowed numbers, and whether they are exclusive.""" lower = schema.get("minimum") upper = schema.get("maximum") exmin = schema.get("exclusiveMinimum", False) exmax = schema.get("exclusiveMaximum", False) assert lower is None or isinstance(lower, (int, float)) assert upper is None or isinstance(upper, (int, float)) assert isinstance(exmin, (bool, int, float)) assert isinstance(exmax, (bool, int, float)) # Canonicalise to number-and-boolean representation if exmin is not True and exmin is not False: if lower is None or exmin >= lower: lower, exmin = exmin, True else: exmin = False if exmax is not True and exmax is not False: if upper is None or exmax <= upper: upper, exmax = exmax, True else: exmax = False assert isinstance(exmin, bool) assert isinstance(exmax, bool) return lower, upper, exmin, exmax def get_number_bounds( schema: Schema, ) -> Tuple[Optional[float], Optional[float], bool, bool]: """Get the min and max allowed floats, and whether they are exclusive.""" lower, upper, exmin, exmax = _get_numeric_bounds(schema) if lower is not None: lo = float(lower) if lo < lower: lo = next_up(lo) exmin = False lower = lo if upper is not None: hi = float(upper) if hi > upper: hi = next_down(hi) exmax = False upper = hi return lower, upper, exmin, exmax def get_integer_bounds(schema: Schema) -> Tuple[Optional[int], Optional[int]]: """Get the min and max allowed integers.""" lower, upper, exmin, exmax = _get_numeric_bounds(schema) # Adjust bounds and cast to int if lower is not None: lo = math.ceil(lower) if exmin and lo == lower: lo += 1 lower = lo if upper is not None: hi = math.floor(upper) if exmax and hi == upper: hi -= 1 upper = hi return lower, upper def canonicalish(schema: JSONType) -> Dict[str, Any]: """Convert a schema into a more-canonical form. This is obviously incomplete, but improves best-effort recognition of equivalent schemas and makes conversion logic simpler. """ if schema is True: return {} elif schema is False: return {"not": {}} # Make a copy, so we don't mutate the existing schema in place. # Using the canonical encoding makes all integer-valued floats into ints. schema = json.loads(encode_canonical_json(schema)) # Otherwise, we're dealing with "objects", i.e. dicts. if not isinstance(schema, dict): raise InvalidArgument( f"Got schema={schema!r} of type {type(schema).__name__}, " "but expected a dict." ) if "const" in schema: if not make_validator(schema).is_valid(schema["const"]): return FALSEY return {"const": schema["const"]} if "enum" in schema: validator = make_validator(schema) enum_ = sorted( (v for v in schema["enum"] if validator.is_valid(v)), key=sort_key ) if not enum_: return FALSEY elif len(enum_) == 1: return {"const": enum_[0]} return {"enum": enum_} # if/then/else schemas are ignored unless if and another are present if_ = schema.pop("if", None) then = schema.pop("then", schema) else_ = schema.pop("else", schema) if ( if_ is not None and (then is not schema or else_ is not schema) and (then not in (if_, TRUTHY) or else_ != TRUTHY) ): alternatives = [ {"allOf": [if_, then, schema]}, {"allOf": [{"not": if_}, else_, schema]}, ] schema = canonicalish({"anyOf": alternatives}) assert isinstance(schema, dict) # Recurse into the value of each keyword with a schema (or list of them) as a value for key in SCHEMA_KEYS: if isinstance(schema.get(key), list): schema[key] = [canonicalish(v) for v in schema[key]] elif isinstance(schema.get(key), (bool, dict)): schema[key] = canonicalish(schema[key]) else: assert key not in schema, (key, schema[key]) for key in SCHEMA_OBJECT_KEYS: if key in schema: schema[key] = { k: v if isinstance(v, list) else canonicalish(v) for k, v in schema[key].items() } # multipleOf is semantically unaffected by the sign, so ensure it's positive if "multipleOf" in schema: schema["multipleOf"] = abs(schema["multipleOf"]) type_ = get_type(schema) if "number" in type_: if schema.get("exclusiveMinimum") is False: del schema["exclusiveMinimum"] if schema.get("exclusiveMaximum") is False: del schema["exclusiveMaximum"] lo, hi, exmin, exmax = get_number_bounds(schema) mul = schema.get("multipleOf") if isinstance(mul, int): # Numbers which are a multiple of an integer? That's the integer type. type_.remove("number") type_ = [t for t in TYPE_STRINGS if t in type_ or t == "integer"] elif lo is not None and hi is not None: lobound = next_up(lo) if exmin else lo hibound = next_down(hi) if exmax else hi if ( mul and not has_divisibles(lo, hi, mul, exmin, exmax) ) or lobound > hibound: type_.remove("number") elif type_ == ["number"] and lobound == hibound: return {"const": lobound} if "integer" in type_: lo, hi = get_integer_bounds(schema) mul = schema.get("multipleOf") if mul is not None and "number" not in type_ and Fraction(mul).numerator == 1: # Every integer is a multiple of 1/n for all natural numbers n. schema.pop("multipleOf") mul = None if lo is not None and isinstance(mul, int) and mul > 1 and (lo % mul): # type: ignore[unreachable] lo += mul - (lo % mul) # type: ignore[unreachable] if hi is not None and isinstance(mul, int) and mul > 1 and (hi % mul): # type: ignore[unreachable] hi -= hi % mul # type: ignore[unreachable] if lo is not None: schema["minimum"] = lo # type: ignore[unreachable] schema.pop("exclusiveMinimum", None) if hi is not None: schema["maximum"] = hi # type: ignore[unreachable] schema.pop("exclusiveMaximum", None) if lo is not None and hi is not None and lo > hi: # type: ignore[unreachable] type_.remove("integer") # type: ignore[unreachable] elif type_ == ["integer"] and lo == hi and make_validator(schema).is_valid(lo): return {"const": lo} if "array" in type_ and "contains" in schema: if isinstance(schema.get("items"), dict): contains_items = merged([schema["contains"], schema["items"]]) if contains_items is not None: schema["contains"] = contains_items if schema["contains"] == FALSEY: type_.remove("array") else: schema["minItems"] = max(schema.get("minItems", 0), 1) if schema["contains"] == TRUTHY: schema.pop("contains") schema["minItems"] = max(schema.get("minItems", 1), 1) if ( "array" in type_ and "uniqueItems" in schema and isinstance(schema.get("items", []), dict) ): item_count = upper_bound_instances(schema["items"]) if math.isfinite(item_count): schema["maxItems"] = min(item_count, schema.get("maxItems", math.inf)) if "array" in type_ and schema.get("minItems", 0) > schema.get( "maxItems", math.inf ): type_.remove("array") if ( "array" in type_ and "minItems" in schema and isinstance(schema.get("items", []), dict) ): count = upper_bound_instances(schema["items"]) if (count == 0 and schema["minItems"] > 0) or ( schema.get("uniqueItems", False) and count < schema["minItems"] ): type_.remove("array") if "array" in type_ and isinstance(schema.get("items"), list): schema["items"] = schema["items"][: schema.get("maxItems")] for idx, s in enumerate(schema["items"]): if s == FALSEY: schema["items"] = schema["items"][:idx] schema["maxItems"] = idx schema.pop("additionalItems", None) break if schema.get("minItems", 0) > min( len(schema["items"]) + upper_bound_instances(schema.get("additionalItems", TRUTHY)), schema.get("maxItems", math.inf), ): type_.remove("array") if ( "array" in type_ and isinstance(schema.get("items"), list) and schema.get("additionalItems") == FALSEY ): schema.pop("maxItems", None) if "array" in type_ and ( schema.get("items") == FALSEY or schema.get("maxItems", 1) == 0 ): schema["maxItems"] = 0 schema.pop("items", None) schema.pop("uniqueItems", None) schema.pop("additionalItems", None) if "array" in type_ and schema.get("items", TRUTHY) == TRUTHY: schema.pop("items", None) if ( "properties" in schema and not schema.get("patternProperties") and schema.get("additionalProperties") == FALSEY ): max_props = schema.get("maxProperties", math.inf) assert isinstance(max_props, (int, float)) for k, v in list(schema["properties"].items()): if v == FALSEY: schema["properties"].pop(k) schema["maxProperties"] = min(max_props, len(schema["properties"])) if schema.get("maxProperties", math.inf) == 0: for k in ("properties", "patternProperties", "additionalProperties"): schema.pop(k, None) if "object" in type_ and schema.get("minProperties", 0) > schema.get( "maxProperties", math.inf ): type_.remove("object") # Discard dependencies values that don't restrict anything for k, v in schema.get("dependencies", {}).copy().items(): if v in ([], TRUTHY): schema["dependencies"].pop(k) # Remove no-op keywords for kw, identity in { "minItems": 0, "items": {}, "additionalItems": {}, "dependencies": {}, "minProperties": 0, "properties": {}, "propertyNames": {}, "patternProperties": {}, "additionalProperties": {}, "required": [], }.items(): if kw in schema and schema[kw] == identity: schema.pop(kw) # Canonicalise "required" schemas to remove redundancy if "object" in type_ and "required" in schema: assert isinstance(schema["required"], list) reqs = set(schema["required"]) if schema.get("dependencies"): # When the presence of a required property requires other properties via # dependencies, those properties can be moved to the base required keys. dep_names = { k: sorted(set(v)) for k, v in schema["dependencies"].items() if isinstance(v, list) } schema["dependencies"].update(dep_names) while reqs.intersection(dep_names): for r in reqs.intersection(dep_names): reqs.update(dep_names.pop(r)) schema["dependencies"].pop(r) # TODO: else merge schema-dependencies of required properties # into the base schema after adding required back in and being # careful to avoid an infinite loop... if not schema["dependencies"]: schema.pop("dependencies") schema["required"] = sorted(reqs) max_ = schema.get("maxProperties", float("inf")) assert isinstance(max_, (int, float)) properties = schema.get("properties", {}) propnames_validator = make_validator(schema.get("propertyNames", {})).is_valid if ( len(schema["required"]) > max_ or any(properties.get(name, {}) == FALSEY for name in schema["required"]) or not all(propnames_validator(name) for name in schema["required"]) ): type_.remove("object") for t, kw in TYPE_SPECIFIC_KEYS: numeric = {"number", "integer"} if t in type_ or (t in numeric and numeric.intersection(type_)): continue for k in kw.split(): schema.pop(k, None) # Canonicalise "not" subschemas if "not" in schema: not_ = schema.pop("not") negated = [] to_negate = not_["anyOf"] if set(not_) == {"anyOf"} else [not_] for not_ in to_negate: type_keys = {k: set(v.split()) for k, v in TYPE_SPECIFIC_KEYS} type_constraints = {"type"} for v in type_keys.values(): type_constraints |= v if set(not_).issubset(type_constraints): not_["type"] = get_type(not_) for t in set(type_).intersection(not_["type"]): if not type_keys.get(t, set()).intersection(not_): type_.remove(t) if t not in ("integer", "number"): not_["type"].remove(t) not_ = canonicalish(not_) m = merged([not_, {**schema, "type": type_}]) if m is not None: not_ = m if not_ != FALSEY: negated.append(not_) if len(negated) > 1: schema["not"] = {"anyOf": negated} elif negated: schema["not"] = negated[0] assert isinstance(type_, list), type_ if not type_: assert type_ == [] return FALSEY if type_ == ["null"]: return {"const": None} if type_ == ["boolean"]: return {"enum": [False, True]} if type_ == ["null", "boolean"]: return {"enum": [None, False, True]} if len(type_) == 1: schema["type"] = type_[0] elif type_ == get_type({}): schema.pop("type", None) else: schema["type"] = type_ # Canonicalise "xxxOf" lists; in each case canonicalising and sorting the # sub-schemas then handling any key-specific logic. if TRUTHY in schema.get("anyOf", ()): schema.pop("anyOf", None) if "anyOf" in schema: i = 0 while i < len(schema["anyOf"]): s = schema["anyOf"][i] if set(s) == {"anyOf"}: schema["anyOf"][i : i + 1] = s["anyOf"] continue i += 1 schema["anyOf"] = [ json.loads(s) for s in sorted( {encode_canonical_json(a) for a in schema["anyOf"] if a != FALSEY} ) ] if not schema["anyOf"]: return FALSEY if len(schema) == len(schema["anyOf"]) == 1: return schema["anyOf"][0] # type: ignore types = [] # Turn # {"anyOf": [{"type": "string"}, {"type": "null"}]} # into # {"type": ["string", "null"]} for subschema in schema["anyOf"]: if "type" in subschema and len(subschema) == 1: types.extend(get_type(subschema)) else: break else: # All subschemas have only the "type" keyword, then we merge all types # into the parent schema del schema["anyOf"] new_types = canonicalish({"type": types}) schema = merged([schema, new_types]) assert isinstance(schema, dict) # merging was certainly valid if "allOf" in schema: schema["allOf"] = [ json.loads(enc) for enc in sorted(set(map(encode_canonical_json, schema["allOf"]))) ] if any(s == FALSEY for s in schema["allOf"]): return FALSEY if all(s == TRUTHY for s in schema["allOf"]): schema.pop("allOf") elif len(schema) == len(schema["allOf"]) == 1: return schema["allOf"][0] # type: ignore else: tmp = schema.copy() ao = tmp.pop("allOf") out = merged([tmp, *ao]) if out is not None: schema = out if "oneOf" in schema: one_of = schema.pop("oneOf") assert isinstance(one_of, list) one_of = sorted(one_of, key=encode_canonical_json) one_of = [s for s in one_of if s != FALSEY] if len(one_of) == 1: m = merged([schema, one_of[0]]) if m is not None: # pragma: no branch return m if (not one_of) or one_of.count(TRUTHY) > 1: return FALSEY schema["oneOf"] = one_of if schema.get("uniqueItems") is False: del schema["uniqueItems"] return schema TRUTHY = canonicalish(True) FALSEY = canonicalish(False) def merged(schemas: List[Any]) -> Optional[Schema]: """Merge *n* schemas into a single schema, or None if result is invalid. Takes the logical intersection, so any object that validates against the returned schema must also validate against all of the input schemas. None is returned for keys that cannot be merged short of pushing parts of the schema into an allOf construct, such as the "contains" key for arrays - there is no other way to merge two schema that could otherwise be applied to different array elements. It's currently also used for keys that could be merged but aren't yet. """ assert schemas, "internal error: must pass at least one schema to merge" schemas = sorted((canonicalish(s) for s in schemas), key=upper_bound_instances) if any(s == FALSEY for s in schemas): return FALSEY out = schemas[0] for s in schemas[1:]: if s == TRUTHY: continue # If we have a const or enum, this is fairly easy by filtering: if "const" in out: if make_validator(s).is_valid(out["const"]): continue return FALSEY if "enum" in out: validator = make_validator(s) enum_ = [v for v in out["enum"] if validator.is_valid(v)] if not enum_: return FALSEY elif len(enum_) == 1: out = {"const": enum_[0]} else: out = {"enum": enum_} continue if "type" in out and "type" in s: tt = s.pop("type") ot = get_type(out) if "number" in ot: ot.append("integer") out["type"] = [ t for t in ot if t in tt or t == "integer" and "number" in tt ] out_type = get_type(out) if not out_type: return FALSEY for t, kw in TYPE_SPECIFIC_KEYS: numeric = ["number", "integer"] if t in out_type or t in numeric and t in out_type + numeric: continue for k in kw.split(): s.pop(k, None) out.pop(k, None) # OK, this is a tricky bit, because we have three overlapping parts. # First we'll deal with the `properties` keyword, containing schemas for # the values associated with an exact key - we merge this with the exact # match from the other schema *or* all of the matching patternProperties # *or* the additionalProperties schema if there are no matches, in that # order. out_add = out.get("additionalProperties", {}) s_add = s.pop("additionalProperties", {}) out_pat = out.get("patternProperties", {}) s_pat = s.pop("patternProperties", {}) if "properties" in out or "properties" in s: # The get/pop/setdefault dance and if-statements ensure that we end up with # none of these keys present in `s`, and avoid adding them to `out` which # can cause an infinite loop of recursive merging. out_props = out.setdefault("properties", {}) s_props = s.pop("properties", {}) for prop_name in set(out_props) | set(s_props): if prop_name in out_props: out_combined = out_props[prop_name] else: out_combined = merged( [s for p, s in out_pat.items() if re.search(p, prop_name)] or [out_add] ) if prop_name in s_props: s_combined = s_props[prop_name] else: s_combined = merged( [s for p, s in s_pat.items() if re.search(p, prop_name)] or [s_add] ) if out_combined is None or s_combined is None: # pragma: no cover # Note that this can only be the case if we were actually going to # use the schema which we attempted to merge, i.e. prop_name was # not in the schema and there were unmergable pattern schemas. return None m = merged([out_combined, s_combined]) if m is None: # pragma: no cover return None out_props[prop_name] = m # With all the property names done, it's time to handle the patterns. This is # simpler as we merge with either an identical pattern, or additionalProperties. if out_pat or s_pat: for pattern in set(out_pat) | set(s_pat): m = merged([out_pat.get(pattern, out_add), s_pat.get(pattern, s_add)]) if m is None: # pragma: no cover return None out_pat[pattern] = m out["patternProperties"] = out_pat # Finally, we merge togther the additionalProperties schemas. if out_add or s_add: m = merged([out_add, s_add]) if m is None: # pragma: no cover return None out["additionalProperties"] = m if "allOf" in out and "allOf" in s: # All our allOf schemas will be de-duplicated by canonicalise out["allOf"] += s.pop("allOf") if "required" in out and "required" in s: out["required"] = sorted(set(out["required"] + s.pop("required"))) for key in ( {"maximum", "exclusiveMaximum", "maxLength", "maxItems", "maxProperties"} & set(s) & set(out) ): out[key] = min([out[key], s.pop(key)]) for key in ( {"minimum", "exclusiveMinimum", "minLength", "minItems", "minProperties"} & set(s) & set(out) ): out[key] = max([out[key], s.pop(key)]) if "multipleOf" in out and "multipleOf" in s: x, y = s.pop("multipleOf"), out["multipleOf"] if isinstance(x, int) and isinstance(y, int): out["multipleOf"] = x * y // math.gcd(x, y) elif x != y: ratio = Fraction(max(x, y)) / Fraction(min(x, y)) if ratio.denominator == 1: # e.g. .75, 1.5 out["multipleOf"] = max(x, y) else: return None if "contains" in out and "contains" in s and out["contains"] != s["contains"]: # If one `contains` schema is a subset of the other, we can discard it. m = merged([out["contains"], s["contains"]]) if m in (out["contains"], s["contains"]): out["contains"] = m s.pop("contains") if "not" in out and "not" in s and out["not"] != s["not"]: out["not"] = {"anyOf": [out["not"], s.pop("not")]} if ( "dependencies" in out and "dependencies" in s and out["dependencies"] != s["dependencies"] ): # Note: draft 2019-09 added separate keywords for name-dependencies # and schema-dependencies, but when we add support for that it will # be by canonicalising to the existing backwards-compatible keyword. # # In each dependencies dict, the keys are property names and the values # are either a list of required names, or a schema that the whole # instance must match. To merge a list and a schema, convert the # former into a `required` key! odeps = out["dependencies"] for k, v in odeps.copy().items(): if k in s["dependencies"]: sval = s["dependencies"].pop(k) if isinstance(v, list) and isinstance(sval, list): odeps[k] = v + sval continue if isinstance(v, list): v = {"required": v} elif isinstance(sval, list): sval = {"required": sval} m = merged([v, sval]) if m is None: return None odeps[k] = m odeps.update(s.pop("dependencies")) if "items" in out or "items" in s: oitems = out.pop("items", TRUTHY) sitems = s.pop("items", TRUTHY) if isinstance(oitems, list) and isinstance(sitems, list): out["items"] = [] out["additionalItems"] = merged( [ out.get("additionalItems", TRUTHY), s.get("additionalItems", TRUTHY), ] ) for a, b in itertools.zip_longest(oitems, sitems): if a is None: a = out.get("additionalItems", TRUTHY) elif b is None: b = s.get("additionalItems", TRUTHY) out["items"].append(merged([a, b])) elif isinstance(oitems, list): out["items"] = [merged([x, sitems]) for x in oitems] out["additionalItems"] = merged( [out.get("additionalItems", TRUTHY), sitems] ) elif isinstance(sitems, list): out["items"] = [merged([x, oitems]) for x in sitems] out["additionalItems"] = merged( [s.get("additionalItems", TRUTHY), oitems] ) else: out["items"] = merged([oitems, sitems]) if out["items"] is None: return None if isinstance(out["items"], list) and None in out["items"]: return None if out.get("additionalItems", TRUTHY) is None: return None s.pop("additionalItems", None) # This loop handles the remaining cases. Notably, we do not attempt to # merge distinct values for: # - `pattern`; computing regex intersection is out of scope # - `contains`; requires allOf and thus enters an infinite loop # - `$ref`; if not already resolved we can't do that here # - `anyOf`; due to product-like explosion in worst case # - `oneOf`; which we plan to handle as an anyOf-not composition # - `if`/`then`/`else`; which is removed by canonicalisation for k, v in s.items(): if k not in out: out[k] = v elif out[k] != v and k in ALL_KEYWORDS: # If non-validation keys like `title` or `description` don't match, # that doesn't really matter and we'll just go with first we saw. # # TODO: note that this is NOT TRUE in the case of recursive references, # where we might change the value at some location that a reference # still points to! return None out = canonicalish(out) if out == FALSEY: return FALSEY assert isinstance(out, dict) _get_validator_class(out) return out def has_divisibles( start: float, end: float, divisor: float, exmin: bool, exmax: bool # noqa ) -> bool: """If the given range from `start` to `end` has any numbers divisible by `divisor`.""" divisible_num = end // divisor - start // divisor if not exmin and not start % divisor: divisible_num += 1 if exmax and not end % divisor: divisible_num -= 1 return divisible_num >= 1 ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152418.0 hypothesis-jsonschema-0.23.1/src/hypothesis_jsonschema/_encode.py0000644000175100001770000000465400000000000026111 0ustar00runnerdocker00000000000000"""Canonical encoding for the JSONSchema semantics, where 1 == 1.0.""" import json import math import platform from typing import Any, Dict, Tuple, Union # Mypy does not (yet!) support recursive type definitions. # (and writing a few steps by hand is a DoS attack on the AST walker in Pytest) PYTHON_IMPLEMENTATION = platform.python_implementation() JSONType = Union[None, bool, float, str, list, Dict[str, Any]] if PYTHON_IMPLEMENTATION != "PyPy": from json.encoder import _make_iterencode, encode_basestring_ascii # type: ignore else: # pragma: no cover _make_iterencode = None encode_basestring_ascii = None def _floatstr(o: float) -> str: # This is the bit we're overriding - integer-valued floats are # encoded as integers, to support JSONschemas's uniqueness. assert math.isfinite(o) if o == int(o): return repr(int(o)) return repr(o) class CanonicalisingJsonEncoder(json.JSONEncoder): if PYTHON_IMPLEMENTATION == "PyPy": # pragma: no cover def _JSONEncoder__floatstr(self, o: float) -> str: return _floatstr(o) else: def iterencode(self, o: Any, _one_shot: bool = False) -> Any: # noqa """Replace a stdlib method, so we encode integer-valued floats as ints.""" return _make_iterencode( {}, self.default, encode_basestring_ascii, self.indent, _floatstr, self.key_separator, self.item_separator, self.sort_keys, self.skipkeys, _one_shot, )(o, 0) def encode_canonical_json(value: JSONType) -> str: """Canonical form serialiser, for uniqueness testing.""" return json.dumps(value, sort_keys=True, cls=CanonicalisingJsonEncoder) def sort_key(value: JSONType) -> Tuple[int, float, Union[float, str]]: """Return a sort key (type, guess, tiebreak) that can compare any JSON value. Sorts scalar types before collections, and within each type tries for a sensible ordering similar to Hypothesis' idea of simplicity. """ if value is None: return (0, 0, 0) if isinstance(value, bool): return (1, int(value), 0) if isinstance(value, (int, float)): return (2 if int(value) == value else 3, abs(value), value >= 0) type_key = {str: 4, list: 5, dict: 6}[type(value)] return (type_key, len(value), encode_canonical_json(value)) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152418.0 hypothesis-jsonschema-0.23.1/src/hypothesis_jsonschema/_from_schema.py0000644000175100001770000006576100000000000027145 0ustar00runnerdocker00000000000000"""A Hypothesis extension for JSON schemata.""" import itertools import math import operator import re import warnings from fractions import Fraction from functools import partial from typing import Any, Callable, Dict, List, NoReturn, Optional, Set, Union import jsonschema import jsonschema.exceptions from hypothesis import assume, provisional as prov, strategies as st from hypothesis.errors import HypothesisWarning, InvalidArgument from hypothesis.internal.conjecture import utils as cu from hypothesis.strategies._internal.regex import regex_strategy from hypothesis.strategies._internal.strings import OneCharStringStrategy from ._canonicalise import ( FALSEY, TRUTHY, TYPE_STRINGS, HypothesisRefResolutionError, Schema, canonicalish, get_integer_bounds, get_number_bounds, get_type, make_validator, merged, ) from ._encode import JSONType, encode_canonical_json from ._resolve import resolve_all_refs JSON_STRATEGY: st.SearchStrategy[JSONType] = st.recursive( st.none() | st.booleans() | st.integers() | st.floats(allow_nan=False, allow_infinity=False).map(lambda x: x or 0.0) | st.text(), lambda strategy: st.lists(strategy, max_size=3) | st.dictionaries(st.text(), strategy, max_size=3), ) _FORMATS_TOKEN = object() class CharStrategy(OneCharStringStrategy): allow_x00: bool codec: Optional[str] @classmethod def from_args(cls, *, allow_x00: bool, codec: Optional[str]) -> "CharStrategy": self: CharStrategy = cls.from_characters_args( min_codepoint=0 if allow_x00 else 1, codec=codec ) self.allow_x00 = allow_x00 self.codec = codec return self def check_name_allowed(self, name: str) -> None: if "\x00" in name and not self.allow_x00: raise InvalidArgument(f"allow_x00=False makes name {name!a} invalid") if self.codec is not None: try: name.encode(self.codec) except Exception: msg = f"{name!r} cannot be encoded as {self.codec!r}" raise InvalidArgument(msg) from None def from_js_regex(pattern: str, alphabet: CharStrategy) -> st.SearchStrategy[str]: return regex_strategy( pattern, fullmatch=False, alphabet=alphabet, _temp_jsonschema_hack_no_end_newline=True, ) def merged_as_strategies( schemas: List[Schema], *, alphabet: CharStrategy, custom_formats: Optional[Dict[str, st.SearchStrategy[str]]], ) -> st.SearchStrategy[JSONType]: assert schemas, "internal error: must pass at least one schema to merge" if len(schemas) == 1: return __from_schema( schemas[0], alphabet=alphabet, custom_formats=custom_formats ) # Try to merge combinations of strategies. strats = [] combined: Set[str] = set() inputs = {encode_canonical_json(s): s for s in schemas} for group in itertools.chain.from_iterable( itertools.combinations(inputs, n) for n in range(len(inputs), 0, -1) ): if combined.issuperset(group): continue s = merged([inputs[g] for g in group]) if s is not None and s != FALSEY: strats.append( __from_schema( s, alphabet=alphabet, custom_formats=custom_formats ).filter( lambda obj, validators=tuple( make_validator(s).is_valid for s in schemas ): all(v(obj) for v in validators) ) ) combined.update(group) return st.one_of(strats) def from_schema( schema: Union[bool, Schema], *, custom_formats: Optional[Dict[str, st.SearchStrategy[str]]] = None, allow_x00: bool = True, codec: Optional[str] = "utf-8", ) -> st.SearchStrategy[JSONType]: """Take a JSON schema and return a strategy for allowed JSON objects. To generate specific string formats, pass a ``custom_formats`` dict mapping the format name to a strategy for allowed strings. You can constrain strings _other than those from custom format strategies_ by passing ``allow_x00=False`` to exclude the null character ``chr(0)``, and/or a ``codec=`` name such as ``"utf-8"``, ``"ascii"``, or any other text encoding supported by Python. Supports JSONSchema drafts 04, 06, and 07, with the exception of recursive references. """ try: return __from_schema( schema, custom_formats=custom_formats, alphabet=CharStrategy.from_args(allow_x00=allow_x00, codec=codec), ) except Exception as err: error = err def error_raiser() -> NoReturn: raise error return st.builds(error_raiser) def _get_format_filter( format_name: str, checker: jsonschema.FormatChecker, strategy: st.SearchStrategy[str], ) -> st.SearchStrategy[str]: def check_valid(string: str) -> str: try: if not isinstance(string, str): raise jsonschema.exceptions.FormatError(f"{string!r} is not a string") checker.check(string, format=format_name) except jsonschema.exceptions.FormatError as err: raise InvalidArgument( f"Got string={string!r} from strategy {strategy!r}, but this " f"is not a valid value for the {format_name!r} checker." ) from err return string return strategy.map(check_valid) def __from_schema( schema: Union[bool, Schema], *, alphabet: CharStrategy, custom_formats: Optional[Dict[str, st.SearchStrategy[str]]], ) -> st.SearchStrategy[JSONType]: try: schema = resolve_all_refs(schema) except RecursionError: raise HypothesisRefResolutionError( f"Could not resolve recursive references in schema={schema!r}" ) from None # We check for _FORMATS_TOKEN to avoid re-validating known good data. if custom_formats is not None and _FORMATS_TOKEN not in custom_formats: assert isinstance(custom_formats, dict) for name, strat in custom_formats.items(): if not isinstance(name, str): raise InvalidArgument(f"format name {name!r} must be a string") if not isinstance(strat, st.SearchStrategy): raise InvalidArgument( f"custom_formats[{name!r}]={strat!r} must be a Hypothesis " "strategy which generates strings matching this format." ) if name in STRING_FORMATS: warnings.warn( f"Overriding standard format {name!r} - was " f"{STRING_FORMATS[name]!r}, now {strat!r}", HypothesisWarning, stacklevel=2, ) format_checker = jsonschema.FormatChecker() custom_formats = { name: _get_format_filter(name, format_checker, strategy) for name, strategy in custom_formats.items() } custom_formats[_FORMATS_TOKEN] = None # type: ignore schema = canonicalish(schema) # Boolean objects are special schemata; False rejects all and True accepts all. if schema == FALSEY: return st.nothing() if schema == TRUTHY: return JSON_STRATEGY assert isinstance(schema, dict) if schema.get("$schema") == "http://json-schema.org/draft-03/schema#": raise InvalidArgument("Draft-03 schemas are not supported") # Only check if declared, lest we error on inner non-latest-draft schemata. if "$schema" in schema: jsonschema.validators.validator_for(schema).check_schema(schema) # Now we handle as many validation keywords as we can... # Applying subschemata with boolean logic if "not" in schema: not_ = schema.pop("not") assert isinstance(not_, dict) validator = make_validator(not_).is_valid return __from_schema( schema, alphabet=alphabet, custom_formats=custom_formats ).filter(lambda v: not validator(v)) if "anyOf" in schema: tmp = schema.copy() ao = tmp.pop("anyOf") assert isinstance(ao, list) return st.one_of( [ merged_as_strategies( [tmp, s], alphabet=alphabet, custom_formats=custom_formats ) for s in ao ] ) if "allOf" in schema: tmp = schema.copy() ao = tmp.pop("allOf") assert isinstance(ao, list) return merged_as_strategies( [tmp, *ao], alphabet=alphabet, custom_formats=custom_formats ) if "oneOf" in schema: tmp = schema.copy() oo = tmp.pop("oneOf") assert isinstance(oo, list) schemas = [merged([tmp, s]) for s in oo] return st.one_of( [ __from_schema(s, alphabet=alphabet, custom_formats=custom_formats) for s in schemas if s is not None ] ).filter(make_validator(schema).is_valid) # Simple special cases if "enum" in schema: assert schema["enum"], "Canonicalises to non-empty list or FALSEY" return st.sampled_from(schema["enum"]) if "const" in schema: return st.just(schema["const"]) # Finally, resolve schema by type - defaulting to "object" map_: Dict[str, Callable[[Schema], st.SearchStrategy[JSONType]]] = { "null": lambda _: st.none(), "boolean": lambda _: st.booleans(), "number": number_schema, "integer": integer_schema, "string": partial(string_schema, custom_formats, alphabet), "array": partial(array_schema, custom_formats, alphabet), "object": partial(object_schema, custom_formats, alphabet), } assert set(map_) == set(TYPE_STRINGS) return st.one_of([map_[t](schema) for t in get_type(schema)]) def _numeric_with_multiplier( min_value: Optional[float], max_value: Optional[float], schema: Schema ) -> st.SearchStrategy[float]: """Handle numeric schemata containing the multipleOf key.""" multiple_of = schema["multipleOf"] assert isinstance(multiple_of, (int, float)) if min_value is not None: min_value = math.ceil(Fraction(min_value) / Fraction(multiple_of)) if max_value is not None: max_value = math.floor(Fraction(max_value) / Fraction(multiple_of)) if min_value is not None and max_value is not None and min_value > max_value: # type: ignore[unreachable] # You would think that this is impossible, but it can happen if multipleOf # is very small and the bounds are very close togther. It would be nicer # to deal with this when canonicalising, but suffice to say we can't without # diverging from the floating-point behaviour of the upstream validator. return st.nothing() # type: ignore[unreachable] return ( st.integers(min_value, max_value) .map(lambda x: x * multiple_of) .filter(make_validator(schema).is_valid) ) def integer_schema(schema: dict) -> st.SearchStrategy[float]: """Handle integer schemata.""" min_value, max_value = get_integer_bounds(schema) if "multipleOf" in schema: return _numeric_with_multiplier(min_value, max_value, schema) return st.integers(min_value, max_value) def number_schema(schema: dict) -> st.SearchStrategy[float]: """Handle numeric schemata.""" min_value, max_value, exclude_min, exclude_max = get_number_bounds(schema) if "multipleOf" in schema: return _numeric_with_multiplier(min_value, max_value, schema) return st.floats( min_value=min_value, max_value=max_value, allow_nan=False, allow_infinity=False, exclude_min=exclude_min, exclude_max=exclude_max, # Filter out negative-zero as it does not exist in JSON ).map(lambda n: n if n != 0 else abs(n)) def rfc3339(name: str) -> st.SearchStrategy[str]: """Get a strategy for date or time strings in the given RFC3339 format. See https://tools.ietf.org/html/rfc3339#section-5.6 """ # Hmm, https://github.com/HypothesisWorks/hypothesis/issues/170 # would make this a lot easier... assert name in RFC3339_FORMATS def zfill(width: int) -> Callable[[int], str]: return lambda v: str(v).zfill(width) simple = { "date-fullyear": st.integers(0, 9999).map(zfill(4)), "date-month": st.integers(1, 12).map(zfill(2)), "date-mday": st.integers(1, 28).map(zfill(2)), # incomplete but valid "time-hour": st.integers(0, 23).map(zfill(2)), "time-minute": st.integers(0, 59).map(zfill(2)), "time-second": st.integers(0, 59).map(zfill(2)), # ignore negative leap seconds "time-secfrac": st.from_regex(r"\.[0-9]+"), } if name in simple: return simple[name] if name == "time-numoffset": return st.tuples( st.sampled_from(["+", "-"]), rfc3339("time-hour"), rfc3339("time-minute") ).map("%s%s:%s".__mod__) if name == "time-offset": return st.one_of(st.just("Z"), rfc3339("time-numoffset")) if name == "partial-time": return st.times().map(str) if name in ("date", "full-date"): return st.dates().map(str) if name in ("time", "full-time"): return st.tuples(rfc3339("partial-time"), rfc3339("time-offset")).map("".join) assert name == "date-time" return st.tuples(rfc3339("full-date"), rfc3339("full-time")).map("T".join) @st.composite # type: ignore def regex_patterns(draw: Any) -> str: """Return a recursive strategy for simple regular expression patterns.""" fragments = st.one_of( st.just("."), st.from_regex(r"\[\^?[A-Za-z0-9]+\]"), REGEX_PATTERNS.map("{}+".format), REGEX_PATTERNS.map("{}?".format), REGEX_PATTERNS.map("{}*".format), ) result = draw(st.lists(fragments, min_size=1, max_size=3).map("".join)) assert isinstance(result, str) try: re.compile(result) except (re.error, FutureWarning): assume(False) return result REGEX_PATTERNS = regex_patterns() def json_pointers(alphabet: CharStrategy) -> st.SearchStrategy[str]: """Return a strategy for strings in json-pointer format.""" return st.lists( st.text(alphabet).map(lambda p: "/" + p.replace("~", "~0").replace("/", "~1")) ).map("".join) def relative_json_pointers(alphabet: CharStrategy) -> st.SearchStrategy[str]: """Return a strategy for strings in relative-json-pointer format.""" return st.builds( operator.add, st.from_regex(r"0|[1-9][0-9]*", fullmatch=True, alphabet=alphabet), st.just("#") | json_pointers(alphabet), ) # Via the `webcolors` package, to match the logic `jsonschema` # uses to check it's (non-standard?) "color" format. _WEBCOLOR_REGEX = "^#([a-fA-F0-9]{3}|[a-fA-F0-9]{6})$" _CSS21_COLOR_NAMES = ( "aqua", "black", "blue", "fuchsia", "green", "gray", "lime", "maroon", "navy", "olive", "orange", "purple", "red", "silver", "teal", "white", "yellow", ) RFC3339_FORMATS = ( "date-fullyear", "date-month", "date-mday", "time-hour", "time-minute", "time-second", "time-secfrac", "time-numoffset", "time-offset", "partial-time", "full-date", "full-time", "date-time", ) STRING_FORMATS = { **{name: rfc3339(name) for name in RFC3339_FORMATS}, "color": st.from_regex(_WEBCOLOR_REGEX) | st.sampled_from(_CSS21_COLOR_NAMES), "date": rfc3339("full-date"), "time": rfc3339("full-time"), "email": st.emails(), "idn-email": st.emails(), "hostname": prov.domains(), "idn-hostname": prov.domains(), "ipv4": st.ip_addresses(v=4).map(str), "ipv6": st.ip_addresses(v=6).map(str), **{ name: prov.domains().map("https://{}".format) for name in ["uri", "uri-reference", "iri", "iri-reference", "uri-template"] }, "json-pointer": json_pointers, "relative-json-pointer": relative_json_pointers, "regex": REGEX_PATTERNS, } def _warn_invalid_regex(pattern: str, err: re.error, kw: str = "pattern") -> None: warnings.warn( f"Got {kw}={pattern!r}, but this is not valid syntax for a Python regular " f"expression ({err}) so it will not be handled by the strategy. See https://" "json-schema.org/understanding-json-schema/reference/regular_expressions.html", stacklevel=2, ) def string_schema( custom_formats: Dict[str, st.SearchStrategy[str]], alphabet: CharStrategy, schema: dict, ) -> st.SearchStrategy[str]: """Handle schemata for strings.""" # also https://json-schema.org/latest/json-schema-validation.html#rfc.section.7 min_size = schema.get("minLength", 0) max_size = schema.get("maxLength") strategy = st.text(alphabet, min_size=min_size, max_size=max_size) known_formats = {**STRING_FORMATS, **(custom_formats or {})} if schema.get("format") in known_formats: # Unknown "format" specifiers should be ignored for validation. # See https://json-schema.org/latest/json-schema-validation.html#format strategy = known_formats[schema["format"]] if not isinstance(strategy, st.SearchStrategy): strategy = strategy(alphabet) if "pattern" in schema: try: # This isn't really supported, but we'll do our best with a filter. strategy = strategy.filter(re.compile(schema["pattern"]).search) except re.error as err: _warn_invalid_regex(schema["pattern"], err) return st.nothing() elif "pattern" in schema: try: re.compile(schema["pattern"]) strategy = from_js_regex(schema["pattern"], alphabet=alphabet) except re.error as err: # Patterns that are invalid in Python, or just malformed _warn_invalid_regex(schema["pattern"], err) return st.nothing() # If we have size bounds but we're generating strings from a regex or pattern, # apply a filter to ensure our size bounds are respected. if ("format" in schema or "pattern" in schema) and ( min_size != 0 or max_size is not None ): max_size = math.inf if max_size is None else max_size strategy = strategy.filter(lambda s: min_size <= len(s) <= max_size) return strategy def array_schema( custom_formats: Dict[str, st.SearchStrategy[str]], alphabet: CharStrategy, schema: dict, ) -> st.SearchStrategy[List[JSONType]]: """Handle schemata for arrays.""" _from_schema_ = partial( __from_schema, custom_formats=custom_formats, alphabet=alphabet ) items = schema.get("items", {}) additional_items = schema.get("additionalItems", {}) min_size = schema.get("minItems", 0) max_size = schema.get("maxItems") unique = schema.get("uniqueItems") if isinstance(items, list): min_size = max(0, min_size - len(items)) if max_size is not None: max_size -= len(items) items_strats = [_from_schema_(s) for s in items] additional_items_strat = _from_schema_(additional_items) # If we have a contains schema to satisfy, we try generating from it when # allowed to do so. We'll skip the None (unmergable / no contains) cases # below, and let Hypothesis ignore the FALSEY cases for us. if "contains" in schema: for i, mrgd in enumerate(merged([schema["contains"], s]) for s in items): if mrgd is not None: items_strats[i] |= _from_schema_(mrgd) contains_additional = merged([schema["contains"], additional_items]) if contains_additional is not None: additional_items_strat |= _from_schema_(contains_additional) # Hypothesis raises InvalidArgument for empty elements and non-None # max_size, because the user has asked for a possibility which will # never happen... but we can work around that here. if additional_items_strat.is_empty: if min_size >= 1: return st.nothing() max_size = 0 if unique: @st.composite # type: ignore def compose_lists_with_filter(draw: Any) -> List[JSONType]: elems = [] seen: Set[str] = set() def not_seen(elem: JSONType) -> bool: return encode_canonical_json(elem) not in seen for strat in items_strats: elems.append(draw(strat.filter(not_seen))) seen.add(encode_canonical_json(elems[-1])) if max_size == 0: return elems extra_items = st.lists( additional_items_strat.filter(not_seen), min_size=min_size, max_size=max_size, unique_by=encode_canonical_json, ) more_elems: List[JSONType] = draw(extra_items) return elems + more_elems strat = compose_lists_with_filter() elif max_size == 0: strat = st.tuples(*items_strats).map(list) else: strat = st.builds( operator.add, st.tuples(*items_strats).map(list), st.lists(additional_items_strat, min_size=min_size, max_size=max_size), ) else: items_strat = _from_schema_(items) if "contains" in schema: contains_strat = _from_schema_(schema["contains"]) if merged([items, schema["contains"]]) != schema["contains"]: # We only need this filter if we couldn't merge items in when # canonicalising. Note that for list-items, above, we just skip # the mixed generation in this case (because they tend to be # heterogeneous) and hope it works out anyway. contains_strat = contains_strat.filter(make_validator(items).is_valid) items_strat |= contains_strat elif items_strat.is_empty and min_size == 0 and max_size is not None: # As above, work around a Hypothesis check for unsatisfiable max_size. return st.builds(list) strat = st.lists( items_strat, min_size=min_size, max_size=max_size, unique_by=encode_canonical_json if unique else None, ) if "contains" not in schema: return strat contains = make_validator(schema["contains"]).is_valid return strat.filter(lambda val: any(contains(x) for x in val)) def object_schema( custom_formats: Dict[str, st.SearchStrategy[str]], alphabet: CharStrategy, schema: dict, ) -> st.SearchStrategy[Dict[str, JSONType]]: """Handle a manageable subset of possible schemata for objects.""" required = schema.get("required", []) # required keys min_size = max(len(required), schema.get("minProperties", 0)) max_size = schema.get("maxProperties", math.inf) assert min_size <= max_size, (min_size, max_size) names = schema.get("propertyNames", {}) # schema for optional keys if names == FALSEY: assert min_size == 0, schema return st.builds(dict) names["type"] = "string" properties = schema.get("properties", {}) # exact name: value schema patterns = schema.get("patternProperties", {}) # regex for names: value schema # schema for other values; handled specially if nothing matches additional = schema.get("additionalProperties", {}) additional_allowed = additional != FALSEY for key in list(patterns): try: re.compile(key) except re.error as err: _warn_invalid_regex(key, err, "patternProperties entry") if min_size == 0 and not required: return st.builds(dict) return st.nothing() dependencies = schema.get("dependencies", {}) dep_names = {k: v for k, v in dependencies.items() if isinstance(v, list)} dep_schemas = {k: v for k, v in dependencies.items() if k not in dep_names} del dependencies valid_name = make_validator(names).is_valid known: set = set(filter(valid_name, set(dep_names).union(dep_schemas, properties))) for name in sorted(known.union(required)): alphabet.check_name_allowed(name) known_optional_names: List[str] = sorted(known - set(required)) name_strats = ( __from_schema(names, custom_formats=custom_formats, alphabet=alphabet) if additional_allowed else st.nothing(), st.sampled_from(known_optional_names) if known_optional_names else st.nothing(), st.one_of( [ from_js_regex(p, alphabet=alphabet).filter(valid_name) for p in sorted(patterns) ] ), ) all_names_strategy = st.one_of([s for s in name_strats if not s.is_empty]) @st.composite # type: ignore def from_object_schema(draw: Any) -> Any: """Do some black magic with private Hypothesis internals for objects. It's unfortunate, but also the only way that I know of to satisfy all the interacting constraints without making shrinking totally hopeless. If any Hypothesis maintainers are reading this... I'm so, so sorry. """ # Hypothesis internals are not type-annotated... I do mean *black* magic! elements = cu.many( draw(st.data()).conjecture_data, min_size=min_size, max_size=max_size, average_size=min(min_size + 5, (min_size + max_size) / 2), ) out: dict = {} while elements.more(): for key in required: if key not in out: break else: for k in set(dep_names).intersection(out): # pragma: no cover # nocover because some of these conditionals are rare enough # that not all test runs hit them, but are still essential. key = next((n for n in dep_names[k] if n not in out), None) if key is not None: break else: key = draw(all_names_strategy.filter(lambda s: s not in out)) pattern_schemas = [ patterns[rgx] for rgx in sorted(patterns) if re.search(rgx, string=key) is not None ] if key in properties: pattern_schemas.insert(0, properties[key]) if pattern_schemas: out[key] = draw( merged_as_strategies( pattern_schemas, alphabet=alphabet, custom_formats=custom_formats, ) ) else: out[key] = draw( __from_schema( additional, custom_formats=custom_formats, alphabet=alphabet ) ) for k, v in dep_schemas.items(): if k in out and not make_validator(v).is_valid(out): out.pop(key) elements.reject() for k in set(dep_names).intersection(out): assume(set(out).issuperset(dep_names[k])) return out return from_object_schema() ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152418.0 hypothesis-jsonschema-0.23.1/src/hypothesis_jsonschema/_resolve.py0000644000175100001770000000646300000000000026333 0ustar00runnerdocker00000000000000""" Canonicalisation logic for JSON schemas. The canonical format that we transform to is not intended for human consumption. Instead, it prioritises locality of reasoning - for example, we convert oneOf arrays into an anyOf of allOf (each sub-schema being the original plus not anyOf the rest). Resolving references and merging subschemas is also really helpful. All this effort is justified by the huge performance improvements that we get when converting to Hypothesis strategies. To the extent possible there is only one way to generate any given value... but much more importantly, we can do most things by construction instead of by filtering. That's the difference between "I'd like it to be faster" and "doesn't finish at all". """ from copy import deepcopy from typing import NoReturn, Optional, Union from hypothesis.errors import InvalidArgument from jsonschema.validators import _RefResolver from ._canonicalise import ( SCHEMA_KEYS, SCHEMA_OBJECT_KEYS, HypothesisRefResolutionError, Schema, canonicalish, merged, ) class LocalResolver(_RefResolver): def resolve_remote(self, uri: str) -> NoReturn: raise HypothesisRefResolutionError( f"hypothesis-jsonschema does not fetch remote references (uri={uri!r})" ) def resolve_all_refs( schema: Union[bool, Schema], *, resolver: Optional[LocalResolver] = None ) -> Schema: """ Resolve all references in the given schema. This handles nested definitions, but not recursive definitions. The latter require special handling to convert to strategies and are much less common, so we just ignore them (and error out) for now. """ if isinstance(schema, bool): return canonicalish(schema) assert isinstance(schema, dict), schema if resolver is None: resolver = LocalResolver.from_schema(deepcopy(schema)) if not isinstance(resolver, _RefResolver): raise InvalidArgument( f"resolver={resolver} (type {type(resolver).__name__}) is not a RefResolver" ) if "$ref" in schema: s = dict(schema) ref = s.pop("$ref") with resolver.resolving(ref) as got: m = merged([s, resolve_all_refs(got, resolver=resolver)]) if m is None: # pragma: no cover msg = f"$ref:{ref!r} had incompatible base schema {s!r}" raise HypothesisRefResolutionError(msg) assert "$ref" not in m return m assert "$ref" not in schema for key in SCHEMA_KEYS: val = schema.get(key, False) if isinstance(val, list): schema[key] = [ resolve_all_refs(v, resolver=resolver) if isinstance(v, dict) else v for v in val ] elif isinstance(val, dict): schema[key] = resolve_all_refs(val, resolver=resolver) else: assert isinstance(val, bool) for key in SCHEMA_OBJECT_KEYS: # values are keys-to-schema-dicts, not schemas if key in schema: subschema = schema[key] assert isinstance(subschema, dict) schema[key] = { k: resolve_all_refs(v, resolver=resolver) if isinstance(v, dict) else v for k, v in subschema.items() } assert isinstance(schema, dict) assert "$ref" not in schema return schema ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152418.0 hypothesis-jsonschema-0.23.1/src/hypothesis_jsonschema/py.typed0000644000175100001770000000000000000000000025617 0ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1709152428.2105355 hypothesis-jsonschema-0.23.1/src/hypothesis_jsonschema.egg-info/0000755000175100001770000000000000000000000025624 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152428.0 hypothesis-jsonschema-0.23.1/src/hypothesis_jsonschema.egg-info/PKG-INFO0000644000175100001770000001041600000000000026723 0ustar00runnerdocker00000000000000Metadata-Version: 2.1 Name: hypothesis-jsonschema Version: 0.23.1 Summary: Generate test data from JSON schemata with Hypothesis Home-page: https://github.com/Zac-HD/hypothesis-jsonschema Author: Zac Hatfield-Dodds Author-email: zac@zhd.dev License: MPL 2.0 Project-URL: Funding, https://github.com/sponsors/Zac-HD Keywords: python testing fuzzing property-based-testing json-schema Classifier: Development Status :: 4 - Beta Classifier: Framework :: Hypothesis Classifier: Intended Audience :: Developers Classifier: License :: OSI Approved :: Mozilla Public License 2.0 (MPL 2.0) Classifier: Programming Language :: Python Classifier: Programming Language :: Python :: 3 Classifier: Programming Language :: Python :: 3.8 Classifier: Programming Language :: Python :: 3.9 Classifier: Programming Language :: Python :: 3.10 Classifier: Programming Language :: Python :: 3.11 Classifier: Topic :: Education :: Testing Classifier: Topic :: Software Development :: Testing Classifier: Typing :: Typed Requires-Python: >=3.8 Description-Content-Type: text/markdown License-File: LICENSE Requires-Dist: hypothesis>=6.84.3 Requires-Dist: jsonschema>=4.18.0 # hypothesis-jsonschema A [Hypothesis](https://hypothesis.readthedocs.io) strategy for generating data that matches some [JSON schema](https://json-schema.org/). [Here's the PyPI page.](https://pypi.org/project/hypothesis-jsonschema/) ## API The public API consists of just one function: `hypothesis_jsonschema.from_schema`, which takes a JSON schema and returns a strategy for allowed JSON objects. ```python from hypothesis import given from hypothesis_jsonschema import from_schema @given(from_schema({"type": "integer", "minimum": 1, "exclusiveMaximum": 10})) def test_integers(value): assert isinstance(value, int) assert 1 <= value < 10 @given( from_schema( {"type": "string", "format": "card"}, # Standard formats work out of the box. Custom formats are ignored # by default, but you can pass custom strategies for them - e.g. custom_formats={"card": st.sampled_from(EXAMPLE_CARD_NUMBERS)}, ) ) def test_card_numbers(value): assert isinstance(value, str) assert re.match(r"^\d{4} \d{4} \d{4} \d{4}$", value) @given(from_schema({}, allow_x00=False, codec="utf-8").map(json.dumps)) def test_card_numbers(payload): assert isinstance(payload, str) assert "\0" not in payload # use allow_x00=False to exclude null characters # If you want to restrict generated strings characters which are valid in # a specific character encoding, you can do that with the `codec=` argument. payload.encode(codec="utf-8") ``` For more details on property-based testing and how to use or customise strategies, [see the Hypothesis docs](https://hypothesis.readthedocs.io/). JSONSchema drafts 04, 05, and 07 are fully tested and working. As of version 0.11, this includes resolving non-recursive references! ## Supported versions `hypothesis-jsonschema` requires Python 3.6 or later. In general, 0.x versions will require very recent versions of all dependencies because I don't want to deal with compatibility workarounds. `hypothesis-jsonschema` may make backwards-incompatible changes at any time before version 1.x - that's what semver means! - but I've kept the API surface small enough that this should be avoidable. The main source of breaks will be if or when schema that never really worked turn into explicit errors instead of generating values that don't quite match. You can [sponsor me](https://github.com/sponsors/Zac-HD) to get priority support, roadmap input, and prioritized feature development. ## Contributing to `hypothesis-jsonschema` We love external contributions - and try to make them both easy and fun. You can [read more details in our contributing guide](https://github.com/Zac-HD/hypothesis-jsonschema/blob/master/CONTRIBUTING.md), and [see everyone who has contributed on GitHub](https://github.com/Zac-HD/hypothesis-jsonschema/graphs/contributors). Thanks, everyone! ### Changelog Patch notes [can be found in `CHANGELOG.md`](https://github.com/Zac-HD/hypothesis-jsonschema/blob/master/CHANGELOG.md). ### Security contact information To report a security vulnerability, please use the [Tidelift security contact](https://tidelift.com/security). Tidelift will coordinate the fix and disclosure. ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152428.0 hypothesis-jsonschema-0.23.1/src/hypothesis_jsonschema.egg-info/SOURCES.txt0000644000175100001770000000141400000000000027510 0ustar00runnerdocker00000000000000LICENSE MANIFEST.in README.md pyproject.toml setup.py tox.ini deps/README.md deps/check.in deps/check.txt deps/deps.in deps/deps.txt deps/test.in deps/test.txt src/hypothesis_jsonschema/__init__.py src/hypothesis_jsonschema/_canonicalise.py src/hypothesis_jsonschema/_encode.py src/hypothesis_jsonschema/_from_schema.py src/hypothesis_jsonschema/_resolve.py src/hypothesis_jsonschema/py.typed src/hypothesis_jsonschema.egg-info/PKG-INFO src/hypothesis_jsonschema.egg-info/SOURCES.txt src/hypothesis_jsonschema.egg-info/dependency_links.txt src/hypothesis_jsonschema.egg-info/not-zip-safe src/hypothesis_jsonschema.egg-info/requires.txt src/hypothesis_jsonschema.egg-info/top_level.txt tests/test_canonicalise.py tests/test_encode.py tests/test_from_schema.py tests/test_version.py././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152428.0 hypothesis-jsonschema-0.23.1/src/hypothesis_jsonschema.egg-info/dependency_links.txt0000644000175100001770000000000100000000000031672 0ustar00runnerdocker00000000000000 ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152428.0 hypothesis-jsonschema-0.23.1/src/hypothesis_jsonschema.egg-info/not-zip-safe0000644000175100001770000000000100000000000030052 0ustar00runnerdocker00000000000000 ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152428.0 hypothesis-jsonschema-0.23.1/src/hypothesis_jsonschema.egg-info/requires.txt0000644000175100001770000000004600000000000030224 0ustar00runnerdocker00000000000000hypothesis>=6.84.3 jsonschema>=4.18.0 ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152428.0 hypothesis-jsonschema-0.23.1/src/hypothesis_jsonschema.egg-info/top_level.txt0000644000175100001770000000002600000000000030354 0ustar00runnerdocker00000000000000hypothesis_jsonschema ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1709152428.2105355 hypothesis-jsonschema-0.23.1/tests/0000755000175100001770000000000000000000000020074 5ustar00runnerdocker00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152418.0 hypothesis-jsonschema-0.23.1/tests/test_canonicalise.py0000644000175100001770000005412700000000000024146 0ustar00runnerdocker00000000000000"""Tests for the hypothesis-jsonschema library.""" import jsonschema import pytest from gen_schemas import gen_number, json_schemata, schema_strategy_params from hypothesis import HealthCheck, assume, given, note, settings, strategies as st from hypothesis.errors import InvalidArgument from hypothesis_jsonschema import from_schema from hypothesis_jsonschema._canonicalise import ( FALSEY, canonicalish, get_type, make_validator, merged, next_up, ) from hypothesis_jsonschema._from_schema import JSON_STRATEGY from hypothesis_jsonschema._resolve import resolve_all_refs def is_valid(instance, schema): return make_validator(schema).is_valid(instance) @settings(suppress_health_check=[HealthCheck.too_slow], deadline=None) @given(data=st.data()) @schema_strategy_params def test_canonicalises_to_equivalent_fixpoint(schema_strategy, data): """Check that an object drawn from an arbitrary schema is valid.""" schema = data.draw(schema_strategy, label="schema") cc = canonicalish(schema) assert cc == canonicalish(cc) try: strat = from_schema(cc) except InvalidArgument: # e.g. array of unique {type: integers}, with too few allowed integers assume(False) instance = data.draw(JSON_STRATEGY | strat, label="instance") assert is_valid(instance, schema) == is_valid(instance, cc) make_validator(schema) @pytest.mark.parametrize( "schema, examples", [({"type": "integer", "multipleOf": 0.75}, [1.5e308])], ) def test_canonicalises_to_equivalent_fixpoint_examples(schema, examples): """Check that an object drawn from an arbitrary schema is valid. This is used to record past regressions from the test above. """ cc = canonicalish(schema) assert cc == canonicalish(cc) validator = jsonschema.validators.validator_for(schema) validator.check_schema(schema) validator.check_schema(cc) for instance in examples: assert is_valid(instance, schema) == is_valid(instance, cc) def test_dependencies_canonicalises_to_fixpoint(): """Check that an object drawn from an arbitrary schema is valid.""" cc = canonicalish( {"required": [""], "properties": {"": {}}, "dependencies": {"": [""]}} ) assert cc == canonicalish(cc) @pytest.mark.parametrize( "schema", [ {"type": "object", "maxProperties": 1, "required": ["0", "1"]}, {"type": "object", "required": [""], "propertyNames": {"minLength": 1}}, {"type": "array", "contains": False}, {"type": "null", "enum": [False, True]}, {"type": "boolean", "const": None}, {"type": "array", "items": False, "minItems": 1}, { "type": "array", "items": {"type": "null"}, "uniqueItems": True, "minItems": 2, }, { "type": "array", "items": {"type": "boolean"}, "uniqueItems": True, "minItems": 3, }, { "type": "array", "items": {"type": ["null", "boolean"]}, "uniqueItems": True, "minItems": 4, }, {"type": "array", "items": [True, False], "minItems": 2}, {"type": "array", "items": [True], "minItems": 2, "additionalItems": False}, { "type": "array", "items": [True, False, True], "minItems": 3, "additionalItems": {"type": "null"}, }, {"type": "integer", "minimum": 2, "maximum": 1}, {"type": "integer", "minimum": 1, "maximum": 2, "multipleOf": 3}, {"type": "number", "exclusiveMinimum": 0, "maximum": 0}, {"type": "number", "exclusiveMinimum": -0.0, "maximum": -0.0}, {"type": "number", "minimum": 0, "exclusiveMaximum": 0}, {"type": "number", "minimum": -0.0, "exclusiveMaximum": -0.0}, { "type": "number", "exclusiveMinimum": 0, "exclusiveMaximum": 3, "multipleOf": 3, }, {"not": {"type": ["integer", "number"]}, "type": "number"}, {"not": {"anyOf": [{"type": "integer"}, {"type": "number"}]}, "type": "number"}, {"not": {"enum": [1, 2, 3]}, "const": 2}, {"oneOf": []}, {"oneOf": [{}, {}]}, {"oneOf": [True, False, {}]}, {"anyOf": [False, {"not": {}}]}, {"type": "object", "maxProperties": 2, "minProperties": 3}, { "type": "object", "maxProperties": 1, "required": ["", "0"], "propertyNames": {"minLength": 2}, }, { "type": "array", "items": {"type": "integer", "minimum": 0, "maximum": 0}, "uniqueItems": True, "minItems": 2, }, {"type": "array", "items": {"type": "integer"}, "contains": {"type": "string"}}, { # only seven allowed elements: [], [1], [2], [1, 1], [1, 2], [2, 1], [2, 2] "type": "array", "items": {"type": "array", "items": {"enum": [1, 2]}, "maxItems": 2}, "minItems": 8, "uniqueItems": True, }, {"type": "object", "required": ["a"], "properties": {"a": False}}, ], ) def test_canonicalises_to_empty(schema): assert canonicalish(schema) == {"not": {}}, (schema, canonicalish(schema)) @pytest.mark.parametrize( "schema,expected", [ ({"type": get_type({})}, {}), ({"required": []}, {}), ({"type": "integer", "not": {"type": "string"}}, {"type": "integer"}), ({"type": "integer", "multipleOf": 1 / 32}, {"type": "integer"}), ({"type": "number", "multipleOf": 1.0}, {"type": "integer"}), ({"type": "number", "multipleOf": -3.0}, {"type": "integer", "multipleOf": 3}), ( {"type": "number", "multipleOf": 0.75, "not": {"multipleOf": 1.25}}, {"type": "number", "multipleOf": 0.75, "not": {"multipleOf": 1.25}}, ), ( {"type": "array", "items": [True, False, True]}, {"type": "array", "items": [{}], "maxItems": 1}, ), ( {"type": "integer", "minimum": 1, "exclusiveMinimum": 0}, {"type": "integer", "minimum": 1}, ), ( {"type": "integer", "maximum": 0, "exclusiveMaximum": 1}, {"type": "integer", "maximum": 0}, ), ( {"type": "integer", "minimum": 1, "multipleOf": 2}, {"type": "integer", "minimum": 2, "multipleOf": 2}, ), ( {"type": "integer", "maximum": 1, "multipleOf": 2}, {"type": "integer", "maximum": 0, "multipleOf": 2}, ), ( {"required": ["a"], "dependencies": {"a": ["b"], "b": ["c"], "x": ["y"]}}, {"required": ["a", "b", "c"], "dependencies": {"x": ["y"]}}, ), ( {"type": "number", "minimum": 0, "exclusiveMaximum": 6, "multipleOf": 3}, {"type": "integer", "minimum": 0, "maximum": 3, "multipleOf": 3}, ), ( {"type": "number", "minimum": 0, "exclusiveMaximum": next_up(0.0)}, {"const": 0}, ), ( {"type": "number", "exclusiveMinimum": 1.5, "maximum": next_up(1.5)}, {"const": next_up(1.5)}, ), ( { "type": "number", "minimum": 1.5, "exclusiveMaximum": 2.5, "multipleOf": 0.5, }, { "type": "number", "minimum": 1.5, "exclusiveMaximum": 2.5, "multipleOf": 0.5, }, ), ({"enum": ["aa", 2, "z", None, 1]}, {"enum": [None, 1, 2, "z", "aa"]}), ( {"contains": {}, "items": {}, "type": "array"}, {"minItems": 1, "type": "array"}, ), ({"anyOf": [{}, {"type": "null"}]}, {}), ({"anyOf": [{"anyOf": [{"anyOf": [{"type": "null"}]}]}]}, {"const": None}), ( { "anyOf": [ {"type": "string"}, {"anyOf": [{"type": "number"}, {"type": "array"}]}, ] }, {"type": ["number", "string", "array"]}, ), ( {"anyOf": [{"type": "integer"}, {"type": "number"}]}, {"type": "number"}, ), ( { "anyOf": [{"type": "string"}, {"type": "number"}], "type": ["string", "object"], }, {"type": "string"}, ), ({"uniqueItems": False}, {}), ( { "type": "array", "items": [True, True], "minItems": 3, "additionalItems": {"type": "null"}, }, { "type": "array", "items": [{}, {}], "minItems": 3, "additionalItems": {"const": None}, }, ), ( { "type": "array", "items": {"type": "number", "multipleOf": 0.5}, "contains": {"type": "number", "multipleOf": 0.75}, }, { "type": "array", "minItems": 1, "items": {"type": "number", "multipleOf": 0.5}, "contains": {"type": "number", "multipleOf": 0.75}, }, ), ( {"type": "array", "items": {"const": 1}, "uniqueItems": True}, { "type": "array", "items": {"const": 1}, "uniqueItems": True, "maxItems": 1, }, ), ( { "anyOf": [ {"const": "a"}, {"anyOf": [{"anyOf": [{"const": "c"}]}, {"const": "b"}]}, ] }, # TODO: could be {"enum": ["a", "b", "c"]}, {"anyOf": [{"const": "a"}, {"const": "b"}, {"const": "c"}]}, ), ( {"if": {"type": "null"}, "then": {"type": "null"}}, {}, ), ( {"if": {"type": "null"}, "then": {"type": "null"}, "else": {}}, {}, ), ( {"if": {"type": "null"}, "then": {}, "else": {}}, {}, ), ( {"if": {"type": "integer"}, "then": {}, "else": {}, "type": "number"}, {"type": "number"}, ), ( {"allOf": [{"multipleOf": 1.5}], "multipleOf": 1.5}, {"multipleOf": 1.5}, ), ( {"type": "integer", "allOf": [{"multipleOf": 0.5}, {"multipleOf": 1e308}]}, {"type": "integer", "multipleOf": 1e308}, ), ( { "additionalProperties": {"not": {}}, "properties": {"a": {"not": {}}}, "type": "object", }, {"maxProperties": 0, "type": "object"}, ), ( { "additionalProperties": {"not": {}}, "properties": {"a": {"not": {}}, "b": {}}, "type": "object", }, { "additionalProperties": {"not": {}}, "properties": {"b": {}}, "maxProperties": 1, "type": "object", }, ), ], ) def test_canonicalises_to_expected(schema, expected): assert canonicalish(schema) == expected, (schema, canonicalish(schema), expected) @pytest.mark.parametrize( "group,result", [ ([{"type": []}, {}], {"not": {}}), ([{"type": "null"}, {"const": 0}], {"not": {}}), ([{"type": "null"}, {"enum": [0]}], {"not": {}}), ([{"type": "integer"}, {"type": "string"}], {"not": {}}), ([{"type": "null"}, {"type": "boolean"}], {"not": {}}), ([{"type": "null"}, {"enum": [None, True]}], {"const": None}), ([{"type": "string"}, {"enum": ["abc", True]}], {"const": "abc"}), ([{"type": "null"}, {"type": ["null", "boolean"]}], {"const": None}), ([{"type": "integer"}, {"maximum": 20}], {"type": "integer", "maximum": 20}), ([{"type": "integer"}, {"type": "number"}], {"type": "integer"}), ([{"multipleOf": 0.25}, {"multipleOf": 0.5}], {"multipleOf": 0.5}), ([{"multipleOf": 0.5}, {"multipleOf": 1.5}], {"multipleOf": 1.5}), ( [ {"type": "string", "format": "color"}, {"type": "string", "format": "date-fullyear"}, ], None, ), ( [ {"type": "integer", "multipleOf": 4}, {"type": "integer", "multipleOf": 6}, ], {"type": "integer", "multipleOf": 12}, ), ( [ {"properties": {"foo": {"maximum": 20}}}, {"properties": {"foo": {"minimum": 10}}}, ], {"properties": {"foo": {"maximum": 20, "minimum": 10}}}, ), ( [ {"contains": {}, "items": {}, "type": "array"}, {"items": False, "type": "array"}, ], {"not": {}}, ), ( [ {"allOf": [{"multipleOf": 0.5}, {"multipleOf": 0.75}]}, {"allOf": [{"multipleOf": 0.5}, {"multipleOf": 1.25}]}, ], { "allOf": [ {"multipleOf": 0.5}, {"multipleOf": 0.75}, {"multipleOf": 1.25}, ] }, ), ( [ {"additionalProperties": {"type": "null"}}, {"additionalProperties": {"type": "boolean"}}, ], {"additionalProperties": {"not": {}}}, ), ( [ {"additionalProperties": {"type": "null"}, "properties": {"foo": {}}}, {"additionalProperties": {"type": "boolean"}}, ], { "properties": {"foo": {"enum": [False, True]}}, "additionalProperties": {"not": {}}, "maxProperties": 1, }, ), ( [ { "properties": {"": {"type": "string"}}, "required": [""], "type": "object", }, {"additionalProperties": {"type": "null"}, "type": "object"}, ], {"not": {}}, ), ( [ {"additionalProperties": {"patternProperties": {".": {}}}}, {"additionalProperties": {"patternProperties": {"a": {}}}}, ], {"additionalProperties": {"patternProperties": {".": {}, "a": {}}}}, ), ( [ {"patternProperties": {".": {"enum": [None, True]}}}, {"properties": {"ab": {"type": "boolean"}}}, ], { "patternProperties": {".": {"enum": [None, True]}}, "properties": {"ab": {"const": True}}, }, ), ( [ {"type": "array", "contains": {"type": "integer"}}, {"type": "array", "contains": {"type": "number"}}, ], {"type": "array", "contains": {"type": "integer"}, "minItems": 1}, ), ( [{"not": {"enum": [1, 2, 3]}}, {"not": {"enum": ["a", "b", "c"]}}], {"not": {"anyOf": [{"enum": ["a", "b", "c"]}, {"enum": [1, 2, 3]}]}}, ), ( [{"dependencies": {"a": ["b"]}}, {"dependencies": {"a": ["c"]}}], {"dependencies": {"a": ["b", "c"]}}, ), ( [{"dependencies": {"a": ["b"]}}, {"dependencies": {"b": ["c"]}}], {"dependencies": {"a": ["b"], "b": ["c"]}}, ), ( [ {"dependencies": {"a": ["b"]}}, {"dependencies": {"a": {"properties": {"b": {"type": "string"}}}}}, ], { "dependencies": { "a": {"required": ["b"], "properties": {"b": {"type": "string"}}} }, }, ), ( [ {"dependencies": {"a": {"properties": {"b": {"type": "string"}}}}}, {"dependencies": {"a": ["b"]}}, ], { "dependencies": { "a": {"required": ["b"], "properties": {"b": {"type": "string"}}} }, }, ), ( [ {"dependencies": {"a": {"pattern": "a"}}}, {"dependencies": {"a": {"pattern": "b"}}}, ], None, ), ([{"items": {"pattern": "a"}}, {"items": {"pattern": "b"}}], None), ([{"items": [{"pattern": "a"}]}, {"items": [{"pattern": "b"}]}], None), ( [ {"items": [{}], "additionalItems": {"pattern": "a"}}, {"items": [{}], "additionalItems": {"pattern": "b"}}, ], None, ), ( [ {"items": [{}, {"type": "string"}], "additionalItems": False}, {"items": [{"type": "string"}]}, ], { "items": [{"type": "string"}, {"type": "string"}], "additionalItems": FALSEY, }, ), ( [ {"items": [{}, {"type": "string"}], "additionalItems": False}, {"items": {"type": "string"}}, ], { "items": [{"type": "string"}, {"type": "string"}], "additionalItems": FALSEY, }, ), ] + [ ([{lo: 0, hi: 9}, {lo: 1, hi: 10}], {lo: 1, hi: 9}) for lo, hi in [ ("minimum", "maximum"), ("exclusiveMinimum", "exclusiveMaximum"), ("minLength", "maxLength"), ("minItems", "maxItems"), ("minProperties", "maxProperties"), ] ], ) def test_merged(group, result): assert merged(group) == result @settings(suppress_health_check=[HealthCheck.too_slow], deadline=None) @given(json_schemata()) def test_self_merge_eq_canonicalish(schema): m = merged([schema, schema]) assert m == canonicalish(schema) def _merge_semantics_helper(data, s1, s2, combined): note(f"combined={combined!r}") ic = data.draw(from_schema(combined), label="combined") i1 = data.draw(from_schema(s1), label="s1") i2 = data.draw(from_schema(s2), label="s2") assert is_valid(ic, s1) assert is_valid(ic, s2) assert is_valid(i1, s2) == is_valid(i1, combined) assert is_valid(i2, s1) == is_valid(i2, combined) @pytest.mark.xfail( strict=False, reason="https://github.com/python-jsonschema/jsonschema/issues/1159" ) @settings(suppress_health_check=list(HealthCheck), deadline=None) @given(st.data(), json_schemata(), json_schemata()) def test_merge_semantics(data, s1, s2): assume(canonicalish(s1) != FALSEY and canonicalish(s2) != FALSEY) combined = merged([s1, s2]) assume(combined is not None) assert combined == merged([s2, s1]) # union is commutative assume(combined != FALSEY) _merge_semantics_helper(data, s1, s2, combined) @pytest.mark.xfail( strict=False, reason="https://github.com/python-jsonschema/jsonschema/issues/1159" ) @settings(suppress_health_check=list(HealthCheck), deadline=None) @given( st.data(), gen_number(kind="integer") | gen_number(kind="number"), gen_number(kind="integer") | gen_number(kind="number"), ) def test_can_almost_always_merge_numeric_schemas(data, s1, s2): assume(canonicalish(s1) != FALSEY and canonicalish(s2) != FALSEY) combined = merged([s1, s2]) if combined is None: # The ONLY case in which we can't merge numeric schemas is when # they both contain multipleOf keys with distinct non-integer values. mul1, mul2 = s1["multipleOf"], s2["multipleOf"] assert isinstance(mul1, float) or isinstance(mul2, float) assert mul1 != mul2 # TODO: work out why this started failing with # s1={'type': 'integer', 'multipleOf': 2}, # s2={'type': 'integer', 'multipleOf': 0.3333333333333333} # ratio = max(mul1, mul2) / min(mul1, mul2) # assert ratio != int(ratio) # i.e. x=0.5, y=2 (ratio=4.0) should work elif combined != FALSEY: _merge_semantics_helper(data, s1, s2, combined) def test_resolution_checks_resolver_is_valid(): with pytest.raises(InvalidArgument): resolve_all_refs({}, resolver="not a resolver") @settings(suppress_health_check=[HealthCheck.too_slow], deadline=None) @given(data=st.data()) def _canonicalises_to_equivalent_fixpoint(data): # This function isn't executed by pytest, only by FuzzBuzz - we want to parametrize # over schemas for differnt types there, but have to supply *all* args here. schema = data.draw(json_schemata(), label="schema") cc = canonicalish(schema) assert cc == canonicalish(cc) try: strat = from_schema(cc) except InvalidArgument: # e.g. array of unique {type: integers}, with too few allowed integers assume(False) instance = data.draw(JSON_STRATEGY | strat, label="instance") assert is_valid(instance, schema) == is_valid(instance, cc) jsonschema.validators.validator_for(schema).check_schema(schema) def test_canonicalise_is_only_valid_for_schemas(): with pytest.raises(InvalidArgument): canonicalish("not a schema") def test_validators_use_proper_draft(): # See GH-66 schema = { "$schema": "http://json-schema.org/draft-04/schema#", "not": { "allOf": [ {"exclusiveMinimum": True, "minimum": 0}, {"exclusiveMaximum": True, "maximum": 10}, ] }, } cc = canonicalish(schema) jsonschema.validators.validator_for(cc).check_schema(cc) def test_reference_resolver_issue_65_regression(): schema = { "allOf": [{"$ref": "#/definitions/ref"}, {"required": ["foo"]}], "properties": {"foo": {}}, "definitions": {"ref": {"maxProperties": 1}}, "type": "object", } res = resolve_all_refs(schema) can = canonicalish(res) assert "$ref" not in res assert "$ref" not in can for s in (schema, res, can): with pytest.raises(jsonschema.ValidationError): jsonschema.validate({}, s) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152418.0 hypothesis-jsonschema-0.23.1/tests/test_encode.py0000644000175100001770000000073500000000000022747 0ustar00runnerdocker00000000000000"""Tests for the hypothesis-jsonschema library.""" import json from hypothesis import given from hypothesis_jsonschema._encode import encode_canonical_json from hypothesis_jsonschema._from_schema import JSON_STRATEGY @given(JSON_STRATEGY) def test_canonical_json_encoding(v): """Test our hand-rolled canonicaljson implementation.""" encoded = encode_canonical_json(v) v2 = json.loads(encoded) assert v == v2 assert encode_canonical_json(v2) == encoded ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152418.0 hypothesis-jsonschema-0.23.1/tests/test_from_schema.py0000644000175100001770000005415000000000000023775 0ustar00runnerdocker00000000000000"""Tests for the hypothesis-jsonschema library.""" import json import re import warnings from pathlib import Path import jsonschema import pytest from gen_schemas import schema_strategy_params from hypothesis import ( HealthCheck, Phase, assume, given, note, reject, settings, strategies as st, ) from hypothesis.errors import FailedHealthCheck, HypothesisWarning, InvalidArgument from hypothesis.internal.compat import PYPY from hypothesis.internal.reflection import proxies from hypothesis_jsonschema._canonicalise import ( HypothesisRefResolutionError, canonicalish, make_validator, ) from hypothesis_jsonschema._from_schema import from_schema from hypothesis_jsonschema._resolve import resolve_all_refs # We use this as a placeholder for all schemas which resolve to nothing() # but do not canonicalise to FALSEY INVALID_REGEX_SCHEMA = {"type": "string", "pattern": "["} @settings( suppress_health_check=[HealthCheck.too_slow, HealthCheck.filter_too_much], deadline=None, ) @given(data=st.data()) @schema_strategy_params def test_generated_data_matches_schema(schema_strategy, data): """Check that an object drawn from an arbitrary schema is valid.""" schema = data.draw(schema_strategy) note(f"{schema=}") try: value = data.draw(from_schema(schema), "value from schema") except InvalidArgument: reject() assert make_validator(schema).is_valid(value) # This checks that our canonicalisation is semantically equivalent. assert make_validator(canonicalish(schema)).is_valid(value) @given(from_schema(True)) def test_boolean_true_is_valid_schema_and_resolvable(_): """...even though it's currently broken in jsonschema.""" @pytest.mark.parametrize( "schema", [ None, False, {"type": "an unknown type"}, {"allOf": [{"type": "boolean"}, {"const": None}]}, {"allOf": [{"type": "boolean"}, {"enum": [None]}]}, { "$schema": "http://json-schema.org/draft-07/schema#", "maximum": 10, "exclusiveMaximum": True, }, ], ) def test_invalid_schemas_raise(schema): """Trigger all the validation exceptions for full coverage.""" with pytest.raises(Exception): from_schema(schema).example() @pytest.mark.parametrize( "schema", [ INVALID_REGEX_SCHEMA, {"type": "string", "pattern": "[", "format": "color"}, {"type": "object", "patternProperties": {"[": False}}, {"type": "object", "patternProperties": {"[": False}, "required": ["a"]}, ], ) def test_invalid_regex_emit_warning(schema): with pytest.warns(UserWarning): from_schema(schema).validate() @given( from_schema( { "$schema": "http://json-schema.org/draft-04/schema#", "maximum": 10, "exclusiveMaximum": True, } ) ) def test_can_generate_with_explicit_schema_version(_): pass INVALID_SCHEMAS = { # Empty list for requires, which is invalid "Release Drafter configuration file", # Many, many schemas have invalid $schema keys, which emit a warning (-Werror) "A JSON schema for CRYENGINE projects (.cryproj files)", "JSDoc configuration file", "Static Analysis Results Format (SARIF) External Property File Format, Version 2.1.0-rtm.2", "Static Analysis Results Format (SARIF) External Property File Format, Version 2.1.0-rtm.3", "Static Analysis Results Format (SARIF) External Property File Format, Version 2.1.0-rtm.4", "Static Analysis Results Format (SARIF) External Property File Format, Version 2.1.0-rtm.5", "Static Analysis Results Format (SARIF), Version 2.1.0-rtm.2", } NON_EXISTENT_REF_SCHEMAS = { "Cirrus CI configuration files", "The Bamboo Specs allows you to define Bamboo configuration as code, and have corresponding plans/deployments created or updated automatically in Bamboo", # Special case - reference is valid, but target is list-format `items` rather than a subschema "TypeScript Lint configuration file", } UNSUPPORTED_SCHEMAS = { # Technically valid, but using regex patterns not supported by Python "draft4/unicode digits are more than 0 through 9", "draft4/unicode semantics should be used for all patternProperties matching", "draft7/unicode digits are more than 0 through 9", "draft7/unicode semantics should be used for all pattern matching", "draft7/unicode semantics should be used for all patternProperties matching", "draft7/ECMA 262 regex escapes control codes with \\c and lower letter", "draft7/ECMA 262 regex escapes control codes with \\c and upper letter", "JSON schema for nodemon.json configuration files.", "JSON Schema for mime type collections", } SKIP_ON_PYPY_SCHEMAS = { # Cause crashes or recursion errors, but only under PyPy "Swagger API 2.0 schema", "Language grammar description files in Textmate and compatible editors", } FLAKY_SCHEMAS = { # The following schemas refer to an `$id` rather than a JSON pointer. # This is valid, but not supported by the Python library - see e.g. # https://json-schema.org/understanding-json-schema/structuring.html#using-id-with-ref "draft4/Location-independent identifier", "draft7/Location-independent identifier", # Yep, lists of lists of lists of lists of lists of integers are HealthCheck-slow # TODO: write a separate test with healthchecks disabled? "draft4/nested items", "draft7/nested items", "draft4/oneOf with missing optional property", "draft7/oneOf with missing optional property", # Sometimes unsatisfiable. TODO: improve canonicalisation to remove filters "JSCS configuration file", # https://github.com/Zac-HD/hypothesis-jsonschema/pull/78#issuecomment-803519293 "Drone CI configuration file", "PHP Composer configuration file", "Pyrseas database schema versioning for Postgres databases, v0.8", # Apparently we're not handling this one correctly? "draft4/additionalProperties should not look in applicators", "draft7/additionalProperties should not look in applicators", # $id (sometimes) rejected as invalid/unknown URL type "A JSON schema for a Dolittle bounded context's artifacts", "A JSON schema for a Dolittle bounded context's resource configurations", # This one fails because of a hard-to-find (and on the surface impossible) # counterexample involving oneOf, which doesn't fail if validated directly! # {'requirements': {'': [{'location': None, 'rule': 'dir'}]}} "CLI config for enforcing environment settings", # These ones fail under jsonschema >= 4.0.0 # TODO: work out why and fix it; this is pure "ignore so we can ship it" "draft7/Recursive references between schemas", "draft7/$id inside an unknown keyword is not a real identifier", "draft7/refs with relative uris and defs", "draft7/relative refs with absolute uris and defs", } SLOW_SCHEMAS = { "snapcraft project (https://snapcraft.io)", "batect configuration file", "UI5 Tooling Configuration File (ui5.yaml)", "Renovate config file (https://github.com/renovatebot/renovate)", "Renovate config file (https://renovatebot.com/)", "Jenkins X Pipeline YAML configuration files", "TypeScript compiler configuration file", "JSON Schema for GraphQL Mesh config file", "Configuration file for stylelint", "Travis CI configuration file", "JSON schema for ESLint configuration files", "Ansible task files-2.0", "Ansible task files-2.1", "Ansible task files-2.2", "Ansible task files-2.3", "Ansible task files-2.4", "Ansible task files-2.5", "Ansible task files-2.6", "Ansible task files-2.7", "Ansible task files-2.9", "JSON Schema for GraphQL Code Generator config file", "Schema for CircleCI 2.0 config files", "Schema for Camel K YAML DSL", "The AWS Serverless Application Model (AWS SAM, previously known as Project Flourish) extends AWS CloudFormation to provide a simplified way of defining the Amazon API Gateway APIs, AWS Lambda functions, and Amazon DynamoDB tables needed by your serverless application.", "AWS CloudFormation provides a common language for you to describe and provision all the infrastructure resources in your cloud environment.", "JSON API document", "Prometheus configuration file", "JSON schema for electron-build configuration file.", "Pyrseas database schema versioning for Postgres databases, v0.8", # oneOf on property names means only objects are valid, but it's a very # filter-heavy way to express that. TODO: canonicalise oneOf to anyOf. "draft7/oneOf complex types", } with open(Path(__file__).parent / "corpus-schemastore-catalog.json") as f: catalog = json.load(f) with open(Path(__file__).parent / "corpus-suite-schemas.json") as f: suite, invalid_suite = json.load(f) with open(Path(__file__).parent / "corpus-reported.json") as f: reported = json.load(f) assert set(reported).isdisjoint(suite) suite.update(reported) def to_name_params(corpus): for n in sorted(corpus): if n in INVALID_SCHEMAS | NON_EXISTENT_REF_SCHEMAS: continue if n in UNSUPPORTED_SCHEMAS: continue if n in SKIP_ON_PYPY_SCHEMAS: yield pytest.param(n, marks=pytest.mark.skipif(PYPY, reason="broken")) elif n in SLOW_SCHEMAS | FLAKY_SCHEMAS: yield pytest.param(n, marks=pytest.mark.skip) else: if isinstance(corpus[n], dict) and "$schema" in corpus[n]: jsonschema.validators.validator_for(corpus[n]).check_schema(corpus[n]) yield n @pytest.mark.parametrize("name", sorted(INVALID_SCHEMAS)) def test_invalid_schemas_are_invalid(name): with pytest.raises(Exception): jsonschema.validators.validator_for(catalog[name]).check_schema(catalog[name]) @pytest.mark.parametrize("name", sorted(NON_EXISTENT_REF_SCHEMAS)) def test_invalid_ref_schemas_are_invalid(name): with pytest.raises(Exception): resolve_all_refs(catalog[name]) RECURSIVE_REFS = { # From upstream validation test suite "draft4/valid definition", "draft7/valid definition", "draft4/validate definition against metaschema", "draft7/validate definition against metaschema", "draft4/remote ref, containing refs itself", "draft7/remote ref, containing refs itself", "draft7/root pointer ref", # Schema also requires draft 03, which hypothesis-jsonschema doesn't support "A JSON Schema for ninjs by the IPTC. News and publishing information. See https://iptc.org/standards/ninjs/-1.0", # From schemastore "A JSON schema for Open API documentation files", "Avro Schema Avsc file", "AWS CloudFormation provides a common language for you to describe and provision all the infrastructure resources in your cloud environment.", "JSON schema .NET template files", "AppVeyor CI configuration file", "JSON Document Transofrm", "JSON Linked Data files", "Meta-validation schema for JSON Schema Draft 4", "JSON schema for vim plugin addon-info.json metadata files", "Meta-validation schema for JSON Schema Draft 7", "Neotys as-code load test specification, more at: https://github.com/Neotys-Labs/neoload-cli", "Metadata spec v1.26.4 for KSP-CKAN", "Digital Signature Service Core Protocols, Elements, and Bindings Version 2.0", "Opctl schema for describing an op", "Metadata spec v1.27 for KSP-CKAN", "PocketMine plugin manifest file", "BuckleScript configuration file", "Schema for CircleCI 2.0 config files", "Source Map files version 3", "Schema for Minecraft Bukkit plugin description files", "Swagger API 2.0 schema", "Static Analysis Results Interchange Format (SARIF) version 1", "Static Analysis Results Format (SARIF), Version 2.1.0-rtm.4", "Static Analysis Results Interchange Format (SARIF) version 2", "Static Analysis Results Format (SARIF), Version 2.1.0-rtm.3", "Static Analysis Results Format (SARIF), Version 2.1.0-rtm.5", "Web component file", "Vega visualization specification file", "The AWS Serverless Application Model (AWS SAM, previously known as Project Flourish) extends AWS CloudFormation to provide a simplified way of defining the Amazon API Gateway APIs, AWS Lambda functions, and Amazon DynamoDB tables needed by your serverless application.", "Windows App localization file", "YAML schema for GitHub Workflow", "JSON-stat 2.0 Schema", "Vega-Lite visualization specification file", "Language grammar description files in Textmate and compatible editors", "JSON Schema for GraphQL Mesh Config gile-0.0.16", "Azure Pipelines YAML pipelines definition", "Action and rule configuration descriptor for Yippee-Ki-JSON transformations.-1.1.2", "Action and rule configuration descriptor for Yippee-Ki-JSON transformations.-latest", "Schema for Camel K YAML DSL", } def xfail_on_reference_resolve_error(f): @proxies(f) def inner(*args, **kwargs): _, name = args assert isinstance(name, str) try: f(*args, **kwargs) assert name not in RECURSIVE_REFS except jsonschema.exceptions._RefResolutionError as err: if ( isinstance(err, HypothesisRefResolutionError) or isinstance(err._cause, HypothesisRefResolutionError) ) and ( "does not fetch remote references" in str(err) or name in RECURSIVE_REFS and "Could not resolve recursive references" in str(err) ): pytest.xfail() raise return inner @pytest.mark.parametrize("name", to_name_params(catalog)) @settings(deadline=None, max_examples=5, suppress_health_check=list(HealthCheck)) @given(data=st.data()) @xfail_on_reference_resolve_error def test_can_generate_for_real_large_schema(data, name): note(name) value = data.draw(from_schema(catalog[name])) jsonschema.validate(value, catalog[name]) @pytest.mark.parametrize("name", to_name_params(suite)) @settings( suppress_health_check=[HealthCheck.too_slow, HealthCheck.data_too_large], deadline=None, max_examples=20, ) @given(data=st.data()) @xfail_on_reference_resolve_error def test_can_generate_for_test_suite_schema(data, name): note(f"{suite[name]=}") value = data.draw(from_schema(suite[name])) try: jsonschema.validate(value, suite[name]) except jsonschema.exceptions.SchemaError: jsonschema.Draft4Validator(suite[name]).validate(value) @pytest.mark.parametrize("name", to_name_params(invalid_suite)) def test_cannot_generate_for_empty_test_suite_schema(name): strat = from_schema(invalid_suite[name]) with pytest.raises(Exception): strat.example() # This schema has overlapping patternProperties - this is OK, so long as they're # merged or otherwise handled correctly, with the exception of the key "ab" which # would have to be both an integer and a string (and is thus disallowed). OVERLAPPING_PATTERNS_SCHEMA = { "type": "object", "patternProperties": { r"\A[ab]{1,2}\Z": {}, r"\Aa[ab]\Z": {"type": "integer"}, r"\A[ab]b\Z": {"type": "string"}, }, "additionalProperties": False, "minProperties": 1, } @given(from_schema(OVERLAPPING_PATTERNS_SCHEMA)) def test_handles_overlapping_patternproperties(value): jsonschema.validate(value, OVERLAPPING_PATTERNS_SCHEMA) assert isinstance(value, dict) assert len(value) >= 1 assert "ab" not in value # A dictionary with zero or one keys, which was always empty due to a bug. SCHEMA = { "type": "object", "properties": {"key": {"type": "string"}}, "additionalProperties": False, } @given(from_schema(SCHEMA)) def test_single_property_can_generate_nonempty(query): # See https://github.com/Zac-HD/hypothesis-jsonschema/issues/25 assume(query) UNIQUE_NUMERIC_ARRAY_SCHEMA = { "type": "array", "uniqueItems": True, "items": {"enum": [0, 0.0]}, "minItems": 1, } @given(from_schema(UNIQUE_NUMERIC_ARRAY_SCHEMA)) def test_numeric_uniqueness(value): # NOTE: this kind of test should usually be embedded in corpus-reported.json, # but in this case the type of the enum elements matter and we don't want to # allow a flexible JSON loader to mess things up. jsonschema.validate(value, UNIQUE_NUMERIC_ARRAY_SCHEMA) def test_draft03_not_supported(): # Also checks that errors are deferred from importtime to runtime @given(from_schema({"$schema": "http://json-schema.org/draft-03/schema#"})) def f(_): raise AssertionError with pytest.raises(InvalidArgument): f() @pytest.mark.parametrize("type_", ["integer", "number"]) def test_impossible_multiplier(type_): # Covering test for a failsafe branch, which explicitly returns nothing() # if scaling the bounds and taking their ceil/floor also inverts them. schema = {"maximum": -1, "minimum": -1, "multipleOf": 0.0009765625000000002} schema["type"] = type_ strategy = from_schema(schema) strategy.validate() assert strategy.is_empty def test_unsatisfiable_array_returns_nothing(): schema = { "type": "array", "items": [], "additionalItems": INVALID_REGEX_SCHEMA, "minItems": 1, } with pytest.warns(UserWarning): strategy = from_schema(schema) strategy.validate() assert strategy.is_empty ALLOF_CONTAINS = { "type": "array", "items": {"type": "string"}, "allOf": [{"contains": {"const": "A"}}, {"contains": {"const": "B"}}], } @pytest.mark.xfail(raises=FailedHealthCheck) @given(from_schema(ALLOF_CONTAINS)) def test_multiple_contains_behind_allof(value): # By placing *multiple* contains elements behind "allOf" we've disabled the # mixed-generation logic, and so we can't generate any valid instances at all. jsonschema.validate(value, ALLOF_CONTAINS) @jsonschema.FormatChecker._cls_checks("card-test") def validate_card_format(string): # For the real thing, you'd want use the Luhn algorithm; this is enough for tests. return bool(re.match(r"^\d{4} \d{4} \d{4} \d{4}$", string)) @pytest.mark.parametrize( "kw", [ {"foo": "not a strategy"}, {5: st.just("name is not a string")}, {"card-test": st.just("not a valid card")}, {"card-test": st.none()}, # Not a string ], ) @given(data=st.data()) def test_custom_formats_validation(data, kw): s = from_schema({"type": "string", "format": "card-test"}, custom_formats=kw) with pytest.raises(InvalidArgument): data.draw(s) @pytest.mark.parametrize( "schema", [ {"required": ["\x00"]}, {"properties": {"\x00": {"type": "integer"}}}, {"dependencies": {"\x00": ["a"]}}, {"dependencies": {"\x00": {"type": "integer"}}}, {"required": ["\xff"]}, {"properties": {"\xff": {"type": "integer"}}}, {"dependencies": {"\xff": ["a"]}}, {"dependencies": {"\xff": {"type": "integer"}}}, ], ) @settings(deadline=None) @given(data=st.data()) def test_alphabet_name_validation(data, schema): with pytest.raises(InvalidArgument): data.draw(from_schema(schema, allow_x00=False, codec="ascii")) @given( num=from_schema( {"type": "string", "format": "card-test"}, custom_formats={"card-test": st.just("4111 1111 1111 1111")}, ) ) def test_allowed_custom_format(num): assert num == "4111 1111 1111 1111" @given( string=from_schema( {"type": "string", "format": "not registered"}, custom_formats={"not registered": st.just("hello world")}, ) ) def test_allowed_unknown_custom_format(string): assert string == "hello world" assert "not registered" not in jsonschema.FormatChecker().checkers @given(data=st.data()) def test_overriding_standard_format(data): expected = "2000-01-01" schema = {"type": "string", "format": "full-date"} custom_formats = {"full-date": st.just(expected)} with pytest.warns( HypothesisWarning, match="Overriding standard format 'full-date'" ): value = data.draw(from_schema(schema, custom_formats=custom_formats)) assert value == expected with warnings.catch_warnings(): warnings.simplefilter("ignore", UserWarning) @given( from_schema({"type": "array", "items": INVALID_REGEX_SCHEMA, "maxItems": 10}) ) def test_can_generate_empty_list_with_max_size_and_no_allowed_items(val): assert val == [] @given( from_schema( { "type": "array", "items": [{"const": 1}, {"const": 2}, {"const": 3}], "additionalItems": INVALID_REGEX_SCHEMA, "maxItems": 10, } ) ) def test_can_generate_list_with_max_size_and_no_allowed_additional_items(val): assert val == [1, 2, 3] @given(string=from_schema({"type": "string", "pattern": "^[a-z]+$"})) def test_does_not_generate_trailing_newline_from_dollar_pattern(string): assert not string.endswith("\n") @pytest.mark.xfail(strict=True, raises=UnicodeEncodeError) @settings(phases=set(Phase) - {Phase.shrink}) @given(from_schema({"type": "string", "minLength": 100}, codec=None)) def test_can_find_non_utf8_string(value): value.encode() @given(st.data()) def test_errors_on_unencodable_property_name(data): non_ascii_schema = {"type": "object", "properties": {"é": {"type": "integer"}}} data.draw(from_schema(non_ascii_schema, codec=None)) with pytest.raises(InvalidArgument, match=r"'é' cannot be encoded as 'ascii'"): data.draw(from_schema(non_ascii_schema, codec="ascii")) @settings(deadline=None) @given(data=st.data()) def test_no_null_bytes(data): schema = { "type": "object", "properties": { "p1": {"type": "string"}, "p2": { "type": "object", "properties": {"pp1": {"type": "string"}}, "required": ["pp1"], "additionalProperties": False, }, "p3": {"type": "array", "items": {"type": "string"}}, }, "required": ["p1", "p2", "p3"], "additionalProperties": False, } example = data.draw(from_schema(schema, allow_x00=False)) assert "\x00" not in example["p1"] assert "\x00" not in example["p2"]["pp1"] assert all("\x00" not in item for item in example["p3"]) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152418.0 hypothesis-jsonschema-0.23.1/tests/test_version.py0000644000175100001770000000341500000000000023175 0ustar00runnerdocker00000000000000"""Tests for the hypothesis-jsonschema library.""" import re from datetime import datetime, timezone from functools import lru_cache from pathlib import Path from typing import NamedTuple import hypothesis_jsonschema class Version(NamedTuple): major: int minor: int patch: int @classmethod def from_string(cls, string): return cls(*map(int, string.split("."))) @lru_cache def get_releases(): pattern = re.compile(r"^#### (\d+\.\d+\.\d+) - (\d\d\d\d-\d\d-\d\d)$") with open(Path(__file__).parent.parent / "CHANGELOG.md") as f: return tuple( (Version.from_string(match.group(1)), match.group(2)) for match in map(pattern.match, f) if match is not None ) def test_last_release_against_changelog(): last_version, last_date = get_releases()[0] assert last_version == Version.from_string(hypothesis_jsonschema.__version__) assert last_date <= datetime.now(timezone.utc).date().isoformat() def test_changelog_is_ordered(): versions, dates = zip(*get_releases()) assert versions == tuple(sorted(versions, reverse=True)) assert dates == tuple(sorted(dates, reverse=True)) def test_version_increments_are_correct(): # We either increment the patch version by one, increment the minor version # and reset the patch, or increment major and reset both minor and patch. versions, _ = zip(*get_releases()) for prev, current in zip(versions[1:], versions): assert prev < current # remember that `versions` is newest-first assert current in ( prev._replace(patch=prev.patch + 1), prev._replace(minor=prev.minor + 1, patch=0), prev._replace(major=prev.major + 1, minor=0, patch=0), ), f"{current} does not follow {prev}" ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1709152418.0 hypothesis-jsonschema-0.23.1/tox.ini0000644000175100001770000000363300000000000020252 0ustar00runnerdocker00000000000000# The test environment and commands [tox] envlist = check, test skipsdist = True [testenv:check] description = Runs all formatting tools then static analysis (quick) deps = --requirement deps/check.txt commands = shed python tests/format_json.py ruff --fix . mypy --config-file=tox.ini src/ [testenv:test] description = Runs pytest with posargs - `tox -e test -- -v` == `pytest -v` deps = --requirement deps/test.txt commands = pip install --no-deps --editable . pytest {posargs:-n auto --durations=5} [testenv:deps] description = Updates test corpora and the pinned dependencies in `deps/*.txt` deps = pip-tools commands = pip-compile --quiet --upgrade --rebuild --output-file=deps/check.txt deps/check.in pip-compile --quiet --upgrade --rebuild --output-file=deps/test.txt deps/test.in setup.py # python tests/fetch.py [testenv:mutmut] description = Run the mutation testing tool `mutmut` (allow several hours) deps = mutmut tox commands = # pip install -e ../mutmut # if e.g. string mutations patched out for speed mutmut run \ --no-backup \ --paths-to-mutate=src/hypothesis_jsonschema \ --runner="tox -- -x --no-cov -n auto -k 'not real_large_schema'" mutmut results # Settings for other tools [pytest] xfail_strict = True addopts = -Werror --tb=short --cov=hypothesis_jsonschema --cov-branch --cov-report=term-missing:skip-covered --cov-fail-under=100 [flake8] ignore = D1,E203,E501,W503,S101,S310 exclude = .*/,__pycache__ [mypy] python_version = 3.6 platform = linux disallow_untyped_calls = True disallow_untyped_defs = True disallow_untyped_decorators = True follow_imports = silent ignore_missing_imports = True implicit_reexport = False strict_equality = True warn_no_return = True warn_return_any = True warn_unreachable = True warn_unused_ignores = True warn_unused_configs = True warn_redundant_casts = True