././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1780787404.6923745 galaxy_util-26.0.1/0000755000175100017510000000000015211124315013563 5ustar00runnerrunner././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787384.0 galaxy_util-26.0.1/HISTORY.rst0000644000175100017510000006312615211124270015466 0ustar00runnerrunnerHistory ------- .. to_doc ------------------- 26.0.1 (2026-06-04) ------------------- ========= Bug fixes ========= * Fixes looks_like_flattened_repeat_key helper by `@guerler `_ in `#22578 `_ ============ Enhancements ============ * Replace per-term joins in workflow search with EXISTS subqueries by `@mvdbeek `_ in `#22548 `_ ------------------- 26.0.0 (2026-04-08) ------------------- ========= Bug fixes ========= * Plumbing for tracking potential fixes for transient failures (and a fix demonstrating it) by `@jmchilton `_ in `#21243 `_ * Remove unused handle_tool_shed_url_protocol by `@mvdbeek `_ in `#21925 `_ * Raise MessageException instead of generic Exception in rules_dsl by `@mvdbeek `_ in `#22285 `_ * Improve timeout and error handling in ``/api/proxy`` endpoint by `@mvdbeek `_ in `#22297 `_ * Skip WorkflowHub tests when workflowhub.eu is down by `@mvdbeek `_ in `#22302 `_ * Discard rest of line in chunks in iter_start_of_line by `@mvdbeek `_ in `#22332 `_ * Fix Content-Disposition header with trailing whitespace by `@mvdbeek `_ in `#22379 `_ ============ Enhancements ============ * Update Python dependencies by `@galaxybot `_ in `#21043 `_ * Add Playwright Backend Support to Galaxy Browser Automation Framework by `@jmchilton `_ in `#21102 `_ * Add Custom Validation for User-Configured Templates by `@davelopez `_ in `#21155 `_ * Add type annotations to job handling code by `@nsoranzo `_ in `#21171 `_ * Richer tracking of transient failures. by `@jmchilton `_ in `#21227 `_ * Update fastapi to 0.123.4 and ``get_openapi()`` fork by `@nsoranzo `_ in `#21384 `_ * Add AI Agent Framework and ChatGXY 2.0 by `@dannon `_ in `#21434 `_ * Fix use of function, method and argument names deprecated in pyparsing 3.0.0 by `@nsoranzo `_ in `#21517 `_ * Apply 2026 black style by `@galaxybot `_ in `#21618 `_ * Add tests for oidc usernames by `@nuwang `_ in `#21655 `_ * Various fixes to file source template's validation system by `@davelopez `_ in `#21704 `_ ------------------- 25.1.2 (2026-03-09) ------------------- No recorded changes since last release ------------------- 25.1.1 (2026-02-03) ------------------- No recorded changes since last release ------------------- 25.1.0 (2025-12-12) ------------------- ========= Bug fixes ========= * Extract: do not use common prefix dir by `@bernt-matthias `_ in `#20929 `_ * Test and fix CORS on exceptions by `@mvdbeek `_ in `#21105 `_ ============ Enhancements ============ * Implement Sample Sheets by `@jmchilton `_ in `#19305 `_ * Empower Users to More Pragmatically Import Datasets & Collections From Tables by `@jmchilton `_ in `#20288 `_ * Type annotation fixes for mypy 1.16.0 by `@nsoranzo `_ in `#20424 `_ * Remove deprecated tool document cache by `@nsoranzo `_ in `#20510 `_ * Refactor Files Sources Framework for stronger typing using pydantic models by `@davelopez `_ in `#20728 `_ * Support remote file source hashes by `@davelopez `_ in `#20853 `_ ------------------- 25.0.4 (2025-11-18) ------------------- No recorded changes since last release ------------------- 25.0.3 (2025-09-23) ------------------- No recorded changes since last release ------------------- 25.0.2 (2025-08-13) ------------------- ========= Bug fixes ========= * Prevent importing workflows with invalid step UUID by `@davelopez `_ in `#20596 `_ * Remove base_dir from zip in make_fast_zipfile by `@davelopez `_ in `#20739 `_ ------------------- 25.0.1 (2025-06-20) ------------------- No recorded changes since last release ------------------- 25.0.0 (2025-06-18) ------------------- ========= Bug fixes ========= * Use ``resource_path()`` to access datatypes_conf.xml.sample as a package resource by `@nsoranzo `_ in `#19331 `_ * Use fissix also when python3-lib2to3 is not installed by `@nsoranzo `_ in `#19749 `_ * Fix ``test_in_directory`` on osx by `@mvdbeek `_ in `#19943 `_ ============ Enhancements ============ * Calculate hash for new non-deferred datasets when finishing a job by `@nsoranzo `_ in `#19181 `_ * Fix UP031 errors - Part 3 by `@nsoranzo `_ in `#19218 `_ * Fix UP031 errors - Part 4 by `@nsoranzo `_ in `#19235 `_ * Fix UP031 errors - Part 5 by `@nsoranzo `_ in `#19282 `_ * Type annotation fixes for mypy 1.14.0 by `@nsoranzo `_ in `#19372 `_ * Empower Users to Build More Kinds of Collections, More Intelligently by `@jmchilton `_ in `#19377 `_ * Set safe default extraction filter for tar archives by `@nsoranzo `_ in `#19406 `_ * Format code with black 25.1.0 by `@nsoranzo `_ in `#19625 `_ * Improve type annotations of ``ModelPersistenceContext`` and derived classes by `@nsoranzo `_ in `#19852 `_ * Allow PathLike parameters in ``make_fast_zipfile()`` by `@nsoranzo `_ in `#19955 `_ * Implement dataset collection support in workflow landing requests by `@mvdbeek `_ in `#20004 `_ * Add DOI to workflow metadata by `@jdavcs `_ in `#20033 `_ * Improve type annotation of `galaxy.util` submodules by `@nsoranzo `_ in `#20104 `_ * Additional type hints for ``toolbox.get_tool`` / ``toolbox.has_tool`` by `@mvdbeek `_ in `#20150 `_ ------------------- 24.2.4 (2025-06-17) ------------------- ========= Bug fixes ========= * Use ``make_fast_zipfile`` directly by `@mvdbeek `_ in `#19947 `_ ------------------- 24.2.3 (2025-03-16) ------------------- No recorded changes since last release ------------------- 24.2.2 (2025-03-08) ------------------- ============ Enhancements ============ * Add bwa_mem2_index directory datatype, framework enhancements for testing directories by `@mvdbeek `_ in `#19694 `_ ------------------- 24.2.1 (2025-02-28) ------------------- No recorded changes since last release ------------------- 24.2.0 (2025-02-11) ------------------- ========= Bug fixes ========= * Fixes for errors reported by mypy 1.11.0 by `@nsoranzo `_ in `#18608 `_ * Fix numerous issues with tool input format "21.01" by `@jmchilton `_ in `#19030 `_ * Partial backport of #19331 by `@nsoranzo `_ in `#19342 `_ * Fix config template validation for file sources and object store templates by `@davelopez `_ in `#19414 `_ * Serialize message exceptions on execution error by `@mvdbeek `_ in `#19483 `_ ============ Enhancements ============ * Allow OAuth 2.0 user defined file sources (w/Dropbox integration) by `@jmchilton `_ in `#18272 `_ * Add Python 3.13 support by `@nsoranzo `_ in `#18449 `_ * Add Tool-Centric APIs to the Tool Shed 2.0 by `@jmchilton `_ in `#18524 `_ * Rip repository_registry out of tool shed 2.0 by `@jmchilton `_ in `#18647 `_ * Workflow Landing Requests by `@jmchilton `_ in `#18807 `_ * Update Mypy to 1.11.2 and fix new signature override errors by `@nsoranzo `_ in `#18811 `_ * Raise exception if CompressedFile used on incompatible file by `@mvdbeek `_ in `#18888 `_ * Type annotations and fixes by `@nsoranzo `_ in `#18911 `_ * Workflow landing improvements by `@mvdbeek `_ in `#18979 `_ * Run installed Galaxy with no config and a simplified entry point by `@natefoo `_ in `#19050 `_ * Enhance UTF-8 support for filename handling in downloads by `@arash77 `_ in `#19161 `_ ------------------- 24.1.4 (2024-12-11) ------------------- ========= Bug fixes ========= * Fix Archive header encoding by `@arash77 `_ in `#18583 `_ * File source and object store instance api fixes by `@mvdbeek `_ in `#18685 `_ ============ Enhancements ============ * Use smtplib send_message to support utf-8 chars in to and from by `@mvdbeek `_ in `#18805 `_ ------------------- 24.1.3 (2024-10-25) ------------------- ========= Bug fixes ========= * Fix Archive header encoding by `@arash77 `_ in `#18583 `_ * File source and object store instance api fixes by `@mvdbeek `_ in `#18685 `_ ============ Enhancements ============ * Use smtplib send_message to support utf-8 chars in to and from by `@mvdbeek `_ in `#18805 `_ ------------------- 24.1.2 (2024-09-25) ------------------- ========= Bug fixes ========= * Fix Archive header encoding by `@arash77 `_ in `#18583 `_ * File source and object store instance api fixes by `@mvdbeek `_ in `#18685 `_ ============ Enhancements ============ * Use smtplib send_message to support utf-8 chars in to and from by `@mvdbeek `_ in `#18805 `_ ------------------- 24.1.1 (2024-07-02) ------------------- ========= Bug fixes ========= * Fix bug in image_util.py by `@kostrykin `_ in `#17749 `_ * Revert some requests import changes by `@nsoranzo `_ in `#18199 `_ ============ Enhancements ============ * Better display of estimated line numbers and add number of columns for tabular by `@bernt-matthias `_ in `#17492 `_ * Update Python dependencies by `@galaxybot `_ in `#17653 `_ * Code cleanups from ruff and pyupgrade by `@nsoranzo `_ in `#17654 `_ * SQLAlchemy 2.0 by `@jdavcs `_ in `#17778 `_ * Error reporting unit tests by `@jmchilton `_ in `#17968 `_ * Enable ``warn_unused_ignores`` mypy option by `@nsoranzo `_ in `#17991 `_ * Add galaxy to user agent by `@mvdbeek `_ in `#18003 `_ * Update Python dependencies by `@galaxybot `_ in `#18063 `_ * Enable flake8-implicit-str-concat ruff rules by `@nsoranzo `_ in `#18067 `_ * Overhaul Azure storage infrastructure. by `@jmchilton `_ in `#18087 `_ * Empower users to bring their own storage and file sources by `@jmchilton `_ in `#18127 `_ * Harden User Object Store and File Source Creation by `@jmchilton `_ in `#18172 `_ ------------------- 24.0.3 (2024-06-28) ------------------- ========= Bug fixes ========= * Use config_section to distinguish between galaxy and ts or other apps by `@jdavcs `_ in `#18215 `_ ------------------- 24.0.2 (2024-05-07) ------------------- ========= Bug fixes ========= * Adds logging of messageExceptions in the fastapi exception handler. by `@dannon `_ in `#18041 `_ ------------------- 24.0.1 (2024-05-02) ------------------- ========= Bug fixes ========= * Fix conditional Image imports by `@mvdbeek `_ in `#17899 `_ ------------------- 24.0.0 (2024-04-02) ------------------- ========= Bug fixes ========= * Optional Reply-to SMTP header in tool error reports by `@neoformit `_ in `#17243 `_ * Follow-up on #17274 and #17262 by `@nsoranzo `_ in `#17302 `_ * Fixes for flake8-bugbear 24.1.17 by `@nsoranzo `_ in `#17340 `_ ============ Enhancements ============ * Add support for Python 3.12 by `@tuncK `_ in `#16796 `_ * Python 3.8 as minimum by `@mr-c `_ in `#16954 `_ * Remove web framework dependency from tools by `@davelopez `_ in `#17058 `_ * Add support for (fast5.tar).xz binary compressed files by `@tuncK `_ in `#17106 `_ * Reuse test instance during non-integration tests by `@mvdbeek `_ in `#17234 `_ * Add OIDC backend configuration schema and validation by `@uwwint `_ in `#17274 `_ * Enable ``warn_unreachable`` mypy option by `@mvdbeek `_ in `#17365 `_ * Fix type annotation of code using XML etree by `@nsoranzo `_ in `#17367 `_ * Update to black 2024 stable style by `@nsoranzo `_ in `#17391 `_ * Add `image_diff` comparison method for test output verification using images by `@kostrykin `_ in `#17556 `_ ------------------- 23.2.1 (2024-02-21) ------------------- ========= Bug fixes ========= * Ruff and flake8 fixes by `@nsoranzo `_ in `#16884 `_ ============ Enhancements ============ * Tool Shed 2.0 by `@jmchilton `_ in `#15639 `_ * Move database access code out of ``galaxy.util`` by `@jdavcs `_ in `#16526 `_ * Tweak tool memory use and optimize shared memory when using preload by `@mvdbeek `_ in `#16536 `_ * Updated path-based interactive tools with entry point path injection, support for ITs with relative links, shortened URLs, doc and config updates including Podman job_conf by `@sveinugu `_ in `#16795 `_ * Allow partial matches in workflow name tag search and search all tags for unquoted query by `@ahmedhamidawan `_ in `#16860 `_ * Use python-isal for fast zip deflate compression in rocrate export by `@mvdbeek `_ in `#17342 `_ ============= Other changes ============= * Merge 23.1 into dev by `@mvdbeek `_ in `#16534 `_ ------------------- 23.1.4 (2024-01-04) ------------------- No recorded changes since last release ------------------- 23.1.3 (2023-12-01) ------------------- No recorded changes since last release ------------------- 23.1.2 (2023-11-29) ------------------- ============ Enhancements ============ * Improve invocation error reporting by `@mvdbeek `_ in `#16917 `_ ------------------- 23.1.1 (2023-10-23) ------------------- ========= Bug fixes ========= * Fix bad auto-merge of dev. by `@jmchilton `_ in `#15386 `_ * Fix some drs handling issues by `@nuwang `_ in `#15777 `_ * Enable ``strict_equality`` mypy option by `@nsoranzo `_ in `#15808 `_ * Ensure session is request-scoped for legacy endpoints by `@jdavcs `_ in `#16207 `_ * Fix form builder value handling by `@guerler `_ in `#16304 `_ * Backport tool mem fixes by `@mvdbeek `_ in `#16601 `_ * Workaround for XML nodes of job resource parameters losing their children by `@kysrpex `_ in `#16728 `_ * Fix allowlist deserialization in file sources by `@mvdbeek `_ in `#16729 `_ * Exclude on_opened and on_closed from watcher events by `@mvdbeek `_ in `#16850 `_ ============ Enhancements ============ * Various Tool Shed Cleanup by `@jmchilton `_ in `#15247 `_ * Protection against problematic boolean parameters. by `@jmchilton `_ in `#15493 `_ * Unify url handling with filesources by `@nuwang `_ in `#15497 `_ * Explore tool remote test data by `@davelopez `_ in `#15510 `_ * Drop database views by `@jdavcs `_ in `#15876 `_ * Update Python dependencies by `@galaxybot `_ in `#15890 `_ * Record input datasets and collections at full parameter path by `@mvdbeek `_ in `#15978 `_ * Code cleanups from ruff and pyupgrade by `@nsoranzo `_ in `#16035 `_ * Vendorise ``packaging.versions.LegacyVersion`` by `@nsoranzo `_ in `#16058 `_ * Improve histories and datasets immutability checks by `@davelopez `_ in `#16143 `_ * Merge ``Target`` class with ``CondaTarget`` by `@nsoranzo `_ in `#16181 `_ ------------------- 23.0.6 (2023-10-23) ------------------- No recorded changes since last release ------------------- 23.0.5 (2023-07-29) ------------------- No recorded changes since last release ------------------- 23.0.4 (2023-06-30) ------------------- No recorded changes since last release ------------------- 23.0.3 (2023-06-26) ------------------- No recorded changes since last release ------------------- 23.0.2 (2023-06-13) ------------------- No recorded changes since last release ------------------- 23.0.1 (2023-06-08) ------------------- ========= Bug fixes ========= * Replace httpbin service with pytest-httpserver by `@mvdbeek `_ in `#16042 `_ ------------------- 22.1.2 (2022-12-08) ------------------- * Pin packaging dependency to < 22, fixes ``LegacyVersion`` import errors * Add missing pyparsing dependency ------------------- 22.1.1 (2022-08-22) ------------------- * First release from the 22.01 branch of Galaxy ------------------- 21.1.0 (2021-03-19) ------------------- * First release from the 21.01 branch of Galaxy. ------------------- 20.9.1 (2020-10-28) ------------------- ------------------- 20.9.0 (2020-10-15) ------------------- * First release from the 20.09 branch of Galaxy. ------------------- 20.5.0 (2020-07-03) ------------------- * First release from 20.05 branch of Galaxy. ------------------- 19.9.0 (2019-11-21) ------------------- * Initial import from dev branch of Galaxy during 19.09 development cycle. ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/LICENSE0000644000175100017510000000306615211124267014603 0ustar00runnerrunnerCopyright (c) 2005-2026 Galaxy Contributors (see CONTRIBUTORS.md) Galaxy is provided from 2026-02-25 onwards entirely under the MIT License. Some icons found in Galaxy are from the Silk Icons set, available under the Creative Commons Attribution 2.5 License, from: http://www.famfamfam.com/lab/icons/silk/ Other images and documentation are licensed under the Creative Commons Attribution 3.0 (CC BY 3.0) License. See: http://creativecommons.org/licenses/by/3.0/ -------------------------------------------------------------------------------- MIT License Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787384.0 galaxy_util-26.0.1/MANIFEST.in0000644000175100017510000000024315211124270015320 0ustar00runnerrunnerinclude *.rst *.txt LICENSE */py.typed include galaxy/util/docutils_template.txt include galaxy/util/rules_dsl_spec.yml include galaxy/exceptions/error_codes.json ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1780787404.6923156 galaxy_util-26.0.1/PKG-INFO0000644000175100017510000006630215211124315014667 0ustar00runnerrunnerMetadata-Version: 2.4 Name: galaxy-util Version: 26.0.1 Summary: Galaxy generic utilities Home-page: https://github.com/galaxyproject/galaxy Author: Galaxy Project and Community Author-email: galaxy-committers@lists.galaxyproject.org License: MIT Requires-Python: >=3.8 Description-Content-Type: text/x-rst License-File: LICENSE Requires-Dist: bleach Requires-Dist: boltons Requires-Dist: docutils!=0.17,!=0.17.1 Requires-Dist: importlib-resources>=5.10.0; python_version < "3.12" Requires-Dist: packaging Requires-Dist: pyparsing>=3.0.0 Requires-Dist: PyYAML Requires-Dist: requests Requires-Dist: typing-extensions Requires-Dist: zipstream-new Provides-Extra: image-util Requires-Dist: pillow; extra == "image-util" Provides-Extra: jstree Requires-Dist: dictobj; extra == "jstree" Provides-Extra: template Requires-Dist: CT3>=3.3.3; extra == "template" Requires-Dist: fissix; python_version >= "3.13" and extra == "template" Requires-Dist: future>=1.0.0; extra == "template" Provides-Extra: config-template Requires-Dist: galaxy-tool-util-models; extra == "config-template" Requires-Dist: Jinja2; extra == "config-template" Requires-Dist: pydantic>=2.7.4; extra == "config-template" Provides-Extra: test Requires-Dist: pytest; extra == "test" Requires-Dist: pytest-httpserver; extra == "test" Requires-Dist: responses; extra == "test" Requires-Dist: Werkzeug; extra == "test" Dynamic: license-file .. image:: https://badge.fury.io/py/galaxy-util.svg :target: https://pypi.org/project/galaxy-util/ Overview -------- The Galaxy_ utilities module. * Code: https://github.com/galaxyproject/galaxy .. _Galaxy: http://galaxyproject.org/ History ------- .. to_doc ------------------- 26.0.1 (2026-06-04) ------------------- ========= Bug fixes ========= * Fixes looks_like_flattened_repeat_key helper by `@guerler `_ in `#22578 `_ ============ Enhancements ============ * Replace per-term joins in workflow search with EXISTS subqueries by `@mvdbeek `_ in `#22548 `_ ------------------- 26.0.0 (2026-04-08) ------------------- ========= Bug fixes ========= * Plumbing for tracking potential fixes for transient failures (and a fix demonstrating it) by `@jmchilton `_ in `#21243 `_ * Remove unused handle_tool_shed_url_protocol by `@mvdbeek `_ in `#21925 `_ * Raise MessageException instead of generic Exception in rules_dsl by `@mvdbeek `_ in `#22285 `_ * Improve timeout and error handling in ``/api/proxy`` endpoint by `@mvdbeek `_ in `#22297 `_ * Skip WorkflowHub tests when workflowhub.eu is down by `@mvdbeek `_ in `#22302 `_ * Discard rest of line in chunks in iter_start_of_line by `@mvdbeek `_ in `#22332 `_ * Fix Content-Disposition header with trailing whitespace by `@mvdbeek `_ in `#22379 `_ ============ Enhancements ============ * Update Python dependencies by `@galaxybot `_ in `#21043 `_ * Add Playwright Backend Support to Galaxy Browser Automation Framework by `@jmchilton `_ in `#21102 `_ * Add Custom Validation for User-Configured Templates by `@davelopez `_ in `#21155 `_ * Add type annotations to job handling code by `@nsoranzo `_ in `#21171 `_ * Richer tracking of transient failures. by `@jmchilton `_ in `#21227 `_ * Update fastapi to 0.123.4 and ``get_openapi()`` fork by `@nsoranzo `_ in `#21384 `_ * Add AI Agent Framework and ChatGXY 2.0 by `@dannon `_ in `#21434 `_ * Fix use of function, method and argument names deprecated in pyparsing 3.0.0 by `@nsoranzo `_ in `#21517 `_ * Apply 2026 black style by `@galaxybot `_ in `#21618 `_ * Add tests for oidc usernames by `@nuwang `_ in `#21655 `_ * Various fixes to file source template's validation system by `@davelopez `_ in `#21704 `_ ------------------- 25.1.2 (2026-03-09) ------------------- No recorded changes since last release ------------------- 25.1.1 (2026-02-03) ------------------- No recorded changes since last release ------------------- 25.1.0 (2025-12-12) ------------------- ========= Bug fixes ========= * Extract: do not use common prefix dir by `@bernt-matthias `_ in `#20929 `_ * Test and fix CORS on exceptions by `@mvdbeek `_ in `#21105 `_ ============ Enhancements ============ * Implement Sample Sheets by `@jmchilton `_ in `#19305 `_ * Empower Users to More Pragmatically Import Datasets & Collections From Tables by `@jmchilton `_ in `#20288 `_ * Type annotation fixes for mypy 1.16.0 by `@nsoranzo `_ in `#20424 `_ * Remove deprecated tool document cache by `@nsoranzo `_ in `#20510 `_ * Refactor Files Sources Framework for stronger typing using pydantic models by `@davelopez `_ in `#20728 `_ * Support remote file source hashes by `@davelopez `_ in `#20853 `_ ------------------- 25.0.4 (2025-11-18) ------------------- No recorded changes since last release ------------------- 25.0.3 (2025-09-23) ------------------- No recorded changes since last release ------------------- 25.0.2 (2025-08-13) ------------------- ========= Bug fixes ========= * Prevent importing workflows with invalid step UUID by `@davelopez `_ in `#20596 `_ * Remove base_dir from zip in make_fast_zipfile by `@davelopez `_ in `#20739 `_ ------------------- 25.0.1 (2025-06-20) ------------------- No recorded changes since last release ------------------- 25.0.0 (2025-06-18) ------------------- ========= Bug fixes ========= * Use ``resource_path()`` to access datatypes_conf.xml.sample as a package resource by `@nsoranzo `_ in `#19331 `_ * Use fissix also when python3-lib2to3 is not installed by `@nsoranzo `_ in `#19749 `_ * Fix ``test_in_directory`` on osx by `@mvdbeek `_ in `#19943 `_ ============ Enhancements ============ * Calculate hash for new non-deferred datasets when finishing a job by `@nsoranzo `_ in `#19181 `_ * Fix UP031 errors - Part 3 by `@nsoranzo `_ in `#19218 `_ * Fix UP031 errors - Part 4 by `@nsoranzo `_ in `#19235 `_ * Fix UP031 errors - Part 5 by `@nsoranzo `_ in `#19282 `_ * Type annotation fixes for mypy 1.14.0 by `@nsoranzo `_ in `#19372 `_ * Empower Users to Build More Kinds of Collections, More Intelligently by `@jmchilton `_ in `#19377 `_ * Set safe default extraction filter for tar archives by `@nsoranzo `_ in `#19406 `_ * Format code with black 25.1.0 by `@nsoranzo `_ in `#19625 `_ * Improve type annotations of ``ModelPersistenceContext`` and derived classes by `@nsoranzo `_ in `#19852 `_ * Allow PathLike parameters in ``make_fast_zipfile()`` by `@nsoranzo `_ in `#19955 `_ * Implement dataset collection support in workflow landing requests by `@mvdbeek `_ in `#20004 `_ * Add DOI to workflow metadata by `@jdavcs `_ in `#20033 `_ * Improve type annotation of `galaxy.util` submodules by `@nsoranzo `_ in `#20104 `_ * Additional type hints for ``toolbox.get_tool`` / ``toolbox.has_tool`` by `@mvdbeek `_ in `#20150 `_ ------------------- 24.2.4 (2025-06-17) ------------------- ========= Bug fixes ========= * Use ``make_fast_zipfile`` directly by `@mvdbeek `_ in `#19947 `_ ------------------- 24.2.3 (2025-03-16) ------------------- No recorded changes since last release ------------------- 24.2.2 (2025-03-08) ------------------- ============ Enhancements ============ * Add bwa_mem2_index directory datatype, framework enhancements for testing directories by `@mvdbeek `_ in `#19694 `_ ------------------- 24.2.1 (2025-02-28) ------------------- No recorded changes since last release ------------------- 24.2.0 (2025-02-11) ------------------- ========= Bug fixes ========= * Fixes for errors reported by mypy 1.11.0 by `@nsoranzo `_ in `#18608 `_ * Fix numerous issues with tool input format "21.01" by `@jmchilton `_ in `#19030 `_ * Partial backport of #19331 by `@nsoranzo `_ in `#19342 `_ * Fix config template validation for file sources and object store templates by `@davelopez `_ in `#19414 `_ * Serialize message exceptions on execution error by `@mvdbeek `_ in `#19483 `_ ============ Enhancements ============ * Allow OAuth 2.0 user defined file sources (w/Dropbox integration) by `@jmchilton `_ in `#18272 `_ * Add Python 3.13 support by `@nsoranzo `_ in `#18449 `_ * Add Tool-Centric APIs to the Tool Shed 2.0 by `@jmchilton `_ in `#18524 `_ * Rip repository_registry out of tool shed 2.0 by `@jmchilton `_ in `#18647 `_ * Workflow Landing Requests by `@jmchilton `_ in `#18807 `_ * Update Mypy to 1.11.2 and fix new signature override errors by `@nsoranzo `_ in `#18811 `_ * Raise exception if CompressedFile used on incompatible file by `@mvdbeek `_ in `#18888 `_ * Type annotations and fixes by `@nsoranzo `_ in `#18911 `_ * Workflow landing improvements by `@mvdbeek `_ in `#18979 `_ * Run installed Galaxy with no config and a simplified entry point by `@natefoo `_ in `#19050 `_ * Enhance UTF-8 support for filename handling in downloads by `@arash77 `_ in `#19161 `_ ------------------- 24.1.4 (2024-12-11) ------------------- ========= Bug fixes ========= * Fix Archive header encoding by `@arash77 `_ in `#18583 `_ * File source and object store instance api fixes by `@mvdbeek `_ in `#18685 `_ ============ Enhancements ============ * Use smtplib send_message to support utf-8 chars in to and from by `@mvdbeek `_ in `#18805 `_ ------------------- 24.1.3 (2024-10-25) ------------------- ========= Bug fixes ========= * Fix Archive header encoding by `@arash77 `_ in `#18583 `_ * File source and object store instance api fixes by `@mvdbeek `_ in `#18685 `_ ============ Enhancements ============ * Use smtplib send_message to support utf-8 chars in to and from by `@mvdbeek `_ in `#18805 `_ ------------------- 24.1.2 (2024-09-25) ------------------- ========= Bug fixes ========= * Fix Archive header encoding by `@arash77 `_ in `#18583 `_ * File source and object store instance api fixes by `@mvdbeek `_ in `#18685 `_ ============ Enhancements ============ * Use smtplib send_message to support utf-8 chars in to and from by `@mvdbeek `_ in `#18805 `_ ------------------- 24.1.1 (2024-07-02) ------------------- ========= Bug fixes ========= * Fix bug in image_util.py by `@kostrykin `_ in `#17749 `_ * Revert some requests import changes by `@nsoranzo `_ in `#18199 `_ ============ Enhancements ============ * Better display of estimated line numbers and add number of columns for tabular by `@bernt-matthias `_ in `#17492 `_ * Update Python dependencies by `@galaxybot `_ in `#17653 `_ * Code cleanups from ruff and pyupgrade by `@nsoranzo `_ in `#17654 `_ * SQLAlchemy 2.0 by `@jdavcs `_ in `#17778 `_ * Error reporting unit tests by `@jmchilton `_ in `#17968 `_ * Enable ``warn_unused_ignores`` mypy option by `@nsoranzo `_ in `#17991 `_ * Add galaxy to user agent by `@mvdbeek `_ in `#18003 `_ * Update Python dependencies by `@galaxybot `_ in `#18063 `_ * Enable flake8-implicit-str-concat ruff rules by `@nsoranzo `_ in `#18067 `_ * Overhaul Azure storage infrastructure. by `@jmchilton `_ in `#18087 `_ * Empower users to bring their own storage and file sources by `@jmchilton `_ in `#18127 `_ * Harden User Object Store and File Source Creation by `@jmchilton `_ in `#18172 `_ ------------------- 24.0.3 (2024-06-28) ------------------- ========= Bug fixes ========= * Use config_section to distinguish between galaxy and ts or other apps by `@jdavcs `_ in `#18215 `_ ------------------- 24.0.2 (2024-05-07) ------------------- ========= Bug fixes ========= * Adds logging of messageExceptions in the fastapi exception handler. by `@dannon `_ in `#18041 `_ ------------------- 24.0.1 (2024-05-02) ------------------- ========= Bug fixes ========= * Fix conditional Image imports by `@mvdbeek `_ in `#17899 `_ ------------------- 24.0.0 (2024-04-02) ------------------- ========= Bug fixes ========= * Optional Reply-to SMTP header in tool error reports by `@neoformit `_ in `#17243 `_ * Follow-up on #17274 and #17262 by `@nsoranzo `_ in `#17302 `_ * Fixes for flake8-bugbear 24.1.17 by `@nsoranzo `_ in `#17340 `_ ============ Enhancements ============ * Add support for Python 3.12 by `@tuncK `_ in `#16796 `_ * Python 3.8 as minimum by `@mr-c `_ in `#16954 `_ * Remove web framework dependency from tools by `@davelopez `_ in `#17058 `_ * Add support for (fast5.tar).xz binary compressed files by `@tuncK `_ in `#17106 `_ * Reuse test instance during non-integration tests by `@mvdbeek `_ in `#17234 `_ * Add OIDC backend configuration schema and validation by `@uwwint `_ in `#17274 `_ * Enable ``warn_unreachable`` mypy option by `@mvdbeek `_ in `#17365 `_ * Fix type annotation of code using XML etree by `@nsoranzo `_ in `#17367 `_ * Update to black 2024 stable style by `@nsoranzo `_ in `#17391 `_ * Add `image_diff` comparison method for test output verification using images by `@kostrykin `_ in `#17556 `_ ------------------- 23.2.1 (2024-02-21) ------------------- ========= Bug fixes ========= * Ruff and flake8 fixes by `@nsoranzo `_ in `#16884 `_ ============ Enhancements ============ * Tool Shed 2.0 by `@jmchilton `_ in `#15639 `_ * Move database access code out of ``galaxy.util`` by `@jdavcs `_ in `#16526 `_ * Tweak tool memory use and optimize shared memory when using preload by `@mvdbeek `_ in `#16536 `_ * Updated path-based interactive tools with entry point path injection, support for ITs with relative links, shortened URLs, doc and config updates including Podman job_conf by `@sveinugu `_ in `#16795 `_ * Allow partial matches in workflow name tag search and search all tags for unquoted query by `@ahmedhamidawan `_ in `#16860 `_ * Use python-isal for fast zip deflate compression in rocrate export by `@mvdbeek `_ in `#17342 `_ ============= Other changes ============= * Merge 23.1 into dev by `@mvdbeek `_ in `#16534 `_ ------------------- 23.1.4 (2024-01-04) ------------------- No recorded changes since last release ------------------- 23.1.3 (2023-12-01) ------------------- No recorded changes since last release ------------------- 23.1.2 (2023-11-29) ------------------- ============ Enhancements ============ * Improve invocation error reporting by `@mvdbeek `_ in `#16917 `_ ------------------- 23.1.1 (2023-10-23) ------------------- ========= Bug fixes ========= * Fix bad auto-merge of dev. by `@jmchilton `_ in `#15386 `_ * Fix some drs handling issues by `@nuwang `_ in `#15777 `_ * Enable ``strict_equality`` mypy option by `@nsoranzo `_ in `#15808 `_ * Ensure session is request-scoped for legacy endpoints by `@jdavcs `_ in `#16207 `_ * Fix form builder value handling by `@guerler `_ in `#16304 `_ * Backport tool mem fixes by `@mvdbeek `_ in `#16601 `_ * Workaround for XML nodes of job resource parameters losing their children by `@kysrpex `_ in `#16728 `_ * Fix allowlist deserialization in file sources by `@mvdbeek `_ in `#16729 `_ * Exclude on_opened and on_closed from watcher events by `@mvdbeek `_ in `#16850 `_ ============ Enhancements ============ * Various Tool Shed Cleanup by `@jmchilton `_ in `#15247 `_ * Protection against problematic boolean parameters. by `@jmchilton `_ in `#15493 `_ * Unify url handling with filesources by `@nuwang `_ in `#15497 `_ * Explore tool remote test data by `@davelopez `_ in `#15510 `_ * Drop database views by `@jdavcs `_ in `#15876 `_ * Update Python dependencies by `@galaxybot `_ in `#15890 `_ * Record input datasets and collections at full parameter path by `@mvdbeek `_ in `#15978 `_ * Code cleanups from ruff and pyupgrade by `@nsoranzo `_ in `#16035 `_ * Vendorise ``packaging.versions.LegacyVersion`` by `@nsoranzo `_ in `#16058 `_ * Improve histories and datasets immutability checks by `@davelopez `_ in `#16143 `_ * Merge ``Target`` class with ``CondaTarget`` by `@nsoranzo `_ in `#16181 `_ ------------------- 23.0.6 (2023-10-23) ------------------- No recorded changes since last release ------------------- 23.0.5 (2023-07-29) ------------------- No recorded changes since last release ------------------- 23.0.4 (2023-06-30) ------------------- No recorded changes since last release ------------------- 23.0.3 (2023-06-26) ------------------- No recorded changes since last release ------------------- 23.0.2 (2023-06-13) ------------------- No recorded changes since last release ------------------- 23.0.1 (2023-06-08) ------------------- ========= Bug fixes ========= * Replace httpbin service with pytest-httpserver by `@mvdbeek `_ in `#16042 `_ ------------------- 22.1.2 (2022-12-08) ------------------- * Pin packaging dependency to < 22, fixes ``LegacyVersion`` import errors * Add missing pyparsing dependency ------------------- 22.1.1 (2022-08-22) ------------------- * First release from the 22.01 branch of Galaxy ------------------- 21.1.0 (2021-03-19) ------------------- * First release from the 21.01 branch of Galaxy. ------------------- 20.9.1 (2020-10-28) ------------------- ------------------- 20.9.0 (2020-10-15) ------------------- * First release from the 20.09 branch of Galaxy. ------------------- 20.5.0 (2020-07-03) ------------------- * First release from 20.05 branch of Galaxy. ------------------- 19.9.0 (2019-11-21) ------------------- * Initial import from dev branch of Galaxy during 19.09 development cycle. ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787384.0 galaxy_util-26.0.1/README.rst0000644000175100017510000000036215211124270015253 0ustar00runnerrunner .. image:: https://badge.fury.io/py/galaxy-util.svg :target: https://pypi.org/project/galaxy-util/ Overview -------- The Galaxy_ utilities module. * Code: https://github.com/galaxyproject/galaxy .. _Galaxy: http://galaxyproject.org/ ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787384.0 galaxy_util-26.0.1/dev-requirements.txt0000644000175100017510000000006115211124270017620 0ustar00runnerrunner# For dev mypy # For release build twine==6.2.0 ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1780787404.6762238 galaxy_util-26.0.1/galaxy/0000755000175100017510000000000015211124315015050 5ustar00runnerrunner././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787384.0 galaxy_util-26.0.1/galaxy/__init__.py0000644000175100017510000000013215211124270017155 0ustar00runnerrunnerfrom pkgutil import extend_path __path__ = extend_path(__path__, __name__) # noqa: F821 ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1780787404.6769211 galaxy_util-26.0.1/galaxy/exceptions/0000755000175100017510000000000015211124315017231 5ustar00runnerrunner././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/exceptions/__init__.py0000644000175100017510000002220115211124267021345 0ustar00runnerrunner"""This module defines Galaxy's custom exceptions. A Galaxy exception is an exception that extends :class:`MessageException` which defines an HTTP status code (represented by the `status_code` attribute) and a default error message. New exceptions should be defined by adding an entry to `error_codes.json` in this directory to define a default error message and a Galaxy "error code". A concrete Python class should be added in this file defining an HTTP status code (as `status_code`) and error code (`error_code`) object loaded dynamically from `error_codes.json`. Reflecting Galaxy's origins as a web application, these exceptions tend to be a bit web-oriented. However this module is a dependency of modules and tools that have nothing to do with the web - keep this in mind when defining exception names and messages. """ from typing import Optional from .error_codes import ( error_codes_by_name, ErrorCode, ) class MessageException(Exception): """Most generic Galaxy exception - indicates merely that some exceptional condition happened.""" # status code to be set when used with API. status_code: int = 400 # Error code information embedded into API json responses. err_code: ErrorCode = error_codes_by_name["UNKNOWN"] def __init__(self, err_msg: Optional[str] = None, type="info", **extra_error_info): self.err_msg = err_msg or self.err_code.default_error_message self.type = type self.extra_error_info = extra_error_info @staticmethod def from_code(status_code, message): exception_class = MessageException if status_code == 404: exception_class = ObjectNotFound elif status_code / 100 == 5: exception_class = InternalServerError return exception_class(message) def __str__(self): return self.err_msg class ItemDeletionException(MessageException): pass class ObjectInvalid(Exception): """Accessed object store ID is invalid""" # Please keep the exceptions ordered by status code class AcceptedRetryLater(MessageException): status_code = 202 err_code = error_codes_by_name["ACCEPTED_RETRY_LATER"] retry_after: int def __init__(self, msg: Optional[str] = None, retry_after=60): super().__init__(msg) self.retry_after = retry_after class NoContentException(MessageException): status_code = 204 err_code = error_codes_by_name["NO_CONTENT_GENERIC"] class ActionInputError(MessageException): status_code = 400 err_code = error_codes_by_name["USER_REQUEST_INVALID_PARAMETER"] def __init__(self, err_msg, type="error"): super().__init__(err_msg, type) class DuplicatedSlugException(MessageException): status_code = 400 err_code = error_codes_by_name["USER_SLUG_DUPLICATE"] class DuplicatedIdentifierException(MessageException): status_code = 400 err_code = error_codes_by_name["USER_IDENTIFIER_DUPLICATE"] class ObjectAttributeInvalidException(MessageException): status_code = 400 err_code = error_codes_by_name["USER_OBJECT_ATTRIBUTE_INVALID"] class ObjectAttributeMissingException(MessageException): status_code = 400 err_code = error_codes_by_name["USER_OBJECT_ATTRIBUTE_MISSING"] class MalformedId(MessageException): status_code = 400 err_code = error_codes_by_name["MALFORMED_ID"] class UserInvalidRunAsException(MessageException): status_code = 400 err_code = error_codes_by_name["USER_INVALID_RUN_AS"] class MalformedContents(MessageException): status_code = 400 err_code = error_codes_by_name["MALFORMED_CONTENTS"] class UnknownContentsType(MessageException): status_code = 400 err_code = error_codes_by_name["UNKNOWN_CONTENTS_TYPE"] class RequestParameterMissingException(MessageException): status_code = 400 err_code = error_codes_by_name["USER_REQUEST_MISSING_PARAMETER"] class ToolMetaParameterException(MessageException): status_code = 400 err_code = error_codes_by_name["USER_TOOL_META_PARAMETER_PROBLEM"] class ToolMissingException(MessageException): status_code = 400 err_code = error_codes_by_name["USER_TOOL_MISSING_PROBLEM"] def __init__(self, err_msg: Optional[str] = None, type="info", tool_id=None, **extra_error_info): super().__init__(err_msg, type, **extra_error_info) self.tool_id = tool_id class RequestParameterInvalidException(MessageException): status_code = 400 err_code = error_codes_by_name["USER_REQUEST_INVALID_PARAMETER"] class ToolInputsNotReadyException(MessageException): status_code = 400 error_code = error_codes_by_name["TOOL_INPUTS_NOT_READY"] class ToolInputsNotOKException(MessageException): def __init__( self, err_msg: Optional[str] = None, type="info", *, src: str, id: str, input_name: str, **extra_error_info ): super().__init__(err_msg, type, src=src, id=id, input_name=input_name, **extra_error_info) self.src = src self.id = id self.input_name = input_name status_code = 400 error_code = error_codes_by_name["TOOL_INPUTS_NOT_OK"] class RealUserRequiredException(MessageException): status_code = 400 error_code = error_codes_by_name["REAL_USER_REQUIRED"] class AuthenticationFailed(MessageException): status_code = 401 err_code = error_codes_by_name["USER_AUTHENTICATION_FAILED"] class AuthenticationRequired(MessageException): status_code = 403 # TODO: as 401 and send WWW-Authenticate: ??? err_code = error_codes_by_name["USER_NO_API_KEY"] class ItemAccessibilityException(MessageException): status_code = 403 err_code = error_codes_by_name["USER_CANNOT_ACCESS_ITEM"] class ItemOwnershipException(MessageException): status_code = 403 err_code = error_codes_by_name["USER_DOES_NOT_OWN_ITEM"] class ItemImmutableException(MessageException): status_code = 403 err_code = error_codes_by_name["ITEM_IS_IMMUTABLE"] class ConfigDoesNotAllowException(MessageException): status_code = 403 err_code = error_codes_by_name["CONFIG_DOES_NOT_ALLOW"] class InsufficientPermissionsException(MessageException): status_code = 403 err_code = error_codes_by_name["INSUFFICIENT_PERMISSIONS"] class UserCannotRunAsException(MessageException): status_code = 403 err_code = error_codes_by_name["USER_CANNOT_RUN_AS"] class UserRequiredException(MessageException): status_code = 403 err_code = error_codes_by_name["USER_REQUIRED"] class AdminRequiredException(MessageException): status_code = 403 err_code = error_codes_by_name["ADMIN_REQUIRED"] class UserActivationRequiredException(MessageException): status_code = 403 err_code = error_codes_by_name["USER_ACTIVATION_REQUIRED"] class ItemAlreadyClaimedException(MessageException): status_code = 403 err_code = error_codes_by_name["ITEM_IS_CLAIMED"] class ObjectNotFound(MessageException): """Accessed object was not found""" status_code = 404 err_code = error_codes_by_name["USER_OBJECT_NOT_FOUND"] class Conflict(MessageException): status_code = 409 err_code = error_codes_by_name["CONFLICT"] class ItemMustBeClaimed(Conflict): err_code = error_codes_by_name["MUST_CLAIM"] class DeprecatedMethod(MessageException): """ Method (or a particular form/arg signature) has been removed and won't be available later """ status_code = 410 err_code = error_codes_by_name["DEPRECATED_API_CALL"] class ConfigurationError(Exception): status_code = 500 err_code = error_codes_by_name["CONFIG_ERROR"] class InconsistentDatabase(MessageException): status_code = 500 err_code = error_codes_by_name["INCONSISTENT_DATABASE"] class InconsistentApplicationState(MessageException): status_code = 500 err_code = error_codes_by_name["INCONSISTENT_APPLICATION_STATE"] class InternalServerError(MessageException): status_code = 500 err_code = error_codes_by_name["INTERNAL_SERVER_ERROR"] class ToolExecutionError(MessageException): status_code = 500 err_code = error_codes_by_name["TOOL_EXECUTION_ERROR"] def __init__(self, err_msg, type="error", job=None): super().__init__(err_msg, type) self.job = job class NotImplemented(MessageException): status_code = 501 err_code = error_codes_by_name["NOT_IMPLEMENTED"] class InvalidFileFormatError(MessageException): status_code = 500 err_code = error_codes_by_name["INVALID_FILE_FORMAT"] class ReferenceDataError(MessageException): status_code = 500 err_code = error_codes_by_name["REFERENCE_DATA_ERROR"] class ServerNotConfiguredForRequest(MessageException): # A bit like ConfigDoesNotAllowException but it has nothing to do with the user of the # request being "forbidden". It just isn't configured. status_code = 501 err_code = error_codes_by_name["SERVER_NOT_CONFIGURED_FOR_REQUEST"] class UpstreamProxyError(MessageException): status_code = 502 err_code = error_codes_by_name["UPSTREAM_PROXY_ERROR"] class GatewayTimeoutException(MessageException): status_code = 504 err_code = error_codes_by_name["UPSTREAM_PROXY_TIMEOUT"] class HandlerAssignmentError(Exception): def __init__(self, msg=None, obj=None, **kwargs): super().__init__(msg) self.obj = obj ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/exceptions/error_codes.json0000644000175100017510000001550715211124267022450 0ustar00runnerrunner[ { "name": "UNKNOWN", "code": 0, "message": "Unknown error occurred while processing request." }, { "name": "ACCEPTED_RETRY_LATER", "code": 202001, "message": "Galaxy has accepted this request and is processing. A retry-after header indicates to the client when to retry." }, { "name": "NO_CONTENT_GENERIC", "code": 204001, "message": "Galaxy has no content to associate with this request." }, { "name": "USER_CANNOT_RUN_AS", "code": 400001, "message": "User does not have permissions to run jobs as another user." }, { "name": "USER_INVALID_RUN_AS", "code": 400002, "message": "Invalid run_as request - run_as user does not exist." }, { "name": "USER_INVALID_JSON", "code": 400003, "message": "Your request did not appear to be valid JSON, please consult the API documentation." }, { "name": "USER_OBJECT_ATTRIBUTE_INVALID", "code": 400004, "message": "Attempted to create or update object with invalid attribute value." }, { "name": "USER_OBJECT_ATTRIBUTE_MISSING", "code": 400005, "message": "Attempted to create or update object without required attribute." }, { "name": "USER_SLUG_DUPLICATE", "code": 400006, "message": "Slug must be unique per user." }, { "name": "USER_REQUEST_MISSING_PARAMETER", "code": 400007, "message": "Request is missing parameter required to complete desired action." }, { "name": "USER_REQUEST_INVALID_PARAMETER", "code": 400008, "message": "Request contained invalid parameter, action could not be completed." }, { "name": "MALFORMED_ID", "code": 400009, "message": "The id of the resource is malformed." }, { "name": "UNKNOWN_CONTENTS_TYPE", "code": 400010, "message": "The request contains unknown type of contents." }, { "name": "USER_IDENTIFIER_DUPLICATE", "code": 400011, "message": "Request contained a duplicated identifier that must be unique." }, { "name": "USER_TOOL_META_PARAMETER_PROBLEM", "code": 400012, "message": "Supplied incorrect or incompatible tool meta parameters." }, { "name": "MALFORMED_CONTENTS", "code": 400013, "message": "The contents of the request are malformed." }, { "name": "USER_TOOL_MISSING_PROBLEM", "code": 400014, "message": "Tool could not be found." }, { "name": "TOOL_INPUTS_NOT_READY", "code": 400015, "message": "Tool inputs not yet ready, try again later." }, { "name": "REAL_USER_REQUIRED", "code": 400016, "message": "Only real users can make this request." }, { "name": "TOOL_INPUTS_NOT_OK", "code": 400017, "message": "Tool inputs not in required OK state." }, { "name": "USER_AUTHENTICATION_FAILED", "code": 401001, "message": "Authentication failed, invalid credentials supplied." }, { "name": "USER_NO_API_KEY", "code": 403001, "message": "Authentication required for this request" }, { "name": "USER_CANNOT_ACCESS_ITEM", "code": 403002, "message": "User cannot access specified item." }, { "name": "USER_DOES_NOT_OWN_ITEM", "code": 403003, "message": "User does not own specified item." }, { "name": "ITEM_IS_IMMUTABLE", "code": 403003, "message": "The specified item is immutable." }, { "name": "CONFIG_DOES_NOT_ALLOW", "code": 403004, "message": "The configuration of this Galaxy instance does not allow that operation" }, { "name": "INSUFFICIENT_PERMISSIONS", "code": 403005, "message": "You don't have proper permissions to perform the requested operation" }, { "name": "ADMIN_REQUIRED", "code": 403006, "message": "Action requires admin account." }, { "name": "USER_ACTIVATION_REQUIRED", "code": 403007, "message": "Action requires account activation." }, { "name": "ITEM_IS_CLAIMED", "code": 403008, "message": "This item has already been claimed and cannot be re-claimed." }, { "name": "USER_REQUIRED", "code": 403008, "message": "Action requires user authentication." }, { "name": "USER_OBJECT_NOT_FOUND", "code": 404001, "message": "No such object found." }, { "name": "CONFLICT", "code": 409001, "message": "Database conflict prevented fulfilling the request." }, { "name": "MUST_CLAIM", "code": 409010, "message": "Private request must be claimed before use" }, { "name": "DEPRECATED_API_CALL", "code": 410001, "message": "This API method or call signature has been deprecated and is no longer available" }, { "name": "INTERNAL_SERVER_ERROR", "code": 500001, "message": "Internal server error." }, { "name": "INCONSISTENT_DATABASE", "code": 500002, "message": "Inconsistent database prevented fulfilling the request." }, { "name": "CONFIG_ERROR", "code": 500003, "message": "Error in a configuration file." }, { "name": "TOOL_EXECUTION_ERROR", "code": 500004, "message": "Tool execution failed due to an internal server error." }, { "name": "INVALID_FILE_FORMAT", "code": 500005, "message": "File format not supported for this operation." }, { "name": "REFERENCE_DATA_ERROR", "code": 500006, "message": "Reference data required for program execution failed to load." }, { "name": "INCONSISTENT_APPLICATION_STATE", "code": 500007, "message": "Inconsistent application state (likely not dbms related) prevented fulfilling the request." }, { "name": "NOT_IMPLEMENTED", "code": 501001, "message": "Method is not implemented." }, { "name": "SERVER_NOT_CONFIGURED_FOR_REQUEST", "code": 501002, "message": "Server not configured for the request. The Galaxy admin may be able to resolve the problem by installing additional dependencies or setting up new infrastructure." }, { "name": "UPSTREAM_PROXY_ERROR", "code": 502001, "message": "An error occurred while proxying a request to an upstream server." }, { "name": "UPSTREAM_PROXY_TIMEOUT", "code": 504001, "message": "The upstream server did not respond in time." } ] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/exceptions/error_codes.py0000644000175100017510000000355615211124267022130 0ustar00runnerrunner"""Defines the :class:`ErrorCode` class and instantiates concrete objects from JSON. See the file error_codes.json for actual error code descriptions. """ from json import loads from typing import Dict from galaxy.util.resources import resource_string # Error codes are provided as a convience to Galaxy API clients, but at this # time they do represent part of the more stable interface. They can change # without warning between releases. UNKNOWN_ERROR_MESSAGE = "Unknown error occurred while processing request." class ErrorCode: """Small class allowing object representation for error descriptions loaded from JSON.""" def __init__(self, code: int, default_error_message: str): """Construct a :class:`ErrorCode` from supplied integer and error message.""" self.code = code self.default_error_message = default_error_message or UNKNOWN_ERROR_MESSAGE def __str__(self): """Return the error code message.""" return str(self.default_error_message) def __repr__(self): """Return object representation of this error code.""" return f"ErrorCode[code={self.code},message={str(self.default_error_message)}]" def __int__(self): """Return the error code integer.""" return self.code def _from_dict(entry): """Build a :class:`ErrorCode` object from a JSON entry.""" name = entry.get("name") code = entry.get("code") message = entry.get("message") return (name, ErrorCode(code, message)) error_codes_json = resource_string(__name__, "error_codes.json") error_codes_by_name: Dict[str, ErrorCode] = {} error_codes_by_int_code: Dict[int, ErrorCode] = {} for entry in loads(error_codes_json): name, error_code_obj = _from_dict(entry) globals()[name] = error_code_obj error_codes_by_name[name] = error_code_obj error_codes_by_int_code[error_code_obj.code] = error_code_obj ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/exceptions/utils.py0000644000175100017510000000573415211124267020762 0ustar00runnerrunnerfrom typing import ( TYPE_CHECKING, Union, ) from galaxy.exceptions import ( error_codes, MessageException, RequestParameterInvalidException, RequestParameterMissingException, ) from galaxy.util.json import safe_dumps if TYPE_CHECKING: from fastapi.exceptions import RequestValidationError from pydantic import ValidationError def validation_error_to_message_exception(e: Union["ValidationError", "RequestValidationError"]) -> MessageException: invalid_found = False missing_found = False messages = [] clean_validation_errors = [] for error in e.errors(): messages.append(f"{error['msg']} in {error['loc']}") if error["type"] == "message_exception" and "ctx" in error: return error["ctx"]["exception"] if error["type"] == "missing" or error["type"] == "type_error.none.not_allowed": missing_found = True elif error["type"].startswith("type_error"): invalid_found = True # ctx contains data that can't be serialized, like exception instances error.pop("ctx", None) try: clean_validation_errors.append(safe_dumps(error)) except TypeError: pass if missing_found and not invalid_found: return RequestParameterMissingException("\n".join(messages), validation_errors=clean_validation_errors) else: return RequestParameterInvalidException("\n".join(messages), validation_errors=clean_validation_errors) def api_error_to_dict(**kwds): UNKNOWN_ERROR_CODE = error_codes.error_codes_by_name["UNKNOWN"] exception = kwds.get("exception", None) if exception: # If we are passed a MessageException use err_msg. default_error_code = getattr(exception, "err_code", UNKNOWN_ERROR_CODE) default_error_message = getattr(exception, "err_msg", default_error_code.default_error_message) extra_error_info = getattr(exception, "extra_error_info", {}) if not isinstance(extra_error_info, dict): extra_error_info = {} else: default_error_message = "Error processing API request." default_error_code = UNKNOWN_ERROR_CODE extra_error_info = {} traceback_string = kwds.get("traceback", "No traceback available.") err_msg = kwds.get("err_msg", default_error_message) error_code_object = kwds.get("err_code", default_error_code) try: error_code = error_code_object.code except AttributeError: # Some sort of bad error code sent in, logic failure on part of # Galaxy developer. error_code = UNKNOWN_ERROR_CODE.code # Would prefer the terminology of error_code and error_message, but # err_msg used a good number of places already. Might as well not change # it? error_response = dict(err_msg=err_msg, err_code=error_code, **extra_error_info) if kwds.get("debug"): # TODO: Should admins get to see traceback as well? error_response["traceback"] = traceback_string return error_response ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787384.0 galaxy_util-26.0.1/galaxy/py.typed0000644000175100017510000000000015211124270016535 0ustar00runnerrunner././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1780787404.6878254 galaxy_util-26.0.1/galaxy/util/0000755000175100017510000000000015211124315016025 5ustar00runnerrunner././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/__init__.py0000644000175100017510000020167215211124267020154 0ustar00runnerrunner""" Utility functions used systemwide. """ import binascii import codecs import collections import errno import importlib import itertools import json import os import random import re import shlex import shutil import smtplib import stat import string import sys import tempfile import textwrap import threading import time import unicodedata import uuid import xml.dom.minidom from datetime import ( datetime, timezone, ) from decimal import Decimal from email.message import EmailMessage from hashlib import md5 from os.path import relpath from pathlib import Path from typing import ( Any, cast, Dict, Iterable, Iterator, List, Mapping, Optional, overload, Tuple, TYPE_CHECKING, TypeVar, Union, ) from urllib.parse import ( quote, urlencode, urlparse, urlsplit, urlunsplit, ) from boltons.iterutils import ( default_enter, remap, ) from requests.adapters import HTTPAdapter from requests.packages.urllib3.util.retry import Retry # type: ignore[import-untyped, unused-ignore] from typing_extensions import ( Literal, Self, ) try: import grp except ImportError: # For Pulsar on Windows (which does not use the function that uses grp) grp = None # type: ignore[assignment] LXML_AVAILABLE = True try: from lxml import etree from lxml.etree import DocumentInvalid # lxml.etree.Element is a function that returns a new instance of the # lxml.etree._Element class. This class doesn't have a proper __init__() # method, so we can add a __new__() constructor that mimicks the # xml.etree.ElementTree.Element initialization. class Element(etree._Element): def __new__(cls, tag, attrib={}, **extra) -> Self: # noqa: B006 return cast(Self, etree.Element(tag, attrib, **extra)) def __iter__(self) -> Iterator[Self]: # type: ignore[override] return cast(Iterator[Self], super().__iter__()) def find(self, path: str, namespaces: Optional[Mapping[str, str]] = None) -> Union[Self, None]: ret = super().find(path, namespaces) if ret is not None: return cast(Self, ret) else: return None def findall(self, path: str, namespaces: Optional[Mapping[str, str]] = None) -> List[Self]: # type: ignore[override] return cast(List[Self], super().findall(path, namespaces)) def iterfind(self, path: str, namespaces: Optional[Mapping[str, str]] = None) -> Iterator[Self]: return cast(Iterator[Self], super().iterfind(path, namespaces)) def SubElement(parent: Element, tag: str, attrib: Optional[Dict[str, str]] = None, **extra) -> Element: return cast(Element, etree.SubElement(parent, tag, attrib, **extra)) # lxml.etree.ElementTree is a function that returns a new instance of the # lxml.etree._ElementTree class. This class doesn't have a proper __init__() # method, so we can add a __new__() constructor that mimicks the # xml.etree.ElementTree.ElementTree initialization. class ElementTree(etree._ElementTree): def __new__(cls, element=None, file=None) -> Self: return cast(Self, etree.ElementTree(element, file=file)) def getroot(self) -> Element: return cast(Element, super().getroot()) def XML(text: Union[str, bytes]) -> Element: return cast(Element, etree.XML(text)) except ImportError: LXML_AVAILABLE = False import xml.etree.ElementTree as etree # type: ignore[no-redef] from xml.etree.ElementTree import ( # type: ignore[assignment] # noqa: F401 Element, ElementTree, XML, ) class DocumentInvalid(Exception): # type: ignore[no-redef] pass from . import requests from .custom_logging import get_logger from .inflection import Inflector from .path import ( # noqa: F401 safe_contains, safe_makedirs, safe_relpath, StrPath, ) from .rst_to_html import rst_to_html # noqa: F401 try: shlex_join = shlex.join # type: ignore[attr-defined, unused-ignore] except AttributeError: # Python < 3.8 def shlex_join(split_command): return " ".join(map(shlex.quote, split_command)) if TYPE_CHECKING: from galaxy.util.resources import Traversable inflector = Inflector() log = get_logger(__name__) _lock = threading.RLock() namedtuple = collections.namedtuple CHUNK_SIZE = 65536 # 64k DATABASE_MAX_STRING_SIZE = 32768 DATABASE_MAX_STRING_SIZE_PRETTY = "32K" DEFAULT_SOCKET_TIMEOUT = 600 gzip_magic = b"\x1f\x8b" bz2_magic = b"BZh" xz_magic = b"\xfd7zXZ\x00" DEFAULT_ENCODING = os.environ.get("GALAXY_DEFAULT_ENCODING", "utf-8") NULL_CHAR = b"\x00" BINARY_CHARS = [NULL_CHAR] FILENAME_VALID_CHARS = ".,^_-()[]0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" RW_R__R__ = stat.S_IRUSR | stat.S_IWUSR | stat.S_IRGRP | stat.S_IROTH RWXR_XR_X = stat.S_IRWXU | stat.S_IRGRP | stat.S_IXGRP | stat.S_IROTH | stat.S_IXOTH RWXRWXRWX = stat.S_IRWXU | stat.S_IRWXG | stat.S_IRWXO defaultdict = collections.defaultdict UNKNOWN = "unknown" DOI_MAX_LENGTH = 200 # This is a reasonable limit. The DOI spec does not set a limit. def str_removeprefix(s: str, prefix: str): """ str.removeprefix() equivalent for Python < 3.9 """ if sys.version_info >= (3, 9): return s.removeprefix(prefix) elif s.startswith(prefix): return s[len(prefix) :] else: return s @overload def remove_protocol_from_url(url: None) -> None: ... @overload def remove_protocol_from_url(url: str) -> str: ... def remove_protocol_from_url(url): """Supplied URL may be null, if not ensure http:// or https:// etc... is stripped off. """ if url is None: return url # We have a URL if url.find("://") > 0: new_url = url.split("://")[1] else: new_url = url return new_url.rstrip("/") def is_binary(value): """ File is binary if it contains a null-byte by default (e.g. behavior of grep, etc.). This may fail for utf-16 files, but so would ASCII encoding. >>> is_binary( string.printable ) False >>> is_binary( b'\\xce\\x94' ) False >>> is_binary( b'\\x00' ) True """ value = smart_str(value) for binary_char in BINARY_CHARS: if binary_char in value: return True return False def is_uuid(value): """ This method returns True if value is a UUID, otherwise False. >>> is_uuid( "123e4567-e89b-12d3-a456-426655440000" ) True >>> is_uuid( "0x3242340298902834" ) False """ uuid_re = re.compile("[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}") if re.match(uuid_re, str(value)): return True else: return False def is_valid_uuid_v4(uuid_str: str) -> bool: """Check if a string is a valid UUID v4.""" try: u = uuid.UUID(uuid_str) return u.version == 4 except ValueError: return False def directory_hash_id(id): """ >>> directory_hash_id( 100 ) ['000'] >>> directory_hash_id( "90000" ) ['090'] >>> directory_hash_id("777777777") ['000', '777', '777'] >>> directory_hash_id("135ee48a-4f51-470c-ae2f-ce8bd78799e6") ['1', '3', '5'] """ s = str(id) len_s = len(s) # Shortcut -- ids 0-999 go under ../000/ if len_s < 4: return ["000"] if not is_uuid(s): # Pad with zeros until a multiple of three padded = ((3 - len_s % 3) * "0") + s # Drop the last three digits -- 1000 files per directory padded = padded[:-3] # Break into chunks of three return [padded[i * 3 : (i + 1) * 3] for i in range(len(padded) // 3)] else: # assume it is a UUID return list(iter(s[0:3])) def get_charset_from_http_headers(headers, default=None): rval = headers.get("content-type", None) if rval and "charset=" in rval: rval = rval.split("charset=")[-1].split(";")[0].strip() if rval: return rval return default def synchronized(func): """This wrapper will serialize access to 'func' to a single thread. Use it as a decorator.""" def caller(*params, **kparams): _lock.acquire(True) # Wait try: return func(*params, **kparams) finally: _lock.release() return caller def iter_start_of_line(fh, chunk_size=None): """Iterate over fh and call readline(chunk_size).""" while True: data = fh.readline(chunk_size) if not data: break if not data.endswith("\n"): # Discard the rest of the line without reading it all into memory while True: line_rest = fh.readline(CHUNK_SIZE) if not line_rest or line_rest.endswith("\n"): break yield data def file_reader(fp, chunk_size=CHUNK_SIZE): """This generator yields the open file object in chunks (default 64k).""" while True: data = fp.read(chunk_size) if not data: break yield data ItemType = TypeVar("ItemType") def chunk_iterable(it: Iterable[ItemType], size: int = 1000) -> Iterator[Tuple[ItemType, ...]]: """ Break an iterable into chunks of ``size`` elements. >>> list(chunk_iterable([1, 2, 3, 4, 5, 6, 7], 3)) [(1, 2, 3), (4, 5, 6), (7,)] """ it = iter(it) while True: p = tuple(itertools.islice(it, size)) if not p: break yield p def unique_id(KEY_SIZE=128): """ Generates an unique id >>> ids = [ unique_id() for i in range(1000) ] >>> len(set(ids)) 1000 """ random_bits = str(random.getrandbits(KEY_SIZE)).encode("UTF-8") return md5(random_bits).hexdigest() def parse_xml( fname: Union[StrPath, "Traversable"], strip_whitespace: bool = True, remove_comments: bool = True, schemafname: Union[StrPath, None] = None, ) -> ElementTree: """Returns a parsed xml tree""" parser = None schema = None if remove_comments and LXML_AVAILABLE: # If using stdlib etree comments are always removed, # but lxml doesn't do this by default parser = etree.XMLParser(remove_comments=remove_comments) if LXML_AVAILABLE and schemafname: with open(str(schemafname), "rb") as schema_file: schema_root = etree.XML(schema_file.read()) schema = etree.XMLSchema(schema_root) source = Path(fname) if isinstance(fname, (str, os.PathLike)) else fname try: with source.open("rb") as f: tree = cast(ElementTree, etree.parse(f, parser=parser)) root = tree.getroot() if strip_whitespace: for elem in root.iter("*"): if elem.text is not None: elem.text = elem.text.strip() if elem.tail is not None: elem.tail = elem.tail.strip() if schema: schema.assertValid(tree) except etree.ParseError: log.exception("Error parsing file %s", fname) raise except DocumentInvalid: log.exception("Validation of file %s failed", fname) raise return tree def parse_xml_string(xml_string: str, strip_whitespace: bool = True) -> Element: try: elem = XML(xml_string) except ValueError as e: if "strings with encoding declaration are not supported" in unicodify(e): # This happens with lxml for a string that starts with e.g. `` elem = XML(xml_string.encode("utf-8")) else: raise e if strip_whitespace: for sub_elem in elem.iter("*"): if sub_elem.text is not None: sub_elem.text = sub_elem.text.strip() if sub_elem.tail is not None: sub_elem.tail = sub_elem.tail.strip() return elem def parse_xml_string_to_etree(xml_string: str, strip_whitespace: bool = True) -> ElementTree: return ElementTree(parse_xml_string(xml_string=xml_string, strip_whitespace=strip_whitespace)) def xml_to_string(elem: Optional[Element], pretty: bool = False) -> str: """ Returns a string from an xml tree. """ if elem is None: return "" try: xml_str = etree.tostring(elem, encoding="unicode") except TypeError as e: # we assume this is a comment if hasattr(elem, "text"): return f"\n" else: raise e if xml_str and pretty: pretty_string = xml.dom.minidom.parseString(xml_str).toprettyxml(indent=" ") return "\n".join(line for line in pretty_string.split("\n") if not re.match(r"^[\s\\nb\']*$", line)) return xml_str def xml_element_compare(elem1, elem2): if not isinstance(elem1, dict): elem1 = xml_element_to_dict(elem1) if not isinstance(elem2, dict): elem2 = xml_element_to_dict(elem2) return elem1 == elem2 def xml_element_list_compare(elem_list1, elem_list2): return [xml_element_to_dict(elem) for elem in elem_list1] == [xml_element_to_dict(elem) for elem in elem_list2] def xml_element_to_dict(elem): rval = {} if elem.attrib: rval[elem.tag] = {} else: rval[elem.tag] = None sub_elems = list(elem) if sub_elems: sub_elem_dict = {} for sub_sub_elem_dict in map(xml_element_to_dict, sub_elems): for key, value in sub_sub_elem_dict.items(): if key not in sub_elem_dict: sub_elem_dict[key] = [] sub_elem_dict[key].append(value) for key, value in sub_elem_dict.items(): if len(value) == 1: rval[elem.tag][key] = value[0] else: rval[elem.tag][key] = value if elem.attrib: for key, value in elem.attrib.items(): rval[elem.tag][f"@{key}"] = value if elem.text: text = elem.text.strip() if text and sub_elems or elem.attrib: rval[elem.tag]["#text"] = text else: rval[elem.tag] = text return rval def pretty_print_xml(elem, level=0): pad = " " i = "\n" + level * pad if len(elem): if not elem.text or not elem.text.strip(): elem.text = i + pad + pad if not elem.tail or not elem.tail.strip(): elem.tail = i for e in elem: pretty_print_xml(e, level + 1) if not elem.tail or not elem.tail.strip(): elem.tail = i else: if level and (not elem.tail or not elem.tail.strip()): elem.tail = i + pad return elem def get_file_size(value, default=None): try: # try built-in return os.path.getsize(value) except Exception: try: # try built-in one name attribute return os.path.getsize(value.name) except Exception: try: # try tell() of end of object offset = value.tell() value.seek(0, 2) rval = value.tell() value.seek(offset) return rval except Exception: # return default value return default def shrink_stream_by_size( value, size, join_by=b"..", left_larger=True, beginning_on_size_error=False, end_on_size_error=False ): """ Shrinks bytes read from `value` to `size`. `value` needs to implement tell/seek, so files need to be opened in binary mode. Returns unicode text with invalid characters replaced. """ rval = b"" join_by = smart_str(join_by) if get_file_size(value) > size: start = value.tell() len_join_by = len(join_by) min_size = len_join_by + 2 if size < min_size: if beginning_on_size_error: rval = value.read(size) value.seek(start) return rval elif end_on_size_error: value.seek(-size, 2) rval = value.read(size) value.seek(start) return rval raise ValueError(f"With the provided join_by value ({join_by}), the minimum size value is {min_size}.") left_index = right_index = int((size - len_join_by) / 2) if left_index + right_index + len_join_by < size: if left_larger: left_index += 1 else: right_index += 1 rval = value.read(left_index) + join_by value.seek(-right_index, 2) rval += value.read(right_index) else: while True: data = value.read(CHUNK_SIZE) if not data: break rval += data return unicodify(rval) def shrink_and_unicodify(stream): stream = unicodify(stream, strip_null=True) or "" if len(stream) > DATABASE_MAX_STRING_SIZE: stream = shrink_string_by_size( stream, DATABASE_MAX_STRING_SIZE, join_by="\n..\n", left_larger=True, beginning_on_size_error=True ) return stream def shrink_string_by_size( value, size, join_by="..", left_larger=True, beginning_on_size_error=False, end_on_size_error=False ): if len(value) > size: len_join_by = len(join_by) min_size = len_join_by + 2 if size < min_size: if beginning_on_size_error: return value[:size] elif end_on_size_error: return value[-size:] raise ValueError(f"With the provided join_by value ({join_by}), the minimum size value is {min_size}.") left_index = right_index = int((size - len_join_by) / 2) if left_index + right_index + len_join_by < size: if left_larger: left_index += 1 else: right_index += 1 value = f"{value[:left_index]}{join_by}{value[-right_index:]}" return value def pretty_print_time_interval(time=False, precise=False, utc=False): """ Get a datetime object or a int() Epoch timestamp and return a pretty string like 'an hour ago', 'Yesterday', '3 months ago', 'just now', etc credit: http://stackoverflow.com/questions/1551382/user-friendly-time-format-in-python """ if utc: now = datetime.utcnow() else: now = datetime.now() if isinstance(time, (int, float)): diff = now - datetime.fromtimestamp(time) elif isinstance(time, datetime): diff = now - time elif isinstance(time, str): try: time = datetime.strptime(time, "%Y-%m-%dT%H:%M:%S.%f") except ValueError: # MySQL may not support microseconds precision time = datetime.strptime(time, "%Y-%m-%dT%H:%M:%S") diff = now - time else: diff = now - now second_diff = diff.seconds day_diff = diff.days if day_diff < 0: return "" if precise: if day_diff == 0: if second_diff < 10: return "just now" if second_diff < 60: return str(second_diff) + " seconds ago" if second_diff < 120: return "a minute ago" if second_diff < 3600: return str(second_diff / 60) + " minutes ago" if second_diff < 7200: return "an hour ago" if second_diff < 86400: return str(second_diff / 3600) + " hours ago" if day_diff == 1: return "yesterday" if day_diff < 7: return str(day_diff) + " days ago" if day_diff < 31: return str(day_diff / 7) + " weeks ago" if day_diff < 365: return str(day_diff / 30) + " months ago" return str(day_diff / 365) + " years ago" else: if day_diff == 0: return "today" if day_diff == 1: return "yesterday" if day_diff < 7: return "less than a week" if day_diff < 31: return "less than a month" if day_diff < 365: return "less than a year" return "a few years ago" # characters that are valid valid_chars = set(string.ascii_letters + string.digits + " -=_.()/+*^,:?!") # characters that are allowed but need to be escaped mapped_chars = { ">": "__gt__", "<": "__lt__", "'": "__sq__", '"': "__dq__", "[": "__ob__", "]": "__cb__", "{": "__oc__", "}": "__cc__", "@": "__at__", "\n": "__cn__", "\r": "__cr__", "\t": "__tc__", "#": "__pd__", } def restore_text(text, character_map=mapped_chars): """Restores sanitized text""" if not text: return text for key, value in character_map.items(): text = text.replace(value, key) return text def sanitize_text(text, valid_characters=valid_chars, character_map=mapped_chars, invalid_character="X"): """ Restricts the characters that are allowed in text; accepts both strings and lists of strings; non-string entities will be cast to strings. """ if isinstance(text, list): return [ sanitize_text( x, valid_characters=valid_characters, character_map=character_map, invalid_character=invalid_character ) for x in text ] if not isinstance(text, str): text = smart_str(text) return _sanitize_text_helper(text, valid_characters=valid_characters, character_map=character_map) def _sanitize_text_helper(text, valid_characters=valid_chars, character_map=mapped_chars, invalid_character="X"): """Restricts the characters that are allowed in a string""" out = [] for c in text: if c in valid_characters: out.append(c) elif c in character_map: out.append(character_map[c]) else: out.append(invalid_character) # makes debugging easier return "".join(out) def sanitize_lists_to_string(values, valid_characters=valid_chars, character_map=mapped_chars, invalid_character="X"): if isinstance(values, list): rval = [] for value in values: rval.append( sanitize_lists_to_string( value, valid_characters=valid_characters, character_map=character_map, invalid_character=invalid_character, ) ) values = ",".join(rval) else: values = sanitize_text( values, valid_characters=valid_characters, character_map=character_map, invalid_character=invalid_character ) return values def sanitize_param(value, valid_characters=valid_chars, character_map=mapped_chars, invalid_character="X"): """Clean incoming parameters (strings or lists)""" if isinstance(value, str): return sanitize_text( value, valid_characters=valid_characters, character_map=character_map, invalid_character=invalid_character ) elif isinstance(value, list): return [ sanitize_text( x, valid_characters=valid_characters, character_map=character_map, invalid_character=invalid_character ) for x in value ] else: raise Exception(f"Unknown parameter type ({type(value)})") valid_filename_chars = set(string.ascii_letters + string.digits + "_.") invalid_filenames = ["", ".", ".."] def sanitize_for_filename(text, default=None): """ Restricts the characters that are allowed in a filename portion; Returns default value or a unique id string if result is not a valid name. Method is overly aggressive to minimize possible complications, but a maximum length is not considered. """ out = [] for c in text: if c in valid_filename_chars: out.append(c) else: out.append("_") out = "".join(out) if out in invalid_filenames: if default is None: return sanitize_for_filename(str(unique_id())) return default return out def find_instance_nested(item, instances): """ Recursively find instances from lists, dicts, tuples. `instances` should be a tuple of valid instances. Returns a dictionary, where keys are the deepest key at which an instance has been found, and the value is the matched instance. """ matches = {} def visit(path, key, value): if isinstance(value, instances): if key not in matches: matches[key] = value return key, value def enter(path, key, value): if isinstance(value, instances): return None, False return default_enter(path, key, value) remap(item, visit, reraise_visit=False, enter=enter) return matches def mask_password_from_url(url): """ Masks out passwords from connection urls like the database connection in galaxy.ini >>> mask_password_from_url( 'sqlite+postgresql://user:password@localhost/' ) 'sqlite+postgresql://user:********@localhost/' >>> mask_password_from_url( 'amqp://user:amqp@localhost' ) 'amqp://user:********@localhost' >>> mask_password_from_url( 'amqp://localhost') 'amqp://localhost' """ split = urlsplit(url) if split.password: if url.count(split.password) == 1: url = url.replace(split.password, "********") else: # This can manipulate the input other than just masking password, # so the previous string replace method is preferred when the # password doesn't appear twice in the url split = split._replace( netloc=split.netloc.replace(f"{split.username}:{split.password}", f"{split.username}:********") ) url = urlunsplit(split) return url def ready_name_for_url(raw_name): """General method to convert a string (i.e. object name) to a URL-ready slug. >>> ready_name_for_url( "My Cool Object" ) 'My-Cool-Object' >>> ready_name_for_url( "!My Cool Object!" ) 'My-Cool-Object' >>> ready_name_for_url( "Hello₩◎ґʟⅾ" ) 'Hello' """ # Replace whitespace with '-' slug_base = re.sub(r"\s+", "-", raw_name) # Remove all non-alphanumeric characters. slug_base = re.sub(r"[^a-zA-Z0-9\-]", "", slug_base) # Remove trailing '-'. if slug_base.endswith("-"): slug_base = slug_base[:-1] return slug_base def which(file: str) -> Optional[str]: # http://stackoverflow.com/questions/5226958/which-equivalent-function-in-python for path in os.environ["PATH"].split(":"): if os.path.exists(path + "/" + file): return path + "/" + file return None def in_directory(file, directory, local_path_module=os.path): """ Return true, if the common prefix of both is equal to directory e.g. /a/b/c/d.rst and directory is /a/b, the common prefix is /a/b. This function isn't used exclusively for security checks, but if it is used for such checks it is assumed that ``directory`` is a "trusted" path - supplied by Galaxy or by the admin and ``file`` is something generated by a tool, configuration, external web server, or user supplied input. local_path_module is used by Pulsar to check Windows paths while running on a POSIX-like system. """ if local_path_module != os.path: _safe_contains = importlib.import_module(f"galaxy.util.path.{local_path_module.__name__}").safe_contains else: directory = os.path.realpath(directory) _safe_contains = safe_contains return _safe_contains(directory, file) def merge_sorted_iterables(operator, *iterables): """ >>> operator = lambda x: x >>> list( merge_sorted_iterables( operator, [1,2,3], [4,5] ) ) [1, 2, 3, 4, 5] >>> list( merge_sorted_iterables( operator, [4, 5], [1,2,3] ) ) [1, 2, 3, 4, 5] >>> list( merge_sorted_iterables( operator, [1, 4, 5], [2], [3] ) ) [1, 2, 3, 4, 5] """ first_iterable = iterables[0] if len(iterables) == 1: yield from first_iterable else: yield from __merge_two_sorted_iterables( operator, iter(first_iterable), merge_sorted_iterables(operator, *iterables[1:]) ) def __merge_two_sorted_iterables(operator, iterable1, iterable2): unset = object() continue_merge = True next_1 = unset next_2 = unset while continue_merge: try: if next_1 is unset: next_1 = next(iterable1) if next_2 is unset: next_2 = next(iterable2) if operator(next_2) < operator(next_1): yield next_2 next_2 = unset else: yield next_1 next_1 = unset except StopIteration: continue_merge = False if next_1 is not unset: yield next_1 if next_2 is not unset: yield next_2 yield from iterable1 yield from iterable2 class Params: """ Stores and 'sanitizes' parameters. Alphanumeric characters and the non-alphanumeric ones that are deemed safe are let to pass through (see L{valid_chars}). Some non-safe characters are escaped to safe forms for example C{>} becomes C{__lt__} (see L{mapped_chars}). All other characters are replaced with C{X}. Operates on string or list values only (HTTP parameters). >>> values = { 'status':'on', 'symbols':[ 'alpha', '<>', '$rm&#!' ] } >>> par = Params(values) >>> par.status 'on' >>> par.value == None # missing attributes return None True >>> par.get('price', 0) 0 >>> par.symbols # replaces unknown symbols with X ['alpha', '__lt____gt__', 'XrmX__pd__!'] >>> sorted(par.flatten()) # flattening to a list [('status', 'on'), ('symbols', 'XrmX__pd__!'), ('symbols', '__lt____gt__'), ('symbols', 'alpha')] """ # is NEVER_SANITIZE required now that sanitizing for tool parameters can be controlled on a per parameter basis and occurs via InputValueWrappers? NEVER_SANITIZE = ["file_data", "url_paste", "URL", "filesystem_paths"] def __init__(self, params, sanitize=True): if sanitize: for key, value in params.items(): # sanitize check both ungrouped and grouped parameters by # name. Anything relying on NEVER_SANITIZE should be # changed to not require this and NEVER_SANITIZE should be # removed. if ( value is not None and key not in self.NEVER_SANITIZE and True not in [key.endswith(f"|{nonsanitize_parameter}") for nonsanitize_parameter in self.NEVER_SANITIZE] ): self.__dict__[key] = sanitize_param(value) else: self.__dict__[key] = value else: self.__dict__.update(params) def flatten(self): """ Creates a tuple list from a dict with a tuple/value pair for every value that is a list """ flat = [] for key, value in self.__dict__.items(): if isinstance(value, list): for v in value: flat.append((key, v)) else: flat.append((key, value)) return flat def __getattr__(self, name): """This is here to ensure that we get None for non existing parameters""" return None def get(self, key, default): return self.__dict__.get(key, default) def __str__(self): return f"{self.__dict__}" def __len__(self): return len(self.__dict__) def __iter__(self): return iter(self.__dict__) def update(self, values): self.__dict__.update(values) def xml_text(root, name=None, default=""): """Returns the text inside an element""" if name is not None: # Try attribute first val = root.get(name) if val: return val # Then try as element elem = root.find(name) else: elem = root if elem is not None and elem.text: text = "".join(elem.text.splitlines()) return text.strip() # No luck, return empty string return default def parse_resource_parameters(resource_param_file): """Code shared between jobs and workflows for reading resource parameter configuration files. TODO: Allow YAML in addition to XML. """ resource_parameters = {} if os.path.exists(resource_param_file): resource_definitions = parse_xml(resource_param_file) resource_definitions_root = resource_definitions.getroot() for parameter_elem in resource_definitions_root.findall("param"): name = parameter_elem.get("name") resource_parameters[name] = etree.tostring(parameter_elem, encoding="unicode") return resource_parameters # asbool implementation pulled from PasteDeploy truthy = frozenset({"true", "yes", "on", "y", "t", "1"}) falsy = frozenset({"false", "no", "off", "n", "f", "0"}) def asbool(obj): if isinstance(obj, str): obj = obj.strip().lower() if obj in truthy: return True elif obj in falsy: return False else: raise ValueError(f"String is not true/false: {obj!r}") return bool(obj) def string_as_bool(string: Any) -> bool: if str(string).lower() in ("true", "yes", "on", "1"): return True else: return False def string_as_bool_or_none(string): """ Returns True, None or False based on the argument: True if passed True, 'True', 'Yes', or 'On' None if passed None or 'None' False otherwise Note: string comparison is case-insensitive so lowecase versions of those function equivalently. """ string = str(string).lower() if string in ("true", "yes", "on"): return True elif string in ["none", "null"]: return None else: return False @overload def listify(item: Union[None, Literal[False]], do_strip: bool = False) -> List: ... @overload def listify(item: str, do_strip: bool = False) -> List[str]: ... @overload def listify(item: Union[List[ItemType], Tuple[ItemType, ...]], do_strip: bool = False) -> List[ItemType]: ... # Unfortunately we cannot use ItemType .. -> List[ItemType] in the next overload # because then that would also match Union types. @overload def listify(item: Any, do_strip: bool = False) -> List: ... def listify(item: Any, do_strip: bool = False) -> List: """ Make a single item a single item list. If *item* is a string, it is split on comma (``,``) characters to produce the list. Optionally, if *do_strip* is true, any extra whitespace around the split items is stripped. If *item* is a list it is returned unchanged. If *item* is a tuple, it is converted to a list and returned. If *item* evaluates to False, an empty list is returned. :type item: object :param item: object to make a list from :type do_strip: bool :param do_strip: strip whitespaces from around split items, if set to ``True`` :rtype: list :returns: The input as a list """ if not item: return [] elif isinstance(item, list): return item elif isinstance(item, tuple): return list(item) elif isinstance(item, str) and item.count(","): if do_strip: return [token.strip() for token in item.split(",")] else: return item.split(",") else: return [item] def commaify(amount): orig = amount new = re.sub(r"^(-?\d+)(\d{3})", r"\g<1>,\g<2>", amount) if orig == new: return new else: return commaify(new) @overload def unicodify( value: Literal[None], encoding: str = DEFAULT_ENCODING, error: str = "replace", strip_null: bool = False, log_exception: bool = True, ) -> None: ... @overload def unicodify( value: Any, encoding: str = DEFAULT_ENCODING, error: str = "replace", strip_null: bool = False, log_exception: bool = True, ) -> str: ... def unicodify( value: Any, encoding: str = DEFAULT_ENCODING, error: str = "replace", strip_null: bool = False, log_exception: bool = True, ) -> Optional[str]: """ Returns a Unicode string or None. >>> assert unicodify(None) is None >>> assert unicodify('simple string') == 'simple string' >>> assert unicodify(3) == '3' >>> assert unicodify(bytearray([115, 116, 114, 196, 169, 195, 177, 103])) == 'strĩñg' >>> assert unicodify(Exception('strĩñg')) == 'strĩñg' >>> assert unicodify('cómplǐcḁtëd strĩñg') == 'cómplǐcḁtëd strĩñg' >>> s = 'cómplǐcḁtëd strĩñg'; assert unicodify(s) == s >>> s = 'lâtín strìñg'; assert unicodify(s.encode('latin-1'), 'latin-1') == s >>> s = 'lâtín strìñg'; assert unicodify(s.encode('latin-1')) == 'l\ufffdt\ufffdn str\ufffd\ufffdg' >>> s = 'lâtín strìñg'; assert unicodify(s.encode('latin-1'), error='ignore') == 'ltn strg' """ if value is None: return value try: if isinstance(value, bytearray): value = bytes(value) elif not isinstance(value, (str, bytes)): value = str(value) # Now value is an instance of bytes or str if not isinstance(value, str): value = str(value, encoding, error) except Exception as e: if log_exception: msg = f"Value '{repr(value)}' could not be coerced to Unicode: {type(e).__name__}('{e}')" log.exception(msg) raise if strip_null: return value.replace("\0", "") return value def filesystem_safe_string( s, max_len=255, truncation_chars="..", strip_leading_dot=True, invalid_chars=("/",), replacement_char="_" ): """ Strip unicode null chars, truncate at 255 characters. Optionally replace additional ``invalid_chars`` with `replacement_char` . Defaults are probably only safe on linux / osx. Needs further escaping if used in shell commands """ sanitized_string = unicodify(s, strip_null=True) if strip_leading_dot: sanitized_string = sanitized_string.lstrip(".") for invalid_char in invalid_chars: sanitized_string = sanitized_string.replace(invalid_char, replacement_char) if len(sanitized_string) > max_len: sanitized_string = sanitized_string[: max_len - len(truncation_chars)] sanitized_string = f"{sanitized_string}{truncation_chars}" return sanitized_string def smart_str(s, encoding=DEFAULT_ENCODING, strings_only=False, errors="strict"): """ Returns a bytestring version of 's', encoded as specified in 'encoding'. If strings_only is True, don't convert (some) non-string-like objects. Adapted from an older, simpler version of django.utils.encoding.smart_str. >>> assert smart_str(None) == b'None' >>> assert smart_str(None, strings_only=True) is None >>> assert smart_str(3) == b'3' >>> assert smart_str(3, strings_only=True) == 3 >>> s = b'a bytes string'; assert smart_str(s) == s >>> s = bytearray(b'a bytes string'); assert smart_str(s) == s >>> assert smart_str('a simple unicode string') == b'a simple unicode string' >>> assert smart_str('à strange ünicode ڃtring') == b'\\xc3\\xa0 strange \\xc3\\xbcnicode \\xda\\x83tring' >>> assert smart_str(b'\\xc3\\xa0n \\xc3\\xabncoded utf-8 string', encoding='latin-1') == b'\\xe0n \\xebncoded utf-8 string' >>> assert smart_str(bytearray(b'\\xc3\\xa0n \\xc3\\xabncoded utf-8 string'), encoding='latin-1') == b'\\xe0n \\xebncoded utf-8 string' """ if strings_only and isinstance(s, (type(None), int)): return s if not isinstance(s, (str, bytes, bytearray)): s = str(s) # Now s is an instance of str, bytes or bytearray if not isinstance(s, (bytes, bytearray)): return s.encode(encoding, errors) elif s and encoding != DEFAULT_ENCODING: return s.decode(DEFAULT_ENCODING, errors).encode(encoding, errors) else: return s def strip_control_characters(s): """Strip unicode control characters from a string.""" return "".join(c for c in unicodify(s) if unicodedata.category(c) != "Cc") def object_to_string(obj): return binascii.hexlify(obj) def string_to_object(s): return binascii.unhexlify(s) def clean_multiline_string(multiline_string, sep="\n"): """ Dedent, split, remove first and last empty lines, rejoin. """ multiline_string = textwrap.dedent(multiline_string) string_list = multiline_string.split(sep) if not string_list[0]: string_list = string_list[1:] if not string_list[-1]: string_list = string_list[:-1] return "\n".join(string_list) + "\n" class ParamsWithSpecs(collections.defaultdict): """ """ def __init__(self, specs=None, params=None): self.specs = specs or {} self.params = params or {} for name, value in self.params.items(): if name not in self.specs: self._param_unknown_error(name) if "map" in self.specs[name]: try: self.params[name] = self.specs[name]["map"](value) except Exception: self._param_map_error(name, value) if "valid" in self.specs[name]: if not self.specs[name]["valid"](value): self._param_vaildation_error(name, value) self.update(self.params) def __missing__(self, name): return self.specs[name]["default"] def __getattr__(self, name): return self[name] def _param_unknown_error(self, name): raise NotImplementedError() def _param_map_error(self, name, value): raise NotImplementedError() def _param_vaildation_error(self, name, value): raise NotImplementedError() def compare_urls(url1, url2, compare_scheme=True, compare_hostname=True, compare_path=True): url1 = urlparse(url1) url2 = urlparse(url2) if compare_scheme and url1.scheme and url2.scheme and url1.scheme != url2.scheme: return False if compare_hostname and url1.hostname and url2.hostname and url1.hostname != url2.hostname: return False if compare_path and url1.path and url2.path and url1.path != url2.path: return False return True def read_build_sites(filename, check_builds=True): """read db names to ucsc mappings from file, this file should probably be merged with the one above""" build_sites = [] try: for line in open(filename): try: if line[0:1] == "#": continue fields = line.replace("\r", "").replace("\n", "").split("\t") site_name = fields[0] site = fields[1] if check_builds: site_builds = fields[2].split(",") site_dict = {"name": site_name, "url": site, "builds": site_builds} else: site_dict = {"name": site_name, "url": site} build_sites.append(site_dict) except Exception: continue except Exception: log.error("ERROR: Unable to read builds for site file %s", filename) return build_sites def relativize_symlinks(path, start=None, followlinks=False): for root, _, files in os.walk(path, followlinks=followlinks): rel_start = None for file_name in files: symlink_file_name = os.path.join(root, file_name) if os.path.islink(symlink_file_name): symlink_target = os.readlink(symlink_file_name) if rel_start is None: if start is None: rel_start = root else: rel_start = start rel_path = relpath(symlink_target, rel_start) os.remove(symlink_file_name) os.symlink(rel_path, symlink_file_name) def stringify_dictionary_keys(in_dict): # returns a new dictionary # changes unicode keys into strings, only works on top level (does not recurse) # unicode keys are not valid for expansion into keyword arguments on method calls out_dict = {} for key, value in in_dict.items(): out_dict[str(key)] = value return out_dict def mkstemp_ln(src, prefix="mkstemp_ln_"): """ From tempfile._mkstemp_inner, generate a hard link in the same dir with a random name. Created so we can persist the underlying file of a NamedTemporaryFile upon its closure. """ dir = os.path.dirname(src) names = tempfile._get_candidate_names() for _ in range(tempfile.TMP_MAX): name = next(names) file = os.path.join(dir, prefix + name) try: os.link(src, file) return os.path.abspath(file) except OSError as e: if e.errno == errno.EEXIST: continue # try again raise raise OSError(errno.EEXIST, "No usable temporary file name found") def umask_fix_perms(path, umask, unmasked_perms, gid=None): """ umask-friendly permissions fixing """ perms = unmasked_perms & ~umask try: st = os.stat(path) except OSError: log.exception("Unable to set permissions or group on %s", path) return # fix modes if stat.S_IMODE(st.st_mode) != perms: try: os.chmod(path, perms) except Exception as e: log.warning( "Unable to honor umask (%s) for %s, tried to set: %s but mode remains %s, error was: %s", oct(umask), path, oct(perms), oct(stat.S_IMODE(st.st_mode)), e, ) # fix group if gid is not None and st.st_gid != gid: try: os.chown(path, -1, gid) except Exception as e: try: desired_group = grp.getgrgid(gid) current_group = grp.getgrgid(st.st_gid) except Exception: desired_group = gid current_group = st.st_gid log.warning( "Unable to honor primary group (%s) for %s, group remains %s, error was: %s", desired_group, path, current_group, e, ) def docstring_trim(docstring): """Trimming python doc strings. Taken from: http://www.python.org/dev/peps/pep-0257/""" if not docstring: return "" # Convert tabs to spaces (following the normal Python rules) # and split into a list of lines: lines = docstring.expandtabs().splitlines() # Determine minimum indentation (first line doesn't count): indent = sys.maxsize for line in lines[1:]: stripped = line.lstrip() if stripped: indent = min(indent, len(line) - len(stripped)) # Remove indentation (first line is special): trimmed = [lines[0].strip()] if indent < sys.maxsize: for line in lines[1:]: trimmed.append(line[indent:].rstrip()) # Strip off trailing and leading blank lines: while trimmed and not trimmed[-1]: trimmed.pop() while trimmed and not trimmed[0]: trimmed.pop(0) # Return a single string: return "\n".join(trimmed) def metric_prefix(number: Union[int, float], base: int) -> Tuple[float, str]: """ >>> metric_prefix(100, 1000) (100.0, '') >>> metric_prefix(999, 1000) (999.0, '') >>> metric_prefix(1000, 1000) (1.0, 'K') >>> metric_prefix(1001, 1000) (1.001, 'K') >>> metric_prefix(1000000, 1000) (1.0, 'M') >>> metric_prefix(1000**10, 1000) (1.0, 'Q') >>> metric_prefix(1000**11, 1000) (1000.0, 'Q') """ prefixes = ["", "K", "M", "G", "T", "P", "E", "Z", "Y", "R", "Q"] if number < 0: number = abs(number) sign = -1 else: sign = 1 for prefix in prefixes: if number < base: return sign * float(number), prefix number /= base else: return sign * float(number) * base, prefix def shorten_with_metric_prefix(amount: int) -> str: """ >>> shorten_with_metric_prefix(23000) '23K' >>> shorten_with_metric_prefix(2300000) '2.3M' >>> shorten_with_metric_prefix(23000000) '23M' >>> shorten_with_metric_prefix(1) '1' >>> shorten_with_metric_prefix(0) '0' >>> shorten_with_metric_prefix(100) '100' >>> shorten_with_metric_prefix(-100) '-100' """ m, prefix = metric_prefix(amount, 1000) m_str = str(int(m)) if m.is_integer() else f"{m:.1f}" exp = f"{m_str}{prefix}" if len(exp) <= len(str(amount)): return exp else: return str(amount) def nice_size(size: Union[float, int, str, Decimal]) -> str: """ Returns a readably formatted string with the size >>> nice_size(100) '100 bytes' >>> nice_size(10000) '9.8 KB' >>> nice_size(1000000) '976.6 KB' >>> nice_size(100000000) '95.4 MB' """ try: size = float(size) except ValueError: return "??? bytes" size, prefix = metric_prefix(size, 1024) if prefix == "": return f"{int(size)} bytes" else: return f"{size:.1f} {prefix}B" def size_to_bytes(size): """ Returns a number of bytes (as integer) if given a reasonably formatted string with the size >>> size_to_bytes('1024') 1024 >>> size_to_bytes('1.0') 1 >>> size_to_bytes('10 bytes') 10 >>> size_to_bytes('4k') 4096 >>> size_to_bytes('2.2 TB') 2418925581107 >>> size_to_bytes('.01 TB') 10995116277 >>> size_to_bytes('1.b') 1 >>> size_to_bytes('1.2E2k') 122880 """ # The following number regexp is based on https://stackoverflow.com/questions/385558/extract-float-double-value/385597#385597 size_re = re.compile(r"(?P(\d+(\.\d*)?|\.\d+)(e[+-]?\d+)?)\s*(?P[eptgmk]?(b|bytes?)?)?$") size_match = size_re.match(size.lower()) if size_match is None: raise ValueError(f"Could not parse string '{size}'") number = float(size_match.group("number")) multiple = size_match.group("multiple") if multiple == "" or multiple.startswith("b"): return int(number) elif multiple.startswith("k"): return int(number * 1024) elif multiple.startswith("m"): return int(number * 1024**2) elif multiple.startswith("g"): return int(number * 1024**3) elif multiple.startswith("t"): return int(number * 1024**4) elif multiple.startswith("p"): return int(number * 1024**5) elif multiple.startswith("e"): return int(number * 1024**6) else: raise ValueError(f"Unknown multiplier '{multiple}' in '{size}'") def send_mail(frm, to, subject, body, config, html=None, reply_to=None): """ Sends an email. :type frm: str :param frm: from address :type to: str :param to: to address :type subject: str :param subject: Subject line :type body: str :param body: Body text (should be plain text) :type config: object :param config: Galaxy configuration object :type html: str :param html: Alternative HTML representation of the body content. If provided will convert the message to a MIMEMultipart. (Default None) :type reply_to: str :param reply_to: Reply-to address (Default None) """ smtp_server = config.smtp_server if smtp_server and isinstance(smtp_server, str) and smtp_server.startswith("mock_emails_to_path://"): path = config.smtp_server[len("mock_emails_to_path://") :] email_dict = { "from": frm, "to": to, "subject": subject, "body": body, "html": html, "reply_to": reply_to, } email_json = json.to_json_string(email_dict) with open(path, "w") as f: f.write(email_json) return to = listify(to) msg = EmailMessage() msg.set_content(body) msg["To"] = ", ".join(to) msg["From"] = frm msg["Subject"] = subject if reply_to: msg["Reply-To"] = reply_to if config.smtp_server is None: log.error("Mail is not configured for this Galaxy instance.") log.info(msg) return if html: msg.add_alternative(html, subtype="html") smtp_ssl = asbool(getattr(config, "smtp_ssl", False)) if smtp_ssl: s = smtplib.SMTP_SSL(config.smtp_server) else: s = smtplib.SMTP(config.smtp_server) try: if not smtp_ssl: try: s.starttls() log.debug("Initiated SSL/TLS connection to SMTP server: %s", config.smtp_server) except RuntimeError as e: log.warning("SSL/TLS support is not available to your Python interpreter: %s", e) except smtplib.SMTPHeloError as e: log.error("The server didn't reply properly to the HELO greeting: %s", e) raise except smtplib.SMTPException as e: log.warning("The server does not support the STARTTLS extension: %s", e) if config.smtp_username and config.smtp_password: try: s.login(config.smtp_username, config.smtp_password) except smtplib.SMTPHeloError as e: log.error("The server didn't reply properly to the HELO greeting: %s", e) raise except smtplib.SMTPAuthenticationError as e: log.error("The server didn't accept the username/password combination: %s", e) raise except smtplib.SMTPException as e: log.error("No suitable authentication method was found: %s", e) raise s.send_message(msg) finally: s.quit() def force_symlink(source, link_name): try: os.symlink(source, link_name) except OSError as e: if e.errno == errno.EEXIST: os.remove(link_name) os.symlink(source, link_name) else: raise e def unlink(path_or_fd, ignore_errors=False): """Calls os.unlink on `path_or_fd`, and ignore FileNoteFoundError if ignore_errors is True.""" try: os.unlink(path_or_fd) except FileNotFoundError: if ignore_errors: pass else: raise def move_merge(source, target): # when using shutil and moving a directory, if the target exists, # then the directory is placed inside of it # if the target doesn't exist, then the target is made into the directory # this makes it so that the target is always the target, and if it exists, # the source contents are moved into the target if os.path.isdir(source) and os.path.exists(target) and os.path.isdir(target): for name in os.listdir(source): move_merge(os.path.join(source, name), os.path.join(target, name)) else: return shutil.move(source, target) def safe_str_cmp(a, b): """safely compare two strings in a timing-attack-resistant manner""" if len(a) != len(b): return False rv = 0 for x, y in zip(a, b): rv |= ord(x) ^ ord(y) return rv == 0 # never load packages this way (won't work for installed packages), # but while we're working on packaging everything this can be a way to point # an installed Galaxy at a Galaxy root for things like tools. Ultimately # this all needs to be packaged, but we have some very old PRs working on this # that are pretty tricky and shouldn't slow current development. GALAXY_INCLUDES_ROOT = os.environ.get("GALAXY_INCLUDES_ROOT") # Don't use this directly, prefer method version that "works" with packaged Galaxy. galaxy_root_path = Path(GALAXY_INCLUDES_ROOT) if GALAXY_INCLUDES_ROOT else Path(__file__).parent.parent.parent.parent def galaxy_directory() -> str: if in_packages() and not GALAXY_INCLUDES_ROOT: # This will work only when running pytest from /packages// cwd = Path.cwd() path = cwd.parent.parent else: path = galaxy_root_path return os.path.abspath(path) def in_packages() -> bool: galaxy_lib_path = Path(__file__).parent.parent.parent return galaxy_lib_path.name != "lib" def config_directories_from_setting(directories_setting, galaxy_root=galaxy_root_path): """ Parse the ``directories_setting`` into a list of relative or absolute filesystem paths that will be searched to discover plugins. :type galaxy_root: string :param galaxy_root: the root path of this galaxy installation :type directories_setting: string (default: None) :param directories_setting: the filesystem path (or paths) to search for plugins. Can be CSV string of paths. Will be treated as absolute if a path starts with '/', relative otherwise. :rtype: list of strings :returns: list of filesystem paths """ directories = [] if not directories_setting: return directories for directory in listify(directories_setting): directory = directory.strip() if not directory.startswith("/"): directory = os.path.join(galaxy_root, directory) if not os.path.exists(directory): log.warning("directory not found: %s", directory) continue directories.append(directory) return directories def parse_int(value, min_val=None, max_val=None, default=None, allow_none=False): try: value = int(value) if min_val is not None and value < min_val: return min_val if max_val is not None and value > max_val: return max_val return value except ValueError: if allow_none: if default is None or value == "None": return None if default: return default else: raise def parse_non_hex_float(s): r""" Parse string `s` into a float but throw a `ValueError` if the string is in the otherwise acceptable format `\d+e\d+` (e.g. 40000000000000e5.) This can be passed into `json.loads` to prevent a hex string in the above format from being incorrectly parsed as a float in scientific notation. >>> parse_non_hex_float( '123.4' ) 123.4 >>> parse_non_hex_float( '2.45e+3' ) 2450.0 >>> parse_non_hex_float( '2.45e-3' ) 0.00245 >>> parse_non_hex_float( '40000000000000e5' ) Traceback (most recent call last): ... ValueError: could not convert string to float: 40000000000000e5 """ f = float(s) # successfully parsed as float if here - check for format in original string if "e" in s and not ("+" in s or "-" in s): raise ValueError("could not convert string to float: " + s) return f def build_url(base_url, port=80, scheme="http", pathspec=None, params=None, doseq=False): if params is None: params = {} if pathspec is None: pathspec = [] parsed_url = urlparse(base_url) if scheme != "http": parsed_url.scheme = scheme assert parsed_url.scheme in ("http", "https", "ftp"), f"Invalid URL scheme: {parsed_url.scheme}" if port != 80: url = "{}://{}:{}/{}".format(parsed_url.scheme, parsed_url.netloc.rstrip("/"), int(port), parsed_url.path) else: url = f"{parsed_url.scheme}://{parsed_url.netloc.rstrip('/')}/{parsed_url.path.lstrip('/')}" if len(pathspec) > 0: url = f"{url.rstrip('/')}/{'/'.join(pathspec)}" if parsed_url.query: for query_parameter in parsed_url.query.split("&"): key, value = query_parameter.split("=") params[key] = value if params: url += f"?{urlencode(params, doseq=doseq)}" return url def url_get(base_url, auth=None, pathspec=None, params=None, max_retries=5, backoff_factor=1): """Make contact with the uri provided and return any contents.""" full_url = build_url(base_url, pathspec=pathspec, params=params) s = requests.Session() retries = Retry(total=max_retries, backoff_factor=backoff_factor, status_forcelist=[429]) s.mount(base_url, HTTPAdapter(max_retries=retries)) response = s.get(full_url, auth=auth) response.raise_for_status() return response.text def is_url(uri, allow_list=None): """ Check if uri is (most likely) an URL, more precisely the function checks if uri starts with a scheme from the allow list (defaults to "http://", "https://", "ftp://") >>> is_url('https://zenodo.org/record/4104428/files/UCSC-hg38-chr22-Coding-Exons.bed') True >>> is_url('file:///some/path') False >>> is_url('/some/path') False """ if allow_list is None: allow_list = ("http://", "https://", "ftp://") return any(uri.startswith(scheme) for scheme in allow_list) def check_github_api_response_rate_limit(response): if response.status_code == 403 and "API rate limit exceeded" in response.json()["message"]: # It can take tens of minutes before the rate limit window resets message = "GitHub API rate limit exceeded." rate_limit_reset_UTC_timestamp = response.headers.get("X-RateLimit-Reset") if rate_limit_reset_UTC_timestamp: rate_limit_reset_datetime = datetime.fromtimestamp(int(rate_limit_reset_UTC_timestamp), tz=timezone.utc) message += f" The rate limit window will reset at {rate_limit_reset_datetime.isoformat()}." raise Exception(message) def download_to_file(url, dest_file_path, timeout=30, chunk_size=2**20): """Download a URL to a file in chunks.""" with requests.get(url, timeout=timeout, stream=True) as r, open(dest_file_path, "wb") as f: for chunk in r.iter_content(chunk_size): if chunk: f.write(chunk) def stream_to_open_named_file( stream, fd, filename, source_encoding=None, source_error="strict", target_encoding=None, target_error="strict" ): """Writes a stream to the provided file descriptor, returns the file name. Closes file descriptor""" # signature and behavor is somewhat odd, due to backwards compatibility, but this can/should be done better CHUNK_SIZE = 1048576 try: codecs.lookup(target_encoding) except Exception: target_encoding = DEFAULT_ENCODING # utf-8 use_source_encoding = source_encoding is not None try: while True: chunk = stream.read(CHUNK_SIZE) if not chunk: break if use_source_encoding: # If a source encoding is given we use it to convert to the target encoding try: if not isinstance(chunk, str): chunk = chunk.decode(source_encoding, source_error) os.write(fd, chunk.encode(target_encoding, target_error)) except UnicodeDecodeError: use_source_encoding = False os.write(fd, chunk) else: # Compressed files must be encoded after they are uncompressed in the upload utility, # while binary files should not be encoded at all. if isinstance(chunk, str): chunk = chunk.encode(target_encoding, target_error) os.write(fd, chunk) finally: os.close(fd) return filename class classproperty: def __init__(self, f): self.f = f def __get__(self, obj, owner): return self.f(owner) class ExecutionTimer: def __init__(self) -> None: self.begin = time.time() def __str__(self) -> str: return f"({self.elapsed * 1000:0.3f} ms)" @property def elapsed(self) -> float: return time.time() - self.begin class StructuredExecutionTimer: def __init__(self, timer_id, template, **tags): self.begin = time.time() self.timer_id = timer_id self.template = template self.tags = tags def __str__(self): return self.to_str() def to_str(self, **kwd): if kwd: message = string.Template(self.template).safe_substitute(kwd) else: message = self.template log_message = message + f" ({self.elapsed * 1000:0.3f} ms)" return log_message @property def elapsed(self): return time.time() - self.begin def enum_values(enum_class): """ Return a list of member values of enumeration enum_class. Values are in member definition order. """ return [value.value for value in enum_class.__members__.values()] def hex_to_lowercase_alphanum(hex_string: str) -> str: """ Convert a hexadecimal string encoding into a lowercase 36-base alphanumeric string using the characters a-z and 0-9 """ import numpy as np return np.base_repr(int(hex_string, 16), 36).lower() def lowercase_alphanum_to_hex(lowercase_alphanum: str) -> str: """ Convert a lowercase 36-base alphanumeric string encoding using the characters a-z and 0-9 to a hexadecimal string """ import numpy as np return np.base_repr(int(lowercase_alphanum, 36), 16).lower() def to_content_disposition(target: str) -> str: target = target.strip() filename, ext = os.path.splitext(target) character_limit = 255 - len(ext) sanitized_filename = "".join(c in FILENAME_VALID_CHARS and c or "_" for c in filename)[0:character_limit] + ext utf8_encoded_filename = quote(re.sub(r'[\/\\\?%*:|"<>]', "_", filename), safe="")[0:character_limit] + ext return f"attachment; filename=\"{sanitized_filename}\"; filename*=UTF-8''{utf8_encoded_filename}" def validate_doi(doi: str) -> bool: if len(doi) > DOI_MAX_LENGTH: return False prefix = "https://doi.org/|doi.org/|doi:" doi_prefix = r"10\.\d+" doi_suffix = r"\S+" doi_re = re.compile(f"^{prefix}{doi_prefix}/{doi_suffix}$") return bool(doi_re.match(doi)) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/aliaspickler.py0000644000175100017510000000155015211124267021051 0ustar00runnerrunnerimport pickle from io import StringIO class AliasUnpickler(pickle.Unpickler): def __init__(self, aliases, *args, **kw): pickle.Unpickler.__init__(self, *args, **kw) self.aliases = aliases def find_class(self, module, name): module, name = self.aliases.get((module, name), (module, name)) return pickle.Unpickler.find_class(self, module, name) class AliasPickleModule: def __init__(self, aliases): self.aliases = aliases def dump(self, obj, fileobj, protocol=0): return pickle.dump(obj, fileobj, protocol) def dumps(self, obj, protocol=0): return pickle.dumps(obj, protocol) def load(self, fileobj): return AliasUnpickler(self.aliases, fileobj).load() def loads(self, string): fileobj = StringIO(string) return AliasUnpickler(self.aliases, fileobj).load() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/bool_expressions.py0000644000175100017510000001322215211124267022002 0ustar00runnerrunner"""Simple boolean expression parser and evaluator using pyparsing. Based on the example: https://github.com/pyparsing/pyparsing/blob/master/examples/simpleBool.py """ import logging from typing import ( Callable, Iterable, Optional, Set, ) from pyparsing import ( alphanums, CaselessKeyword, infix_notation, Keyword, opAssoc, ParseException, ParserElement, QuotedString, Word, ) log = logging.getLogger(__name__) ParserElement.enable_packrat() # Defines the allowed characters that form a valid token. # Tokens that don't match this format will raise an exception when found. DEFAULT_TOKEN_FORMAT = f"{alphanums}_-@." TRUE = Keyword("True") FALSE = Keyword("False") NOT_OP = CaselessKeyword("not") AND_OP = CaselessKeyword("and") OR_OP = CaselessKeyword("or") QUOTED_STRING = QuotedString("'") class TokenEvaluator: """Interface to evaluate a token and determine its boolean value.""" def evaluate(self, token: str) -> bool: """Returns the boolean representation of the given token according to some custom logic.""" raise NotImplementedError class BoolOperand: """Represents a boolean operand that has a label and a value. The value is determined by a custom TokenEvaluator.""" evaluator: TokenEvaluator def __init__(self, token): self.label = token[0] self.value = self.evaluator.evaluate(token[0]) def __bool__(self): return self.value def __str__(self): return self.label __repr__ = __str__ class BoolBinaryOperation: """Base representation of a boolean binary operation.""" reprsymbol: str evalop: Callable[[Iterable[object]], bool] def __init__(self, token): self.args = token[0][0::2] def __str__(self): sep = f" {self.reprsymbol} " return f"({sep.join(map(str, self.args))})" def __bool__(self): return self.evalop(bool(a) for a in self.args) __nonzero__ = __bool__ class BoolAnd(BoolBinaryOperation): """Represents the `AND` boolean operation.""" reprsymbol = "&" evalop = all # type: ignore[assignment] class BoolOr(BoolBinaryOperation): """Represents the `OR` boolean operation.""" reprsymbol = "|" evalop = any # type: ignore[assignment] class BoolNot: """Represents the `NOT` boolean operation.""" def __init__(self, token): self.arg = token[0][1] def __bool__(self): v = bool(self.arg) return not v def __str__(self): return f"~{self.arg}" __repr__ = __str__ class BooleanExpressionEvaluator: """Boolean logic parser that can evaluate an expression using a particular TokenEvaluator. Supports AND, OR and NOT operator including parentheses to override operator precedences. You can pass in different TokenEvaluator implementations to customize how the tokens (or variables) are converted to a boolean value when evaluating the expression.""" def __init__(self, evaluator: TokenEvaluator, token_format: Optional[str] = None) -> None: """Initializes the expression evaluator. :param evaluator: The custom TokenEvaluator used to transform any token into a boolean. :type evaluator: TokenEvaluator :param token_format: A string of all allowed characters used to form a valid token, defaults to None. The default value (None) will use DEFAULT_TOKEN_FORMAT which means the allowed characters are ``[A-Za-z0-9_-@.]``. :type token_format: Optional[str] """ action = BoolOperand action.evaluator = evaluator boolOperand = TRUE | FALSE | QUOTED_STRING | Word(token_format or DEFAULT_TOKEN_FORMAT) boolOperand.set_parse_action(action) self.boolExpr: ParserElement = infix_notation( boolOperand, [ (NOT_OP, 1, opAssoc.RIGHT, BoolNot), (AND_OP, 2, opAssoc.LEFT, BoolAnd), (OR_OP, 2, opAssoc.LEFT, BoolOr), ], ) def evaluate_expression(self, expr: str) -> bool: """Given an expression it gets evaluated to True or False using boolean logic.""" try: res = self.boolExpr.parse_string(expr, parse_all=True)[0] return bool(res) except ParseException as e: log.error(f"BooleanExpressionEvaluator unable to evaluate expression => {expr}", exc_info=e) raise e @classmethod def is_valid_expression(cls, expr: str) -> bool: """Tries to evaluate the given boolean expression and returns True if it is valid or False if it has syntax or gramatical errors.""" try: evaluator = BooleanExpressionEvaluator(ValidationOnlyTokenEvaluator()) evaluator.evaluate_expression(expr) return True except ParseException: return False class TokenContainedEvaluator(TokenEvaluator): """Implements the TokenEvaluator interface to determine if a token is contained in a particular list of tokens.""" def __init__(self, tokens: Set[str]) -> None: """Initializes the token evaluator with the set of tokens that will evaluate to `True`. :param tokens: The list of tokens that should be evaluated to True. :type tokens: List[str] """ self.tokens = tokens or set() def evaluate(self, token: str) -> bool: return token in self.tokens class ValidationOnlyTokenEvaluator(TokenEvaluator): """Simple TokenEvaluator that always evaluates to True for valid tokens. This is only useful for validation purposes, do NOT use it for real expression evaluations.""" def evaluate(self, token: str) -> bool: return True ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/bunch.py0000644000175100017510000000220415211124267017502 0ustar00runnerrunnerfrom .dynamic import HasDynamicProperties class Bunch(HasDynamicProperties): """ http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52308 Often we want to just collect a bunch of stuff together, naming each item of the bunch; a dictionary's OK for that, but a small do-nothing class is even handier, and prettier to use. For new code, use dataclasses from the standard library instead. """ def __init__(self, **kwds): self.__dict__.update(kwds) def dict(self): return self.__dict__ def get(self, key, default=None): return self.__dict__.get(key, default) def __iter__(self): return iter(self.__dict__) def items(self): return self.__dict__.items() def keys(self): return self.__dict__.keys() def values(self): return self.__dict__.values() def __str__(self): return f"{self.__dict__}" def __bool__(self): return bool(self.__dict__) __nonzero__ = __bool__ def __setitem__(self, k, v): self.__dict__.__setitem__(k, v) def __contains__(self, item): return item in self.__dict__ ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/bytesize.py0000644000175100017510000000353115211124267020245 0ustar00runnerrunnerSUFFIX_TO_BYTES = { "KI": 1024, "MI": 1024**2, "GI": 1024**3, "TI": 1024**4, "PI": 1024**5, "EI": 1024**6, "K": 1000, "M": 1000**2, "G": 1000**3, "T": 1000**4, "P": 1000**5, "E": 1000**6, } class ByteSize: """Convert multiples of bytes to various units.""" def __init__(self, value): """ Represents a quantity of bytes. `value` may be an integer, in which case it is assumed to be bytes. If value is a string, it is parsed as bytes if no suffix (Mi, M, Gi, G ...) is found. >>> values = [128974848, '129e6', '129M', '123Mi' ] >>> [ByteSize(v).to_unit('M') for v in values] ['128M', '129M', '129M', '128M'] """ self.value = parse_bytesize(value) def to_unit(self, unit=None, as_string=True): """unit must be `None` or one of Ki,Mi,Gi,Ti,Pi,Ei,K,M,G,T,P.""" if unit is None: if as_string: return str(self.value) return self.value unit = unit.upper() new_value = int(self.value / SUFFIX_TO_BYTES[unit]) if not as_string: return new_value return f"{new_value}{unit}" def parse_bytesize(value): if isinstance(value, int) or isinstance(value, float): # Assume bytes return value value = value.upper() found_suffix = None for suffix in SUFFIX_TO_BYTES: if value.endswith(suffix): found_suffix = suffix break if found_suffix: value = value[: -len(found_suffix)] try: value = int(value) except ValueError: try: value = float(value) except ValueError: raise ValueError(f"{value} is not a valid integer or float value") if found_suffix: value = value * SUFFIX_TO_BYTES[found_suffix] return value ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/checkers.py0000644000175100017510000001503615211124267020201 0ustar00runnerrunnerimport bz2 import gzip import lzma import os import re import tarfile import zipfile from io import ( BytesIO, StringIO, ) from typing import ( Dict, IO, Tuple, ) from typing_extensions import Protocol from galaxy import util from galaxy.util.image_util import image_type HTML_CHECK_LINES = 100 CHUNK_SIZE = 2**15 # 32Kb HTML_REGEXPS = ( re.compile(r"]*HREF[^>]+>", re.I), re.compile(r"]*>", re.I), re.compile(r"]*>", re.I), re.compile(r"]*>", re.I), re.compile(r"]*>", re.I), ) class CompressionChecker(Protocol): def __call__(self, file_path: str, check_content: bool = True) -> Tuple[bool, bool]: ... def check_html(name, file_path: bool = True) -> bool: """ Returns True if the file/string contains HTML code. """ # Handles files if file_path is True or text if file_path is False temp: IO[str] if file_path: temp = open(name, encoding="utf-8") else: temp = StringIO(util.unicodify(name)) try: for _ in range(HTML_CHECK_LINES): line = temp.readline(CHUNK_SIZE) if not line: break if any(regexp.search(line) for regexp in HTML_REGEXPS): return True except UnicodeDecodeError: return False finally: temp.close() return False def check_binary(name, file_path: bool = True) -> bool: # Handles files if file_path is True or text if file_path is False temp: IO[bytes] if file_path: temp = open(name, "rb") size = os.stat(name).st_size else: temp = BytesIO(name) size = len(name) read_start = int(size / 2) read_length = 1024 try: if util.is_binary(temp.read(read_length)): return True # Some binary files have text only within the first 1024 # Read 1024 from the middle of the file if this is not # a gzip or zip compressed file (bzip are indexed), # to avoid issues with long txt headers on binary files. if file_path and not is_gzip(name) and not is_zip(name) and not is_bz2(name): # file_path=False doesn't seem to be used in the codebase temp.seek(read_start) return util.is_binary(temp.read(read_length)) return False finally: temp.close() def check_gzip(file_path: str, check_content: bool = True) -> Tuple[bool, bool]: # This method returns a tuple of booleans representing ( is_gzipped, is_valid ) # Make sure we have a gzipped file try: with open(file_path, "rb") as temp: magic_check = temp.read(2) if magic_check != util.gzip_magic: return (False, False) except Exception: return (False, False) # We support some binary data types, so check if the compressed binary file is valid # If the file is Bam, it should already have been detected as such, so we'll just check # for sff format. try: with gzip.open(file_path, "rb") as fh: header = fh.read(4) if header == b".sff": return (True, True) except Exception: return (False, False) if not check_content: return (True, True) with gzip.open(file_path, mode="rb") as gzipped_file: chunk = gzipped_file.read(CHUNK_SIZE) # See if we have a compressed HTML file if check_html(chunk, file_path=False): return (True, False) return (True, True) def check_xz(file_path: str, check_content: bool = True) -> Tuple[bool, bool]: try: with open(file_path, "rb") as temp: magic_check = temp.read(6) if magic_check != util.xz_magic: return (False, False) except Exception: return (False, False) if not check_content: return (True, True) with lzma.LZMAFile(file_path, mode="rb") as xzipped_file: chunk = xzipped_file.read(CHUNK_SIZE) # See if we have a compressed HTML file if check_html(chunk, file_path=False): return (True, False) return (True, True) def check_bz2(file_path: str, check_content: bool = True) -> Tuple[bool, bool]: try: with open(file_path, "rb") as temp: magic_check = temp.read(3) if magic_check != util.bz2_magic: return (False, False) except Exception: return (False, False) if not check_content: return (True, True) with bz2.BZ2File(file_path, mode="rb") as bzipped_file: chunk = bzipped_file.read(CHUNK_SIZE) # See if we have a compressed HTML file if check_html(chunk, file_path=False): return (True, False) return (True, True) def check_zip(file_path: str, check_content: bool = True, files=1) -> Tuple[bool, bool]: if not zipfile.is_zipfile(file_path): return (False, False) if not check_content: return (True, True) chunk = None for filect, member in enumerate(iter_zip(file_path)): handle, name = member chunk = handle.read(CHUNK_SIZE) if chunk and check_html(chunk, file_path=False): return (True, False) if filect >= files: break return (True, True) def is_bz2(file_path: str) -> bool: is_bz2, is_valid = check_bz2(file_path, check_content=False) return is_bz2 def is_gzip(file_path: str) -> bool: is_gzipped, is_valid = check_gzip(file_path, check_content=False) return is_gzipped def is_xz(file_path: str) -> bool: is_xzipped, is_valid = check_xz(file_path, check_content=False) return is_xzipped def is_zip(file_path: str) -> bool: is_zipped, is_valid = check_zip(file_path, check_content=False) return is_zipped def is_single_file_zip(file_path: str) -> bool: for i, _ in enumerate(iter_zip(file_path)): if i > 1: return False return True def is_tar(file_path: str) -> bool: return tarfile.is_tarfile(file_path) def iter_zip(file_path: str): with zipfile.ZipFile(file_path) as z: for f in filter(lambda x: not x.endswith("/"), z.namelist()): yield (z.open(f), f) def check_image(file_path: str) -> bool: """Simple wrapper around image_type to yield a True/False verdict""" return bool(image_type(file_path)) COMPRESSION_CHECK_FUNCTIONS: Dict[str, CompressionChecker] = { "gzip": check_gzip, "bz2": check_bz2, "xz": check_xz, "zip": check_zip, } __all__ = ( "check_binary", "check_bz2", "check_gzip", "check_html", "check_image", "check_zip", "COMPRESSION_CHECK_FUNCTIONS", "is_gzip", "is_bz2", "is_xz", "is_zip", ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/commands.py0000644000175100017510000001432015211124267020206 0ustar00runnerrunner"""Generic I/O and shell processing code used by Galaxy tool dependencies.""" import logging import os import shlex import subprocess import sys as _sys import tempfile from typing import ( Any, Dict, List, Optional, Union, ) from galaxy.util import ( unicodify, which, ) log = logging.getLogger(__name__) STDOUT_INDICATOR = "-" def redirecting_io(sys=_sys): """Predicate to determine if we are redicting stdout in process.""" assert sys is not None try: # Need to explicitly call fileno() because sys.stdout could be a # io.StringIO object, which has a fileno() method but only raises an # io.UnsupportedOperation exception sys.stdout.fileno() except Exception: return True else: return False def redirect_aware_commmunicate(p, sys=_sys): """Variant of process.communicate that works with in process I/O redirection.""" assert sys is not None out, err = p.communicate() if redirecting_io(sys=sys): if out: # We don't unicodify in Python2 because sys.stdout may be a # cStringIO.StringIO object, which does not accept Unicode strings out = unicodify(out) sys.stdout.write(out) out = None if err: err = unicodify(err) sys.stderr.write(err) err = None return out, err def shell(cmds: Union[List[str], str], env: Optional[Dict[str, str]] = None, **kwds: Any) -> int: """Run shell commands with `shell_process` and wait.""" sys = kwds.get("sys", _sys) assert sys is not None p = shell_process(cmds, env, **kwds) if redirecting_io(sys=sys): redirect_aware_commmunicate(p, sys=sys) exit = p.returncode return exit else: return p.wait() def shell_process(cmds: Union[List[str], str], env: Optional[Dict[str, str]] = None, **kwds: Any) -> subprocess.Popen: """A high-level method wrapping subprocess.Popen. Handles details such as environment extension and in process I/O redirection. """ sys = kwds.get("sys", _sys) popen_kwds: Dict[str, Any] = {} if isinstance(cmds, str): log.warning("Passing program arguments as a string may be a security hazard if combined with untrusted input") popen_kwds["shell"] = True if kwds.get("stdout", None) is None and redirecting_io(sys=sys): popen_kwds["stdout"] = subprocess.PIPE if kwds.get("stderr", None) is None and redirecting_io(sys=sys): popen_kwds["stderr"] = subprocess.PIPE popen_kwds.update(**kwds) if env: new_env = os.environ.copy() new_env.update(env) popen_kwds["env"] = new_env p = subprocess.Popen(cmds, **popen_kwds) return p def execute(cmds, input=None): """Execute commands and throw an exception on a non-zero exit. if input is not None then the string is sent to the process' stdin. Return the standard output if the commands are successful """ return _wait(cmds, input=input, shell=False, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE) def argv_to_str(command_argv, quote=True): """Convert an argv command list to a string for shell subprocess. If None appears in the command list it is simply excluded. Arguments are quoted with shlex.quote(). That said, this method is not meant to be used in security critical paths of code and should not be used to sanitize code. """ map_func = shlex.quote if quote else lambda x: x return " ".join(map_func(c) for c in command_argv if c is not None) def _wait(cmds, input=None, **popen_kwds): p = subprocess.Popen(cmds, **popen_kwds) stdout, stderr = p.communicate(input) stdout, stderr = unicodify(stdout), unicodify(stderr) if p.returncode != 0: raise CommandLineException(argv_to_str(cmds), stdout, stderr, p.returncode) return stdout def download_command(url, to=STDOUT_INDICATOR): """Build a command line to download a URL. By default the URL will be downloaded to standard output but a specific file can be specified with the `to` argument. """ if which("wget"): download_cmd = ["wget", "-q"] if to == STDOUT_INDICATOR: download_cmd.extend(["-O", STDOUT_INDICATOR, url]) else: download_cmd.extend(["--recursive", "-O", to, url]) else: download_cmd = ["curl", "-L", url] if to != STDOUT_INDICATOR: download_cmd.extend(["-o", to]) return download_cmd class CommandLineException(Exception): """An exception indicating a non-zero command-line exit.""" def __init__(self, command, stdout, stderr, returncode): """Construct a CommandLineException from command and standard I/O.""" self.command = command self.stdout = stdout self.stderr = stderr self.returncode = returncode self.message = ( f"Failed to execute command-line {command}, stderr was:\n" "-------->>begin stderr<<--------\n" f"{stderr}\n" "-------->>end stderr<<--------\n" "-------->>begin stdout<<--------\n" f"{stdout}\n" "-------->>end stdout<<--------\n" ) def __str__(self): """Return a verbose error message indicating the command problem.""" return self.message def new_clean_env(): """ Returns a minimal environment to use when invoking a subprocess """ env = {} for k in ("HOME", "LC_CTYPE", "PATH", "TMPDIR"): if k in os.environ: env[k] = os.environ[k] if "TMPDIR" not in env: env["TMPDIR"] = os.path.abspath(tempfile.gettempdir()) # Set LC_CTYPE environment variable to enforce UTF-8 file encoding. # This is needed e.g. for Python < 3.7 where # `locale.getpreferredencoding()` (also used by open() to determine the # default file encoding) would return `ANSI_X3.4-1968` without this. if not env.get("LC_CTYPE", "").endswith("UTF-8"): env["LC_CTYPE"] = "C.UTF-8" return env __all__ = ( "argv_to_str", "CommandLineException", "download_command", "execute", "new_clean_env", "redirect_aware_commmunicate", "redirecting_io", "shell", "shell_process", "which", ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/compression_utils.py0000644000175100017510000004240515211124267022173 0ustar00runnerrunnerimport bz2 import gzip import io import logging import lzma import os import tarfile import tempfile import zipfile from types import TracebackType from typing import ( Any, cast, IO, Iterable, Iterator, List, Optional, overload, Tuple, Type, Union, ) from typing_extensions import ( Literal, Self, ) from galaxy.util.path import ( safe_relpath, StrPath, ) from .checkers import ( is_bz2, is_gzip, is_xz, ) try: from isal import isal_zlib except ImportError: isal_zlib = None # type: ignore[assignment,unused-ignore] log = logging.getLogger(__name__) FileObjTypeStr = Union[IO[str], io.TextIOWrapper] FileObjTypeBytes = Union[gzip.GzipFile, bz2.BZ2File, lzma.LZMAFile, IO[bytes]] FileObjType = Union[FileObjTypeStr, FileObjTypeBytes] @overload def get_fileobj( filename: str, mode: Literal["r"], compressed_formats: Optional[List[str]] = None ) -> FileObjTypeStr: ... @overload def get_fileobj( filename: str, mode: Literal["rb"], compressed_formats: Optional[List[str]] = None ) -> FileObjTypeBytes: ... @overload def get_fileobj(filename: str) -> FileObjTypeStr: ... @overload def get_fileobj(filename: str, mode: str = "r", compressed_formats: Optional[List[str]] = None) -> FileObjType: ... def get_fileobj(filename: str, mode: str = "r", compressed_formats: Optional[List[str]] = None) -> FileObjType: """ Returns a fileobj. If the file is compressed, return an appropriate file reader. In text mode, always use 'utf-8' encoding. :param filename: path to file that should be opened :param mode: mode to pass to opener :param compressed_formats: list of allowed compressed file formats among 'bz2', 'gzip', 'xz' and 'zip'. If left to None, all 3 formats are allowed """ return get_fileobj_raw(filename, mode, compressed_formats)[1] @overload def get_fileobj_raw( filename: str, mode: Literal["r"], compressed_formats: Optional[List[str]] = None ) -> Tuple[Optional[str], FileObjTypeStr]: ... @overload def get_fileobj_raw( filename: str, mode: Literal["rb"], compressed_formats: Optional[List[str]] = None ) -> Tuple[Optional[str], FileObjTypeBytes]: ... @overload def get_fileobj_raw(filename: str) -> Tuple[Optional[str], FileObjTypeStr]: ... @overload def get_fileobj_raw( filename: str, mode: str = "r", compressed_formats: Optional[List[str]] = None ) -> Tuple[Optional[str], FileObjType]: ... def get_fileobj_raw( filename: str, mode: str = "r", compressed_formats: Optional[List[str]] = None ) -> Tuple[Optional[str], FileObjType]: if compressed_formats is None: compressed_formats = ["bz2", "gzip", "xz", "zip"] # Remove 't' from mode, which may cause an error for compressed files mode = mode.replace("t", "") # 'U' mode is deprecated, we open in 'r'. if mode == "U": mode = "r" compressed_format = None if "gzip" in compressed_formats and is_gzip(filename): fh: Union[gzip.GzipFile, bz2.BZ2File, lzma.LZMAFile, IO[bytes]] = gzip.GzipFile(filename, mode) compressed_format = "gzip" elif "bz2" in compressed_formats and is_bz2(filename): mode = cast(Literal["a", "ab", "r", "rb", "w", "wb", "x", "xb"], mode) fh = bz2.BZ2File(filename, mode) compressed_format = "bz2" elif "xz" in compressed_formats and is_xz(filename): mode = cast(Literal["a", "ab", "r", "rb", "w", "wb", "x", "xb"], mode) fh = lzma.LZMAFile(filename, mode) compressed_format = "xz" elif "zip" in compressed_formats and zipfile.is_zipfile(filename): # Return fileobj for the first file in a zip file. # 'b' is not allowed in the ZipFile mode argument # since it always opens files in binary mode. # For emulating text mode, we will be returning the binary fh in a # TextIOWrapper. zf_mode = cast(Literal["r", "w"], mode.replace("b", "")) with zipfile.ZipFile(filename, zf_mode) as zh: fh = zh.open(zh.namelist()[0], zf_mode) compressed_format = "zip" elif "b" in mode: return compressed_format, open(filename, mode) else: return compressed_format, open(filename, mode, encoding="utf-8") if "b" not in mode: return compressed_format, io.TextIOWrapper(cast(IO[bytes], fh), encoding="utf-8") else: return compressed_format, fh def file_iter(fname: str, sep: Optional[Any] = None) -> Iterator[List[str]]: """ This generator iterates over a file and yields its lines splitted via the C{sep} parameter. Skips empty lines and lines starting with the C{#} character. >>> lines = [ line for line in file_iter(__file__) ] >>> len(lines) != 0 True """ with get_fileobj(fname) as fh: for line in fh: if line and line[0] != "#": yield line.split(sep) ArchiveMemberType = Union[tarfile.TarInfo, zipfile.ZipInfo] def decompress_bytes_to_directory(content: bytes) -> str: temp_directory = tempfile.mkdtemp() with tempfile.NamedTemporaryFile(delete=False) as fp: fp.write(content) fp.close() with CompressedFile(fp.name) as cf: outdir = cf.extract(temp_directory) return outdir def decompress_path_to_directory(path: str) -> str: temp_directory = tempfile.mkdtemp() with CompressedFile(path) as cf: outdir = cf.extract(temp_directory) return outdir class CompressedFile: archive: Union[tarfile.TarFile, zipfile.ZipFile] @staticmethod def can_decompress(file_path: StrPath) -> bool: return tarfile.is_tarfile(file_path) or zipfile.is_zipfile(file_path) def __init__(self, file_path: StrPath, mode: Literal["a", "r", "w", "x"] = "r") -> None: file_path_str = str(file_path) if zipfile.is_zipfile(file_path) and not file_path_str.endswith(".jar"): self.file_type = "zip" elif tarfile.is_tarfile(file_path): self.file_type = "tar" else: raise Exception("File must be valid zip or tar file.") self.file_name = os.path.splitext(os.path.basename(file_path))[0] if self.file_name.endswith(".tar"): self.file_name = os.path.splitext(self.file_name)[0] self.type = self.file_type method = f"open_{self.file_type}" if hasattr(self, method): self.archive = getattr(self, method)(file_path, mode) else: raise NameError(f"File type {self.file_type} specified, no open method found.") @property def common_prefix_dir(self) -> str: """ Get the common prefix directory for all the files in the archive, if any. Returns '' if the archive contains multiple files and/or directories at the root of the archive. """ contents = self.getmembers() common_prefix = "" if len(contents) > 1: common_prefix = os.path.commonprefix([self.getname(item) for item in contents]) # If the common_prefix does not end with a slash, check that is a # directory and all other files are contained in it common_prefix_member = self.getmember(common_prefix) if ( len(common_prefix) >= 1 and not common_prefix.endswith(os.sep) and common_prefix_member and self.isdir(common_prefix_member) and all(self.getname(item).startswith(common_prefix + os.sep) for item in contents if self.isfile(item)) ): common_prefix += os.sep if not common_prefix.endswith(os.sep): common_prefix = "" return common_prefix def extract(self, path: StrPath) -> str: """Determine the path to which the archive should be extracted.""" contents = self.getmembers() extraction_path = path common_prefix_dir = self.common_prefix_dir if len(contents) == 1: # The archive contains a single file, return the extraction path. if self.isfile(contents[0]): extraction_path = os.path.join(path, self.file_name) if not os.path.exists(extraction_path): os.makedirs(extraction_path) if isinstance(self.archive, tarfile.TarFile): members_t = cast(Iterable[tarfile.TarInfo], self.safemembers()) self.archive.extractall(extraction_path, members=members_t) else: members_z = cast(Iterable[str], self.safemembers()) self.archive.extractall(extraction_path, members=members_z) else: if isinstance(self.archive, tarfile.TarFile): members_t = cast(Iterable[tarfile.TarInfo], self.safemembers()) self.archive.extractall(extraction_path, members=members_t) else: members_z = cast(Iterable[str], self.safemembers()) self.archive.extractall(extraction_path, members=members_z) # Since .zip files store unix permissions separately, we need to iterate through the zip file # and set permissions on extracted members. if self.file_type == "zip": assert isinstance(self.archive, zipfile.ZipFile) for zipped_file in contents: filename = self.getname(zipped_file) absolute_filepath = os.path.join(extraction_path, filename) external_attributes = self.archive.getinfo(filename).external_attr # The 2 least significant bytes are irrelevant, the next two contain unix permissions. unix_permissions = external_attributes >> 16 if unix_permissions != 0: if os.path.exists(absolute_filepath): os.chmod(absolute_filepath, unix_permissions) else: log.warning( f"Unable to change permission on extracted file '{absolute_filepath}' as it does not exist" ) return os.path.abspath(os.path.join(extraction_path, common_prefix_dir)) def safemembers(self) -> Union[Iterable[tarfile.TarInfo], Iterable[str]]: members = self.archive common_prefix_dir = self.common_prefix_dir if self.file_type == "tar": assert isinstance(members, tarfile.TarFile) for finfo in members: if not safe_relpath(finfo.name): raise Exception(f"Path '{finfo.name}' is blocked (illegal path).") if finfo.issym() or finfo.islnk(): link_target = os.path.join(os.path.dirname(finfo.name), finfo.linkname) if not safe_relpath(link_target) or not os.path.normpath(link_target).startswith(common_prefix_dir): raise Exception(f"Link '{finfo.name}' to '{finfo.linkname}' is blocked.") yield finfo elif self.file_type == "zip": assert isinstance(members, zipfile.ZipFile) for name in members.namelist(): if not safe_relpath(name): raise Exception(f"{name} is blocked (illegal path).") yield name def getmembers_tar(self) -> List[tarfile.TarInfo]: assert isinstance(self.archive, tarfile.TarFile) return self.archive.getmembers() def getmembers_zip(self) -> List[zipfile.ZipInfo]: assert isinstance(self.archive, zipfile.ZipFile) return self.archive.infolist() def getname_tar(self, item: tarfile.TarInfo) -> str: return item.name def getname_zip(self, item: zipfile.ZipInfo) -> str: return item.filename def getmember(self, name: str) -> Optional[ArchiveMemberType]: for member in self.getmembers(): if self.getname(member) == name: return member return None def getmembers(self) -> List[ArchiveMemberType]: return cast(List[ArchiveMemberType], getattr(self, f"getmembers_{self.type}")()) def getname(self, member: ArchiveMemberType) -> str: return cast(str, getattr(self, f"getname_{self.type}")(member)) def isdir(self, member: ArchiveMemberType) -> bool: return cast(bool, getattr(self, f"isdir_{self.type}")(member)) def isdir_tar(self, member: tarfile.TarInfo) -> bool: return member.isdir() def isdir_zip(self, member: zipfile.ZipInfo) -> bool: if member.filename.endswith(os.sep): return True return False def isfile(self, member: ArchiveMemberType) -> bool: if not self.isdir(member): return True return False @staticmethod def open_tar(file: Union[StrPath, IO[bytes]], mode: Literal["a", "r", "w", "x"] = "r") -> tarfile.TarFile: if isinstance(file, (str, os.PathLike)): tf = tarfile.open(file, mode=mode, errorlevel=0) else: tf = tarfile.open(mode=mode, fileobj=file, errorlevel=0) # Set a safe default ("data_filter") for the extraction filter if # available, reverting to Python 3.11 behavior otherwise, see # https://docs.python.org/3/library/tarfile.html#supporting-older-python-versions tf.extraction_filter = getattr(tarfile, "data_filter", (lambda member, path: member)) return tf @staticmethod def open_zip(file: Union[StrPath, IO[bytes]], mode: Literal["a", "r", "w", "x"] = "r") -> zipfile.ZipFile: return zipfile.ZipFile(file, mode) @staticmethod def zipfile_ok(path_to_archive: StrPath) -> bool: """ This function is a bit pedantic and not functionally necessary. It checks whether there is no file pointing outside of the extraction, because ZipFile.extractall() has some potential security holes. See python zipfile documentation for more details. """ basename = os.path.realpath(os.path.dirname(path_to_archive)) zip_archive = zipfile.ZipFile(path_to_archive) for member in zip_archive.namelist(): member_path = os.path.realpath(os.path.join(basename, member)) if not member_path.startswith(basename): return False return True def __enter__(self) -> Self: return self def __exit__( self, exc_type: Optional[Type[BaseException]], exc_value: Optional[BaseException], traceback: Optional[TracebackType], ) -> bool: try: self.archive.close() return exc_type is None except Exception: return False class FastZipFile(zipfile.ZipFile): """ Simple wrapper around ZipFile that uses the default compression strategy of ISA-L to write zip files. Ignores compresslevel and compresstype arguments, and is 3 to 4 times faster than the zlib implementation at the default compression level. """ def _open_to_write(self, *args, **kwargs): # type: ignore[no-untyped-def] zwf = super()._open_to_write(*args, **kwargs) # type: ignore[misc] if isal_zlib and self.compression == zipfile.ZIP_DEFLATED: zwf._compressor = isal_zlib.compressobj(isal_zlib.ISAL_DEFAULT_COMPRESSION, isal_zlib.DEFLATED, -15, 9) return zwf # modified from shutil._make_zipfile def make_fast_zipfile( base_name: str, base_dir: StrPath, verbose: int = 0, dry_run: int = 0, logger: Optional[logging.Logger] = None, owner: Optional[str] = None, group: Optional[str] = None, root_dir: Optional[StrPath] = None, ) -> str: """Create a zip file from all the files under 'base_dir'. The output zip file will be named 'base_name' + ".zip". Returns the name of the output zip file. """ zip_filename = base_name + ".zip" archive_dir = os.path.dirname(base_name) if archive_dir and not os.path.exists(archive_dir): if logger is not None: logger.info("creating %s", archive_dir) if not dry_run: os.makedirs(archive_dir) if logger is not None: logger.info("creating '%s' and adding '%s' to it", zip_filename, base_dir) if not dry_run: with FastZipFile(zip_filename, mode="w", compression=zipfile.ZIP_DEFLATED) as zf: arcname = os.path.normpath(base_dir) if root_dir is not None: base_dir = os.path.join(root_dir, base_dir) base_dir = os.path.normpath(base_dir) for dirpath, dirnames, filenames in os.walk(base_dir): arcdirpath = dirpath if root_dir is not None: arcdirpath = os.path.relpath(arcdirpath, root_dir) arcdirpath = os.path.normpath(arcdirpath) for name in sorted(dirnames): path = os.path.join(dirpath, name) arcname = os.path.join(arcdirpath, name) zf.write(path, arcname) if logger is not None: logger.info("adding '%s'", path) for name in filenames: path = os.path.join(dirpath, name) path = os.path.normpath(path) if os.path.isfile(path): arcname = os.path.join(arcdirpath, name) zf.write(path, arcname) if logger is not None: logger.info("adding '%s'", path) if root_dir is not None: zip_filename = os.path.abspath(zip_filename) return zip_filename ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/config_parsers.py0000644000175100017510000000131015211124267021404 0ustar00runnerrunnerimport ipaddress from typing import ( List, Union, ) from galaxy.util import unicodify IpAddressT = Union[ipaddress.IPv4Address, ipaddress.IPv6Address] IpNetworkT = Union[ipaddress.IPv4Network, ipaddress.IPv6Network] IpAllowedListEntryT = Union[IpAddressT, IpNetworkT] def parse_allowlist_ips(fetch_url_allowlist: List[str]) -> List[IpAllowedListEntryT]: return [ ( ipaddress.ip_network(unicodify(ip.strip())) # If it has a slash, assume 127.0.0.1/24 notation if "/" in ip else ipaddress.ip_address(unicodify(ip.strip())) ) # Otherwise interpret it as an ip address. for ip in fetch_url_allowlist if len(ip.strip()) > 0 ] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/config_templates.py0000644000175100017510000006050315211124267021734 0ustar00runnerrunner"""Utilities for defining user configuration bits from admin templates. This is capturing code shared by file source templates and object store templates. """ import logging import os from collections.abc import Iterable from typing import ( Any, Callable, cast, Dict, List, Optional, Sequence, Tuple, Type, TypeVar, Union, ) from urllib.parse import urlencode import requests import yaml from boltons.iterutils import remap from pydantic import ( BaseModel, ConfigDict, create_model, RootModel, ValidationError, ) from pydantic.fields import FieldInfo from typing_extensions import ( Annotated, Literal, NotRequired, Protocol, TypedDict, ) try: from jinja2 import ( StrictUndefined, UndefinedError, ) from jinja2.nativetypes import NativeEnvironment except ImportError: NativeEnvironment = None # type: ignore[assignment, misc, unused-ignore] StrictUndefined = None # type: ignore[assignment, misc, unused-ignore] UndefinedError = None # type: ignore[assignment, misc, unused-ignore] from galaxy.exceptions import ( ObjectNotFound, RequestParameterInvalidException, RequestParameterMissingException, ) from galaxy.tool_util_models.parameter_validators import AnySafeValidatorModel from galaxy.util import asbool log = logging.getLogger(__name__) TemplateVariableType = Literal["string", "path_component", "boolean", "integer"] TemplateVariableValueType = Union[str, bool, int] TemplateExpansion = str MarkdownContent = str RawTemplateConfig = Dict[str, Any] UserDetailsDict = Dict[str, Any] VariablesDict = Dict[str, TemplateVariableValueType] SecretsDict = Dict[str, str] EnvironmentDict = Dict[str, str] class StrictModel(BaseModel): model_config = ConfigDict(extra="forbid", coerce_numbers_to_str=True) class BaseTemplateVariable(StrictModel): name: str label: Optional[str] = None help: Optional[MarkdownContent] = None optional: Optional[bool] = None validators: Optional[Sequence[AnySafeValidatorModel]] = None class TemplateVariableString(BaseTemplateVariable): type: Literal["string"] default: Optional[str] = None class TemplateVariableInteger(BaseTemplateVariable): type: Literal["integer"] default: Optional[int] = None # add min/max class TemplateVariablePathComponent(BaseTemplateVariable): type: Literal["path_component"] default: Optional[str] = None class TemplateVariableBoolean(BaseTemplateVariable): type: Literal["boolean"] default: Optional[bool] = None TemplateVariable = Union[ TemplateVariableString, TemplateVariableInteger, TemplateVariablePathComponent, TemplateVariableBoolean ] class TemplateSecret(StrictModel): name: str label: Optional[str] = None help: Optional[MarkdownContent] = None optional: Optional[bool] = None class TemplateEnvironmentSecret(StrictModel): type: Literal["secret"] name: str vault_key: str default: Optional[str] = None class TemplateEnvironmentVariable(StrictModel): type: Literal["variable"] name: str variable: str default: Optional[str] = None TemplateEnvironmentEntry = Union[TemplateEnvironmentVariable, TemplateEnvironmentSecret] TemplateEnvironment = RootModel[List[TemplateEnvironmentEntry]] def _ensure_path_component(input: Any): input_as_string = str(input) if not acts_as_simple_path_component(input_as_string): raise Exception("Path manipulation detected, failing evaluation") return input # NativeEnvironment preserves Python types def _environment(template_start: str, template_end: str) -> NativeEnvironment: env = NativeEnvironment( variable_start_string=template_start, variable_end_string=template_end, undefined=StrictUndefined, ) env.filters["ensure_path_component"] = _ensure_path_component env.filters["asbool"] = asbool return env class TemplateConfiguration(Protocol): def model_dump(self) -> Dict[str, Any]: """Implements a pydantic model dump to build simple JSON dictionary.""" @property def template_start(self) -> Optional[str]: """Set a custom variable start for Jinja variable substitution. https://stackoverflow.com/questions/12083319/add-custom-tokens-in-jinja2-e-g-somevar """ @property def template_end(self) -> Optional[str]: """Set a custom variable end for Jinja variable substitution. https://stackoverflow.com/questions/12083319/add-custom-tokens-in-jinja2-e-g-somevar """ def populate_default_variables(variables: Optional[List[TemplateVariable]], variable_values: VariablesDict): if variables: for variable in variables: name = variable.name # Apply defaults only for explicitly optional variables if variable.optional and name not in variable_values and variable.default is not None: variable_values[name] = variable.default def expand_raw_config( template_configuration: TemplateConfiguration, variables: VariablesDict, secrets: SecretsDict, user_details: UserDetailsDict, environment: EnvironmentDict, ) -> RawTemplateConfig: template_variables = { "variables": variables, "secrets": secrets, "user": user_details, "environment": environment, } return _expand_raw_config(template_configuration, template_variables) def _expand_raw_config( template_configuration: TemplateConfiguration, template_variables: Dict[str, Any] ) -> RawTemplateConfig: template_start = template_configuration.template_start or "{{" template_end = template_configuration.template_end or "}}" def expand_template(_, key, value): if isinstance(value, str) and template_start in value and template_end in value: template = _environment(template_start, template_end).from_string(value) return key, template.render(**template_variables) return key, value template_model_as_json = template_configuration.model_dump() raw_config = remap(template_model_as_json, visit=expand_template) _clean_template_meta_parameters(raw_config) return raw_config def merge_implicit_parameters(raw_config: RawTemplateConfig, implicit: Optional["ImplicitConfigurationParameters"]): if implicit: raw_config.update(implicit) raw_config.pop("oauth2_client_id", None) raw_config.pop("oauth2_client_secret", None) raw_config.pop("oauth2_scope", None) def verify_vault_configured_if_uses_secrets(catalog, vault_configured: bool, exception_message: str) -> None: if _catalog_uses_secrets(catalog) and not vault_configured: raise Exception(exception_message) def _catalog_uses_secrets(catalog) -> bool: templates = catalog.root for template in templates: if template.secrets and len(template.secrets) > 0: return True return False def _clean_template_meta_parameters(config: RawTemplateConfig) -> RawTemplateConfig: # slight templating differences between what is allowed in the template definition # and what is allowed in the actual configuration objects we send to respective modules # to instantiate plugins. In particular, descriptions of how templating is done should # eliminated after templates have been expanded. meta_parameters = ["template_start", "template_end"] for meta_parameter in meta_parameters: if meta_parameter in config: del config[meta_parameter] return config # cwl-like - convert simple dictionary to list of dictionaries for quickly # configuring variables and secrets def apply_syntactic_sugar(raw_templates: List[RawTemplateConfig]) -> List[RawTemplateConfig]: templates = [] expanded_raw_templates = _expand_includes(raw_templates) for template in expanded_raw_templates: _force_key_to_list(template, "variables") _force_key_to_list(template, "secrets") _force_key_to_list(template, "environment") templates.append(template) return templates def _expand_includes(raw_templates: List[RawTemplateConfig]) -> List[RawTemplateConfig]: expanded_raw_templates = [] for raw_template in raw_templates: expanded_raw_templates.extend(_expand_include(raw_template)) return expanded_raw_templates def _expand_include(raw_template: RawTemplateConfig) -> List[RawTemplateConfig]: has_one_key = len(raw_template.keys()) == 1 has_include = "include" in raw_template if has_one_key and has_include: include = raw_template["include"] with open(include) as f: included = yaml.safe_load(f) raw_templates: List[RawTemplateConfig] if isinstance(included, list): raw_templates = included else: raw_templates = [included] return _expand_includes(raw_templates) else: return [raw_template] def _force_key_to_list(template: RawTemplateConfig, key: str) -> None: value = template.get(key, None) if isinstance(value, dict): value_as_list = [] for key_name, key_value in value.items(): key_value["name"] = key_name value_as_list.append(key_value) template[key] = value_as_list class TemplateReference(Protocol): template_id: str template_version: int class InstanceDefinition(TemplateReference, Protocol): variables: Dict[str, Any] secrets: SecretsDict class Template(Protocol): @property def id(self) -> str: ... @property def version(self) -> int: ... @property def type(self) -> str: ... @property def variables(self) -> Optional[List[TemplateVariable]]: ... @property def secrets(self) -> Optional[List[TemplateSecret]]: ... @property def environment(self) -> Optional[List[TemplateEnvironmentEntry]]: ... T = TypeVar("T", bound=Template, covariant=True) def find_template(templates: List[T], instance_reference: TemplateReference, what: str) -> T: template_id = instance_reference.template_id template_version = instance_reference.template_version return find_template_by(templates, template_id, template_version, what) def find_template_by(templates: List[T], template_id: str, template_version: int, what: str) -> T: for template in templates: if template.id == template_id and template.version == template_version: return template raise ObjectNotFound(f"Could not find a {what} template with id {template_id} and version {template_version}") def _run_variable_validator(validator: AnySafeValidatorModel, value: Any, variable_name: str) -> None: """Run a single validator on a variable value. Raises RequestParameterInvalidException if validation fails. """ try: validator.statically_validate(value) except ValueError as e: raise RequestParameterInvalidException(f"Variable '{variable_name}' failed validation: {str(e)}") def validate_variable_types(instance: InstanceDefinition, template: Template) -> None: pass def validate_defines_all_required_secrets(instance: InstanceDefinition, template: Template): secrets = instance.secrets for template_secret in template.secrets or []: name = template_secret.name is_optional = bool(template_secret.optional) if name not in secrets and not is_optional: raise RequestParameterMissingException(f"Must define secret '{name}'") def validate_defines_all_required_variables(instance: InstanceDefinition, template: Template): variables = instance.variables for template_variable in template.variables or []: name = template_variable.name is_optional = bool(template_variable.optional) if name not in variables and not is_optional: raise RequestParameterMissingException(f"Must define variable '{name}'") def validate_specified_datatypes(instance: InstanceDefinition, template: Template): secrets = instance.secrets for name, value in secrets.items(): if not isinstance(value, str): raise RequestParameterInvalidException(f"Secret value for secret '{name}' must be of type string") variables = instance.variables validate_specified_datatypes_variables(variables, template) def validate_specified_datatypes_variables(variables: Dict[str, Any], template: Template): for template_variable in template.variables or []: name = template_variable.name # Only fall back to default for optional variables if name in variables: variable_value = variables[name] elif template_variable.optional: variable_value = template_variable.default else: variable_value = None # Skip validation only if variable is optional, not provided, and has no default applied if name not in variables and template_variable.optional and template_variable.default is None: continue template_type = template_variable.type if template_type in ["string", "path_component"]: if not isinstance(variable_value, str): raise RequestParameterInvalidException(f"Variable value for variable '{name}' must be of type str") if template_type == "path_component": if ".." in variable_value or "/" in variable_value: raise RequestParameterInvalidException( f"Variable value for variable '{name}' must be simple path component, invalid characters found" ) if not acts_as_simple_path_component(variable_value): raise RequestParameterInvalidException( f"Variable value for variable '{name}' must be simple path component, invalid characters found" ) if template_type == "integer": if not _is_of_exact_type(variable_value, int): raise RequestParameterInvalidException(f"Variable value for variable '{name}' must be of type int") if template_type == "boolean": if not _is_of_exact_type(variable_value, bool): raise RequestParameterInvalidException(f"Variable value for variable '{name}' must be of type bool") # Run custom validators if present. if template_variable.validators: for validator in template_variable.validators: _run_variable_validator(validator, variable_value, name) def validate_no_extra_secrets_defined(secrets: Dict[str, str], template: Template) -> None: template_secrets = secrets_as_dict(template.secrets) for secret in secrets.keys(): if secret not in template_secrets: raise RequestParameterInvalidException(f"No secret named {secret} for this template") def validate_no_extra_variables_defined(variables: Dict[str, Any], template: Template): template_variables = _variables_as_dict(template.variables) for variable in variables.keys(): if variable not in template_variables: raise RequestParameterInvalidException(f"No variable named {variable} for this template") def validate_secrets_and_variables(instance: InstanceDefinition, template: Template) -> None: validate_defines_all_required_secrets(instance, template) validate_defines_all_required_variables(instance, template) validate_specified_datatypes(instance, template) validate_no_extra_secrets_defined(instance.secrets, template) validate_no_extra_variables_defined(instance.variables, template) def secrets_as_dict(secrets: Optional[List[TemplateSecret]]) -> Dict[str, TemplateSecret]: as_dict = {} for secret in secrets or []: as_dict[secret.name] = secret return as_dict def _variables_as_dict(variables: Optional[List[TemplateVariable]]) -> Dict[str, TemplateVariable]: as_dict = {} for variable in variables or []: as_dict[variable.name] = variable return as_dict def _is_of_exact_type(object: Any, target_type: Type): # isinstance(False, int) and False == 0 are both True in Python... # We are creating a DSL here that is intentionally more strict than Python # so we are using type() instead of isinstance and we have the test coverage # to ensure this is the desired behavior and remains. Think JSON typing, not # pythonic typing. Galaxy's internals as a Python project should not be # exposed here. return type(object) == target_type # noqa: E721 def acts_as_simple_path_component(value: str): cwd = os.getcwd() abs_path = os.path.abspath(f"{cwd}/{value}") unaffected_by_normpath = os.path.normpath(abs_path) == abs_path if not unaffected_by_normpath: return False should_be_cwd, should_be_value = os.path.split(abs_path) if should_be_cwd != cwd: return False if should_be_value != value: return False return True class PluginAspectStatus(StrictModel): state: Literal["ok", "not_ok", "unknown"] message: str @property def is_not_ok(self): return self.state == "not_ok" class PluginStatus(StrictModel): template_definition: PluginAspectStatus template_settings: Optional[PluginAspectStatus] = None connection: Optional[PluginAspectStatus] = None # I would love to disambiguate connection vs auth errors but would # attempting to do that cause confusion. Maybe not if the user interface # skipped presenting the one that couldn't be disambiguated for that # particular plugin? # TODO: Fill in writable checks. # writable: Optional[PluginAspectStatus] = None oauth2_access_token_generation: Optional[PluginAspectStatus] = None def status_template_definition(template: Optional[Template]) -> PluginAspectStatus: # if we found a template in the catalog, it was validated at load time. Reflect # this as a PluginAspectStatus if template: return PluginAspectStatus(state="ok", message="Template definition found and validates against schema") else: return PluginAspectStatus(state="not_ok", message="Template not found or not loaded") def settings_exception_to_status(exception: Optional[Exception]) -> PluginAspectStatus: if exception is None: status = PluginAspectStatus(state="ok", message="Valid configuration resulted from supplied settings") elif isinstance(exception, UndefinedError): message = f"Problem with template definition causing invalid settings resolution, please contact admin to correct template: {exception}" status = PluginAspectStatus(state="not_ok", message=message) elif isinstance(exception, ValidationError): message = f"Problem with template definition causing invalid configuration, template expanded without error but resulting configuration is invalid. please contact admin to correct template: {exception}" status = PluginAspectStatus(state="not_ok", message=message) else: message = f"Unknown problem with resolving configuration from supplied settings: {exception}" status = PluginAspectStatus(state="not_ok", message=message) return status def connection_exception_to_status(what: str, exception: Optional[Exception]) -> PluginAspectStatus: if exception is None: connection_status = PluginAspectStatus(state="ok", message="Valid connection resulted from supplied settings") else: message = f"Failed to connect to a {what} with supplied settings: {exception}" connection_status = PluginAspectStatus(state="not_ok", message=message) return connection_status class OAuth2Info(StrictModel): authorize_url: str class OAuth2Configuration(StrictModel): authorize_url: str token_url: str authorize_params: Optional[Dict[str, str]] scope: Optional[str] = None ConfiguredOAuth2Sources = Dict[str, OAuth2Configuration] class OAuth2ClientPair(StrictModel): client_id: str client_secret: str def get_authorize_url( client_id_or_pair: Union[str, OAuth2ClientPair], config: OAuth2Configuration, redirect_uri: Optional[str], state: Optional[str] = None, scope: Optional[str] = None, ) -> str: client_id = client_id_or_pair if isinstance(client_id_or_pair, str) else client_id_or_pair.client_id query_data = dict( client_id=client_id, response_type="code", ) if redirect_uri is not None: query_data["redirect_uri"] = redirect_uri if state is not None: query_data["state"] = state if scope is not None: query_data["scope"] = scope elif config.scope is not None: query_data["scope"] = config.scope query_data.update(config.authorize_params or {}) query = urlencode(query_data) return f"{config.authorize_url}?{query}" def get_token_from_code_raw( code: str, client_pair: OAuth2ClientPair, config: OAuth2Configuration, redirect_uri: Optional[str] ) -> requests.Response: data = { "code": code, "grant_type": "authorization_code", "client_id": client_pair.client_id, "client_secret": client_pair.client_secret, } if redirect_uri is not None: data["redirect_uri"] = redirect_uri return requests.post(config.token_url, data=data) def get_token_from_refresh_raw( refresh_token: str, client_pair: OAuth2ClientPair, config: OAuth2Configuration ) -> requests.Response: data = { "refresh_token": refresh_token, "grant_type": "refresh_token", "client_id": client_pair.client_id, "client_secret": client_pair.client_secret, } return requests.post(config.token_url, data=data) def get_oauth2_config_from(template, sources: ConfiguredOAuth2Sources) -> OAuth2Configuration: template_type = template.configuration.type if template_type not in sources: raise ObjectNotFound(f"oauth information not available for template type {template_type}") return sources[template_type] def read_oauth2_info_from_configuration( template_configuration: TemplateConfiguration, user_details: UserDetailsDict, environment: EnvironmentDict, ) -> Tuple[OAuth2ClientPair, Optional[str]]: template_variables = { "user": user_details, "environment": environment, } expanded_config = _expand_raw_config(template_configuration, template_variables) oauth2_client_id = expanded_config["oauth2_client_id"] oauth2_client_secret = expanded_config["oauth2_client_secret"] oauth2_scope = cast(Optional[str], expanded_config.get("oauth2_scope")) client_pair = OAuth2ClientPair(client_id=oauth2_client_id, client_secret=oauth2_client_secret) return client_pair, oauth2_scope # Things added to configuration dictionary not managed by the template # but injected dynamically. Currently only `oauth2_access_token`. class ImplicitConfigurationParameters(TypedDict): oauth2_access_token: NotRequired[str] M = TypeVar("M", bound="BaseModel") # Implementation copied from https://github.com/pydantic/pydantic/issues/12329#issuecomment-3382159312 def _make_field_optional(field_info: FieldInfo): """Returns the field's definition to be used in a `create_model()` call to make the field optional.""" annotation = field_info.annotation assert annotation is not None if field_info.is_required(): return Annotated[Union[annotation, None], field_info], None else: return Annotated[annotation, field_info] def make_model_with_all_fields_optional(model: Type[M], fields=None) -> Type[M]: """Returns a new Pydantic model based on `model`, but with all fields optional.""" if fields is None: fields = model.model_fields.items() return create_model( model.__name__, __doc__=model.__doc__, __base__=model, **{field_name: _make_field_optional(field_info) for field_name, field_info in fields}, ) # TODO: This is a workaround to make all fields optional. # It should be removed when Python/pydantic supports this feature natively. # https://github.com/pydantic/pydantic/issues/1673 def partial_model( include: Optional[List[str]] = None, exclude: Optional[List[str]] = None ) -> Callable[[Type[M]], Type[M]]: """Decorator to make all model fields optional""" if exclude is None: exclude = [] def decorator(model: Type[M]) -> Type[M]: if include is None: fields: Iterable[tuple[str, FieldInfo]] = model.model_fields.items() else: fields = ((k, v) for k, v in model.model_fields.items() if k in include) if exclude is not None: fields = ((k, v) for k, v in fields if k not in exclude) return make_model_with_all_fields_optional(model, fields) return decorator ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1780787404.6881788 galaxy_util-26.0.1/galaxy/util/custom_logging/0000755000175100017510000000000015211124315021045 5ustar00runnerrunner././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/custom_logging/__init__.py0000644000175100017510000000110015211124267023154 0ustar00runnerrunnerimport logging from typing import ( Any, cast, Optional, ) class GalaxyLogger(logging.Logger): def trace(self, message: object, *args: Any, **kwargs: Any) -> None: if self.isEnabledFor(LOGLV_TRACE): self._log(LOGLV_TRACE, message, args, **kwargs) # Add custom "TRACE" log level for ludicrous verbosity. LOGLV_TRACE = 5 logging.addLevelName(LOGLV_TRACE, "TRACE") logging.setLoggerClass(GalaxyLogger) def get_logger(name: Optional[str] = None) -> GalaxyLogger: logger = logging.getLogger(name) return cast(GalaxyLogger, logger) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/custom_logging/fluent_log.py0000644000175100017510000000253615211124267023571 0ustar00runnerrunner""" Provides a `TraceLogger` implementation that logs to a fluentd collector """ import json import threading import time try: from fluent.sender import FluentSender except ImportError: FluentSender = None FLUENT_IMPORT_MESSAGE = "The Python fluent package is required to use this feature, please install it" class FluentTraceLogger: def __init__(self, name, host="localhost", port=24224): assert FluentSender is not None, FLUENT_IMPORT_MESSAGE self.lock = threading.Lock() self.thread_local = threading.local() self.name = name self.sender = FluentSender(self.name, host=host, port=port) def context_set(self, key, value): self.lock.acquire() if not hasattr(self.thread_local, "context"): self.thread_local.context = {} self.thread_local.context[key] = value self.lock.release() def context_remove(self, key): self.lock.acquire() del self.thread_local.context[key] self.lock.release() def log(self, label, event_time=None, **kwargs): self.lock.acquire() if hasattr(self.thread_local, "context"): kwargs.update(self.thread_local.context) self.lock.release() event_time = event_time or time.time() self.sender.emit_with_time(label, int(event_time), json.dumps(kwargs, default=str)) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/dictifiable.py0000644000175100017510000000641015211124267020645 0ustar00runnerrunnerimport datetime import uuid from typing import ( Any, Callable, Dict, Optional, ) ValueMapperT = Dict[str, Callable] def dict_for(obj, **kwds): # Create dict to represent item. return dict(model_class=obj.__class__.__name__, **kwds) class UsesDictVisibleKeys: """Mixin used to implement to_dict methods that consume dict_{view}_visible_keys to produce dicts. For typical to_dict methods that just consume a view and value mapper use the Dictifable mixin instead of this more low level mixin, but if you want to consume other things in your to_dict method that are incompatible (such as required arguments) - inherit this lower level mixin and implement a custom to_dict with whatever signature makes sense for the class. """ def _dictify_view_keys( self, view: str = "collection", value_mapper: Optional[ValueMapperT] = None ) -> Dict[str, Any]: """ Return item dictionary. """ if not value_mapper: value_mapper = {} def get_value(key, item): """ Recursive helper function to get item values. """ # FIXME: why use exception here? Why not look for key in value_mapper # first and then default to to_dict? try: return item.to_dict(view=view, value_mapper=value_mapper) except Exception: assert value_mapper is not None if key in value_mapper: return value_mapper[key](item) if isinstance(item, datetime.datetime): return item.isoformat() elif isinstance(item, uuid.UUID): return str(item) # Leaving this for future reference, though we may want a more # generic way to handle special type mappings going forward. # If the item is of a class that needs to be 'stringified' before being put into a JSON data structure # elif type(item) in []: # return str(item) return item # Create dict to represent item. rval = dict_for(self) # Fill item dict with visible keys. try: visible_keys = self.__getattribute__(f"dict_{view}_visible_keys") except AttributeError: raise Exception(f"Unknown Dictifiable view: {view}") for key in visible_keys: try: item = self.__getattribute__(key) if isinstance(item, list): rval[key] = [] for i in item: rval[key].append(get_value(key, i)) else: rval[key] = get_value(key, item) except AttributeError: rval[key] = None return rval class Dictifiable(UsesDictVisibleKeys): """Mixin that enables objects to be converted to dictionaries. This is useful when for sharing objects across boundaries, such as the API, tool scripts, and JavaScript code.""" def to_dict(self, view: str = "collection", value_mapper: Optional[ValueMapperT] = None) -> Dict[str, Any]: """ Return item dictionary. """ return self._dictify_view_keys(view=view, value_mapper=value_mapper) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/docutils_template.txt0000644000175100017510000000001115211124267022305 0ustar00runnerrunner%(body)s ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/dynamic.py0000644000175100017510000000034115211124267020027 0ustar00runnerrunnerfrom typing import ( Any, TYPE_CHECKING, ) if TYPE_CHECKING: class HasDynamicProperties: def __getattr__(self, property: str) -> Any: return object() else: HasDynamicProperties = object ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/expressions.py0000644000175100017510000000300615211124267020766 0ustar00runnerrunner""" Expression evaluation support. For the moment this depends on python's eval. In the future it should be replaced with a "safe" parser. """ from collections.abc import MutableMapping from itertools import chain class ExpressionContext(MutableMapping): def __init__(self, dict, parent=None): """ Create a new expression context that looks for values in the container object 'dict', and falls back to 'parent' """ self.dict = dict self.parent = parent def __delitem__(self, key): if key in self.dict: del self.dict[key] elif self.parent is not None and key in self.parent: del self.parent[key] def __iter__(self): return chain(iter(self.dict), iter(self.parent or [])) def __len__(self): return len(self.dict) + len(self.parent or []) def __getitem__(self, key): if key in self.dict: return self.dict[key] if self.parent is not None and key in self.parent: return self.parent[key] raise KeyError(key) def __setitem__(self, key, value): self.dict[key] = value def __contains__(self, key): if key in self.dict: return True if self.parent is not None and key in self.parent: return True return False def __str__(self): return str(self.dict) def __bool__(self): if not self.dict and not self.parent: return False return True __nonzero__ = __bool__ ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/facts.py0000644000175100017510000000361715211124267017514 0ustar00runnerrunner"""Return various facts for string formatting.""" import socket from collections.abc import MutableMapping class Facts(MutableMapping): """A dict-like object that evaluates values at access time.""" def __init__(self, config=None, **kwargs): config = config or {} self.__dict__ = {} self.__set_defaults(config) self.__set_config(config) self.__dict__.update(dict(**kwargs)) def __set_defaults(self, config): # config here may be a Galaxy config object, or it may just be a dict defaults = { "server_name": lambda: config.get("base_server_name", "main"), "server_id": None, "instance_id": None, "pool_name": None, "fqdn": lambda: socket.getfqdn(), "hostname": lambda: socket.gethostname().split(".", 1)[0], } self.__dict__.update(defaults) def __set_config(self, config): if config is not None: for name in dir(config): if not name.startswith("_") and isinstance(getattr(config, name), str): self.__dict__[f"config_{name}"] = lambda name=name: getattr(config, name) def __getitem__(self, key): item = self.__dict__.__getitem__(key) if callable(item): return item() else: return item # Other methods pass through to the corresponding dict methods def __setitem__(self, key, value): return self.__dict__.__setitem__(key, value) def __delitem__(self, key): return self.__dict__.__delitem__(key) def __iter__(self): return self.__dict__.__iter__() def __len__(self): return self.__dict__.__len__() def __str__(self): return self.__dict__.__str__() def __repr__(self): return self.__dict__.__repr__() def get_facts(config=None, **kwargs): return Facts(config=config, **kwargs) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/filelock.py0000644000175100017510000000505715211124267020204 0ustar00runnerrunner"""Code obtained from https://github.com/dmfrey/FileLock. See full license at: https://github.com/dmfrey/FileLock/blob/master/LICENSE.txt """ import errno import os import time class FileLockException(Exception): pass class FileLock: """A file locking mechanism that has context-manager support so you can use it in a with statement. This should be relatively cross compatible as it doesn't rely on msvcrt or fcntl for the locking. """ def __init__(self, file_name, timeout=10, delay=0.05): """Prepare the file locker. Specify the file to lock and optionally the maximum timeout and the delay between each attempt to lock. """ self.is_locked = False full_path = os.path.abspath(file_name) self.lockfile = f"{full_path}.lock" self.file_name = full_path self.timeout = timeout self.delay = delay def acquire(self): """Acquire the lock, if possible. If the lock is in use, it check again every `wait` seconds. It does this until it either gets the lock or exceeds `timeout` number of seconds, in which case it throws an exception. """ start_time = time.time() while True: try: self.fd = os.open(self.lockfile, os.O_CREAT | os.O_EXCL | os.O_RDWR) break except OSError as e: if e.errno != errno.EEXIST: raise if (time.time() - start_time) >= self.timeout: raise FileLockException("Timeout occurred.") time.sleep(self.delay) self.is_locked = True def release(self): """Get rid of the lock by deleting the lockfile. When working in a `with` statement, this gets automatically called at the end. """ if self.is_locked: os.close(self.fd) os.unlink(self.lockfile) self.is_locked = False def __enter__(self): """Activated when used in the with statement. Should automatically acquire a lock to be used in the with block. """ if not self.is_locked: self.acquire() return self def __exit__(self, type, value, traceback): """Activated at the end of the with statement. It automatically releases the lock if it isn't locked. """ if self.is_locked: self.release() def __del__(self): """Make sure that the FileLock instance doesn't leave a lockfile lying around. """ self.release() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/form_builder.py0000644000175100017510000001253115211124267021060 0ustar00runnerrunner""" Classes for generating HTML forms """ import logging from galaxy.util import asbool log = logging.getLogger(__name__) class BaseField: def __init__(self, name, value=None, label=None, **kwds): self.name = name self.label = label self.value = value self.disabled = kwds.get("disabled", False) if "optional" in kwds: self.optional = asbool(kwds.get("optional")) else: self.optional = kwds.get("required", "optional") == "optional" self.help = kwds.get("helptext") def to_dict(self): return { "name": self.name, "label": self.label, "disabled": self.disabled, "optional": self.optional, "value": self.value, "help": self.help, } class TextField(BaseField): """ A standard text input box. """ def to_dict(self): d = super().to_dict() d["type"] = "text" return d class PasswordField(BaseField): """ A password input box. text appears as "******" """ def to_dict(self): d = super().to_dict() d["type"] = "password" return d class TextArea(BaseField): """ A standard text area box. """ def to_dict(self): d = super().to_dict() d["type"] = "text" d["area"] = True return d class CheckboxField(BaseField): """ A checkbox (boolean input) """ @staticmethod def is_checked(value): if value in [True, "True", "true"]: return True return False def to_dict(self): d = super().to_dict() d["type"] = "boolean" return d class SelectField(BaseField): """ A select field. """ def __init__( self, name, multiple=None, display=None, field_id=None, value=None, selectlist=None, refresh_on_change=False, **kwds, ): super().__init__(name, value, **kwds) self.field_id = field_id self.multiple = multiple or False self.refresh_on_change = refresh_on_change self.selectlist = selectlist or [] self.options = [] if display == "checkboxes": assert multiple, "Checkbox display only supported for multiple select" elif display == "radio": assert not (multiple), "Radio display only supported for single select" elif display is not None: raise Exception(f"Unknown display type: {display}") self.display = display def add_option(self, label, value, selected=False): self.options.append((str(label), value, selected)) def to_dict(self): d = super().to_dict() d["type"] = "select" d["display"] = self.display d["multiple"] = self.multiple d["data"] = [] for value in self.selectlist: d["data"].append({"label": value, "value": value}) d["options"] = [{"label": t[0], "value": t[1]} for t in self.options] return d class AddressField(BaseField): @staticmethod def fields(): return [ ("desc", "Short address description", "Required"), ("name", "Name", ""), ("institution", "Institution", ""), ("address", "Address", ""), ("city", "City", ""), ("state", "State/Province/Region", ""), ("postal_code", "Postal Code", ""), ("country", "Country", ""), ("phone", "Phone", ""), ] def __init__(self, name, user=None, value=None, security=None, **kwds): super().__init__(name, value, **kwds) self.user = user self.security = security def to_dict(self): d = super().to_dict() d["type"] = "select" d["data"] = [] if self.user and self.security: for a in self.user.addresses: if not a.deleted: d["data"].append({"label": a.desc, "value": self.security.encode_id(a.id)}) return d class WorkflowField(BaseField): def __init__(self, name, user=None, value=None, security=None, **kwds): super().__init__(name, value, **kwds) self.user = user self.value = value self.security = security def to_dict(self): d = super().to_dict() d["type"] = "select" d["data"] = [] if self.user and self.security: for a in self.user.stored_workflows: if not a.deleted: d["data"].append({"label": a.name, "value": self.security.encode_id(a.id)}) return d class WorkflowMappingField(BaseField): def __init__(self, name, user=None, value=None, **kwds): super().__init__(name, value, **kwds) self.user = user class HistoryField(BaseField): def __init__(self, name, user=None, value=None, security=None, **kwds): super().__init__(name, value, **kwds) self.user = user self.value = value self.security = security def to_dict(self): d = super().to_dict() d["type"] = "select" d["data"] = [{"label": "New History", "value": "new"}] if self.user and self.security: for a in self.user.histories: if not a.deleted: d["data"].append({"label": a.name, "value": self.security.encode_id(a.id)}) return d ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/hash_util.py0000644000175100017510000001147415211124267020374 0ustar00runnerrunner""" Utility functions for bi-directional Python version compatibility. Python 2.5 introduced hashlib which replaced sha in Python 2.4 and previous versions. """ import hashlib import hmac import logging from enum import Enum from typing import ( Any, Callable, Dict, List, Literal, Optional, Tuple, Union, ) from . import smart_str from .path import StrPath log = logging.getLogger(__name__) BLOCK_SIZE = 1024 * 1024 HashFunctionT = Callable[[], "hashlib._Hash"] sha1 = hashlib.sha1 sha256 = hashlib.sha256 sha512 = hashlib.sha512 sha = sha1 md5 = hashlib.md5 class HashFunctionNameEnum(str, Enum): """Hash function names that can be used to generate checksums for files.""" md5 = "MD5" sha1 = "SHA-1" sha256 = "SHA-256" sha512 = "SHA-512" # IMPORTANT: Keep this literal type in sync with HashFunctionNameEnum values above # as well as with HASH_NAME_ALIAS and HASH_NAME_MAP below. HashFunctionNames = Literal["MD5", "SHA-1", "SHA-256", "SHA-512"] HASH_NAME_ALIAS: Dict[str, str] = { "SHA1": "SHA-1", "SHA256": "SHA-256", "SHA512": "SHA-512", } HASH_NAME_MAP: Dict[HashFunctionNameEnum, HashFunctionT] = { HashFunctionNameEnum.md5: md5, HashFunctionNameEnum.sha1: sha1, HashFunctionNameEnum.sha256: sha256, HashFunctionNameEnum.sha512: sha512, } HASH_NAMES: List[HashFunctionNameEnum] = list(HASH_NAME_MAP.keys()) def memory_bound_hexdigest( hash_func: Optional[HashFunctionT] = None, hash_func_name: Optional[HashFunctionNameEnum] = None, path: Optional[str] = None, file=None, ): if hash_func is None: assert hash_func_name is not None hash_func = HASH_NAME_MAP[hash_func_name] hasher = hash_func() if file is None: assert path is not None file = open(path, "rb") else: assert path is None, "Cannot specify path and path keyword arguments." try: for block in iter(lambda: file.read(BLOCK_SIZE), b""): hasher.update(block) return hasher.hexdigest() finally: file.close() def md5_hash_file(path: StrPath) -> Optional[str]: """ Return a md5 hashdigest for a file or None if path could not be read. """ hasher = hashlib.md5() try: with open(path, "rb") as afile: buf = afile.read() hasher.update(buf) return hasher.hexdigest() except OSError: # This may happen if path has been deleted return None def md5_hash_str(s): """ Return hex encoded md5 hash of string s """ m = hashlib.md5() m.update(smart_str(s)) return m.hexdigest() def new_secure_hash_v2(text_type: Union[bytes, str]) -> str: """More modern version of new_secure_hash. Certain passwords are set via new_insecure_hash (previously new_secure_hash), so that needs to remain for legacy purposes. """ assert text_type is not None return sha512(smart_str(text_type)).hexdigest() def new_insecure_hash(text_type: Union[bytes, str]) -> str: """Returns the hexdigest of the sha1 hash of the argument `text_type`. Previously called new_secure_hash, but this should not be considered secure - SHA1 is no longer considered a secure hash and has been broken since the early 2000s. use_pbkdf2 should be set by default and galaxy.security.passwords should be the default used for passwords in Galaxy. """ assert text_type is not None return sha1(smart_str(text_type)).hexdigest() def hmac_new(key: Union[bytes, str], value: Union[bytes, str]) -> str: return hmac.new(smart_str(key), smart_str(value), sha).hexdigest() def is_hashable(value: Any) -> bool: try: hash(value) except Exception: return False return True def parse_checksum_hash(checksum: str) -> Tuple[HashFunctionNameEnum, str]: """Parses checksum strings in the form of `hash_type$hash_value` considering possible aliases.""" hash_name, hash_value = checksum.split("$", 1) hash_name = hash_name.upper() if hash_name in HASH_NAME_ALIAS: hash_name = HASH_NAME_ALIAS[hash_name] if hash_name not in HASH_NAMES: raise ValueError(f"Unsupported hash function '{hash_name}'. Supported functions: [{','.join(HASH_NAMES)}]") return HashFunctionNameEnum(hash_name), hash_value def verify_hash(path: str, hash_func_name: HashFunctionNameEnum, hash_value: str, what: str = "path"): calculated_hash_value = memory_bound_hexdigest(hash_func_name=hash_func_name, path=path) if calculated_hash_value != hash_value: raise Exception( f"Failed to validate {what} with [{hash_func_name}] - expected [{hash_value}] got [{calculated_hash_value}]" ) __all__ = ( "md5", "hashlib", "sha1", "sha", "new_insecure_hash", "new_secure_hash_v2", "hmac_new", "is_hashable", "parse_checksum_hash", ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/heartbeat.py0000644000175100017510000001740415211124267020352 0ustar00runnerrunnerimport os import sys import threading import time import traceback from typing import Dict def get_current_thread_object_dict(): """ Get a dictionary of all 'Thread' objects created via the threading module keyed by thread_id. Note that not all interpreter threads have a thread objects, only the main thread and any created via the 'threading' module. Threads created via the low level 'thread' module will not be in the returned dictionary. HACK: This mucks with the internals of the threading module since that module does not expose any way to match 'Thread' objects with intepreter thread identifiers (though it should). """ rval = {} # Acquire the lock and then union the contents of 'active' and 'limbo' # threads into the return value. threading._active_limbo_lock.acquire() rval.update(threading._active) rval.update(threading._limbo) threading._active_limbo_lock.release() return rval class Heartbeat(threading.Thread): """ Thread that periodically dumps the state of all threads to a file """ def __init__(self, config, name="Heartbeat Thread", period=20, fname="heartbeat.log"): threading.Thread.__init__(self, name=name) self.config = config self.should_stop = False self.period = period self.fname = fname self.file = None self.fname_nonsleeping = None self.file_nonsleeping = None self.pid = None self.nonsleeping_heartbeats: Dict[int, int] = {} # Event to wait on when sleeping, allows us to interrupt for shutdown self.wait_event = threading.Event() def run(self): self.pid = os.getpid() self.fname = self.fname.format(server_name=self.config.server_name, pid=self.pid) fname, ext = os.path.splitext(self.fname) self.fname_nonsleeping = f"{fname}.nonsleeping{ext}" wait = self.period if self.period <= 0: wait = 60 while not self.should_stop: if self.period > 0: self.dump() self.wait_event.wait(wait) def open_logs(self): if self.file is None or self.file.closed: self.file = open(self.fname, "a") self.file_nonsleeping = open(self.fname_nonsleeping, "a") self.file.write(f"Heartbeat for pid {self.pid} thread started at {time.asctime()}\n\n") self.file_nonsleeping.write( f"Non-Sleeping-threads for pid {self.pid} thread started at {time.asctime()}\n\n" ) def close_logs(self): if self.file is not None and not self.file.closed: self.file.write(f"Heartbeat for pid {self.pid} thread stopped at {time.asctime()}\n\n") self.file_nonsleeping.write( f"Non-Sleeping-threads for pid {self.pid} thread stopped at {time.asctime()}\n\n" ) self.file.close() self.file_nonsleeping.close() def dump(self): self.open_logs() try: # Print separator with timestamp self.file.write(f"Traceback dump for all threads at {time.asctime()}:\n\n") # Print the thread states threads = get_current_thread_object_dict() for thread_id, frame in sys._current_frames().items(): if thread_id in threads: object = repr(threads[thread_id]) else: object = "" self.file.write(f"Thread {thread_id}, {object}:\n\n") traceback.print_stack(frame, file=self.file) self.file.write("\n") self.file.write("End dump\n\n") self.file.flush() self.print_nonsleeping(threads) except Exception: self.file.write("Caught exception attempting to dump thread states:") traceback.print_exc(None, self.file) self.file.write("\n") def shutdown(self): self.should_stop = True self.wait_event.set() self.close_logs() self.join() def thread_is_sleeping(self, last_stack_frame): """ Returns True if the given stack-frame represents a known sleeper function (at least in python 2.5) """ _filename = last_stack_frame[0] # _line = last_stack_frame[1] _funcname = last_stack_frame[2] _text = last_stack_frame[3] # Ugly hack to tell if a thread is supposedly sleeping or not # These are the most common sleeping functions I've found. # Is there a better way? (python interpreter internals?) # Tested only with python 2.5 if _funcname == "wait" and _text == "waiter.acquire()": return True if _funcname == "wait" and _text == "_sleep(delay)": return True if _funcname == "accept" and _text[-14:] == "_sock.accept()": return True if ( _funcname in ("monitor", "__monitor", "app_loop", "check") and _text.startswith("time.sleep(") and _text.endswith(")") ): return True if _funcname == "drain_events" and _text == "sleep(polling_interval)": return True # Ugly hack: always skip the heartbeat thread # TODO: get the current thread-id in python # skip heartbeat thread by thread-id, not by filename if _filename.find("/lib/galaxy/util/heartbeat.py") != -1: return True # By default, assume the thread is not sleeping return False def get_interesting_stack_frame(self, stack_frames): """ Scans a given backtrace stack frames, returns a single quadraple of [filename, line, function-name, text] of the single, deepest, most interesting frame. Interesting being:: inside the galaxy source code ("/lib/galaxy"), prefreably not an egg. """ for _filename, _line, _funcname, _text in reversed(stack_frames): idx = _filename.find("/lib/galaxy/") if idx != -1: relative_filename = _filename[idx:] return (relative_filename, _line, _funcname, _text) # no "/lib/galaxy" code found, return the innermost frame return stack_frames[-1] def print_nonsleeping(self, threads_object_dict): self.file_nonsleeping.write(f"Non-Sleeping threads at {time.asctime()}:\n\n") all_threads_are_sleeping = True threads = get_current_thread_object_dict() for thread_id, frame in sys._current_frames().items(): if thread_id in threads: object = repr(threads[thread_id]) else: object = "" tb = traceback.extract_stack(frame) if self.thread_is_sleeping(tb[-1]): if thread_id in self.nonsleeping_heartbeats: del self.nonsleeping_heartbeats[thread_id] continue # Count non-sleeping thread heartbeats if thread_id in self.nonsleeping_heartbeats: self.nonsleeping_heartbeats[thread_id] += 1 else: self.nonsleeping_heartbeats[thread_id] = 1 good_frame = self.get_interesting_stack_frame(tb) self.file_nonsleeping.write( f'Thread {thread_id}\t{object}\tnon-sleeping for {self.nonsleeping_heartbeats[thread_id]} heartbeat(s)\n File {good_frame[0]}:{good_frame[1]}\n Function "{good_frame[2]}"\n {good_frame[3]}\n' ) all_threads_are_sleeping = False if all_threads_are_sleeping: self.file_nonsleeping.write("All threads are sleeping.\n") self.file_nonsleeping.write("\n") self.file_nonsleeping.flush() def dump_signal_handler(self, signum, frame): self.dump() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/image_util.py0000644000175100017510000000131015211124267020517 0ustar00runnerrunner"""Provides utilities for working with image files.""" import logging from typing import ( List, Optional, ) try: from PIL import Image except ImportError: Image = None # type: ignore[assignment, unused-ignore] log = logging.getLogger(__name__) def image_type(filename: str) -> Optional[str]: fmt = None if Image is not None: try: with Image.open(filename) as im: fmt = im.format except Exception: pass if fmt: return fmt.upper() else: return None def check_image_type(filename: str, types: List[str]) -> bool: fmt = image_type(filename) if fmt in types: return True return False ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/inflection.py0000644000175100017510000001154115211124267020541 0ustar00runnerrunner#!/usr/bin/env python # Copyright (c) 2006 Bermi Ferrer Martinez # # bermi a-t bermilabs - com # See the end of this file for the free software, open source license (BSD-style). # # Modified by the Galaxy team. import re class Inflector: """ Inflector for pluralizing and singularizing English nouns. """ # This is a small subset of words that either have the same singular and plural form, or have no singular form NONCHANGING_WORDS = { "equipment", "information", "rice", "money", "species", "series", "sheep", "sms", } IRREGULAR_WORDS = { "person": "people", "man": "men", "child": "children", "sex": "sexes", "move": "moves", "octopus": "octopi", } PLURALIZE_RULES = ( ("(?i)(quiz)$", "\\1zes"), ("(?i)^(ox)$", "\\1en"), ("(?i)([m|l])ouse$", "\\1ice"), ("(?i)(matr|vert|ind)ix|ex$", "\\1ices"), ("(?i)(x|ch|ss|sh)$", "\\1es"), ("(?i)([^aeiouy]|qu)ies$", "\\1y"), ("(?i)([^aeiouy]|qu)y$", "\\1ies"), ("(?i)(hive)$", "\\1s"), ("(?i)(?:([^f])fe|([lr])f)$", "\\1\\2ves"), ("(?i)sis$", "ses"), ("(?i)([ti])um$", "\\1a"), ("(?i)(buffal|tomat)o$", "\\1oes"), ("(?i)(bu)s$", "\\1ses"), ("(?i)(alias|status|virus)", "\\1es"), ("(?i)(ax|test)is$", "\\1es"), ("(?i)s$", "s"), ("(?i)$", "s"), ) SINGULARIZE_RULES = ( ("(?i)(quiz)zes$", "\\1"), ("(?i)(matr)ices$", "\\1ix"), ("(?i)(vert|ind)ices$", "\\1ex"), ("(?i)^(ox)en", "\\1"), ("(?i)(alias|status|virus)es$", "\\1"), ("(?i)(cris|ax|test)es$", "\\1is"), ("(?i)(shoe)s$", "\\1"), ("(?i)(o)es$", "\\1"), ("(?i)(bus)es$", "\\1"), ("(?i)([m|l])ice$", "\\1ouse"), ("(?i)(x|ch|ss|sh)es$", "\\1"), ("(?i)(m)ovies$", "\\1ovie"), ("(?i)(s)eries$", "\\1eries"), ("(?i)([^aeiouy]|qu)ies$", "\\1y"), ("(?i)([lr])ves$", "\\1f"), ("(?i)(tive)s$", "\\1"), ("(?i)(hive)s$", "\\1"), ("(?i)([^f])ves$", "\\1fe"), ("(?i)(^analy)ses$", "\\1sis"), ("(?i)((a)naly|(b)a|(d)iagno|(p)arenthe|(p)rogno|(s)ynop|(t)he)ses$", "\\1\\2sis"), ("(?i)([ti])a$", "\\1um"), ("(?i)(n)ews$", "\\1ews"), ("(?i)s$", ""), ) def pluralize(self, word): """Pluralizes nouns.""" return self._transform(self.PLURALIZE_RULES, word) def singularize(self, word): """Singularizes nouns.""" return self._transform(self.SINGULARIZE_RULES, word, pluralize=False) def cond_plural(self, number_of_records, word): """Returns the plural form of a word if first parameter is greater than 1""" if number_of_records != 1: return self.pluralize(word) return word def _transform(self, rules, word, pluralize=True): return ( self._handle_nonchanging(word) or self._handle_irregular(word, pluralize=pluralize) or self._apply_rules(rules, word) or word ) def _handle_nonchanging(self, word): lower_cased_word = word.lower() # Check if word is an item or the suffix of any item in NONCHANGING_WORDS for nonchanging_word in self.NONCHANGING_WORDS: if lower_cased_word.endswith(nonchanging_word): return word def _handle_irregular(self, word, pluralize=True): for form_a, form_b in self.IRREGULAR_WORDS.items(): if not pluralize: form_a, form_b = form_b, form_a match = re.search(f"({form_a})$", word, re.IGNORECASE) if match: return re.sub(f"(?i){form_a}$", match.expand("\\1")[0] + form_b[1:], word) def _apply_rules(self, rules, word): for pattern, replacement in rules: if re.search(pattern, word): return re.sub(pattern, replacement, word) # Copyright (c) 2006 Bermi Ferrer Martinez # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software to deal in this software without restriction, including # without limitation the rights to use, copy, modify, merge, publish, # distribute, sublicense, and/or sell copies of this software, and to permit # persons to whom this software is furnished to do so, subject to the following # condition: # # THIS SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, # OUT OF OR IN CONNECTION WITH THIS SOFTWARE OR THE USE OR OTHER DEALINGS IN # THIS SOFTWARE. ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/json.py0000644000175100017510000001577715211124267017377 0ustar00runnerrunnerimport copy import json import logging import math import random import string from collections.abc import ( Iterable, Mapping, Sequence, ) from decimal import Decimal from . import unicodify __all__ = ("safe_dumps", "validate_jsonrpc_request", "validate_jsonrpc_response", "jsonrpc_request", "jsonrpc_response") log = logging.getLogger(__name__) to_json_string = json.dumps from_json_string = json.loads def swap_inf_nan(val): """ This takes an arbitrary object and preps it for jsonifying safely, templating Inf/NaN and casting Decimal instances as strings. """ if isinstance(val, str): # basestring first, because it's a sequence and would otherwise get caught below. return val elif isinstance(val, Sequence): return [swap_inf_nan(v) for v in val] elif isinstance(val, Mapping): return {swap_inf_nan(k): swap_inf_nan(v) for (k, v) in val.items()} elif isinstance(val, float): if math.isnan(val): return "__NaN__" elif val == math.inf: return "__Infinity__" elif val == -math.inf: return "__-Infinity__" else: return val elif isinstance(val, Decimal): return str(val) else: return val def safe_loads(arg): """ This is a wrapper around loads that returns the parsed value instead of raising a value error. It also avoids autoconversion of non-iterables i.e numeric and boolean values. """ try: loaded = json.loads(arg) if loaded is not None and not isinstance(loaded, Iterable): loaded = arg except (TypeError, ValueError): loaded = copy.deepcopy(arg) return loaded def safe_dumps(obj, **kwargs): """ This is a wrapper around dumps that encodes Infinity and NaN values. It's a fairly rare case (which will be low in request volume). Basically, we tell json.dumps to blow up if it encounters Infinity/NaN, or Decimal values and we 'fix' it before re-encoding. """ try: dumped = json.dumps(obj, allow_nan=False, **kwargs) except (ValueError, TypeError): obj = swap_inf_nan(obj) dumped = json.dumps(obj, allow_nan=False, **kwargs) if kwargs.get("escape_closing_tags", True): return dumped.replace(">> node = Node('a', None) >>> assert node._items == {'text': 'a', 'children': dictobj.MutableDictionaryObject({})} >>> assert node.jsonData() == {'text': 'a'} >>> node = Node('a', 1) >>> assert node._items == {'text': 'a', 'children': dictobj.MutableDictionaryObject({}), 'li_attr': dictobj.DictionaryObject({'id': 1}), 'id': 1} >>> assert node.jsonData() == {'text': 'a', 'id': 1, 'li_attr': {'id': 1}} >>> node = Node('a', 5, icon="folder", state = {'opened': True}) >>> assert node._items == {'text': 'a', 'id': 5, 'state': dictobj.DictionaryObject({'opened': True}), 'children': dictobj.MutableDictionaryObject({}), 'li_attr': dictobj.DictionaryObject({'id': 5}), 'icon': 'folder'} >>> assert node.jsonData() == {'text': 'a', 'state': {'opened': True}, 'id': 5, 'li_attr': {'id': 5}, 'icon': 'folder'} """ super().__init__() children = kwargs.get("children", {}) if len([key for key in children if not isinstance(children[key], Node)]): raise TypeError(f"One or more children were not instances of '{Node.__name__}'") if "children" in kwargs: del kwargs["children"] self._items["children"] = dictobj.MutableDictionaryObject(children) if oid is not None: li_attr = kwargs.get("li_attr", {}) li_attr["id"] = oid kwargs["li_attr"] = li_attr self._items["id"] = oid self._items.update(dictobj.DictionaryObject(**kwargs)) self._items["text"] = path def jsonData(self): children = [self.children[k].jsonData() for k in sorted(self.children)] output = {} for k in self._items: if "children" == k: continue if isinstance(self._items[k], dictobj.DictionaryObject): output[k] = self._items[k].asdict() else: output[k] = self._items[k] if len(children): output["children"] = children return output class JSTree(dictobj.DictionaryObject): """ An immutable dictionary-like object that converts a list of "paths" into a tree structure suitable for jQuery's jsTree. """ def __init__(self, paths, **kwargs): """ Take a list of paths and put them into a tree. Paths with the same prefix should be at the same level in the tree. kwargs may be standard jsTree options used at all levels in the tree. These will be outputted in the JSON. """ if len([p for p in paths if not isinstance(p, Path)]): raise TypeError(f"All paths must be instances of '{Path.__name__}'") super().__init__() root = Node("", None, **kwargs) for path in sorted(paths): curr = root subpaths = path.path.split(os.path.sep) for i, subpath in enumerate(subpaths): if subpath not in curr.children: opt = copy.deepcopy(kwargs) if len(subpaths) - 1 == i: oid = path.id opt.update(path.options) if path.options is not None else None else: oid = None curr.children[subpath] = Node(subpath, oid, **opt) # oid = path.id if len(subpaths) - 1 == i else None # curr.children[subpath] = Node(subpath, oid, **kwargs) curr = curr.children[subpath] self._items["_root"] = root def pretty(self, root=None, depth=0, spacing=2): """ Create a "pretty print" represenation of the tree with customized indentation at each level of the tree. """ if root is None: root = self._root fmt = "%s%s/" if root.children else "%s%s" s = fmt % (" " * depth * spacing, root.text) for child in root.children: child = root.children[child] s += f"\n{self.pretty(child, depth + 1, spacing)}" return s def jsonData(self): """ Returns a copy of the internal tree in a JSON-friendly format, ready for consumption by jsTree. The data is represented as a list of dictionaries, each of which are our internal nodes. """ return [self._root.children[k].jsonData() for k in sorted(self._root.children)] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/lazy_process.py0000644000175100017510000000313415211124267021123 0ustar00runnerrunnerimport subprocess import threading import time class LazyProcess: """Abstraction describing a command line launching a service - probably as needed as functionality is accessed in Galaxy. """ def __init__(self, command_and_args): self.command_and_args = command_and_args self.thread_lock = threading.Lock() self.allow_process_request = True self.process = None def start_process(self): with self.thread_lock: if self.allow_process_request: self.allow_process_request = False t = threading.Thread(target=self.__start) t.daemon = True t.start() def __start(self): with self.thread_lock: self.process = subprocess.Popen(self.command_and_args, close_fds=True) def shutdown(self): with self.thread_lock: self.allow_process_request = False if self.running: assert self.process # tell type checker it can not be None if self.running self.process.terminate() time.sleep(0.01) if self.running: self.process.kill() @property def running(self): return self.process and not self.process.poll() class NoOpLazyProcess: """LazyProcess abstraction meant to describe potentially optional services, in those cases where one is not configured or valid, this class can be used in place of LazyProcess. """ def start_process(self): return def shutdown(self): return @property def running(self): return False ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/markdown.py0000644000175100017510000000061515211124267020231 0ustar00runnerrunner"""Common markdown formatting helpers for per-datatype rendering.""" def literal_via_fence(content): return "\n{}\n".format("\n".join(f" {line}" for line in content.splitlines())) def indicate_data_truncated(): return "\n**Warning:** The above data has been truncated to be embedded in this document.\n\n" def pre_formatted_contents(markdown): return f"
{markdown}
" ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/monitors.py0000644000175100017510000000276515211124267020271 0ustar00runnerrunnerimport logging import threading from .sleeper import Sleeper log = logging.getLogger(__name__) DEFAULT_MONITOR_THREAD_JOIN_TIMEOUT = 5 class Monitors: def _init_monitor_thread(self, name, target_name=None, target=None, start=False, config=None): self.monitor_join_sleep = getattr(config, "monitor_thread_join_timeout", DEFAULT_MONITOR_THREAD_JOIN_TIMEOUT) self.monitor_join = self.monitor_join_sleep > 0 self.monitor_running = True if target is not None: assert target_name is None monitor_func = target else: target_name = target_name or "monitor" monitor_func = getattr(self, target_name) self.sleeper = Sleeper() self.monitor_thread = threading.Thread(name=name, target=monitor_func) self.monitor_thread.daemon = True self._start = start self.start_monitoring() def _init_noop_monitor(self): self.sleeper = None self.monitor_join = False def start_monitoring(self): if self._start: self.monitor_thread.start() def stop_monitoring(self): self.monitor_running = False def _monitor_sleep(self, sleep_amount): self.sleeper.sleep(sleep_amount) def shutdown_monitor(self): self.stop_monitoring() if self.sleeper is not None: self.sleeper.wake() if self.monitor_join: log.debug("Joining monitor thread") self.monitor_thread.join(self.monitor_join_sleep) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/odict.py0000644000175100017510000000604115211124267017510 0ustar00runnerrunner""" Ordered dictionary implementation with `insert` functionality. This is only used in one specific place in the codebase: galaxy.tool_util.toolbox.panel Whenever possible the stdlib `collections.OrderedDict` should be used instead of this custom implementation. """ import sys from collections import UserDict from typing import ( Dict, Generic, List, Optional, Tuple, TypeVar, Union, ) dict_alias = dict KeyT = TypeVar("KeyT") ValueT = TypeVar("ValueT") if sys.version_info >= (3, 9): # A simple type alias doesn't work with mypy class TypedUserDict(UserDict[KeyT, ValueT]): ... else: # UserDict is not generic in Python < 3.9 # TypeError: 'ABCMeta' object is not subscriptable class TypedUserDict(UserDict, Generic[KeyT, ValueT]): ... class odict(TypedUserDict[KeyT, ValueT]): """ http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/107747 This dictionary class extends UserDict to record the order in which items are added. Calling keys(), values(), items(), etc. will return results in this order. """ def __init__(self, dict: Optional[Union[Dict[KeyT, ValueT], List[Tuple[KeyT, ValueT]]]] = None) -> None: item = dict self._keys: List[KeyT] = [] if isinstance(item, dict_alias): super().__init__(item) else: super().__init__(None) if isinstance(item, list): for key, value in item: self[key] = value def __delitem__(self, key: KeyT) -> None: super().__delitem__(key) self._keys.remove(key) def __setitem__(self, key: KeyT, item: ValueT) -> None: super().__setitem__(key, item) if key not in self._keys: self._keys.append(key) def clear(self) -> None: super().clear() self._keys = [] def copy(self) -> "odict[KeyT, ValueT]": new: odict[KeyT, ValueT] = odict() new.update(self) return new def items(self): return zip(self._keys, self.values()) def keys(self): return self._keys[:] def popitem(self) -> Tuple[KeyT, ValueT]: try: key = self._keys[-1] except IndexError: raise KeyError("dictionary is empty") val = self[key] del self[key] return (key, val) def setdefault(self, key, failobj=None): if key not in self._keys: self._keys.append(key) return super().setdefault(key, failobj) def values(self): return map(self.get, self._keys) def iterkeys(self): return iter(self._keys) def itervalues(self): for key in self._keys: yield self.get(key) def iteritems(self): for key in self._keys: yield key, self.get(key) def __iter__(self): yield from self._keys def reverse(self): self._keys.reverse() def insert(self, index, key: KeyT, item: ValueT) -> None: if key not in self._keys: self._keys.insert(index, key) super().__setitem__(key, item) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/oset.py0000644000175100017510000000325515211124267017364 0ustar00runnerrunner""" Ordered set implementation from https://code.activestate.com/recipes/576694/ """ from collections.abc import MutableSet class OrderedSet(MutableSet): def __init__(self, iterable=None): self.end = end = [] end += [None, end, end] # sentinel node for doubly linked list self.map = {} # key --> [key, prev, next] if iterable is not None: self |= iterable def __len__(self): return len(self.map) def __contains__(self, key): return key in self.map def add(self, key): if key not in self.map: end = self.end curr = end[1] curr[2] = end[1] = self.map[key] = [key, curr, end] def discard(self, key): if key in self.map: key, prev, next = self.map.pop(key) prev[2] = next next[1] = prev def __iter__(self): end = self.end curr = end[2] while curr is not end: yield curr[0] curr = curr[2] def __reversed__(self): end = self.end curr = end[1] while curr is not end: yield curr[0] curr = curr[1] def pop(self, last=True): if not self: raise KeyError("set is empty") key = self.end[1][0] if last else self.end[2][0] self.discard(key) return key def __repr__(self): if not self: return f"{self.__class__.__name__}()" return f"{self.__class__.__name__}({list(self)!r})" def __eq__(self, other): if isinstance(other, OrderedSet): return len(self) == len(other) and list(self) == list(other) return set(self) == set(other) ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1780787404.688734 galaxy_util-26.0.1/galaxy/util/path/0000755000175100017510000000000015211124315016761 5ustar00runnerrunner././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/path/__init__.py0000644000175100017510000004077115211124267021111 0ustar00runnerrunner"""Path manipulation functions.""" import errno import importlib import logging import shlex import types from functools import partial from itertools import starmap from operator import getitem from os import ( extsep, makedirs, PathLike, stat, walk, ) from os.path import ( abspath, basename, dirname, exists, isabs, islink, join, normpath, pardir, realpath, relpath, sep as separator, ) from pathlib import Path from typing import ( AnyStr, Iterator, List, Optional, Tuple, TYPE_CHECKING, Union, ) try: from grp import getgrgid except ImportError: getgrgid = None # type: ignore[assignment] try: from pwd import getpwuid except ImportError: getpwuid = None # type: ignore[assignment] import galaxy.util # Stable in Python 3.10 path types if TYPE_CHECKING: StrPath = Union[str, PathLike[str]] BytesPath = Union[bytes, PathLike[bytes]] GenericPath = Union[AnyStr, PathLike[AnyStr]] StrOrBytesPath = Union[str, bytes, PathLike[str], PathLike[bytes]] else: StrPath = Union[str, PathLike] BytesPath = Union[bytes, PathLike] GenericPath = Union[AnyStr, PathLike] StrOrBytesPath = Union[str, bytes, PathLike, PathLike] AllowListT = Optional[List[GenericPath]] WALK_MAX_DIRS = 10000 log = logging.getLogger(__name__) def safe_path(path: GenericPath, allowlist: AllowListT = None): """Ensure that a the absolute location of the path (after following symlinks) is either itself or on the allowlist of acceptable locations. This function does not perform an existence check, thus, if the path does not exist, ``True`` is returned. :type path: string :param path: a path to check :type allowlist: comma separated list of strings :param allowlist: list of acceptable locations :return: ``True`` if ``path`` resolves to itself or a allowlisted location """ return any(__contains(dirname(path), path, allowlist=allowlist)) def safe_contains(prefix: GenericPath, path: GenericPath, allowlist: AllowListT = None, real=None): """Ensure a path is contained within another path. Given any two filesystem paths, ensure that ``path`` is contained in ``prefix``. If ``path`` exists (either as an absolute path or relative to ``prefix``), it is canonicalized with :func:`os.path.realpath` to ensure it is not a symbolic link that points outside of ``prefix``. If it is a symbolic link and ``allowlist`` is set, the symbolic link may also point inside a ``allowlist`` path. The ``path`` is checked against ``allowlist`` using either its absolute pathname (if passed in as absolute) or relative to ``prefix`` and canonicalized (if applicable). It is *not* ``os.path.join()``ed with each ``allowlist`` directory. :type prefix: string :param prefix: a directory under which ``path`` is to be checked :type path: string :param path: a filename to check :type allowlist: list of strings :param allowlist: list of additional paths under which ``path`` may be located :rtype: bool :returns: ``True`` if ``path`` is contained within ``prefix`` or ``allowlist``, ``False`` otherwise. """ return any(__contains(prefix, path, allowlist=allowlist, real=real)) class _SafeContainsDirectoryChecker: def __init__(self, dirpath, prefix, allowlist=None): self.allowlist = allowlist self.dirpath = dirpath self.prefix = prefix self.real_dirpath = realpath(join(prefix, dirpath)) def check(self, filename: GenericPath) -> bool: dirpath_path = join(self.real_dirpath, filename) if islink(dirpath_path): return safe_contains(self.prefix, filename, allowlist=self.allowlist) else: return safe_contains(self.prefix, filename, allowlist=self.allowlist, real=dirpath_path) def safe_makedirs(path: GenericPath) -> None: """Safely make a directory, do not fail if it already exists or is created during execution. :type path: string :param path: a directory to create """ # prechecking for existence is faster than try/except if not exists(path): try: makedirs(path) except OSError as e: # reviewing the source for Python 2.7, this would only ever happen for the last path element anyway so no # need to recurse - this exception means the last part of the path was already in existence. if e.errno != errno.EEXIST: raise def safe_relpath(path: GenericPath) -> bool: """Determine whether a relative path references a path outside its root. This is a path computation: the filesystem is not accessed to confirm the existence or nature of ``path``. :type path: string :param path: a path to check :rtype: bool :returns: ``True`` if path is relative and does not reference a path in a parent directory, ``False`` otherwise. """ return not (isabs(path) or normpath(path).startswith(pardir)) def safe_walk(path, allowlist=None): """Walk a path and return only the contents that are not symlinks outside the path. Symbolic links are followed if a allowlist is provided. The path itself cannot be a symbolic link unless the pointed to location is in the allowlist. :type path: string :param path: a directory to check for unsafe contents :type allowlist: list of strings :param allowlist: list of additional paths under which contents may be located :rtype: iterator :returns: Iterator of "safe" ``os.walk()`` tuples found under ``path`` """ for i, elems in enumerate(walk(path, followlinks=bool(allowlist)), start=1): dirpath, dirnames, filenames = elems _check = _SafeContainsDirectoryChecker(dirpath, path, allowlist=allowlist).check if allowlist and i % WALK_MAX_DIRS == 0: raise RuntimeError( f"Breaking out of walk of {path!r} after {WALK_MAX_DIRS} iterations (most likely infinite symlink recursion) at: {dirpath!r}" ) _prefix = partial(join, dirpath) prune = False for dname in dirnames: if not _check(join(dirpath, dname)): prune = True break if prune: dirnames = map(basename, filter(_check, map(_prefix, dirnames))) prune = False for filename in filenames: if not _check(join(dirpath, filename)): prune = True break if prune: filenames = map(basename, filter(_check, map(_prefix, filenames))) yield (dirpath, dirnames, filenames) def unsafe_walk(path: GenericPath, allowlist: AllowListT = None, username: Optional[str] = None): """Walk a path and ensure that none of its contents are symlinks outside the path. It is assumed that ``path`` itself has already been validated e.g. with :func:`safe_relpath` or :func:`safe_contains`. This function is most useful for the case where you want to test whether a directory contains safe paths, but do not want to actually walk the safe contents. :type path: string :param path: a directory to check for unsafe contents :type allowlist: list of strings :param allowlist: list of additional paths under which contents may be located :rtype: list of strings :returns: A list of "bad" files found under ``path`` """ unsafe_paths = [] for walked_path in __walk(abspath(path)): is_safe = safe_contains(path, walked_path, allowlist=allowlist) if username and is_safe: is_safe = full_path_permission_for_user(path, walked_path, username=username, skip_prefix=True) if not is_safe: unsafe_paths.append(walked_path) return unsafe_paths def __path_permission_for_user(path: GenericPath, username: str) -> bool: """ :type path: string :param path: a directory or file to check :type username: string :param username: a username matching the systems username """ if getpwuid is None or getgrgid is None: raise NotImplementedError("This functionality is not implemented for Windows.") group_id_of_file = stat(path).st_gid file_owner = getpwuid(stat(path).st_uid) group_members = getgrgid(group_id_of_file).gr_mem oct_mode = oct(stat(path).st_mode) owner_permissions = int(oct_mode[-3]) group_permissions = int(oct_mode[-2]) other_permissions = int(oct_mode[-1]) if ( other_permissions >= 4 or (file_owner.pw_name == username and owner_permissions >= 4) or (username in group_members and group_permissions >= 4) ): return True return False def full_path_permission_for_user(prefix, path, username: str, skip_prefix=False): """ Assuming username is identical to the os username, this checks that the given user can read the specified path by checking the file permission and each parent directory permission. :type prefix: string :param prefix: a directory under which ``path`` is to be checked :type path: string :param path: a filename to check :type username: string :param username: a username matching the systems username :type skip_prefix: bool :param skip_prefix: skip the given prefix from being checked for permissions """ full_path = realpath(join(prefix, path)) top_path = realpath(prefix) if skip_prefix else None can_read = __path_permission_for_user(full_path, username) if can_read: depth = 0 max_depth = full_path.count(separator) parent_path = dirname(full_path) while can_read and depth != max_depth: if parent_path in [separator, top_path]: break if not __path_permission_for_user(parent_path, username): can_read = False depth += 1 parent_path = dirname(parent_path) return can_read def joinext(root: str, ext: str) -> str: """ Roughly the reverse of os.path.splitext. :type root: string :param root: part of the filename before the extension :type root: string :param ext: the extension :rtype: string :returns: ``root`` joined with ``ext`` separated by a single ``os.extsep`` """ return extsep.join((root.rstrip(extsep), ext.lstrip(extsep))) def has_ext(path: AnyStr, ext: str, aliases=False, ignore=None): """ Determine whether ``path`` has extension ``ext`` :type path: string :param path: Path to check :type ext: string :param ext: Extension to check :type aliases: bool :param aliases: Check any known aliases for the given extension :type ignore: string :param ignore: Ignore this extension at the end of the path (e.g. ``sample``) :rtype: bool :returns: ``True`` if path is a YAML file, ``False`` otherwise. """ ext = __ext_strip_sep(ext) root, _ext = __splitext_ignore(path, ignore=ignore) if aliases: return _ext in extensions[ext] else: return _ext == ext def get_ext(path: AnyStr, ignore=None, canonicalize=True) -> str: """ Return the extension of ``path`` :type path: string :param path: Path to check :type ignore: string :param ignore: Ignore this extension at the end of the path (e.g. ``sample``) :type canonicalize: bool :param canonicalize: If the extension is known to this module, return the canonicalized extension instead of the file's actual extension :rtype: string """ root, ext = __splitext_ignore(path, ignore=ignore) if canonicalize: try: ext = extensions.canonicalize(ext) except KeyError: pass # should do something else here? return ext class Extensions(dict): """Mappings for extension aliases. A dict-like object that returns values for keys that are not mapped if the key can be found in any of the dict's values (which should be sequence types). The first item in the sequence should match the key and is the "canonicalization". """ def __missing__(self, key): for v in self.values(): if key in v: self[key] = v return v raise KeyError(key) def canonicalize(self, ext: str) -> str: # shouldn't raise an IndexError because it should raise a KeyError first return self[ext][0] extensions = Extensions( { "ini": ["ini"], "json": ["json"], "yaml": ["yaml", "yml"], } ) def external_chown(path, pwent, external_chown_script, description="file"): """ call the external chown script to change the user and group of the given path, and additional description of the file/path for the log message can be given return True in case of success """ try: if not external_chown_script: raise ValueError("external_chown_script is not defined") if Path(path).owner() == pwent[0]: return True cmd = shlex.split(external_chown_script) cmd.extend([path, pwent[0], str(pwent[3])]) log.debug(f"Changing ownership of {path} with: '{galaxy.util.shlex_join(cmd)}'") galaxy.util.commands.execute(cmd) return True except galaxy.util.commands.CommandLineException as e: log.warning(f"Changing ownership of {description} {path} failed: {galaxy.util.unicodify(e)}") return False def __listify(item) -> Union[list, tuple]: """A non-splitting version of :func:`galaxy.util.listify`.""" if not item: return [] elif isinstance(item, list) or isinstance(item, tuple): return item else: return [item] # helpers def __walk(path: GenericPath) -> Iterator[GenericPath]: for dirpath, dirnames, filenames in walk(path): for name in dirnames: yield join(dirpath, name) for name in filenames: yield join(dirpath, name) def __contains( prefix: GenericPath, path: GenericPath, allowlist: AllowListT = None, real: Optional[GenericPath] = None ): real = real or realpath(join(prefix, path)) yield not relpath(real, prefix).startswith(pardir) for aldir in allowlist or []: # a path is under the allowlist if the relative path between it and the allowlist does not have to go up (..) yield not relpath(real, aldir).startswith(pardir) def __ext_strip_sep(ext: str) -> str: return ext.lstrip(extsep) def __splitext_no_sep(path: AnyStr) -> List[str]: path_as_str = galaxy.util.unicodify(path) return (path_as_str.rsplit(extsep, 1) + [""])[0:2] def __splitext_ignore(path: AnyStr, ignore: Optional[Union[List[str], Tuple[str]]] = None) -> Tuple[str, str]: # note: unlike os.path.splitext this strips extsep from ext ignore_map = map(__ext_strip_sep, __listify(ignore)) root, ext = __splitext_no_sep(path) if ext in ignore_map: new_path = path[0 : (-len(ext) - 1)] root, ext = __splitext_no_sep(new_path) return (root, ext) # cross-platform support def _build_self(target: types.ModuleType, path_module: types.ModuleType) -> None: """Populate a module with the same exported functions as this module, but using the given os.path module. :type target: module :param target: module on which to set ``galaxy.util.path`` functions :type path_module: ``ntpath`` or ``posixpath`` module :param path_module: module implementing ``os.path`` API to use for path functions """ self_copy = importlib.import_module(__name__) self_copy.__set_fxns_on(target, path_module) def __set_fxns_on(target, path_module): """Overrides imported os.path functions with the ones from path_module and populates target with the global functions from this module. """ for name in __pathfxns__: globals()[name] = getattr(path_module, name) __get = partial(getitem, globals()) __set = partial(setattr, target) # this is actually izip(..., imap(...)) __fxns = zip(__all__, map(__get, __all__)) # list() to execute list(starmap(__set, __fxns)) __pathfxns__ = ( "abspath", "basename", "exists", "isabs", "join", "normpath", "pardir", "realpath", "relpath", ) __all__ = ( "extensions", "get_ext", "has_ext", "join", "joinext", "safe_contains", "safe_makedirs", "safe_relpath", "safe_walk", "unsafe_walk", ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/path/ntpath.py0000644000175100017510000000031315211124267020634 0ustar00runnerrunner"""Galaxy "safe" path functions forced to work with Windows-style paths regardless of current platform""" import ntpath import sys from . import _build_self _build_self(sys.modules[__name__], ntpath) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/path/posixpath.py0000644000175100017510000000031715211124267021361 0ustar00runnerrunner"""Galaxy "safe" path functions forced to work with POSIX-style paths regardless of current platform""" import posixpath import sys from . import _build_self _build_self(sys.modules[__name__], posixpath) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/permutations.py0000644000175100017510000001316615211124267021146 0ustar00runnerrunner"""There is some shared logic between matching/multiplying inputs in workflows and tools. This module is meant to capture some general permutation logic that can be applicable for both cases but will only be used in the newer tools case first. Maybe this doesn't make sense and maybe much of this stuff could be replaced with itertools product and permutations. These are open questions. """ import copy from typing import Tuple from galaxy.exceptions import MessageException from galaxy.util.bunch import Bunch input_classification = Bunch( SINGLE="single", MATCHED="matched", MULTIPLIED="multiplied", ) class InputMatchedException(MessageException): """Indicates problem matching inputs while building up inputs permutations.""" def build_combos(single_inputs, matched_multi_inputs, multiplied_multi_inputs, nested): # Build up every combination of inputs to be run together. input_combos = __extend_with_matched_combos(single_inputs, matched_multi_inputs, nested) input_combos = __extend_with_multiplied_combos(input_combos, multiplied_multi_inputs, nested) return input_combos def __extend_with_matched_combos(single_inputs, multi_inputs, nested): """ {a => 1, b => 2} and {c => {3, 4}, d => {5, 6}} Becomes [ {a => 1, b => 2, c => 3, d => 5}, {a => 1, b => 2, c => 4, d => 6}, ] """ if len(multi_inputs) == 0: return [single_inputs] matched_multi_inputs = [] first_multi_input_key = next(iter(multi_inputs.keys())) first_multi_value = multi_inputs.get(first_multi_input_key) for value in first_multi_value: new_inputs = __copy_and_extend_inputs(single_inputs, first_multi_input_key, value, nested=nested) matched_multi_inputs.append(new_inputs) for multi_input_key, multi_input_values in multi_inputs.items(): if multi_input_key == first_multi_input_key: continue if len(multi_input_values) != len(first_multi_value): raise InputMatchedException( f"Received {len(multi_input_values)} inputs for '{multi_input_key}' and {len(first_multi_value)} inputs for '{first_multi_input_key}', these should be of equal length" ) for index, value in enumerate(multi_input_values): state_set_value(matched_multi_inputs[index], multi_input_key, value, nested) return matched_multi_inputs def __extend_with_multiplied_combos(input_combos, multi_inputs, nested): combos = input_combos for multi_input_key, multi_input_value in multi_inputs.items(): iter_combos = [] for combo in combos: for input_value in multi_input_value: iter_combo = __copy_and_extend_inputs(combo, multi_input_key, input_value, nested) iter_combos.append(iter_combo) combos = iter_combos return combos def __copy_and_extend_inputs(inputs, key, value, nested): # can't deepcopy dicts with our models for reason I don't understand, # test_map_over_two_collections_unlinked breaks if I try to combine these two branches of the if new_inputs = state_copy(inputs, nested) state_set_value(new_inputs, key, value, nested) return new_inputs def state_copy(inputs, nested): # can't deepcopy dicts with our models for reason I don't understand, # test_map_over_two_collections_unlinked breaks if I try to combine these two branches of the if if nested: state_dict_copy = copy.deepcopy(inputs) else: state_dict_copy = dict(inputs) return state_dict_copy def state_set_value(state_dict, key, value, nested): if "|" not in key or not nested: state_dict[key] = value else: first, rest = key.split("|", 1) if first not in state_dict and looks_like_flattened_repeat_key(first): repeat_name, index = split_flattened_repeat_key(first) if repeat_name not in state_dict: state_dict[repeat_name] = [] repeat_state = state_dict[repeat_name] while len(repeat_state) <= index: repeat_state.append({}) state_set_value(repeat_state[index], rest, value, nested) else: state_set_value(state_dict[first], rest, value, nested) def state_remove_value(state_dict, key, nested): if "|" not in key or not nested: del state_dict[key] else: first, rest = key.split("|", 1) child_dict = state_dict[first] # repeats? if "|" in rest: state_remove_value(child_dict, rest, nested) else: del child_dict[rest] if len(child_dict) == 0: del state_dict[first] def state_get_value(state_dict, key, nested): if "|" not in key or not nested: return state_dict[key] else: first, rest = key.split("|", 1) if first not in state_dict and looks_like_flattened_repeat_key(first): repeat_name, index = split_flattened_repeat_key(first) return state_get_value(state_dict[repeat_name][index], rest, nested) else: return state_get_value(state_dict[first], rest, nested) def is_in_state(state_dict, key, nested): if not state_dict: return False if "|" not in key or not nested: return key in state_dict else: first, rest = key.split("|", 1) # repeats? is_in_state(state_dict.get(first), rest, nested) def looks_like_flattened_repeat_key(key: str) -> bool: parts = key.rsplit("_", 1) return len(parts) == 2 and parts[1].isdigit() def split_flattened_repeat_key(key: str) -> Tuple[str, int]: input_name, _index = key.rsplit("_", 1) index = int(_index) return input_name, index ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/plugin_config.py0000644000175100017510000001304515211124267021233 0ustar00runnerrunnerfrom types import ModuleType from typing import ( Any, cast, Dict, Generator, Iterable, List, NamedTuple, Optional, Type, TypeVar, Union, ) import yaml from galaxy.util import parse_xml from galaxy.util.path import StrPath from galaxy.util.submodules import import_submodules PluginDictConfigT = Dict[str, Any] PluginConfigsT = Union[PluginDictConfigT, List[PluginDictConfigT]] class PluginConfigSource(NamedTuple): type: str source: Any def plugins_dict(module: ModuleType, plugin_type_identifier: str) -> Dict[str, Type]: """Walk through all classes in submodules of module and find ones labelled with specified plugin_type_identifier and throw in a dictionary to allow constructions from plugins by these types later on. """ plugin_dict = {} for plugin_module in import_submodules(module, ordered=True): for clazz in __plugin_classes_in_module(plugin_module): plugin_type = getattr(clazz, plugin_type_identifier, None) if plugin_type: plugin_dict[plugin_type] = clazz return plugin_dict T = TypeVar("T") def load_plugins( plugins_dict: Dict[str, Type[T]], plugin_source: PluginConfigSource, extra_kwds: Optional[Dict[str, Any]] = None, plugin_type_keys: Iterable[str] = ("type",), dict_to_list_key: Optional[str] = None, ) -> List[T]: if extra_kwds is None: extra_kwds = {} if plugin_source.type == "xml": return __load_plugins_from_element(plugins_dict, plugin_source.source, extra_kwds) else: return __load_plugins_from_dicts( plugins_dict, plugin_source.source, extra_kwds, plugin_type_keys=plugin_type_keys, dict_to_list_key=dict_to_list_key, ) def __plugin_classes_in_module(plugin_module: ModuleType) -> Generator[Type, None, None]: for clazz in getattr(plugin_module, "__all__", []): try: clazz = getattr(plugin_module, clazz) except TypeError: clazz = clazz yield clazz def __load_plugins_from_element( plugins_dict: Dict[str, Type[T]], plugins_element, extra_kwds: Dict[str, Any] ) -> List[T]: plugins = [] for plugin_element in plugins_element: plugin_type = plugin_element.tag plugin_kwds = dict(plugin_element.items()) plugin_kwds.update(extra_kwds) try: plugin_klazz = plugins_dict[plugin_type] except KeyError: template = "Failed to find plugin of type [%s] in available plugin types %s" message = template % (plugin_type, str(plugins_dict.keys())) raise Exception(message) plugin = __create_plugin_instance(plugin_klazz, plugin_kwds) plugins.append(plugin) return plugins def __as_configurable_plugin_instance(obj: Any) -> Optional[Type]: """Check if the class implements the configurable plugin pattern.""" try: if isinstance(obj, type) and hasattr(obj, "build_template_config"): return obj except TypeError: pass return None def __create_plugin_instance(plugin_class: Type[T], plugin_kwds: Dict[str, Any]) -> T: """Create an instance of the plugin class with the provided keyword arguments.""" configurable_instance = __as_configurable_plugin_instance(plugin_class) if configurable_instance: plugin_template_config = configurable_instance.build_template_config(**plugin_kwds) return configurable_instance(template_config=plugin_template_config) else: return plugin_class(**plugin_kwds) def __load_plugins_from_dicts( plugins_dict: Dict[str, Type[T]], configs: PluginConfigsT, extra_kwds: Dict[str, Any], plugin_type_keys: Iterable[str], dict_to_list_key: Optional[str], ) -> List[T]: plugins = [] configs_as_list: List[PluginDictConfigT] if isinstance(configs, dict) and dict_to_list_key is not None: configs_as_list = [] for key, value in configs.items(): config = value.copy() config[dict_to_list_key] = key configs_as_list.append(config) else: configs_as_list = cast(List[PluginDictConfigT], configs) for config in configs_as_list: plugin_type = None for plugin_type_key in plugin_type_keys: if plugin_type_key in config: plugin_type = config[plugin_type_key] break assert plugin_type is not None, f"Could not determine plugin type for [{config}]" plugin_kwds = config if extra_kwds: plugin_kwds = plugin_kwds.copy() plugin_kwds.update(extra_kwds) plugin_class = plugins_dict[plugin_type] plugin = __create_plugin_instance(plugin_class, plugin_kwds) plugins.append(plugin) return plugins def plugin_source_from_path(path: StrPath) -> PluginConfigSource: filename = str(path) if ( filename.endswith(".yaml") or filename.endswith(".yml") or filename.endswith(".yaml.sample") or filename.endswith(".yml.sample") ): return PluginConfigSource("dict", __read_yaml(path)) else: return PluginConfigSource("xml", parse_xml(path, remove_comments=True).getroot()) def plugin_source_from_dict(as_dict: PluginConfigsT) -> PluginConfigSource: return PluginConfigSource("dict", as_dict) def __read_yaml(path: StrPath): if yaml is None: raise ImportError("Attempting to read YAML configuration file - but PyYAML dependency unavailable.") with open(path, "rb") as f: return yaml.safe_load(f) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/properties.py0000644000175100017510000001755015211124267020611 0ustar00runnerrunner"""Module used to blend ini, environment, and explicit dictionary properties to determine application configuration. Some hard coded defaults for Galaxy but this should be reusable by tool shed and pulsar as well. """ import os import os.path import sys from configparser import ( BasicInterpolation, ConfigParser, InterpolationError, ) from functools import partial from itertools import ( product, starmap, ) from typing import ( cast, Iterable, Optional, ) import yaml from galaxy.exceptions import InvalidFileFormatError from galaxy.util.path import ( extensions, has_ext, joinext, ) def get_from_env(key: str, prefixes: Iterable[str], default: Optional[str] = None): """ Return first available value for prefix+key set in the environment, or default. An empty prefix is ignored. Useful when we need to check against multiple prefixes sequentially, returning the first available value. """ for prefix in prefixes: if prefix: value = os.getenv(f"{prefix}{key}") if value: return value return default def find_config_file(names, exts=None, dirs=None, include_samples=False): """Locate a config file in multiple directories, with multiple extensions. >>> from shutil import rmtree >>> from tempfile import mkdtemp >>> def touch(d, f): ... open(os.path.join(d, f), 'w').close() >>> def _find_config_file(*args, **kwargs): ... return find_config_file(*args, **kwargs).replace(d, '') >>> d = mkdtemp() >>> d1 = os.path.join(d, 'd1') >>> d2 = os.path.join(d, 'd2') >>> os.makedirs(d1) >>> os.makedirs(d2) >>> touch(d1, 'foo.ini') >>> touch(d1, 'foo.bar') >>> touch(d1, 'baz.ini.sample') >>> touch(d2, 'foo.yaml') >>> touch(d2, 'baz.yml') >>> _find_config_file('foo', dirs=(d1, d2)) '/d1/foo.ini' >>> _find_config_file('baz', dirs=(d1, d2)) '/d2/baz.yml' >>> _find_config_file('baz', dirs=(d1, d2), include_samples=True) '/d2/baz.yml' >>> _find_config_file('baz', dirs=(d1,), include_samples=True) '/d1/baz.ini.sample' >>> _find_config_file('foo', dirs=(d2, d1)) '/d2/foo.yaml' >>> find_config_file('quux', dirs=(d,)) >>> _find_config_file('foo', exts=('bar', 'ini'), dirs=(d1,)) '/d1/foo.bar' >>> rmtree(d) """ found = __find_config_files( names, exts=exts or extensions["yaml"] + extensions["ini"], dirs=dirs or [os.getcwd(), os.path.join(os.getcwd(), "config")], include_samples=include_samples, ) if not found: return None # doesn't really make sense to log here but we should probably generate a warning of some kind if more than one # config is found. return found[0] def load_app_properties( kwds=None, ini_file=None, ini_section=None, config_file=None, config_section=None, config_prefix="GALAXY_CONFIG_" ): if config_file is None: config_file = ini_file config_section = config_section or ini_section # read from file or init w/no file if config_file and os.path.exists(config_file): properties = read_properties_from_file(config_file, config_section) else: properties = {"__file__": None} # update from kwds if kwds: properties.update(kwds) # update from env override_prefix = f"{config_prefix}OVERRIDE_" for key in os.environ: if key.startswith(override_prefix): config_key = key[len(override_prefix) :].lower() properties[config_key] = os.environ[key] elif key.startswith(config_prefix): config_key = key[len(config_prefix) :].lower() if config_key not in properties: properties[config_key] = os.environ[key] return properties def read_properties_from_file(config_file, config_section=None): properties = {} if has_ext(config_file, "yaml", aliases=True, ignore="sample"): if config_section is None: config_section = "galaxy" properties.update(__default_properties(config_file)) raw_properties = _read_from_yaml_file(config_file) if raw_properties: properties.update(raw_properties.get(config_section) or {}) elif has_ext(config_file, "ini", aliases=True, ignore="sample"): if config_section is None: config_section = "app:main" parser = nice_config_parser(config_file) # default properties loaded w/parser if parser.has_section(config_section): properties.update(dict(parser.items(config_section))) else: properties.update(parser.defaults()) else: raise InvalidFileFormatError(f"File '{config_file}' doesn't have a supported extension") return properties def _read_from_yaml_file(path): with open(path) as f: return yaml.safe_load(f) def nice_config_parser(path): parser = NicerConfigParser(path, defaults=__default_properties(path)) with open(path) as f: parser.read_file(f) return parser class _InterpolateWrapper(BasicInterpolation): def before_get(self, parser, section, option, value, defaults): try: return super().before_get(parser, section, option, value, defaults) except InterpolationError: e = cast(InterpolationError, sys.exc_info()[1]) args = list(e.args) args[0] = f"Error in file {parser.filename}: {e}" e.args = tuple(args) e.message = args[0] raise class NicerConfigParser(ConfigParser): def __init__(self, filename, *args, **kw): kw["interpolation"] = _InterpolateWrapper() ConfigParser.__init__(self, *args, **kw) self.filename = filename def optionxform(self, optionstr: str) -> str: # Don't lower-case keys return str(super().optionxform(optionstr)) def defaults(self): """Return the defaults, with their values interpolated (with the defaults dict itself) Mainly to support defaults using values such as %(here)s """ defaults = dict(ConfigParser.defaults(self)) for key, val in defaults.items(): defaults[key] = self.get("DEFAULT", key) or val return defaults def _running_from_source(): paths = ["run.sh", "lib/galaxy/__init__.py", "scripts/common_startup.sh"] return all(map(os.path.exists, paths)) running_from_source = _running_from_source() def get_data_dir(properties): data_dir = properties.get("data_dir", None) if data_dir is None: if running_from_source: data_dir = "./database" elif properties["__file__"] is None: data_dir = "./data" else: config_dir = properties.get("config_dir", os.path.dirname(properties["__file__"])) data_dir = os.path.join(config_dir, "data") return data_dir def __get_all_configs(dirs, names): return list(filter(os.path.exists, starmap(os.path.join, product(dirs, names)))) def __find_config_files(names, exts=None, dirs=None, include_samples=False): sample_names: Iterable[str] = [] if isinstance(names, str): names = [names] if not dirs: dirs = [os.getcwd()] if exts: # add exts to names, converts back into a list because it's going to be small and we might consume names twice names = list(starmap(joinext, product(names, exts))) if include_samples: sample_names = map(partial(joinext, ext="sample"), names) # check for all names in each dir before moving to the next dir. could do it the other way around but that makes # less sense to me. return __get_all_configs(dirs, names) or __get_all_configs(dirs, sample_names) def __default_properties(path): return {"here": os.path.dirname(os.path.abspath(path)), "__file__": os.path.abspath(path)} __all__ = ("find_config_file", "get_data_dir", "load_app_properties", "NicerConfigParser", "running_from_source") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/renamed_temporary_file.py0000644000175100017510000000310215211124267023115 0ustar00runnerrunner"""Safely write file to temporary file and then move file into place.""" # Copied from https://stackoverflow.com/a/12007885. import os import tempfile from galaxy.util.path import StrOrBytesPath class RenamedTemporaryFile: """ A temporary file object which will be renamed to the specified path on exit. """ final_path: StrOrBytesPath def __init__(self, final_path: StrOrBytesPath, **kwargs): """ >>> dir = tempfile.mkdtemp() >>> with RenamedTemporaryFile(os.path.join(dir, 'test.txt'), mode="w") as out: ... _ = out.write('bla') """ tmpfile_dir = kwargs.pop("dir", None) # Put temporary file in the same directory as the location for the # final file so that an atomic move into place can occur. if tmpfile_dir is None: tmpfile_dir = os.path.dirname(final_path) self.tmpfile = tempfile.NamedTemporaryFile(dir=tmpfile_dir, delete=False, **kwargs) self.final_path = final_path def __getattr__(self, attr: str): """ Delegate attribute access to the underlying temporary file object. """ return getattr(self.tmpfile, attr) def __enter__(self): self.tmpfile.__enter__() return self def __exit__(self, exc_type, exc_val, exc_tb): if exc_type is None: self.tmpfile.flush() self.tmpfile.__exit__(exc_type, exc_val, exc_tb) os.rename(self.tmpfile.name, self.final_path) else: self.tmpfile.__exit__(exc_type, exc_val, exc_tb) return False ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/requests.py0000644000175100017510000000254615211124267020267 0ustar00runnerrunnerfrom typing import ( Callable, cast, TypeVar, ) import requests from requests import ( # noqa: F401 codes as codes, exceptions as exceptions, Response as Response, ) from typing_extensions import ParamSpec from .user_agent import get_default_headers Param = ParamSpec("Param") RetType = TypeVar("RetType") def default_user_agent_decorator(f: Callable[Param, RetType]) -> Callable[Param, RetType]: def wrapper(*args: Param.args, **kwargs: Param.kwargs) -> RetType: headers = cast(dict, kwargs.pop("headers", None) or {}) headers.update(get_default_headers()) is_session = f in (requests.session, requests.Session) if not is_session: kwargs["headers"] = headers rval = f(*args, **kwargs) if is_session: rval.headers = headers # type: ignore[attr-defined] return rval return wrapper delete = default_user_agent_decorator(requests.delete) get = default_user_agent_decorator(requests.get) head = default_user_agent_decorator(requests.head) patch = default_user_agent_decorator(requests.patch) post = default_user_agent_decorator(requests.post) options = default_user_agent_decorator(requests.options) put = default_user_agent_decorator(requests.put) session = default_user_agent_decorator(requests.session) Session = default_user_agent_decorator(requests.Session) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/resources.py0000644000175100017510000000242015211124267020415 0ustar00runnerrunner"""Provide a consistent interface into and utilities for importlib file resources.""" import sys if sys.version_info >= (3, 12): from importlib.resources import ( as_file, files, ) from importlib.resources.abc import Traversable if sys.version_info >= (3, 13): from importlib.resources import Anchor else: from importlib.resources import Package as Anchor else: from importlib_resources import ( as_file, files, Package as Anchor, ) from importlib_resources.abc import Traversable def resource_path(anchor: Anchor, resource_name: str) -> Traversable: """ Return specified resource as a Traversable. anchor is either a module object or a module name as a string. """ return files(anchor).joinpath(resource_name) def resource_string(anchor: Anchor, resource_name: str) -> str: """ Return specified resource as a string. Replacement function for pkg_resources.resource_string, but returns unicode string instead of bytestring. anchor is either a module object or a module name as a string. """ return resource_path(anchor, resource_name).read_text() __all__ = ( "as_file", "files", "resource_string", "resource_path", "Traversable", ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/rst_to_html.py0000644000175100017510000000410215211124267020740 0ustar00runnerrunnerimport functools import os try: import docutils.core import docutils.io import docutils.utils import docutils.writers.html4css1 except ImportError: docutils = None # type: ignore[assignment] from .custom_logging import get_logger class FakeStream: def __init__(self, error): self.__error = error log_ = get_logger("docutils") def write(self, str): if len(str) > 0 and not str.isspace(): if self.__error: raise Exception(str) self.log_.warning(str) @functools.lru_cache(maxsize=None) def get_publisher(error=False): docutils_writer = docutils.writers.html4css1.Writer() docutils_template_path = os.path.join(os.path.dirname(__file__), "docutils_template.txt") no_report_level = docutils.utils.Reporter.SEVERE_LEVEL + 1 settings_overrides = { "embed_stylesheet": False, "template": docutils_template_path, "warning_stream": FakeStream(error), "doctitle_xform": False, # without option, very different rendering depending on # number of sections in help content. "halt_level": no_report_level, "output_encoding": "unicode", } if not error: # in normal operation we don't want noisy warnings, that's tool author business settings_overrides["report_level"] = no_report_level Publisher = docutils.core.Publisher pub = Publisher( parser=None, writer=docutils_writer, settings=None, source_class=docutils.io.StringInput, destination_class=docutils.io.StringOutput, ) pub.set_components("standalone", "restructuredtext", "pseudoxml") pub.process_programmatic_settings(None, settings_overrides, None) return pub @functools.lru_cache(maxsize=None) def rst_to_html(s, error=False): if docutils is None: raise Exception("Attempted to use rst_to_html but docutils unavailable.") publisher = get_publisher(error=error) publisher.set_source(s, None) publisher.set_destination(None, None) return publisher.publish(enable_exit_status=False) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/rules_dsl.py0000644000175100017510000004756415211124267020421 0ustar00runnerrunnerimport abc import itertools import re from typing import ( List, Type, ) import yaml from galaxy.util.resources import resource_string class RulesDSLError(Exception): pass def get_rules_specification(): return yaml.safe_load(resource_string(__name__, "rules_dsl_spec.yml")) def _ensure_rule_contains_keys(rule, keys): for key, instance_class in keys.items(): if key not in rule: raise ValueError(f"Rule of type [{rule['type']}] does not contain key [{key}].") value = rule[key] if not isinstance(value, instance_class): raise ValueError(f"Rule of type [{rule['type']}] does not contain correct value type for key [{key}].") def _ensure_key_value_in(rule, key, values): value = rule[key] if value not in values: raise ValueError(f"Invalid value [{value}] for [{key}] encountered.") def _ensure_valid_pattern(expression): re.compile(expression) def apply_regex(regex, target, data, replacement=None, group_count=None, allow_unmatched: bool = False): pattern = re.compile(regex) def new_row(row): source = row[target] if replacement is None: match = pattern.search(source) if not match: if allow_unmatched: new_columns = [""] if group_count: new_columns = ["" for _ in range(group_count)] result = row + new_columns else: raise RulesDSLError(f"Problem applying regular expression [{regex}] to [{source}].") else: if group_count: if len(match.groups()) != group_count: raise RulesDSLError("Problem applying regular expression, wrong number of groups found.") result = row + list(match.groups()) else: result = row + [match.group(0)] else: match = pattern.search(source) if match: result = row + [match.expand(replacement)] else: if allow_unmatched: result = row + [""] else: raise RulesDSLError(f"Problem applying regular expression [{regex}] to [{source}].") return result new_data = list(map(new_row, data)) return new_data class BaseRuleDefinition(metaclass=abc.ABCMeta): @property @abc.abstractmethod def rule_type(self): """Short string describing type of rule (plugin class) to use.""" @abc.abstractmethod def validate_rule(self, rule): """Validate dictified rule definition of this type.""" @abc.abstractmethod def apply(self, rule, data, sources): """Apply validated, dictified rule definition to supplied data.""" class AddColumnMetadataRuleDefinition(BaseRuleDefinition): rule_type = "add_column_metadata" def validate_rule(self, rule): _ensure_rule_contains_keys(rule, {"value": str}) def apply(self, rule, data, sources): rule_value = rule["value"] if rule_value.startswith("identifier"): identifier_index = int(rule_value[len("identifier") :]) new_rows = [] for index, row in enumerate(data): new_rows.append(row + [sources[index]["identifiers"][identifier_index]]) elif rule_value.startswith("index"): element_index = int(rule_value[len("index") :]) new_rows = [] for index, row in enumerate(data): new_rows.append(row + [str(sources[index]["indices"][element_index])]) elif rule_value == "tags": def sorted_tags(index): tags = sorted(sources[index]["tags"]) return [",".join(tags)] new_rows = [] for index, row in enumerate(data): new_rows.append(row + sorted_tags(index)) return new_rows, sources class AddColumnGroupTagValueRuleDefinition(BaseRuleDefinition): rule_type = "add_column_group_tag_value" def validate_rule(self, rule): _ensure_rule_contains_keys(rule, {"value": str}) def apply(self, rule, data, sources): rule_value = rule["value"] tag_prefix = f"group:{rule_value}:" new_rows = [] for index, row in enumerate(data): group_tag_value = None source = sources[index] tags = source["tags"] for tag in sorted(tags): if tag.startswith(tag_prefix): group_tag_value = tag[len(tag_prefix) :] break if group_tag_value is None: group_tag_value = rule.get("default_value", "") new_rows.append(row + [group_tag_value]) return new_rows, sources class AddColumnConcatenateRuleDefinition(BaseRuleDefinition): rule_type = "add_column_concatenate" def validate_rule(self, rule): _ensure_rule_contains_keys(rule, {"target_column_0": int, "target_column_1": int}) def apply(self, rule, data, sources): column_0 = rule["target_column_0"] column_1 = rule["target_column_1"] new_rows = [] for row in data: new_rows.append(row + [row[column_0] + row[column_1]]) return new_rows, sources class AddColumnBasenameRuleDefinition(BaseRuleDefinition): rule_type = "add_column_basename" def validate_rule(self, rule): _ensure_rule_contains_keys(rule, {"target_column": int}) def apply(self, rule, data, sources): column = rule["target_column"] re = r"[^/]*$" return apply_regex(re, column, data), sources class AddColumnRegexRuleDefinition(BaseRuleDefinition): rule_type = "add_column_regex" def validate_rule(self, rule): _ensure_rule_contains_keys(rule, {"target_column": int, "expression": str}) _ensure_valid_pattern(rule["expression"]) def apply(self, rule, data, sources): target = rule["target_column"] expression = rule["expression"] replacement = rule.get("replacement") group_count = rule.get("group_count") allow_unmatched = False if "allow_unmatched" in rule: _ensure_rule_contains_keys(rule, {"allow_unmatched": bool}) allow_unmatched = rule["allow_unmatched"] return apply_regex(expression, target, data, replacement, group_count, allow_unmatched=allow_unmatched), sources class AddColumnRownumRuleDefinition(BaseRuleDefinition): rule_type = "add_column_rownum" def validate_rule(self, rule): _ensure_rule_contains_keys(rule, {"start": int}) def apply(self, rule, data, sources): start = rule["start"] new_rows = [] for index, row in enumerate(data): new_rows.append(row + [f"{index + start}"]) return new_rows, sources class AddColumnValueRuleDefinition(BaseRuleDefinition): rule_type = "add_column_value" def validate_rule(self, rule): _ensure_rule_contains_keys(rule, {"value": str}) def apply(self, rule, data, sources): value = rule["value"] new_rows = [] for row in data: new_rows.append(row + [str(value)]) return new_rows, sources class AddColumnSubstrRuleDefinition(BaseRuleDefinition): rule_type = "add_column_substr" def validate_rule(self, rule): _ensure_rule_contains_keys( rule, { "target_column": int, "length": int, "substr_type": str, }, ) _ensure_key_value_in(rule, "substr_type", ["keep_prefix", "drop_prefix", "keep_suffix", "drop_suffix"]) def apply(self, rule, data, sources): target = rule["target_column"] length = rule["length"] substr_type = rule["substr_type"] def new_row(row): original_value = row[target] start = 0 end = len(original_value) if substr_type == "keep_prefix": end = length elif substr_type == "drop_prefix": start = length elif substr_type == "keep_suffix": start = end - length if start < 0: start = 0 else: end = end - length if end < 0: end = 0 return row + [original_value[start:end]] return list(map(new_row, data)), sources class AddColumnFromSampleSheetByIndex(BaseRuleDefinition): rule_type = "add_column_from_sample_sheet_index" def validate_rule(self, rule): _ensure_rule_contains_keys( rule, { "value": int, }, ) def apply(self, rule, data, sources): sample_sheet_column_index = rule["value"] new_rows = [] for index, row in enumerate(data): source = sources[index] columns = source["columns"] new_rows.append(row + [columns[sample_sheet_column_index]]) return new_rows, sources class RemoveColumnsRuleDefinition(BaseRuleDefinition): rule_type = "remove_columns" def validate_rule(self, rule): _ensure_rule_contains_keys( rule, { "target_columns": list, }, ) def apply(self, rule, data, sources): target_columns = rule["target_columns"] def new_row(row): new = [] for index, val in enumerate(row): if index not in target_columns: new.append(val) return new return list(map(new_row, data)), sources def _filter_index(func, iterable): result = [] for index, x in enumerate(iterable): if func(index): result.append(x) return result class AddFilterRegexRuleDefinition(BaseRuleDefinition): rule_type = "add_filter_regex" def validate_rule(self, rule): _ensure_rule_contains_keys( rule, { "target_column": int, "invert": bool, "expression": str, }, ) _ensure_valid_pattern(rule["expression"]) def apply(self, rule, data, sources): target_column = rule["target_column"] invert = rule["invert"] regex = rule["expression"] def _filter(index): row = data[index] val = row[target_column] pattern = re.compile(regex) return not invert if pattern.search(val) else invert return _filter_index(_filter, data), _filter_index(_filter, sources) class AddFilterCountRuleDefinition(BaseRuleDefinition): rule_type = "add_filter_count" def validate_rule(self, rule): _ensure_rule_contains_keys( rule, { "count": int, "invert": bool, "which": str, }, ) _ensure_key_value_in(rule, "which", ["first", "last"]) def apply(self, rule, data, sources): num_rows = len(data) invert = rule["invert"] n = rule["count"] which = rule["which"] def _filter(index): if which == "first": matches = index >= n else: matches = index < (num_rows - n) return not invert if matches else invert return _filter_index(_filter, data), _filter_index(_filter, sources) class AddFilterEmptyRuleDefinition(BaseRuleDefinition): rule_type = "add_filter_empty" def validate_rule(self, rule): _ensure_rule_contains_keys(rule, {"target_column": int, "invert": bool}) def apply(self, rule, data, sources): invert = rule["invert"] target_column = rule["target_column"] def _filter(index): non_empty = len(data[index][target_column]) != 0 return not invert if non_empty else invert return _filter_index(_filter, data), _filter_index(_filter, sources) class AddFilterMatchesRuleDefinition(BaseRuleDefinition): rule_type = "add_filter_matches" def validate_rule(self, rule): _ensure_rule_contains_keys( rule, { "target_column": int, "invert": bool, "value": str, }, ) def apply(self, rule, data, sources): invert = rule["invert"] target_column = rule["target_column"] value = rule["value"] def _filter(index): row = data[index] val = row[target_column] return not invert if val == value else invert return _filter_index(_filter, data), _filter_index(_filter, sources) class AddFilterCompareRuleDefinition(BaseRuleDefinition): rule_type = "add_filter_compare" def validate_rule(self, rule): _ensure_rule_contains_keys( rule, { "target_column": int, "value": int, "compare_type": str, }, ) _ensure_key_value_in( rule, "compare_type", ["less_than", "less_than_equal", "greater_than", "greater_than_equal"] ) def apply(self, rule, data, sources): target_column = rule["target_column"] value = rule["value"] compare_type = rule["compare_type"] def _filter(index): row = data[index] target_value = float(row[target_column]) if compare_type == "less_than": matches = target_value < value elif compare_type == "less_than_equal": matches = target_value <= value elif compare_type == "greater_than": matches = target_value > value elif compare_type == "greater_than_equal": matches = target_value >= value return matches return _filter_index(_filter, data), _filter_index(_filter, sources) class SortRuleDefinition(BaseRuleDefinition): rule_type = "sort" def validate_rule(self, rule): _ensure_rule_contains_keys( rule, { "target_column": int, "numeric": bool, }, ) def apply(self, rule, data, sources): target = rule["target_column"] numeric = rule["numeric"] sortable = zip(data, sources) def sort_func(item): a_val = item[0][target] if numeric: a_val = float(a_val) return a_val sorted_data = sorted(sortable, key=sort_func) new_data = [] new_sources = [] for row, source in sorted_data: new_data.append(row) new_sources.append(source) return new_data, new_sources class SwapColumnsRuleDefinition(BaseRuleDefinition): rule_type = "swap_columns" def validate_rule(self, rule): _ensure_rule_contains_keys( rule, { "target_column_0": int, "target_column_1": int, }, ) def apply(self, rule, data, sources): target_column_0 = rule["target_column_0"] target_column_1 = rule["target_column_1"] def new_row(row): row_copy = row[:] row_copy[target_column_0] = row[target_column_1] row_copy[target_column_1] = row[target_column_0] return row_copy return list(map(new_row, data)), sources class SplitColumnsRuleDefinition(BaseRuleDefinition): rule_type = "split_columns" def validate_rule(self, rule): _ensure_rule_contains_keys( rule, { "target_columns_0": list, "target_columns_1": list, }, ) def apply(self, rule, data, sources): target_columns_0 = rule["target_columns_0"] target_columns_1 = rule["target_columns_1"] def split_row(row): new_row_0 = [] new_row_1 = [] for index, el in enumerate(row): if index in target_columns_0: new_row_0.append(el) elif index in target_columns_1: new_row_1.append(el) else: new_row_0.append(el) new_row_1.append(el) return [new_row_0, new_row_1] data = flat_map(split_row, data) sources = flat_map(lambda x: [x, x], sources) return data, sources def flat_map(f, items): return list(itertools.chain.from_iterable(map(f, items))) class RuleSet: def __init__(self, rule_set_as_dict): self.raw_rules = rule_set_as_dict["rules"] self.raw_mapping = rule_set_as_dict.get("mapping", []) @property def rules(self): return self.raw_rules def _rules_with_definitions(self): for rule in self.raw_rules: yield (rule, RULES_DEFINITIONS[rule["type"]]) def apply(self, data, sources): for rule, rule_definition in self._rules_with_definitions(): rule_definition.validate_rule(rule) data, sources = rule_definition.apply(rule, data, sources) return data, sources @property def has_errors(self): errored = False try: for rule, rule_definition in self._rules_with_definitions(): rule_definition.validate_rule(rule) except Exception: errored = True return errored @property def mapping_as_dict(self): as_dict = {} for mapping in self.raw_mapping: as_dict[mapping["type"]] = mapping return as_dict # Rest of this is generic, things here are Galaxy collection specific, think about about # subclass of RuleSet for collection creation. @property def identifier_columns(self): mapping_as_dict = self.mapping_as_dict identifier_columns = [] if "list_identifiers" in mapping_as_dict: identifier_columns.extend(mapping_as_dict["list_identifiers"]["columns"]) if "paired_identifier" in mapping_as_dict: identifier_columns.append(mapping_as_dict["paired_identifier"]["columns"][0]) if "paired_or_unpaired_identifier" in mapping_as_dict: identifier_columns.append(mapping_as_dict["paired_or_unpaired_identifier"]["columns"][0]) return identifier_columns @property def collection_type(self): mapping_as_dict = self.mapping_as_dict list_columns = mapping_as_dict.get("list_identifiers", {"columns": []})["columns"] collection_type = ":".join("list" for c in list_columns) if "paired_identifier" in mapping_as_dict: if collection_type: collection_type += ":paired" else: collection_type = "paired" if "paired_or_unpaired_identifier" in mapping_as_dict: if collection_type: collection_type += ":paired_or_unpaired" else: collection_type = "paired_or_unpaired" return collection_type @property def display(self): message = "Rules:\n" message += "".join(f"- {r}\n" for r in self.raw_rules) message += "Column Definitions:\n" message += "".join(f"- {m}\n" for m in self.raw_mapping) return message RULES_DEFINITION_CLASSES: List[Type[BaseRuleDefinition]] = [ AddColumnMetadataRuleDefinition, AddColumnGroupTagValueRuleDefinition, AddColumnConcatenateRuleDefinition, AddColumnBasenameRuleDefinition, AddColumnRegexRuleDefinition, AddColumnRownumRuleDefinition, AddColumnValueRuleDefinition, AddColumnSubstrRuleDefinition, AddColumnFromSampleSheetByIndex, RemoveColumnsRuleDefinition, AddFilterRegexRuleDefinition, AddFilterCountRuleDefinition, AddFilterEmptyRuleDefinition, AddFilterMatchesRuleDefinition, AddFilterCompareRuleDefinition, SortRuleDefinition, SwapColumnsRuleDefinition, SplitColumnsRuleDefinition, ] RULES_DEFINITIONS = {} for rule_class in RULES_DEFINITION_CLASSES: RULES_DEFINITIONS[rule_class.rule_type] = rule_class() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/rules_dsl_spec.yml0000644000175100017510000003500515211124267021567 0ustar00runnerrunner- doc: add_column_basename functions properly on absolute and relative paths rules: - type: add_column_basename target_column: 0 initial: data: [['/path/to/moo.txt'], ['moo.txt']] sources: [1, 2] final: data: - [/path/to/moo.txt, moo.txt] - [moo.txt, moo.txt] - doc: add_column_regex works with simple captures by default rules: - type: add_column_regex target_column: 0 expression: '(o)+' initial: data: [[foo], [cow]] sources: [1,2] final: data: [[foo, oo], [cow, o]] - doc: add_column_regex works with replacements if supplied rules: - type: add_column_regex target_column: 0 expression: '(o+)' replacement: 'the os \1' initial: data: [[foo], [cow]] sources: [1,2] final: data: [["foo", "the os oo"], ["cow", "the os o"]] - doc: add_column_regex works with multiple groups when group_count specified rules: - type: add_column_regex target_column: 0 expression: '.*(o)(o)' group_count: 2 initial: data: [[foo], [boo]] sources: [1,2] final: data: [["foo", "o", "o"], ["boo", "o", "o"]] - doc: add_column_regex fails if unmatched by default rules: - type: add_column_regex target_column: 0 expression: '(o)+' initial: data: [[foo], [cow], [cat]] sources: [1,2] error: true - doc: add_column_regex allowed to unmatched with allow_unmatched property rules: - type: add_column_regex target_column: 0 expression: '(o)+' allow_unmatched: true initial: data: [[foo], [cow], [cat]] sources: [1,2] final: data: [["foo", "oo"], ["cow", "o"], ["cat", ""]] - doc: add_column_regex allowed to unmatched with group counts rules: - type: add_column_regex target_column: 0 expression: '.*(o)(o)' group_count: 2 allow_unmatched: true initial: data: [[foo], [boo], [cat]] sources: [1,2] final: data: [["foo", "o", "o"], ["boo", "o", "o"], ["cat", "", ""]] - doc: add_column_substr works for keeping a fixed length prefix rules: - type: add_column_substr target_column: 0 substr_type: keep_prefix length: 2 initial: data: [[foo], [cow], [ba], [d]] sources: [1, 2] final: data: [[foo, fo], [cow, co], [ba, ba], [d, d]] - doc: add_column_substr works for keeping a fixed length suffix rules: - type: add_column_substr target_column: 0 substr_type: keep_suffix length: 2 initial: data: [[foo], [cow], [ba], [d]] sources: [1, 2] final: data: [[foo, oo], [cow, ow], [ba, ba], [d, d]] - doc: add_column_substr works for dropping a fixed length prefix rules: - type: add_column_substr target_column: 0 substr_type: drop_prefix length: 2 initial: data: [[foo], [cow], [ba], [d]] sources: [1, 2] final: data: [[foo, o], [cow, w], [ba, ""], [d, ""]] - doc: add_column_substr works for dropping a fixed length suffix rules: - type: add_column_substr target_column: 0 substr_type: drop_suffix length: 2 initial: data: [[foo], [cow], [ba], [d]] sources: [1, 2] final: data: [[foo, f], [cow, c], [ba, ""], [d, ""]] - rules: - type: add_column_rownum start: 1 initial: data: [[foo], [cow], [ba], [d]] sources: [1, 2, 3, 4] final: data: [[foo, "1"], [cow, "2"], [ba, "3"], [d, "4"]] - rules: - type: add_column_rownum start: 0 initial: data: [[foo], [cow], [ba], [d]] sources: [1, 2, 3, 4] final: data: [[foo, "0"], [cow, "1"], [ba, "2"], [d, "3"]] - rules: - type: add_column_value value: "moo" initial: data: [[foo], [cow]] sources: [1, 2] final: data: [[foo, moo], [cow, moo]] - rules: - type: remove_columns target_columns: [0, 1] initial: data: [[a, b, c], [e, f, g]] sources: [1, 2] final: data: [[c], [g]] - rules: - type: remove_columns target_columns: [2] initial: data: [[a, b, c], [e, f, g]] sources: [1, 2] final: data: [[a, b], [e, f]] - rules: - type: add_filter_regex target_column: 0 expression: '(a+)' invert: false initial: data: [[a, b, c], [e, f, g]] sources: [1, 2] final: data: [[a, b, c]] - rules: - type: add_filter_regex target_column: 2 expression: '(c+)' invert: true initial: data: [[a, b, c], [e, f, g]] sources: [1, 2] final: data: [[e, f, g]] sources: [2] - rules: - type: add_filter_count count: 1 which: first invert: false initial: data: [[a, b, c], [e, f, g], [h, i, j]] sources: [1, 2, 3] final: data: [[e, f, g], [h, i, j]] sources: [2, 3] - rules: - type: add_filter_count count: 0 which: first invert: false initial: data: [[a, b, c], [e, f, g], [h, i, j]] sources: [1, 2, 3] final: data: [[a, b, c], [e, f, g], [h, i, j]] sources: [1, 2, 3] - rules: - type: add_filter_count count: 1 which: last invert: false initial: data: [[a, b, c], [e, f, g], [h, i, j]] sources: [1, 2, 3] final: data: [[a, b, c], [e, f, g]] sources: [1, 2] - rules: - type: add_filter_count count: 1 which: last invert: true initial: data: [[a, b, c], [e, f, g], [h, i, j]] sources: [1, 2, 3] final: data: [[h, i, j]] sources: [3] - rules: - type: add_filter_empty target_column: 0 invert: false initial: data: [["", "b", "c"], ["a", "b", "c"]] sources: [1, 2] final: data: [["a", "b", "c"]] sources: [2] - rules: - type: add_filter_empty target_column: 0 invert: true initial: data: [["", "b", "c"], ["a", "b", "c"]] sources: [moo, cow] final: data: [["", "b", "c"]] sources: [moo] - rules: - type: add_filter_matches value: a target_column: 0 invert: false initial: data: [[a, b, c], [e, f, g], [h, i, j]] sources: [1, 2, 3] final: data: [[a, b, c]] sources: [1] - rules: - type: add_filter_matches value: a target_column: 0 invert: true initial: data: [[a, b, c], [e, f, g], [h, i, j]] sources: [1, 2, 3] final: data: [[e, f, g], [h, i, j]] sources: [2, 3] - rules: - type: add_filter_matches value: p target_column: 1 invert: false initial: data: [[a, b, c], [e, f, g], [h, i, j]] sources: [1, 2, 3] final: data: [] sources: [] - rules: - type: add_filter_matches value: a target_column: 1 invert: false initial: data: [['a ', b, c], [e, f, g], [h, i, j]] sources: [1, 2, 3] final: data: [] sources: [] - rules: - type: add_filter_matches value: 'a ' target_column: 1 invert: false initial: data: [[a, b, c], [e, f, g], [h, i, j]] sources: [1, 2, 3] final: data: [] sources: [] - rules: - type: add_filter_matches value: p target_column: 1 invert: true initial: data: [[a, b, c], [e, f, g], [h, i, j]] sources: [1, 2, 3] final: data: [[a, b, c], [e, f, g], [h, i, j]] sources: [1, 2, 3] - rules: - type: add_filter_compare target_column: 0 value: 13 compare_type: less_than initial: data: [["1", "moo"], ["10", "cow"], ["13", "rat"], ["20", "dog"], ["30", "cat"]] sources: [1, 2, 3, 4, 5] final: data: [["1", "moo"], ["10", "cow"]] sources: [1, 2] - rules: - type: add_filter_compare target_column: 0 value: 13 compare_type: less_than_equal initial: data: [["1", "moo"], ["10", "cow"], ["13", "rat"], ["20", "dog"], ["30", "cat"]] sources: [1, 2, 3, 4, 5] final: data: [["1", "moo"], ["10", "cow"], ["13", "rat"]] sources: [1, 2, 3] - rules: - type: add_filter_compare target_column: 0 value: 13 compare_type: greater_than initial: data: [["1", "moo"], ["10", "cow"], ["13", "rat"], ["20", "dog"], ["30", "cat"]] sources: [1, 2, 3, 4, 5] final: data: [["20", "dog"], ["30", "cat"]] sources: [4, 5] - rules: - type: add_filter_compare target_column: 0 value: 13 compare_type: greater_than_equal initial: data: [["1", "moo"], ["10", "cow"], ["13", "rat"], ["20", "dog"], ["30", "cat"]] sources: [1, 2, 3, 4, 5] final: data: [["13", "rat"], ["20", "dog"], ["30", "cat"]] sources: [3, 4, 5] - rules: - type: sort numeric: false target_column: 0 initial: data: [["moo", "cow"], ["meow", "cat"], ["bark", "dog"]] sources: [1, 2, 3] final: data: [["bark", "dog"], ["meow", "cat"], ["moo", "cow"]] sources: [3, 2, 1] - rules: - type: sort numeric: false target_column: 1 initial: data: [["moo", "cow"], ["meow", "cat"], ["bark", "dog"]] sources: [1, 2, 3] final: data: [["meow", "cat"], ["moo", "cow"], ["bark", "dog"]] sources: [2, 1, 3] - rules: - type: sort numeric: false target_column: 1 initial: data: [["moo", "cow"], ["meow", "cat"], ["bark", "Dog"]] sources: [1, 2, 3] final: data: [["bark", "Dog"], ["meow", "cat"], ["moo", "cow"]] sources: [3, 2, 1] - rules: - type: swap_columns target_column_0: 0 target_column_1: 1 initial: data: [["moo", "cow"], ["meow", "cat"], ["bark", "Dog"]] sources: [1, 2, 3] final: data: [["cow", "moo"], ["cat", "meow"], ["Dog", "bark"]] sources: [1, 2, 3] - rules: - type: split_columns target_columns_0: [0] target_columns_1: [1] initial: data: [["moo", "cow", "A"], ["meow", "cat", "B"], ["bark", "Dog", "C"]] sources: [1, 2, 3] final: data: [["moo", "A"], ["cow", "A"], ["meow", "B"], ["cat", "B"], ["bark", "C"], ["Dog", "C"]] sources: [1, 1, 2, 2, 3, 3] - rules: - type: add_column_metadata value: identifier0 initial: data: [["moo"], ["meow"], ["bark"]] sources: [{"identifiers": ["cow"]}, {"identifiers": ["cat"]}, {"identifiers": ["dog"]}] final: data: [["moo", "cow"], ["meow", "cat"], ["bark", "dog"]] - doc: Add columns based on flat list indicies rules: - type: add_column_metadata value: index0 initial: data: [["moo"], ["meow"], ["bark"]] sources: [{"indices": [0]}, {"indices": [1]}, {"indices": [2]}] final: data: [["moo", "0"], ["meow", "1"], ["bark", "2"]] - doc: Add columns based on paired list indicies rules: - type: add_column_metadata value: index0 - type: add_column_metadata value: index1 initial: data: [["samp1for"], ["samp1rev"], ["samp2for"], ["samp2rev"]] sources: [{"indices": [0, 0]}, {"indices": [0, 1]}, {"indices": [1, 0]}, {"indices": [1, 1]}] final: data: [["samp1for", "0", "0"], ["samp1rev", "0", "1"], ["samp2for", "1", "0"], ["samp2rev", "1", "1"]] - rules: - type: add_column_metadata value: tags initial: data: [["moo"], ["meow"], ["bark"]] sources: [{"identifiers": ["cow"], "tags": ["farm"]}, {"identifiers": ["cat"], "tags": ["house"]}, {"identifiers": ["dog"], "tags": ["house", "firestation"]}] final: data: [["moo", "farm"], ["meow", "house"], ["bark", "firestation,house"]] - rules: - type: add_column_group_tag_value value: where default_value: '' initial: data: [["moo"], ["meow"], ["bark"]] sources: [{"identifiers": ["cow"], "tags": ["group:where:farm"]}, {"identifiers": ["cat"], "tags": ["group:where:house"]}, {"identifiers": ["dog"], "tags": ["group:where:house"]}] final: data: [["moo", "farm"], ["meow", "house"], ["bark", "house"]] - doc: add_column_group_tag_value uses default value rules: - type: add_column_group_tag_value value: where default_value: 'barn' initial: data: [["moo"], ["meow"], ["bark"]] sources: [{"identifiers": ["cow"], "tags": []}, {"identifiers": ["cat"], "tags": ["group:where:house"]}, {"identifiers": ["dog"], "tags": ["group:where:firestation"]}] final: data: [["moo", "barn"], ["meow", "house"], ["bark", "firestation"]] - doc: add_column_group_tag_value sorts and grabs first tag value if multiple present rules: - type: add_column_group_tag_value value: where default_value: 'barn' initial: data: [["moo"], ["meow"], ["bark"]] sources: [{"identifiers": ["cow"], "tags": []}, {"identifiers": ["cat"], "tags": ["group:where:house", "group:where:kittenpile"]}, {"identifiers": ["dog"], "tags": ["group:where:house", "group:where:firestation"]}] final: data: [["moo", "barn"], ["meow", "house"], ["bark", "firestation"]] - doc: add column from a sample sheet by index rules: - type: add_column_from_sample_sheet_index value: 0 initial: data: [["moo"], ["cow"]] sources: [{"columns": [0, 1]}, {"columns": [2, 3]}] final: data: [["moo", 0], ["cow", 2]] - doc: add multiple columns from a sample sheet by index rules: - type: add_column_from_sample_sheet_index value: 0 - type: add_column_from_sample_sheet_index value: 1 initial: data: [["moo"], ["cow"]] sources: [{"columns": [0, 1]}, {"columns": [2, 3]}] final: data: [["moo", 0, 1], ["cow", 2, 3]] - rules: - type: invalid_rule_type error: true - rules: - type: add_filter_compare target_column: 0 value: 13 compare_type: invalid_compare_type error: true - rules: - type: add_column_concatenate target_column: 0 error: true - rules: - type: add_column_basename target_column_0: 0 error: true - rules: - type: add_column_regex target_column: 0 regex: '(o)+' error: true - rules: - type: add_column_regex target_column: 0 expression: '(o+' error: true # convert list:record to list:paired - rules: - type: add_column_metadata value: identifier0 - type: add_column_metadata value: identifier1 - type: add_column_regex target_column: 2 expression: 'mother' replacement: 'forward' allow_unmatched: true - type: add_column_regex target_column: 2 expression: 'child' replacement: 'reverse' allow_unmatched: true - type: add_column_concatenate target_column_0: 3 target_column_1: 4 - type: add_filter_empty target_column: 5 invert: false - type: remove_columns target_columns: [2, 3, 4] initial: data: [["el1"], ["el2"], ["el3"]] sources: [{"identifiers": ["samp1", "mother"]}, {"identifiers": ["samp1", "father"]}, {"identifiers": ["samp1", "child"]}] final: data: [["el1", "samp1", "forward"], ["el3", "samp1", "reverse"]] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/sanitize_html.py0000644000175100017510000000720515211124267021263 0ustar00runnerrunner""" HTML Sanitizer (lists of acceptable_* ripped from feedparser) """ import bleach from galaxy.util import unicodify _acceptable_elements = [ "a", "abbr", "acronym", "address", "area", "article", "aside", "audio", "b", "big", "blockquote", "br", "button", "canvas", "caption", "center", "cite", "code", "col", "colgroup", "command", "datagrid", "datalist", "dd", "del", "details", "dfn", "dialog", "dir", "div", "dl", "dt", "em", "event-source", "fieldset", "figure", "footer", "font", "form", "header", "h1", "h2", "h3", "h4", "h5", "h6", "hr", "i", "img", "input", "ins", "keygen", "kbd", "label", "legend", "li", "m", "map", "menu", "meter", "multicol", "nav", "nextid", "ol", "output", "optgroup", "option", "p", "pre", "progress", "q", "s", "samp", "section", "select", "small", "sound", "source", "spacer", "span", "strike", "strong", "sub", "sup", "table", "tbody", "td", "textarea", "time", "tfoot", "th", "thead", "tr", "tt", "u", "ul", "var", "video", "noscript", ] _acceptable_attributes = [ "abbr", "accept", "accept-charset", "accesskey", "action", "align", "alt", "autocomplete", "autofocus", "axis", "background", "balance", "bgcolor", "bgproperties", "border", "bordercolor", "bordercolordark", "bordercolorlight", "bottompadding", "cellpadding", "cellspacing", "ch", "challenge", "char", "charoff", "choff", "charset", "checked", "cite", "class", "clear", "color", "cols", "colspan", "compact", "contenteditable", "controls", "coords", "data", "datafld", "datapagesize", "datasrc", "datetime", "default", "delay", "dir", "disabled", "draggable", "dynsrc", "enctype", "end", "face", "for", "form", "frame", "galleryimg", "gutter", "headers", "height", "hidefocus", "hidden", "high", "href", "hreflang", "hspace", "icon", "id", "inputmode", "ismap", "keytype", "label", "leftspacing", "lang", "list", "longdesc", "loop", "loopcount", "loopend", "loopstart", "low", "lowsrc", "max", "maxlength", "media", "method", "min", "multiple", "name", "nohref", "noshade", "nowrap", "open", "optimum", "pattern", "ping", "point-size", "prompt", "pqg", "radiogroup", "readonly", "rel", "repeat-max", "repeat-min", "replace", "required", "rev", "rightspacing", "rows", "rowspan", "rules", "scope", "selected", "shape", "size", "span", "src", "start", "step", "summary", "suppress", "tabindex", "target", "template", "title", "toppadding", "type", "unselectable", "usemap", "urn", "valign", "value", "variable", "volume", "vspace", "vrml", "width", "wrap", "xml:lang", ] def sanitize_html(htmlSource, allow_data_urls=False): kwd = dict(tags=_acceptable_elements, attributes=_acceptable_attributes, strip=True) if allow_data_urls: kwd["protocols"] = list(bleach.ALLOWED_PROTOCOLS) + ["data"] return bleach.clean(unicodify(htmlSource), **kwd) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/script.py0000644000175100017510000000710615211124267017715 0ustar00runnerrunner"""Utilities for Galaxy scripts""" import argparse import logging import os import sys from galaxy.util.properties import ( find_config_file, load_app_properties, ) DESCRIPTION = None ACTIONS = None ARGUMENTS = None DEFAULT_ACTION = None ARG_HELP_CONFIG_FILE = """ Galaxy config file (defaults to $GALAXY_ROOT/config/galaxy.yml if that file exists or else to ./config/galaxy.ini if that exists). If this isn't set on the command line it can be set with the environment variable GALAXY_CONFIG_FILE. """ # ARG_HELP_CONFIG_SECTION = """ # Section containing application configuration in the target config file specified with # -c/--config-file. This defaults to 'galaxy' for YAML/JSON configuration files and 'main' # with 'app:' prepended for INI. If this isn't set on the command line it can be set with # the environment variable GALAXY_CONFIG_SECTION. # """ def main_factory(description=None, actions=None, arguments=None, default_action=None): global DESCRIPTION, ACTIONS, ARGUMENTS, DEFAULT_ACTION DESCRIPTION = description ACTIONS = actions or {} ARGUMENTS = arguments or [] DEFAULT_ACTION = default_action return main def main(argv=None): """Entry point for conversion process.""" if argv is None: argv = sys.argv[1:] args = _arg_parser().parse_args(argv) kwargs = app_properties_from_args(args) action = args.action action_func = ACTIONS[action] action_func(args, kwargs) def app_properties_from_args(args, legacy_config_override=None, app=None): config_file = config_file_from_args(args, legacy_config_override=legacy_config_override, app=app) config_section = getattr(args, "config_section", None) app_properties = load_app_properties(config_file=config_file, config_section=config_section) return app_properties def config_file_from_args(args, legacy_config_override=None, app=None): app = app or getattr(args, "app", "galaxy") config_file = legacy_config_override or args.config_file or find_config_file(app) return config_file def populate_config_args(parser): # config and config-file respected because we have used different arguments at different # time for scripts. # Options (e.g. option_name) not found in this file can have their defaults overridden # set setting GALAXY_CONFIG_OPTION_NAME where OPTION_NAME is option_name converted to upper case. # Options specified in that file can be overridden for this program set setting # GALAXY_CONFIG_OVERRIDE_OPTION_NAME to a new value. parser.add_argument( "-c", "--config-file", "--config", default=os.environ.get("GALAXY_CONFIG_FILE", None), help=ARG_HELP_CONFIG_FILE ) parser.add_argument( "--config-section", default=os.environ.get("GALAXY_CONFIG_SECTION", None), help=argparse.SUPPRESS ) # See ARG_HELP_CONFIG_SECTION comment above for unsuppressed details. def _arg_parser(): parser = argparse.ArgumentParser(description=DESCRIPTION) parser.add_argument( "action", metavar="ACTION", type=str, choices=list(ACTIONS.keys()), default=DEFAULT_ACTION, nargs="?" if DEFAULT_ACTION is not None else None, help="action to perform", ) populate_config_args(parser) parser.add_argument("--app", default=os.environ.get("GALAXY_APP", "galaxy")) for argument in ARGUMENTS: parser.add_argument(*argument[0], **argument[1]) return parser def set_log_handler(filename=None, stream=None): if filename: handler = logging.FileHandler(filename) else: handler = logging.StreamHandler(stream=stream) return handler ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/search.py0000644000175100017510000001216615211124267017660 0ustar00runnerrunnerimport re from typing import ( Dict, List, NamedTuple, Optional, Tuple, Union, ) KeyedQueryT = Tuple[str, str] ParseFilterResultT = Tuple[Optional[List["FilteredTerm"]], Optional[str]] QUOTE_PATTERN = re.compile(r"\'(.*?)\'") # Defaults for `filter_terms` used by index-search callers. A whitespace-rich # query turns into one WHERE clause (and, pre-trigram-index, one seq scan per # matching table) per raw term, so both floors are there to bound query cost. DEFAULT_MIN_RAW_TERM_LENGTH = 4 DEFAULT_MAX_RAW_TERMS = 7 def parse_filters(search_term: str, filters: Optional[Dict[str, str]] = None) -> ParseFilterResultT: """Support github-like filters for narrowing the results. Order of chunks does not matter, only recognized filter names are allowed. :param search_term: the original search str from user input :returns allow_query: whoosh Query object used for filtering results of searching in index :returns search_term_without_filters: str that represents user's search phrase without the filters """ return parse_filters_structured(search_term, filters, preserve_quotes=False).simple_result def parse_filters_structured( search_term: str, filters: Optional[Dict[str, str]] = None, preserve_quotes: bool = True, ) -> "ParsedSearch": search_space = search_term.replace('"', "'") filters = filters or {} filter_keys = "|".join(list(filters.keys())) pattern = rf"({filter_keys}):(?:\s+)?([\w-]+|'.*?')(:\w+)?" reserved = re.compile(pattern) parsed_search = ParsedSearch() while True: match = reserved.search(search_space) if match is None: match = QUOTE_PATTERN.search(search_space) if match is None: parsed_search.add_unfiltered_text_terms(search_space) break group = match.groups()[0].strip() parsed_search.add_unfiltered_text_terms(search_space[0 : match.start()]) parsed_search.add_unfiltered_text(group, True) else: first_group = match.groups()[0] if first_group in filters: if match.groups()[0] == "tag" and match.groups()[1] == "name" and match.groups()[2] is not None: group = match.groups()[1] + match.groups()[2].strip() else: group = match.groups()[1].strip() filter_as = filters[first_group] quoted = preserve_quotes and group.startswith("'") parsed_search.add_keyed_term(filter_as, group.replace("'", ""), quoted) parsed_search.add_unfiltered_text_terms(search_space[0 : match.start()]) search_space = search_space[match.end() :] return parsed_search class RawTextTerm(NamedTuple): text: str quoted: bool class FilteredTerm(NamedTuple): filter: str text: str quoted: bool TermT = Union[RawTextTerm, FilteredTerm] class ParsedSearch: terms: List[TermT] text_terms: List[RawTextTerm] filter_terms: List[FilteredTerm] def __init__(self): self.terms = [] self.text_terms = [] self.filter_terms = [] def add_unfiltered_text_terms(self, text: str): for part in text.split(): self.add_unfiltered_text(part, False) def add_unfiltered_text(self, text: str, quoted: bool = False): text = text.strip() if not text: return term = RawTextTerm(text.strip(), quoted) self.terms.append(term) self.text_terms.append(term) def add_keyed_term(self, key: str, text: str, quoted: bool): term = FilteredTerm(key, text, quoted) self.terms.append(term) self.filter_terms.append(term) @property def simple_result(self) -> ParseFilterResultT: return None if len(self.filter_terms) == 0 else self.filter_terms, " ".join([t.text for t in self.text_terms]) def filter_terms( parsed: "ParsedSearch", min_raw_term_length: int = DEFAULT_MIN_RAW_TERM_LENGTH, max_raw_terms: Optional[int] = DEFAULT_MAX_RAW_TERMS, ) -> "ParsedSearch": """Return a new ParsedSearch with short / excess raw text terms dropped. Raw (unquoted, non-keyed) terms shorter than ``min_raw_term_length`` are dropped, and the surviving raw terms are capped at ``max_raw_terms``. Filtered terms (``key:value``) and quoted raw terms ('foo bar') are always kept — those are explicit user intent. """ out = ParsedSearch() raw_kept = 0 for term in parsed.terms: if isinstance(term, RawTextTerm) and not term.quoted: if len(term.text) < min_raw_term_length: continue if max_raw_terms is not None and raw_kept >= max_raw_terms: continue raw_kept += 1 out.add_unfiltered_text(term.text, term.quoted) elif isinstance(term, RawTextTerm): out.add_unfiltered_text(term.text, term.quoted) else: out.add_keyed_term(term.filter, term.text, term.quoted) return out __all__ = ( "DEFAULT_MAX_RAW_TERMS", "DEFAULT_MIN_RAW_TERM_LENGTH", "filter_terms", "parse_filters", "parse_filters_structured", ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/simplegraph.py0000644000175100017510000001101115211124267020712 0ustar00runnerrunner""" Fencepost-simple graph structure implementation. """ # Currently (2013.7.12) only used in easing the parsing of graph datatype data. class SimpleGraphNode: """ Node representation. """ def __init__(self, index, **data): """ :param index: index of this node in some parent list :type index: int :param data: any extra data that needs to be saved :type data: (variadic dictionary) """ # a bit application specific (could be 'id') self.index = index self.data = data class SimpleGraphEdge: """ Edge representation. """ def __init__(self, source_index, target_index, **data): """ :param source_index: index of the edge's source node in some parent list :type source_index: int :param target_index: index of the edge's target node in some parent list :type target_index: int :param data: any extra data that needs to be saved :type data: (variadic dictionary) """ self.source_index = source_index self.target_index = target_index self.data = data class SimpleGraph: """ Each node is unique (by id) and stores its own index in the node list/odict. Each edge is represented as two indeces into the node list/odict. Both nodes and edges allow storing extra information if needed. Allows: multiple edges between two nodes self referential edges (an edge from a node to itself) These graphs are not specifically directed but since source and targets on the edges are listed - it could easily be used that way. """ def __init__(self, nodes=None, edges=None): # use an odict so that edge indeces actually match the final node list indeces self.nodes = nodes or {} self.edges = edges or [] def add_node(self, node_id, **data): """ Adds a new node only if it doesn't already exist. :param node_id: some unique identifier :type node_id: (hashable) :param data: any extra data that needs to be saved :type data: (variadic dictionary) :returns: the new node """ if node_id in self.nodes: return self.nodes[node_id] node_index = len(self.nodes) new_node = SimpleGraphNode(node_index, **data) self.nodes[node_id] = new_node return new_node def add_edge(self, source_id, target_id, **data): """ Adds a new node only if it doesn't already exist. :param source_id: the id of the source node :type source_id: (hashable) :param target_id: the id of the target node :type target_id: (hashable) :param data: any extra data that needs to be saved for the edge :type data: (variadic dictionary) :returns: the new node ..note: that, although this will create new nodes if necessary, there's no way to pass `data` to them - so if you need to assoc. more data with the nodes, use `add_node` first. """ # adds target_id to source_id's edge list # adding source_id and/or target_id to nodes if not there already if source_id not in self.nodes: self.add_node(source_id) if target_id not in self.nodes: self.add_node(target_id) new_edge = SimpleGraphEdge(self.nodes[source_id].index, self.nodes[target_id].index, **data) self.edges.append(new_edge) return new_edge def gen_node_dicts(self): """ Returns a generator that yields node dictionaries in the form: { 'id': , 'data': } """ for node_id, node in self.nodes.items(): yield {"id": node_id, "data": node.data} def gen_edge_dicts(self): """ Returns a generator that yields node dictionaries in the form:: { 'source': , 'target': , 'data' : } """ for edge in self.edges: yield {"source": edge.source_index, "target": edge.target_index, "data": edge.data} def as_dict(self): """ Returns a dictionary of the form:: { 'nodes': , 'edges': } """ return {"nodes": list(self.gen_node_dicts()), "edges": list(self.gen_edge_dicts())} ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/sleeper.py0000644000175100017510000000115315211124267020044 0ustar00runnerrunnerimport threading class Sleeper: """ Provides a 'sleep' method that sleeps for a number of seconds *unless* the notify method is called (from a different thread). """ def __init__(self): self.condition = threading.Condition() def sleep(self, seconds): # Should this be in a try/finally block? -John self.condition.acquire() self.condition.wait(seconds) self.condition.release() def wake(self): # Should this be in a try/finally block? -John self.condition.acquire() self.condition.notify() self.condition.release() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/sockets.py0000644000175100017510000000321615211124267020062 0ustar00runnerrunnerimport random import socket import sys from galaxy.util import commands def get_ip() -> str: if sys.platform == "darwin": # If we're on OSX it is likely that the docker host is localhost. return socket.gethostbyname(socket.gethostname()) # This method assumes that the ip with default route is the ip we want to return s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) try: # doesn't even have to be reachable s.connect(("10.255.255.255", 1)) ip = s.getsockname()[0] except Exception: ip = None finally: s.close() return ip def unused_port(range=None): if range: return __unused_port_on_range(range) else: return __unused_port_rangeless() def __unused_port_rangeless(): # TODO: Allow ranges (though then need to guess and check)... s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.bind(("localhost", 0)) addr, port = s.getsockname() s.close() return port def __unused_port_on_range(range): assert range[0] and range[1] # Find all ports that are already occupied cmd_netstat = ["netstat", "tuln"] stdout = commands.execute(cmd_netstat) occupied_ports = set() for line in stdout.split("\n"): if line.startswith("tcp") or line.startswith("tcp6"): col = line.split() local_address = col[3] local_port = local_address.split(":")[1] occupied_ports.add(int(local_port)) # Generate random free port number. while True: port = random.randrange(range[0], range[1]) if port not in occupied_ports: break return port ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/specs.py0000644000175100017510000000105515211124267017523 0ustar00runnerrunnerimport functools import operator from galaxy import util # Utility methods for specifing maps. def to_str_or_none(value): if value is None: return None else: return str(value) def to_bool_or_none(value): return util.string_as_bool_or_none(value) def to_bool(value): return util.asbool(value) def to_float_or_none(value): if value is None: return None else: return float(value) # Utility methods for specifing valid... def is_in(*args): return functools.partial(operator.contains, args) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/sqlite.py0000644000175100017510000000135115211124267017706 0ustar00runnerrunnerimport re import sqlite3 try: import sqlparse def is_read_only_query(query): statements = sqlparse.parse(query) for statement in statements: if statement.get_type() != "SELECT": return False return True except ImportError: # Without sqlparse we use a very weak regex check def is_read_only_query(query): if re.match("select ", query, re.IGNORECASE): if re.search('^([^"]|"[^"]*")*?;', query) or re.search("^([^']|'[^']*')*?;", query): return False else: return True return False def connect(path): connection = sqlite3.connect(path) connection.row_factory = sqlite3.Row return connection ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/submodules.py0000644000175100017510000000362415211124267020574 0ustar00runnerrunnerimport importlib import logging import pkgutil from types import ModuleType from typing import ( List, Union, ) log = logging.getLogger(__name__) def import_submodules( module: Union[ModuleType, str], ordered: bool = True, recursive: bool = False ) -> List[ModuleType]: """Import all submodules of a module :param module: module (package name or actual module) :type module: str | module :param ordered: Whether to order the returned modules. The default is True, and modules are returned in reverse order to allow hierarchical overrides i.e. 000_galaxy_rules.py, 100_site_rules.py, 200_instance_rules.py :type ordered: bool :param recursive: Recursively returns all subpackages :type recursive: bool :rtype: [module] """ sub_modules = __import_submodules_impl(module, recursive) if ordered: return sorted(sub_modules, reverse=True, key=lambda m: m.__name__) else: return sub_modules def __import_submodules_impl(module: Union[ModuleType, str], recursive: bool = False) -> List[ModuleType]: """Implementation of import only, without sorting. :param module: module (package name or actual module) :type module: str | module :rtype: [module] """ if isinstance(module, str): module = importlib.import_module(module) submodules: List[ModuleType] = [] for _, name, is_pkg in pkgutil.walk_packages(module.__path__): full_name = f"{module.__name__}.{name}" try: submodule = importlib.import_module(full_name) submodules.append(submodule) if recursive and is_pkg: submodules.extend(__import_submodules_impl(submodule, recursive=True)) except Exception: message = f"{full_name} dynamic module could not be loaded (traceback follows):" log.exception(message) continue return submodules ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/task.py0000644000175100017510000000252215211124267017350 0ustar00runnerrunnerimport logging from threading import ( Event, Thread, ) from galaxy.util import ExecutionTimer log = logging.getLogger(__name__) class IntervalTask: def __init__(self, func, name="Periodic task", interval=3600, immediate_start=False, time_execution=False): """ Run an arbitrary function `func` every `interval` seconds. Set `immediate_start` to True to run `func` when task is started. """ self.func = func self.name = name self.interval = interval self.time_execution = time_execution self.immediate_start = immediate_start self.event = Event() self.thread = Thread(target=self.run, name=self.name, daemon=True) self.running = False def start(self): self.running = True self.thread.start() def _exec(self): if self.time_execution: timer = ExecutionTimer() self.func() if self.time_execution: log.debug(f"Executed periodic task {self.name} {timer}") def run(self): if self.immediate_start: self._exec() while not self.event.is_set(): self.event.wait(self.interval) if self.running: self._exec() def shutdown(self): self.running = False self.event.set() self.thread.join(5) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/template.py0000644000175100017510000001532215211124267020223 0ustar00runnerrunner"""Entry point for the usage of Cheetah templating within Galaxy.""" import sys import traceback from typing import ( Optional, Union, ) from Cheetah.Compiler import Compiler from Cheetah.NameMapper import NotFound from Cheetah.Parser import ParseError from Cheetah.Template import Template from packaging.version import Version from galaxy.util.tree_dict import TreeDict from . import unicodify try: from lib2to3.refactor import RefactoringTool except ImportError: # Either Python 3.13 or Debian(<=12)/Ubuntu(<=24.10) without the # python3-lib2to3 package import fissix from fissix import ( fixes as fissix_fixes, pgen2 as fissix_pgen2, refactor as fissix_refactor, ) sys.modules["lib2to3"] = fissix sys.modules["lib2to3.fixes"] = fissix_fixes sys.modules["lib2to3.pgen2"] = fissix_pgen2 sys.modules["lib2to3.refactor"] = fissix_refactor from lib2to3.refactor import RefactoringTool from past.translation import myfixes # Skip libpasteurize fixers, which make sure code is py2 and py3 compatible. # This is not needed, we only translate code on py3. myfixes = [f for f in myfixes if not f.startswith("libpasteurize")] refactoring_tool = RefactoringTool(myfixes, {"print_function": True}) class InputNotFoundSyntaxError(SyntaxError): pass class FixedModuleCodeCompiler(Compiler): module_code = None def getModuleCode(self): self._moduleDef = self.module_code return self._moduleDef def create_compiler_class(module_code): class CustomCompilerClass(FixedModuleCodeCompiler): pass CustomCompilerClass.module_code = module_code return CustomCompilerClass def fill_template( template_text, context=None, retry=10, compiler_class=Compiler, first_exception=None, futurized=False, python_template_version: Optional[Union[str, Version]] = "3", **kwargs, ): """Fill a cheetah template out for specified context. If template_text is None, an exception will be thrown, if context is None (the default) - keyword arguments to this function will be used as the context. """ if template_text is None: raise TypeError("Template text specified as None to fill_template.") if not context: context = kwargs if python_template_version is None: python_template_version = Version("3") elif isinstance(python_template_version, str): python_template_version = Version(python_template_version) try: klass = Template.compile(source=template_text, compilerClass=compiler_class) except ParseError as e: # Might happen on invalid syntax within a cheetah statement, like `#if $smxsize <> 128.0` if first_exception is None: first_exception = e if python_template_version.release[0] < 3 and retry > 0: module_code = Template.compile( source=template_text, compilerClass=compiler_class, returnAClass=False ).decode("utf-8") module_code = futurize_preprocessor(module_code) compiler_class = create_compiler_class(module_code) return fill_template( template_text=template_text, context=context, retry=retry - 1, compiler_class=compiler_class, first_exception=first_exception, python_template_version=python_template_version, ) raise first_exception or e t = klass(searchList=[context]) try: return unicodify(t, log_exception=False) except (NotFound, InputNotFoundSyntaxError) as e: if first_exception is None: first_exception = e if not isinstance(context, TreeDict): masked_input = None if "input" in context and callable(context["input"]): masked_input = context.pop("input", None) context = TreeDict(context) if "input" not in context and masked_input: context["input"] = masked_input tb = e.__traceback__ if retry > 0: if python_template_version.release[0] < 3: last_stack = traceback.extract_tb(tb)[-1] if last_stack.name == "" and last_stack.lineno: # On python 3 list, dict and set comprehensions as well as generator expressions # have their own local scope, which prevents accessing frame variables in cheetah. # We can work around this by replacing `$var` with `var`, but we only do this for # list comprehensions, as this has never worked for dict or set comprehensions or # generator expressions in Cheetah. var_not_found = e.args[0].split("'")[1] replace_str = f'VFFSL(SL,"{var_not_found}",True)' lineno = last_stack.lineno - 1 module_code = t._CHEETAH_generatedModuleCode.splitlines() module_code[lineno] = module_code[lineno].replace(replace_str, var_not_found) module_code = "\n".join(module_code) compiler_class = create_compiler_class(module_code) return fill_template( template_text=template_text, context=context, retry=retry - 1, compiler_class=compiler_class, first_exception=first_exception, python_template_version=python_template_version, ) raise first_exception or e except Exception as e: if first_exception is None: first_exception = e if python_template_version.release[0] < 3 and not futurized: # Possibly an error caused by attempting to run python 2 # template code on python 3. Run the generated module code # through futurize and hope for the best. module_code = t._CHEETAH_generatedModuleCode module_code = futurize_preprocessor(module_code) compiler_class = create_compiler_class(module_code) return fill_template( template_text=template_text, context=context, retry=retry, compiler_class=compiler_class, first_exception=first_exception, futurized=True, python_template_version=python_template_version, ) raise first_exception or e def futurize_preprocessor(source): source = str(refactoring_tool.refactor_string(source, name="auto_translate_cheetah")) # libfuturize.fixes.fix_unicode_keep_u' breaks from Cheetah.compat import unicode source = source.replace("from Cheetah.compat import str", "from Cheetah.compat import unicode") return source ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/themes.py0000644000175100017510000000114615211124267017674 0ustar00runnerrunnerfrom typing import ( Dict, Union, ) Theme = Dict[str, Union["Theme", str]] def flatten_theme(theme: Theme, prefix: str = "-") -> Dict[str, str]: """Transforms a nested theme dictionary into a flat dictionary, containing keys compatible with css variables. e.g. '--masthead-background-color'""" flat_attributes: Dict[str, str] = {} for key, val in theme.items(): if isinstance(val, str): flat_attributes[f"{prefix}-{key}"] = val elif isinstance(val, Dict): flat_attributes.update(flatten_theme(val, f"{prefix}-{key}")) return flat_attributes ././@PaxHeader0000000000000000000000000000003200000000000010210 xustar0026 mtime=1780787404.68961 galaxy_util-26.0.1/galaxy/util/tool_shed/0000755000175100017510000000000015211124315020005 5ustar00runnerrunner././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/tool_shed/__init__.py0000644000175100017510000000000015211124267022112 0ustar00runnerrunner././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/tool_shed/common_util.py0000644000175100017510000002545315211124267022723 0ustar00runnerrunnerimport json import logging import os from typing import ( Optional, TYPE_CHECKING, ) from urllib.parse import urljoin from typing_extensions import Protocol from galaxy import util from galaxy.util.tool_shed import encoding_util if TYPE_CHECKING: from .tool_shed_registry import Registry as ToolShedRegistry log = logging.getLogger(__name__) REPOSITORY_OWNER = "devteam" class HasToolShedRegistry(Protocol): tool_shed_registry: "ToolShedRegistry" name: str def accumulate_tool_dependencies(tool_shed_accessible, tool_dependencies, all_tool_dependencies): if tool_shed_accessible: if tool_dependencies: for tool_dependency in tool_dependencies: if tool_dependency not in all_tool_dependencies: all_tool_dependencies.append(tool_dependency) return all_tool_dependencies def check_tool_tag_set(elem, migrated_tool_configs_dict, missing_tool_configs_dict): file_path = elem.get("file", None) if file_path: name = os.path.basename(file_path) for migrated_tool_config in migrated_tool_configs_dict.keys(): if migrated_tool_config in [file_path, name]: missing_tool_configs_dict[name] = migrated_tool_configs_dict[migrated_tool_config] return missing_tool_configs_dict def generate_clone_url_for_installed_repository(app: HasToolShedRegistry, repository) -> str: """Generate the URL for cloning a repository that has been installed into a Galaxy instance.""" tool_shed_url = get_tool_shed_url_from_tool_shed_registry(app, str(repository.tool_shed)) return util.build_url(tool_shed_url, pathspec=["repos", str(repository.owner), str(repository.name)]) def generate_clone_url_from_repo_info_tup(app: HasToolShedRegistry, repo_info_tup) -> str: """Generate the URL for cloning a repository given a tuple of toolshed, name, owner, changeset_revision.""" # Example tuple: ['http://localhost:9009', 'blast_datatypes', 'test', '461a4216e8ab', False] ( toolshed, name, owner, changeset_revision, prior_installation_required, only_if_compiling_contained_td, ) = parse_repository_dependency_tuple(repo_info_tup) tool_shed_url = get_tool_shed_url_from_tool_shed_registry(app, toolshed) # Don't include the changeset_revision in clone urls. return util.build_url(tool_shed_url, pathspec=["repos", owner, name]) def get_repository_dependencies(app, tool_shed_url, repository_name, repository_owner, changeset_revision): repository_dependencies_dict = {} tool_shed_accessible = True params = dict(name=repository_name, owner=repository_owner, changeset_revision=changeset_revision) pathspec = ["repository", "get_repository_dependencies"] try: raw_text = util.url_get( tool_shed_url, auth=app.tool_shed_registry.url_auth(tool_shed_url), pathspec=pathspec, params=params ) tool_shed_accessible = True except Exception as e: tool_shed_accessible = False log.warning( "The URL\n%s\nraised the exception:\n%s\n", util.build_url(tool_shed_url, pathspec=pathspec, params=params), e, ) if tool_shed_accessible: if len(raw_text) > 2: encoded_text = json.loads(util.unicodify(raw_text)) repository_dependencies_dict = encoding_util.tool_shed_decode(encoded_text) return tool_shed_accessible, repository_dependencies_dict def get_protocol_from_tool_shed_url(tool_shed_url: str) -> str: """Return the protocol from the received tool_shed_url if it exists.""" try: if tool_shed_url.find("://") > 0: return tool_shed_url.split("://")[0].lower() except Exception: # We receive a lot of calls here where the tool_shed_url is None. The container_util uses # that value when creating a header row. If the tool_shed_url is not None, we have a problem. if tool_shed_url is not None: log.exception("Handled exception getting the protocol from Tool Shed URL %s", str(tool_shed_url)) # Default to HTTP protocol. return "http" def get_tool_shed_repository_ids(as_string=False, **kwd): tsrid = kwd.get("tool_shed_repository_id", None) tsridslist = util.listify(kwd.get("tool_shed_repository_ids", None)) if not tsridslist: tsridslist = util.listify(kwd.get("id", None)) if tsridslist is not None: if tsrid is not None and tsrid not in tsridslist: tsridslist.append(tsrid) if as_string: return ",".join(tsridslist) return tsridslist else: tsridslist = util.listify(kwd.get("ordered_tsr_ids", None)) if tsridslist is not None: if as_string: return ",".join(tsridslist) return tsridslist if as_string: return "" return [] def get_tool_shed_url_from_tool_shed_registry(app: HasToolShedRegistry, tool_shed: str) -> Optional[str]: """ The value of tool_shed is something like: toolshed.g2.bx.psu.edu. We need the URL to this tool shed, which is something like: http://toolshed.g2.bx.psu.edu/ """ cleaned_tool_shed = remove_protocol_from_tool_shed_url(tool_shed) for shed_url in app.tool_shed_registry.tool_sheds.values(): if shed_url.find(cleaned_tool_shed) >= 0: if shed_url.endswith("/"): shed_url = shed_url.rstrip("/") return shed_url # The tool shed from which the repository was originally installed must no longer be configured in tool_sheds_conf.xml. return None def get_tool_shed_repository_url(app: HasToolShedRegistry, tool_shed: str, owner: str, name: str): tool_shed_url = get_tool_shed_url_from_tool_shed_registry(app, tool_shed) if tool_shed_url: # Append a slash to the tool shed URL, because urlparse.urljoin will eliminate # the last part of a URL if it does not end with a forward slash. tool_shed_url = f"{tool_shed_url}/" return urljoin(tool_shed_url, f"view/{owner}/{name}") return tool_shed_url def handle_galaxy_url(trans, **kwd): galaxy_url = kwd.get("galaxy_url", None) if galaxy_url: trans.set_cookie(galaxy_url, name="toolshedgalaxyurl") else: galaxy_url = trans.get_cookie(name="toolshedgalaxyurl") return galaxy_url def parse_repository_dependency_tuple(repository_dependency_tuple, contains_error=False): # Default both prior_installation_required and only_if_compiling_contained_td to False in cases where metadata should be reset on the # repository containing the repository_dependency definition. prior_installation_required = "False" only_if_compiling_contained_td = "False" if contains_error: if len(repository_dependency_tuple) == 5: tool_shed, name, owner, changeset_revision, error = repository_dependency_tuple elif len(repository_dependency_tuple) == 6: tool_shed, name, owner, changeset_revision, prior_installation_required, error = repository_dependency_tuple elif len(repository_dependency_tuple) == 7: ( tool_shed, name, owner, changeset_revision, prior_installation_required, only_if_compiling_contained_td, error, ) = repository_dependency_tuple return ( tool_shed, name, owner, changeset_revision, prior_installation_required, only_if_compiling_contained_td, error, ) else: if len(repository_dependency_tuple) == 4: tool_shed, name, owner, changeset_revision = repository_dependency_tuple elif len(repository_dependency_tuple) == 5: tool_shed, name, owner, changeset_revision, prior_installation_required = repository_dependency_tuple elif len(repository_dependency_tuple) == 6: ( tool_shed, name, owner, changeset_revision, prior_installation_required, only_if_compiling_contained_td, ) = repository_dependency_tuple return tool_shed, name, owner, changeset_revision, prior_installation_required, only_if_compiling_contained_td def remove_port_from_tool_shed_url(tool_shed_url: str) -> str: """Return a partial Tool Shed URL, eliminating the port if it exists.""" try: if tool_shed_url.find(":") > 0: # Eliminate the port, if any, since it will result in an invalid directory name. new_tool_shed_url = tool_shed_url.split(":")[0] else: new_tool_shed_url = tool_shed_url return new_tool_shed_url.rstrip("/") except Exception: # We receive a lot of calls here where the tool_shed_url is None. The container_util uses # that value when creating a header row. If the tool_shed_url is not None, we have a problem. if tool_shed_url is not None: log.exception("Handled exception removing the port from Tool Shed URL %s", str(tool_shed_url)) return tool_shed_url def remove_protocol_and_port_from_tool_shed_url(tool_shed_url: str) -> str: """Return a partial Tool Shed URL, eliminating the protocol and/or port if either exists.""" tool_shed = remove_protocol_from_tool_shed_url(tool_shed_url) tool_shed = remove_port_from_tool_shed_url(tool_shed) return tool_shed def remove_protocol_and_user_from_clone_url(repository_clone_url: str) -> str: """Return a URL that can be used to clone a repository, eliminating the protocol and user if either exists.""" if repository_clone_url.find("@") > 0: # We have an url that includes an authenticated user, something like: # http://test@bx.psu.edu:9009/repos/some_username/column items = repository_clone_url.split("@") tmp_url = items[1] elif repository_clone_url.find("//") > 0: # We have an url that includes only a protocol, something like: # http://bx.psu.edu:9009/repos/some_username/column items = repository_clone_url.split("//") tmp_url = items[1] else: tmp_url = repository_clone_url return tmp_url.rstrip("/") remove_protocol_from_tool_shed_url = util.remove_protocol_from_url __all__ = ( "accumulate_tool_dependencies", "check_tool_tag_set", "generate_clone_url_for_installed_repository", "generate_clone_url_from_repo_info_tup", "get_repository_dependencies", "get_protocol_from_tool_shed_url", "get_tool_shed_repository_ids", "get_tool_shed_url_from_tool_shed_registry", "get_tool_shed_repository_url", "handle_galaxy_url", "parse_repository_dependency_tuple", "remove_port_from_tool_shed_url", "remove_protocol_and_port_from_tool_shed_url", "remove_protocol_and_user_from_clone_url", "remove_protocol_from_tool_shed_url", ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/tool_shed/encoding_util.py0000644000175100017510000000164215211124267023213 0ustar00runnerrunnerimport binascii import json from galaxy.util import ( smart_str, unicodify, ) from galaxy.util.hash_util import hmac_new encoding_sep = "__esep__" encoding_sep2 = "__esepii__" def tool_shed_decode(value): # Extract and verify hash value = unicodify(value) a, b = value.split(":") value = binascii.unhexlify(b) test = hmac_new(b"ToolShedAndGalaxyMustHaveThisSameKey", value) assert a == test # Restore from string values = None value = unicodify(value) try: values = json.loads(value) except Exception: pass if values is None: values = value return values def tool_shed_encode(val): if isinstance(val, dict) or isinstance(val, list): value = json.dumps(val) else: value = val a = hmac_new(b"ToolShedAndGalaxyMustHaveThisSameKey", value) b = unicodify(binascii.hexlify(smart_str(value))) return f"{a}:{b}" ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/tool_shed/tool_shed_registry.py0000644000175100017510000001103715211124267024277 0ustar00runnerrunnerimport logging from typing import ( Dict, NamedTuple, Optional, ) from typing_extensions import Literal from galaxy.util import parse_xml_string from galaxy.util.path import StrPath from galaxy.util.tool_shed import common_util from galaxy.util.tool_shed.xml_util import parse_xml log = logging.getLogger(__name__) DEFAULT_TOOL_SHED_URL = "https://toolshed.g2.bx.psu.edu/" DEFAULT_TOOL_SHED_NAME = "Galaxy Main Tool Shed" DEFAULT_TOOL_SHEDS_CONF_XML = f""" """ API_VERSION = Literal["v1", "v2"] class AUTH_TUPLE(NamedTuple): username: str password: str class Registry: tool_sheds: Dict[str, str] tool_shed_api_versions: Dict[str, API_VERSION] tool_sheds_auth: Dict[str, Optional[AUTH_TUPLE]] def __init__(self, config: Optional[StrPath] = None): self.tool_sheds = {} self.tool_sheds_auth = {} self.tool_shed_api_versions = {} if config: # Parse tool_sheds_conf.xml tree, error_message = parse_xml(config) if tree is None: log.warning(f"Unable to load references to tool sheds defined in file {str(config)}") return root = tree.getroot() else: root = parse_xml_string(DEFAULT_TOOL_SHEDS_CONF_XML) config = "internal default config" log.debug(f"Loading references to tool sheds from {config}") for elem in root.findall("tool_shed"): try: name = elem.get("name", None) url = elem.get("url", None) version_raw = elem.get("version", "1") version: API_VERSION if version_raw == "1": version = "v1" else: version = "v2" username = elem.get("user", None) password = elem.get("pass", None) if name and url: self.tool_sheds[name] = url self.tool_shed_api_versions[name] = version self.tool_sheds_auth[name] = None log.debug(f"Loaded reference to tool shed: {name}") if name and url and username and password: self.tool_sheds_auth[name] = AUTH_TUPLE(username, password) except Exception as e: log.warning(f'Error loading reference to tool shed "{name}", problem: {str(e)}') def url_auth(self, url: str) -> Optional[AUTH_TUPLE]: """ If the tool shed is using external auth, the client to the tool shed must authenticate to that as well. This provides access to the six.moves.urllib.request.HTTPPasswordMgrWithdefaultRealm() object for the url passed in. Following more what galaxy.demo_sequencer.controllers.common does might be more appropriate at some stage... """ shed_name = self._shed_name_for_url(url) if shed_name is not None: return self.tool_sheds_auth[shed_name] else: log.debug(f"Invalid url '{str(url)}' received by tool shed registry's url_auth method.") return None def is_legacy(self, url: str) -> bool: shed_name = self._shed_name_for_url(url) if shed_name is None: return True else: return self.tool_shed_api_versions[shed_name] == "v1" def _shed_name_for_url(self, url: str) -> Optional[str]: url_sans_protocol = common_util.remove_protocol_from_tool_shed_url(url) for shed_name, shed_url in self.tool_sheds.items(): shed_url_sans_protocol = common_util.remove_protocol_from_tool_shed_url(shed_url) if url_sans_protocol.startswith(shed_url_sans_protocol): return shed_name return None def get_tool_shed_url(self, tool_shed: str) -> Optional[str]: """ The value of tool_shed is something like: toolshed.g2.bx.psu.edu. We need the URL to this tool shed, which is something like: http://toolshed.g2.bx.psu.edu/ """ cleaned_tool_shed = common_util.remove_protocol_from_tool_shed_url(tool_shed) for shed_url in self.tool_sheds.values(): if shed_url.find(cleaned_tool_shed) >= 0: if shed_url.endswith("/"): shed_url = shed_url.rstrip("/") return shed_url # The tool shed from which the repository was originally installed must no longer be configured in tool_sheds_conf.xml. return None ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/tool_shed/xml_util.py0000644000175100017510000000242415211124267022224 0ustar00runnerrunnerimport logging import os import tempfile from typing import ( Optional, Tuple, ) from galaxy.util import ( Element, ElementTree, parse_xml as galaxy_parse_xml, xml_to_string, ) from galaxy.util.path import StrPath log = logging.getLogger(__name__) def create_and_write_tmp_file(elem: Element) -> str: tmp_str = xml_to_string(elem, pretty=True) with tempfile.NamedTemporaryFile(prefix="tmp-toolshed-cawrf", delete=False) as fh: tmp_filename = fh.name with open(tmp_filename, mode="w", encoding="utf-8") as fh: fh.write(tmp_str) return tmp_filename def parse_xml(file_name: StrPath, check_exists=True) -> Tuple[Optional[ElementTree], str]: """Returns a parsed xml tree with comments intact.""" error_message = "" if check_exists and not os.path.exists(file_name): return None, f"File does not exist {str(file_name)}" try: tree = galaxy_parse_xml(file_name, remove_comments=False, strip_whitespace=False) except OSError: raise except Exception as e: error_message = f"Exception attempting to parse {file_name}: {e}" log.exception(error_message) return None, error_message return tree, error_message __all__ = ( "create_and_write_tmp_file", "parse_xml", ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/tool_version.py0000644000175100017510000000041415211124267021126 0ustar00runnerrunnerfrom typing import Union def remove_version_from_guid(guid: str) -> Union[str, None]: """ Removes version from toolshed-derived tool_id(=guid). """ if "/" not in guid: return None last_slash = guid.rfind("/") return guid[:last_slash] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/topsort.py0000644000175100017510000001513415211124267020123 0ustar00runnerrunner""" Topological sort. From Tim Peters, see: http://mail.python.org/pipermail/python-list/1999-July/006660.html topsort takes a list of pairs, where each pair (x, y) is taken to mean that x <= y wrt some abstract partial ordering. The return value is a list, representing a total ordering that respects all the input constraints. E.g., topsort( [(1,2), (3,3)] ) Valid topological sorts would be any of (but nothing other than) [3, 1, 2] [1, 3, 2] [1, 2, 3] ... however this variant ensures that 'key' order (first element of tuple) is preserved so the following will be result returned: [1, 3, 2] because those are the permutations of the input elements that respect the "1 precedes 2" and "3 precedes 3" input constraints. Note that a constraint of the form (x, x) is really just a trick to make sure x appears *somewhere* in the output list. If there's a cycle in the constraints, say topsort( [(1,2), (2,1)] ) then CycleError is raised, and the exception object supports many methods to help analyze and break the cycles. This requires a good deal more code than topsort itself! """ from random import choice class CycleError(Exception): def __init__(self, sofar, numpreds, succs): Exception.__init__(self, "cycle in constraints", sofar, numpreds, succs) self.preds = None # return as much of the total ordering as topsort was able to # find before it hit a cycle def get_partial(self): return self[1] # return remaining elt -> count of predecessors map def get_pred_counts(self): return self[2] # return remaining elt -> list of successors map def get_succs(self): return self[3] # return remaining elements (== those that don't appear in # get_partial()) def get_elements(self): return self.get_pred_counts().keys() # Return a list of pairs representing the full state of what's # remaining (if you pass this list back to topsort, it will raise # CycleError again, and if you invoke get_pairlist on *that* # exception object, the result will be isomorphic to *this* # invocation of get_pairlist). # The idea is that you can use pick_a_cycle to find a cycle, # through some means or another pick an (x,y) pair in the cycle # you no longer want to respect, then remove that pair from the # output of get_pairlist and try topsort again. def get_pairlist(self): succs = self.get_succs() answer = [] for x in self.get_elements(): if x in succs: for y in succs[x]: answer.append((x, y)) else: # make sure x appears in topsort's output! answer.append((x, x)) return answer # return remaining elt -> list of predecessors map def get_preds(self): if self.preds is not None: return self.preds self.preds = preds = {} remaining_elts = self.get_elements() for x in remaining_elts: preds[x] = [] succs = self.get_succs() for x in remaining_elts: if x in succs: for y in succs[x]: preds[y].append(x) if __debug__: for x in remaining_elts: assert len(preds[x]) > 0 return preds # return a cycle [x, ..., x] at random def pick_a_cycle(self): remaining_elts = self.get_elements() # We know that everything in remaining_elts has a predecessor, # but don't know that everything in it has a successor. So # crawling forward over succs may hit a dead end. Instead we # crawl backward over the preds until we hit a duplicate, then # reverse the path. preds = self.get_preds() x = choice(remaining_elts) answer = [] index = {} in_answer = index.has_key while not in_answer(x): index[x] = len(answer) # index of x in answer answer.append(x) x = choice(preds[x]) answer.append(x) answer = answer[index[x] :] answer.reverse() return answer def _numpreds_and_successors_from_pairlist(pairlist): numpreds = {} # elt -> # of predecessors successors = {} # elt -> list of successors for first, second in pairlist: # make sure every elt is a key in numpreds if first not in numpreds: numpreds[first] = 0 if second not in numpreds: numpreds[second] = 0 # if they're the same, there's no real dependence if first == second: continue # since first < second, second gains a pred ... numpreds[second] = numpreds[second] + 1 # ... and first gains a succ if first in successors: successors[first].append(second) else: successors[first] = [second] return numpreds, successors def topsort(pairlist): numpreds, successors = _numpreds_and_successors_from_pairlist(pairlist) # suck up everything without a predecessor answer = [x for x in numpreds.keys() if numpreds[x] == 0] # for everything in answer, knock down the pred count on # its successors; note that answer grows *in* the loop for x in answer: assert numpreds[x] == 0 del numpreds[x] if x in successors: for y in successors[x]: numpreds[y] = numpreds[y] - 1 if numpreds[y] == 0: answer.append(y) # following "del" isn't needed; just makes # CycleError details easier to grasp del successors[x] if numpreds: # everything in numpreds has at least one predecessor -> # there's a cycle if __debug__: for x in numpreds.keys(): assert numpreds[x] > 0 raise CycleError(answer, numpreds, successors) return answer def topsort_levels(pairlist): numpreds, successors = _numpreds_and_successors_from_pairlist(pairlist) answer = [] while 1: # Suck up everything without a predecessor. levparents = [x for x in numpreds.keys() if numpreds[x] == 0] if not levparents: break answer.append(levparents) for levparent in levparents: del numpreds[levparent] if levparent in successors: for levparentsucc in successors[levparent]: numpreds[levparentsucc] -= 1 del successors[levparent] if numpreds: # Everything in num_parents has at least one child -> # there's a cycle. raise CycleError(answer, numpreds, successors) return answer ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/tree_dict.py0000644000175100017510000000407615211124267020356 0ustar00runnerrunnerfrom collections import UserDict from collections.abc import ( ItemsView, MutableMapping, ) from typing import ( Any, Optional, ) from boltons.iterutils import remap def enter(path, key, value): if isinstance(value, MutableMapping): return value.__class__(), ItemsView(value) else: return value, False class TreeDict(UserDict): """ Dictionary that inserts its own keys in a parent dictionary. """ def __init__(self, dict=None, **kwargs): self._parent_data: Optional[TreeDict] = None self._injected_data = {} super().__init__(dict, **kwargs) def clean_copy(self): """ Copy without injected data. """ def strip_tree_dict(path, key, value): if isinstance(value, TreeDict): value = value.data return key, value return remap(self.data, strip_tree_dict, enter=enter) def __getitem__(self, key: Any) -> Any: if key in self.data: return super().__getitem__(key) else: return self._injected_data[key] def __contains__(self, key: object) -> bool: if super().__contains__(key): return True return key in self._injected_data def __setitem__(self, key: Any, item: Any) -> None: if isinstance(item, MutableMapping): # We're not doing item = TreeDict(item) because we want to record the keys in _parent_data _item = TreeDict() _item._parent_data = self _item.update(item) item = _item current_parent_data = self._parent_data while current_parent_data is not None and key != "__current_case__": if ( key not in current_parent_data or key == "input" and key in current_parent_data and callable(current_parent_data[key]) ): current_parent_data._injected_data[key] = item current_parent_data = current_parent_data._parent_data return super().__setitem__(key, item) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/ucsc.py0000644000175100017510000000175115211124267017346 0ustar00runnerrunner""" Utilities for dealing with UCSC data. """ class UCSCLimitException(Exception): pass class UCSCOutWrapper: """File-like object that throws an exception if it encounters the UCSC limit error lines""" def __init__(self, other): self.other = iter(other) # Need one line of lookahead to be sure we are hitting the limit message self.lookahead = None def __iter__(self): return self def __next__(self): if self.lookahead is None: line = next(self.other) else: line = self.lookahead self.lookahead = None if line.startswith("----------"): next_line = next(self.other) if next_line.startswith("Reached output limit"): raise UCSCLimitException(next_line.strip()) else: self.lookahead = next_line return line def next(self): return self.__next__() def readline(self): return next(self) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/unittest.py0000644000175100017510000000177515211124267020276 0ustar00runnerrunnerimport pytest class TestCase: """Partial re-implementation of standard library unittest.TestCase using pytest methods See https://docs.pytest.org/en/latest/how-to/xunit_setup.html for a description of the pytest setup/teardown methods. Most assert*() methods of unittest.TestCase are not reimplemented here on purpose, normal assert statements should be used instead.""" @classmethod def setUpClass(cls): pass @classmethod def setup_class(cls): cls.setUpClass() @classmethod def tearDownClass(cls): pass @classmethod def teardown_class(cls): cls.tearDownClass() def setUp(self): pass def setup_method(self): self.setUp() def tearDown(self): pass def teardown_method(self): self.tearDown() def assertRaises(self, exception): return pytest.raises(exception) def assertRaisesRegex(self, exception, regex): return pytest.raises(exception, match=regex) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1780787404.6898096 galaxy_util-26.0.1/galaxy/util/unittest_utils/0000755000175100017510000000000015211124315021124 5ustar00runnerrunner././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/unittest_utils/__init__.py0000644000175100017510000001111615211124267023243 0ustar00runnerrunnerimport os from datetime import datetime from functools import wraps from typing import ( Callable, TypeVar, Union, ) from unittest import SkipTest import pytest from typing_extensions import ParamSpec from galaxy.util import requests from galaxy.util.commands import which def is_site_up(url: str) -> bool: try: response = requests.get(url, timeout=10) return response.status_code == 200 except Exception: return False P = ParamSpec("P") T = TypeVar("T") def skip_if_site_down(url: str) -> Callable[[Callable[P, T]], Callable[P, T]]: def method_wrapper(method: Callable[P, T]) -> Callable[P, T]: @wraps(method) def wrapped_method(*args: P.args, **kwargs: P.kwargs) -> T: if not is_site_up(url): raise SkipTest(f"Test depends on [{url}] being up and it appears to be down.") return method(*args, **kwargs) return wrapped_method return method_wrapper skip_if_github_down = skip_if_site_down("https://github.com/") skip_if_workflowhub_down = skip_if_site_down("https://workflowhub.eu/") def _identity(func: Callable[P, T]) -> Callable[P, T]: return func def skip_unless_executable(executable: str) -> Union[Callable[[Callable[P, T]], Callable[P, T]], pytest.MarkDecorator]: if which(executable): return _identity return pytest.mark.skip(f"PATH doesn't contain executable {executable}") def skip_unless_environ(env_var: str) -> Union[Callable[[Callable[P, T]], Callable[P, T]], pytest.MarkDecorator]: if os.environ.get(env_var): return _identity return pytest.mark.skip(f"{env_var} must be set for this test") # Pytest mark for tests that require a live LLM connection # Set GALAXY_TEST_ENABLE_LIVE_LLM=1 to run these tests pytestmark_live_llm = pytest.mark.skipif( not os.environ.get("GALAXY_TEST_ENABLE_LIVE_LLM"), reason="Live LLM tests disabled. Set GALAXY_TEST_ENABLE_LIVE_LLM=1 to enable.", ) def transient_failure(issue: int, potentially_fixed: bool = False) -> Callable[[Callable[P, T]], Callable[P, T]]: """Mark test as known transient failure with GitHub issue tracking. This decorator catches exceptions from tests and rewraps them with a marker indicating this is a known transient failure. This allows automated tooling to categorize failures and helps reviewers quickly identify flaky tests. Please create an issue on Github to track each transient failure. If a potential fix is implemented, set potentially_fixed=True to indicate that the failure may have been resolved. This will update the displayed error message and help us know if the issue can be potentially closed after a month of not being reported. Args: issue: GitHub issue number tracking this transient failure potentially_fixed: If True, indicates that the underlying issue may have been fixed, and the test failure comment will require the PR reviewer to report potential failures on the tracking issue and remove potentially_fixed. Example: @transient_failure(issue=12345) def test_flaky_selenium(self): # Test that sometimes fails due to race condition ... """ def decorator(func: Callable[P, T]) -> Callable[P, T]: @wraps(func) def wrapper(*args: P.args, **kwargs: P.kwargs) -> T: try: return func(*args, **kwargs) except Exception as e: if potentially_fixed: current_datetime = datetime.now().isoformat() report_this = ( "We have previously implemented a potential fix for this issue, " "if you are seeing this failure in CI on a recently branched commit, please report it on the tracking issue " f"https://github.com/galaxyproject/galaxy/issues/{issue} including the comment " f"'This issue is not fixed and was last seen at {current_datetime}' so we can mark the previous fix as insufficient." ) else: report_this = "This is known issue and doesn't need to be reported." msg = f"KNOWN TRANSIENT FAILURE [Issue #{issue}] [{report_this}]: {str(e)}" # Try to preserve exception type, fallback to plain Exception try: raise type(e)(msg) from e except (TypeError, AttributeError): # type(e) doesn't accept single string arg raise Exception(msg) from e return wrapper return decorator ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/user_agent.py0000644000175100017510000000015715211124267020544 0ustar00runnerrunnerfrom galaxy.version import VERSION def get_default_headers(): return {"user-agent": f"galaxy/{VERSION}"} ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/validation.py0000644000175100017510000000213215211124267020535 0ustar00runnerrunner"""Module for validation of incoming inputs. TODO: Refactor BaseController references to similar methods to use this module. """ from galaxy import exceptions from galaxy.util.sanitize_html import sanitize_html def validate_and_sanitize_basestring(key, val): if not isinstance(val, str): raise exceptions.RequestParameterInvalidException(f"{key} must be a string or unicode: {type(val)}") return sanitize_html(val) def validate_and_sanitize_basestring_list(key, val): try: assert isinstance(val, list) return [sanitize_html(t) for t in val] except (AssertionError, TypeError): raise exceptions.RequestParameterInvalidException(f"{key} must be a list of strings: {type(val)}") def validate_boolean(key, val): if not isinstance(val, bool): raise exceptions.RequestParameterInvalidException(f"{key} must be a boolean: {type(val)}") return val # TODO: # def validate_integer(self, key, val, min, max): # def validate_float(self, key, val, min, max): # def validate_number(self, key, val, min, max): # def validate_genome_build(self, key, val): ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/wait.py0000644000175100017510000000262615211124267017357 0ustar00runnerrunner"""Abstraction for waiting on API conditions to become true.""" import time from typing import ( Callable, Optional, Union, ) DEFAULT_POLLING_BACKOFF = 0 DEFAULT_POLLING_DELTA = 0.25 TIMEOUT_MESSAGE_TEMPLATE = "Timed out after {} seconds waiting on {}." timeout_type = Union[int, float] def wait_on( function: Callable, desc: str, timeout: timeout_type, delta: timeout_type = DEFAULT_POLLING_DELTA, polling_backoff: timeout_type = DEFAULT_POLLING_BACKOFF, sleep_: Optional[Callable] = None, ): """Wait for function to return non-None value. Grow the polling interval (initially ``delta`` defaulting to 0.25 seconds) incrementally by the supplied ``polling_backoff`` (defaulting to 0). Throw a TimeoutAssertionError if the supplied timeout is reached without supplied function ever returning a non-None value. """ sleep = sleep_ or time.sleep total_wait = 0.0 while True: if total_wait > timeout: raise TimeoutAssertionError(TIMEOUT_MESSAGE_TEMPLATE.format(total_wait, desc)) value = function() if value is not None: return value total_wait += delta sleep(delta) delta += polling_backoff class TimeoutAssertionError(AssertionError): """Derivative of AssertionError indicating wait_on exceeded max time.""" def __init__(self, message): super().__init__(message) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/watcher.py0000644000175100017510000001561315211124267020050 0ustar00runnerrunner# TODO: this is largely copied from galaxy.tool_util.toolbox.galaxy and generalized, the tool-oriented watchers in that # module should probably be updated to use this where possible import logging import os.path import time try: from watchdog.events import FileSystemEventHandler from watchdog.observers import Observer from watchdog.observers.polling import PollingObserver can_watch = True except ImportError: Observer = None # type: ignore[assignment, unused-ignore] FileSystemEventHandler = object # type: ignore[assignment,misc, unused-ignore] PollingObserver = None # type: ignore[assignment, misc, unused-ignore] can_watch = False from galaxy.util.hash_util import md5_hash_file log = logging.getLogger(__name__) def get_observer_class(config_name, config_value, default, monitor_what_str): """ """ config_value = config_value or default config_value = str(config_value).lower() if config_value in ("true", "yes", "on", "auto"): expect_observer = True observer_class = Observer elif config_value == "polling": expect_observer = True observer_class = PollingObserver elif config_value in ("false", "no", "off"): expect_observer = False observer_class = None else: message = f"Unrecognized value for {config_name} config option: {config_value}" raise Exception(message) if expect_observer and observer_class is None: message = f"Watchdog library unavailable, cannot monitor {monitor_what_str}." if config_value == "auto": log.info(message) else: raise Exception(message) return observer_class def get_watcher( config, config_name, default="False", monitor_what_str=None, watcher_class=None, event_handler_class=None, **kwargs ): config_value = getattr(config, config_name, None) observer_class = get_observer_class(config_name, config_value, default=default, monitor_what_str=monitor_what_str) if observer_class is not None: watcher_class = watcher_class or Watcher event_handler_class = event_handler_class or EventHandler return watcher_class(observer_class, event_handler_class, **kwargs) else: return NullWatcher() class BaseWatcher: def __init__(self, observer_class, event_handler_class, **kwargs): self.observer = None self.observer_class = observer_class self.event_handler = event_handler_class(self) self.monitored_dirs = {} def start(self): if self.observer is None: self.observer = self.observer_class() self.observer.start() self.resume_watching() def monitor(self, dir_path, recursive=False): self.monitored_dirs[dir_path] = recursive if self.observer is not None: self.observer.schedule(self.event_handler, dir_path, recursive=recursive) def resume_watching(self): for dir_path, recursive in self.monitored_dirs.items(): self.monitor(dir_path, recursive) def shutdown(self): if self.observer is not None: self.observer.stop() self.observer.join() self.observer = None class Watcher(BaseWatcher): def __init__(self, observer_class, event_handler_class, **kwargs): super().__init__(observer_class, event_handler_class, **kwargs) self.path_hash = {} self.file_callbacks = {} self.dir_callbacks = {} self.ignore_extensions = {} self.require_extensions = {} self.event_handler = event_handler_class(self) def watch_file(self, file_path, callback=None): file_path = os.path.abspath(file_path) dir_path = os.path.dirname(file_path) if dir_path not in self.monitored_dirs: if callback is not None: self.file_callbacks[file_path] = callback self.monitor(dir_path) log.debug("Watching for changes to file: %s", file_path) def watch_directory( self, dir_path, callback=None, recursive=False, ignore_extensions=None, require_extensions=None ): dir_path = os.path.abspath(dir_path) if dir_path not in self.monitored_dirs: if callback is not None: self.dir_callbacks[dir_path] = callback if ignore_extensions: self.ignore_extensions[dir_path] = ignore_extensions if require_extensions: self.require_extensions[dir_path] = require_extensions self.monitor(dir_path, recursive=recursive) log.debug("Watching for changes in directory%s: %s", " (recursively)" if recursive else "", dir_path) class EventHandler(FileSystemEventHandler): def __init__(self, watcher): self.watcher = watcher # this effectively excludes on_opened and on_closed def on_moved(self, event): self._handle(event) def on_created(self, event): self._handle(event) def on_deleted(self, event): self._handle(event) def on_modified(self, event): self._handle(event) def _extension_check(self, key, path): required_extensions = self.watcher.require_extensions.get(key) if required_extensions: return any(filter(path.endswith, required_extensions)) return not any(filter(path.endswith, self.watcher.ignore_extensions.get(key, []))) def _handle(self, event): # modified events will only have src path, move events will # have dest_path and src_path but we only care about dest. So # look at dest if it exists else use src. path = getattr(event, "dest_path", None) or event.src_path path = os.path.abspath(path) callback = self.watcher.file_callbacks.get(path) if os.path.basename(path).startswith("."): return if callback: ext_ok = self._extension_check(path, path) else: # reversed sort for getting the most specific dir first for key in sorted(self.watcher.dir_callbacks.keys(), reverse=True): if os.path.commonprefix([path, key]) == key: callback = self.watcher.dir_callbacks[key] ext_ok = self._extension_check(key, path) break if not callback or not ext_ok: return cur_hash = md5_hash_file(path) if cur_hash: if self.watcher.path_hash.get(path) == cur_hash: return else: time.sleep(0.5) if cur_hash != md5_hash_file(path): # We're still modifying the file, it'll be picked up later return self.watcher.path_hash[path] = cur_hash callback(path=path) class NullWatcher: def start(self): pass def shutdown(self): pass def watch_file(self, *args, **kwargs): pass def watch_directory(self, *args, **kwargs): pass ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/xml_macros.py0000644000175100017510000003075415211124267020562 0ustar00runnerrunnerimport os from copy import deepcopy from typing import ( Callable, Dict, Iterable, List, Optional, Tuple, TYPE_CHECKING, TypeVar, Union, ) from galaxy.util import ( parse_xml, unicodify, ) if TYPE_CHECKING: from galaxy.util import ( Element, ElementTree, ) from galaxy.util.path import StrPath MacrosDictT = Dict[str, List["Element"]] def load_with_references(path: "StrPath") -> Tuple["ElementTree", Optional[List[str]]]: """Load XML documentation from file system and preprocesses XML macros. Return the XML representation of the expanded tree and paths to referenced files that were imported (macros). """ tree = raw_xml_tree(path) root = tree.getroot() macros_el = _macros_el(root) if macros_el is None: return tree, [] macros: MacrosDictT = {} macro_paths = _import_macros(macros_el, path, macros) macros_el.clear() # Collect tokens tokens: Dict[str, str] = {} for m in macros.get("token", []): token_name = m.get("name") assert token_name tokens[token_name] = m.text or "" tokens = expand_nested_tokens(tokens) # Expand xml macros macro_dict: Dict[str, XmlMacroDef] = {} for m in macros.get("xml", []): macro_name = m.get("name") assert macro_name macro_dict[macro_name] = XmlMacroDef(m) _expand_macros([root], macro_dict, tokens) # reinsert template macro which are used during tool execution for m in macros.get("template", []): macros_el.append(m) _expand_tokens_for_el(root, tokens) return tree, macro_paths def load(path: "StrPath") -> "ElementTree": tree, _ = load_with_references(path) return tree def template_macro_params(root: "Element") -> Dict[str, Union[str, None]]: """ Look for template macros and populate param_dict (for cheetah) with these. """ macros_el = _macros_el(root) if macros_el is not None: return _macros_of_type(macros_el, "template", lambda el: el.text) return {} def raw_xml_tree(path: "StrPath") -> "ElementTree": """Load raw (no macro expansion) tree representation of XML represented at the specified path. """ tree = parse_xml(path, strip_whitespace=False, remove_comments=True) return tree def imported_macro_paths(root: "Element") -> List[str]: macros_el = _macros_el(root) if macros_el is None: return [] return _imported_macro_paths_from_el(macros_el) def _import_macros(macros_el: "Element", path: "StrPath", macros: MacrosDictT) -> Optional[List[str]]: """ root the parsed XML tree path the path to the main xml document """ xml_base_dir = os.path.dirname(path) macro_paths = _load_macros(macros_el, xml_base_dir, macros) # _xml_set_children(macros_el, macro_els) return macro_paths def _macros_el(root: "Element") -> Union["Element", None]: return root.find("macros") T = TypeVar("T") def _macros_of_type(macros_el: "Element", type: str, el_func: Callable[["Element"], T]) -> Dict[str, T]: macro_els = macros_el.findall("macro") ret: Dict[str, T] = {} for macro_el in macro_els: if macro_el.get("type") == type: macro_name = macro_el.get("name") assert macro_name ret[macro_name] = el_func(macro_el) return ret def expand_nested_tokens(tokens: Dict[str, str]) -> Dict[str, str]: for token_name in tokens.keys(): for current_token_name, current_token_value in tokens.items(): if token_name in current_token_value: if token_name == current_token_name: raise Exception(f"Token '{token_name}' cannot contain itself") tokens[current_token_name] = current_token_value.replace(token_name, tokens[token_name]) return tokens def _expand_tokens(elements: Iterable["Element"], tokens: Dict[str, str]) -> None: if not tokens: return for element in elements: _expand_tokens_for_el(element, tokens) def _expand_tokens_for_el(element: "Element", tokens: Dict[str, str]) -> None: """ expand tokens in element and (recursively) in its children replacements of text attributes and attribute values are possible """ element_text = element.text if element_text: new_value = _expand_tokens_str(element_text, tokens) if new_value is not element_text: element.text = new_value for key, value in element.attrib.items(): new_value = _expand_tokens_str(unicodify(value), tokens) if new_value is not value: element.attrib[key] = new_value new_key = _expand_tokens_str(unicodify(key), tokens) if new_key is not key: element.attrib[new_key] = element.attrib[key] del element.attrib[key] # recursively expand in childrens _expand_tokens(element.__iter__(), tokens) def _expand_tokens_str(s: str, tokens: Dict[str, str]) -> str: for key, value in tokens.items(): if key in s: s = s.replace(key, value) return s def _expand_macros( elements: Iterable["Element"], macros: Dict[str, "XmlMacroDef"], tokens: Dict[str, str], visited: Optional[List[str]] = None, ) -> None: if not macros and not tokens: return if visited is None: visited = [] for element in elements: while True: expand_el = element.find(".//expand") if expand_el is None: break _expand_macro(expand_el, macros, tokens, visited) def _expand_macro( expand_el: "Element", macros: Dict[str, "XmlMacroDef"], tokens: Dict[str, str], visited: List[str] ) -> None: macro_name = expand_el.get("macro") assert macro_name is not None, "Attempted to expand macro with no 'macro' attribute defined." # check for cycles in the nested macro expansion assert ( macro_name not in visited ), f"Cycle in nested macros: already expanded {visited} can't expand '{macro_name}' again" visited.append(macro_name) assert macro_name in macros, f"No macro named {macro_name} found, known macros are {', '.join(macros.keys())}." macro_def = macros[macro_name] macro_el = deepcopy(macro_def.element) _expand_yield_statements(macro_el, expand_el) macro_tokens = macro_def.macro_tokens(expand_el) if macro_tokens: _expand_tokens(macro_el.__iter__(), macro_tokens) # Recursively expand contained macros. _expand_macros(macro_el.__iter__(), macros, tokens, visited) _xml_replace(expand_el, macro_el.__iter__()) del visited[-1] def _expand_yield_statements(macro_el: "Element", expand_el: "Element") -> None: """ Modifies the macro_el element by replacing 1. all named yield tags by the content of the corresponding token tags - token tags need to be direct children of the expand - processed in order of definition of the token tags 2. all unnamed yield tags by the non-token children of the expand tag """ # replace named yields for token_el in expand_el.findall("./token"): name = token_el.attrib.get("name") assert name is not None, "Found unnamed token" + str(token_el.attrib) yield_els = list(macro_el.findall(f".//yield[@name='{name}']")) assert len(yield_els) > 0, f"No named yield found for named token {name}" for yield_el in yield_els: _xml_replace(yield_el, token_el.__iter__()) # replace unnamed yields yield_els = list(macro_el.findall(".//yield")) expand_el_children = [c for c in expand_el if c.tag != "token"] for yield_el in yield_els: _xml_replace(yield_el, expand_el_children) def _load_macros(macros_el: "Element", xml_base_dir: str, macros: MacrosDictT) -> List[str]: # Import macros from external files. macro_paths = _load_imported_macros(macros_el, xml_base_dir, macros) # Load all directly defined macros. _load_embedded_macros(macros_el, macros) return macro_paths def _load_embedded_macros(macros_el: "Element", macros: MacrosDictT) -> None: # attribute typed macro for macro in macros_el.iterfind("macro"): if "type" not in macro.attrib: macro.attrib["type"] = "xml" macro_type = unicodify(macro.attrib["type"]) try: macros[macro_type].append(macro) except KeyError: macros[macro_type] = [macro] # type shortcuts ( is a shortcut for . for tag in ["template", "xml", "token"]: for macro_el in macros_el.iterfind(tag): macro_el.attrib["type"] = tag macro_el.tag = "macro" try: macros[tag].append(macro_el) except KeyError: macros[tag] = [macro_el] def _load_imported_macros(macros_el: "Element", xml_base_dir: str, macros: MacrosDictT) -> List[str]: macro_paths = [] for tool_relative_import_path in _imported_macro_paths_from_el(macros_el): import_path = os.path.join(xml_base_dir, tool_relative_import_path) macro_paths.append(import_path) current_macro_paths = _load_macro_file(import_path, xml_base_dir, macros) macro_paths.extend(current_macro_paths) return macro_paths def _imported_macro_paths_from_el(macros_el: "Element") -> List[str]: imported_macro_paths = [] for macro_import_el in macros_el.findall("import"): raw_import_path = macro_import_el.text assert raw_import_path imported_macro_paths.append(raw_import_path) return imported_macro_paths def _load_macro_file(path: "StrPath", xml_base_dir: str, macros: MacrosDictT) -> List[str]: tree = parse_xml(path, strip_whitespace=False) root = tree.getroot() return _load_macros(root, xml_base_dir, macros) def _xml_replace(query: "Element", targets: Iterable["Element"]) -> None: parent_el = query.find("..") assert parent_el is not None matching_index = -1 # for index, el in enumerate(parent_el.iter('.')): ## Something like this for newer implementation for index, el in enumerate(parent_el): if el == query: matching_index = index break assert matching_index >= 0 current_index = matching_index for target in targets: current_index += 1 parent_el.insert(current_index, deepcopy(target)) parent_el.remove(query) class XmlMacroDef: """ representation of a (Galaxy) XML macro stores the root element of the macro and the parameters. each parameter is represented as pair containing - the quote character, default '@' - parameter name Parameter names can be given as comma separated list using the `tokens` attribute or as attributes `token_XXX` (where `XXX` is the name). The former option should be used to specify required attributes of the macro and the latter for optional attributes of the macro (the value of `token_XXX is used as default value). TODO: `token_quote` forbids `"quote"` as character name of optional parameters """ def __init__(self, el: "Element") -> None: self.element = el tokens: Dict[str, Union[str, None]] = {} self.token_quote = "@" for key, value in el.attrib.items(): key = unicodify(key) value = unicodify(value) if key == "token_quote": self.token_quote = value if key == "tokens": for token in value.split(","): tokens[token] = None # here None means that the token is a required parameter elif key.startswith("token_"): token = key[len("token_") :] tokens[token] = value self.tokens = tokens def macro_tokens(self, expand_el: "Element") -> Dict[str, str]: """ get a dictionary mapping token names to values. The names are the parameter names surrounded by the quote character. Values are taken from the expand_el if absent default values of optional parameters are used. """ tokens: Dict[str, str] = {} for key, default_val in self.tokens.items(): token_value = expand_el.attrib.get(key, default_val) if token_value is None: raise ValueError(f"Failed to expand macro - missing required parameter [{key}].") token_name = f"{self.token_quote}{key.upper()}{self.token_quote}" tokens[token_name] = token_value return tokens __all__ = ( "imported_macro_paths", "load", "load_with_references", "raw_xml_tree", "template_macro_params", ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/yaml_util.py0000644000175100017510000000541615211124267020412 0ustar00runnerrunnerimport logging import os from collections import OrderedDict import yaml from yaml.constructor import ConstructorError try: from yaml import CSafeLoader as SafeLoader except ImportError: from yaml import SafeLoader # type: ignore[assignment] log = logging.getLogger(__name__) class OrderedLoader(SafeLoader): # This class was pulled out of ordered_load() for the sake of # mocking __init__ in a unit test. def __init__(self, stream): self._root = os.path.split(stream.name)[0] super().__init__(stream) def include(self, node): filename = os.path.join(self._root, self.construct_scalar(node)) with open(filename) as f: return yaml.load(f, OrderedLoader) def ordered_load(stream, merge_duplicate_keys=False): """ Parse the first YAML document in a stream and produce the corresponding Python object. If merge_duplicate_keys is True, merge the values of duplicate mapping keys into a list, as the uWSGI "dumb" YAML parser would do. Otherwise, following YAML 1.2 specification which says that "each key is unique in the association", raise a ConstructionError exception. """ def construct_mapping(loader, node, deep=False): loader.flatten_mapping(node) mapping = {} merged_duplicate = {} for key_node, value_node in node.value: key = loader.construct_object(key_node, deep=deep) value = loader.construct_object(value_node, deep=deep) if key in mapping: if not merge_duplicate_keys: raise ConstructorError( "while constructing a mapping", node.start_mark, f"found duplicated key ({key})", key_node.start_mark, ) log.debug("Merging values for duplicate key '%s' into a list", key) if merged_duplicate.get(key): mapping[key].append(value) else: mapping[key] = [mapping[key], value] merged_duplicate[key] = True else: mapping[key] = value return mapping OrderedLoader.add_constructor(yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAG, construct_mapping) OrderedLoader.add_constructor("!include", OrderedLoader.include) return yaml.load(stream, OrderedLoader) def ordered_dump(data, stream=None, Dumper=yaml.Dumper, **kwds): class OrderedDumper(Dumper): pass def _dict_representer(dumper, data): return dumper.represent_mapping(yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAG, list(data.items())) OrderedDumper.add_representer(OrderedDict, _dict_representer) return yaml.dump(data, stream, OrderedDumper, **kwds) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/util/zipstream.py0000644000175100017510000000635515211124267020434 0ustar00runnerrunnerimport os import zlib from typing import ( Dict, Iterator, List, Optional, Set, ) from urllib.parse import quote import zipstream from galaxy.util import to_content_disposition from .path import safe_walk CRC32_MIN = 1444 CRC32_MAX = 1459 class ZipstreamWrapper: def __init__( self, archive_name: Optional[str] = None, upstream_mod_zip: bool = False, upstream_gzip: bool = False ) -> None: self.upstream_mod_zip = upstream_mod_zip self.archive_name = archive_name if not self.upstream_mod_zip: self.archive = zipstream.ZipFile( allowZip64=True, compression=zipstream.ZIP_STORED if upstream_gzip else zipstream.ZIP_DEFLATED ) self.files: List[str] = [] self.directories: Set[str] = set() self.size = 0 def response(self) -> Iterator[bytes]: if self.upstream_mod_zip: dir_lines = [f"0 0 @directory {directory}" for directory in self.directories] yield "\n".join(dir_lines + self.files).encode() else: yield from iter(self.archive) def get_headers(self) -> Dict[str, str]: headers = {} if self.archive_name: headers["Content-Disposition"] = to_content_disposition(f"{self.archive_name}.zip") if self.upstream_mod_zip: headers["X-Archive-Files"] = "zip" else: headers["Content-Type"] = "application/x-zip-compressed" return headers def add_path(self, path: str, archive_name: str) -> None: size = int(os.stat(path).st_size) if self.upstream_mod_zip: # calculating crc32 would defeat the point of using mod-zip, but if we ever calculate hashsums we should consider this crc32 = "-" # We do have to calculate the crc32 for files that are between 1444 and 1459 bytes in size, xref: https://github.com/evanmiller/mod_zip/issues/44#issuecomment-656660686 # Oddly that seems to be only true for usegalaxy.org (nginx version 1.12.2), and works fine locally (nginx 1.19.10). # May have been fixed in nginx 1.17.0 if CRC32_MIN <= os.path.getsize(path) <= CRC32_MAX: with open(path, "rb") as contents: crc32 = hex(zlib.crc32(contents.read()))[2:] line = f"{crc32} {size} {quote(path)} {archive_name}" head, tail = os.path.split(archive_name) if head: self.directories.add(head) self.files.append(line) else: self.size += size self.archive.write(path, archive_name) def write(self, path: str, archive_name: Optional[str] = None) -> None: if os.path.isdir(path): pardir = os.path.join(path, os.pardir) for root, directories, files in safe_walk(path): for directory in directories: dir_path = os.path.join(root, directory) self.add_path(dir_path, os.path.relpath(dir_path, pardir)) for file in files: file_path = os.path.join(root, file) self.add_path(file_path, os.path.relpath(file_path, pardir)) else: self.add_path(path, archive_name or os.path.basename(path)) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787383.0 galaxy_util-26.0.1/galaxy/version.py0000644000175100017510000000016415211124267017116 0ustar00runnerrunnerVERSION_MAJOR = "26.0" VERSION_MINOR = "1" VERSION = VERSION_MAJOR + (f".{VERSION_MINOR}" if VERSION_MINOR else "") ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1780787404.6908786 galaxy_util-26.0.1/galaxy_util.egg-info/0000755000175100017510000000000015211124315017577 5ustar00runnerrunner././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787404.0 galaxy_util-26.0.1/galaxy_util.egg-info/PKG-INFO0000644000175100017510000006630215211124314020702 0ustar00runnerrunnerMetadata-Version: 2.4 Name: galaxy-util Version: 26.0.1 Summary: Galaxy generic utilities Home-page: https://github.com/galaxyproject/galaxy Author: Galaxy Project and Community Author-email: galaxy-committers@lists.galaxyproject.org License: MIT Requires-Python: >=3.8 Description-Content-Type: text/x-rst License-File: LICENSE Requires-Dist: bleach Requires-Dist: boltons Requires-Dist: docutils!=0.17,!=0.17.1 Requires-Dist: importlib-resources>=5.10.0; python_version < "3.12" Requires-Dist: packaging Requires-Dist: pyparsing>=3.0.0 Requires-Dist: PyYAML Requires-Dist: requests Requires-Dist: typing-extensions Requires-Dist: zipstream-new Provides-Extra: image-util Requires-Dist: pillow; extra == "image-util" Provides-Extra: jstree Requires-Dist: dictobj; extra == "jstree" Provides-Extra: template Requires-Dist: CT3>=3.3.3; extra == "template" Requires-Dist: fissix; python_version >= "3.13" and extra == "template" Requires-Dist: future>=1.0.0; extra == "template" Provides-Extra: config-template Requires-Dist: galaxy-tool-util-models; extra == "config-template" Requires-Dist: Jinja2; extra == "config-template" Requires-Dist: pydantic>=2.7.4; extra == "config-template" Provides-Extra: test Requires-Dist: pytest; extra == "test" Requires-Dist: pytest-httpserver; extra == "test" Requires-Dist: responses; extra == "test" Requires-Dist: Werkzeug; extra == "test" Dynamic: license-file .. image:: https://badge.fury.io/py/galaxy-util.svg :target: https://pypi.org/project/galaxy-util/ Overview -------- The Galaxy_ utilities module. * Code: https://github.com/galaxyproject/galaxy .. _Galaxy: http://galaxyproject.org/ History ------- .. to_doc ------------------- 26.0.1 (2026-06-04) ------------------- ========= Bug fixes ========= * Fixes looks_like_flattened_repeat_key helper by `@guerler `_ in `#22578 `_ ============ Enhancements ============ * Replace per-term joins in workflow search with EXISTS subqueries by `@mvdbeek `_ in `#22548 `_ ------------------- 26.0.0 (2026-04-08) ------------------- ========= Bug fixes ========= * Plumbing for tracking potential fixes for transient failures (and a fix demonstrating it) by `@jmchilton `_ in `#21243 `_ * Remove unused handle_tool_shed_url_protocol by `@mvdbeek `_ in `#21925 `_ * Raise MessageException instead of generic Exception in rules_dsl by `@mvdbeek `_ in `#22285 `_ * Improve timeout and error handling in ``/api/proxy`` endpoint by `@mvdbeek `_ in `#22297 `_ * Skip WorkflowHub tests when workflowhub.eu is down by `@mvdbeek `_ in `#22302 `_ * Discard rest of line in chunks in iter_start_of_line by `@mvdbeek `_ in `#22332 `_ * Fix Content-Disposition header with trailing whitespace by `@mvdbeek `_ in `#22379 `_ ============ Enhancements ============ * Update Python dependencies by `@galaxybot `_ in `#21043 `_ * Add Playwright Backend Support to Galaxy Browser Automation Framework by `@jmchilton `_ in `#21102 `_ * Add Custom Validation for User-Configured Templates by `@davelopez `_ in `#21155 `_ * Add type annotations to job handling code by `@nsoranzo `_ in `#21171 `_ * Richer tracking of transient failures. by `@jmchilton `_ in `#21227 `_ * Update fastapi to 0.123.4 and ``get_openapi()`` fork by `@nsoranzo `_ in `#21384 `_ * Add AI Agent Framework and ChatGXY 2.0 by `@dannon `_ in `#21434 `_ * Fix use of function, method and argument names deprecated in pyparsing 3.0.0 by `@nsoranzo `_ in `#21517 `_ * Apply 2026 black style by `@galaxybot `_ in `#21618 `_ * Add tests for oidc usernames by `@nuwang `_ in `#21655 `_ * Various fixes to file source template's validation system by `@davelopez `_ in `#21704 `_ ------------------- 25.1.2 (2026-03-09) ------------------- No recorded changes since last release ------------------- 25.1.1 (2026-02-03) ------------------- No recorded changes since last release ------------------- 25.1.0 (2025-12-12) ------------------- ========= Bug fixes ========= * Extract: do not use common prefix dir by `@bernt-matthias `_ in `#20929 `_ * Test and fix CORS on exceptions by `@mvdbeek `_ in `#21105 `_ ============ Enhancements ============ * Implement Sample Sheets by `@jmchilton `_ in `#19305 `_ * Empower Users to More Pragmatically Import Datasets & Collections From Tables by `@jmchilton `_ in `#20288 `_ * Type annotation fixes for mypy 1.16.0 by `@nsoranzo `_ in `#20424 `_ * Remove deprecated tool document cache by `@nsoranzo `_ in `#20510 `_ * Refactor Files Sources Framework for stronger typing using pydantic models by `@davelopez `_ in `#20728 `_ * Support remote file source hashes by `@davelopez `_ in `#20853 `_ ------------------- 25.0.4 (2025-11-18) ------------------- No recorded changes since last release ------------------- 25.0.3 (2025-09-23) ------------------- No recorded changes since last release ------------------- 25.0.2 (2025-08-13) ------------------- ========= Bug fixes ========= * Prevent importing workflows with invalid step UUID by `@davelopez `_ in `#20596 `_ * Remove base_dir from zip in make_fast_zipfile by `@davelopez `_ in `#20739 `_ ------------------- 25.0.1 (2025-06-20) ------------------- No recorded changes since last release ------------------- 25.0.0 (2025-06-18) ------------------- ========= Bug fixes ========= * Use ``resource_path()`` to access datatypes_conf.xml.sample as a package resource by `@nsoranzo `_ in `#19331 `_ * Use fissix also when python3-lib2to3 is not installed by `@nsoranzo `_ in `#19749 `_ * Fix ``test_in_directory`` on osx by `@mvdbeek `_ in `#19943 `_ ============ Enhancements ============ * Calculate hash for new non-deferred datasets when finishing a job by `@nsoranzo `_ in `#19181 `_ * Fix UP031 errors - Part 3 by `@nsoranzo `_ in `#19218 `_ * Fix UP031 errors - Part 4 by `@nsoranzo `_ in `#19235 `_ * Fix UP031 errors - Part 5 by `@nsoranzo `_ in `#19282 `_ * Type annotation fixes for mypy 1.14.0 by `@nsoranzo `_ in `#19372 `_ * Empower Users to Build More Kinds of Collections, More Intelligently by `@jmchilton `_ in `#19377 `_ * Set safe default extraction filter for tar archives by `@nsoranzo `_ in `#19406 `_ * Format code with black 25.1.0 by `@nsoranzo `_ in `#19625 `_ * Improve type annotations of ``ModelPersistenceContext`` and derived classes by `@nsoranzo `_ in `#19852 `_ * Allow PathLike parameters in ``make_fast_zipfile()`` by `@nsoranzo `_ in `#19955 `_ * Implement dataset collection support in workflow landing requests by `@mvdbeek `_ in `#20004 `_ * Add DOI to workflow metadata by `@jdavcs `_ in `#20033 `_ * Improve type annotation of `galaxy.util` submodules by `@nsoranzo `_ in `#20104 `_ * Additional type hints for ``toolbox.get_tool`` / ``toolbox.has_tool`` by `@mvdbeek `_ in `#20150 `_ ------------------- 24.2.4 (2025-06-17) ------------------- ========= Bug fixes ========= * Use ``make_fast_zipfile`` directly by `@mvdbeek `_ in `#19947 `_ ------------------- 24.2.3 (2025-03-16) ------------------- No recorded changes since last release ------------------- 24.2.2 (2025-03-08) ------------------- ============ Enhancements ============ * Add bwa_mem2_index directory datatype, framework enhancements for testing directories by `@mvdbeek `_ in `#19694 `_ ------------------- 24.2.1 (2025-02-28) ------------------- No recorded changes since last release ------------------- 24.2.0 (2025-02-11) ------------------- ========= Bug fixes ========= * Fixes for errors reported by mypy 1.11.0 by `@nsoranzo `_ in `#18608 `_ * Fix numerous issues with tool input format "21.01" by `@jmchilton `_ in `#19030 `_ * Partial backport of #19331 by `@nsoranzo `_ in `#19342 `_ * Fix config template validation for file sources and object store templates by `@davelopez `_ in `#19414 `_ * Serialize message exceptions on execution error by `@mvdbeek `_ in `#19483 `_ ============ Enhancements ============ * Allow OAuth 2.0 user defined file sources (w/Dropbox integration) by `@jmchilton `_ in `#18272 `_ * Add Python 3.13 support by `@nsoranzo `_ in `#18449 `_ * Add Tool-Centric APIs to the Tool Shed 2.0 by `@jmchilton `_ in `#18524 `_ * Rip repository_registry out of tool shed 2.0 by `@jmchilton `_ in `#18647 `_ * Workflow Landing Requests by `@jmchilton `_ in `#18807 `_ * Update Mypy to 1.11.2 and fix new signature override errors by `@nsoranzo `_ in `#18811 `_ * Raise exception if CompressedFile used on incompatible file by `@mvdbeek `_ in `#18888 `_ * Type annotations and fixes by `@nsoranzo `_ in `#18911 `_ * Workflow landing improvements by `@mvdbeek `_ in `#18979 `_ * Run installed Galaxy with no config and a simplified entry point by `@natefoo `_ in `#19050 `_ * Enhance UTF-8 support for filename handling in downloads by `@arash77 `_ in `#19161 `_ ------------------- 24.1.4 (2024-12-11) ------------------- ========= Bug fixes ========= * Fix Archive header encoding by `@arash77 `_ in `#18583 `_ * File source and object store instance api fixes by `@mvdbeek `_ in `#18685 `_ ============ Enhancements ============ * Use smtplib send_message to support utf-8 chars in to and from by `@mvdbeek `_ in `#18805 `_ ------------------- 24.1.3 (2024-10-25) ------------------- ========= Bug fixes ========= * Fix Archive header encoding by `@arash77 `_ in `#18583 `_ * File source and object store instance api fixes by `@mvdbeek `_ in `#18685 `_ ============ Enhancements ============ * Use smtplib send_message to support utf-8 chars in to and from by `@mvdbeek `_ in `#18805 `_ ------------------- 24.1.2 (2024-09-25) ------------------- ========= Bug fixes ========= * Fix Archive header encoding by `@arash77 `_ in `#18583 `_ * File source and object store instance api fixes by `@mvdbeek `_ in `#18685 `_ ============ Enhancements ============ * Use smtplib send_message to support utf-8 chars in to and from by `@mvdbeek `_ in `#18805 `_ ------------------- 24.1.1 (2024-07-02) ------------------- ========= Bug fixes ========= * Fix bug in image_util.py by `@kostrykin `_ in `#17749 `_ * Revert some requests import changes by `@nsoranzo `_ in `#18199 `_ ============ Enhancements ============ * Better display of estimated line numbers and add number of columns for tabular by `@bernt-matthias `_ in `#17492 `_ * Update Python dependencies by `@galaxybot `_ in `#17653 `_ * Code cleanups from ruff and pyupgrade by `@nsoranzo `_ in `#17654 `_ * SQLAlchemy 2.0 by `@jdavcs `_ in `#17778 `_ * Error reporting unit tests by `@jmchilton `_ in `#17968 `_ * Enable ``warn_unused_ignores`` mypy option by `@nsoranzo `_ in `#17991 `_ * Add galaxy to user agent by `@mvdbeek `_ in `#18003 `_ * Update Python dependencies by `@galaxybot `_ in `#18063 `_ * Enable flake8-implicit-str-concat ruff rules by `@nsoranzo `_ in `#18067 `_ * Overhaul Azure storage infrastructure. by `@jmchilton `_ in `#18087 `_ * Empower users to bring their own storage and file sources by `@jmchilton `_ in `#18127 `_ * Harden User Object Store and File Source Creation by `@jmchilton `_ in `#18172 `_ ------------------- 24.0.3 (2024-06-28) ------------------- ========= Bug fixes ========= * Use config_section to distinguish between galaxy and ts or other apps by `@jdavcs `_ in `#18215 `_ ------------------- 24.0.2 (2024-05-07) ------------------- ========= Bug fixes ========= * Adds logging of messageExceptions in the fastapi exception handler. by `@dannon `_ in `#18041 `_ ------------------- 24.0.1 (2024-05-02) ------------------- ========= Bug fixes ========= * Fix conditional Image imports by `@mvdbeek `_ in `#17899 `_ ------------------- 24.0.0 (2024-04-02) ------------------- ========= Bug fixes ========= * Optional Reply-to SMTP header in tool error reports by `@neoformit `_ in `#17243 `_ * Follow-up on #17274 and #17262 by `@nsoranzo `_ in `#17302 `_ * Fixes for flake8-bugbear 24.1.17 by `@nsoranzo `_ in `#17340 `_ ============ Enhancements ============ * Add support for Python 3.12 by `@tuncK `_ in `#16796 `_ * Python 3.8 as minimum by `@mr-c `_ in `#16954 `_ * Remove web framework dependency from tools by `@davelopez `_ in `#17058 `_ * Add support for (fast5.tar).xz binary compressed files by `@tuncK `_ in `#17106 `_ * Reuse test instance during non-integration tests by `@mvdbeek `_ in `#17234 `_ * Add OIDC backend configuration schema and validation by `@uwwint `_ in `#17274 `_ * Enable ``warn_unreachable`` mypy option by `@mvdbeek `_ in `#17365 `_ * Fix type annotation of code using XML etree by `@nsoranzo `_ in `#17367 `_ * Update to black 2024 stable style by `@nsoranzo `_ in `#17391 `_ * Add `image_diff` comparison method for test output verification using images by `@kostrykin `_ in `#17556 `_ ------------------- 23.2.1 (2024-02-21) ------------------- ========= Bug fixes ========= * Ruff and flake8 fixes by `@nsoranzo `_ in `#16884 `_ ============ Enhancements ============ * Tool Shed 2.0 by `@jmchilton `_ in `#15639 `_ * Move database access code out of ``galaxy.util`` by `@jdavcs `_ in `#16526 `_ * Tweak tool memory use and optimize shared memory when using preload by `@mvdbeek `_ in `#16536 `_ * Updated path-based interactive tools with entry point path injection, support for ITs with relative links, shortened URLs, doc and config updates including Podman job_conf by `@sveinugu `_ in `#16795 `_ * Allow partial matches in workflow name tag search and search all tags for unquoted query by `@ahmedhamidawan `_ in `#16860 `_ * Use python-isal for fast zip deflate compression in rocrate export by `@mvdbeek `_ in `#17342 `_ ============= Other changes ============= * Merge 23.1 into dev by `@mvdbeek `_ in `#16534 `_ ------------------- 23.1.4 (2024-01-04) ------------------- No recorded changes since last release ------------------- 23.1.3 (2023-12-01) ------------------- No recorded changes since last release ------------------- 23.1.2 (2023-11-29) ------------------- ============ Enhancements ============ * Improve invocation error reporting by `@mvdbeek `_ in `#16917 `_ ------------------- 23.1.1 (2023-10-23) ------------------- ========= Bug fixes ========= * Fix bad auto-merge of dev. by `@jmchilton `_ in `#15386 `_ * Fix some drs handling issues by `@nuwang `_ in `#15777 `_ * Enable ``strict_equality`` mypy option by `@nsoranzo `_ in `#15808 `_ * Ensure session is request-scoped for legacy endpoints by `@jdavcs `_ in `#16207 `_ * Fix form builder value handling by `@guerler `_ in `#16304 `_ * Backport tool mem fixes by `@mvdbeek `_ in `#16601 `_ * Workaround for XML nodes of job resource parameters losing their children by `@kysrpex `_ in `#16728 `_ * Fix allowlist deserialization in file sources by `@mvdbeek `_ in `#16729 `_ * Exclude on_opened and on_closed from watcher events by `@mvdbeek `_ in `#16850 `_ ============ Enhancements ============ * Various Tool Shed Cleanup by `@jmchilton `_ in `#15247 `_ * Protection against problematic boolean parameters. by `@jmchilton `_ in `#15493 `_ * Unify url handling with filesources by `@nuwang `_ in `#15497 `_ * Explore tool remote test data by `@davelopez `_ in `#15510 `_ * Drop database views by `@jdavcs `_ in `#15876 `_ * Update Python dependencies by `@galaxybot `_ in `#15890 `_ * Record input datasets and collections at full parameter path by `@mvdbeek `_ in `#15978 `_ * Code cleanups from ruff and pyupgrade by `@nsoranzo `_ in `#16035 `_ * Vendorise ``packaging.versions.LegacyVersion`` by `@nsoranzo `_ in `#16058 `_ * Improve histories and datasets immutability checks by `@davelopez `_ in `#16143 `_ * Merge ``Target`` class with ``CondaTarget`` by `@nsoranzo `_ in `#16181 `_ ------------------- 23.0.6 (2023-10-23) ------------------- No recorded changes since last release ------------------- 23.0.5 (2023-07-29) ------------------- No recorded changes since last release ------------------- 23.0.4 (2023-06-30) ------------------- No recorded changes since last release ------------------- 23.0.3 (2023-06-26) ------------------- No recorded changes since last release ------------------- 23.0.2 (2023-06-13) ------------------- No recorded changes since last release ------------------- 23.0.1 (2023-06-08) ------------------- ========= Bug fixes ========= * Replace httpbin service with pytest-httpserver by `@mvdbeek `_ in `#16042 `_ ------------------- 22.1.2 (2022-12-08) ------------------- * Pin packaging dependency to < 22, fixes ``LegacyVersion`` import errors * Add missing pyparsing dependency ------------------- 22.1.1 (2022-08-22) ------------------- * First release from the 22.01 branch of Galaxy ------------------- 21.1.0 (2021-03-19) ------------------- * First release from the 21.01 branch of Galaxy. ------------------- 20.9.1 (2020-10-28) ------------------- ------------------- 20.9.0 (2020-10-15) ------------------- * First release from the 20.09 branch of Galaxy. ------------------- 20.5.0 (2020-07-03) ------------------- * First release from 20.05 branch of Galaxy. ------------------- 19.9.0 (2019-11-21) ------------------- * Initial import from dev branch of Galaxy during 19.09 development cycle. ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787404.0 galaxy_util-26.0.1/galaxy_util.egg-info/SOURCES.txt0000644000175100017510000000450715211124314021470 0ustar00runnerrunnerHISTORY.rst LICENSE MANIFEST.in README.rst dev-requirements.txt pyproject.toml setup.cfg galaxy/__init__.py galaxy/py.typed galaxy/version.py galaxy/exceptions/__init__.py galaxy/exceptions/error_codes.json galaxy/exceptions/error_codes.py galaxy/exceptions/utils.py galaxy/util/__init__.py galaxy/util/aliaspickler.py galaxy/util/bool_expressions.py galaxy/util/bunch.py galaxy/util/bytesize.py galaxy/util/checkers.py galaxy/util/commands.py galaxy/util/compression_utils.py galaxy/util/config_parsers.py galaxy/util/config_templates.py galaxy/util/dictifiable.py galaxy/util/docutils_template.txt galaxy/util/dynamic.py galaxy/util/expressions.py galaxy/util/facts.py galaxy/util/filelock.py galaxy/util/form_builder.py galaxy/util/hash_util.py galaxy/util/heartbeat.py galaxy/util/image_util.py galaxy/util/inflection.py galaxy/util/json.py galaxy/util/jstree.py galaxy/util/lazy_process.py galaxy/util/markdown.py galaxy/util/monitors.py galaxy/util/odict.py galaxy/util/oset.py galaxy/util/permutations.py galaxy/util/plugin_config.py galaxy/util/properties.py galaxy/util/renamed_temporary_file.py galaxy/util/requests.py galaxy/util/resources.py galaxy/util/rst_to_html.py galaxy/util/rules_dsl.py galaxy/util/rules_dsl_spec.yml galaxy/util/sanitize_html.py galaxy/util/script.py galaxy/util/search.py galaxy/util/simplegraph.py galaxy/util/sleeper.py galaxy/util/sockets.py galaxy/util/specs.py galaxy/util/sqlite.py galaxy/util/submodules.py galaxy/util/task.py galaxy/util/template.py galaxy/util/themes.py galaxy/util/tool_version.py galaxy/util/topsort.py galaxy/util/tree_dict.py galaxy/util/ucsc.py galaxy/util/unittest.py galaxy/util/user_agent.py galaxy/util/validation.py galaxy/util/wait.py galaxy/util/watcher.py galaxy/util/xml_macros.py galaxy/util/yaml_util.py galaxy/util/zipstream.py galaxy/util/custom_logging/__init__.py galaxy/util/custom_logging/fluent_log.py galaxy/util/path/__init__.py galaxy/util/path/ntpath.py galaxy/util/path/posixpath.py galaxy/util/tool_shed/__init__.py galaxy/util/tool_shed/common_util.py galaxy/util/tool_shed/encoding_util.py galaxy/util/tool_shed/tool_shed_registry.py galaxy/util/tool_shed/xml_util.py galaxy/util/unittest_utils/__init__.py galaxy_util.egg-info/PKG-INFO galaxy_util.egg-info/SOURCES.txt galaxy_util.egg-info/dependency_links.txt galaxy_util.egg-info/requires.txt galaxy_util.egg-info/top_level.txt././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787404.0 galaxy_util-26.0.1/galaxy_util.egg-info/dependency_links.txt0000644000175100017510000000000115211124314023644 0ustar00runnerrunner ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787404.0 galaxy_util-26.0.1/galaxy_util.egg-info/requires.txt0000644000175100017510000000063015211124314022175 0ustar00runnerrunnerbleach boltons docutils!=0.17,!=0.17.1 packaging pyparsing>=3.0.0 PyYAML requests typing-extensions zipstream-new [:python_version < "3.12"] importlib-resources>=5.10.0 [config-template] galaxy-tool-util-models Jinja2 pydantic>=2.7.4 [image-util] pillow [jstree] dictobj [template] CT3>=3.3.3 future>=1.0.0 [template:python_version >= "3.13"] fissix [test] pytest pytest-httpserver responses Werkzeug ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787404.0 galaxy_util-26.0.1/galaxy_util.egg-info/top_level.txt0000644000175100017510000000000715211124314022325 0ustar00runnerrunnergalaxy ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1780787384.0 galaxy_util-26.0.1/pyproject.toml0000644000175100017510000000042515211124270016500 0ustar00runnerrunner[build-system] build-backend = "setuptools.build_meta" requires = ["setuptools"] [project] dynamic = [ "authors", "dependencies", "description", "license", "optional-dependencies", "readme", "requires-python", "version", ] name = "galaxy-util" ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1780787404.6929255 galaxy_util-26.0.1/setup.cfg0000644000175100017510000000311115211124315015400 0ustar00runnerrunner[metadata] author = Galaxy Project and Community author_email = galaxy-committers@lists.galaxyproject.org classifiers = Development Status :: 5 - Production/Stable Environment :: Console Intended Audience :: Developers Natural Language :: English Operating System :: POSIX Programming Language :: Python :: 3 Programming Language :: Python :: 3.8 Programming Language :: Python :: 3.9 Programming Language :: Python :: 3.10 Programming Language :: Python :: 3.11 Programming Language :: Python :: 3.12 Programming Language :: Python :: 3.13 Programming Language :: Python :: 3.14 Topic :: Software Development Topic :: Software Development :: Code Generators Topic :: Software Development :: Testing description = Galaxy generic utilities keywords = Galaxy license = MIT license_files = LICENSE long_description = file: README.rst, HISTORY.rst long_description_content_type = text/x-rst name = galaxy-util url = https://github.com/galaxyproject/galaxy version = 26.0.1 [options] include_package_data = True install_requires = bleach boltons docutils!=0.17,!=0.17.1 importlib-resources>=5.10.0;python_version<'3.12' packaging pyparsing>=3.0.0 PyYAML requests typing-extensions zipstream-new packages = find: python_requires = >=3.8 [options.extras_require] image-util = pillow jstree = dictobj template = CT3>=3.3.3 fissix;python_version>='3.13' future>=1.0.0 config-template = galaxy-tool-util-models Jinja2 pydantic>=2.7.4 test = pytest pytest-httpserver responses Werkzeug [options.packages.find] exclude = tests* [egg_info] tag_build = tag_date = 0