pax_global_header00006660000000000000000000000064130375007720014516gustar00rootroot0000000000000052 comment=915395c619cec93ec663ce4cced397e57663a0b3 trapperkeeper-status-0.7.1/000077500000000000000000000000001303750077200157155ustar00rootroot00000000000000trapperkeeper-status-0.7.1/.gitignore000066400000000000000000000001531303750077200177040ustar00rootroot00000000000000/target /classes /checkouts pom.xml pom.xml.asc *.jar *.class /.lein-* /.nrepl-port /resources/locales.clj trapperkeeper-status-0.7.1/.travis.yml000066400000000000000000000007731303750077200200350ustar00rootroot00000000000000language: clojure lein: 2.7.1 jdk: - oraclejdk7 - openjdk7 script: ./ext/travisci/test.sh notifications: email: false hipchat: rooms: secure: CfGS3yYsocLruSP0lRr9HFwzdZ1HFw4rFEVdJkP/i0i8aIPY3XB3vTQNxw/lIBVRDwWHaUfA+xnyK3itCxt0M0UtPDp1BkU6abLr8Apjet/nJJ2icBvQfnpx1hb6xBpoE69vpRiIVewoU69UPFjXdZd2D1BWX58tNbdV9CA8Ezw= template: - ! '%{repository}#%{build_number} (%{branch} - %{commit} : %{author}): %{message}' - ! 'Change view: %{compare_url}' - ! 'Build details: %{build_url}' trapperkeeper-status-0.7.1/CHANGELOG.md000066400000000000000000000061511303750077200175310ustar00rootroot00000000000000## 0.7.1 This is a bugfix release. * Fix for a schema error that can cause gc/cpu usage updates to fail when computed process gc and/or cpu times are not whole numbers. * Fix for an erroneous schema error which could be written to the log in the event that a trapperkeeper app is shutdown before the trapperkeeper-status service is started. ## 0.7.0 This is a feature release. * [PE-13539](https://tickets.puppetlabs.com/browse/PE-13539) At startup, log version numbers for all services that register themselves with the status service. * [TK-414](https://tickets.puppetlabs.com/browse/TK-414) Add metrics about CPU usage and GC CPU usage to the default JVM metrics available from the HTTP endpoint at `debug` level. * [TK-401](https://tickets.puppetlabs.com/browse/TK-401) Include service name in log message when a service's callback fails due to error or timeout ## 0.6.0 This is a feature release. * Add ability for TK-status to periodically log status data to a file in JSON format * Add GC counts and file descriptor usage to the `jvm-metrics` section of the status output ## 0.5.0 This is a feature release. * Add the optional `timeout` query parameter to the HTTP endpoints. The value must be an integer that specifies the timeout in seconds. If a timeout is not provided, then the default is used. * Add the optional `timeout` argument to the `get-status` protocol method. The value must be an integer that specifies the timeout in seconds. If a timeout is not provided, then the default is used. * Increase the default critical level timeout from 5 seconds to 30 seconds. ## 0.4.0 This is a feature release. * Add the capability for services to add an :alerts object to their status function's output which will be returned by the HTTP endpoints under the "active_alerts" key. ## 0.3.5 This is a maintenance / bugfix release. * Follow standard exception conventions internally and return standard errors through the API. * Many improvements to documentation, largely around proxying and the `simple` endpoint. * Update slf4j-api dependency. ## 0.3.4 _Never released due to an automation issue_ ## 0.3.3 This is a maintenance / bugfix release. * Allow trapperkeeper's `stop` and `start` functions to be called on the status service without error by cleaning up context state manually. ## 0.3.2 This is a maintenance / bugfix release. * Cease enforcing semver when pulling versions from artifacts. This ended up being too restrictive and tk-status did not actually depend on or use the semver versions. ## 0.3.1 This is a maintenance / bugfix release. * Exclude the obsolete servlet-api dependency from ring-defaults, to avoid classpath issues with multiple copies of the servlet API in downstream projects. ## 0.3.0 This is a feature release. * Adds a status callback to the status service itself, to give us a place to expose status information that is common to all TK apps. * [TK-321](https://tickets.puppetlabs.com/browse/TK-321) - add memory/heap usage metrics to status callback * [TK-322](https://tickets.puppetlabs.com/browse/TK-322) - add process uptime to status callback trapperkeeper-status-0.7.1/CONTRIBUTING.md000066400000000000000000000064241303750077200201540ustar00rootroot00000000000000# How to contribute ## Getting Started * Make sure you have a [GitHub account](https://github.com/signup/free) * Submit an issue for your issue, assuming one does not already exist. * Clearly describe the issue including steps to reproduce when it is a bug. * Fork the repository on GitHub ## Making Changes * Create a topic branch from where you want to base your work. * This is usually the master branch. * Only target release branches if you are certain your fix must be on that branch. * To quickly create a topic branch based on master; `git checkout -b fix/master/my_contribution master`. Please avoid working directly on the `master` branch. * Make commits of logical units. * Check for unnecessary whitespace with `git diff --check` before committing. * Make sure your commit messages are in the proper format. ```` (PUP-1234) Make the example in CONTRIBUTING imperative and concrete Without this patch applied the example commit message in the CONTRIBUTING document is not a concrete example. This is a problem because the contributor is left to imagine what the commit message should look like based on a description rather than an example. This patch fixes the problem by making the example concrete and imperative. The first line is a real life imperative statement with a ticket number from our issue tracker. The body describes the behavior without the patch, why this is a problem, and how the patch fixes the problem when applied. ```` * Make sure you have added the necessary tests for your changes. * Run _all_ the tests to assure nothing else was accidentally broken. ## Making Trivial Changes ### Documentation For changes of a trivial nature to comments and documentation, it is not always necessary to create a new issue in Github. In this case, it is appropriate to start the first line of a commit with '(doc)' instead of a ticket number. ```` (doc) Add documentation commit example to CONTRIBUTING There is no example for contributing a documentation commit to the Puppet repository. This is a problem because the contributor is left to assume how a commit of this nature may appear. The first line is a real life imperative statement with '(doc)' in place of what would have been the ticket number in a non-documentation related commit. The body describes the nature of the new documentation or comments added. ```` ## Submitting Changes * Sign the [Contributor License Agreement](http://links.puppetlabs.com/cla). * Push your changes to a topic branch in your fork of the repository. * Submit a pull request to the repository in the puppetlabs organization, referencing the Github issue you opened. # Additional Resources * [Puppet Labs community guidelines](http://docs.puppetlabs.com/community/community_guidelines.html) * [Contributor License Agreement](http://links.puppetlabs.com/cla) * [General GitHub documentation](http://help.github.com/) * [GitHub pull request documentation](http://help.github.com/send-pull-requests/) * #puppet-dev IRC channel on freenode.org ([Archive](https://botbot.me/freenode/puppet-dev/)) * [puppet-dev mailing list](https://groups.google.com/forum/#!forum/puppet-dev) * [Community PR Triage notes](https://github.com/puppet-community/community-triage/tree/master/core/notes) trapperkeeper-status-0.7.1/LICENSE000066400000000000000000000260751303750077200167340ustar00rootroot00000000000000Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "{}" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright {yyyy} {name of copyright owner} Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. trapperkeeper-status-0.7.1/MAINTAINERS000066400000000000000000000015271303750077200174170ustar00rootroot00000000000000{ "version": 1, "file_format": "This MAINTAINERS file format is described at http://pup.pt/maintainers", "issues": "https://tickets.puppetlabs.com/browse/TK", "internal_list": "https://groups.google.com/a/puppet.com/forum/?hl=en#!forum/discuss-trapperkeeper-maintainers", "people": [ { "github": "aperiodic", "email": "dan.lidral-porter@puppet.com", "name": "Dan Lidral-Porter" }, { "github": "lindboe", "email": "lizzi@puppet.com", "name": "Lizzi Lindboe" }, { "github": "cprice404", "email": "chris@puppet.com", "name": "Chris Price" }, { "github": "jpinsonault", "email": "joe.pinsonault@puppet.com", "name": "Joe Pinsonault" }, { "github": "rlinehan", "email": "ruth@puppet.com", "name": "Ruth Linehan" } ] } trapperkeeper-status-0.7.1/Makefile000066400000000000000000000000441303750077200173530ustar00rootroot00000000000000include dev-resources/Makefile.i18n trapperkeeper-status-0.7.1/README.md000066400000000000000000000021311303750077200171710ustar00rootroot00000000000000# Trapperkeeper Status Service [![Build Status](https://travis-ci.org/puppetlabs/trapperkeeper-status.svg)](https://travis-ci.org/puppetlabs/trapperkeeper-status) [![Clojars Project](http://clojars.org/puppetlabs/trapperkeeper-status/latest-version.svg)](http://clojars.org/puppetlabs/trapperkeeper-status) A Trapperkeeper service that provides a web endpoint for getting status information about a running Trapperkeeper application. Other Trapperkeeper services may register a status callback function with the Status Service, returning any kind of status information that is relevant to the consuming service. The Status Service will make this information available via HTTP, in a consistent, consolidated format. This makes it possible for users to automate monitoring and other tasks around the system. For more information, please see the [documentation](./documentation). ## Support To file a bug, please open a Github issue against this project. Bugs and PRs are addressed on a best-effort basis. Puppet Labs does not guarantee support for this project. ## License Copyright © 2015 Puppet Labs trapperkeeper-status-0.7.1/dev-resources/000077500000000000000000000000001303750077200205035ustar00rootroot00000000000000trapperkeeper-status-0.7.1/dev-resources/Makefile.i18n000066400000000000000000000131451303750077200227250ustar00rootroot00000000000000# -*- Makefile -*- # This file was generated by the i18n leiningen plugin # Do not edit this file; it will be overwritten the next time you run # lein i18n init # # The locale in which our messages are written, and for which we therefore # have messages without any further effort MESSAGE_LOCALE=en # The name of the package into which the translations bundle will be placed BUNDLE=puppetlabs.trapperkeeper_status # The list of names of packages covered by the translation bundle; # by default it contains a single package - the same where the translations # bundle itself is placed - but this can be overridden - preferably in # the top level Makefile PACKAGES?=$(BUNDLE) LOCALES=$(basename $(notdir $(wildcard locales/*.po))) BUNDLE_DIR=$(subst .,/,$(BUNDLE)) BUNDLE_FILES=$(patsubst %,resources/$(BUNDLE_DIR)/Messages_%.class,$(MESSAGE_LOCALE) $(LOCALES)) FIND_SOURCES=find src -name \*.clj # xgettext before 0.19 does not understand --add-location=file. Even CentOS # 7 ships with an older gettext. We will therefore generate full location # info on those systems, and only file names where xgettext supports it LOC_OPT=$(shell xgettext --add-location=file -f - /dev/null 2>&1 && echo --add-location=file || echo --add-location) LOCALES_CLJ=resources/locales.clj define LOCALES_CLJ_CONTENTS { :locales #{$(patsubst %,"%",$(MESSAGE_LOCALE) $(LOCALES))} :packages [$(patsubst %,"%",$(PACKAGES))] :bundle $(patsubst %,"%",$(BUNDLE).Messages) } endef export LOCALES_CLJ_CONTENTS i18n: update-pot msgfmt # Update locales/messages.pot update-pot: locales/messages.pot locales/messages.pot: $(shell $(FIND_SOURCES)) | locales @tmp=$$(mktemp $@.tmp.XXXX); \ $(FIND_SOURCES) \ | xgettext --from-code=UTF-8 --language=lisp \ --copyright-holder='Puppet ' \ --package-name="$(BUNDLE)" \ --package-version="$(BUNDLE_VERSION)" \ --msgid-bugs-address="docs@puppet.com" \ -k \ -kmark:1 -ki18n/mark:1 \ -ktrs:1 -ki18n/trs:1 \ -ktru:1 -ki18n/tru:1 \ -ktrun:1,2 -ki18n/trun:1,2 \ -ktrsn:1,2 -ki18n/trsn:1,2 \ $(LOC_OPT) \ --add-comments --sort-by-file \ -o $$tmp -f -; \ sed -i.bak -e 's/charset=CHARSET/charset=UTF-8/' $$tmp; \ sed -i.bak -e 's/POT-Creation-Date: [^\\]*/POT-Creation-Date: /' $$tmp; \ rm -f $$tmp.bak; \ if ! diff -q -I POT-Creation-Date $$tmp $@ >/dev/null 2>&1; then \ mv $$tmp $@; \ else \ rm $$tmp; touch $@; \ fi # Run msgfmt over all .po files to generate Java resource bundles # and create the locales.clj file msgfmt: $(BUNDLE_FILES) $(LOCALES_CLJ) # force rebuild of locales.clj if its contents is not the # the desired one ifneq ($(shell cat $(LOCALES_CLJ) 2> /dev/null),$(shell echo '$(subst ','\'',$(LOCALES_CLJ_CONTENTS))')) .PHONY: $(LOCALES_CLJ) endif $(LOCALES_CLJ): | resources @echo "Writing $@" @echo "$$LOCALES_CLJ_CONTENTS" > $@ resources/$(BUNDLE_DIR)/Messages_%.class: locales/%.po | resources msgfmt --java2 -d resources -r $(BUNDLE).Messages -l $(*F) $< resources/$(BUNDLE_DIR)/Messages_$(MESSAGE_LOCALE).class: locales/messages.pot | resources msgfmt --java2 -d resources -r $(BUNDLE).Messages -l $(MESSAGE_LOCALE) $< # Translators use this when they update translations; this copies any # changes in the pot file into their language-specific po file locales/%.po: locales/messages.pot @if [ -f $@ ]; then \ msgmerge -U $@ $< && touch $@; \ else \ touch $@ && msginit --no-translator -l $(*F) -o $@ -i $<; \ fi resources locales: @mkdir $@ help: $(info $(HELP)) @echo .PHONY: help define HELP This Makefile assists in handling i18n related tasks during development. Files that need to be checked into source control are put into the locales/ directory. They are locales/messages.pot - the POT file generated by 'make update-pot' locales/$$LANG.po - the translations for $$LANG Only the $$LANG.po files should be edited manually; this is usually done by translators. You can use the following targets: i18n: refresh all the files in locales/ and recompile resources update-pot: extract strings and update locales/messages.pot locales/LANG.po: refresh or create translations for LANG msgfmt: compile the translations into Java classes; this step is needed to make translations available to the Clojure code and produces Java class files in resources/ endef # @todo lutter 2015-04-20: for projects that use libraries with their own # translation, we need to combine all their translations into one big po # file and then run msgfmt over that so that we only have to deal with one # resource bundle trapperkeeper-status-0.7.1/dev-resources/logback-test.xml000066400000000000000000000005461303750077200236110ustar00rootroot00000000000000 %d %-5p [%c{2}] %m%n trapperkeeper-status-0.7.1/dev-resources/puppetlabs/000077500000000000000000000000001303750077200226625ustar00rootroot00000000000000trapperkeeper-status-0.7.1/dev-resources/puppetlabs/trapperkeeper-status/000077500000000000000000000000001303750077200270545ustar00rootroot00000000000000trapperkeeper-status-0.7.1/dev-resources/puppetlabs/trapperkeeper-status/status-proxy-service-test/000077500000000000000000000000001303750077200341715ustar00rootroot00000000000000000077500000000000000000000000001303750077200347135ustar00rootroot00000000000000trapperkeeper-status-0.7.1/dev-resources/puppetlabs/trapperkeeper-status/status-proxy-service-test/ssl000077500000000000000000000000001303750077200360335ustar00rootroot00000000000000trapperkeeper-status-0.7.1/dev-resources/puppetlabs/trapperkeeper-status/status-proxy-service-test/ssl/certsca.pem000066400000000000000000000035061303750077200371250ustar00rootroot00000000000000trapperkeeper-status-0.7.1/dev-resources/puppetlabs/trapperkeeper-status/status-proxy-service-test/ssl/certs-----BEGIN CERTIFICATE----- MIIFMjCCAxqgAwIBAgIBATANBgkqhkiG9w0BAQsFADAfMR0wGwYDVQQDDBRQdXBw ZXQgQ0E6IGxvY2FsaG9zdDAeFw0xNDAyMTQxODA5MDdaFw0xOTAyMTQxODA5MDda MB8xHTAbBgNVBAMMFFB1cHBldCBDQTogbG9jYWxob3N0MIICIjANBgkqhkiG9w0B AQEFAAOCAg8AMIICCgKCAgEA5vYnoJ85k6qcUFzWFOr9MN2ZWFlgA6nB0Adfm8Yg ovg963NrauwTcoPqbfGhi53A8RWc5GT5x1OSjxQW/PAGGfGg3aHZuQMClYWsauod CYG6YgG49WGZODCHZ+TbHMN1pNV+6S+tnZbUtfVKsN347XyHCyrymmb/OQNAxPyO /76dDR4dB7kWKP4KMMOgDAnA+WDpD1d9/UfBPVZtw/sTyjeJGEOZqSXQW93PKumJ 9/DS4azsUMR1JwJA67yWffoHb6sL7QSAQEfp17gDs9asIzITZPPyHNslmY55zaSW EslzFGqPvB7ugUbqH0rjp2TjQ/9Nw7ZKBxukFB9cBUr7v21D2Of2SHQorzRe9lXO wO3BYEt/viypktDnH42qAeoLHDK1ay9ugByTm4ARj14CjslpkEKflk9t9XwUR3ku Uaj3w+Y5ItZXFBxns1OQpDt3bMhJFemj7MxkZXt8tvWlUMKVmCUqbnG1eDdTF4Vd OovdacMDpvKg438I8MCjaHQf90qIp8MaOjdfSoKBCQi6AP7cpjB+x9xhEuetPYcb /bNjV1OFo4h77XJ+lIH58HBpEG0zkqhtS7JqICIzp3GK7ZG3bhfKKJz0Qr4ISxI0 YsYWdmsG38SyMwcomMy4WoLiuQF2QusBSHwBS+p5qQyNkJmXEy8KSQmrrIB/bkiS mgkCAwEAAaN5MHcwNQYJYIZIAYb4QgENBChQdXBwZXQgUnVieS9PcGVuU1NMIElu dGVybmFsIENlcnRpZmljYXRlMA4GA1UdDwEB/wQEAwIBBjAPBgNVHRMBAf8EBTAD AQH/MB0GA1UdDgQWBBRZy3D0vQGiBu3mnggq8jiEZERoXzANBgkqhkiG9w0BAQsF AAOCAgEAZHcyah146g2vmuQZsGJtljCkWjM1WWjcPoLewdd4FG1YKkAgbGGPVnSS yIS04tj4/Z3eePUlBNe3ZHs7yTQkq3SwWIhRSMJRA8xAjtL0rL7mvJ6PMV+Sujb7 +ENTML76+oq1d7XTqAh8mnSJOIc6eMUQqBBYl/5BYa4IzkvkrS4ai4vg0ihKr6pf C8XVINTjxChqK08dEr690sPD/3DwPPVGqG+qIyIv6u5buVLniLWiaq4HOG54i/yH k21W6A5UizRxb5GwVxWfjsHcLr2pvPq/ipRP/sCNBgKdziaGQm+IqeJiI6bQegx3 ZF5UX2CaIkZsEM0v2yQ7/t9rSCfpd/eEYMdcWcnSfEf/OOA/ti7jnpapENXturLo 6/TNJarHYQlkpEsaRfiiSmomAn3TNOOuqBhWtj0fdy21/fETxVVzPp/CL7EvkUnU 26SguZdBkGf22yZViKRq22eDnM/frsoSMTkIbkayDQ8Hx9JVpfPmdJWQyL86etsd uCR5OXJtd7vxRZT15m2cFvdpRW3VZbqFyIwe0NgJUs3FRrjFZdqQiBsvWwQjK8jN A3+IAAz4pk0A4xpQvNwvUzo4wyQB8/PTcmZoPBzeDaJScqSto2r3kvOSqvm1PSc6 gneGkbPvQpQRH9HVxzaEEcCrcZYFZXpCkF0ACuB+smgfMVY2Sno= -----END CERTIFICATE----- localhost.pem000066400000000000000000000037311303750077200405320ustar00rootroot00000000000000trapperkeeper-status-0.7.1/dev-resources/puppetlabs/trapperkeeper-status/status-proxy-service-test/ssl/certs-----BEGIN CERTIFICATE----- MIIFnjCCA4agAwIBAgIBAjANBgkqhkiG9w0BAQsFADAfMR0wGwYDVQQDDBRQdXBw ZXQgQ0E6IGxvY2FsaG9zdDAeFw0xNDAyMTQxODA5MDdaFw0xOTAyMTQxODA5MDda MBQxEjAQBgNVBAMMCWxvY2FsaG9zdDCCAiIwDQYJKoZIhvcNAQEBBQADggIPADCC AgoCggIBALzn4xcbGhSX65fgSIo+NB4EOYX+zJxbHamv+hbQkhAxqHxUSszDD86+ s9ZsN8cvpDIRex1/SfkNRKhgVUHGiUYbol+9RLpgpdHYLJCY0S0kOyBUgWqEcQrA tVDYpwaNC5qMneiZkcNvQ74pnuFOa0/bzbpLyOjYKEQq3BXK32yGLLHseb/PIX1o bqcuYiR6yI7S3wK9hJzSo6BRPSz0cNZNpnXbW/yLRtHbthBtzTifM2MW2jtWuFvg OW1ly9vxwTHCrJv5KxMubVDrqAx+RmmOsn327APO3r6NUr2CzV1vG8CMqLApse6m dKpCZ5UNLm4dwCUe2zWAQ0Q2Eba88O480bH8k/t8NUGXlWt25B8BUE8rklY0jwSw v8Dyjda0moeoxgMMZb+ogPjTY/Ds/h2hgFzy/DUuGrv6kyVxzH0/pOg29HeDoUaK IaqjAsN8h/oYTJ6CiOKAqWfbi8JZRmiEVTekskS5eySsPqPCegfAkfpIlz4EU4FN 0A2vORI7epW4mNiJcCLb09v5jogK+DhfeSscDsrYgIF8eAPLw6r9h1am+vXoD6vt tIwZBu15pnCWXdDcKWBNjx+zYU648iZ9V/qFG31uTJaw0z18eFPyTcJN8StTnoGZ zjKI2AyjG5F6ov/S7mnU3I/wv0B5Vq6fPjSvMrsBlbBdI3Pbr3w/AgMBAAGjge8w gewwNQYJYIZIAYb4QgENBChQdXBwZXQgUnVieS9PcGVuU1NMIEludGVybmFsIENl cnRpZmljYXRlMFQGA1UdEQRNMEuCG2Rqcm9vbWJhLnZwbi5wdXBwZXRsYWJzLm5l dIIJbG9jYWxob3N0ggZwdXBwZXSCGXB1cHBldC52cG4ucHVwcGV0bGFicy5uZXQw DgYDVR0PAQH/BAQDAgWgMCAGA1UdJQEB/wQWMBQGCCsGAQUFBwMBBggrBgEFBQcD AjAMBgNVHRMBAf8EAjAAMB0GA1UdDgQWBBRZy3D0vQGiBu3mnggq8jiEZERoXzAN BgkqhkiG9w0BAQsFAAOCAgEAJwDAK7UKZvLtSJkdprO/Z0qALdUSliO0+I6lsSr/ A8SyilfMQuxOoq6uWA6j7uMU4SAHwT14QD9c6BJJiBhWLo6HoynZfYK8Smn1q2Xy vNMnUEUEjcWIgIqEDP98RTUZldgR4aWQkQVHB9XY8g6F0qWzBq64qLWjfrsIUijF Z3Ex1OMU7ZMjHhoKk2J3oBcMau47mqU49C9MoWkBH+0fLLr+lxoa40DPcFr+KzhI BHdTKuKAEJEkiE5QNVWl3+M7psFZzdTd+YFz9Vn/9L3aQr8KCb4oMietp84KM0yR 6GwodvIjh/3owsnvNvl28HeZGWQMCNIlG0aWx3JIfCOHSYlXfQ55FCtLRqMflJty M4MqRkPHKykZMZmNoiTtXRSz3vMW8JnIyZPsNtfIGltnpjHd4Y4EEVxdZhlL2wBv YycI6PN1oyhv/9ZcIbSaVY4jEpxx6mrsG8iuaT1YHlO2HOwuqvXJ48WkT1Q6go9k A775N00y4jLJXqWchiuP9Hp7AGYWzoKXFDmQbUyl0jpZsrfoTskb90xp6R7+IO7K 6opzTahhO9QeURqc7lkdvwjiYMw7uIiTT6IAKtYeK4f52siiJ/LBWucte5FLCjvq KkdWF7aQOKZq+L7Cs2nJ/qQQzlDYTQLpAbBWJbVAscoYYGPVPdKrwI4UEbKGIsWs wMA= -----END CERTIFICATE----- 000077500000000000000000000000001303750077200374205ustar00rootroot00000000000000trapperkeeper-status-0.7.1/dev-resources/puppetlabs/trapperkeeper-status/status-proxy-service-test/ssl/private_keyslocalhost.pem000066400000000000000000000062531303750077200421210ustar00rootroot00000000000000trapperkeeper-status-0.7.1/dev-resources/puppetlabs/trapperkeeper-status/status-proxy-service-test/ssl/private_keys-----BEGIN RSA PRIVATE KEY----- MIIJKAIBAAKCAgEAvOfjFxsaFJfrl+BIij40HgQ5hf7MnFsdqa/6FtCSEDGofFRK zMMPzr6z1mw3xy+kMhF7HX9J+Q1EqGBVQcaJRhuiX71EumCl0dgskJjRLSQ7IFSB aoRxCsC1UNinBo0Lmoyd6JmRw29Dvime4U5rT9vNukvI6NgoRCrcFcrfbIYssex5 v88hfWhupy5iJHrIjtLfAr2EnNKjoFE9LPRw1k2mddtb/ItG0du2EG3NOJ8zYxba O1a4W+A5bWXL2/HBMcKsm/krEy5tUOuoDH5GaY6yffbsA87evo1SvYLNXW8bwIyo sCmx7qZ0qkJnlQ0ubh3AJR7bNYBDRDYRtrzw7jzRsfyT+3w1QZeVa3bkHwFQTyuS VjSPBLC/wPKN1rSah6jGAwxlv6iA+NNj8Oz+HaGAXPL8NS4au/qTJXHMfT+k6Db0 d4OhRoohqqMCw3yH+hhMnoKI4oCpZ9uLwllGaIRVN6SyRLl7JKw+o8J6B8CR+kiX PgRTgU3QDa85Ejt6lbiY2IlwItvT2/mOiAr4OF95KxwOytiAgXx4A8vDqv2HVqb6 9egPq+20jBkG7XmmcJZd0NwpYE2PH7NhTrjyJn1X+oUbfW5MlrDTPXx4U/JNwk3x K1OegZnOMojYDKMbkXqi/9LuadTcj/C/QHlWrp8+NK8yuwGVsF0jc9uvfD8CAwEA AQKCAgEAr5bHkfGiI2Q3G9vg8YbyQLhik7eMjwVupAyr5MsICb9uwepEAOKLbfv7 A6NhkWcqM1PmYTuxEauQlwW8GcCmVqFXI7C1EpzFZTGP8vPo8xHLV7jU9qKWxIzt vHE1h7RRBd4Q5WThhYyFplvfj8OpofhI2RKadDx/6SUBn8wMMz7gip2paW3pzjzl JcbKeOgcRg2iN1Tb0D1G1LzOpVutCrXwtXopnawELwsPx2OYrznjtQZH4YIxKU1Z c+N8QzwK/OrcMLrBnDm6aM4zTTGO141JQibjqIKArxSDxR2xMFkXrbnRDrYi6xaU OLIyv+wZrUdAFAEDd056uAueGYK0WtLq41ipdBFsqkOXLrdRsyp/t7lG58bmtrzA ZniyYAMFjfpzHKlx69nq4KJeMAUKoCscc9CmWmX+Ej5VvFa1x2xw9qWBkBHdWjEa QaF2NvdE7c9TspwfFu7IZ0jnvxN6yvc2RRSMjZnIJ3jW97wp2+PG6G0uhoYXWSxL cGqoKAnpROAaBBB8n3HQ4kPhNZqiV+xqSIuDBFSDohiSdHAqBawJfeA3ssh0Y7nZ WvGxr+iBB9JF2dmtEV0ySTR+bsdMb5IuPyXmXnZS33DvnvSHL6Ap0UwS2wAyzzew VEyE9+9wUqgnPQ0mgo95ARPBfuPestwNhHujQdHLgZe3t/eq6GECggEBAPQFQK/j e994Zp4gPXJMsLHCDtstwZivS/6CEdR96jMPwGgahjHjrSP0ynCyC4uY7PAAu4A1 0vjBZ0nLvJY0dVvW1f3GIsQM37jZf8eHUEHXsqGL7y3Nx0bHB3KbNHLw4ZhhsZ+7 eCcTT/ExHmshPr8NtsRQdPBYG4agkliJSzNWeQgOLZM8CyDISoCWQcvAbhE5pbOJ NGmbQecFurgBBGAyb3IBvZWdkT/85OtucDh1CVFZgpG6FmpNbVSOnrha4N7KtSjN OXjXvHN0b3GJe62Mjn+yLBOcbvCSkWWnnawV5NuvmH7Oe5FYsTbTBmgH4445T1q/ wCFTydMqf3vhVRECggEBAMYt+dN7yMKriMLUDpPPFO607OMhiv+r35AKefTHMHML pnx2OKiBSQUWt2v9z7uTsHcmNzXSccXUv0AF/DxHZebWZmm+VWgGOGMVaml6plzM 3A+hjsRcjjFaF1lmHy9+skuH0nmiO5hAkQeG5tZVaQ/Cc1k23RiL4WpTZ8v/4JLt 9dhMFZrcTugmCgDyN3aivoO+i1JtX7PcpLbxFv5+e9eGvpQ3FpwEYLT/+DA2DH52 /X9DlexfMoM8j0fs6NqiMdPUxRWuepIgTfrbqfsfcPABpieMGmXGtjFUsV4vpUvr M+ZN8KvNVCsznjC+jDyCthfYtHWE7CeWmEnhOHTSfE8CggEAa6PJhgzVvpzQv12/ XSUBKFhOz1YeuOhSoGDl1pL4dS+0kvdoTKd+34aCqjWPrDN4COJ50zNq7bn6gu3x MVzQjAN3f6sf+NUo9tRSbkR9HZ41ONeOWOkVx13SJjbaav1gtiQaAzjh5nK5Z85f +ae/ku1Muso22zIyai94fr+JQYsadngymGj7C6nuW0xsl6E5rDV+p3SVfyQybOL1 G2evc3Or/2FPLKlFwjEfFc8wh2bxBkZyty+b5aZj3NHQp8fGu+A1C1uDx496nH83 DaE0wjhnP2Lr2Ha/5TTyGCJZBejefB24Ke+RSGsUOPfbMpaQRVN4crJ04P6h35k2 hQG/0QKCAQADdmAsAriiNg8AoGXUzURnWz/cRATCrMUOJjC1RxmgmO6CtCoPP5r/ /MKdn2SWuWDW5BMI3LFiLHJe8vvSLcko/EvzwwCI/brUeFZQm3T2oBmkKEVvRtKx KArKZA9dbBA/Y5MYzu3Nnisqf3/e9MUOIm6Te3Lnb+IzUlu447KPvpqR+dpSx1CV m7yHAbRYXUWI1bZnbUPDx7IVBCdLsPgG7vK7ci7x8N2jq+kxJnCXcQrCw3KGG6+t PUyfjBMRZs4KDmiXFWJM1UWngVj56zW068J0ZG09o/gg6oLiy2BO8EAK4Qe4aLD0 xEUaQun+UKZPylh0ySq7ElV8zPOIjvjfAoIBAC3D6xkfricyFdLNtey6DPvFt00V hJBr1r3700hGGyROaLxsCKUJ+qvsSnkplR63MgCNuX312qXwlnQjrZdP3YprhhmF 14HRm61xbI4wviFPjzO04OsIkxLGq/Ir2QPsg7RYIubtUfbblxndz5oz38lGGQQ6 PNroPKq7exouCYXTfslFxf7MHs6pF3AjUN3H8WwvkEtOSNEaln6q59riY7QBBQxT GMx3RyIIPnw/XRwl4nKFuIpnQbph+gqA5HXysbnht60YbsxUQSk9pBqSU88uxWA1 A7OasL2hCO2dRkwXncUXkOPQwaNn6tNCCYR8Sp0EJC6WN7pc1fqC8cPCk+g= -----END RSA PRIVATE KEY----- trapperkeeper-status-0.7.1/documentation/000077500000000000000000000000001303750077200205665ustar00rootroot00000000000000trapperkeeper-status-0.7.1/documentation/README.md000066400000000000000000000005101303750077200220410ustar00rootroot00000000000000## Trapperkeeper Status Service Documentation * [Querying the Status Service](./query-api.md) * [Wire Formats](./wire-formats.md) * [Metrics: JVM, Debugging](./metrics.md) * [Periodically Logging Status Data](./status-logging.md) * [Status Proxy Service](./status-proxy-service.md) * [Developer Documentation](./developers.md) trapperkeeper-status-0.7.1/documentation/developers.md000066400000000000000000000072131303750077200232630ustar00rootroot00000000000000## TL;DR Quick Start 1. Add `puppetlabs/trapperkeeper-status` to your lein deps. 2. Add `puppetlabs.trapperkeeper.services.status.status-service/status-service` to your `bootstrap.cfg`. 3. Add `[:StatusService register-status]` to your TK service's deps. 4. Call `status-core/get-artifact-version` to dynamically get the version info for your service. 5. Call `register-status` to register a callback function that returns status info for your service. 6. Define your status callback function. Code sample: ```clj (ns foo (:require [puppetlabs.trapperkeeper.services.status.status-core :as status-core] [schema.core :as schema])) (schema/defn ^:always-validate v1-status-callback :- status-core/StatusCallbackResponse [level :- status-core/ServiceStatusDetailLevel] {:state :running :status (get-basic-status-for-my-service-at-level level) :alerts (get-alerts-at-level level)}) (defservice foo-service [[:StatusService register-status]] (init [this context] (register-status "foo-service" (status-core/get-artifact-version "puppetlabs" "foo") 1 v1-status-callback) context)) ``` ## Implementing Your Status Function Your status callback function should return a map that matches the status-core/StatusCallbackResponse schema. This means it should return a :state key, a :status key, and :alerts. :status and :alerts can change depending on the level specified to the function. Generally, :alerts should only be provided at the info level, unless you've decided you have some critical condition that is best notified about in prose. Keep in mind that most of our tools that will surface :alerts should query at info level. The `puppetlabs.trapperkeeper.services.status.status-core` namespace contains some utilities to aid in the implementation of your status functions. In particular, the `level->int` function defines an ordering for status levels as `:critical < :info < :debug`, and the `compare-levels` function can be used to compare status levels. This is especially useful in conjunction with the `cond->` macro from `clojure.core`. Here's an example of how a status function might be implemented to utilize the `compare-levels` function: ```clj (require '[puppetlabs.trapperkeeper.services.status.status-core :as status-core]) (defn my-status [level] (let [level>= (partial status-core/compare-levels >= level)] {:state running :status (cond-> {:this-is-critical "foo"} (level>= :info) (assoc :bar "bar" :baz "baz") (level>= :debug) (assoc :x "x" :y "y" :z "y"))})) ``` ## Exposing the /status endpoint If you want your registered status functions to be accessible via HTTP(S), you need to route the status service accordingly: ``` webserver: { default: { ssl-port: 9001 ssl-cert: /etc/ssl/certs/myhostname.pem ssl-key: /etc/ssl/private_keys/myhostname.pem ssl-ca-cert: /etc/ssl/certs/ca.pem default-server: true } } web-router-service: { "puppetlabs.trapperkeeper.services.status.status-service/status-service": { route: /status server: default } } ``` For information on proxying plaintext `/status` requests to an otherwise HTTPS protected server, see the [Status Proxy documentation](./status-proxy-service.md). ## Details See [Query API](./query-api.md) and [Wire Format](./wire-formats.md) for details on user-facing functionality that the Status Service provides. You'll want to document the format of your service's status data (at the various levels of detail `:critical`, `:info`, and `:debug`) in your own documentation. trapperkeeper-status-0.7.1/documentation/metrics.md000066400000000000000000000105221303750077200225560ustar00rootroot00000000000000## Metrics It is common for services registering callbacks with the status service to include some basic metrics about the health of the service. These metrics are typically only available at the `:debug` status level, and should be limited to a fairly small set of data that can be useful for debugging the service's performance or behavior. We've also been, for now, putting them underneath a key in the map called `"experimental"`, so that we can gather some feedback on UX before committing to a long-term API / wire format for the data structure. Read on to see how some debug metrics of this type are included with `trapperkeeper-status` itself; for an example of how a downstream service might register it's own metrics, you can take a look at the [JRuby metrics code in the pe-puppet-server-extensions repo.](https://github.com/puppetlabs/pe-puppet-server-extensions/blob/3531fa00ce20c99b662595569edc9ef3d1b4daaa/src/clj/puppetlabs/enterprise/services/jruby/pe_jruby_metrics_service.clj#L54-L58) ### JVM Metrics In addition to metrics that downstream services may choose to make available via the status service, the status service itself ships with some basic JVM metrics that can be useful for monitoring the process as a whole. Here is an example of what the status service's own status callback returns at `:debug` level: ```json "status-service": { "active_alerts": [], "detail_level": "debug", "service_status_version": 1, "service_version": "0.5.0", "state": "running", "status": { "experimental": { "jvm-metrics": { "file-descriptors": { "max": 65536, "used": 198 }, "gc-stats": { "PS MarkSweep": { "count": 5, "total-time-ms": 657 }, "PS Scavenge": { "count": 17, "total-time-ms": 306 } }, "heap-memory": { "committed": 1111490560, "init": 262144000, "max": 1908932608, "used": 612812056 }, "non-heap-memory": { "committed": 265027584, "init": 2555904, "max": -1, "used": 178038080 }, "start-time-ms": 1475685724906, "up-time-ms": 25466 } } } } ``` The most interesting part of the payload above is the data available in the "status" -> "experimental" -> "jvm-metrics" map. Here are some details about the fields available there: * `heap-memory`: information about the JVM's heap memory usage; this mostly accounts for memory consumed by application code ** `committed`: the amount of memory that the operating system has allocated to the process ** `init`: the initial amount of memory that was allocated to the process at startup ** `max`: the maximum amount of memory that the process will request from the operating system before throwing an OOM error ** `used`: the amount of memory that is currently being used by the application * `non-heap-memory`: same fields as for `heap-memory`, but this refers to native memory used by the JVM itself, along with any memory allocated by native libraries * `file-descriptors`: ** `max`: the maximum number of file descriptors that this process is allowed to open ** `used`: the current number of file descriptors the process has open * `gc-stats`: a map containing key garbage collection statistics for each of the different GC algorithms that are in use by this JVM. The keys in this map are the GC algorithm names ** `count`: the number of executions of this GC algorithm ** `total-time-ms`: the cumulative number of milliseconds of CPU time that have been spent executing the garbage collections for this algorithm ### Logging metrics data As of tk-status 0.6.0, there is a new configuration setting available that can be used to cause the status service to periodically log debugging metrics data to a file. For more information, see the [docs on status logging](./status-logging.md).trapperkeeper-status-0.7.1/documentation/query-api.md000066400000000000000000000233261303750077200230320ustar00rootroot00000000000000## Status Query API, v1 You can query for status information about services running in your application by making an HTTP request to the `/status` endpoint. ## JSON Endpoints ### Status Detail Level When querying for service status, you may optionally request a specific level of detail to be returned. The valid levels are: * `"critical"` : returns only the bare minimum amount of status information for each service. Intended to return very quickly and to be suitable for use cases like health checks for a load balancer. * `"info"` : typically returns a bit more info than the `"critical"` level would, for each service. The specific data that is returned will depend on the implementation details of the services in your application, but should generally be data that is useful for a human to get a quick impression of the health / status of each service. * `"debug"` : this level can be used to request very detailed status information about a service, typically used by a human for debugging. Requesting this level of status information may be significantly more expensive than the lower levels, depending on the service. A common use case would be for a service to provide some detailed aggregate metrics about the performance or resource usage of its subsystems. The information returned for any service at each increasing level of detail should be additive; in other words, `"info"` should return the same data structure as `"critical"`, but may add additional data in the `status` field. Likewise, `"debug"` should return the same data structure as `"info"`, but may add additional information in the `status` field. ### `GET /status/v1/services` This will return status information for all registered services in an application. #### URL Parameters * `level`: Optional. A JSON String from among the legal [Status Detail Levels](#status-detail-level) listed above. Status information for all registered services will be provided at the requested level of detail. If not provided, the default level is `"info"`. * `timeout`: Optional. An integer specifying the timeout for the check in seconds. If not provided, the default timeout will depend on the level. * `"critical"`: 30 seconds. * `"info"`: 60 seconds. * `"debug"`: 60 seconds. It is highly encouraged to use the timeout parameter to set a timeout that makes sense for your environment. #### Response Format The response format will be a JSON _Object_, which will look something like this: {: { "service_version": , "service_status_version": , "detail_level": , "state": , "status": , "active_alerts": [ { "severity": , "message": } ] }, : { ... }, ... } For detailed information, please see the [Wire Format Specification](./wire-formats.md). NOTE: If any services in your application have registered more than one supported status format version this endpoint will *always* return the latest format. Therefore, if you need to ensure backward compatibility across upgrades of your application, you should consider using the [`/services/`](#get-statusv1servicesservice-name) endpoint (which returns status info for a single service and can take a query parameter specifying status version), rather than the `/services` endpoint (which aggregates status for all registered services). #### Examples Using `curl` from localhost: Get the service status of all registered services in an application: curl -k https://localhost:8000/status/v1/services { "puppet-server": { "detail_level": "info", "state": "running", "service_status_version": 1, "service_version": "1.0.9-SNAPSHOT", "status": { "bar": "bar", "foo": "foo" } }, "other-service": { "detail_level": "info", "state": "running", "service_status_version": 2, "service_version": "0.0.1-SNAPSHOT", "status": { "baz": [1, 2, 3], "bang": {"key": "value"} } } } Get the service status of all registered services in an application, at a specified level of detail: curl -k "https://localhost:8140/status/v1/services?level=critical" { "puppet-server": { "detail_level": "critical", "state": "running", "service_status_version": 1, "service_version": "1.0.9-SNAPSHOT", "status": null }, "other-service": { "detail_level": "info", "state": "running", "service_status_version": 2, "service_version": "0.0.1-SNAPSHOT", "status": null } } ### `GET /status/v1/services/` This will return status information for a single, specified service from the running application. #### URL Parameters * `level`: Optional. A JSON String from among the legal [Status Detail Levels](#status-detail-level) listed above. Status information for the requested service will be provided at the requested level of detail. If not provided, the default level is `"info"`. * `service_status_version`: Optional. A JSON integer specifying the desired status format version for the requested service. If not provided, defaults to the latest available status format version for the service. * `timeout`: Optional. An integer specifying the timeout for the check in seconds. If not provided, the default timeout will depend on the level. * `"critical"`: 30 seconds. * `"info"`: 60 seconds. * `"debug"`: 60 seconds. It is highly encouraged to use the timeout parameter to set a timeout that makes sense for your environment. #### Response Format The response format will be a JSON _Object_ with a single entry, which will look something like this: {: { "service_version": , "service_status_version": , "detail_level": , "state": , "status": } } For detailed information, please see the [Wire Format Specification](./wire-formats.md). #### Examples Using `curl` from localhost: Get the service status for a specified service in the application: curl -k https://localhost:8000/status/v1/services/other-service { "other-service": { "detail_level": "info", "state": "running", "service_status_version": 2, "service_version": "0.0.1-SNAPSHOT", "status": { "baz": [1, 2, 3], "bang": {"key": "value"} } } } Get the service status for a specified service, using a specific status format version: curl -k "https://localhost:8140/status/v1/services/other-service?service_status_version=1" { "other-service": { "detail_level": "info", "state": "running", "service_status_version": 1, "service_version": "0.0.1-SNAPSHOT", "status": { "oldbaz": 123, "bang": {"key": "value"} } } } Get the service status for a specified service, using a specific status format version and a specific detail level: curl -k "https://localhost:8140/status/v1/services/other-service?service_status_version=2&level=debug" { "other-service": { "detail_level": "debug", "state": "running", "service_status_version": 2, "service_version": "0.0.1-SNAPSHOT", "status": { "baz": [1, 2, 3], "bang": {"key": "value"}, "extra_debugging_info": [4, 5, 6] } } } ## Simple Endpoints These endpoints are designed for load balancers that don't support any kind of JSON parsing or query parameter use. They return simple string bodies (either the state of the service in question or a simple error message) and a status code relevant to the status result. If your load balancer *also* needs HTTP instead of HTTPS, you may wish to use the [status service proxy](./status-proxy-service.md). The content type for these endpoints is `text/plain; charset=utf-8`. ### GET /status/v1/simple Returns a status that reflects all services the status service knows about. It decides on what status to report using the following logic: * _running_ if and only if all services are _running_. * _error_ if any service reports _error_. * _starting_ if any service reports _starting_ and no service reports _error_ or _stopping_. * _stopping_ if any service reports _stopping_ and no service reports _error_. * _unknown_ if any service reports _unknown_ and no services report _error_. #### Query parameters No parameters are supported. Defaults to using the _critical_ status level. #### Response codes * 200 if and only if all services report a status of _running_. * 503 if any service's status is _unknown_ or _error_. #### Possible responses * "running" * "error" * "starting" * "stopping" * "unknown" ### GET /status/v1/simple/\ Returns the status of the specified service, such as “rbac-service” or “classifier-service”. #### Query Parameters No parameters are supported. Defaults to using the _critical_ status level. #### Response codes * 200 if service is _running_. * 503 if service is _unknown_, _error_, _starting_, or _stopping_. * 404 if requested service is not found. #### Possible responses * "running" * "error" * "unknown" * "starting" * "stopping" * "not found: \" trapperkeeper-status-0.7.1/documentation/status-logging.md000066400000000000000000000042021303750077200240550ustar00rootroot00000000000000# Status logging TK-status can periodically log status data to a file in JSON format. To take advantage of this you must do two things: 1. Enable logging through your TK app's configuration. For example, to log every 30 minutes, your configuration might look like this: ``` status: { debug-logging: { interval-minutes: 30, } } ``` 2. Create a logback `logger` in your logback configuration for the `puppetlabs.trapperkeeper.services.status.status-logging` namespace. For example, this `logger` and `appender` will send the status log messages to a file, manage rotating them, and keep them from taking up too much room ``` /tmp/status.log true /status-%d{yyyy-MM-dd}.%i.log.zip 10MB 90 1GB %m%n ``` A simpler config with no rotation would look like this: ``` /tmp/status.log %m%n ``` trapperkeeper-status-0.7.1/documentation/status-proxy-service.md000066400000000000000000000077441303750077200252640ustar00rootroot00000000000000# status-proxy-service ## Use Case The `status-proxy-service` acts as a bridge between HTTP clients and an HTTPS `status-service`. It is configured with SSL information about a `status-service`, and then accepts HTTP requests and proxies them over HTTPS to the real `status-service`. This could be useful for instance, if you have a load balancer that needs to access the `status-service`, but which cannot be configured to use custom SSL certs from pem files. ## Preferred Alternatives Given that the `status-proxy-service` opens a small hole in your security, consider the following alternatives first: * Configure your load balancer to use HTTPS with pem files if possible * If you are comfortable setting up reverse proxying with a tool such as nginx, consider using that * Consider running the regular status service on a plaintext port, so that only the status service is accessible via plaintext, and you are not reverse proxying any requests to the HTTPS portions of the apps ## Security Considerations There is some security risk associated with providing a plaintext window into a portion of your HTTPS endpoints. * If incorrectly configured, the `status-proxy-service` could unintentionally open up endpoints other than the `status-service` to plaintext access * The proxy should be run on a network interface that is not accessible from outside your internal LAN if possible * The proxy should be run on a port that is firewalled to only allow access from your load balancer ## Examples ### Configuring the status-service for plaintext access If you would like to configure your status service to be on a plaintext port, your trapperkeeper configuration might look something like this: ``` webserver: { default: { ssl-port: 9001 ssl-cert: /etc/ssl/certs/myhostname.pem ssl-key: /etc/ssl/private_keys/myhostname.pem ssl-ca-cert: /etc/ssl/certs/ca.pem default-server: true } status: { port: 8080 } } web-router-service: { "puppetlabs.trapperkeeper.services.status.status-service/status-service": { route: /status server: status } } ``` This config contains a new jetty server called `status` running on port 8080 that is separate from the other, SSL enabled, server. In the `web-router-service` section, the status service is configured to run on the `status` server, and be mounted at `/status` ### Configuring the status-proxy-service #### bootstrap.cfg You'll need to add the `status-proxy-service` to your `bootstrap.cfg`: ``` puppetlabs.trapperkeeper.services.status.status-proxy-service/status-proxy-service ``` #### Trapperkeeper config Your trapperkeeper configuration might look like this: ``` webserver: { default: { ssl-port: 9001 ssl-cert: /etc/ssl/certs/myhostname.pem ssl-key: /etc/ssl/private_keys/myhostname.pem ssl-ca-cert: /etc/ssl/certs/ca.pem default-server: true } status-proxy: { port: 8080 } } web-router-service: { "puppetlabs.trapperkeeper.services.status.status-service/status-service": /status "puppetlabs.trapperkeeper.services.status.status-proxy-service/status-proxy-service": { route: /status-proxy server: status-proxy } } status-proxy: { proxy-target-url: "https://myhostname:9001/status" ssl-opts: { ssl-cert: /etc/ssl/certs/myhostname.pem ssl-key: /etc/ssl/private_keys/myhostname.pem ssl-ca-cert: /etc/ssl/certs/ca.pem } } ``` The important things to note are: * The status proxy service and the status service are running on separate webservers and ports * There is a new section in the config, `status-proxy`, with: * A url pointing the proxy to the status service. Note that the hostname of the url must be the CN or a SubjectAltName in the server certificate * SSL information that matches the SSL information for the webserver that the status service is running on With this configuration, requests that would normally be made to `https://myhostname:9001/status` with an SSL cert can also be made to `http://myhostname:8080/status-proxy` over plaintext trapperkeeper-status-0.7.1/documentation/wire-formats.md000066400000000000000000000074501303750077200235350ustar00rootroot00000000000000## Status Wire Format - Version 1 ### JSON Endpoints Service status information is represented as JSON. Unless otherwise noted, `null` is not allowed anywhere in the status data. {: { "service_version": , "service_status_version": , "detail_level": , "state": , "status": }, : { ... }, ... } The response is a JSON _Object_. All of the keys are service names, and all of the values are maps containing status information about that service. `` is a String. `` is a String that complies with the [Semantic Versioning Specification, v2.0.0](http://semver.org/spec/v2.0.0.html). It indicates the version number of each service that is providing status information. `` is an Integer, specifying the format version of the status information for the particular service. Any individual service may version its status formats independently from other services; this means that if a new version of a service is released, and it has new status information available, it may provide both a version `1` and a version `2` of its status format. It is possible to query for the status information of an individual service and include a specific format version as part of the query, which provides a way for individual services to make backward compatibility guarantees about the format of their status data as it evolves over time. See the docs on the [`/services/` is a String from the following enumeration: (`"critical"`, `"info"`, `"debug"`). `"critical"` indicates that the data returned should be only system-critical status information (suitable for use in load balancing / monitoring situations). Services may provide slightly more detail at the `"info"` level. Services may provide very detailed information, suitable for debugging, at the `"debug"` level. `` is a String from the following enumeration: (`"running"`, `"error"`, `"starting"`, `"stopping"`, `"unknown"`). `"unknown"` will be used when there is a problem that is preventing the Status Service from getting accurate status information from the specified service. `` may be any valid JSON object (including `null`). The data supplied here is specific to the individual service that is reporting status. #### Errors Error responses are formatted as a JSON _Object_: { "type": , "message": } `` is a String that serves as a unique identifier for the type of error that occurred. For example, "service-status-version-not-found". `` is a String with a descriptive message about the error that occurred. For example: "No status function with version 2 found for service puppet-server". #### Encoding The entire status payload is expected to be valid JSON, which mandates UTF-8 encoding. ### Simple Endpoints The two simple endpoints (see the [query api documentation](./query-api.md)) return strings correpsonding to the computed status of all known services. The possible responses are: * "running" * "error" * "unknown" * "starting" * "stopping" #### Errors If using the /simple/\ endpoint and the provided service is not found, * 404 "not found: \" will be returned. It is important to note that when using the simple endpoints, a 503 response indicates a status other than _running_ and not a problem with the status service itself. However, a 5xx other than 503 would indicate such. #### Encoding The content type for these endpoints is `text/plain; charset=utf-8`. trapperkeeper-status-0.7.1/ext/000077500000000000000000000000001303750077200165155ustar00rootroot00000000000000trapperkeeper-status-0.7.1/ext/travisci/000077500000000000000000000000001303750077200203415ustar00rootroot00000000000000trapperkeeper-status-0.7.1/ext/travisci/test.sh000077500000000000000000000000301303750077200216500ustar00rootroot00000000000000#!/bin/bash lein2 test trapperkeeper-status-0.7.1/locales/000077500000000000000000000000001303750077200173375ustar00rootroot00000000000000trapperkeeper-status-0.7.1/locales/messages.pot000066400000000000000000000067701303750077200217040ustar00rootroot00000000000000# SOME DESCRIPTIVE TITLE. # Copyright (C) YEAR Puppet # This file is distributed under the same license as the puppetlabs.trapperkeeper_status package. # FIRST AUTHOR , YEAR. # #, fuzzy msgid "" msgstr "" "Project-Id-Version: puppetlabs.trapperkeeper_status \n" "Report-Msgid-Bugs-To: docs@puppet.com\n" "POT-Creation-Date: \n" "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n" "Last-Translator: FULL NAME \n" "Language-Team: LANGUAGE \n" "Language: \n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #: src/puppetlabs/trapperkeeper/services/status/status_core.clj msgid "{0} timed out, shutting down background task" msgstr "" #: src/puppetlabs/trapperkeeper/services/status/status_core.clj msgid "" "Cannot register multiple callbacks for a single service with different " "service versions." msgstr "" #: src/puppetlabs/trapperkeeper/services/status/status_core.clj msgid "Service function already exists for service {0} with status version {1}" msgstr "" #: src/puppetlabs/trapperkeeper/services/status/status_core.clj msgid "" "The proxy-target-url ''{0}'' has an unsupported protocol ''{1}''. Must be " "either http or https" msgstr "" #: src/puppetlabs/trapperkeeper/services/status/status_core.clj msgid "Unable to find version number for ''{0}/{1}''" msgstr "" #: src/puppetlabs/trapperkeeper/services/status/status_core.clj msgid "Status check timed out after {0} seconds" msgstr "" #: src/puppetlabs/trapperkeeper/services/status/status_core.clj msgid "Status callback for {0}" msgstr "" #: src/puppetlabs/trapperkeeper/services/status/status_core.clj msgid "Status check response for {0} malformed: {1}" msgstr "" #. if we get here it's almost certainly because the timeout was reached, #. so the macro already has a return value and we don't need to bother #. returning one #: src/puppetlabs/trapperkeeper/services/status/status_core.clj msgid "Status callback for {0} interrupted" msgstr "" #: src/puppetlabs/trapperkeeper/services/status/status_core.clj msgid "Status check for {0} threw an exception" msgstr "" #: src/puppetlabs/trapperkeeper/services/status/status_core.clj msgid "No status function with version {0} found for service {1}" msgstr "" #: src/puppetlabs/trapperkeeper/services/status/status_core.clj msgid "No service info found for service {0}" msgstr "" #: src/puppetlabs/trapperkeeper/services/status/status_core.clj msgid "Invalid level: {0}" msgstr "" #: src/puppetlabs/trapperkeeper/services/status/status_core.clj msgid "Invalid service_status_version. Should be an integer but was {0}" msgstr "" #: src/puppetlabs/trapperkeeper/services/status/status_core.clj msgid "Invalid timeout. Should be an integer but was {0}" msgstr "" #: src/puppetlabs/trapperkeeper/services/status/status_core.clj msgid "Invalid timeout. Timeout must be greater than zero but was {0}" msgstr "" #: src/puppetlabs/trapperkeeper/services/status/status_core.clj msgid "not found: {0}" msgstr "" #: src/puppetlabs/trapperkeeper/services/status/status_core.clj msgid "No status information found for service {0}" msgstr "" #: src/puppetlabs/trapperkeeper/services/status/status_proxy_service.clj msgid "Initializing status service proxy" msgstr "" #: src/puppetlabs/trapperkeeper/services/status/status_service.clj msgid "Registering status service HTTP API at /status" msgstr "" #: src/puppetlabs/trapperkeeper/services/status/status_service.clj msgid "Registering status callback function for service ''{0}'', version {1}" msgstr "" trapperkeeper-status-0.7.1/project.clj000066400000000000000000000044221303750077200200570ustar00rootroot00000000000000(defproject puppetlabs/trapperkeeper-status "0.7.1" :description "A trapperkeeper service for getting the status of other trapperkeeper services." :url "https://github.com/puppetlabs/trapperkeeper-status" :license {:name "Apache License, Version 2.0" :url "http://www.apache.org/licenses/LICENSE-2.0"} :min-lein-version "2.7.1" :parent-project {:coords [puppetlabs/clj-parent "0.2.5"] :inherit [:managed-dependencies]} :pedantic? :abort :exclusions [org.clojure/clojure] :dependencies [[org.clojure/clojure] [cheshire] [slingshot] [prismatic/schema] [trptcolin/versioneer] ;; ring-defaults brings in a bad, old version of the servlet-api, which ;; now has a new artifact name (javax.servlet/javax.servlet-api). If we ;; don't exclude the old one here, they'll both be brought in, and consumers ;; will be subject to the whims of which one shows up on the classpath first. ;; thus, we need to use exclusions here, even though we'd normally resolve ;; this type of thing by just specifying a fixed dependency version. [ring/ring-defaults :exclusions [javax.servlet/servlet-api]] [org.clojure/java.jmx] [org.clojure/tools.logging] [puppetlabs/kitchensink] [puppetlabs/trapperkeeper] [puppetlabs/trapperkeeper-scheduler] [puppetlabs/ring-middleware] [puppetlabs/comidi] [puppetlabs/i18n]] :deploy-repositories [["releases" {:url "https://clojars.org/repo" :username :env/clojars_jenkins_username :password :env/clojars_jenkins_password :sign-releases false}]] :profiles {:dev {:dependencies [[puppetlabs/http-client] [puppetlabs/trapperkeeper :classifier "test"] [puppetlabs/trapperkeeper-webserver-jetty9] [puppetlabs/kitchensink :classifier "test"]]}} :plugins [[lein-parent "0.3.1"] [puppetlabs/i18n "0.4.3"]]) trapperkeeper-status-0.7.1/src/000077500000000000000000000000001303750077200165045ustar00rootroot00000000000000trapperkeeper-status-0.7.1/src/puppetlabs/000077500000000000000000000000001303750077200206635ustar00rootroot00000000000000trapperkeeper-status-0.7.1/src/puppetlabs/trapperkeeper/000077500000000000000000000000001303750077200235345ustar00rootroot00000000000000trapperkeeper-status-0.7.1/src/puppetlabs/trapperkeeper/services/000077500000000000000000000000001303750077200253575ustar00rootroot00000000000000trapperkeeper-status-0.7.1/src/puppetlabs/trapperkeeper/services/status/000077500000000000000000000000001303750077200267025ustar00rootroot00000000000000trapperkeeper-status-0.7.1/src/puppetlabs/trapperkeeper/services/status/check.clj000066400000000000000000000022201303750077200304450ustar00rootroot00000000000000(ns puppetlabs.trapperkeeper.services.status.check "Shared status check functions." (:import [java.io File IOException])) (defn disk-writable? "Given a directory, check whether the FS that the directory is in seems healthy by writing 4k random letters to a file in that directory, and then read them back in. If what was read matches what was written, then return true; if there's a mismatch or anything goes wrong in the process, return false. Since this check involves writing to disk, it may not be appropriate to put at a high status check level like :critical or :info." [directory] (let [rand-letter #(char (rand-nth (range 97 123))) fname (str "tk-disk-check-" (apply str (repeatedly 10 rand-letter)) ".txt")] (if-let [file (File. directory fname)] (let [payload (apply str (repeatedly 4095 rand-letter))] (try (when (.exists file) (.delete file)) (spit file payload) (= (slurp file) payload) (catch Exception _ false) (finally (try (.delete file) (catch IOException _ false))))) false))) trapperkeeper-status-0.7.1/src/puppetlabs/trapperkeeper/services/status/cpu_monitor.clj000066400000000000000000000072701303750077200317400ustar00rootroot00000000000000(ns puppetlabs.trapperkeeper.services.status.cpu-monitor (:require [clojure.java.jmx :as jmx] [puppetlabs.kitchensink.core :as ks] [clojure.tools.logging :as log] [schema.core :as schema]) (:import (java.lang.management ManagementFactory) (javax.management AttributeNotFoundException))) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;; Schemas (def CpuUsageSnapshot {:snapshot {:uptime schema/Int :process-cpu-time schema/Num :process-gc-time schema/Num} :cpu-usage schema/Num :gc-cpu-usage schema/Num}) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;; Private ;; NOTE: code in this namespace was ported from the source code for JVisualVM. (defn- cpu-multiplier* [] (if (contains? (vec (jmx/attribute-names "java.lang:type=OperatingSystem")) :ProcessingCapacity) (jmx/read "java.lang:type=OperatingSystem" :ProcessingCapacity) 1)) (def cpu-multiplier (memoize cpu-multiplier*)) (defn- gc-bean-names* [] (jmx/mbean-names "java.lang:type=GarbageCollector,*")) (def gc-bean-names (memoize gc-bean-names*)) (defn get-process-cpu-time "Get the total CPU time spent by the process since startup." [] (let [bean-cpu-time (jmx/read "java.lang:type=OperatingSystem" :ProcessCpuTime)] (* bean-cpu-time (cpu-multiplier)))) (defn get-collection-time "Compute the total time spent on Garbage Collection since the process was started, by summing GC collection times from the JMX GC beans." [] (try (apply + (map #(jmx/read % :CollectionTime) (gc-bean-names))) (catch AttributeNotFoundException e ;; Hopefully we will never hit this code path, but if we do, we should just ;; log a warning and bail rather than letting the exception bubble up. (log/warn "Found GC Bean that does not contain `:CollectionTime` attribute: " (gc-bean-names)) 0))) (defn calculate-usage "Given 'before' and 'after' values for processing time, and a delta of time that expired between the two values, compute the percentage of CPU used." [process-time prev-process-time uptime-diff] (if (or (= -1 prev-process-time) (<= uptime-diff 0)) 0 (let [process-time-diff (- process-time prev-process-time)] (min (* 100 (/ process-time-diff uptime-diff)) 100)))) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;; Public (schema/defn get-cpu-values :- CpuUsageSnapshot "Given a recent snapshot of CPU Usage data, compute the CPU usage percentage since the snapshot, and return an updated snapshot." [last-snapshot :- CpuUsageSnapshot] (let [{prev-uptime :uptime prev-process-cpu-time :process-cpu-time prev-process-gc-time :process-gc-time} (:snapshot last-snapshot)] (let [runtime-bean (ManagementFactory/getRuntimeMXBean) ;; could cache / memoize num-cpus num-cpus (ks/num-cpus) uptime (* (.getUptime runtime-bean) 1000000) process-cpu-time (/ (get-process-cpu-time) num-cpus) process-gc-time (/ (* (get-collection-time) 1000000) num-cpus) uptime-diff (if (= -1 prev-uptime) uptime (- uptime prev-uptime)) cpu-usage (calculate-usage process-cpu-time prev-process-cpu-time uptime-diff) gc-usage (calculate-usage process-gc-time prev-process-gc-time uptime-diff)] (let [result {:snapshot {:uptime uptime :process-cpu-time process-cpu-time :process-gc-time process-gc-time} :cpu-usage (float (max cpu-usage 0)) :gc-cpu-usage (float (max gc-usage 0))}] (log/trace "Latest cpu usage metrics: " (ks/pprint-to-string result)) result)))) trapperkeeper-status-0.7.1/src/puppetlabs/trapperkeeper/services/status/status_core.clj000066400000000000000000000562001303750077200317320ustar00rootroot00000000000000(ns puppetlabs.trapperkeeper.services.status.status-core (:require [clojure.tools.logging :as log] [clojure.set :as setutils] [schema.core :as schema] [schema.utils :refer [validation-error-explain]] [ring.middleware.defaults :as ring-defaults] [slingshot.slingshot :refer [throw+]] [puppetlabs.comidi :as comidi] [puppetlabs.kitchensink.core :as ks] [puppetlabs.ring-middleware.utils :as ringutils] [puppetlabs.ring-middleware.core :as middleware] [puppetlabs.trapperkeeper.services.status.cpu-monitor :as cpu] [trptcolin.versioneer.core :as versioneer] [clojure.java.jmx :as jmx] [puppetlabs.i18n.core :as i18n]) (:import (java.net URL) (java.util.concurrent CancellationException) (java.lang.management ManagementFactory) (clojure.lang IFn))) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;; Schemas (def WholeSeconds schema/Int) (def WholeMilliseconds schema/Int) (def ServiceStatusDetailLevel (schema/enum :critical :info :debug)) (def State (schema/enum :running :error :starting :stopping :unknown)) (def Alert {:severity (schema/enum :error :warning :info) :message schema/Str}) (def StatusCallbackResponse {:state State :status schema/Any (schema/optional-key :alerts) [Alert]}) (def StatusFn (schema/make-fn-schema StatusCallbackResponse ServiceStatusDetailLevel)) (def ServiceInfo {:service-version schema/Str :service-status-version schema/Int ;; Note that while this specifies the input and output for the status ;; function for each service, it does not actually validate these :status-fn StatusFn}) (def ServicesInfo {schema/Str [ServiceInfo]}) ;; this is what gets returned in the HTTP response as json, and thus uses ;; underscores rather than hyphens (def ServiceStatus {:service_version schema/Str :service_status_version schema/Int :state State :detail_level ServiceStatusDetailLevel :status schema/Any :active_alerts [Alert]}) (def ServicesStatus {schema/Str ServiceStatus}) (def Version schema/Str) (def StatusProxyConfig {:proxy-target-url schema/Str :ssl-opts {:ssl-cert schema/Str :ssl-key schema/Str :ssl-ca-cert schema/Str}}) (def MemoryUsageV1 {:committed schema/Int :init schema/Int :max schema/Int :used schema/Int}) (def FileDescriptorUsageV1 {:max schema/Int :used schema/Int}) (def GcStatsV1 {schema/Str {:count schema/Int :total-time-ms schema/Int}}) (def JvmMetricsV1 {:heap-memory MemoryUsageV1 :non-heap-memory MemoryUsageV1 :file-descriptors FileDescriptorUsageV1 :gc-stats GcStatsV1 :up-time-ms WholeMilliseconds :start-time-ms WholeMilliseconds :cpu-usage schema/Num :gc-cpu-usage schema/Num}) (def DebugLoggingConfig (schema/maybe {:interval-minutes schema/Num})) (def StatusServiceConfig {(schema/optional-key :debug-logging) DebugLoggingConfig (schema/optional-key :cpu-metrics-interval-seconds) schema/Num}) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;; Private (defmacro with-timeout [description timeout-s default & body] `(let [f# (future (do ~@body)) result# (deref f# (* 1000 ~timeout-s) ~default)] (future-cancel f#) (when (future-cancelled? f#) (log/error (i18n/trs "{0} timed out, shutting down background task" ~description)) (try @f# (catch CancellationException e# (log/error e#)))) result#)) (defn- maybe-explain "Given the result of a call to schema.core/check, potentially unwrap it with validation-error-explain if it is a ValidationError object. Otherwise, pass the argument through." [schema-failure] (if (instance? schema.utils.ValidationError schema-failure) (validation-error-explain schema-failure) schema-failure)) (schema/defn check-timeout :- WholeSeconds "Given a status level keyword, returns an integral number of seconds to use as a timeout when calling a status function." [level :- ServiceStatusDetailLevel] (case level :critical 30 :info 60 :debug 60)) (defn validate-callback-registration [status-fns svc-name svc-version status-version] (let [differing-svc-version? (and (not (nil? (first status-fns))) (not= (:service-version (first status-fns)) svc-version)) differing-status-version? (not (empty? (filter #(= (:service-status-version %) status-version) status-fns))) error-message (if differing-svc-version? (i18n/tru "Cannot register multiple callbacks for a single service with different service versions.") (i18n/tru "Service function already exists for service {0} with status version {1}" svc-name status-version))] (when (or differing-svc-version? differing-status-version?) (throw (IllegalStateException. error-message))))) (defn validate-protocol! "Throws if the protocol is not http or https" [url] (let [protocol (.getProtocol url) url-string (str url)] (if-not (contains? #{"http" "https"} protocol) (throw (IllegalArgumentException. (i18n/tru "The proxy-target-url ''{0}'' has an unsupported protocol ''{1}''. Must be either http or https" url-string protocol)))))) (schema/defn ^:always-validate get-jvm-metrics :- JvmMetricsV1 [cpu-snapshot :- cpu/CpuUsageSnapshot] (let [runtime-bean (ManagementFactory/getRuntimeMXBean) gc-beans (jmx/mbean-names "java.lang:name=*,type=GarbageCollector")] {:heap-memory (jmx/read "java.lang:type=Memory" :HeapMemoryUsage) :non-heap-memory (jmx/read "java.lang:type=Memory" :NonHeapMemoryUsage) :file-descriptors (setutils/rename-keys (jmx/read "java.lang:type=OperatingSystem" [:OpenFileDescriptorCount :MaxFileDescriptorCount]) {:OpenFileDescriptorCount :used :MaxFileDescriptorCount :max}) :gc-stats (into {} (for [gc gc-beans] (let [gc-name (.getKeyProperty gc "name") gc-info (setutils/rename-keys (jmx/read gc [:CollectionCount :CollectionTime]) {:CollectionCount :count :CollectionTime :total-time-ms})] {gc-name gc-info}))) :cpu-usage (:cpu-usage cpu-snapshot) :gc-cpu-usage (:gc-cpu-usage cpu-snapshot) :up-time-ms (.getUptime runtime-bean) :start-time-ms (.getStartTime runtime-bean)})) (schema/defn update-cpu-usage-metrics [last-cpu-snapshot :- (schema/atom cpu/CpuUsageSnapshot)] (swap! last-cpu-snapshot cpu/get-cpu-values)) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;; Public (def status-service-name "status-service") (schema/defn ^:always-validate validate-config :- StatusServiceConfig [config] (let [config (or config {})] (schema/validate DebugLoggingConfig (:debug-logging config)) config)) (schema/defn ^:always-validate schedule-bg-tasks [interspaced :- IFn log-status :- IFn config :- StatusServiceConfig last-cpu-snapshot :- (schema/atom cpu/CpuUsageSnapshot)] (let [interval-minutes (get-in config [:debug-logging :interval-minutes])] (when interval-minutes (let [interval-milliseconds (* 60000 interval-minutes)] (log/info "Starting background logging of status data") (interspaced interval-milliseconds log-status)))) (let [cpu-metrics-interval-seconds (get-in config [:cpu-metrics-interval-seconds] 5)] (when (pos? cpu-metrics-interval-seconds) (log/info "Starting background monitoring of cpu usage metrics") (interspaced (* cpu-metrics-interval-seconds 1000) (partial update-cpu-usage-metrics last-cpu-snapshot))))) (schema/defn ^:always-validate nominal? :- schema/Bool [status :- ServiceStatus] (= (:state status) :running)) (schema/defn ^:always-validate all-nominal? :- schema/Bool [statuses :- ServicesStatus] (every? nominal? (vals statuses))) (schema/defn ^:always-validate get-artifact-version :- schema/Str "Utility function that services can use to get a value to pass in as their `service-version` when registering a status callback. `group-id` and `artifact-id` should match the maven/leiningen identifiers for the project that the service is defined in." [group-id artifact-id] (let [version (versioneer/get-version group-id artifact-id)] (when (empty? version) (throw (IllegalStateException. (i18n/tru "Unable to find version number for ''{0}/{1}''" group-id artifact-id)))) version)) (def status-service-version (get-artifact-version "puppetlabs" "trapperkeeper-status")) (defn level->int "Returns an integer which represents the given status level. The ordering of levels is :critical < :info < :debug." [level] (case level :critical 0 :info 1 :debug 2)) (defn compare-levels "Converts the two status levels to integers using level->int and then invokes f, passing the two integers as arguments. Especially useful for comparing two status levels." [f level1 level2] (f (level->int level1) (level->int level2))) (schema/defn service-status-map :- ServiceInfo [svc-version status-version status-fn] {:service-version svc-version :service-status-version status-version :status-fn status-fn}) (defn update-status-context "Update the :status-fns atom in the service context." [status-fns-atom svc-name svc-version status-version status-fn] (validate-callback-registration (get (deref status-fns-atom) svc-name) svc-name svc-version status-version) (let [status-map (service-status-map svc-version status-version status-fn)] (swap! status-fns-atom update-in [svc-name] conj status-map))) (schema/defn ^:always-validate reset-status-context! :- nil "Remove a key from the :status-fns atom in the service context" [status-fns-atom :- clojure.lang.Atom] (reset! status-fns-atom {}) nil) (schema/defn ^:always-validate guarded-status-fn-call :- StatusCallbackResponse "Given a status check function, a status detail level, and a timeout in (integral) seconds, this function calls the status function and handles three types of errors: * Status check timed out * Status check threw an Exception * Status check returned a form that doesn't match the StatusCallbackResponse schema In each error case, :state is set to :unknown and :status is set to a string describing the error." [service-name :- schema/Str status-fn :- StatusFn level :- ServiceStatusDetailLevel timeout :- WholeSeconds] (let [unknown-response (fn [status] {:state :unknown :status status}) timeout-response (unknown-response (i18n/tru "Status check timed out after {0} seconds" timeout))] (with-timeout (i18n/trs "Status callback for {0}" service-name) timeout timeout-response (try (let [status (status-fn level)] (if-let [schema-failure (schema/check StatusCallbackResponse status)] (unknown-response (i18n/tru "Status check response for {0} malformed: {1}" service-name (maybe-explain schema-failure))) status)) (catch InterruptedException e ;; if we get here it's almost certainly because the timeout was reached, ;; so the macro already has a return value and we don't need to bother ;; returning one (log/error e (i18n/trs "Status callback for {0} interrupted" service-name))) (catch Exception e (let [error-msg (i18n/trs "Status check for {0} threw an exception" service-name)] (log/error e error-msg) (unknown-response (format "%s: %s" error-msg e)))))))) (schema/defn ^:always-validate matching-service-info :- ServiceInfo "Find a service info entry matching the service-status-version. If service-status-version is nil the most recent service info is returned." [service-name :- schema/Str service :- [ServiceInfo] service-status-version :- (schema/maybe schema/Int)] (let [status (if (nil? service-status-version) (last (sort-by :service-status-version service)) (first (filter #(= (:service-status-version %) service-status-version) service)))] (if (nil? status) (throw+ {:kind :service-status-version-not-found :msg (i18n/tru "No status function with version {0} found for service {1}" service-status-version service-name)}) status))) (schema/defn ^:always-validate get-status-fn :- StatusFn "Retrieve the status-fn for a service by name and optionally by service-status version. If service-status-version is nil the status fn for the most recent status version used." [services-info-atom service-name :- schema/Str service-status-version :- (schema/maybe schema/Int)] (let [service-info (-> services-info-atom deref (get service-name))] (if (nil? service-info) (throw+ {:kind :service-info-not-found :msg (i18n/tru "No service info found for service {0}" service-name)}) (:status-fn (matching-service-info service-name service-info service-status-version))))) (schema/defn ^:always-validate call-status-fn-for-service :- ServiceStatus "Construct a map with the service's version, the version of the service's status, the detail level, and the results of calling the status function corresponding to the status version specified (or the most recent version if not). If the response from the callback function does not include an :state key, or returns a value other than true or false, return :unknown for :state." ([service-name :- schema/Str service :- [ServiceInfo] level :- ServiceStatusDetailLevel timeout :- WholeSeconds] (call-status-fn-for-service service-name service level timeout nil)) ([service-name :- schema/Str service :- [ServiceInfo] level :- ServiceStatusDetailLevel timeout :- WholeSeconds service-status-version :- (schema/maybe schema/Int)] (let [status (matching-service-info service-name service service-status-version) callback-resp (guarded-status-fn-call service-name (:status-fn status) level timeout) data (:status callback-resp) state (if-not (schema/check State (:state callback-resp)) (:state callback-resp) :unknown) alerts (get callback-resp :alerts [])] {:service_version (:service-version status) :service_status_version (:service-status-version status) :detail_level level :state state :status data :active_alerts alerts}))) (schema/defn ^:always-validate call-status-fns :- ServicesStatus "Call the latest status function for each service in the service context, and return a map of service to service status." [status-fns :- ServicesInfo level :- ServiceStatusDetailLevel timeout :- WholeSeconds] (try (into {} (pmap (fn [[k v]] {k (call-status-fn-for-service k v level timeout)}) status-fns)) ;; pmap returns all exceptions that occur while it is executing tasks in a ;; java.util.concurrent.ExecutionException. This unwraps and rethrows ;; these exceptions so that our other middleware can handle them ;; appropriately. (catch java.util.concurrent.ExecutionException e (throw (.getCause e))))) (defn get-status-detail-level "Given a params map from a request, get out the status level and check whether it is valid. If not, throw an error. If no status level was in the params, then default to 'info'." [params] (if-let [level (keyword (params :level))] (if-not (schema/check ServiceStatusDetailLevel level) level (ringutils/throw-data-invalid! (i18n/tru "Invalid level: {0}" level))) :info)) (defn get-service-status-version "Given a params map from a request, get out the service status version and check whether it is valid. If not, throw an error." [params] (when-let [version (params :service_status_version)] (if-let [numeric-version (ks/parse-int version)] numeric-version (ringutils/throw-data-invalid! (i18n/tru "Invalid service_status_version. Should be an integer but was {0}" version))))) (defn get-timeout "Given a params map from a request, attempt to find the timeout parameter and parse it as an integer, returning the numeric value. If no timeout parameter is found, return nil. If the parameter isn't parseable as an integer, throw an exception." [params] (let [timeout-param (:timeout params) numeric-timeout (and timeout-param (ks/parse-int timeout-param))] (cond (nil? timeout-param) nil (nil? numeric-timeout) (ringutils/throw-data-invalid! (i18n/tru "Invalid timeout. Should be an integer but was {0}" timeout-param)) (<= numeric-timeout 0) (ringutils/throw-data-invalid! (i18n/tru "Invalid timeout. Timeout must be greater than zero but was {0}" numeric-timeout)) :else numeric-timeout))) (schema/defn ^:always-validate summarize-states :- State "Given a map of service statuses, return the 'most severe' state present as ranked by :error, :unknown, :stopping, :starting, :running" [statuses :- ServicesStatus] (let [state-set (->> statuses vals (map :state) set)] (cond (state-set :error) :error (state-set :unknown) :unknown (state-set :stopping) :stopping (state-set :starting) :starting (state-set :running) :running :else :unknown))) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;; Comidi App (schema/defn ^:always-validate status->code :- schema/Int "Given a service status, returns an appropriate HTTP status code" [status :- ServiceStatus] (if (nominal? status) 200 503)) (schema/defn ^:always-validate statuses->code :- schema/Int "Given a map of service statuses, returns an appropriate HTTP status code." [statuses :- ServicesStatus] (if (all-nominal? statuses) 200 503)) (defn build-plaintext-routes [path status-fns] (comidi/context path (comidi/GET "" [:as {params :params}] (let [timeout (or (get-timeout params) (check-timeout :critical)) statuses (call-status-fns status-fns :critical timeout)] (ringutils/plain-response (statuses->code statuses) (-> statuses summarize-states name)))) (comidi/GET ["/" :service-name] [service-name :as {params :params}] (if-let [service-info (get status-fns service-name)] (let [timeout (or (get-timeout params) (check-timeout :critical)) status (call-status-fn-for-service service-name service-info :critical timeout)] (ringutils/plain-response (status->code status) (name (:state status)))) (ringutils/plain-response 404 (i18n/tru "not found: {0}" service-name)))))) (defn build-json-routes [path status-fns] (comidi/context path (comidi/GET "" [:as {params :params}] (let [level (get-status-detail-level params) timeout (or (get-timeout params) (check-timeout level)) statuses (call-status-fns status-fns level timeout)] (ringutils/json-response (statuses->code statuses) statuses))) (comidi/GET ["/" :service-name] [service-name :as {params :params}] (if-let [service-info (get status-fns service-name)] (let [level (get-status-detail-level params) service-status-version (get-service-status-version params) status (call-status-fn-for-service service-name service-info level (or (get-timeout params) (check-timeout level)) service-status-version)] (ringutils/json-response (status->code status) (assoc status :service_name service-name))) ;; else (no service with that name) (ringutils/json-response 404 {:kind :service-not-found :msg (i18n/tru "No status information found for service {0}" service-name)}))))) (schema/defn ^:always-validate errors-by-type-middleware [t :- ringutils/ResponseType] (fn [handler] (-> handler (middleware/wrap-data-errors t) (middleware/wrap-schema-errors t) (middleware/wrap-uncaught-errors t)))) (defn build-handler [path status-fns] (comidi/routes->handler (comidi/wrap-routes (comidi/context path (comidi/context "/v1" (-> (build-json-routes "/services" status-fns) (comidi/wrap-routes (errors-by-type-middleware :json))) (-> (build-plaintext-routes "/simple" status-fns) (comidi/wrap-routes (errors-by-type-middleware :plain))))) #(i18n/locale-negotiator (ring-defaults/wrap-defaults % ring-defaults/api-defaults))))) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;; Status Service Status (schema/defn ^:always-validate v1-status :- StatusCallbackResponse [last-cpu-snapshot :- (schema/atom cpu/CpuUsageSnapshot) level :- ServiceStatusDetailLevel] (let [level>= (partial compare-levels >= level)] {:state :running :status (cond-> ;; no status info at ':critical' level {} ;; no extra status at ':info' level yet (level>= :info) identity (level>= :debug) (assoc-in [:experimental :jvm-metrics] (get-jvm-metrics @last-cpu-snapshot)))})) (schema/defn status-latest-version :- StatusCallbackResponse "This function will return the status data from the latest version of the API" [last-cpu-snapshot :- (schema/atom cpu/CpuUsageSnapshot) level :- ServiceStatusDetailLevel] (v1-status last-cpu-snapshot level)) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;; Status Proxy (schema/defn ^:always-validate get-proxy-route-info "Validates the status-proxy-config and returns a map with parameters to be used with add-proxy-route: proxy-target: target host, port, and path proxy-options: SSL options for the proxy target" [status-proxy-config :- StatusProxyConfig] (let [target-url (URL. (status-proxy-config :proxy-target-url)) host (.getHost target-url) port (.getPort target-url) path (.getPath target-url) protocol (.getProtocol target-url) ssl-opts (status-proxy-config :ssl-opts)] (validate-protocol! target-url) {:proxy-target {:host host :port port :path path} :proxy-options {:ssl-config ssl-opts :scheme (keyword protocol)}})) trapperkeeper-status-0.7.1/src/puppetlabs/trapperkeeper/services/status/status_debug_logging.clj000066400000000000000000000016111303750077200335720ustar00rootroot00000000000000(ns puppetlabs.trapperkeeper.services.status.status-debug-logging (:require [clojure.tools.logging :as log] [schema.utils :refer [validation-error-explain]] [cheshire.core :as json] [puppetlabs.trapperkeeper.services.status.status-core :as status-core] [schema.core :as schema] [puppetlabs.trapperkeeper.services.status.cpu-monitor :as cpu])) (schema/defn log-status "Log status information at the debug level as json Note: This function is in its own namespace so that logback can use the namespace as a way to route these log messages separately from other logging the application might be doing, and so it shouldn't be moved from this namespace" [last-cpu-snapshot :- (schema/atom cpu/CpuUsageSnapshot)] (let [status (status-core/status-latest-version last-cpu-snapshot :debug)] (log/debug (json/generate-string status)))) trapperkeeper-status-0.7.1/src/puppetlabs/trapperkeeper/services/status/status_proxy_service.clj000066400000000000000000000013071303750077200337010ustar00rootroot00000000000000(ns puppetlabs.trapperkeeper.services.status.status-proxy-service (:require [clojure.tools.logging :as log] [puppetlabs.trapperkeeper.services.status.status-core :refer [get-proxy-route-info]] [puppetlabs.trapperkeeper.core :refer [defservice]] [puppetlabs.i18n.core :as i18n])) (defservice status-proxy-service [[:WebroutingService add-proxy-route] [:ConfigService get-in-config]] (init [this context] (log/info (i18n/trs "Initializing status service proxy")) (let [status-proxy-config (get-in-config [:status-proxy]) {:keys [proxy-target proxy-options]} (get-proxy-route-info status-proxy-config)] (add-proxy-route this proxy-target proxy-options)) context)) trapperkeeper-status-0.7.1/src/puppetlabs/trapperkeeper/services/status/status_service.clj000066400000000000000000000070301303750077200324370ustar00rootroot00000000000000(ns puppetlabs.trapperkeeper.services.status.status-service (:require [clojure.tools.logging :as log] [puppetlabs.trapperkeeper.core :refer [defservice]] [puppetlabs.trapperkeeper.services :refer [service-context]] [puppetlabs.trapperkeeper.services.status.status-core :as status-core] [puppetlabs.trapperkeeper.services.status.status-debug-logging :as status-logging] [schema.core :as schema] [puppetlabs.i18n.core :as i18n])) (defprotocol StatusService (register-status [this service-name service-version status-version status-fn] "Register a status callback function for a service by adding it to the status service context. status-fn must be a function of arity 1 which takes the status level as a keyword and returns the status information for the given level. The return value of the callback function must satisfy the puppetlabs.trapperkeeper.services.status.status-core/StatusCallbackResponse schema.") (get-status [this service-name level status-version] [this service-name level status-version timeout] "Call the status function for a registered service, optionally providing a timeout to override the default timeout value for the level.")) (defservice status-service StatusService [[:WebroutingService add-ring-handler get-route] [:ConfigService get-in-config] [:SchedulerService interspaced]] (init [this context] (assoc context :status-fns (atom {}) :last-cpu-snapshot (atom {:snapshot {:uptime -1 :process-cpu-time -1 :process-gc-time -1} :cpu-usage -1 :gc-cpu-usage -1}))) (start [this context] (let [config (status-core/validate-config (get-in-config [:status])) cpu-snapshot (:last-cpu-snapshot context)] (status-core/schedule-bg-tasks interspaced (partial status-logging/log-status cpu-snapshot) config cpu-snapshot) (register-status this status-core/status-service-name status-core/status-service-version 1 (partial status-core/v1-status cpu-snapshot))) (log/info (i18n/trs "Registering status service HTTP API at /status")) (let [path (get-route this) handler (status-core/build-handler path (deref (:status-fns context)))] (add-ring-handler this handler)) context) (stop [this context] (when-let [status-fns (:status-fns context)] (status-core/reset-status-context! status-fns)) context) (register-status [this service-name service-version status-version status-fn] (log/infof (i18n/trs "Registering status callback function for service ''{0}'', version {1}" service-name service-version)) (status-core/update-status-context (:status-fns (service-context this)) service-name service-version status-version status-fn)) (get-status [this service-name level status-version] (get-status this service-name level status-version (status-core/check-timeout level))) (get-status [this service-name level status-version timeout] (let [status-fn (status-core/get-status-fn (:status-fns (service-context this)) service-name status-version)] (status-core/guarded-status-fn-call service-name status-fn level timeout)))) trapperkeeper-status-0.7.1/test/000077500000000000000000000000001303750077200166745ustar00rootroot00000000000000trapperkeeper-status-0.7.1/test/puppetlabs/000077500000000000000000000000001303750077200210535ustar00rootroot00000000000000trapperkeeper-status-0.7.1/test/puppetlabs/trapperkeeper/000077500000000000000000000000001303750077200237245ustar00rootroot00000000000000trapperkeeper-status-0.7.1/test/puppetlabs/trapperkeeper/services/000077500000000000000000000000001303750077200255475ustar00rootroot00000000000000trapperkeeper-status-0.7.1/test/puppetlabs/trapperkeeper/services/status/000077500000000000000000000000001303750077200270725ustar00rootroot00000000000000trapperkeeper-status-0.7.1/test/puppetlabs/trapperkeeper/services/status/check_test.clj000066400000000000000000000027101303750077200317000ustar00rootroot00000000000000(ns puppetlabs.trapperkeeper.services.status.check-test (:require [clojure.test :refer :all] [puppetlabs.trapperkeeper.services.status.check :refer :all])) (deftest disk-writable?-test (let [temp-dir (System/getProperty "java.io.tmpdir")] (testing "disk-writable?" (testing "writes to a temp file, checks what was written, and deletes the file." (let [!file (atom nil) !content (atom nil)] (with-redefs [spit (fn [file content] (reset! !file file) (reset! !content content) nil) slurp (fn [file] (is (= @!file file)) @!content)] (is (disk-writable? temp-dir)) (is (not (.exists @!file)))))) (testing "returns false if what was read isn't what was written" (with-redefs [spit (constantly nil) slurp (constantly "foo")] (is (not (disk-writable? temp-dir))))) (testing "returns false if an exception was thrown by spit" (let [!file (atom nil)] (with-redefs [spit (fn [file _] (reset! !file file) (throw (Exception. "this isn't real")))] (is (false? (disk-writable? temp-dir)))) (testing "but still deletes the file" (is (not (.exists @!file))))))))) trapperkeeper-status-0.7.1/test/puppetlabs/trapperkeeper/services/status/status_core_test.clj000066400000000000000000000156401303750077200331640ustar00rootroot00000000000000(ns puppetlabs.trapperkeeper.services.status.status-core-test (:require [clojure.test :refer :all] [schema.core :as schema] [schema.test :as schema-test] [puppetlabs.trapperkeeper.testutils.logging :refer [with-test-logging]] [puppetlabs.trapperkeeper.services.status.status-core :refer :all] [slingshot.test] [puppetlabs.kitchensink.core :as ks])) (use-fixtures :once schema-test/validate-schemas) (deftest get-service-version-test (testing "get-service-version returns a version string that satisfies the schema" ;; This test coverage isn't very thorough, but anything beyond this would ;; really just be testing the underlying libraries that we use to ;; implement it. (is (thrown-with-msg? IllegalStateException #"Unable to find version number for" (get-artifact-version "fake-group" "artifact-that-does-not-exist"))))) (deftest update-status-context-test (let [status-fns (atom {})] (testing "registering service status callback functions" (update-status-context status-fns "foo" "1.1.0" 1 (fn [] "foo v1")) (update-status-context status-fns "foo" "1.1.0" 2 (fn [] "foo v2")) (update-status-context status-fns "bar" "1.1.0" 1 (fn [] "bar v1")) (is (nil? (schema/check ServicesInfo @status-fns))) (is (= 2 (count (get @status-fns "foo")))) (is (= 1 (count (get @status-fns "bar"))))) (testing (str "registering a service status callback function with a " "version that already exists causes an error") (is (thrown-with-msg? IllegalStateException #"Service function already exists.*" (update-status-context status-fns "foo" "1.1.0" 2 (fn [] "foo repeat"))))) (testing (str "registering a service status callback function with a " "different service version causes an error") (is (thrown-with-msg? IllegalStateException #"Cannot register multiple callbacks.*different service version" (update-status-context status-fns "foo" "1.2.0" 3 (fn [] "foo repeat"))))))) (deftest get-status-fn-test (testing "getting the status function with an unspecified status version" (let [status-fns (atom {})] (update-status-context status-fns "foo" "1.1.0" 1 (fn [] "foo v1")) (update-status-context status-fns "foo" "1.1.0" 2 (fn [] "foo v2")) (update-status-context status-fns "bar" "8.0.1" 1 (fn [] "bar v1")) (let [status-fn (get-status-fn status-fns "foo" nil)] (is (= "foo v2" (status-fn))) (is (thrown+? [:kind :service-info-not-found :msg "No service info found for service baz"] (get-status-fn status-fns "baz" nil))) (is (thrown+? [:kind :service-status-version-not-found :msg "No status function with version 2 found for service bar"] (get-status-fn status-fns "bar" 2))))))) (deftest error-handling-test (testing "when there is an error checking status" (let [status-fns (atom {})] (testing "and it is a bad callback result schema" (update-status-context status-fns "foo" "1.1.0" 1 (fn [_] {:totally :nonconforming})) (let [result (call-status-fn-for-service "foo" (get @status-fns "foo") :debug 1)] (testing "status is set to explain schema error" (is (re-find #"missing-required-key" (pr-str result)))) (testing "state is set properly" (is (= :unknown (:state result)))))) (testing "and it is from a timeout" ; Add a status function that will block indefinitely to ensure we will always ; timeout (update-status-context status-fns "quux" "1.1.0" 1 (fn [_] (deref (promise)) {:state :running :status "aw yis"})) (with-test-logging (let [result (call-status-fn-for-service "quux" (get @status-fns "quux") :debug 0)] (is (logged? #"Status callback for quux timed out" :error)) (is (logged? #"CancellationException")) (testing "state is set properly" (is (= :unknown (:state result)))) (testing "status is set to explain timeout" (is (= "Status check timed out after 0 seconds" (:status result))))))) (testing "and it is from the status reporting function" (update-status-context status-fns "bar" "1.1.0" 1 (fn [_] (throw (Exception. "don't")))) (with-test-logging (let [result (call-status-fn-for-service "bar" (get @status-fns "bar") :debug 1)] (is (logged? #"Status check for bar threw an exception" :error)) (testing "status contains exception" (is (re-find #"don't" (pr-str result)))) (testing "state is set properly" (= :unknown (:state result))))))))) (deftest v1-status-test (let [last-cpu-snapshot (atom {:snapshot {:uptime -1 :process-cpu-time -1 :process-gc-time -1} :cpu-usage -1 :gc-cpu-usage -1})] (testing "no data at critical level" (is (= {:state :running :status {}} (v1-status last-cpu-snapshot :critical)))) (testing "no data at info level" (is (= {:state :running :status {}} (v1-status last-cpu-snapshot :info)))) (testing "jvm metrics at debug level" (let [status (v1-status last-cpu-snapshot :debug)] (is (= {:state :running} (dissoc status :status))) (is (= #{:experimental} (ks/keyset (:status status)))) (is (= #{:jvm-metrics} (ks/keyset (get-in status [:status :experimental])))) (let [jvm-metrics (get-in status [:status :experimental :jvm-metrics])] (is (= #{:heap-memory :non-heap-memory :file-descriptors :gc-stats :up-time-ms :start-time-ms :cpu-usage :gc-cpu-usage} (ks/keyset jvm-metrics))) (is (= #{:committed :init :max :used} (ks/keyset (:heap-memory jvm-metrics)))) (is (= #{:committed :init :max :used} (ks/keyset (:non-heap-memory jvm-metrics)))) (is (every? #(< 0 %) (vals (:heap-memory jvm-metrics)))) (is (every? #(or (< 0 %) (= -1 %)) (vals (:non-heap-memory jvm-metrics)))) (is (= #{:max :used} (ks/keyset (:file-descriptors jvm-metrics)))) (is (every? #(< 0 %) (vals (:file-descriptors jvm-metrics)))) (is (every? #(= #{:count :total-time-ms} (ks/keyset %)) (vals (:gc-stats jvm-metrics)))) (is (every? #(<= 0 %) ;; Possible that no major collections occurred. (mapcat #(vals %) (vals (:gc-stats jvm-metrics))))) (is (< 0 (:up-time-ms jvm-metrics))) (is (< 0 (:start-time-ms jvm-metrics)))))))) status_proxy_service_test.clj000066400000000000000000000247171303750077200350630ustar00rootroot00000000000000trapperkeeper-status-0.7.1/test/puppetlabs/trapperkeeper/services/status(ns puppetlabs.trapperkeeper.services.status.status-proxy-service-test (:require [clojure.test :refer :all] [cheshire.core :as json] [schema.test :as schema-test] [puppetlabs.http.client.sync :as http-client] [puppetlabs.trapperkeeper.core :refer [defservice service]] [puppetlabs.trapperkeeper.testutils.bootstrap :refer :all] [puppetlabs.trapperkeeper.testutils.logging :refer [with-test-logging]] [puppetlabs.trapperkeeper.services.status.status-service :refer [status-service]] [puppetlabs.trapperkeeper.services.status.status-proxy-service :refer [status-proxy-service]] [puppetlabs.trapperkeeper.services.webrouting.webrouting-service :as webrouting-service] [puppetlabs.trapperkeeper.services.scheduler.scheduler-service :as scheduler-service] [puppetlabs.trapperkeeper.services.webserver.jetty9-service :as jetty9-service])) (use-fixtures :once schema-test/validate-schemas) (def dev-resources "./dev-resources/puppetlabs/trapperkeeper-status/status-proxy-service-test") (def common-ssl-config {:ssl-cert (str dev-resources "/ssl/certs/localhost.pem") :ssl-key (str dev-resources "/ssl/private_keys/localhost.pem") :ssl-ca-cert (str dev-resources "/ssl/certs/ca.pem")}) (def ssl-status-service-config {:webserver (merge common-ssl-config {:ssl-host "0.0.0.0" :ssl-port 9001}) :web-router-service {:puppetlabs.trapperkeeper.services.status.status-service/status-service "/ssl-status"}}) (def status-proxy-service-config {:webserver {:port 8181 :host "0.0.0.0"} :status-proxy {:proxy-target-url "https://localhost:9001/ssl-status" :ssl-opts common-ssl-config} :web-router-service {:puppetlabs.trapperkeeper.services.status.status-proxy-service/status-proxy-service "/status-proxy"}}) (defservice foo-service [[:StatusService register-status]] (init [this context] (register-status "foo" "1.1.0" 1 (fn [level] {:status (str "foo status 1 " level) :state :running})) (register-status "foo" "1.1.0" 2 (fn [level] {:status (str "foo status 2 " level) :state :running})) context)) (defservice bar-service [[:StatusService register-status]] (init [this context] (register-status "bar" "0.1.0" 1 (fn [level] {:status (str "bar status 1 " level) :state :running})) context)) (deftest proxy-ssl-status-endpoint-test (testing "status-proxy-service can connect to https status-service service correctly" ; Start status service (with-app-with-config status-app [jetty9-service/jetty9-service webrouting-service/webrouting-service status-service foo-service bar-service scheduler-service/scheduler-service] ssl-status-service-config ; Start the proxy service (with-app-with-config proxy-app [jetty9-service/jetty9-service webrouting-service/webrouting-service status-proxy-service] status-proxy-service-config ; Make HTTP request to the proxy, which will forward it as an HTTPS request (testing "proxying plain url" (let [resp (http-client/get "http://localhost:8181/status-proxy/v1/services") body (json/parse-string (slurp (:body resp)))] (is (= 200 (:status resp))) (is (= {"bar" {"service_version" "0.1.0" "service_status_version" 1 "state" "running" "detail_level" "info" "active_alerts" [] "status" "bar status 1 :info"} "foo" {"service_version" "1.1.0" "service_status_version" 2 "state" "running" "detail_level" "info" "active_alerts" [] "status" "foo status 2 :info"}} (dissoc body "status-service"))))) (testing "proxying url with query param" (let [resp (http-client/get "http://localhost:8181/status-proxy/v1/services?level=debug") body (json/parse-string (slurp (:body resp)))] (is (= 200 (:status resp))) (is (= {"bar" {"service_version" "0.1.0" "service_status_version" 1 "state" "running" "detail_level" "debug" "active_alerts" [] "status" "bar status 1 :debug"} "foo" {"service_version" "1.1.0" "service_status_version" 2 "state" "running" "detail_level" "debug" "active_alerts" [] "status" "foo status 2 :debug"}} (dissoc body "status-service"))))) (testing "proxying specific service" (let [resp (http-client/get "http://localhost:8181/status-proxy/v1/services/foo") body (json/parse-string (slurp (:body resp)))] (is (= 200 (:status resp))) (is (= {"service_version" "1.1.0" "service_status_version" 2 "state" "running" "detail_level" "info" "status" "foo status 2 :info" "active_alerts" [] "service_name" "foo"} body)))))))) (defn count-ring-handler "Increments counter" [counter req] {:status 200 :body (str (swap! counter inc))}) (deftest proxy-only-proxies-what-it-should-test (testing "status-proxy-service doesn't proxy things it shouldn't" ; non-status-endpoint-counter is used to make sure that no requests to the proxy ; endpoint end up hitting non status-service endpoints (let [non-status-endpoint-counter (atom 0) status-ring-handler (partial count-ring-handler non-status-endpoint-counter) status-count-service (service [[:WebserverService add-ring-handler]] (init [this context] (add-ring-handler status-ring-handler "/stats") (add-ring-handler status-ring-handler "/nope") (add-ring-handler status-ring-handler "/ssl-statuses") (add-ring-handler status-ring-handler "/statuses") context)) ; Should not succeed in hitting the endpoints running on the status service server bad-proxy-requests ["http://localhost:8181/status-proxy/stats" "http://localhost:8181/status-proxy/nope" "http://localhost:8181/status-proxy/statuses"] ; Should succeed, used to make sure the counter works right good-non-status-endpoint-requests ["https://localhost:9001/stats" "https://localhost:9001/nope" "https://localhost:9001/ssl-statuses" "https://localhost:9001/statuses"]] (with-app-with-config ; Start status service status-app [jetty9-service/jetty9-service webrouting-service/webrouting-service status-service scheduler-service/scheduler-service status-count-service] ssl-status-service-config ; Start the proxy service (with-app-with-config proxy-app [jetty9-service/jetty9-service webrouting-service/webrouting-service status-proxy-service] status-proxy-service-config (testing "non-proxied endpoints on status-service server don't see any traffic" (doall (map http-client/get bad-proxy-requests)) (is (= 0 (deref non-status-endpoint-counter)))) (testing "that the counter is correctly catching requests" (doall (map #(http-client/get % common-ssl-config) good-non-status-endpoint-requests)) (is (= (count good-non-status-endpoint-requests) (deref non-status-endpoint-counter))))))))) (deftest invalid-proxy-config-throws (testing "Missing proxy-target-url throws schema error" (let [bad-config (update-in status-proxy-service-config [:status-proxy] dissoc :proxy-target-url)] (is (thrown-with-msg? clojure.lang.ExceptionInfo #"does not match schema.*:proxy-target-url" (with-test-logging (with-app-with-config proxy-app [jetty9-service/jetty9-service webrouting-service/webrouting-service status-proxy-service] bad-config)))))) (testing "Invalid ssl-cert type throws error" (let [bad-config (assoc-in status-proxy-service-config [:status-proxy :ssl-opts :ssl-cert] 3.1415)] (is (thrown-with-msg? clojure.lang.ExceptionInfo #"does not match schema.*:ssl-cert" (with-test-logging (with-app-with-config proxy-app [jetty9-service/jetty9-service webrouting-service/webrouting-service status-proxy-service] bad-config)))))) (testing "Invalid proxy-target-url protocol throws error" (let [bad-config (assoc-in status-proxy-service-config [:status-proxy :proxy-target-url] "file:///C:/Windows/System32/Config")] (is (thrown-with-msg? java.lang.IllegalArgumentException #"The proxy-target-url.*has an unsupported protocol 'file'" (with-test-logging (with-app-with-config proxy-app [jetty9-service/jetty9-service webrouting-service/webrouting-service status-proxy-service] bad-config))))))) trapperkeeper-status-0.7.1/test/puppetlabs/trapperkeeper/services/status/status_service_test.clj000066400000000000000000000624651303750077200337030ustar00rootroot00000000000000(ns puppetlabs.trapperkeeper.services.status.status-service-test (:require [clojure.test :refer :all] [cheshire.core :as json] [schema.test :as schema-test] [puppetlabs.http.client.sync :as http-client] [puppetlabs.trapperkeeper.services :refer [defservice service service-context]] [puppetlabs.trapperkeeper.app :refer [get-service] :as tka] [puppetlabs.trapperkeeper.testutils.bootstrap :refer :all] [puppetlabs.trapperkeeper.testutils.logging :refer [with-test-logging with-logger-event-maps]] [puppetlabs.trapperkeeper.services.status.status-service :refer [status-service get-status]] [puppetlabs.trapperkeeper.services.status.status-core :as status-core] [puppetlabs.trapperkeeper.services.webrouting.webrouting-service :as webrouting-service] [puppetlabs.trapperkeeper.services.webserver.jetty9-service :as jetty9-service] [puppetlabs.trapperkeeper.services.scheduler.scheduler-service :as scheduler-service] [puppetlabs.kitchensink.core :as ks])) (use-fixtures :once schema-test/validate-schemas) (def status-service-config {:webserver {:port 8180 :host "0.0.0.0"} :web-router-service {:puppetlabs.trapperkeeper.services.status.status-service/status-service "/status"}}) (defn parse-response ([resp] (parse-response resp false)) ([resp keywordize?] (json/parse-string (slurp (:body resp)) keywordize?))) (defn response->status [resp] (:status (parse-response resp true))) (defmacro with-status-service-with-config "Macro to start the status service and its dependencies (jetty9 and webrouting service), along with any other services desired, with the given config" [app services config & body] `(with-app-with-config ~app (concat [jetty9-service/jetty9-service webrouting-service/webrouting-service scheduler-service/scheduler-service status-service] ~services) ~config (do ~@body))) (defmacro with-status-service "Macro to start the status service and its dependencies (jetty9 and webrouting service), along with any other services desired. Provides a default tk config" [app services & body] `(with-status-service-with-config ~app ~services status-service-config ~@body)) (def alerts [{:severity :error :message "Alert! Alert"}]) (def decoded-alerts (json/decode (json/encode alerts))) (defservice foo-service [[:StatusService register-status]] (init [this context] (register-status "foo" "1.1.0" 1 (fn [level] {:status (str "foo status 1 " level) :state :running})) (register-status "foo" "1.1.0" 2 (fn [level] {:status (str "foo status 2 " level) :state :running :alerts alerts})) context)) (defservice bar-service [[:StatusService register-status]] (init [this context] (register-status "bar" "0.1.0" 1 (fn [level] {:status (str "bar status 1 " level) :state :running})) context)) (defservice baz-service [[:StatusService register-status]] (init [this context] (register-status "baz" "0.2.0" 1 (fn [level] "baz")) context)) (defservice fail-service [[:StatusService register-status]] (init [this context] (register-status "fail" "4.2.0" 1 (fn [level] {:status "wheee", :state :error})) context)) (defservice starting-service [[:StatusService register-status]] (init [this context] (register-status "starting" "6.6.6" 1 (fn [level] {:status "foo" :state :starting})))) (defservice stopping-service [[:StatusService register-status]] (init [this context] (register-status "stopping" "6.6.6" 1 (fn [level] {:status "bar" :state :stopping})))) (defservice slow-service [[:StatusService register-status]] (init [this context] (register-status "slow" "0.1.0" 1 (fn [level] (Thread/sleep 2000))))) (defservice broken-service [[:StatusService register-status]] (init [this context] (register-status "broken" "0.1.0" 1 (fn [level] (throw (Exception. "don't")))))) (deftest get-status-test (with-status-service app [foo-service jetty9-service/jetty9-service webrouting-service/webrouting-service status-service] (let [svc (get-service app :StatusService)] (is (= (get-status svc "foo" :critical nil) {:state :running :alerts alerts :status "foo status 2 :critical"}) "can get the status from the latest status fn") (is (= (get-status svc "foo" :critical 1) {:state :running :status "foo status 1 :critical"}) "can get the status from a specific status version") (is (= (get-status svc "foo" :info nil) {:state :running :alerts alerts :status "foo status 2 :info"}) "can select the status fn level")))) (deftest rollup-status-endpoint-test (with-status-service app [foo-service bar-service] (testing "returns latest status for all services" (let [resp (http-client/get "http://localhost:8180/status/v1/services") body (parse-response resp)] (is (= 200 (:status resp))) (is (= {"bar" {"service_version" "0.1.0" "service_status_version" 1 "state" "running" "detail_level" "info" "active_alerts" [] "status" "bar status 1 :info"} "foo" {"service_version" "1.1.0" "service_status_version" 2 "state" "running" "detail_level" "info" "active_alerts" decoded-alerts "status" "foo status 2 :info"}} (dissoc body "status-service"))))) (testing "uses status level from query param" (let [resp (http-client/get "http://localhost:8180/status/v1/services?level=debug") body (parse-response resp)] (is (= 200 (:status resp))) (is (= {"bar" {"service_version" "0.1.0" "service_status_version" 1 "state" "running" "detail_level" "debug" "active_alerts" [] "status" "bar status 1 :debug"} "foo" {"service_version" "1.1.0" "service_status_version" 2 "state" "running" "detail_level" "debug" "active_alerts" decoded-alerts "status" "foo status 2 :debug"}} (dissoc body "status-service")))))) (testing "uses timeout from query param" (with-test-logging (with-status-service app [foo-service slow-service] (testing "uses timeout from query param" (let [resp (http-client/get "http://localhost:8180/status/v1/services?timeout=1") body (parse-response resp)] (is (= 503 (:status resp))) (is (re-find #"timed out" (get-in body ["slow" "status"]))))))))) (deftest alternate-mount-point-test (testing "can mount status endpoint at alternate location" (with-app-with-config app [jetty9-service/jetty9-service webrouting-service/webrouting-service scheduler-service/scheduler-service status-service] (merge status-service-config {:web-router-service {:puppetlabs.trapperkeeper.services.status.status-service/status-service "/alternate-status"}}) (let [resp (http-client/get "http://localhost:8180/alternate-status/v1/services")] (is (= 200 (:status resp))))))) (deftest single-service-status-endpoint-test (with-status-service app [foo-service baz-service] (testing "returns service information for service that has registered a callback" (let [resp (http-client/get "http://localhost:8180/status/v1/services/foo")] (is (= 200 (:status resp))) (is (= {"service_version" "1.1.0" "service_status_version" 2 "state" "running" "detail_level" "info" "status" "foo status 2 :info" "active_alerts" decoded-alerts "service_name" "foo"} (parse-response resp))))) (testing "uses status level query param" (let [resp (http-client/get "http://localhost:8180/status/v1/services/foo?level=critical")] (is (= 200 (:status resp))) (is (= {"service_version" "1.1.0" "service_status_version" 2 "state" "running" "detail_level" "critical" "status" "foo status 2 :critical" "active_alerts" decoded-alerts "service_name" "foo"} (parse-response resp))))) (testing "uses service_status_version query param" (let [resp (http-client/get "http://localhost:8180/status/v1/services/foo?service_status_version=1")] (is (= 200 (:status resp))) (is (= {"service_version" "1.1.0" "service_status_version" 1 "state" "running" "detail_level" "info" "status" "foo status 1 :info" "active_alerts" [] "service_name" "foo"} (parse-response resp))))) (testing "returns unknown for state if not provided in callback fn" (let [resp (http-client/get "http://localhost:8180/status/v1/services/baz")] (is (= 503 (:status resp))) (is (= {"service_version" "0.2.0" "service_status_version" 1 "state" "unknown" "detail_level" "info" "status" "Status check response for baz malformed: (not (map? \"baz\"))" "active_alerts" [] "service_name" "baz"} (parse-response resp))))) (testing "returns a 404 for service not registered with the status service" (let [resp (http-client/get "http://localhost:8180/status/v1/services/notfound")] (is (= 404 (:status resp))) (is (= {"kind" "service-not-found" "msg" "No status information found for service notfound"} (parse-response resp))))))) (deftest status-code-test (with-status-service app [bar-service fail-service] (testing "returns 503 response code when a service is not running" (let [{:keys [status]} (http-client/get "http://localhost:8180/status/v1/services/fail")] (is (= 503 status))) (let [{:keys [status]} (http-client/get "http://localhost:8180/status/v1/services")] (is (= 503 status)))) (testing "returns a 200 response code for the service that is running" (let [{:keys [status]} (http-client/get "http://localhost:8180/status/v1/services/bar")] (is (= 200 status))))) (testing "returns 200 response code when all services are running" (with-status-service app [bar-service foo-service] (let [{:keys [status]} (http-client/get "http://localhost:8180/status/v1/services/bar")] (is (= 200 status))) (let [{:keys [status]} (http-client/get "http://localhost:8180/status/v1/services")] (is (= 200 status)))))) (deftest status-check-error-handling-test (with-test-logging (with-status-service app [slow-service broken-service baz-service] (testing "handles case when a status check times out" (let [resp (http-client/get (str "http://localhost:8180/status/v1/services/slow" "?level=critical&timeout=1")) body (parse-response resp)] (is (= 503 (:status resp))) (is (= "unknown" (get body "state"))) (is (re-find #"timed out" (get body "status"))))) (testing "handles case when a status check throws an exception" (let [resp (http-client/get "http://localhost:8180/status/v1/services/broken?level=critical") body (parse-response resp)] (is (= 503 (:status resp))) (is (= "unknown" (get body "state"))) (is (re-find #"exception.*don't" (get body "status"))))) (testing "handles case when a status check returns a non-conforming result" (let [resp (http-client/get "http://localhost:8180/status/v1/services/baz?level=critical") body (parse-response resp)] (is (= 503 (:status resp))) (is (= "unknown" (get body "state"))) (is (re-find #"malformed" (get body "status")))))))) (deftest error-handling-test (with-status-service app [foo-service] (with-test-logging (testing "returns a 400 when an invalid level is queried for" (let [resp (http-client/get "http://localhost:8180/status/v1/services?level=bar")] (is (= 400 (:status resp))) (is (= {"kind" "data-invalid" "msg" "Invalid level: :bar"} (parse-response resp))))) (testing "returns a 400 when a non-integer status-version is queried for" (let [resp (http-client/get (str "http://localhost:8180/status/v1/" "services/foo?service_status_version=abc"))] (is (= 400 (:status resp))) (is (= {"kind" "data-invalid" "msg" (str "Invalid service_status_version. " "Should be an integer but was abc")} (parse-response resp))))) (testing "returns a 400 when a non-existent status-version is queried for" (let [resp (http-client/get (str "http://localhost:8180/status/v1/" "services/foo?service_status_version=99"))] (is (= 400 (:status resp))) (is (= {"kind" "service-status-version-not-found" "msg" (str "No status function with version 99 " "found for service foo")} (parse-response resp))))) (testing "returns a 400 when a non-integer timeout is provided" (let [resp (http-client/get "http://localhost:8180/status/v1/services?timeout=three")] (is (= 400 (:status resp))) (is {"kind" "data-invalid" "msg" "Invalid timeout. Should be an integer but was three"}))) (testing "returns a 400 when zero is provided as the timeout" (let [resp (http-client/get "http://localhost:8180/status/v1/services?timeout=0")] (is (= 400 (:status resp))) (is {"kind" "data-invalid" "msg" "Invalid timeout. Timeout must be greater than zero but was 0"}))) (testing "returns a 400 when a negative timeout is provided" (let [resp (http-client/get "http://localhost:8180/status/v1/services?timeout=-3")] (is (= 400 (:status resp))) (is {"kind" "data-invalid" "msg" "Invalid timeout. Timeout must be greater than zero but was -3"})))))) (deftest simple-routes-params-ignoring-test (with-status-service app [foo-service] (testing "ignores bad level" (let [resp (http-client/get "http://localhost:8180/status/v1/simple?level=bar")] (is (= 200 (:status resp))) (is (= "running" (slurp (:body resp)))))) (testing "ignores alphabetic service_status_version" (let [resp (http-client/get (str "http://localhost:8180/status/v1/" "simple/foo?service_status_version=abc"))] (is (= 200 (:status resp))) (is (= "running" (slurp (:body resp)))))) (testing "ignores non-existent service_status_version" (let [resp (http-client/get (str "http://localhost:8180/status/v1/" "simple/foo?service_status_version=3"))] (is (= 200 (:status resp))) (is (= "running" (slurp (:body resp)))))))) (deftest simple-routes-test (testing "when calling the simple routes" (testing "for all services" (testing "and all services are :running" (with-status-service app [foo-service bar-service] (let [resp (http-client/get "http://localhost:8180/status/v1/simple")] (is (= 200 (:status resp))) (is (= "running" (slurp (:body resp))))))) (testing "and a service is :error" (with-status-service app [foo-service baz-service fail-service] (let [resp (http-client/get "http://localhost:8180/status/v1/simple")] (is (= 503 (:status resp))) (is (= "error" (slurp (:body resp))))))) (testing "and one service is :starting while another is :stopping" (with-status-service app [foo-service bar-service starting-service stopping-service] (let [resp (http-client/get "http://localhost:8180/status/v1/simple")] (is (= 503 (:status resp))) (is (= "stopping" (slurp (:body resp))))))) (testing "and a service is :unknown" (with-status-service app [foo-service baz-service starting-service stopping-service] (let [resp (http-client/get "http://localhost:8180/status/v1/simple")] (is (= 503 (:status resp))) (is (= "unknown" (slurp (:body resp)))))))) (testing "for a single service" (with-status-service app [foo-service baz-service fail-service starting-service] (testing "and it is :running" (let [resp (http-client/get "http://localhost:8180/status/v1/simple/foo")] (is (= 200 (:status resp))) (is (= "running" (slurp (:body resp)))))) (testing "and it is :unknown" (let [resp (http-client/get "http://localhost:8180/status/v1/simple/baz")] (is (= 503 (:status resp))) (is (= "unknown" (slurp (:body resp)))))) (testing "and it is :error" (let [resp (http-client/get "http://localhost:8180/status/v1/simple/fail")] (is (= 503 (:status resp))) (is (= "error" (slurp (:body resp)))))) (testing "and it is :starting" (let [resp (http-client/get "http://localhost:8180/status/v1/simple/starting")] (is (= 503 (:status resp))) (is (= "starting" (slurp (:body resp)))))) (testing "and it does not exist" (let [resp (http-client/get "http://localhost:8180/status/v1/simple/kafka")] (is (= 404 (:status resp))) (is (= "not found: kafka" (slurp (:body resp)))))))))) (deftest compare-levels-test (testing "use of compare-levels to implement a status function" (let [my-status (fn [level] (let [level>= (partial status-core/compare-levels >= level)] {:state :running :status (cond-> {:this-is-critical "foo"} (level>= :info) (assoc :bar "bar" :baz "baz") (level>= :debug) (assoc :x "x" :y "y" :z "y"))})) my-service (service [[:StatusService register-status]] (init [this context] (register-status "my-service" "1.0.0" 1 my-status)))] (with-status-service app [my-service] (testing "critical" (let [resp (http-client/get "http://localhost:8180/status/v1/services/my-service?level=critical")] (is (= 200 (:status resp))) (is (= {:this-is-critical "foo"} (response->status resp))))) (testing "info" (let [resp (http-client/get "http://localhost:8180/status/v1/services/my-service?level=info")] (is (= 200 (:status resp))) (is (= {:this-is-critical "foo" :bar "bar" :baz "baz"} (response->status resp))))) (testing "debug" (let [resp (http-client/get "http://localhost:8180/status/v1/services/my-service?level=debug")] (is (= 200 (:status resp))) (is (= {:this-is-critical "foo" :bar "bar" :baz "baz" :x "x" :y "y" :z "y"} (response->status resp))))))))) (deftest content-type-test (testing "responses have the 'application/json' content type set" (with-status-service app [foo-service] (let [{:keys [headers]} (http-client/get "http://localhost:8180/status/v1/services/foo")] (is (re-find #"^application/json" (get headers "content-type"))))))) (deftest status-status-test (testing "trapperkeeper-status registers its own status callback" (with-status-service app [] (let [resp (http-client/get "http://localhost:8180/status/v1/services")] (is (= 200 (:status resp))) (is (= #{:status-service} (ks/keyset (parse-response resp true))))) (let [resp (http-client/get "http://localhost:8180/status/v1/services/status-service?level=debug")] (is (= 200 (:status resp))) (let [body (parse-response resp true)] (is (= {:detail_level "debug" :service_name "status-service" :service_status_version 1 :service_version status-core/status-service-version :active_alerts [] :state "running"} (dissoc body :status))) (is (map? (get-in body [:status :experimental :jvm-metrics])))))))) (deftest status-debug-logging-test (testing "status service logs debug data when setting is enabled" (with-test-logging (with-logger-event-maps "puppetlabs.trapperkeeper.services.status.status-debug-logging" event-maps (with-status-service-with-config app [] (merge status-service-config ; 30 milliseconds {:status {:debug-logging {:interval-minutes 0.0005}}}) (Thread/sleep 100) ; The only thing that's logged from that namespace should be the status data ; so any of the events will work (let [log-event (first @event-maps) status-json-string (:message log-event) status-data (json/parse-string status-json-string)] (is (= "running" (get status-data "state")))))))) (testing "status service does not log debug data when setting is not present" (with-test-logging (with-logger-event-maps "puppetlabs.trapperkeeper.services.status.status-debug-logging" event-maps (with-status-service app [] ; Can't prove that with a longer sleep something wouldn't have been logged, ; so this sleep time is a bit arbitrary (Thread/sleep 100) (testing "no events have been logged" (is (empty? @event-maps)))))))) (deftest cpu-metrics-test (testing "cpu usage metrics are tracked when setting is enabled" (with-status-service-with-config app [] (merge status-service-config {:status {:cpu-metrics-interval-seconds 0.02}}) (Thread/sleep 100) (let [s (tka/get-service app :StatusService) sc (service-context s) first-cpu-snapshot @(:last-cpu-snapshot sc) first-timers (:snapshot first-cpu-snapshot)] (is (< 0 (:uptime first-timers))) (is (< 0 (:process-cpu-time first-timers))) (is (< 0 (:process-gc-time first-timers))) (is (<= 0 (:cpu-usage first-cpu-snapshot))) (is (<= 0 (:gc-cpu-usage first-cpu-snapshot))) (Thread/sleep 100) (let [second-cpu-snapshot @(:last-cpu-snapshot sc) second-timers (:snapshot second-cpu-snapshot)] (is (< (:uptime first-timers) (:uptime second-timers))) (is (<= (:process-cpu-time first-timers) (:process-cpu-time second-timers))) (is (<= (:process-gc-time first-timers) (:process-gc-time second-timers))) (is (<= 0 (:cpu-usage first-cpu-snapshot))) (is (<= 0 (:gc-cpu-usage first-cpu-snapshot))) (testing "CPU metrics are accessible via http" (let [resp (http-client/get "http://localhost:8180/status/v1/services/status-service?level=debug")] (is (= 200 (:status resp))) (let [body (parse-response resp true) jvm-metrics (get-in body [:status :experimental :jvm-metrics])] (is (<= 0 (:cpu-usage jvm-metrics))) (is (<= 0 (:gc-cpu-usage jvm-metrics)))))))))) (testing "cpu usage metrics are not updated when setting is disabled" (with-status-service-with-config app [] (merge status-service-config {:status {:cpu-metrics-interval-seconds 0}}) ;; TODO: this test doesn't really cover anything without a sleep that is ;; longer than the default interval, and I don't really want to sleep that ;; long in the test, so it's not really useful. Should try to think of ;; a better way to test the "disabled" case. (Thread/sleep 100) (let [s (tka/get-service app :StatusService) sc (service-context s) last-cpu-snapshot @(:last-cpu-snapshot sc) timers (:snapshot last-cpu-snapshot)] (is (= -1 (:uptime timers))) (is (= -1 (:process-cpu-time timers))) (is (= -1 (:process-gc-time timers))) (is (= -1 (:cpu-usage last-cpu-snapshot))) (is (= -1 (:gc-cpu-usage last-cpu-snapshot)))))))