Nous sommes un restaurant situΓ© Γ Paris . Ceci est notre menu.
Nous sommes un restaurant situΓ© Γ Paris . Ceci est notre menu.
Cliquez ici pour aller Γ l'entrΓ©e du site. ``` #### Canonical URL Handling For proper canonical URL handling on multilingual sites, we recommend using Polyglot's `{% I18n_Headers %}` tag for canonical URLs instead of jekyll-seo-tag's default canonical output. This provides intelligent canonical URL generation that: - Points to the translated URL for pages with actual translations - Points to the default language URL for fallback pages (pages without translations) - Properly handles the `page_id` and permalink matching for translation detection **Setup with jekyll-seo-tag:** If you're using [jekyll-seo-tag](https://github.com/jekyll/jekyll-seo-tag), you can disable its canonical output and let Polyglot handle it: ```liquid {% seo canonical=false %} {% I18n_Headers %} ``` The `canonical=false` option is available in jekyll-seo-tag v2.9.0+ **Fallback Canonical Behavior:** To have fallback pages (pages without translations) point their canonical URL to the default language version, add to your `_config.yml`: ```yaml fallback_canonical_to_default_lang: true ``` With this option enabled: - Pages with actual translations: canonical points to the translated URL (e.g., `/es/sobre-nosotros/`) - Fallback pages (no translation): canonical points to the default language URL (e.g., `/about/` instead of `/es/about/`) This improves SEO by: - Preventing search engines from indexing duplicate fallback content under multiple language URLs - Consolidating SEO authority to the original content - Signaling to search engines which version is the authoritative source Note: `hreflang` URLs pointing to the default language or `x-default` are intentionally NOT relativized, as they should always point to the canonical language-specific URLs. ### Localizing Netlify _redirects _New in 1.13.0_ When using Polyglot with [Netlify](https://www.netlify.com/), redirect rules defined in a [Netlify `_redirects` file](https://docs.netlify.com/manage/routing/redirects/overview/#syntax-for-the-_redirects-file) will get relativized (e.g., `/github` becomes `/fr/github` on French pages). However the Netlify `_redirects` file only contains the redirect base paths, which causes 404 errors for localized URLs. Polyglot can automatically generate language-prefixed versions of your redirects. Enable this feature in your `_config.yml`: ```yaml localize_redirects: true exclude_from_redirect_localization: - /signin - /app ``` With this configuration, a redirect like: ``` /github https://github.com/org/repo 302 ``` Will automatically generate localized versions for all your configured languages: ``` /github https://github.com/org/repo 302 /fr/github https://github.com/org/repo 302 /de/github https://github.com/org/repo 302 /sv/github https://github.com/org/repo 302 ``` Paths listed in `exclude_from_redirect_localization` will not be localized, which is useful for authentication endpoints or app URLs that should only exist at the root level. ### Disabling Url Relativizing _New in 1.4.0_ If you dont want a href attribute to be relativized (such as for making [a language switcher](https://github.com/untra/polyglot/blob/main/site/_includes/sidebar.html#L40)), you can use the block tag: ```html {% static_href %}href="..."{% endstatic_href %} ``` ```html click this static link ``` that will generate `click this static link` which is what you would normally use to create a url unmangled by invisible language relativization. Combine with a [html minifier](https://github.com/digitalsparky/jekyll-minifier) for a polished and production ready website. ### Exclusive site language generation _New in 1.4.0_ If you want to control which languages a document can be generated for, you can specify `lang-exclusive: [ ]` frontmatter. If you include this frontmatter in your post, it will only generate for the specified site languages. For Example, the following frontmatter will only generate in the `en` and `fr` site language builds: ``` --- lang-exclusive: ['en', 'fr'] --- ``` ### Localized site.data There are cases where you may want to have a list of `key: value` pairs of translated content. For example, instead of creating a complete separate file for each language containing the layout structure and localized content, you can create a single file with the layout that will be shared among pages, and then create a language-specific file with the localized content that will be used. To do this, you can create a file like `_data/:lang/strings.yml`, one for each language, and Polyglot will bring those keys under `site.data[:lang].strings`. For example, suppose you have the following files: `_data/en/strings.yml` ```yaml hello: "Hello" greetings: morning: "Good morning" evening: "Good evening" ``` `_data/pt-br/strings.yml` ```yaml hello: "OlΓ‘" greetings: morning: "Bom dia" evening: "Boa noite" ``` You can use the `site.data` to access the localized content in your layouts and pages: ```liquid
{{ site.data[site.active_lang].strings.hello }}, {{ site.data[site.active_lang].strings.greetings.morning }}
``` For more information on this matter, check out this [post](https://polyglot.untra.io/2024/02/29/localized-variables/). ### Localized collections To localize collections, you first have to properly define the collection in your `_config.yml` file. For example, if you have a collection of `projects`, you can define it like this: ```yaml collections: projects: output: true permalink: /:collection/:title/ ``` Note that the [permalink](https://jekyllrb.com/docs/permalinks/#collections) definition here is important. Then, you can create a file for each language in the `_projects` directory, and Polyglot will bring those files under `site.projects`. For more information, check the related discussion #188. ## How It Works This plugin makes modifications to existing Jekyll classes and modules, namely `Jekyll::StaticFile` and `Jekyll::Site`. These changes are as lightweight and slim as possible. The biggest change is in `Jekyll::Site.process`. Polyglot overwrites this method to instead spawn a separate process for each language you intend to process the site for. Each of those processes calls the original `Jekyll::Site.process` method with its language in mind, ensuring your website scales to support any number of languages, while building all of your site languages simultaneously. `Jekyll::Site.process` is the entry point for the Jekyll build process. Take care whatever other plugins you use do not also attempt to overwrite this method. You may have problems. ### (:polyglot, :post_write) hook _New in 1.8.0_ Polyglot issues a `:polyglot, :post_write` hook event once all languages have been built for the site. This hook runs exactly once, after all site languages been processed: ```rb Jekyll::Hooks.register :polyglot, :post_write do |site| # do something custom and cool here! end ``` ### Machine-aware site building _New in 1.5.0_ Polyglot will only start builds after it confirms there is a cpu core ready to accept the build thread. This ensures that jekyll will build large sites efficiently, streamlining build processes instead of overloading machines with process thrash. ### Writing Tests and Debugging _:wave: I need assistance with modern ruby best practices for test maintenance with rake and rspec. If you got the advice I have the ears._ Tests are run with `test.sh`. Tests are in the `/spec` directory, and test failure output detail can be examined in the `rspec.json` file. Code Coverage details are in the the `coverage` directory. ## Features This plugin stands out from other I18n Jekyll plugins. - automatically corrects your relative links, keeping your *french* visitors on your *french* website, even when content has to fallback to the `default_lang`. - builds all versions of your website *simultaneously*, allowing big websites to scale efficiently. - provides the liquid tag `{{ site.languages }}` to get an array of your I18n strings. - provides the liquid tag `{{ site.default_lang }}` to get the default_lang I18n string. - provides the liquid tag `{{ site.active_lang }}` to get the I18n language string the website was built for. Alternative names for `active_lang` can be configured via `config.lang_vars`. - provides the liquid tag `{{ page.rendered_lang }}` to get the language the page content is actually rendered in (useful for detecting fallback pages). - provides the liquid tag `{{ page.available_languages }}` to get the array of language codes a page has been translated into. - provides the liquid tag `{{ page.missing_languages }}` to get the array of configured languages a page has not been translated into (empty when the page has no real translations and falls back identically everywhere). - provides the liquid tag `{{ I18n_Headers }}` to append SEO bonuses to your website. - provides the liquid tag `{{ Unrelativized_Link href="/hello" }}` to make urls that do not get influenced by url correction regexes. - provides `site.data` localization for efficient rich text replacement. - a creator that will answer all of your questions and issues. ### Detecting Fallback Pages with `page.rendered_lang` The `page.rendered_lang` variable indicates the actual language of a page's content. This is different from `site.active_lang`, which indicates the language version of the site currently being built. - `site.active_lang`: The language the site is being built for (e.g., `es` for the Spanish site) - `page.rendered_lang`: The language of the page's actual content (e.g., `en` if no Spanish translation exists) When `page.rendered_lang != site.active_lang`, the page is a **fallback page** - it's being served in the default language because no translation exists. **Example: Showing a "not translated" notice:** ```liquid {% if page.rendered_lang != site.active_lang %}Welcome to our {{ site.active_lang }} content!
{% else %}Content available in {{ page.rendered_lang }} only.
{% endif %} ``` This is useful for: - Displaying notices when content hasn't been translated - Tracking translation coverage - Applying different styling to fallback pages - Building translation status dashboards ## SEO Recipes Jekyll-polyglot has a few spectacular [Search Engine Optimization techniques](https://untra.github.io/polyglot/seo) to ensure your Jekyll blog gets the most out of its multilingual audience. Check them out! ### Sitemap generation See the example [sitemap.xml](/site/sitemap.xml) and [robots.txt](/site/robots.txt) for how to automatically generate a multi-language sitemap for your page and turn it in for the SEO i18n credit. The [official Sitemap protocol documentation](https://www.sitemaps.org/protocol.html#location) states: > "The location of a Sitemap file determines the set of URLs that can be included in that Sitemap. A Sitemap file located at http://example.com/catalog/sitemap.xml can include any URLs starting with http://example.com/catalog/ but can not include URLs starting with http://example.com/images/." > "It is strongly recommended that you place your Sitemap at the root directory of your web server." To comply with this, 'sitemap.xml' should be added to the 'exclude_from_localization' list to ensure that only one `sitemap.xml` file exists in the root directory, rather than creating separate ones for each language. ## Compatibility Currently supports Jekyll 3.0 , and Jekyll 4.0 * Windows users will need to disable parallel_localization on their machines by setting `parallel_localization: false` in the `_config.yml` * In Jekyll 4.0 , SCSS source maps will generate improperly due to how Polyglot operates. The workaround is to disable the CSS sourcemaps. Adding the following to your `config.yml` will disable sourcemap generation: ```yaml sass: sourcemap: never ``` ## Contributions Please! I need all the support I can get! π But for real I would appreciate any code contributions and support. This started as an open-source side-project and has gotten bigger than I'd ever imagine! If you have something you'd like to contribute to jekyll-polyglot, please open a PR! ### Contributors These are talented and considerate software developers across the world that have lent their support to this project. **Thank You! Β‘Gracias! Merci! Danke! κ°μ¬ν©λλ€! ΧͺΧΧΧ Χ¨ΧΧ! Π‘ΠΏΠ°ΡΠΈΠ±ΠΎ! Dankjewel! θ°’θ°’οΌObrigado!** * [@yunseo-kim](https://github.com/yunseo-kim) [improved i18n sitemaps]((https://polyglot.untra.io/2025/01/18/polyglot-1.9.0/)) * [@blackpill](https://github.com/blackpill) [1.8.1](https://polyglot.untra.io/2024/08/18/polyglot-1.8.1/) * [@hacketiwack](https://github.com/hacketiwack) [1.8.1](https://polyglot.untra.io/2024/08/18/polyglot-1.8.1/) * [@jerturowetz](https://github.com/jerturowetz) [sitemap generation](https://polyglot.untra.io/2024/03/17/polyglot-1.8.0/) * [@antoniovazquezblanco](https://github.com/antoniovazquezblanco) [1.7.0](https://polyglot.untra.io/2023/10/29/polyglot-1.7.0/) * [@salinatedcoffee](https://github.com/SalinatedCoffee) [ko support](https://polyglot.untra.io/ko/2023/02/27/korean-support/) * [@aturret](https://github.com/aturret) [zh-CN support](https://polyglot.untra.io/zh-CN/2023/06/08/polyglot-1.6.0-chinese-support/) * [@dougieh](https://github.com/dougieh) [1.5.1](https://polyglot.untra.io/2022/10/01/polyglot-1.5.1/) * [@pandermusubi](https://github.com/PanderMusubi) [nl support](https://polyglot.untra.io/nl/2022/01/15/dutch-site-support/) * [@obfusk](https://github.com/obfusk) [1.5.0](https://polyglot.untra.io/2021/07/17/polyglot-1.5.0/) * [@eighthave](https://github.com/eighthave) [1.5.0](https://polyglot.untra.io/2021/07/17/polyglot-1.5.0/) * [@george-gca](https://github.com/george-gca) [pt-BR support](https://polyglot.untra.io/pt-BR/2024/02/29/localized-variables.md) * [@PanderMusubi](https://github.com/PanderMusubi) - 1.12 / jekyll-minimal-mistakes-polyglot demo * [@GruberMarkus](https://github.com/GruberMarkus) - redirect anchor support * [@rathboma](https://github.com/rathboma) - page.rendered_lang / sublanguage redirects * [@manabu-nakamura](https://github.com/manabu-nakamura) - Japanese strings ### Other Websites Built with Polyglot Feel free to open a PR and list your multilingual blog here you may want to share: * [**Polyglot project website**](https://polyglot.untra.io) * [LogRhythm Corporate Website](https://logrhythm.com) * [All Over Earth](https://allover.earth/) * [Hanare Cafe in Toshijima, Japan](https://hanarecafe.com) * [F-Droid](https://f-droid.org) * [Ubuntu MATE](https://ubuntu-mate.org) * [Leo3418 blog](https://leo3418.github.io/) * [Gaphor](https://gaphor.org) * [Yi Yunseok's personal blog website](https://Yi-Yunseok.GitHub.io) * [Tarlogic Cybersecurity](https://www.tarlogic.com/) * [A beautiful, simple, clean, and responsive Jekyll theme for academics](https://github.com/george-gca/multi-language-al-folio) * [AnotherTurret just another study note blog](https://aturret.space/) * [Diciotech is a collaborative online tech dictionary](https://diciotech.netlify.app/) * [Yunseo Kim's Study Notes](https://www.yunseo.kim/) * [Beekeeper Studio](https://www.beekeeperstudio.io/) ## 2.0 Roadmap * [x] - **site language**: portuguese Brazil `pt-BR` * [x] - **site language**: arabic `ar` * [x] - **site language**: japanese `ja` * [x] - **site language**: russian `ru` * [x] - **site language**: dutch `nl` * [x] - **site language**: korean `ko` * [x] - **site language**: hebrew `he` * [x] - **site language**: chinese China `zh-CN` * [x] - **site language**: italian `it` * [x] - **site language**: turkish `tk` * [x] - **site language**: ukrainian `uk` * [x] - **site language**: hindi `hi` * [ ] - **site language**: chinese Taiwan `zh-TW` * [ ] - **site language**: portuguese Portugal `pt-PT` * [ ] - get whitelisted as an official github-pages jekyll plugin * [x] - update CI provider ## Copyright Copyright (c) Samuel Volin 2025. License: MIT untra-polyglot-e688c05/Rakefile 0000664 0000000 0000000 00000001031 15201735340 0016516 0 ustar 00root root 0000000 0000000 require "rspec/core/rake_task" RSpec::Core::RakeTask.new(:spec) task :default => [:spec] desc 'Build the example site at ./site' task :site_build do Dir.chdir('site') do sh 'bundle exec jekyll build' end end desc 'Run html-proofer on the built example site' task :htmlproofer => [:site_build] do require 'html-proofer' HTMLProofer.check_directory( './site/_site', disable_external: true, allow_hash_href: true, root_dir: './site/_site', swap_urls: { %r{^https://polyglot\.untra\.io} => '' } ).run end untra-polyglot-e688c05/ai_docs/ 0000775 0000000 0000000 00000000000 15201735340 0016457 5 ustar 00root root 0000000 0000000 untra-polyglot-e688c05/ai_docs/ARCHITECTURE.md 0000664 0000000 0000000 00000005427 15201735340 0020673 0 ustar 00root root 0000000 0000000 # Architecture and Design Patterns ## Plugin Integration Overview jekyll-polyglot integrates with Jekyll through three main mechanisms: hooks, patches, and Liquid filters. These work together to add i18n capabilities without breaking existing Jekyll functionality. ## Hooks System **Location**: `lib/jekyll/polyglot/hooks.rb` and `lib/jekyll/polyglot/hooks/` The plugin uses Jekyll's hook system to intercept the build process: - **`post_init` hook**: Runs after Jekyll initializes, sets up configuration and language mappings - **Other hooks**: Handle document coordinate setup, output relocation, and build orchestration - **Execution mode**: Hooks run serially or in parallel based on the `parallel_localization` config setting ## Patches System **Location**: `lib/jekyll/polyglot/patches.rb` Patches extend Jekyll core classes with i18n capabilities: - **Target classes**: `Jekyll::Document`, `Jekyll::Site`, and related classes - **Methods added**: Language detection, document grouping, URL relativization - **Design principle**: Only add methods; avoid overriding existing behavior without thorough testing - **Application**: Patches are applied at require time ## Liquid Filters and Tags **Location**: `lib/jekyll/polyglot/liquid.rb` Custom Liquid filters and tags for rendering language-aware content in templates: - **`static_href` tag**: Relativizes URLs for multi-language sites - **`i18n_headers` tag**: Generates SEO hreflang links for alternate language versions ## Document Coordination Model The core of polyglot's i18n support: - Documents are grouped by **permalink** (or `page_id` in v1.7.0+) - Each document declares its language via the `lang` frontmatter property - The plugin "coordinates" documents with identical permalinks across languages - **Alternative URLs per language**: Use `page_id` to identify translations when different URLs are needed per language (e.g., `/about` in English, `/acerca-de/` in Spanish) ## Configuration Users configure polyglot in `_config.yml`: ```yaml languages: ["en", "de", "fr"] # Supported languages default_lang: "en" # Fallback language exclude_from_localization: [...] # Paths to exclude parallel_localization: true # Enable parallel processing url: https://example.com # Site URL (required for URL relativization) lang_from_path: true # (Optional) Extract lang from file path permalink_lang: { page_id: "..." } # (v1.8.0+) Permalink info for languages ``` ## Key Design Decisions 1. **Minimal coupling**: The plugin extends Jekyll rather than replacing its core behavior 2. **Parallel-friendly**: Document coordination works in both serial and parallel modes 3. **Fallback support**: Missing content in a language falls back to the default language 4. **URL consistency**: Permalinks must be identical across language variants (unless using `page_id`) untra-polyglot-e688c05/ai_docs/SECURITY.md 0000664 0000000 0000000 00000006236 15201735340 0020257 0 ustar 00root root 0000000 0000000 # Security and Best Practices ## Configuration and Secrets - **Never hard-code secrets or API keys** in the plugin code - Users configure the plugin exclusively via Jekyll's `_config.yml` - Configuration values are passed through Jekyll's standard config system - Do not commit sensitive data (tokens, API keys) to the repository ## Input Validation and Safety ### Document Language Codes - Language codes come from user frontmatter - Validate format follows [I18n language code conventions](https://developer.chrome.com/docs/extensions/reference/api/i18n#locales) - Reject or sanitize invalid language identifiers ### URL Relativization - The plugin handles URL transformation for multi-language sites - Test thoroughly with special characters and unicode in permalinks - Ensure relative URLs work correctly across all language variants ### Patch Safety - Patches to Jekyll core classes should **only add methods** - Avoid overriding existing behavior without thorough testing - Document any behavior changes that affect Jekyll's standard operations ## Compatibility and Maintenance ### Jekyll Compatibility - Keep compatibility with **Jekyll >= 4.0** - Test against multiple Jekyll versions in CI if possible - Document any version-specific behavior or workarounds ### Ruby Compatibility - Support **Ruby 3.1.0+** - Use syntax compatible with this version range - Test code on the minimum supported version before release ### Dependency Management - Keep dependencies minimal - Prefer Jekyll's built-in functionality over external gems - Review and update dependencies regularly for security patches ## Performance Considerations ### Parallel Localization - The plugin can process languages in parallel when `parallel_localization: true` - Use parallel processing for large sites with many languages - Set to `false` for: - Windows hosts (known compatibility issues) - Sites that conflict with other Jekyll plugins - Development environments where serial processing is easier to debug - Tests in `subprocess_spec.rb` verify correct parallel behavior ### N+1 Prevention - When iterating over documents, group by language first to avoid redundant processing - Use efficient document lookups (avoid searching the entire site for each language variant) ### Caching - Leverage Jekyll's build cache where possible - Avoid recalculating document coordinates on unchanged content - Cache language mappings and configuration lookups ## Error Handling - Log errors through Jekyll's logger (`jekyll.logger`) - Provide helpful, actionable error messages that guide users to fix their setup: - Include the problematic file or configuration value - Suggest solutions (e.g., "Missing `url` in `_config.yml`; URL relativization requires this") - Handle missing translations gracefully by falling back to the default language - Don't expose internal stack traces or implementation details to users ## SEO and URLs - Polyglot provides SEO tools via Liquid filters - Ensure `hreflang` alternate links are correctly generated for all language variants - URLs must be **consistent and valid** across language versions for proper SEO - Test URL generation with various permalink formats and language combinations untra-polyglot-e688c05/ai_docs/TESTING.md 0000664 0000000 0000000 00000004635 15201735340 0020126 0 ustar 00root root 0000000 0000000 # Testing Guidelines ## Testing Philosophy - **Mandatory**: All new features and bug fixes must include tests - **Coverage target**: Aim for > 90% code coverage on new modules - **Framework**: Use **RSpec** for unit and integration tests - **Scope**: Tests should verify plugin hooks, patches, Liquid filters, and Jekyll integration ## Test Organization Test structure mirrors the source code structure under `lib/jekyll/polyglot/`: ``` spec/ jekyll/ polyglot/ hooks/ coordinate_spec.rb # Document coordination tests subprocess_spec.rb # Parallel processing tests [other feature specs] fixture/ [sample Jekyll sites for integration testing] spec_helper.rb # RSpec configuration and setup ``` ## Writing Tests ### Unit Tests Test individual methods and filters in isolation: - **Location**: Mirror the source file structure (e.g., tests for `lib/jekyll/polyglot/hooks.rb` go in `spec/jekyll/polyglot/hooks_spec.rb`) - **Structure**: Use RSpec's `describe`, `context`, and `it` blocks - **Example**: `spec/jekyll/polyglot/hooks/coordinate_spec.rb` ### Integration Tests Test plugin interaction with Jekyll as a whole: - **Setup**: Use fixture sites in `spec/fixture/` for full Jekyll builds - **Verification**: Check output HTML structure, language routing, and URL generation - **Edge cases**: Add new fixtures for complex scenarios (e.g., custom permalinks, different language combinations) ## Running Tests ```bash # Full test suite with coverage COVERAGE=true bundle exec rspec # Run with verbose output bundle exec rspec --format documentation # Run a specific test file bundle exec rspec spec/jekyll/polyglot/hooks/coordinate_spec.rb # Run only tests matching a pattern bundle exec rspec --pattern "*coordinate*" ``` ## Code Coverage - Coverage reports are generated in the `coverage/` directory by **SimpleCov** - Coverage results are uploaded to **Codecov** in CI environments - The `test.sh` script handles coverage report generation and automated CI uploads - Check coverage reports before committing to ensure no regression ## Testing Parallel Localization The `subprocess_spec.rb` tests verify parallel behavior. When testing parallel processing: - Ensure the `parallel_localization: true` config is set - Test on non-Windows systems (parallel mode has known issues on Windows) - Verify that language variants process independently without race conditions untra-polyglot-e688c05/codecov.yml 0000664 0000000 0000000 00000000545 15201735340 0017227 0 ustar 00root root 0000000 0000000 comment: false coverage: status: project: default: target: 80% threshold: 1% patch: default: target: 80% threshold: 1% ignore: - "spec/**/*" - "test/**/*" - "vendor/**/*" - "bin/**/*" - "lib/jekyll/polyglot/version.rb" - "lib/jekyll-polyglot.rb" fixes: - "lib/::lib/" - "spec/::spec/" untra-polyglot-e688c05/jekyll-polyglot.gemspec 0000664 0000000 0000000 00000001132 15201735340 0021561 0 ustar 00root root 0000000 0000000 Gem::Specification.new do |s| s.name = 'jekyll-polyglot' s.version = '1.13.0' s.summary = 'I18n plugin for Jekyll Blogs' s.description = 'Fast open source i18n plugin for Jekyll blogs.' s.authors = ['Samuel Volin'] s.email = 'untra.sam@gmail.com' s.files = ['README.md', 'LICENSE'] + Dir['lib/**/*'] s.homepage = 'https://polyglot.untra.io/' s.license = 'MIT' s.add_dependency('jekyll', '>= 3.0', '>= 4.0') s.required_ruby_version = '>= 3.1.0' s.required_rubygems_version = '>= 3.1.0' s.metadata['rubygems_mfa_required'] = 'true' end untra-polyglot-e688c05/lib/ 0000775 0000000 0000000 00000000000 15201735340 0015624 5 ustar 00root root 0000000 0000000 untra-polyglot-e688c05/lib/jekyll-polyglot.rb 0000664 0000000 0000000 00000000032 15201735340 0021305 0 ustar 00root root 0000000 0000000 require "jekyll/polyglot" untra-polyglot-e688c05/lib/jekyll/ 0000775 0000000 0000000 00000000000 15201735340 0017116 5 ustar 00root root 0000000 0000000 untra-polyglot-e688c05/lib/jekyll/polyglot.rb 0000664 0000000 0000000 00000000237 15201735340 0021316 0 ustar 00root root 0000000 0000000 require 'jekyll' require_relative 'polyglot/liquid' require_relative 'polyglot/patches' require_relative 'polyglot/hooks' require_relative 'polyglot/version' untra-polyglot-e688c05/lib/jekyll/polyglot/ 0000775 0000000 0000000 00000000000 15201735340 0020767 5 ustar 00root root 0000000 0000000 untra-polyglot-e688c05/lib/jekyll/polyglot/hooks.rb 0000664 0000000 0000000 00000000150 15201735340 0022433 0 ustar 00root root 0000000 0000000 require_relative 'hooks/coordinate' require_relative 'hooks/process' require_relative 'hooks/redirects' untra-polyglot-e688c05/lib/jekyll/polyglot/hooks/ 0000775 0000000 0000000 00000000000 15201735340 0022112 5 ustar 00root root 0000000 0000000 untra-polyglot-e688c05/lib/jekyll/polyglot/hooks/assets-toggle.rb 0000664 0000000 0000000 00000000332 15201735340 0025216 0 ustar 00root root 0000000 0000000 # Jekyll::Hooks.register :site, :after_init do |site| # if site.config['assets'] # if site.active_lang != site.default_lang # then jekyll.sprockets.asset_config['autowrite'] = false # end # end # end untra-polyglot-e688c05/lib/jekyll/polyglot/hooks/coordinate.rb 0000664 0000000 0000000 00000001670 15201735340 0024572 0 ustar 00root root 0000000 0000000 # hook to coordinate blog posts and pages into distinct urls, # and remove duplicate multilanguage posts and pages Jekyll::Hooks.register :site, :post_read do |site| hook_coordinate(site) end def hook_coordinate(site) # Copy the language specific data, by recursively merging it with the default data. # Favour active_lang first, then default_lang, then any non-language-specific data. # See: https://www.ruby-forum.com/topic/142809 merger = proc { |_key, v1, v2| v1.is_a?(Hash) && v2.is_a?(Hash) ? v1.merge(v2, &merger) : v2 } if site.data.include?(site.default_lang) site.data = site.data.merge(site.data[site.default_lang], &merger) end if site.data.include?(site.active_lang) site.data = site.data.merge(site.data[site.active_lang], &merger) end site.collections.each_value do |collection| collection.docs = site.coordinate_documents(collection.docs) end site.pages = site.coordinate_documents(site.pages) end untra-polyglot-e688c05/lib/jekyll/polyglot/hooks/process.rb 0000664 0000000 0000000 00000000444 15201735340 0024117 0 ustar 00root root 0000000 0000000 # hook to make a call to process rendered documents, Jekyll::Hooks.register :site, :post_render do |site| hook_process(site) end def hook_process(site) site.collections.each_value do |collection| site.process_documents(collection.docs) end site.process_documents(site.pages) end untra-polyglot-e688c05/lib/jekyll/polyglot/hooks/redirects.rb 0000664 0000000 0000000 00000004625 15201735340 0024432 0 ustar 00root root 0000000 0000000 # frozen_string_literal: true # Hook to localize Netlify _redirects file for multilingual sites. # When enabled, generates language-prefixed versions of each redirect. # # Configuration: # localize_redirects: true # Enable the feature # exclude_from_redirect_localization: # Optional: paths to skip # - /signin # - /app # # Example: # Input: /github https://github.com/org/repo 302 # Output: /github https://github.com/org/repo 302 # /es/github https://github.com/org/repo 302 # /de/github https://github.com/org/repo 302 # ... Jekyll::Hooks.register :polyglot, :post_write do |site| hook_redirects(site) end def hook_redirects(site) return unless site.config.fetch('localize_redirects', false) redirects_path = File.join(site.source, '_redirects') return unless File.exist?(redirects_path) exclusions = site.config.fetch('exclude_from_redirect_localization', []) lines = File.readlines(redirects_path) localized_lines = [] lines.each do |line| # Always include the original line localized_lines << line # Skip comments and empty lines stripped = line.strip next if stripped.empty? || stripped.start_with?('#') # Parse the redirect line: /source /target [status_code] parts = stripped.split(/\s+/) next if parts.length < 2 source = parts[0] # Skip if source is in exclusion list next if exclusions.include?(source) # Only process paths that start with / next unless source.start_with?('/') # Skip if source already has a language prefix next if site.languages.any? { |lang| source.start_with?("/#{lang}/") || source == "/#{lang}" } # Add localized versions for non-default languages site.languages.each do |lang| next if lang == site.default_lang localized_source = "/#{lang}#{source}" destination = parts[1] # Localize destination if it's an internal path (starts with /) # but not if it's an external URL (contains ://) localized_destination = if destination.start_with?('/') && !destination.include?('://') "/#{lang}#{destination}" else destination end rest = parts.length > 2 ? " #{parts[2..].join(' ')}" : '' localized_lines << "#{localized_source} #{localized_destination}#{rest}\n" end end # Write to destination dest_path = File.join(site.dest, '_redirects') File.write(dest_path, localized_lines.join) end untra-polyglot-e688c05/lib/jekyll/polyglot/liquid.rb 0000664 0000000 0000000 00000000247 15201735340 0022606 0 ustar 00root root 0000000 0000000 module Jekyll module Polyglot module Liquid require_relative 'liquid/tags/i18n_headers' require_relative 'liquid/tags/static_href' end end end untra-polyglot-e688c05/lib/jekyll/polyglot/liquid/ 0000775 0000000 0000000 00000000000 15201735340 0022256 5 ustar 00root root 0000000 0000000 untra-polyglot-e688c05/lib/jekyll/polyglot/liquid/tags/ 0000775 0000000 0000000 00000000000 15201735340 0023214 5 ustar 00root root 0000000 0000000 untra-polyglot-e688c05/lib/jekyll/polyglot/liquid/tags/i18n_headers.rb 0000664 0000000 0000000 00000007763 15201735340 0026030 0 ustar 00root root 0000000 0000000 module Jekyll module Polyglot module Liquid class I18nHeadersTag < ::Liquid::Tag def initialize(tag_name, text, tokens) super @url = text @url.strip! @url.chomp! '/' end def render(context) site = context.registers[:site] page = context.registers[:page] permalink = normalize_permalink(page['permalink'] || page['url'] || '') normalized_permalink = strip_lang_prefix(permalink, site.active_lang) permalink_lang = page['permalink_lang'] site_url = resolve_site_url(site) lang_to_permalink = build_lang_to_permalink(site, page['page_id'], normalized_permalink) canonical_tag(site, site_url, lang_to_permalink, permalink_lang, normalized_permalink) + hreflang_tags(site, site_url, lang_to_permalink, permalink_lang, normalized_permalink) end private def normalize_permalink(permalink) permalink.start_with?('/') ? permalink : "/#{permalink}" end def strip_lang_prefix(permalink, active_lang) stripped = permalink.delete_prefix("/#{active_lang}/") stripped.start_with?('/') ? stripped : "/#{stripped}" end def resolve_site_url(site) return @url unless @url.empty? baseurl = site.config['baseurl'] || '' site.config['url'] + baseurl end def build_lang_to_permalink(site, page_id, normalized_permalink) site.find_translations(page_id, normalized_permalink) end def lookup_permalink(lang_to_permalink, permalink_lang, lang) lang_to_permalink[lang] || (permalink_lang && permalink_lang[lang]) end def with_lang_prefix(permalink, lang) permalink.start_with?("/#{lang}/") ? permalink : "/#{lang}#{permalink}" end def canonical_tag(site, site_url, lang_to_permalink, permalink_lang, normalized_permalink) current_lang = site.active_lang has_translation = lookup_permalink(lang_to_permalink, permalink_lang, current_lang) use_default = site.fallback_canonical_to_default_lang && !has_translation && current_lang != site.default_lang canonical = if use_default normalize_permalink(lookup_permalink(lang_to_permalink, permalink_lang, site.default_lang) || normalized_permalink) elsif current_lang == site.default_lang normalize_permalink(lookup_permalink(lang_to_permalink, permalink_lang, current_lang) || normalized_permalink) else current = normalize_permalink(lookup_permalink(lang_to_permalink, permalink_lang, current_lang) || normalized_permalink) with_lang_prefix(current, current_lang) end "\n" end def hreflang_tags(site, site_url, lang_to_permalink, permalink_lang, normalized_permalink) default_permalink = normalize_permalink(lookup_permalink(lang_to_permalink, permalink_lang, site.default_lang) || normalized_permalink) site.languages.map do |lang| has_translation = lookup_permalink(lang_to_permalink, permalink_lang, lang) next nil if !has_translation && lang != site.default_lang alt = normalize_permalink(lookup_permalink(lang_to_permalink, permalink_lang, lang) || normalized_permalink) if lang == site.default_lang "\n" \ "\n" else "\n" end end.compact.join end end end end end Liquid::Template.register_tag('I18n_Headers', Jekyll::Polyglot::Liquid::I18nHeadersTag) Liquid::Template.register_tag('i18n_headers', Jekyll::Polyglot::Liquid::I18nHeadersTag) untra-polyglot-e688c05/lib/jekyll/polyglot/liquid/tags/static_href.rb 0000664 0000000 0000000 00000001704 15201735340 0026036 0 ustar 00root root 0000000 0000000 module Jekyll module Polyglot module Liquid class StaticHrefTag < ::Liquid::Block def render(context) text = super href_attrs = text.strip.split('=', 2) valid = (href_attrs.length == 2 && href_attrs[0] == 'href') && href_attrs[1].start_with?('"') && href_attrs[1].end_with?('"') unless valid raise Liquid::SyntaxError, "static_href parameters must include match href=\"...\" attribute param, eg. href=\"http://example.com, href=\"/about\", href=\"/\" , instead got:\n#{text}" end href_value = href_attrs[1] # href writes out as ferh="..." explicitly wrong, to be caught by separate processor for nonrelativized links "ferh=#{href_value}" end end end end end Liquid::Template.register_tag('Static_Href', Jekyll::Polyglot::Liquid::StaticHrefTag) Liquid::Template.register_tag('static_href', Jekyll::Polyglot::Liquid::StaticHrefTag) untra-polyglot-e688c05/lib/jekyll/polyglot/patches.rb 0000664 0000000 0000000 00000000125 15201735340 0022741 0 ustar 00root root 0000000 0000000 require_relative 'patches/jekyll/site' require_relative 'patches/jekyll/static_file' untra-polyglot-e688c05/lib/jekyll/polyglot/patches/ 0000775 0000000 0000000 00000000000 15201735340 0022416 5 ustar 00root root 0000000 0000000 untra-polyglot-e688c05/lib/jekyll/polyglot/patches/jekyll/ 0000775 0000000 0000000 00000000000 15201735340 0023710 5 ustar 00root root 0000000 0000000 untra-polyglot-e688c05/lib/jekyll/polyglot/patches/jekyll/site.rb 0000664 0000000 0000000 00000035250 15201735340 0025206 0 ustar 00root root 0000000 0000000 require 'etc' include Process module Jekyll class Site attr_reader :default_lang, :languages, :exclude_from_localization, :lang_vars, :lang_from_path, :fallback_canonical_to_default_lang attr_accessor :file_langs, :active_lang def prepare @file_langs = {} fetch_languages @parallel_localization = config.fetch('parallel_localization', true) @lang_from_path = config.fetch('lang_from_path', false) @fallback_canonical_to_default_lang = config.fetch('fallback_canonical_to_default_lang', false) @exclude_from_localization = config.fetch('exclude_from_localization', []).map do |e| if File.directory?(e) && e[-1] != '/' "#{e}/" else e end end end def fetch_languages @default_lang = config.fetch('default_lang', 'en') @languages = config.fetch('languages', ['en']).uniq @keep_files += (@languages - [@default_lang]) @active_lang = @default_lang @lang_vars = config.fetch('lang_vars', []) end alias process_orig process def process prepare all_langs = ([@default_lang] + @languages).uniq if @parallel_localization nproc = Etc.nprocessors pids = {} begin all_langs.each do |lang| pids[lang] = fork do process_language lang end while pids.length >= (lang == all_langs[-1] ? 1 : nproc) sleep 0.1 pids.map do |pid_lang, pid| next unless waitpid pid, Process::WNOHANG pids.delete pid_lang raise "Polyglot subprocess #{pid} (#{pid_lang}) failed (#{$?.exitstatus})" unless $?.success? end end end rescue Interrupt all_langs.each do |lang| next unless pids.key? lang puts "Killing #{pids[lang]} : #{lang}" kill('INT', pids[lang]) end end else all_langs.each do |lang| process_language lang end end Jekyll::Hooks.trigger :polyglot, :post_write, self end alias site_payload_orig site_payload def site_payload payload = site_payload_orig payload['site']['default_lang'] = default_lang payload['site']['languages'] = languages payload['site']['active_lang'] = active_lang lang_vars.each do |v| payload['site'][v] = active_lang end payload end def process_language(lang) @active_lang = lang config['active_lang'] = @active_lang lang_vars.each do |v| config[v] = @active_lang end if @active_lang == @default_lang then process_default_language else process_active_language end end def process_default_language old_include = @include process_orig @include = old_include end def process_active_language old_dest = @dest old_exclude = @exclude @file_langs = {} @dest = "#{@dest}/#{@active_lang}" @exclude += @exclude_from_localization process_orig @dest = old_dest @exclude = old_exclude end def split_on_multiple_delimiters(string) delimiters = ['.', '/'] regex = Regexp.union(delimiters) string.split(regex) end # Convert glob pattern to regex pattern # * matches any characters except / # ? matches any single character except / def glob_to_regex(pattern) # Escape special regex characters first escaped = Regexp.escape(pattern) # Convert glob patterns to regex patterns escaped.gsub("\\*", '.*').gsub("\\?", '.') end def derive_lang_from_path(doc) unless @lang_from_path return nil end segments = split_on_multiple_delimiters(doc.path) segments.each do |segment| match = @languages.find { |lang| lang.downcase == segment.downcase } return match if match end nil end # assigns natural permalinks to documents and prioritizes documents with # active_lang languages over others. If lang is not set in front matter, # then this tries to derive from the path, if the lang_from_path is set. # otherwise it will assign the document to the default_lang def coordinate_documents(docs) regex = document_url_regex approved = {} # Build set of valid languages (default + configured) valid_languages = ([@default_lang] + @languages).uniq docs.each do |doc| # Get the explicitly declared language (frontmatter or path-derived) explicit_lang = doc.data['lang'] || derive_lang_from_path(doc) lang = explicit_lang || @default_lang # FILTER: Skip documents whose explicit lang is not in configured languages. # Check the explicit value (not the fallback) so that documents with an # unconfigured lang like 'de' are excluded even if normalization would # map them to default_lang. Compare case-insensitively so case-mismatched # frontmatter (e.g. 'pt-br' vs configured 'pt-BR') is normalized below # rather than rejected here. if explicit_lang && valid_languages.none? { |l| l.downcase == explicit_lang.downcase } Jekyll.logger.warn "Polyglot:", "Skipping #{doc.relative_path} - lang '#{explicit_lang}' not in configured languages #{valid_languages.inspect}" next end # If the doc lang matches a config language case-insensitively, use the config case config_lang = @languages.find { |l| l.downcase == lang.downcase } lang = config_lang if config_lang doc.data['lang'] = lang if doc.data['lang'] && config_lang lang_exclusive = doc.data['lang-exclusive'] || [] url = doc.url.gsub(regex, '/') page_id = doc.data['page_id'] || url doc.data['permalink'] = url if doc.data['permalink'].to_s.empty? && !doc.data['lang'].to_s.empty? # Set rendered_lang to indicate what language this page is actually rendered in # This allows templates to detect fallback pages (rendered_lang != active_lang) doc.data['rendered_lang'] = lang # skip entirely if nothing to check next if @file_langs.nil? # skip this document if it has already been processed next if @file_langs[page_id] == @active_lang # skip this document if it has a fallback and it isn't assigned to the active language next if @file_langs[page_id] == @default_lang && lang != @active_lang # skip this document if it has lang-exclusive defined and the active_lang is not included next if !lang_exclusive.empty? && !lang_exclusive.include?(@active_lang) approved[page_id] = doc @file_langs[page_id] = lang end approved.each_value do |doc| assignPageRedirects(doc, docs) assignPageLanguagePermalinks(doc, docs) end approved.values end def assignPageRedirects(doc, docs) # Preserve and normalize user-defined redirect_from user_redirects = doc.data['redirect_from'] || [] user_redirects = [user_redirects] unless user_redirects.is_a?(Array) # Determine document language doc_lang = doc.data['lang'] || derive_lang_from_path(doc) || @default_lang # Scope user-defined redirects to document's language if non-default if doc_lang != @default_lang && !user_redirects.empty? user_redirects = user_redirects.map do |redirect_path| # Normalize path to start with / redirect_path = "/#{redirect_path}" unless redirect_path.start_with?('/') # Only prefix if not already prefixed with this language if redirect_path.start_with?("/#{doc_lang}/") redirect_path else "/#{doc_lang}#{redirect_path}" end end end # Compute page_id based redirects (cross-language) computed_redirects = [] pageId = doc.data['page_id'] if !pageId.nil? && !pageId.empty? docs_with_same_id = docs.select { |dd| dd.data['page_id'] == pageId } docs_with_same_id.each do |dd| if dd.data['permalink'] != doc.data['permalink'] computed_redirects << dd.data['permalink'] end end end # Merge user-defined and computed redirects, removing duplicates all_redirects = (user_redirects + computed_redirects).uniq doc.data['redirect_from'] = all_redirects unless all_redirects.empty? end def assignPageLanguagePermalinks(doc, docs) page_id = doc.data['page_id'] normalized_permalink = normalized_permalink_for_doc(doc) translations = find_translations(page_id, normalized_permalink, docs) doc.data['permalink_lang'] = translations configured = ([@default_lang] + @languages).uniq doc.data['available_languages'] = translations.keys # missing_languages signals "a visitor in this lang would see different # content than another lang's visitor". A single-source page falls back # identically everywhere, so nothing is missing in that case. doc.data['missing_languages'] = translations.size > 1 ? (configured - translations.keys) : [] end # Returns a hash of { lang => permalink } for all docs that are translations # of the given page. Matches by page_id when present, otherwise by normalized # permalink. Filters out languages not in the configured languages list. # candidate_docs defaults to site.collections + site.pages so the helper can # be called from Liquid render contexts where the caller doesn't already # hold a docs array. def find_translations(page_id, normalized_permalink, candidate_docs = nil) candidate_docs ||= collections.values.flat_map(&:docs) + pages valid_languages = ([@default_lang] + @languages).uniq matching = if !page_id.to_s.empty? candidate_docs.select { |d| d.data['page_id'] == page_id } elsif !normalized_permalink.to_s.empty? candidate_docs.select { |d| normalized_permalink_for_doc(d) == normalized_permalink } else [] end matching.each_with_object({}) do |d, h| explicit_lang = d.data['lang'] || derive_lang_from_path(d) doclang = explicit_lang || @default_lang next if explicit_lang && !valid_languages.include?(explicit_lang) h[doclang] = d.data['permalink'] end end # Returns the doc's permalink with its own language prefix stripped, so it # can be matched against sibling docs that share the same un-prefixed # permalink. Returns nil when no usable permalink is present. def normalized_permalink_for_doc(doc) permalink = doc.data['permalink'] || (doc.respond_to?(:url) ? doc.url : nil) return nil if permalink.to_s.empty? permalink = "/#{permalink}" unless permalink.start_with?('/') lang = doc.data['lang'] return permalink if lang.to_s.empty? stripped = permalink.delete_prefix("/#{lang}/") stripped.start_with?('/') ? stripped : "/#{stripped}" end # performs any necessary operations on the documents before rendering them def process_documents(docs) # return if @active_lang == @default_lang url = config.fetch('url', false) rel_regex = relative_url_regex(false) abs_regex = absolute_url_regex(url, false) non_rel_regex = relative_url_regex(true) non_abs_regex = absolute_url_regex(url, true) docs.each do |doc| unless @active_lang == @default_lang then relativize_urls(doc, rel_regex) end correct_nonrelativized_urls(doc, non_rel_regex) if url unless @active_lang == @default_lang then relativize_absolute_urls(doc, abs_regex, url) end correct_nonrelativized_absolute_urls(doc, non_abs_regex, url) end end end # a regex that matches urls or permalinks with i18n prefixes or suffixes # matches /en/foo , .en/foo , foo.en/ and other simmilar default urls # made by jekyll when parsing documents without explicitly set permalinks def document_url_regex regex = '' (@languages || []).each do |lang| regex += "([/.]#{lang}[/.])|" end regex.chomp! '|' /#{regex}/ end # a regex that matches relative urls in a html document # matches href="baseurl/foo/bar-baz" href="/foo/bar-baz" and others like it # avoids matching excluded files. prepare makes sure # that all @exclude dirs have a trailing slash. def relative_url_regex(disabled = false) regex = '' unless disabled @exclude.each do |x| escaped_x = glob_to_regex(x) regex += "(?!#{escaped_x})" end @languages.each do |x| escaped_x = Regexp.escape(x) regex += "(?!#{escaped_x}/)" end end start = disabled ? 'ferh' : 'href' %r{#{start}="?#{@baseurl}/((?:#{regex}[^,'"\s/?.]+\.?)*(?:/[^\]\[)("'\s]*)?)"} end # a regex that matches absolute urls in a html document # matches href="http://baseurl/foo/bar-baz" and others like it # avoids matching excluded files. prepare makes sure # that all @exclude dirs have a trailing slash. def absolute_url_regex(url, disabled = false) regex = '' unless disabled @exclude.each do |x| escaped_x = glob_to_regex(x) regex += "(?!#{escaped_x})" end @languages.each do |x| escaped_x = Regexp.escape(x) regex += "(?!#{escaped_x}/)" end end start = disabled ? 'ferh' : 'href' # Build negative lookbehind to exclude hreflang URLs from relativization # hreflang tags for default language and x-default should not be relativized neglookbehind = disabled ? "" : "(?