` elements that are also parents: ```py3 custom = { ":--parent": ":has(> *|*)", ":--parent-paragraph": "p:--parent" } print(sv.select(':--parent-paragraph', soup, custom=custom)) ``` The above code will yield the only paragraph that is a parent: ``` [

` and then `#!html `. Soup Sieve does not, and frankly cannot, honor Beautiful Soup's old ordering convention due to the way it is designed. Soup Sieve returns the nodes in the order they are defined in the document as that is how the elements are searched. This much more efficient and provides better performance. So, given the earlier selector pattern of `article, body`, Soup Sieve would return the element `#!html ` and then `#!html

` as that is how it is ordered in the HTML document. soupsieve-2.7/docs/src/markdown/faq.md0000644000000000000000000000617313615410400014765 0ustar00# Frequent Asked Questions ## Why do selectors not work the same in Beautiful Soup 4.7+? Soup Sieve is the official CSS selector library in Beautiful Soup 4.7+, and with this change, Soup Sieve introduces a number of changes that break some of the expected behaviors that existed in versions prior to 4.7. In short, Soup Sieve follows the CSS specifications fairly close, and this broke a number of non-standard behaviors. These non-standard behaviors were not allowed according to the CSS specifications. Soup Sieve has no intentions of bringing back these behaviors. For more details on specific changes, and the reasoning why a specific change is considered a good change, or simply a feature that Soup Sieve cannot/will not support, see [Beautiful Soup Differences](./differences.md). ## How does `iframe` handling work? In web browsers, CSS selectors do not usually select content inside an `iframe` element if the selector is called on an element outside of the `iframe`. Each HTML document is usually encapsulated and CSS selector leakage across this `iframe` boundary is usually prevented. In it's current iteration, Soup Sieve is not aware of the origin of the documents in the `iframe`, and Soup Sieve will not prevent selectors from crossing these boundaries. Soup Sieve is not used to style documents, but to scrape documents. For this reason, it seems to be more helpful to allow selector combinators to cross these boundaries. Soup Sieve isn't entirely unaware of `iframe` elements though. In Soup Sieve 1.9.1, it was noticed that some pseudo-classes behaved in unexpected ways without awareness to `iframes`, this was fixed in 1.9.1. Pseudo-classes such as [`:default`](./selectors/pseudo-classes.md#:default), [`:indeterminate`](./selectors/pseudo-classes.md#:indeterminate), [`:dir()`](./selectors/pseudo-classes.md#:dir), [`:lang()`](./selectors/pseudo-classes.md#:lang), [`:root`](./selectors/pseudo-classes.md#:root), and [`:contains()`](./selectors/pseudo-classes.md#:contains) were given awareness of `iframes` to ensure they behaved properly and returned the expected elements. This doesn't mean that `select` won't return elements in `iframes`, but it won't allow something like `:default` to select a `button` in an `iframe` whose parent `form` is outside the `iframe`. Or better put, a default `button` will be evaluated in the context of the document it is in. With all of this said, if your selectors have issues with `iframes`, it is most likely because `iframes` are handled differently by different parsers. `html.parser` will usually parse `iframe` elements as it sees them. `lxml` parser will often remove `html` and `body` tags of an `iframe` HTML document. `lxml-xml` will simply ignore the content in a XHTML document. And `html5lib` will HTML escape the content of an `iframe` making traversal impossible. In short, Soup Sieve will return elements from all documents, even `iframes`. But certain pseudo-classes may take into consideration the context of the document they are in. But even with all of this, a parser's handling of `iframes` may make handling its content difficult if it doesn't parse it as HTML elements, or augments its structure. soupsieve-2.7/docs/src/markdown/index.md0000644000000000000000000001267113615410400015325 0ustar00# Quick Start ## Overview Soup Sieve is a CSS selector library designed to be used with [Beautiful Soup 4][bs4]. It aims to provide selecting, matching, and filtering using modern CSS selectors. Soup Sieve currently provides selectors from the CSS level 1 specifications up through the latest CSS level 4 drafts and beyond (though some are not yet implemented). Soup Sieve was written with the intent to replace Beautiful Soup's builtin select feature, and as of Beautiful Soup version 4.7.0, it now is :confetti_ball:. Soup Sieve can also be imported in order to use its API directly for more controlled, specialized parsing. Soup Sieve has implemented most of the CSS selectors up through the latest CSS draft specifications, though there are a number that don't make sense in a non-browser environment. Selectors that cannot provide meaningful functionality simply do not match anything. Some of the supported selectors are: - `#!css .classes` - `#!css #ids` - `#!css [attributes=value]` - `#!css parent child` - `#!css parent > child` - `#!css sibling ~ sibling` - `#!css sibling + sibling` - `#!css :not(element.class, element2.class)` - `#!css :is(element.class, element2.class)` - `#!css parent:has(> child)` - and [many more](./selectors/index.md) ## Installation You must have Beautiful Soup already installed: ``` pip install beautifulsoup4 ``` In most cases, assuming you've installed version 4.7.0, that should be all you need to do, but if you've installed via some alternative method, and Soup Sieve is not automatically installed, you can install it directly: ``` pip install soupsieve ``` If you want to manually install it from source, first ensure that [`build`][build] is installed: ``` pip install build ``` Then navigate to the root of the project and build the wheel and install (replacing `` with the current version): ``` python -m build -w pip install dist/soupsive--py3-none-any.whl ``` ## Usage To use Soup Sieve, you must create a `BeautifulSoup` object: ```pycon3 >>> import bs4 >>> text = """ ...

... ...

Cat

...

Dog

...

Mouse

...

... """ >>> soup = bs4.BeautifulSoup(text, 'html5lib') ``` For most people, using the Beautiful Soup 4.7.0+ API may be more than sufficient. Beautiful Soup offers two methods that employ Soup Sieve: `select` and `select_one`. Beautiful Soup's select API is identical to Soup Sieve's, except that you don't have to hand it the tag object, the calling object passes itself to Soup Sieve: ```pycon3 >>> soup = bs4.BeautifulSoup(text, 'html5lib') >>> soup.select_one('p:is(.a, .b, .c)')

Cat

``` ```pycon3 >>> soup = bs4.BeautifulSoup(text, 'html5lib') >>> soup.select('p:is(.a, .b, .c)') [

Cat

Dog

Mouse

] ``` You can also use the Soup Sieve API directly to get access to the full range of possibilities that Soup Sieve offers. You can select a single tag: ```pycon3 >>> import soupsieve as sv >>> sv.select_one('p:is(.a, .b, .c)', soup)

Cat

``` You can select all tags: ```pycon3 >>> import soupsieve as sv >>> sv.select('p:is(.a, .b, .c)', soup) [

Cat

Dog

Mouse

] ``` You can select the closest ancestor: ```pycon3 >>> import soupsieve as sv >>> el = sv.select_one('.c', soup) >>> sv.closest('div', el)

Cat

Dog

Mouse

``` You can filter a tag's Children (or an iterable of tags): ```pycon3 >>> sv.filter('p:not(.b)', soup.div) [

Cat

Mouse

] ``` You can match a single tag: ```pycon3 >>> els = sv.select('p:is(.a, .b, .c)', soup) >>> sv.match('p:not(.b)', els[0]) True >>> sv.match('p:not(.b)', els[1]) False ``` Or even just extract comments: ```pycon3 >>> sv.comments(soup) [' These are animals '] ``` Selectors do not have to be constrained to one line either. You can span selectors over multiple lines just like you would in a CSS file. ```pycon3 >>> selector = """ ... .a, ... .b, ... .c ... """ >>> sv.select(selector, soup) [

Cat

Dog

Mouse

] ``` You can even use comments to annotate a particularly complex selector. ```pycon3 >>> selector = """ ... /* This isn't complicated, but we're going to annotate it anyways. ... This is the a class */ ... .a, ... /* This is the b class */ ... .b, ... /* This is the c class */ ... .c ... """ >>> sv.select(selector, soup) [

Cat

Dog

Mouse

] ``` If you've ever used Python's Re library for regular expressions, you may know that it is often useful to pre-compile a regular expression pattern, especially if you plan to use it more than once. The same is true for Soup Sieve's matchers, though is not required. If you have a pattern that you want to use more than once, it may be wise to pre-compile it early on: ```pycon3 >>> selector = sv.compile('p:is(.a, .b, .c)') >>> selector.filter(soup.div) [

Cat

Dog

Mouse

Here is some text.

...

Here is some more text.

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('div')) [

Here is some text.

Here is some more text.

] ``` /// /// tip | Additional Reading https://developer.mozilla.org/en-US/docs/Web/CSS/Type_selectors /// ## Universal Selectors The Universal selector (`*`) matches elements of any type. /// tab | Syntax ```css * ``` /// /// tab | Usage ```pycon3 >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

Here is some text.

...

Here is some more text.

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('*')) [

Here is some text.

Here is some more text.

, ,

Here is some text.

Here is some more text.

Here is some text.

Here is some more text.

] ``` /// /// tip | Additional Reading https://developer.mozilla.org/en-US/docs/Web/CSS/Universal_selectors /// ## ID Selectors The ID selector matches an element based on its `id` attribute. The ID must match exactly. /// tab | Syntax ```css #id ``` /// /// tab | Usage ```pycon3 >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

Here is some text.

...

Here is some more text.

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('#some-id')) [

Here is some text.

] ``` /// /// tip | Additional Reading https://developer.mozilla.org/en-US/docs/Web/CSS/ID_selectors /// /// note | XML Support While the use of the `id` attribute (in the context of CSS) is a very HTML centric idea, it is supported for XML as well because Beautiful Soup supported it before Soup Sieve's existence. /// ## Class Selectors The class selector matches an element based on the values contained in the `class` attribute. The `class` attribute is treated as a whitespace separated list, where each item is a **class**. /// tab | Syntax ```css .class ``` /// /// tab | Usage ```pycon3 >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

Here is some text.

...

Here is some more text.

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('.some-class')) [

Here is some text.

] ``` /// /// tip | Additional Reading https://developer.mozilla.org/en-US/docs/Web/CSS/Class_selectors /// /// note | XML Support While the use of the `class` attribute (in the context of CSS) is a very HTML centric idea, it is supported for XML as well because Beautiful Soup supported it before Soup Sieve's existence. /// ## Attribute Selectors The attribute selector matches an element based on its attributes. When specifying a value of an attribute, if it contains whitespace or special characters, you should quote them with either single or double quotes. /// tip | Additional Reading https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors /// /// define | `[attribute]` - Represents elements with an attribute named **attribute**. //// tab | Syntax ```css [attr] ``` //// //// tab | Usage ```pycon3 >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

...

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('[href]')) [Internal link, Example link, Insensitive internal link, Example org link] ``` //// /// /// define `[attribute=value]` - Represents elements with an attribute named **attribute** that also has a value of **value**. //// tab | Syntax ```css [attr=value] [attr="value"] ``` //// //// tab | Usage ```pycon3 >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('[href="#internal"]')) [Internal link] ``` //// /// /// define `[attribute~=value]` - Represents elements with an attribute named **attribute** whose value is a space separated list which contains **value**. //// tab | Syntax ```css [attr~=value] [attr~="value"] ``` //// //// tab | Usage ```pycon3 >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('[class~=class2]')) [Internal link] ``` //// /// /// define `[attribute|=value]` - Represents elements with an attribute named **attribute** whose value is a dash separated list that starts with **value**. //// tab | Syntax ```css [attr|=value] [attr|="value"] ``` //// //// tab | Usage ```pycon3 >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

Some text

...

Some more text

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('div[lang|="en"]')) [

Some text

Some more text

] ``` //// /// /// define `[attribute^=value]` - Represents elements with an attribute named **attribute** whose value starts with **value**. //// tab | Syntax ```css [attr^=value] [attr^="value"] ``` //// //// tab | Usage ```pycon3 >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('[href^=http]')) [Example link, Example org link] ``` //// /// /// define `[attribute$=value]` - Represents elements with an attribute named **attribute** whose value ends with **value**. //// tab | Syntax ```css [attr$=value] [attr$="value"] ``` //// //// tab | Usage ```pycon3 >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('[href$=org]')) [Example org link] ``` //// /// /// define `[attribute*=value]` - Represents elements with an attribute named **attribute** whose value containing the substring **value**. //// tab | Syntax ```css [attr*=value] [attr*="value"] ``` //// //// tab | Usage ```pycon3 >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('[href*="example"]')) [Example link, Example org link] ``` //// /// /// define `[attribute!=value]`:material-star:{: title="Custom" data-md-color-primary="green" .icon} - Equivalent to `#!css :not([attribute=value])`. //// tab | Syntax ```css [attr!=value] [attr!="value"] ``` //// //// tab | Usage ```pycon3 >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('a[href!="#internal"]')) [Example link, Insensitive internal link, Example org link] ``` //// /// /// define `[attribute operator value i]`:material-flask:{: title="Experimental" data-md-color-primary="purple" .icon} - Represents elements with an attribute named **attribute** and whose value, when the **operator** is applied, matches **value** *without* case sensitivity. In general, attribute comparison is insensitive in normal HTML, but not XML. `i` is most useful in XML documents. //// tab | Syntax ```css [attr=value i] [attr="value" i] ``` //// //// tab | Usage ```pycon3 >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('[href="#INTERNAL" i]')) [Internal link] ``` //// /// /// define `[attribute operator value s]` :material-flask:{: title="Experimental" data-md-color-primary="purple" .icon} - Represents elements with an attribute named **attribute** and whose value, when the **operator** is applied, matches **value** *with* case sensitivity. //// tab | Syntax ```css [attr=value s] [attr="value" s] ``` //// //// tab | Usage ```pycon3 >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('[href="#INTERNAL" s]')) [] >>> print(soup.select('[href="#internal" s]')) [Internal link] ``` //// /// ## Namespace Selectors Namespace selectors are used in conjunction with type and universal selectors as well as attribute names in attribute selectors. They are specified by declaring the namespace and the selector separated with `|`: `namespace|selector`. `namespace`, in this context, is the prefix defined via the [namespace dictionary](../api.md#namespaces). The prefix defined for the CSS selector does not need to match the prefix name in the document as it is the namespace associated with the prefix that is compared, not the prefix itself. The universal selector (`*`) can be used to represent any namespace just as it can with types. By default, type selectors without a namespace selector will match any element whose type matches, regardless of namespace. But if a CSS default namespace is declared (one with an empty key: `{"": "http://www.w3.org/1999/xhtml"}`), all type selectors will assume the default namespace unless an explicit namespace selector is specified. For example, if the default name was defined to be `http://www.w3.org/1999/xhtml`, the selector `a` would only match `a` tags that are within the `http://www.w3.org/1999/xhtml` namespace. The one exception is within pseudo classes (`:not()`, `:has()`, etc.) as namespaces are not considered within pseudo classes unless one is explicitly specified. If the namespace is omitted (`|element`), any element without a namespace will be matched. In HTML documents that support namespaces (XHTML and HTML5), HTML elements are counted as part of the `http://www.w3.org/1999/xhtml` namespace, but attributes usually do not have a namespace unless one is explicitly defined in the markup. Namespaces can be used with attribute selectors as well except that when `[|attribute`] is used, it is equivalent to `[attribute]`. /// tab | Syntax ```css ns|element ns|* *|* *|element |element [ns|attr] [*|attr] [|attr] ``` /// /// tab | Usage ```pycon3 >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

SVG Example

...

Soup Sieve Docs

... ... ... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('svg|a', namespaces={'svg': 'http://www.w3.org/2000/svg'})) [MDN Web Docs] >>> print(soup.select('a', namespaces={'svg': 'http://www.w3.org/2000/svg'})) [Soup Sieve Docs, MDN Web Docs] >>> print(soup.select('a', namespaces={'': 'http://www.w3.org/1999/xhtml', 'svg': 'http://www.w3.org/2000/svg'})) [Soup Sieve Docs] >>> print(soup.select('[xlink|href]', namespaces={'xlink': 'http://www.w3.org/1999/xlink'})) [MDN Web Docs] >>> print(soup.select('[|href]', namespaces={'xlink': 'http://www.w3.org/1999/xlink'})) [Soup Sieve Docs] ``` /// --8<-- selector_styles.md --8<-- soupsieve-2.7/docs/src/markdown/selectors/combinators.md0000644000000000000000000000671213615410400020540 0ustar00# Combinators and Selector Lists CSS employs a number of tokens in order to represent lists or to provide relational context between two selectors. ## Selector Lists Selector lists use the comma (`,`) to join multiple selectors in a list. When presented with a selector list, any selector in the list that matches an element will return that element. /// tab | Syntax ```css element1, element2 ``` /// /// tab | Usage ```pycon3 >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

Title

...

Paragraph

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('h1, p')) [

Title

Paragraph

] ``` /// ## Descendant Combinator Descendant combinators combine two selectors with whitespace ( ) in order to signify that the second element is matched if it has an ancestor that matches the first element. /// tab | Syntax ```css parent descendant ``` /// /// tab | Usage ```pycon3 >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

Paragraph 1

...

Paragraph 2

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('body p')) [

Paragraph 1

Paragraph 2

] ``` /// /// tip | Additional Reading https://developer.mozilla.org/en-US/docs/Web/CSS/Descendant_combinator /// ## Child combinator Child combinators combine two selectors with `>` in order to signify that the second element is matched if it has a parent that matches the first element. /// tab | Syntax ```css parent > child ``` /// /// tab | Usage ```pycon3 >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

Paragraph 1

...

Paragraph 2

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('div > p')) [

Paragraph 1

] ``` /// /// tip | Additional Reading https://developer.mozilla.org/en-US/docs/Web/CSS/Child_combinator /// ## General sibling combinator General sibling combinators combine two selectors with `~` in order to signify that the second element is matched if it has a sibling that precedes it that matches the first element. /// tab | Syntax ```css prevsibling ~ sibling ``` /// /// tab | Usage ```pycon3 >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

Title

...

Paragraph 1

...

Paragraph 2

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('h1 ~ p')) [

Paragraph 1

Paragraph 2

] ``` /// /// tip | Additional Reading https://developer.mozilla.org/en-US/docs/Web/CSS/General_sibling_combinator /// ## Adjacent sibling combinator Adjacent sibling combinators combine two selectors with `+` in order to signify that the second element is matched if it has an adjacent sibling that precedes it that matches the first element. /// tab | Syntax ```css prevsibling + nextsibling ``` /// /// tab | Usage ```pycon3 >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

Title

...

Paragraph 1

...

Paragraph 2

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select('h1 + p')) [

Paragraph 1

] ``` /// /// tip | Additional Reading https://developer.mozilla.org/en-US/docs/Web/CSS/Adjacent_sibling_combinator /// --8<-- selector_styles.md --8<-- soupsieve-2.7/docs/src/markdown/selectors/index.md0000644000000000000000000001404313615410400017323 0ustar00# General Details ## Implementation Specifics The CSS selectors are based off of the CSS specification and includes not only stable selectors, but may also include selectors currently under development from the draft specifications. Primarily support has been added for selectors that were feasible to implement and most likely to get practical use. In addition to the selectors in the specification, Soup Sieve also supports a couple non-standard selectors. Soup Sieve aims to allow users to target XML/HTML elements with CSS selectors. It implements many pseudo classes, but it does not currently implement any pseudo elements and has no plans to do so. Soup Sieve also will not match anything for pseudo classes that are only relevant in a live, browser environment, but it will gracefully handle them if they've been implemented; such pseudo classes are non-applicable in the Beautiful Soup environment and are noted in [Non-Applicable Pseudo Classes](./unsupported.md#non-applicable-pseudo-classes). When speaking about namespaces, they only apply to XML, XHTML, or when dealing with recognized foreign tags in HTML5. Currently, Beautiful Soup's `html5lib` parser is the only parser that will return the appropriate namespaces for a HTML5 document. If you are using XHTML, you have to use the Beautiful Soup's `lxml-xml` parser (or `xml` for short) to get the appropriate namespaces in an XHTML document. In addition to using the correct parser, you must provide a dictionary of namespaces to Soup Sieve in order to use namespace selectors. See the documentation on [namespaces](../api.md#namespaces) to learn more. While an effort is made to mimic CSS selector behavior, there may be some differences or quirks, please report issues if any are found. ## Selector Context Key

Symbol	Name	Description
:material-language-html5:{: data-md-color-primary="orange" .big-icon}	HTML	Some selectors are very specific to HTML and either have no meaningful representation in XML, or such functionality has not been implemented. Selectors that are HTML only will be noted with :material-language-html5:{: data-md-color-primary="orange"}, and will match nothing if used in XML.
:material-star:{: data-md-color-primary="green" .big-icon}	Custom	Soup Sieve has implemented a couple non-standard selectors. These can contain useful selectors that were rejected from the official CSS specifications, selectors implemented by other systems such as JQuery, or even selectors specifically created for Soup Sieve. If a selector is considered non standard, it will be marked with :material-star:{: title="Custom" data-md-color-primary="green"}.
:material-flask:{: title="Experimental" data-md-color-primary="purple" .big-icon}	Experimental	All selectors that are from the current working draft of CSS4 are considered experimental and are marked with :material-flask:{: title="Experimental" data-md-color-primary="purple"}. Additionally, if there are other immature selectors, they may be marked as experimental as well. Experimental may mean we are not entirely sure if our implementation is correct, that things may still be in flux as they are part of a working draft, or even both. If at anytime a working draft drops a selector from the current draft, it will most likely also be removed here, most likely with a deprecation path, except where there may be a conflict that requires a less graceful transition. One exception is in the rare case that the selector is found to be far too useful despite being rejected. In these cases, we may adopt them as "custom" selectors.

/// tip | Additional Reading If usage of a selector is not clear in this documentation, you can find more information by reading these specification documents: [CSS Level 3 Specification](https://www.w3.org/TR/selectors-3/) : Contains the latest official document outlying official behaviors of CSS selectors. [CSS Level 4 Working Draft](https://www.w3.org/TR/selectors-4/) : Contains the latest published working draft of the CSS level 4 selectors which outlines the experimental new selectors and experimental behavioral changes. [HTML5](https://www.w3.org/TR/html50/) : The HTML 5.0 specification document. Defines the semantics regarding HTML. [HTML Living Standard](https://html.spec.whatwg.org/) : The HTML Living Standard document. Defines semantics regarding HTML. /// ## Selector Terminology Certain terminology is used throughout this document when describing selectors. In order to fully understand the syntax a selector may implement, it is important to understand a couple of key terms. ### Selector Selector is used to describe any selector whether it is a [simple](#simple-selector), [compound](#compound-selector), or [complex](#complex-selector) selector. ### Simple Selector A simple selector represents a single condition on an element. It can be a [type selector](#type-selectors), [universal selector](#universal-selectors), [ID selector](#id-selectors), [class selector](#class-selectors), [attribute selector](#attribute-selectors), or [pseudo class selector](#pseudo-classes). ### Compound Selector A [compound](#compound-selector) selector is a sequence of [simple](#simple-selector) selectors. They do not contain any [combinators](#combinators-and-selector-lists). If a universal or type selector is used, they must come first, and only one instance of either a universal or type selector can be used, both cannot be used at the same time. ### Complex Selector A complex selector consists of multiple [simple](#simple-selector) or [compound](#compound-selector) selectors joined with [combinators](#combinators-and-selector-lists). ### Selector List A selector list is a list of selectors joined with a comma (`,`). A selector list is used to specify that a match is valid if any of the selectors in a list matches. --8<-- selector_styles.md --8<-- soupsieve-2.7/docs/src/markdown/selectors/pseudo-classes.md0000644000000000000000000014237113615410400021154 0ustar00# Pseudo-Classes ## Overview These are pseudo classes that are either fully or partially supported. Partial support is usually due to limitations of not being in a live, browser environment. Pseudo classes that cannot be implemented are found under [Non-Applicable Pseudo Classes](./unsupported.md/#non-applicable-pseudo-classes). Any selectors that are not found here or under the non-applicable either are under consideration, have not yet been evaluated, or are too new and viewed as a risk to implement as they might not stick around. ## `:any-link`:material-language-html5:{: title="HTML" data-md-color-primary="orange" .icon} {:#:any-link} Selects every `#!html `, or `#!html ` element that has an `href` attribute, independent of whether it has been visited. /// tab | Syntax ```css :any-link ``` /// /// tab | Usage ```pycon3 >>> from bs4 import BeautifulSoup as bs >>> html = """ ... ... ... ...

A link to click

... ... ... """ >>> soup = bs(html, 'html5lib') >>> print(soup.select(':any-link')) [click] ``` /// /// tip | Additional Reading https://developer.mozilla.org/en-US/docs/Web/CSS/:any-link /// /// new | New in 2.2 The CSS specification recently updated to not include `#!html ` in the definition; therefore, Soup Sieve has removed it as well. /// ## `:checked`:material-language-html5:{: title="HTML" data-md-color-primary="orange" .icon} {:#:checked} Selects any `#!html `, `#!html `, or `#!html ] ``` /// /// tip | Additional Reading https://developer.mozilla.org/en-US/docs/Web/CSS/:checked /// ## `:default`:material-language-html5:{: title="HTML" data-md-color-primary="orange" .icon} {:#:default} Selects any form element that is the default among a group of related elements, including: `#!html

""" self.assert_selector( markup, ":default", ['summer', 'd1', 'd3', 'hamster', 'enable'], flags=util.HTML ) def test_iframe(self): """Test with `iframe`.""" markup = """

<html> <body> <button id="d2" type="submit">default2</button> </body> </html>
<html> <body> <form> <button id="d4" type="submit">default4</button> </form> </body> </html> """ self.assert_selector( markup, ":default", ['d1', 'd3', 'd4'], flags=util.PYHTML ) def test_nested_form(self): """ Test nested form. This is technically invalid use of forms, but browsers will generally evaluate first in the nested forms. """ markup = """

""" self.assert_selector( markup, ":default", ['d1'], flags=util.HTML ) def test_default_cached(self): """ Test that we use the cached "default". For the sake of coverage, we will do this impractical select to ensure we reuse the cached default. """ markup = """

""" self.assert_selector( markup, ":default:default", ['d1'], flags=util.HTML ) def test_nested_form_fail(self): """ Test that the search for elements will bail after the first nested form. You shouldn't nest forms, but if you do, when a parent form encounters a nested form, we will bail evaluation like browsers do. We should see button 1 getting found for nested form, but button 2 will not be found for parent form. """ markup = """
what
""" self.assert_selector( markup, ":default", [], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_defined.py0000644000000000000000000000501313615410400016726 0ustar00"""Test defined selectors.""" from .. import util class TestDefined(util.TestCase): """Test defined selectors.""" def test_defined_html(self): """Test defined HTML.""" markup = """
""" self.assert_selector( markup, 'body :defined', ['0', '2', '3'], flags=util.HTML ) @util.skip_no_lxml def test_defined_xhtml(self): """Test defined XHTML.""" markup = """
""" from lxml import etree self.assert_selector( markup, 'body :defined', # We should get 3, but for LXML versions less than 4.4.0 we don't for reasons stated above. ['0', '2'] if etree.LXML_VERSION < (4, 4, 0, 0) else ['0', '1', '2'], flags=util.XHTML ) def test_defined_xml(self): """Test defined HTML.""" markup = """
""" # Defined is a browser thing. # XML doesn't care about defined and this will match nothing in XML. self.assert_selector( markup, 'body :defined', [], flags=util.XML ) soupsieve-2.7/tests/test_level4/test_dir.py0000644000000000000000000001240313615410400016107 0ustar00"""Test direction selectors.""" from .. import util import soupsieve as sv class TestDir(util.TestCase): """Test direction selectors.""" MARKUP = """
test1
test2
עִבְרִית()

עִבְרִית
test3
עִבְרִית
""" def test_dir_rtl(self): """Test general direction right to left.""" self.assert_selector( self.MARKUP, "div:dir(rtl)", ["1", "4", "6"], flags=util.HTML ) def test_dir_ltr(self): """Test general direction left to right.""" self.assert_selector( self.MARKUP, "div:dir(ltr)", ["3"], flags=util.HTML ) def test_dir_conflict(self): """Test conflicting direction.""" self.assert_selector( self.MARKUP, "div:dir(ltr):dir(rtl)", [], flags=util.HTML ) def test_dir_xml(self): """Test direction with XML (not supported).""" self.assert_selector( self.MARKUP, "div:dir(ltr)", [], flags=util.XML ) def test_dir_bidi_detect(self): """Test bidirectional detection.""" self.assert_selector( self.MARKUP, "span:dir(rtl)", ['2', '5', '7'], flags=util.HTML ) self.assert_selector( self.MARKUP, "span:dir(ltr)", ['8'], flags=util.HTML ) def test_dir_on_input(self): """Test input direction rules.""" self.assert_selector( self.MARKUP, ":is(input, textarea):dir(ltr)", ['9', '10', '11', '12', '13'], flags=util.HTML5 ) def test_dir_on_root(self): """Test that the root is assumed left to right if not explicitly defined.""" self.assert_selector( self.MARKUP, "html:dir(ltr)", ['0'], flags=util.HTML ) def test_dir_auto_root(self): """Test that the root is assumed left to right if auto used.""" markup = """ """ self.assert_selector( markup, "html:dir(ltr)", ['0'], flags=util.HTML ) def test_dir_on_input_root(self): """Test input direction when input is the root.""" markup = """""" # Input is root for parser in util.available_parsers('html.parser', 'lxml', 'html5lib'): soup = self.soup(markup, parser) fragment = soup.input.extract() self.assertTrue(sv.match(":root:dir(ltr)", fragment, flags=sv.DEBUG)) def test_iframe(self): """Test direction in `iframe`.""" markup = """
<html> <body> <div id="2" dir="auto"> עִבְרִית <span id="5" dir="auto">()</span></div> </div> </body> </html> """ self.assert_selector( markup, "div:dir(ltr)", ['1'], flags=util.PYHTML ) self.assert_selector( markup, "div:dir(rtl)", ['2'], flags=util.PYHTML ) def test_xml_in_html(self): """Test cases for when we have XML in HTML.""" markup = """
$עִבְרִית$ other text
""" self.assert_selector( markup, "div:dir(ltr)", ['1'], flags=util.HTML5 ) self.assert_selector( markup, "div:dir(rtl)", [], flags=util.HTML5 ) self.assert_selector( markup, "math:dir(rtl)", [], flags=util.HTML5 ) soupsieve-2.7/tests/test_level4/test_focus_visible.py0000644000000000000000000000124313615410400020165 0ustar00"""Test focus visible selectors.""" from .. import util class TestFocusVisible(util.TestCase): """Test focus visible selectors.""" MARKUP = """

""" def test_focus_visible(self): """Test focus visible.""" self.assert_selector( self.MARKUP, "form:focus-visible", [], flags=util.HTML ) def test_not_focus_visible(self): """Test inverse of focus visible.""" self.assert_selector( self.MARKUP, "form:not(:focus-visible)", ["form"], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_focus_within.py0000644000000000000000000000123213615410400020030 0ustar00"""Test focus within selectors.""" from .. import util class TestFocusWithin(util.TestCase): """Test focus within selectors.""" MARKUP = """

""" def test_focus_within(self): """Test focus within.""" self.assert_selector( self.MARKUP, "form:focus-within", [], flags=util.HTML ) def test_not_focus_within(self): """Test inverse of focus within.""" self.assert_selector( self.MARKUP, "form:not(:focus-within)", ["form"], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_future.py0000644000000000000000000000144313615410400016645 0ustar00"""Test future selectors.""" from .. import util class TestFuture(util.TestCase): """Test future selectors.""" MARKUP = """

Some text in a paragraph. Link Placeholder text.

""" def test_future(self): """Test future (should match nothing).""" self.assert_selector( self.MARKUP, "p:future", [], flags=util.HTML ) def test_not_future(self): """Test not future.""" self.assert_selector( self.MARKUP, "p:not(:future)", ["0"], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_has.py0000644000000000000000000001012713615410400016105 0ustar00"""Test has selectors.""" from .. import util from soupsieve import SelectorSyntaxError class TestHas(util.TestCase): """Test has selectors.""" MARKUP = """

""" MARKUP2 = """

""" def test_has_descendant(self): """Test has descendant.""" self.assert_selector( self.MARKUP, 'div:not(.aaaa):has(.kkkk > p.llll)', ['4', '5', '6'], flags=util.HTML ) def test_has_next_sibling(self): """Test has next sibling.""" self.assert_selector( self.MARKUP, 'p:has(+ .dddd:has(+ div .jjjj))', ['2'], flags=util.HTML ) def test_has_subsequent_sibling(self): """Test has subsequent sibling.""" self.assert_selector( self.MARKUP, 'p:has(~ .jjjj)', ['7', '8'], flags=util.HTML ) def test_has_child(self): """Test has2.""" self.assert_selector( self.MARKUP2, 'div:has(> .bbbb)', ['0'], flags=util.HTML ) def test_has_case(self): """Test has case insensitive.""" self.assert_selector( self.MARKUP, 'div:NOT(.aaaa):HAS(.kkkk > p.llll)', ['4', '5', '6'], flags=util.HTML ) def test_has_mixed(self): """Test has mixed.""" self.assert_selector( self.MARKUP2, 'div:has(> .bbbb, .ffff, .jjjj)', ['0', '4', '8'], flags=util.HTML ) self.assert_selector( self.MARKUP2, 'div:has(.ffff, > .bbbb, .jjjj)', ['0', '4', '8'], flags=util.HTML ) def test_has_nested_pseudo(self): """Test has with nested pseudo.""" self.assert_selector( self.MARKUP2, 'div:has(> :not(.bbbb, .ffff, .jjjj))', ['2', '6', '8'], flags=util.HTML ) self.assert_selector( self.MARKUP2, 'div:not(:has(> .bbbb, .ffff, .jjjj))', ['2', '6'], flags=util.HTML ) def test_has_no_match(self): """Test has with a non-matching selector.""" self.assert_selector( self.MARKUP2, 'div:has(:paused)', [], flags=util.HTML ) def test_has_empty(self): """Test has with empty slot due to multiple commas.""" self.assert_raises('div:has()', SelectorSyntaxError) def test_invalid_incomplete_has(self): """Test `:has()` fails with just a combinator.""" self.assert_raises(':has(>)', SelectorSyntaxError) def test_invalid_has_double_combinator(self): """Test `:has()` fails with consecutive combinators.""" self.assert_raises(':has(>> has a)', SelectorSyntaxError) self.assert_raises(':has(> has, >> a)', SelectorSyntaxError) self.assert_raises(':has(> has >> a)', SelectorSyntaxError) def test_invalid_has_trailing_combinator(self): """Test `:has()` fails with trailing combinator.""" self.assert_raises(':has(> has >)', SelectorSyntaxError) soupsieve-2.7/tests/test_level4/test_host.py0000644000000000000000000000111413615410400016303 0ustar00"""Test host selectors.""" from .. import util class TestHost(util.TestCase): """Test host selectors.""" MARKUP = """
header
some text
""" def test_host(self): """Test host (not supported).""" self.assert_selector( self.MARKUP, ":host", [], flags=util.HTML ) def test_host_func(self): """Test host function (not supported).""" self.assert_selector( self.MARKUP, ":host(h1)", [], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_host_context.py0000644000000000000000000000065213615410400020055 0ustar00"""Test host context selectors.""" from .. import util class TestHostContext(util.TestCase): """Test host context selectors.""" def test_host_context(self): """Test host context (not supported).""" markup = """
header
some text
""" self.assert_selector( markup, ":host-context(h1, h2)", [], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_in_range.py0000644000000000000000000002532313615410400017120 0ustar00"""Test in range selectors.""" from .. import util class TestInRange(util.TestCase): """Test in range selectors.""" def test_in_range_number(self): """Test in range number.""" markup = """ """ self.assert_selector( markup, ":in-range", ['0', '1', '2', '3', '4', '5', '6', '7', '8'], flags=util.HTML ) def test_in_range_range(self): """Test in range range.""" markup = """ """ self.assert_selector( markup, ":in-range", ['0', '1', '2', '3', '4', '5', '6', '7', '8'], flags=util.HTML ) def test_in_range_month(self): """Test in range month.""" markup = """ """ self.assert_selector( markup, ":in-range", ['0', '1', '2', '3', '4', '5', '6'], flags=util.HTML ) def test_in_range_week(self): """Test in range week.""" markup = """ """ self.assert_selector( markup, ":in-range", ['0', '1', '2', '3', '4', '5', '6', '7'], flags=util.HTML ) def test_in_range_date(self): """Test in range date.""" markup = """ """ self.assert_selector( markup, ":in-range", ['0', '1', '2', '3', '4', '5', '6'], flags=util.HTML ) def test_in_range_date_time(self): """Test in range date_time.""" markup = """ """ self.assert_selector( markup, ":in-range", ['0', '1', '2', '3', '4', '5', '6'], flags=util.HTML ) def test_in_range_time(self): """Test in range time.""" markup = """ """ self.assert_selector( markup, ":in-range", ['0', '1', '2', '3', '4', '5', '6', '7'], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_indeterminate.py0000644000000000000000000000454713615410400020173 0ustar00"""Test indeterminate selectors.""" from .. import util class TestIndeterminate(util.TestCase): """Test indeterminate selectors.""" def test_indeterminate(self): """Test indeterminate.""" markup = """ No name 1 no name 2
This label starts out lime.

This label starts out lime.
This label starts out lime. This label starts out lime. This label starts out lime. This label starts out lime.
This label starts out lime.
""" self.assert_selector( markup, ":indeterminate", ['checkbox', 'radio1', 'radio6', 'radio4', 'radio5', 'radio-no-name1'], flags=util.HTML ) def test_iframe(self): """Test indeterminate when `iframe` is involved.""" markup = """
This label starts out lime. <html> <body> <input type="radio" name="test" id="radio2" checked> <label for="radio2">This label starts out lime.</label> <input type="radio" name="other" id="radio3"> <label for="radio3">This label starts out lime.</label> </body> </html>
""" self.assert_selector( markup, ":indeterminate", ['radio1', 'radio3'], flags=util.PYHTML ) soupsieve-2.7/tests/test_level4/test_is.py0000644000000000000000000000647513615410400015760 0ustar00"""Test is selectors.""" from .. import util from soupsieve import SelectorSyntaxError class TestIs(util.TestCase): """Test is selectors.""" MARKUP = """

Some text in a paragraph. Link

""" def test_is(self): """Test multiple selectors with "is".""" self.assert_selector( self.MARKUP, ":is(span, a)", ["1", "2"], flags=util.HTML ) def test_is_multi_comma(self): """Test multiple selectors but with an empty slot due to multiple commas.""" self.assert_selector( self.MARKUP, ":is(span, , a)", ["1", "2"], flags=util.HTML ) def test_is_leading_comma(self): """Test multiple selectors but with an empty slot due to leading commas.""" self.assert_selector( self.MARKUP, ":is(, span, a)", ["1", "2"], flags=util.HTML ) def test_is_trailing_comma(self): """Test multiple selectors but with an empty slot due to trailing commas.""" self.assert_selector( self.MARKUP, ":is(span, a, )", ["1", "2"], flags=util.HTML ) def test_is_empty(self): """Test empty `:is()` selector list.""" self.assert_selector( self.MARKUP, ":is()", [], flags=util.HTML ) def test_nested_is(self): """Test multiple nested selectors.""" self.assert_selector( self.MARKUP, ":is(span, a:is(#\\32))", ["1", "2"], flags=util.HTML ) self.assert_selector( self.MARKUP, ":is(span, a:is(#\\32))", ["1", "2"], flags=util.HTML ) def test_is_with_other_pseudo(self): """Test `:is()` behavior when paired with `:not()`.""" # Each pseudo class is evaluated separately # So this will not match self.assert_selector( self.MARKUP, ":is(span):not(span)", [], flags=util.HTML ) def test_multiple_is(self): """Test `:is()` behavior when paired with `:not()`.""" # Each pseudo class is evaluated separately # So this will not match self.assert_selector( self.MARKUP, ":is(span):is(div)", [], flags=util.HTML ) # Each pseudo class is evaluated separately # So this will match self.assert_selector( self.MARKUP, ":is(a):is(#\\32)", ['2'], flags=util.HTML ) def test_invalid_pseudo_class_start_combinator(self): """Test invalid start combinator in pseudo-classes other than `:has()`.""" self.assert_raises(':is(> div)', SelectorSyntaxError) self.assert_raises(':is(div, > div)', SelectorSyntaxError) def test_invalid_pseudo_orphan_close(self): """Test invalid, orphaned pseudo close.""" self.assert_raises('div)', SelectorSyntaxError) def test_invalid_pseudo_open(self): """Test invalid pseudo close.""" self.assert_raises(':is(div', SelectorSyntaxError) soupsieve-2.7/tests/test_level4/test_lang.py0000644000000000000000000002361613615410400016262 0ustar00"""Test language selectors.""" from .. import util class TestLang(util.TestCase): """Test language selectors.""" MARKUP = """

""" def test_lang(self): """Test language and that it uses implicit wildcard.""" # Implicit wild self.assert_selector( self.MARKUP, "p:lang(de-DE)", ['1', '2', '3', '4', '5', '6'], flags=util.HTML ) def test_lang_missing_range(self): """Test language range with a missing range.""" # Implicit wild self.assert_selector( self.MARKUP, "p:lang(de--DE)", [], flags=util.HTML ) def test_explicit_wildcard(self): """Test language with explicit wildcard (same as implicit).""" # Explicit wild self.assert_selector( self.MARKUP, "p:lang(de-\\*-DE)", ['1', '2', '3', '4', '5', '6'], flags=util.HTML ) def test_only_wildcard(self): """Test language with only a wildcard.""" self.assert_selector( self.MARKUP, "p:lang('*')", ['1', '2', '3', '4', '5', '6', '7', '8', '9'], flags=util.HTML ) def test_wildcard_start_no_match(self): """Test language with a wildcard at start, but it matches nothing.""" self.assert_selector( self.MARKUP, "p:lang('*-de-DE')", [], flags=util.HTML ) def test_wildcard_start_collapse(self): """Test that language with multiple wildcard patterns at start collapse.""" self.assert_selector( self.MARKUP, "p:lang('*-*-*-DE')", ['1', '2', '3', '4', '5', '6', '7'], flags=util.HTML ) def test_wildcard_at_start_escaped(self): """ Test language with wildcard at start (escaped). Wildcard in the middle is same as implicit, but at the start, it has specific meaning. """ self.assert_selector( self.MARKUP, "p:lang(\\*-DE)", ['1', '2', '3', '4', '5', '6', '7'], flags=util.HTML ) def test_language_quoted(self): """Test language (quoted).""" # Normal quoted self.assert_selector( self.MARKUP, "p:lang('de-DE')", ['1', '2', '3', '4', '5', '6'], flags=util.HTML ) def test_language_quoted_with_escaped_newline(self): """Test language (quoted) with escaped new line.""" # Normal quoted self.assert_selector( self.MARKUP, "p:lang('de-\\\nDE')", ['1', '2', '3', '4', '5', '6'], flags=util.HTML ) def test_wildcard_at_start_quoted(self): """Test language with wildcard at start (quoted).""" # First wild quoted self.assert_selector( self.MARKUP, "p:lang('*-DE')", ['1', '2', '3', '4', '5', '6', '7'], flags=util.HTML ) def test_avoid_implicit_language(self): """Test that we can narrow language selection to elements that match and explicitly state language.""" # Target element with language and language attribute self.assert_selector( self.MARKUP, "p[lang]:lang(de-DE)", ['6'], flags=util.HTML ) def test_language_und(self): """Test that undefined language can be matched by `*`.""" markup = """

""" self.assert_selector( markup, "div:lang('*')", ['2'], flags=util.HTML ) def test_language_empty_string(self): """Test that an empty string language will only match untagged languages `lang=""`.""" markup = """

""" self.assert_selector( markup, "div:lang('')", ['1', '3', '4'], flags=util.HTML ) def test_language_list(self): """Test language list.""" # Multiple languages markup = """

""" self.assert_selector( markup, "p:lang(de-DE, '*-US')", ['1', '3', '4', '5', '6'], flags=util.HTML ) def test_undetermined_language(self): """Test undetermined language.""" markup = """

""" self.assert_selector( markup, "p:lang(en)", [], flags=util.HTML ) def test_language_in_header(self): """Test that we can find language in header.""" markup = """

""" self.assert_selector( markup, "p:lang('*-US')", ['1', '2'], flags=util.HTML ) def test_xml_style_language_in_html5(self): """Test XML style language when out of HTML5 namespace.""" markup = """

""" self.assert_selector( markup, "mtext:lang(en)", ['1'], flags=util.HTML5 ) def test_xml_style_language(self): """Test XML style language.""" # XML style language markup = """

""" self.assert_selector( markup, "p:lang(de-DE)", ['1', '2', '3', '4', '5', '6'], flags=util.XML ) def test_language_in_xhtml(self): """Test language in XHTML.""" markup = """

""" self.assert_selector( markup, "p:lang(de-DE)", ['1', '2', '3', '4', '5', '6'], flags=util.XML ) def test_language_in_xhtml_without_html_style_lang(self): """ Test language in XHTML. HTML namespace elements must use HTML style language. """ # XHTML language: `lang` markup = """

""" self.assert_selector( markup, "p:lang(de-DE)", [], flags=util.XHTML ) soupsieve-2.7/tests/test_level4/test_local_link.py0000644000000000000000000000133113615410400017436 0ustar00"""Test local link selectors.""" from .. import util class TestLocalLink(util.TestCase): """Test local link selectors.""" MARKUP = """ Link Another link """ def test_local_link(self): """Test local link (matches nothing).""" self.assert_selector( self.MARKUP, "a:local-link", [], flags=util.HTML ) def test_not_local_link(self): """Test not local link.""" self.assert_selector( self.MARKUP, "a:not(:local-link)", ["1", "2"], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_matches.py0000644000000000000000000000140713615410400016757 0ustar00"""Test matches selectors.""" from .. import util class TestMatches(util.TestCase): """Test matches selectors.""" MARKUP = """

Some text in a paragraph. Link

""" def test_matches(self): """Test multiple selectors with "matches".""" self.assert_selector( self.MARKUP, ":matches(span, a)", ["1", "2"], flags=util.HTML ) def test_nested_matches(self): """Test multiple nested selectors with "matches".""" self.assert_selector( self.MARKUP, ":matches(span, a:matches(#\\32))", ["1", "2"], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_muted.py0000644000000000000000000000203713615410400016451 0ustar00"""Test muted selectors.""" from .. import util class TestPaused(util.TestCase): """Test paused selectors.""" MARKUP = """ """ def test_muted(self): """Test muted.""" self.assert_selector( self.MARKUP, "video:muted", ['vid1'], flags=util.HTML ) def test_not_muted(self): """Test not muted.""" self.assert_selector( self.MARKUP, "video:not(:muted)", ["vid2"], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_not.py0000644000000000000000000000132313615410400016130 0ustar00"""Test not selectors.""" from .. import util class TestNot(util.TestCase): """Test not selectors.""" def test_multi_nested_not(self): """Test nested not and multiple selectors.""" markup = """

Some text in a paragraph.
Link Direct child
Child 1 Child 2 Child 3

""" self.assert_selector( markup, 'div :not(p, :not([id=\\35]))', ['5'], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_nth_child.py0000644000000000000000000000226213615410400017267 0ustar00"""Test `nth` child selectors.""" from .. import util class TestNthChild(util.TestCase): """Test `nth` child selectors.""" MARKUP = """

""" def test_nth_child_of_s_simple(self): """Test `nth` child with selector (simple).""" self.assert_selector( self.MARKUP, ":nth-child(-n+3 of p)", ['0', '1', '7'], flags=util.HTML ) def test_nth_child_of_s_complex(self): """Test `nth` child with selector (complex).""" self.assert_selector( self.MARKUP, ":nth-child(2n + 1 of :is(p, span).test)", ['2', '6', '10'], flags=util.HTML ) self.assert_selector( self.MARKUP, ":nth-child(2n + 1 OF :is(p, span).test)", ['2', '6', '10'], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_nth_last_child.py0000644000000000000000000000145613615410400020316 0ustar00"""Test `nth` last child selectors.""" from .. import util class TestNthLastChild(util.TestCase): """Test `nth` last child selectors.""" def test_nth_child_of_s_complex(self): """Test `nth` child with selector (complex).""" markup = """

""" self.assert_selector( markup, ":nth-last-child(2n + 1 of p[id], span[id])", ['1', '3', '5', '7', '9', '11'], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_open.py0000644000000000000000000000271413615410400016276 0ustar00"""Test open selectors.""" from .. import util class TestOpen(util.TestCase): """Test open selectors.""" MARKUP = """

This is closed.A closed details element.

This is open.An open details element.

Greetings, one and all!

Goodbye, one and all!

""" def test_open(self): """Test open.""" self.assert_selector( self.MARKUP, ":open", ['2', '3'], flags=util.HTML ) def test_targted_open(self): """Test targeted open.""" self.assert_selector( self.MARKUP, "details:open", ['2'], flags=util.HTML ) self.assert_selector( self.MARKUP, "dialog:open", ['3'], flags=util.HTML ) def test_not_open(self): """Test not open.""" self.assert_selector( self.MARKUP, ":is(dialog, details):not(:open)", ["1", "4"], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_optional.py0000644000000000000000000000152113615410400017155 0ustar00"""Test optional selectors.""" from .. import util class TestOptional(util.TestCase): """Test optional selectors.""" MARKUP = """

""" def test_optional(self): """Test optional.""" self.assert_selector( self.MARKUP, ":optional", ['3', '4', '5'], flags=util.HTML ) def test_specific_optional(self): """Test specific optional.""" self.assert_selector( self.MARKUP, "input:optional", ['3'], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_out_of_range.py0000644000000000000000000002526013615410400020005 0ustar00"""Test out of range selectors.""" from .. import util class TestOutOfRange(util.TestCase): """Test out of range selectors.""" def test_out_of_range_number(self): """Test in range number.""" markup = """ """ self.assert_selector( markup, ":out-of-range", ['9', '10', '11'], flags=util.HTML ) def test_out_of_range_range(self): """Test in range range.""" markup = """ """ self.assert_selector( markup, ":out-of-range", ['9', '10'], flags=util.HTML ) def test_out_of_range_month(self): """Test in range month.""" markup = """ """ self.assert_selector( markup, ":out-of-range", ['7', '8', '9', '10'], flags=util.HTML ) def test_out_of_range_week(self): """Test in range week.""" markup = """ """ self.assert_selector( markup, ":out-of-range", ['8', '9', '10', '11'], flags=util.HTML ) def test_out_of_range_date(self): """Test in range date.""" markup = """ """ self.assert_selector( markup, ":out-of-range", ['7', '8', '9', '10', '11', '12'], flags=util.HTML ) def test_out_of_range_date_time(self): """Test in range date time.""" markup = """ """ self.assert_selector( markup, ":out-of-range", ['7', '8', '9', '10', '11', '12', '13', '14', '15', '16'], flags=util.HTML ) def test_out_of_range_time(self): """Test in range time.""" markup = """ """ self.assert_selector( markup, ":out-of-range", ['8', '9', '10', '11', '12', '13', '14'], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_past.py0000644000000000000000000000142113615410400016276 0ustar00"""Test past selectors.""" from .. import util class TestPast(util.TestCase): """Test past selectors.""" MARKUP = """

Some text in a paragraph. Link Placeholder text.

""" def test_past(self): """Test past (should match nothing).""" self.assert_selector( self.MARKUP, "p:past", [], flags=util.HTML ) def test_not_past(self): """Test not past.""" self.assert_selector( self.MARKUP, "p:not(:past)", ["0"], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_paused.py0000644000000000000000000000164313615410400016616 0ustar00"""Test paused selectors.""" from .. import util class TestPaused(util.TestCase): """Test paused selectors.""" MARKUP = """ """ def test_paused(self): """Test paused (matches nothing).""" # Not actually sure how this is used, but it won't match anything anyways self.assert_selector( self.MARKUP, "video:paused", [], flags=util.HTML ) def test_not_paused(self): """Test not paused.""" self.assert_selector( self.MARKUP, "video:not(:paused)", ["vid"], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_placeholder_shown.py0000644000000000000000000000623313615410400021035 0ustar00"""Test placeholder shown selectors.""" from .. import util class TestPlaceholderShown(util.TestCase): """Test placeholder shown selectors.""" def test_placeholder_shown(self): """Test placeholder shown.""" markup = """ Value """ self.assert_selector( markup, ":placeholder-shown", ['0', '1', '4', '5', '6', '7', '8', '9', '10', '11', '12', '28', '32'], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_playing.py0000644000000000000000000000165413615410400017002 0ustar00"""Test playing selectors.""" from .. import util class TestPlaying(util.TestCase): """Test playing selectors.""" MARKUP = """ """ def test_playing(self): """Test playing (matches nothing).""" # Not actually sure how this is used, but it won't match anything anyways self.assert_selector( self.MARKUP, "video:playing", [], flags=util.HTML ) def test_not_playing(self): """Test not playing.""" self.assert_selector( self.MARKUP, "video:not(:playing)", ["vid"], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_read_only.py0000644000000000000000000000337613615410400017316 0ustar00"""Test read only selectors.""" from .. import util class TestReadOnly(util.TestCase): """Test read only selectors.""" def test_read_only(self): """Test read only.""" markup = """
Text

Text

Text

Text

Text
""" self.assert_selector( markup, "body :read-only", [ '3', '13', '14', '15', '18', '19', '20', '22', '23', '24', '25', '31', '32', '33' ], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_read_write.py0000644000000000000000000000337413615410400017465 0ustar00"""Test read write selectors.""" from .. import util class TestReadWrite(util.TestCase): """Test read write selectors.""" def test_read_write(self): """Test read write.""" markup = """
Text

Text

Text

Text

Text
""" self.assert_selector( markup, ":read-write", [ '0', '1', '2', '4', '5', '6', '7', '8', '9', '10', '11', '12', '16', '17', '21', '26', '27', '28', '29', '30' ], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_required.py0000644000000000000000000000155513615410400017157 0ustar00"""Test required selectors.""" from .. import util class TestRequired(util.TestCase): """Test required selectors.""" MARKUP = """

""" def test_required(self): """Test required.""" self.assert_selector( self.MARKUP, ":required", ['1', '2', '4', '5'], flags=util.HTML ) def test_specific_required(self): """Test specific required.""" self.assert_selector( self.MARKUP, "input:required", ['1', '2'], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_scope.py0000644000000000000000000000530613615410400016446 0ustar00"""Test scope selectors.""" from .. import util import soupsieve as sv class TestScope(util.TestCase): """Test scope selectors.""" MARKUP = """

Some text in a paragraph.
Link Direct child
Child 1 Child 2 Child 3

""" def test_scope_is_root(self): """Test scope is the root when the a specific element is not the target of the select call.""" # Scope is root when applied to a document node self.assert_selector( self.MARKUP, ":scope", ["root"], flags=util.HTML ) self.assert_selector( self.MARKUP, ":scope > body > div", ["div"], flags=util.HTML ) def test_scope_cannot_select_target(self): """Test that scope, the element which scope is called on, cannot be selected.""" for parser in util.available_parsers( 'html.parser', 'lxml', 'html5lib', 'xml'): soup = self.soup(self.MARKUP, parser) el = soup.html # Scope is the element we are applying the select to, and that element is never returned self.assertTrue(len(sv.select(':scope', el, flags=sv.DEBUG)) == 0) def test_scope_is_select_target(self): """Test that scope is the element which scope is called on.""" for parser in util.available_parsers( 'html.parser', 'lxml', 'html5lib', 'xml'): soup = self.soup(self.MARKUP, parser) el = soup.html # Scope here means the current element under select ids = [el.attrs['id'] for el in sv.select(':scope div', el, flags=sv.DEBUG)] self.assertEqual(sorted(ids), sorted(['div'])) el = soup.body ids = [el.attrs['id'] for el in sv.select(':scope div', el, flags=sv.DEBUG)] self.assertEqual(sorted(ids), sorted(['div'])) # `div` is the current element under select, and it has no `div` elements. el = soup.div ids = [el.attrs['id'] for el in sv.select(':scope div', el, flags=sv.DEBUG)] self.assertEqual(sorted(ids), sorted([])) # `div` does have an element with the class `.wordshere` ids = [el.attrs['id'] for el in sv.select(':scope .wordshere', el, flags=sv.DEBUG)] self.assertEqual(sorted(ids), sorted(['pre'])) soupsieve-2.7/tests/test_level4/test_target_within.py0000644000000000000000000000147213615410400020205 0ustar00"""Test target within selectors.""" from .. import util class TestTargetWithin(util.TestCase): """Test target within selectors.""" MARKUP = """ Jump

Header 1

content

Header 2

content

""" def test_target_within(self): """Test target within.""" self.assert_selector( self.MARKUP, "article:target-within", [], flags=util.HTML ) def test_not_target_within(self): """Test inverse of target within.""" self.assert_selector( self.MARKUP, "article:not(:target-within)", ["article"], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_user_invalid.py0000644000000000000000000000113413615410400020014 0ustar00"""Test invalid selectors.""" from .. import util class TestInvalid(util.TestCase): """Test invalid selectors.""" def test_user_invalid(self): """Test user invalid (matches nothing).""" markup = """

""" self.assert_selector( markup, "input:user-invalid", [], flags=util.HTML ) self.assert_selector( markup, "input:not(:user-invalid)", ["1"], flags=util.HTML ) soupsieve-2.7/tests/test_level4/test_where.py0000644000000000000000000000136313615410400016446 0ustar00"""Test where selectors.""" from .. import util class TestWhere(util.TestCase): """Test where selectors.""" MARKUP = """

Some text in a paragraph. Link

""" def test_where(self): """Test multiple selectors with "where".""" self.assert_selector( self.MARKUP, ":where(span, a)", ["1", "2"], flags=util.HTML ) def test_nested_where(self): """Test multiple nested selectors with "where".""" self.assert_selector( self.MARKUP, ":where(span, a:where(#\\32))", ["1", "2"], flags=util.HTML ) soupsieve-2.7/tests/test_nesting_1/__init__.py0000644000000000000000000000005613615410400016526 0ustar00"""Test CSS introduced by Nesting level 1.""" soupsieve-2.7/tests/test_nesting_1/test_amp.py0000644000000000000000000000525313615410400016607 0ustar00"""Test ampersand selectors.""" from .. import util import soupsieve as sv class TestAmp(util.TestCase): """Test scope selectors.""" MARKUP = """

Some text in a paragraph.
Link Direct child
Child 1 Child 2 Child 3

""" def test_amp_is_root(self): """Test ampersand is the root when the a specific element is not the target of the select call.""" # Scope is root when applied to a document node self.assert_selector( self.MARKUP, "&", ["root"], flags=util.HTML ) self.assert_selector( self.MARKUP, "& > body > div", ["div"], flags=util.HTML ) def test_amp_cannot_select_target(self): """Test that ampersand, the element which scope is called on, cannot be selected.""" for parser in util.available_parsers( 'html.parser', 'lxml', 'html5lib', 'xml'): soup = self.soup(self.MARKUP, parser) el = soup.html # Scope is the element we are applying the select to, and that element is never returned self.assertTrue(len(sv.select('&', el, flags=sv.DEBUG)) == 0) def test_amp_is_select_target(self): """Test that ampersand is the element which scope is called on.""" for parser in util.available_parsers( 'html.parser', 'lxml', 'html5lib', 'xml'): soup = self.soup(self.MARKUP, parser) el = soup.html # Scope here means the current element under select ids = [el.attrs['id'] for el in sv.select('& div', el, flags=sv.DEBUG)] self.assertEqual(sorted(ids), sorted(['div'])) el = soup.body ids = [el.attrs['id'] for el in sv.select('& div', el, flags=sv.DEBUG)] self.assertEqual(sorted(ids), sorted(['div'])) # `div` is the current element under select, and it has no `div` elements. el = soup.div ids = [el.attrs['id'] for el in sv.select('& div', el, flags=sv.DEBUG)] self.assertEqual(sorted(ids), sorted([])) # `div` does have an element with the class `.wordshere` ids = [el.attrs['id'] for el in sv.select('& .wordshere', el, flags=sv.DEBUG)] self.assertEqual(sorted(ids), sorted(['pre'])) soupsieve-2.7/.gitignore0000644000000000000000000000247013615410400012317 0ustar00.DS_Store # Byte-compiled / optimized / DLL files __pycache__/ *.py[cod] *$py.class # C extensions *.so # Distribution / packaging .Python build/ develop-eggs/ dist/ downloads/ eggs/ .eggs/ lib/ lib64/ parts/ sdist/ var/ wheels/ *.egg-info/ .installed.cfg *.egg MANIFEST # PyInstaller # Usually these files are written by a python script from a template # before PyInstaller builds the exe, so as to inject date/other infos into it. *.manifest *.spec # Installer logs pip-log.txt pip-delete-this-directory.txt # Unit test / coverage reports htmlcov/ .tox/ .nox/ .coverage .coverage.* .cache nosetests.xml coverage.xml *.cover .hypothesis/ .pytest_cache/ # Translations *.mo *.pot # Django stuff: *.log local_settings.py db.sqlite3 # Flask stuff: instance/ .webassets-cache # Scrapy stuff: .scrapy # Sphinx documentation docs/_build/ # PyBuilder target/ # Jupyter Notebook .ipynb_checkpoints # IPython profile_default/ ipython_config.py # pyenv .python-version # celery beat schedule file celerybeat-schedule # SageMath parsed files *.sage.py # Environments .env .venv env/ venv/ ENV/ env.bak/ venv.bak/ # Spyder project settings .spyderproject .spyproject # Rope project settings .ropeproject # mkdocs documentation /site # mypy .mypy_cache/ .dmypy.json dmypy.json # Pyre type checker .pyre/ # Patches *.patch soupsieve-2.7/LICENSE.md0000644000000000000000000000211013615410400011722 0ustar00MIT License Copyright (c) 2018 - 2025 Isaac Muse Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. soupsieve-2.7/README.md0000644000000000000000000000664413615410400011615 0ustar00[![Donate via PayPal][donate-image]][donate-link] [![Build][github-ci-image]][github-ci-link] [![Coverage Status][codecov-image]][codecov-link] [![PyPI Version][pypi-image]][pypi-link] [![PyPI Downloads][pypi-down]][pypi-link] [![PyPI - Python Version][python-image]][pypi-link] [![License][license-image-mit]][license-link] # Soup Sieve ## Overview Soup Sieve is a CSS selector library designed to be used with [Beautiful Soup 4][bs4]. It aims to provide selecting, matching, and filtering using modern CSS selectors. Soup Sieve currently provides selectors from the CSS level 1 specifications up through the latest CSS level 4 drafts and beyond (though some are not yet implemented). Soup Sieve was written with the intent to replace Beautiful Soup's builtin select feature, and as of Beautiful Soup version 4.7.0, it now is :confetti_ball:. Soup Sieve can also be imported in order to use its API directly for more controlled, specialized parsing. Soup Sieve has implemented most of the CSS selectors up through the latest CSS draft specifications, though there are a number that don't make sense in a non-browser environment. Selectors that cannot provide meaningful functionality simply do not match anything. Some of the supported selectors are: - `.classes` - `#ids` - `[attributes=value]` - `parent child` - `parent > child` - `sibling ~ sibling` - `sibling + sibling` - `:not(element.class, element2.class)` - `:is(element.class, element2.class)` - `parent:has(> child)` - and [many more](https://facelessuser.github.io/soupsieve/selectors/) ## Installation You must have Beautiful Soup already installed: ``` pip install beautifulsoup4 ``` In most cases, assuming you've installed version 4.7.0, that should be all you need to do, but if you've installed via some alternative method, and Soup Sieve is not automatically installed, you can install it directly: ``` pip install soupsieve ``` If you want to manually install it from source, first ensure that [`build`](https://pypi.org/project/build/) is installed: ``` pip install build ``` Then navigate to the root of the project and build the wheel and install (replacing `` with the current version): ``` python -m build -w pip install dist/soupsieve--py3-none-any.whl ``` ## Documentation Documentation is found here: https://facelessuser.github.io/soupsieve/. ## License MIT [bs4]: https://beautiful-soup-4.readthedocs.io/en/latest/# [github-ci-image]: https://github.com/facelessuser/soupsieve/workflows/build/badge.svg [github-ci-link]: https://github.com/facelessuser/soupsieve/actions?query=workflow%3Abuild+branch%3Amain [codecov-image]: https://img.shields.io/codecov/c/github/facelessuser/soupsieve/master.svg?logo=codecov&logoColor=aaaaaa&labelColor=333333 [codecov-link]: https://codecov.io/github/facelessuser/soupsieve [pypi-image]: https://img.shields.io/pypi/v/soupsieve.svg?logo=pypi&logoColor=aaaaaa&labelColor=333333 [pypi-down]: https://img.shields.io/pypi/dm/soupsieve.svg?logo=pypi&logoColor=aaaaaa&labelColor=333333 [pypi-link]: https://pypi.python.org/pypi/soupsieve [python-image]: https://img.shields.io/pypi/pyversions/soupsieve?logo=python&logoColor=aaaaaa&labelColor=333333 [license-image-mit]: https://img.shields.io/badge/license-MIT-blue.svg?labelColor=333333 [license-link]: https://github.com/facelessuser/soupsieve/blob/main/LICENSE.md [donate-image]: https://img.shields.io/badge/Donate-PayPal-3fabd1?logo=paypal [donate-link]: https://www.paypal.me/facelessuser soupsieve-2.7/hatch_build.py0000644000000000000000000000304613615410400013147 0ustar00"""Dynamically define some metadata.""" import os from hatchling.metadata.plugin.interface import MetadataHookInterface def get_version_dev_status(root): """Get version_info without importing the entire module.""" import importlib.util path = os.path.join(root, "soupsieve", "__meta__.py") spec = importlib.util.spec_from_file_location("__meta__", path) module = importlib.util.module_from_spec(spec) spec.loader.exec_module(module) return module.__version_info__._get_dev_status() class CustomMetadataHook(MetadataHookInterface): """Our metadata hook.""" def update(self, metadata): """See https://ofek.dev/hatch/latest/plugins/metadata-hook/ for more information.""" metadata["classifiers"] = [ f"Development Status :: {get_version_dev_status(self.root)}", 'Environment :: Console', 'Intended Audience :: Developers', 'License :: OSI Approved :: MIT License', 'Operating System :: OS Independent', 'Programming Language :: Python :: 3', 'Programming Language :: Python :: 3.8', 'Programming Language :: Python :: 3.9', 'Programming Language :: Python :: 3.10', 'Programming Language :: Python :: 3.11', 'Programming Language :: Python :: 3.12', 'Programming Language :: Python :: 3.13', 'Topic :: Internet :: WWW/HTTP :: Dynamic Content', 'Topic :: Software Development :: Libraries :: Python Modules', 'Typing :: Typed' ] soupsieve-2.7/pyproject.toml0000644000000000000000000000545113615410400013245 0ustar00[build-system] requires = [ "hatchling>=0.21.1", ] build-backend = "hatchling.build" [project] name = "soupsieve" description = "A modern CSS selector implementation for Beautiful Soup." readme = "README.md" license = "MIT" requires-python = ">=3.8" authors = [ { name = "Isaac Muse", email = "Isaac.Muse@gmail.com" }, ] keywords = [ "CSS", "HTML", "XML", "selector", "filter", "query", "soup" ] dynamic = [ "classifiers", "version", ] [project.urls] Homepage = "https://github.com/facelessuser/soupsieve" [tool.hatch.version] source = "code" path = "soupsieve/__meta__.py" [tool.hatch.build.targets.wheel] include = [ "/soupsieve", ] [tool.hatch.build.targets.sdist] include = [ "/docs/src/markdown/**/*.md", "/docs/src/markdown/**/*.gif", "/docs/src/markdown/**/*.png", "/docs/src/markdown/dictionary/*.txt", "/docs/theme/**/*.css", "/docs/theme/**/*.js", "/docs/theme/**/*.html", "/requirements/*.txt", "/soupsieve/**/*.py", "/soupsieve/py.typed", "/tests/**/*.py", "/.pyspelling.yml", "/.coveragerc", "/mkdocs.yml" ] [tool.mypy] files = [ "soupsieve" ] strict = true show_error_codes = true [tool.hatch.metadata.hooks.custom] [tool.ruff] line-length = 120 lint.select = [ "A", # flake8-builtins "B", # flake8-bugbear "D", # pydocstyle "C4", # flake8-comprehensions "N", # pep8-naming "E", # pycodestyle "F", # pyflakes "PGH", # pygrep-hooks "RUF", # ruff # "UP", # pyupgrade "W", # pycodestyle "YTT", # flake8-2020, "PERF" # Perflint ] lint.ignore = [ "E741", "D202", "D401", "D212", "D203", "N802", "N801", "N803", "N806", "N818", "RUF012", "RUF005", "PGH004", "RUF100", "RUF022", "RUF023" ] [tool.tox] legacy_tox_ini = """ [tox] isolated_build = true envlist = py{38,39,310,311,312}, lint, nolxml, nohtml5lib [testenv] passenv = * deps = -rrequirements/tests.txt commands = mypy pytest --cov soupsieve --cov-append {toxinidir} coverage html -d {envtmpdir}/coverage coverage xml coverage report --show-missing [testenv:documents] passenv = * deps = -rrequirements/docs.txt commands = mkdocs build --clean --verbose --strict pyspelling -j 8 [testenv:lint] passenv = * deps = -rrequirements/lint.txt commands = "{envbindir}"/ruff check . [testenv:nolxml] passenv = * deps = -rrequirements/tests-nolxml.txt commands = pytest {toxinidir} [testenv:nohtml5lib] passenv = * deps = -rrequirements/tests-nohtml5lib.txt commands = pytest {toxinidir} [pytest] filterwarnings = ignore:\nCSS selector pattern:UserWarning """ [tool.pytest.ini_options] filterwarnings = [ "ignore:The 'strip_cdata':DeprecationWarning" ] soupsieve-2.7/PKG-INFO0000644000000000000000000001103013615410400011414 0ustar00Metadata-Version: 2.4 Name: soupsieve Version: 2.7 Summary: A modern CSS selector implementation for Beautiful Soup. Project-URL: Homepage, https://github.com/facelessuser/soupsieve Author-email: Isaac Muse License-Expression: MIT License-File: LICENSE.md Keywords: CSS,HTML,XML,filter,query,selector,soup Classifier: Development Status :: 5 - Production/Stable Classifier: Environment :: Console Classifier: Intended Audience :: Developers Classifier: License :: OSI Approved :: MIT License Classifier: Operating System :: OS Independent Classifier: Programming Language :: Python :: 3 Classifier: Programming Language :: Python :: 3.8 Classifier: Programming Language :: Python :: 3.9 Classifier: Programming Language :: Python :: 3.10 Classifier: Programming Language :: Python :: 3.11 Classifier: Programming Language :: Python :: 3.12 Classifier: Programming Language :: Python :: 3.13 Classifier: Topic :: Internet :: WWW/HTTP :: Dynamic Content Classifier: Topic :: Software Development :: Libraries :: Python Modules Classifier: Typing :: Typed Requires-Python: >=3.8 Description-Content-Type: text/markdown [![Donate via PayPal][donate-image]][donate-link] [![Build][github-ci-image]][github-ci-link] [![Coverage Status][codecov-image]][codecov-link] [![PyPI Version][pypi-image]][pypi-link] [![PyPI Downloads][pypi-down]][pypi-link] [![PyPI - Python Version][python-image]][pypi-link] [![License][license-image-mit]][license-link] # Soup Sieve ## Overview Soup Sieve is a CSS selector library designed to be used with [Beautiful Soup 4][bs4]. It aims to provide selecting, matching, and filtering using modern CSS selectors. Soup Sieve currently provides selectors from the CSS level 1 specifications up through the latest CSS level 4 drafts and beyond (though some are not yet implemented). Soup Sieve was written with the intent to replace Beautiful Soup's builtin select feature, and as of Beautiful Soup version 4.7.0, it now is :confetti_ball:. Soup Sieve can also be imported in order to use its API directly for more controlled, specialized parsing. Soup Sieve has implemented most of the CSS selectors up through the latest CSS draft specifications, though there are a number that don't make sense in a non-browser environment. Selectors that cannot provide meaningful functionality simply do not match anything. Some of the supported selectors are: - `.classes` - `#ids` - `[attributes=value]` - `parent child` - `parent > child` - `sibling ~ sibling` - `sibling + sibling` - `:not(element.class, element2.class)` - `:is(element.class, element2.class)` - `parent:has(> child)` - and [many more](https://facelessuser.github.io/soupsieve/selectors/) ## Installation You must have Beautiful Soup already installed: ``` pip install beautifulsoup4 ``` In most cases, assuming you've installed version 4.7.0, that should be all you need to do, but if you've installed via some alternative method, and Soup Sieve is not automatically installed, you can install it directly: ``` pip install soupsieve ``` If you want to manually install it from source, first ensure that [`build`](https://pypi.org/project/build/) is installed: ``` pip install build ``` Then navigate to the root of the project and build the wheel and install (replacing `` with the current version): ``` python -m build -w pip install dist/soupsieve--py3-none-any.whl ``` ## Documentation Documentation is found here: https://facelessuser.github.io/soupsieve/. ## License MIT [bs4]: https://beautiful-soup-4.readthedocs.io/en/latest/# [github-ci-image]: https://github.com/facelessuser/soupsieve/workflows/build/badge.svg [github-ci-link]: https://github.com/facelessuser/soupsieve/actions?query=workflow%3Abuild+branch%3Amain [codecov-image]: https://img.shields.io/codecov/c/github/facelessuser/soupsieve/master.svg?logo=codecov&logoColor=aaaaaa&labelColor=333333 [codecov-link]: https://codecov.io/github/facelessuser/soupsieve [pypi-image]: https://img.shields.io/pypi/v/soupsieve.svg?logo=pypi&logoColor=aaaaaa&labelColor=333333 [pypi-down]: https://img.shields.io/pypi/dm/soupsieve.svg?logo=pypi&logoColor=aaaaaa&labelColor=333333 [pypi-link]: https://pypi.python.org/pypi/soupsieve [python-image]: https://img.shields.io/pypi/pyversions/soupsieve?logo=python&logoColor=aaaaaa&labelColor=333333 [license-image-mit]: https://img.shields.io/badge/license-MIT-blue.svg?labelColor=333333 [license-link]: https://github.com/facelessuser/soupsieve/blob/main/LICENSE.md [donate-image]: https://img.shields.io/badge/Donate-PayPal-3fabd1?logo=paypal [donate-link]: https://www.paypal.me/facelessuser

Header 1

Header 2

Header 2

SVG Example

Title

Title

Title

Title

An H1

An H2

Another H2

Header 1

Header 2

A contrived example

Header 1

Header 2

header

header

Header 1

Header 2