Floki: HTML parser with CSS selectors
Elixir HTML parser with CSS selectors and multiple parser backends.
Learn more about Floki
Floki is an HTML parsing library for Elixir that converts HTML documents into structured node trees and supports CSS selector queries. It uses a default mochiweb_html parser backend but can be configured to use alternative parsers like fast_html (C-based lexbor) or html5ever (Rust-based). The library represents HTML nodes as tuples containing tag names, attributes, and child nodes, providing functions for parsing, searching, and manipulating HTML content. Common use cases include web scraping, HTML processing, and document transformation in Elixir applications.
Multiple Parser Backends
Supports three different HTML parsers including mochiweb_html, fast_html (C-based), and html5ever (Rust-based) for different performance and correctness trade-offs.
CSS Selector Support
Implements CSS selector syntax for node searching including attribute selectors, combinators, and pseudo-selectors for flexible HTML querying.
Tuple-Based Representation
Uses a simple tuple structure {tag_name, attributes, children_nodes} to represent HTML nodes, making it easy to pattern match and manipulate in Elixir.
# Parse HTML document and find elements using CSS selectors
html = """
<html>
<body>
<section id="content">
<p class="headline">Floki</p>
<span class="headline">Enables search using CSS selectors</span>
<a href="https://github.com/philss/floki">Github page</a>
</section>
</body>
</html>
"""
{:ok, document} = Floki.parse_document(html)
# Find elements by CSS selector
Floki.find(document, "p.headline")
# => [{"p", [{"class", "headline"}], ["Floki"]}]Adds initial support for the :has pseudo-selector for finding elements containing specific child elements.
- –This version adds initial support for the :has pseudo-selector
- –Support for div:has(h1), div:has(h1, p, span), div:has(p.foo), and div:has(img[src='url']) selectors
Move regex declaration from module tag to inside function. This is a fix to be compatible with the upcoming OTP 28.
- –Add Elixir 1.18 to the CI workflow
- –Bump ex_doc from 0.35.1 to 0.37.1
- –Fix versions we describe in README.md
- –Bump credo from 1.7.10 to 1.7.11
Add CSS escape function, fix raw_html encoding bug, and drop support for older Elixir versions.
- –Add Floki.css_escape/1 function
- –Fix bug propagating identity encoder in raw_html/2
- –Remove support for Elixir 1.13 and OTP 22
- –Bump credo from 1.7.8 to 1.7.9
Top in Developer Tools
Related Repositories
Discover similar tools and frameworks used by developers
Spring Initializr
Extensible API for generating JVM projects with multi-language and build system support.
RedisInsight
Cross-platform desktop client for Redis data management and monitoring.
Colima
Lima-based container runtime provisioning for macOS and Linux.
Cursor
Local code editor with integrated LLM assistance.
Biome
Unified toolchain providing fast formatting and linting for JavaScript, TypeScript, CSS, and JSON with CLI and LSP.