A rust CLI that calculates the file size of a web page when loaded with all external ressources.
  • Rust 87.1%
  • JavaScript 10.3%
  • Just 1.3%
  • Dockerfile 0.8%
  • Nix 0.5%
Find a file
Pontoporeia 1ec0d8fe04 Fix 2 clippy collapsible_if warnings
- src/config.rs:94 — collapse nested if (dir.is_relative() && let Ok(cwd))
- src/stats.rs:144 — collapse nested if let (diff_result && diff_baseline)
- cargo clippy --lib is now clean (0 warnings)
- All 80 tests pass
2026-07-02 12:44:11 +02:00
.cargo Added New build system, and updated .gitignore 2025-12-05 00:15:17 +01:00
.forgejo/workflows cross compilation with armv7l support 2025-09-24 22:34:46 +02:00
assets SVG version with text-to-path so logo renders correctly for those without the font installed 2026-04-08 10:10:14 +02:00
src Fix 2 clippy collapsible_if warnings 2026-07-02 12:44:11 +02:00
tests Fix duplicate code blocks in Playwright browser_check.mjs 2026-06-23 18:20:10 +02:00
.gitignore Replace chromiumoxide CDP test with Playwright browser integration test 2026-06-21 11:41:36 +02:00
.jjignore fix: resolve assets when base_url prefix matches absolute HTML URLs (issue #11) 2026-03-31 12:00:03 +02:00
BUILDING.md docs: Comprehensive documentation update for v0.2.0 release 2025-09-19 01:52:19 +02:00
Cargo.lock DRY: eliminate code duplication across 12 files 2026-07-01 12:42:45 +02:00
Cargo.toml DRY: eliminate code duplication across 12 files 2026-07-01 12:42:45 +02:00
Containerfile Added New build system, and updated .gitignore 2025-12-05 00:15:17 +01:00
CONTRIBUTING.md feat(test): add browser integration test scaffold (§9) and close all remaining TODO items 2026-02-24 16:35:58 +01:00
flake.lock Add Cargo.lock and flake.lock for nix develop compatibility 2026-03-11 11:25:17 +01:00
flake.nix release: bump version to v0.4.1 2026-03-31 12:05:20 +02:00
Justfile Replace chromiumoxide CDP test with Playwright browser integration test 2026-06-21 11:41:36 +02:00
LICENSE Initial commit 2025-04-16 10:28:33 +00:00
README.md Add AI disclosure section and refresh README command-line docs 2026-06-24 18:00:28 +02:00
TODO.md Fix 2 clippy collapsible_if warnings 2026-07-02 12:44:11 +02:00

Logo of a white weighing machine in a green nature colored circle

webweigh is a command-line tool that scans HTML files in a directory, calculates their total page size (including referenced assets like CSS, JavaScript, and images), and optionally injects this size information into specified HTML elements.

Key capabilities:

  • Scan mode: Calculate total website size without modifying files
  • Injection mode: Update HTML files with calculated page sizes
  • Analysis mode: Detailed breakdown of page size calculations for debugging

This is a Rust rewrite of the Python script originally used by solar.lowtech-website, offering correct measurements, significantly improved performance and new features.

Installation

Binary releases

Currently, I only build for Linux X86. You can grab the executable from the release page.

Cargo

cargo install --git https://codeberg.org/Pontoporeia/webweigh

From Source

Prerequisites

  • Rust 1.70+ (recommended)
  • Cargo package manager
git clone https://codeberg.org/Pontoporeia/webweigh.git
cd webweigh
cargo install --path .

See BUILDING.md for detailed build instructions.

Usage

Basic Usage

Scan mode (calculate sizes without modifying files):

webweigh --directory /path/to/website

Injection mode (update HTML files with calculated sizes):

webweigh --directory /path/to/website --selector ".page-size"

Common Examples

Scan a website (read-only analysis):

webweigh --directory ./dist

Update page sizes in elements with class "page-size":

webweigh --directory ./dist --selector ".page-size"

Process with exclusions and exceptions:

webweigh --directory ./dist --selector ".size" \
  --exclude "portfolio" "demo" \
  --except "portfolio/index.html"

Remove base URL prefix from asset paths:

webweigh --directory ./dist --selector ".size" \
  --base-url "https://example.com"

Dry run (calculate and validate selectors but write nothing):

webweigh --directory ./dist --selector ".page-size" --dry-run

Verbose processing with detailed logs:

webweigh --directory ./dist --selector "#size-info" -vv

Silent operation (no output):

webweigh --directory ./dist --selector "body" --silent

JSON output (machine-readable, one JSON object per line):

webweigh --directory ./dist --output-format json
webweigh --directory ./dist --selector ".page-size" --output-format json | tee baseline.json

Strict mode (fail on broken links / missing assets):

webweigh --directory ./dist --strict

Diff against a baseline (detect page weight regressions):

webweigh --directory ./dist --output-format json | tee baseline.json
# ... later, after changes ...
webweigh --directory ./dist --output-format json --diff baseline.json

Analyze a specific page (detailed breakdown for debugging):

webweigh --directory ./dist --analyze-page "/"
webweigh --directory ./dist --analyze-page "/portfolio/" --base-url "https://example.com"

This will show:

  • Base HTML size
  • Assets grouped by type (Stylesheets, Scripts, Images, Fonts, etc.)
  • Size and percentage contribution of each asset type
  • Detailed list of all assets sorted by size

Perfect for debugging discrepancies between webweigh calculations and browser dev tools.

Excludes & Exceptions

Exclude Examples:

# excludes entire directory
-e static/portfolio
-e static/portfolio templates

# regex: excludes .tmp files in static/
-e "static/.*\.tmp$" 

# regex: excludes all .html files in content/microblog/
-e "content/microblog/.*\.html$"

Exceptions Examples:

# excludes all HTML in content/microblog/ except index.html in that directory
-e "content/microblog/.*.html$" --except index.html

# excludes content/temp/ directory except files ending with important.txt
  -e content/temp --except "important\.txt$"

Command Line Options

Usage: webweigh [OPTIONS] --directory <DIR>

Options:
  -d, --directory <DIR>          Directory to traverse (required)
      --base-url <URL>           Base URL prefix to remove from asset paths
      --selector <SELECTOR>      CSS selector for size injection (optional — scans without modification if omitted)
      --analyze-page <PATH>      Analyze a specific page with detailed breakdown. Paths are relative
                                 to --directory. Use a directory path like "/" or "/portfolio/"
                                 to find index.html, or a file path like "/about.html" for a specific file.
      --dry-run                  Calculate sizes and validate selectors but do not write any files
  -e, --exclude <PATTERN>...     Exclude paths/patterns (supports literal paths and regex)
      --except <PATTERN>...      Exception patterns within exclusion scope
      --output-format <FORMAT>   Output format: text (human-readable), json (JSON Lines), or csv.
                                 Default: text.
      --strict                   Treat unresolved asset references as errors instead of warnings.
                                 An asset is unresolved when its URL cannot be mapped to a file on disk
                                 (e.g. external CDN links, missing local files).
      --diff <FILE>              Compare results against a previous JSON baseline file. Shows which
                                 pages grew or shrank, and by how much. The baseline is a JSON file
                                 previously produced with --output-format json.
  -v, --verbose...               Logging verbosity: -v (errors), -vv (info), -vvv (debug), -vvvv (trace)
  -s, --silent                   Suppress all output
  -h, --help                     Show help information
  -V, --version                  Show version

Verbosity Levels

  • Default: Shows only the final statistics report (no log output)
  • -v: Shows error-level logs only
  • -vv: Shows info-level logs (processing start, configuration)
  • -vvv: Shows debug-level logs (detailed processing information)
  • -vvvv: Shows trace-level logs (maximum detail)
  • --silent: No output at all (suppresses even the statistics report)

Supported Assets

Direct References:

  • HTML files (.html, .htm)
  • CSS stylesheets (<link rel="stylesheet">)
  • JavaScript files (<script src>)
  • Images (<img src>, CSS url())
  • Fonts (CSS @font-face, url())
  • Icons (<link rel="icon">)

Dynamic Assets (detected in JS/CSS):

  • Dynamic script loading (script.src = "...")
  • Dynamic imports (import(), require())
  • Fetch calls (fetch("/api/data.json"))
  • CSS imports (@import url(...))
  • Asset assignments (link.href = "...")

Dependencies

Core runtime dependencies:

  • clap: Command-line argument parsing (builder API, no proc-macro derive)
  • scraper: HTML parsing and CSS selector evaluation
  • regex: Pattern matching for asset discovery in CSS and JS
  • rayon: Work-stealing parallel processing
  • log: Logging facade (stderr logger is inlined; no env_logger required)
  • anyhow: Error handling in the binary entry point

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines on how to contribute to webweigh.

For build instructions and development setup, see BUILDING.md.

AI Disclosure

This tool was developed with the assistance of large language models (LLMs) throughout its history. The following models were used:

  • Claude Sonnet 3.5 through Claude Sonnet 4.6 (Anthropic)
  • DeepSeek V4 Pro (DeepSeek)

AI assistance was employed for code generation, refactoring, test authoring, documentation, debugging, and project scaffolding. All AI-generated output was reviewed, tested, and validated by a human before inclusion.

Credits

Icon: Phosphor Icons

License

This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).