Quick start
Run your first kumo command.
Once kumo is on your PATH, crawl a host into the data tree:
kumo scrape example.com --max-pages 20
Each page is written as pages/<host>/<path>.md under your data directory: a
JSON front-matter block with every structured field, followed by the page
content as Markdown. Read back what you crawled, offline:
kumo pages example.com -o table
Before a full crawl, look at the frontier without fetching anything:
kumo scrape example.com --dry-run -o url # the URLs a crawl would start from
kumo sitemap example.com # the same, from robots.txt and sitemaps
Work with a single page when you do not want a whole crawl:
kumo page https://example.com/ -o json | jq .title
kumo links https://example.com/ # its outbound links, as URIs
Add -o jsonl to stream records into the rest of your tools, and --limit to
cap any command. See the CLI reference for the full surface.