Single-page mode
Not every URL is a list. An article, a company profile, a one-off dashboard page — these are one record, not a repeating item. datahelm:scrap:generate normally requires a repeating pattern to detect (it errors with "Could not detect a repeating item list" on a page that has none); --single-page skips that requirement entirely:
php artisan datahelm:scrap:generate "https://example.com/about-us" \
--single-page --robot-name=AboutUs
This treats the whole page as a single item: field detectors (title, price, image, description, …) run directly against <body> instead of a detected list-item sample, and the resulting blueprint is exactly what you'd expect:
{
"item_selector": "body",
"pagination": { "strategy": "none", "css": "" }
}
No engine changes are involved: CrawlEngine already treats an item_selector matching exactly one node as a one-item crawl, so --single-page is purely a generation-time shortcut. --search-filters still works alongside it — each filter URL becomes its own single-page item, useful for scraping the same kind of one-off page (e.g. a profile) across several known URLs.
--main-content — skip the site chrome
By default, single-page detection sees the whole <body> — including the nav bar, footer and sidebars, whose links and text can leak into the detected fields. --main-content (the equivalent of Firecrawl's onlyMainContent) scopes detection to the page's primary content region:
php artisan datahelm:scrap:generate "https://en.wikipedia.org/wiki/Web_scraping" \
--single-page --main-content --robot-name=WikiArticle
The detector looks for <main> / [role=main] / #content-style containers and bakes the region in as the item selector — on Wikipedia this produces item_selector: "main#content". Two safety rules:
- When no region is confidently found, it falls back to
<body>and says so in the generation notes. - When the region's CSS selector isn't unique on the page (it would split one page into several items), it also falls back to
<body>.
Pairs well with Markdown output
Single-page mode is the natural companion of the markdown field type and output format: point --single-page --main-content at an article, set a markdown field for the content region, and export an LLM-ready document from one URL — the package's equivalent of Firecrawl's /scrape endpoint.
How other tools handle this
Scrapy has no dedicated concept either — you write a parse() that reads fields off response directly and yields one item, instead of looping over a selector list (the same idea as item_selector: "body"). Firecrawl draws the line explicitly with two endpoints: /scrape (one URL → one result) versus /crawl (many results) — --single-page is this package's /scrape.
Next: Markdown / LLM output →

