Lesson 7 of 9
DOM Scraping, The Safe Way
Sometimes the value is in none of the good places — not the dataLayer, not the URL, not a cookie. It is only printed on the page. Then you scrape the DOM. This is a legitimate tool, but it is the most fragile one you have: it breaks the moment a developer changes the markup, an A/B test swaps the layout, or the page renders in another language. Treat it as a last resort and make it as robust as you can.
Choose stable selectors
The order of preference, most stable first:
A data-* attribute put there for tracking is gold — it exists to be read and rarely changes. A positional selector like "the third div" is the opposite: it works today and silently breaks on the next redesign.
Read defensively, then normalise
Assume the element might be missing and the text might be messy. Never let a scrape throw — a broken variable can take other tags down with it. Return undefined and move on.
function () {
var el = document.querySelector('[data-plan-price]');
if (!el) return undefined; // guard: element may be gone
var raw = el.textContent || ''; // " $1,499.00 "
var num = parseFloat(raw.replace(/[^0-9.]/g, '')); // 1499
return isNaN(num) ? undefined : num; // never return NaN
}Then test it in every state that matters: logged in and out, on sale and not, empty cart and full, and in any other locale the site serves. A scrape that only works for the happy path is a slow-motion data quality bug.
Key takeaway
Scrape only when the value lives nowhere better. Prefer data-* attributes over ids over classes over positions, guard against missing elements, normalise the text to a clean value, never throw, and test across states and locales.