← All Posts
Scraping Public Records · 8 min

Is It Legal to Scrape Public Records? A Builder's Guide (Not a Lawyer's)

By Forrest Webber · June 17, 2026

A Texas appraisal district will email you an entire county's property data for about four dollars. It's completely legal — a public record, handed over by the people whose job is to hand it over. So why does scraping and reselling public records feel like it lives in a gray area?

Because the honest answer is: mostly it doesn't — but the parts that do matter, matter a lot.

I'm not a lawyer, and I want to be clear about that up front: this is an operator's read, not legal advice, and if you're about to do something big you should pay a real attorney for a real opinion. But I've pulled a lot of county data, built a property-data business on it, and spent real hours understanding where the lines are. So here's the builder's-eye view.

Start with the word "public"

Public records are public by law. That's not a loophole or a gray area — it's the entire design.

In Texas, the Public Information Act (Government Code Chapter 552) starts from the presumption that government information belongs to the people who paid for it. The federal FOIA carries the same spirit at the national level. Appraisal rolls, deed records, court dockets, business filings — these were made open on purpose, because a functioning society needs to know who owns what, who owes what, and what the government is doing.

When a Texas appraisal district emails you its property roll for about four dollars — every owner, every situs address, every assessed value — that's not a leak. That's the system working exactly as designed. The Tax Code even addresses it: §25.027 governs how appraisal-roll information gets distributed (and, worth knowing, it's also the statute that restricts a few sensitive fields — more on that line below).

So the baseline is friendlier than most people assume. The default state of a public record is "you're allowed to have this." The real questions are about how you get it and what you do with it.

Public record ≠ PII (this is the line that matters)

Here's the distinction that separates clean operators from people who get into trouble.

A public record is a fact the government maintains in the open: this parcel is owned by this entity, sold on this date, for this amount. A piece of personally identifiable information — a Social Security number, a date of birth, a driver's license number — is a different animal, and a lot of it is specifically redacted out of public records precisely because it's sensitive.

The mistake is treating "I found it" as "I can use it." Owner name and mailing address on a deed? Public, intended to be public, fine. A phone number you skip-traced and want to robo-dial? Now you've walked into the TCPA, and that statute doesn't care that the underlying record was public — it cares that you're about to call someone who didn't ask to hear from you. The fines run $500 to $1,500 per message.

Public-by-law gives you the data. It does not give you blanket permission for every downstream use of that data. Keep those two ideas separate and you avoid most of the trouble.

"Scraping" is a method, not a crime

People use "scraping" like it's inherently shady. It isn't. Scraping is just automating the act of looking at a page you're already allowed to look at. If a human can open the page in a browser without logging in, a script reading that same page is doing the same thing, faster.

The closest thing we have to a marquee ruling here is hiQ Labs v. LinkedIn. hiQ scraped public LinkedIn profiles; LinkedIn invoked the Computer Fraud and Abuse Act — the federal "unauthorized access" law — to try to stop them. The courts largely sided with hiQ on the core point: scraping data that's publicly available, without authentication, generally isn't "unauthorized access" under the CFAA. The CFAA is about breaking into systems — bypassing logins, defeating access controls — not about reading what's already open to the world.

That said, the case got messy on contract grounds, which is the real lesson. The CFAA wasn't hiQ's problem. The terms of service were.

Where the actual lines are

Strip away the disclaimers and here's what genuinely matters when you're pulling public data at scale:

  • Don't hammer the server. Rate-limiting isn't a courtesy, it's the difference between "automated access" and "denial-of-service." If your scraper degrades a county's site for everyone else, you've crossed from reading into abusing, and that's where access-fraud arguments get teeth. Throttle. Cache. Be a polite guest.
  • Terms of service are a contract, not a law — but contracts bind you. A site's ToS can't make a public record un-public, but it can govern your relationship with that specific website. Violating ToS usually isn't a crime; it can still get you sued or banned. Read them. Weigh them.
  • Copyright lives in the compilation, not the facts. You can't copyright the fact that a parcel sold for $300,000 — facts are free. But someone's curated, formatted, value-added database of those facts can carry a thin copyright in its arrangement. Pull the underlying facts; don't lift a competitor's finished product wholesale.
  • Respect the PII line on redistribution. Reselling owner names and public sale prices is one thing. Repackaging sensitive personal data the government tried to redact is another. Know which you're holding.

The cleanest path skips scraping entirely

Here's the part that makes all of this almost moot for the highest-value data: you often don't have to scrape at all.

My friend Andy showed me AlcoholSalesTracker years ago and summed up the whole model in one sentence: it's just getting data from publicly-listed places, improving the UI/UX, and customers buy it. That stuck with me. Because the government will frequently hand you the bulk file directly if you simply ask.

That four-dollar appraisal roll? I didn't scrape it row by row. I sent a one-line public information request and they emailed me a complete, current, authorized export — every field, clean, with payment on record. Harris, Tarrant, and Collin counties publish their bulk rolls as free downloads. No script, no rate-limiting worries, no ToS to weigh. Just a sanctioned copy of public data, handed over by the people whose job is to hand it over.

When the bulk-purchase door is open, walk through it. It's faster, it's cleaner, and it sidesteps every scraping question in one move.

The honest summary

Public records are public by law. Scraping data you're allowed to view is generally fine, hiQ backs that up, and the CFAA is about breaking in — not about reading what's open. The real constraints are practical and downstream: don't pound the servers, respect the terms you agreed to, don't lift a competitor's compiled database, keep PII separate from public facts, and remember that getting a phone number legally doesn't make calling it legal.

The government is slow and the access is ugly. That ugliness is the opportunity — clean it up, package it well, and people pay.

The data's just sitting there, and most of the time they'll email it to you for four dollars. Most people will never bother to ask. You could.

The Newsletter

The playbooks land here first.

Real files, real costs, real numbers — how to turn public data into products people pay for. Join and I’ll send you the Data Arbitrage Starter Pack to begin.

No spam. Unsubscribe anytime.