Why Linking to the Right Wikidata Nodes Matters for Local Business AI Visibility (2026)
Why Linking to the Right Wikidata Nodes Matters for Local Business AI Visibility (2026)
When someone asks ChatGPT for a lawyer in California or Perplexity for a clinic in Austin, the answer doesn’t come from the whole internet. It comes from a filtered slice of the knowledge graph: entities that are instance of something (law firm, clinic), located in somewhere (California, Austin), and country United States. Miss the right nodes, and you’re not in the slice. You’re not even in the room. This post uses live Wikidata data to show which nodes actually dominate that slice—and why linking to them is the difference between being recommended and being invisible.
What are hub nodes? (And why should you care?)
Hub nodes in Wikidata are the items that (a) huge numbers of US entities with a website already link to, and (b) every “find me a [type] in [place]” query effectively filters on. Think of them as the busiest intersections in the graph: country (P17), location (P131), type (P31), industry (P452). When an AI assistant answers “lawyer in Florida,” it’s looking for entities that sit at the intersection of law firm, Florida, and United States. The nodes that define those intersections—Q30 (United States), Q99 (California), Q613142 (law firm), and so on—are the hub nodes. Linking your business to them doesn’t just “add you to Wikidata.” It puts you in the discovery set that ChatGPT and Perplexity actually return. For how to get your business into the graph with the right structure, see Wikidata publishing for business.
The data: which nodes do US entities with a website use?
We ran SPARQL against the public Wikidata Query Service for all US entities that have an official website (P17=Q30, P856). Below are the instance-of (P31), located-in (P131), and industry (P452) values that show up most. These are the hub nodes in practice—the ones that already anchor local-business discovery.
Top instance-of (P31) nodes — US entities with website
| Type | Entities |
|---|---|
| business | 20,002 |
| nonprofit organization | 18,883 |
| radio station | 12,454 |
| high school | 10,635 |
| movie theater | 6,742 |
| organization | 6,528 |
| city in the United States | 5,967 |
| online exhibition | 4,129 |
| school district | 3,397 |
| school | 2,949 |
| public company | 2,706 |
| medical organization | 2,524 |
| hospital | 2,490 |
So “business” and “nonprofit” lead by a mile—but radio stations, schools, and movie theaters are right behind. The graph is crowded at the top; the takeaway for local businesses (law firms, clinics, real estate) is that you need to link to the specific type nodes (e.g. law firm Q613142, health care facility Q1774898) that queries use, not just “organization.”
Top located-in (P131) nodes — US entities with website
| Place | Entities |
|---|---|
| California | 3,592 |
| New York | 2,951 |
| Texas | 2,866 |
| Pennsylvania | 2,533 |
| Washington, D.C. | 2,283 |
| Michigan | 2,065 |
| Ohio | 1,917 |
| New York City | 1,822 |
| Florida | 1,788 |
| Manhattan | 1,742 |
California and New York dominate; states and major cities (NYC, Manhattan, D.C.) are the location hub nodes. If your entity isn’t linked to the right state or city node, “in California” or “in Manhattan” won’t surface you.
Top industry (P452) nodes — US entities with website
| Industry | Entities |
|---|---|
| higher education | 1,742 |
| video game industry | 809 |
| retail | 782 |
| financial services | 614 |
| software industry | 577 |
| business and professional associations | 322 |
| environment | 300 |
| health care | 237 |
Industry tagging (P452) is where many local businesses can stand out—health care, legal, real estate—by linking to the canonical industry nodes that AI systems and SPARQL use.
Data note: March 2026. Counts from the public Wikidata Query Service. US = P17 Q30, has website = P856. Run pnpm tsx scripts/wikidata-hub-nodes-report.ts to refresh; output is reports/wikidata-hub-nodes-report.json.
Why this matters for AI visibility and GEO
- Queries are built from these nodes. “Law firm” + “California” + “United States” maps to P31, P131, and P17. Your entity either links to those same nodes or it doesn’t show up. No node, no recommendation.
- Discovery set, not just “in the graph.” Getting into Wikidata is step one. Linking to the hub nodes that queries filter on is what gets you into the result set when people ask by type and location.
- For agencies: “We don’t just add your client to Wikidata—we connect them to the same hub nodes that ChatGPT and Perplexity filter on, so they show up when people ask by city, state, and type.” That’s the pitch.
Bottom line: what are Wikidata hub nodes?
Hub nodes are the high-value Wikidata items (country, location, type, industry) that both (1) many US entities with a website already link to and (2) AI assistants and SPARQL use to answer “find me a [type] in [place].” Linking your business to these nodes—via P17, P31, P131, P452, and related properties—is what puts you in the discovery set for ChatGPT, Claude, and Perplexity. Fix the source; the recommendations follow.
How GEMflush uses hub nodes
GEMflush publishes businesses to Wikidata with the properties that matter: P17 (country), P31 (instance of), P131 (located in), P159 (headquarters), P452 (industry), P1995 (medical specialty). We map to canonical hub nodes—Q30 for the US, Q613142 for law firms, Q1660104 for real estate, Q1774898 for health care facilities—and to the same state and city QIDs that show up in this report. So clients aren’t just “in the graph”; they’re connected to the nodes that queries and AI actually use. That’s knowledge graph engineering for GEO.
Get started with GEMflush or see AI visibility for agencies.
Methodology
- Scope: US entities with official website (P17=Q30, P856 present).
- Rankings: For each property (P31, P131, P17, P452), we count how many of those entities use each value and report the top N.
- Script: wikidata-hub-nodes-report.ts writes
reports/wikidata-hub-nodes-report.json. Run it to reproduce or refresh.
Internal links
- Which US Industries Have the Biggest Knowledge Graph Gap?
- Wikidata: Why This Free Knowledge Base Matters for Local Business AI Visibility
- US Law Firms in Wikidata by State, US Medical Clinics in Wikidata by State, US Real Estate in Wikidata by State
- Wikidata Local Business Coverage by City
- For Agencies
- Law Firm Visibility in ChatGPT | Medical Clinic Visibility in ChatGPT | Real Estate Agent Visibility in ChatGPT
Explore Related Topics
Learn More About GEO
Related GEO Articles
Explore our comprehensive coverage of Generative Engine Optimization:
Related Articles
Wikidata Local Business Coverage by City (2026)
How many law firms, medical clinics, and real estate companies in major US cities are in Wikidata? City-level data for AI visibility and GEO.
Wikidata Publishing for Business | Get in the Knowledge Graph
Wikidata publishing for business: what it is, why it drives AI visibility, and how to get your business into the knowledge graph. For agencies and local businesses.
Wikidata Local Business Coverage: What SEO Agencies Need to Know (2026)
Data-driven look at how many US local businesses appear in Wikidata by industry. Why the gap matters for AI visibility and how agencies can add GEO services for clients.
Knowledge Graph Publishing for AI Visibility | What It Is & Why Agencies Offer It
What is knowledge graph publishing? How it drives AI visibility for agencies and local businesses. Publish to Wikidata vs monitoring only—and why it belongs in your GEO stack.
US Law Firms in Wikidata by State (2026)
Data-driven look at how many US law firms appear in Wikidata by state. AI visibility and law firms in Wikidata—which states lead and what it means for GEO.
US Medical Clinics in Wikidata by State (2026)
How many US medical clinics appear in Wikidata by state? Data-driven snapshot of medical clinic AI visibility and the knowledge graph gap for healthcare.