Back to Research & Insights

Why Linking to the Right Wikidata Nodes Matters for Local Business AI Visibility (2026)

by GEMflush Research Team6 min read

Why Linking to the Right Wikidata Nodes Matters for Local Business AI Visibility (2026)

When someone asks ChatGPT for a lawyer in California or Perplexity for a clinic in Austin, the answer doesn’t come from the whole internet. It comes from a filtered slice of the knowledge graph: entities that are instance of something (law firm, clinic), located in somewhere (California, Austin), and country United States. Miss the right nodes, and you’re not in the slice. You’re not even in the room. This post uses live Wikidata data to show which nodes actually dominate that slice—and why linking to them is the difference between being recommended and being invisible.

What are hub nodes? (And why should you care?)

Hub nodes in Wikidata are the items that (a) huge numbers of US entities with a website already link to, and (b) every “find me a [type] in [place]” query effectively filters on. Think of them as the busiest intersections in the graph: country (P17), location (P131), type (P31), industry (P452). When an AI assistant answers “lawyer in Florida,” it’s looking for entities that sit at the intersection of law firm, Florida, and United States. The nodes that define those intersections—Q30 (United States), Q99 (California), Q613142 (law firm), and so on—are the hub nodes. Linking your business to them doesn’t just “add you to Wikidata.” It puts you in the discovery set that ChatGPT and Perplexity actually return. For how to get your business into the graph with the right structure, see Wikidata publishing for business.

The data: which nodes do US entities with a website use?

We ran SPARQL against the public Wikidata Query Service for all US entities that have an official website (P17=Q30, P856). Below are the instance-of (P31), located-in (P131), and industry (P452) values that show up most. These are the hub nodes in practice—the ones that already anchor local-business discovery.

Top instance-of (P31) nodes — US entities with website

TypeEntities
business20,002
nonprofit organization18,883
radio station12,454
high school10,635
movie theater6,742
organization6,528
city in the United States5,967
online exhibition4,129
school district3,397
school2,949
public company2,706
medical organization2,524
hospital2,490

So “business” and “nonprofit” lead by a mile—but radio stations, schools, and movie theaters are right behind. The graph is crowded at the top; the takeaway for local businesses (law firms, clinics, real estate) is that you need to link to the specific type nodes (e.g. law firm Q613142, health care facility Q1774898) that queries use, not just “organization.”

Top located-in (P131) nodes — US entities with website

PlaceEntities
California3,592
New York2,951
Texas2,866
Pennsylvania2,533
Washington, D.C.2,283
Michigan2,065
Ohio1,917
New York City1,822
Florida1,788
Manhattan1,742

California and New York dominate; states and major cities (NYC, Manhattan, D.C.) are the location hub nodes. If your entity isn’t linked to the right state or city node, “in California” or “in Manhattan” won’t surface you.

Top industry (P452) nodes — US entities with website

IndustryEntities
higher education1,742
video game industry809
retail782
financial services614
software industry577
business and professional associations322
environment300
health care237

Industry tagging (P452) is where many local businesses can stand out—health care, legal, real estate—by linking to the canonical industry nodes that AI systems and SPARQL use.

Data note: March 2026. Counts from the public Wikidata Query Service. US = P17 Q30, has website = P856. Run pnpm tsx scripts/wikidata-hub-nodes-report.ts to refresh; output is reports/wikidata-hub-nodes-report.json.

Why this matters for AI visibility and GEO

  • Queries are built from these nodes. “Law firm” + “California” + “United States” maps to P31, P131, and P17. Your entity either links to those same nodes or it doesn’t show up. No node, no recommendation.
  • Discovery set, not just “in the graph.” Getting into Wikidata is step one. Linking to the hub nodes that queries filter on is what gets you into the result set when people ask by type and location.
  • For agencies: “We don’t just add your client to Wikidata—we connect them to the same hub nodes that ChatGPT and Perplexity filter on, so they show up when people ask by city, state, and type.” That’s the pitch.

Bottom line: what are Wikidata hub nodes?

Hub nodes are the high-value Wikidata items (country, location, type, industry) that both (1) many US entities with a website already link to and (2) AI assistants and SPARQL use to answer “find me a [type] in [place].” Linking your business to these nodes—via P17, P31, P131, P452, and related properties—is what puts you in the discovery set for ChatGPT, Claude, and Perplexity. Fix the source; the recommendations follow.

How GEMflush uses hub nodes

GEMflush publishes businesses to Wikidata with the properties that matter: P17 (country), P31 (instance of), P131 (located in), P159 (headquarters), P452 (industry), P1995 (medical specialty). We map to canonical hub nodes—Q30 for the US, Q613142 for law firms, Q1660104 for real estate, Q1774898 for health care facilities—and to the same state and city QIDs that show up in this report. So clients aren’t just “in the graph”; they’re connected to the nodes that queries and AI actually use. That’s knowledge graph engineering for GEO.

Get started with GEMflush or see AI visibility for agencies.

Methodology

  • Scope: US entities with official website (P17=Q30, P856 present).
  • Rankings: For each property (P31, P131, P17, P452), we count how many of those entities use each value and report the top N.
  • Script: wikidata-hub-nodes-report.ts writes reports/wikidata-hub-nodes-report.json. Run it to reproduce or refresh.

Internal links

Explore Related Topics

Related GEO Articles

Explore our comprehensive coverage of Generative Engine Optimization:

Share:
Why Linking to the Right Wikidata Nodes Matters for Local Business AI Visibility (2026) | GEMflush