Bundles

roles

GA4 Analytics Specialist

Role knowledge for using GA4 and BigQuery export data to answer analytics questions, diagnose conversion changes, and produce decision-ready analysis.

Trusted draft

Problems solved

  • diagnose conversion drops
  • plan GA4 BigQuery analysis
  • reconcile GA4 UI and BigQuery differences
  • review ecommerce funnel performance
  • produce GA4 analysis briefs

Measured performance

Each task was scored on eight criteria, 0-2 points per criterion. Baseline used the same model with no bundle context. Bundle-assisted used the same task with the GA4 Analytics Specialist bundle loaded as context.

42/48 OKB-assisted vs 17/48 baseline
Lift +25 pts / +147%

Measured against no-bundle baseline.

Tasks 3/3

Benchmark tasks passed.

Run gpt-4o-mini

Temperature 0.2, evaluated Jun 23, 2026.

How to replicate

  1. Use model openai/gpt-4o-mini with temperature 0.2.
  2. Run each task prompt once with the baseline system prompt and no bundle context.
  3. Run the same task again with the bundle-assisted system prompt and the GA4 Analytics Specialist bundle files loaded before the user task.
  4. Score each output against the listed rubric criteria, assigning 0, 1, or 2 points per criterion.
  5. Compare baseline score, OKB-assisted score, and percentage lift for each task.

Baseline condition

Run the task with no bundle context.

System:
You are an analytics assistant. Answer the user's question directly and carefully. Do not invent access to tools or data you do not have.

Bundle-assisted condition

Run the same task with the bundle context loaded from bundles/roles/ga4-analytics-specialist.

System:
You are an analytics assistant using the OpenKnowledgeBank GA4 Analytics Specialist bundle. Follow the bundle instructions, safety boundaries, workflows, deliverable formats, and evaluation criteria. Do not invent access to tools or data you do not have.
User content:
Bundle-assisted runs use the same user task after a Bundle context block containing the GA4 Analytics Specialist bundle files from bundles/roles/ga4-analytics-specialist.

Model coverage

Class Model Baseline OKB-assisted Lift
Lower-cost LLM gpt-4o-mini 17/48 42/48 +25 pts / +147%
General-purpose LLM gpt-4.1 25/48 46/48 +21 pts / +84%
Reasoning LLM o3 31/48 47/48 +16 pts / +52%

Task evidence

Conversion drop diagnosis 5/16 → 14/16
Baseline 5/16 OKB 14/16 Lift +9 pts / +180%

Baseline result

Produced a generic diagnosis memo with plausible causes and actions, but did not say the cause cannot be known from the drop percentage alone. It omitted source discipline, missing-evidence framing, and measurement-health structure.

Bundle improvement

Stated that the cause cannot be determined from the percentage alone, then separated available evidence, missing evidence, decomposition, measurement checks, hypotheses, next actions, and source note.

Prompt used

System:
You are an analytics assistant. Answer the user's question directly and carefully. Do not invent access to tools or data you do not have.
User:
An ecommerce store reports that GA4 purchases dropped 28% last week compared with the previous week. The user asks: "Why did purchases drop, and what should we do?"

Create a short diagnosis memo. Assume the agent has no direct tool access unless the user provides data.
Criterion Baseline OKB
Direct answer 1 2
Metric definitions 0 1
Evidence discipline 0 2
Tool honesty 2 2
Source discipline 0 2
Actionability 1 2
Safety 0 1
Output fit 1 2
GA4 UI vs BigQuery reconciliation 6/16 → 15/16
Baseline 6/16 OKB 15/16 Lift +9 pts / +150%

Baseline result

Gave a high-level reconciliation answer but did not clearly start with neither number being automatically right. It missed important alignment checks such as table suffixes, event_date versus event_timestamp, timezone conversion, reporting identity, and source note.

Bundle improvement

Began with neither number being automatically right, separated active users from distinct user_pseudo_id, requested exact GA4 settings and SQL, and included table/date/timezone/identity checks with a source note.

Prompt used

System:
You are an analytics assistant. Answer the user's question directly and carefully. Do not invent access to tools or data you do not have.
User:
A user says: "GA4 says we had 12,430 active users last week, but my BigQuery query returns 13,180 distinct user_pseudo_id values for the same date range. Which number is right, and how should I reconcile this?"

Create a concise reconciliation note. Assume you cannot run tools unless the user provides data. Do not invent the user's exact GA4 settings or BigQuery SQL.
Criterion Baseline OKB
Direct answer 0 2
Defines both numbers 1 2
Requests exact settings and query logic 1 2
Alignment checks 1 2
Expected vs unresolved differences 0 1
Source discipline 0 2
Tool honesty 2 2
Actionable next step 1 2
GA4 BigQuery query plan 6/16 → 13/16
Baseline 6/16 OKB 13/16 Lift +7 pts / +117%

Baseline result

Produced a plausible high-level plan and avoided unsupported SQL, but missed GA4 export specifics, detailed date/table handling, session/user logic, ecommerce caveats, and visible source discipline.

Bundle improvement

Separated revenue, traffic volume, and conversion efficiency; included GA4 export table/date behavior, validation against GA4 UI and backend orders, and a source note. Remaining misses included gclid/ad-platform joins and deeper session/deduplication detail.

Prompt used

System:
You are an analytics assistant. Answer the user's question directly and carefully. Do not invent access to tools or data you do not have.
User:
A user asks: "I need a BigQuery query plan to analyze whether paid search revenue fell last week because conversion efficiency dropped or because traffic volume changed. We use GA4 ecommerce export. What should I query and what caveats should I watch for?"

Create a query plan, not SQL. Assume you cannot inspect the user's dataset directly.
Criterion Baseline OKB
Plan not unsupported SQL 2 2
Metric decomposition 1 2
GA4 export specifics 0 2
User/session logic 0 1
Paid search attribution caveats 1 1
Revenue/ecommerce caveats 0 1
Validation/source discipline 1 2
Actionability 1 2

Agent affordances

Tools 5

GA4, BigQuery, Google Tag Manager

Frameworks 3

measurement plan, funnel analysis, hypothesis-driven diagnosis

Deliverables 3

GA4 analysis brief, conversion drop diagnosis, GA4 BigQuery query plan

Commands 6

/diagnose-conversion-drop, /plan-ga4-analysis, /reconcile-ga4-ui-bigquery

Evaluations 1

ga4-analysis-quality-check

Bundle map

GA4 Analytics Specialist
GA4
BigQuery
Google Tag Manager
Looker Studio

Limitations and safety notes

Limitations

  • Initial benchmark draft; not a complete GA4 schema or SQL cookbook.
  • Does not replace professional privacy, legal, or analytics implementation review.

Safety notes

  • Confirm before modifying live GA4, GTM, dashboard, audience, conversion, or BigQuery resources.
  • Avoid user-level analysis unless privacy, permissions, and business need are clear.