TL;DR
| Platform | 2020 | 2026 |
|---|---|---|
| Excel | Manual PivotTable | Copilot Agent Mode (natural language) |
| Google Sheets | Basic pivot | Connected Sheets + BigQuery (200K rows) |
| Python | pandas pivot_table() | Polars (5-10x faster), DuckDB (SQL PIVOT) |
| R | spread/gather | pivot_wider/pivot_longer (tidyr) |
| JavaScript | PivotTable.js | AG Grid, Flexmonster, Arquero |
| AI | None | Semantic summarization, token optimization |
The first version of this article was published in 2020 and covered basic pivot table creation using Excel and Google Sheets. Since then, pivot table has evolved from a “row and column transposition” tool to a data engine at the center of the data analytics ecosystem. Especially with the proliferation of AI agents, pivot table is no longer the final result but serves as an intermediate layer that artificial intelligence uses to make sense of data.
Pivot Table Fundamentals
A Pivot Table (Summary Table) is a tool used to analyze, summarize, and report records in a data table. Its fundamental function is to designate a column as a grouping key, transform rows into columns, and apply aggregation functions (sum, average, count).
Through this process, you can:
- Summarize large data tables with grouping, filtering, and conditional formatting
- Calculate totals and subtotals of numerical data at category and subcategory levels
- Adjust detail scope by expanding or collapsing operations
- Make comparisons using different calculation functions (numerical, text, date)
Key rules when creating pivot tables: columns must have headers, each column should contain a single data type, and data entry consistency must be maintained (e.g., “Email” should not appear as “E-mail” in another row).
Modern Excel: Power Pivot, DAX and Copilot
Power Pivot and Data Model
In 2020, Excel’s row limit (1,048,576) was one of the biggest constraints when working with pivot tables. Power Pivot’s Data Model eliminates this limit: the xVelocity (VertiPaq) in-memory columnar engine supports approximately 2 billion rows per table1. Data is stored in the backend Data Model rather than the Excel worksheet and reported via pivot.
DAX (Data Analysis Expressions)
DAX, which comes with Power Pivot, takes pivot table calculations beyond standard Excel formulas2:
| DAX Function | Description | Use Case |
|---|---|---|
CALCULATE | Dynamic aggregation by changing filter context | Conditional sales totals |
SUMX | Row-by-row iteration and summation | SUMX(Sales, Sales[Qty] * Sales[Margin]) |
RELATED | Fetch data from related tables | Multi-table pivot |
SAMEPERIODLASTYEAR | Same period last year comparison | YoY analysis |
ALL / ALLEXCEPT | Remove filters for grand totals | Percentage calculations |
Copilot Agent Mode
Agent Mode, generally available since December 2025, brings pivot table creation to natural language in Excel3:
- PivotTables can be automatically created with commands like “Create a quarterly sales pivot by region”
- User control is provided at planning, creation, and validation stages
- OpenAI GPT 5.2 or Anthropic Claude Opus 4.5 models can be used in the backend
- The
=COPILOT()formula enables cell-level forecasting, summarization, and aggregation
Excel’s former “Analyze Data” (App Skills) feature is being removed by late February 2026; Agent Mode and Analyst are taking its place.
Google Sheets: Connected Sheets and Gemini
Connected Sheets + BigQuery
Google Connected Sheets brings BigQuery’s scale to the Sheets interface. With the October 2025 update4:
- Pivot table row capacity increased from 100,000 to 200,000
- Drill-down feature added by double-clicking pivot cells
- CSV files from Drive can be imported directly to BigQuery and analyzed via Connected Sheets
This enables billions of rows of e-commerce data to be analyzed as pivot tables in the Sheets interface.
Gemini Integration
With the May and October 2025 updates, Gemini in Google Sheets enables5:
- Creating pivot tables with natural language (“Create a pivot table showing sum of sales by region”)
- Multi-step execution from a single prompt (data cleaning + formatting + pivot creation)
- Cross-pivot analysis across multiple tables
- Focused analysis on selected ranges
Google Sheets Limits
| Limit | Value |
|---|---|
| Cells | 10 million |
| Columns | 18,278 |
| Tabs | 200 |
| Characters per cell | 50,000 |
Python: pandas, Polars and DuckDB
pandas pivot_table()
pandas 3.0.0 (January 2026) introduces a significant behavioral change to pivot_table(): the observed parameter now defaults to True; only observed categories are shown. Copy-on-Write has also become the default and only mode6.
import pandas as pd
df = pd.DataFrame({
'Date': ['2026-01', '2026-01', '2026-02', '2026-02'],
'Category': ['Electronics', 'Fashion', 'Electronics', 'Fashion'],
'Sales': [1200, 450, 1500, 300],
'Returns': [1, 2, 0, 5]
})
pivot = df.pivot_table(
index='Category',
columns='Date',
values='Sales',
aggfunc='sum'
)
Polars: The Performance Revolution
Polars 1.38.1 (February 2026) is a DataFrame library written in Rust, positioned as an alternative to pandas. It runs 1.6 to 8.6 times faster for pivot operations depending on data size7:
| Data Size | pandas | Polars | Difference |
|---|---|---|---|
| 10K rows | Baseline | ~1.6x faster | Small difference |
| 100M rows | Baseline | ~8.6x faster | Significant difference |
| 1 GB CSV loading | Baseline | ~5x faster, 87% less memory | Critical difference |
import polars as pl
df = pl.DataFrame({
'Category': ['Electronics', 'Fashion', 'Electronics', 'Fashion'],
'Month': ['January', 'January', 'February', 'February'],
'Sales': [1200, 450, 1500, 300]
})
pivot = df.pivot(
on='Month',
index='Category',
values='Sales',
aggregate_function='sum'
)
Polars’ streaming engine introduced in 2025 provides an additional 3-7x improvement over the in-memory engine. For scenarios where e-commerce data reaches millions of rows, Polars is the biggest best practice of 2026.
DuckDB: Pivot with SQL
DuckDB 1.4.4 (January 2026) offers first-class PIVOT/UNPIVOT SQL syntax8:
-- Simple pivot
PIVOT sales ON month USING sum(amount);
-- SQL standard syntax
SELECT * FROM sales
PIVOT (sum(amount) AS total FOR month IN ('January', 'February', 'March'));
DuckDB can run SQL directly on pandas DataFrames and provides advantages in memory management for large datasets.
R: tidyverse and data.table
pivot_wider() and pivot_longer()
Introduced in tidyr 1.0.0 (September 2019), pivot_wider() and pivot_longer() replaced the old spread() and gather() functions9. The naming is more intuitive: pivot_longer for long format, pivot_wider for wide format.
library(tidyr)
# Convert long format to wide format (pivot)
df %>%
pivot_wider(
names_from = Month,
values_from = Sales,
values_fn = sum
)
data.table Alternative
data.table 1.18.0 (December 2025) offers reshaping operations with dcast() (wide) and melt() (long). Thanks to internal fast radix sorting and binary search, it runs approximately 5 times faster than tidyr10.
library(data.table)
dcast(dt, Category ~ Month, value.var = "Sales", fun.aggregate = sum)
JavaScript: Browser-Based Pivot
Current State
| Tool | Status | Capacity | License |
|---|---|---|---|
| PivotTable.js | Unmaintained (~7 years since last update) | Small data | MIT |
| WebDataRocks | Active | ~1 MB | Free |
| Flexmonster | Active | ~1 GB, millions of rows | Commercial |
| AG Grid (Enterprise) | Active | Server-side, unlimited | Commercial |
| Arquero | Active | 1M+ rows (in browser) | BSD |
Arquero, developed by the UW Interactive Data Lab, is a columnar JavaScript data transformation library. It can process over 1 million rows in the browser and offers pivot/reshape operations11.
AG Grid Enterprise offers the most comprehensive pivot features with server-side row models. React, Angular, and Vue integrations are available.
AI Agents and Pivot Tables
With the proliferation of AI agents, the role of pivot tables is changing: no longer just a report for humans to read, but a method for artificial intelligence to break data into digestible chunks.
Creating Pivots with Natural Language
Excel Copilot Agent Mode and Google Gemini can create pivot tables from natural language commands. Claude’s Code Interpreter feature (September 2025) can upload CSV/TSV files to produce pivot analysis, visualizations, and Excel output using Python12.
AI-Assisted Pivot Recommendation
Microsoft Research has developed a system that uses BERT embeddings to understand attribute significance in pivot tables and recommend meaningful pivot configurations to users13. In this approach, the agent evaluates not just numbers but also the semantic meaning of column names. For example, in the command “Pivot return rates during discount periods,” it logically matches which column represents discounts and which represents returns.
Token Optimization: Pre-Pivot Strategy
Sending pre-pivoted summaries instead of raw data to AI agents reduces token costs by 60-80%14:
import pandas as pd
import json
# Raw data: 10,000 rows of e-commerce data
df = pd.read_csv('sales.csv')
# Pre-pivot summary: For AI agent
pivot = df.pivot_table(
index='Category',
values=['Sales', 'Returns'],
aggfunc={'Sales': 'sum', 'Returns': 'mean'}
).reset_index()
# Add flags: Points for agent to focus on
pivot['Status'] = pivot.apply(
lambda x: 'Critical' if x['Returns'] > 3 else 'Stable', axis=1
)
# Send to agent in JSON format
context = pivot.to_json(orient='records', force_ascii=False)
This approach provides two critical advantages:
- Token savings: 10-20 row pivot summary instead of 10,000 row raw log
- Improved accuracy: Flag columns (Status, Trend) answer the agent’s “where should I look?” question and reduce hallucination risk
Browser-Based Agents
Browser-based AI agents navigating an e-commerce panel (Shopify, WooCommerce) can instantly transform complex on-screen tables into pivot structures and perform anomaly detection: generating alerts like “Sales are increasing while profit is declining.”
E-Commerce Pivot Analysis Practices
Common Pivot Patterns
| Pattern | Row (Index) | Column | Value | Purpose |
|---|---|---|---|---|
| Category performance | Product category | Month/Quarter | Sales total | Seasonal trends |
| Cohort analysis | Acquisition month | Subsequent months | Retention rate | Customer loyalty |
| Channel x Region | Marketing channel | Geographic region | Revenue | Channel effectiveness |
| Product matrix | SKU | Metrics (revenue, returns, margin) | Values | Product health |
| Customer LTV | Customer segment | Period | Lifetime value | Segment comparison |
Pivot in the Modern Data Stack
In the modern data stack architecture (dbt, Looker, BigQuery), pivot operations typically occur in the transformation layer15:
- Data ingestion: Raw e-commerce data brought to warehouse via Fivetran/Stitch
- Transformation (dbt):
dbt_utils.pivot()macro for SQL pivot operations, dynamic column creation without hardcoding - BI layer: Looker/Tableau consumes dbt-transformed data
- Reverse ETL: Census/Hightouch pushes pivoted results back to CRM and marketing tools
ShopifyQL and Pivot
ShopifyQL, introduced by Shopify in June 2022, significantly simplifies querying e-commerce data compared to standard SQL16. Period comparison with COMPARE TO previous_year (an operation requiring self-joins in SQL), row/column totals with WITH TOTALS and GROUP_TOTALS, and percentage change calculation with WITH PERCENT_CHANGE can all be done in a single line:
FROM sales
SHOW net_sales
GROUP BY product_type
SINCE -12m
COMPARE TO previous_year
WITH TOTALS, PERCENT_CHANGE
VISUALIZE
However, ShopifyQL does not yet support true pivot/crosstab operations (transforming rows into columns). Subquery and CTE support is also absent. The Sidekick AI integration introduced in December 2025 enables generating ShopifyQL queries through natural language, and Sidekick Pulse provides proactive anomaly detection17.
For complex pivot scenarios, the recommended approach is: extract a GROUP BY summary via ShopifyQL, transfer to BigQuery via Fivetran or Airbyte, and apply true pivot operations there using DuckDB PIVOT or dbt_utils.pivot(). Analyzing BigQuery data directly as a Sheets pivot table via Google Connected Sheets also provides an effective alternative.
Web Scraping and Competitive Intelligence
In e-commerce, analyzing competitor pricing and market data through web scraping combined with pivot tables is a common practice.
Tool Landscape (2025-2026)
| Tool | Type | Use Case | Status |
|---|---|---|---|
| Scrapy 2.12+ | Framework | Large-scale structured crawling | Active |
| Playwright 1.49+ | Browser automation | JavaScript-heavy sites, SPAs | Active |
| ScrapeGraphAI | AI-powered | Semantic extraction, low maintenance | New |
| Crawl4AI | Open source | LLM-ready output, free | New |
Anti-Bot Challenges
As of 2025-2026, anti-bot protections have hardened significantly. Cloudflare Turnstile uses TLS/HTTP2 fingerprinting and JavaScript challenges; DataDome performs ML-based real-time bot detection. Puppeteer-stealth was discontinued in February 2025; active alternatives are Nodriver, SeleniumBase UC Mode, and Camoufox18.
ScrapeGraphAI’s AI approach requires 70% less maintenance than CSS selector-based scrapers. LLMs can understand page semantics regardless of HTML structure changes.
Scraping to Pivot Pipeline
from playwright.async_api import async_playwright
import duckdb
# Store data in DuckDB
con = duckdb.connect("competitor_prices.duckdb")
# Competitor price comparison: PIVOT
con.execute("""
PIVOT price_history
ON competitor
USING AVG(price) AS avg_price, MIN(price) AS min_price
GROUP BY product_sku
ORDER BY product_sku
""")
From a legal perspective: respecting robots.txt, maintaining minimum 1-2 second delays between requests, and checking ToS are recommended. Pricing data is generally not considered personal data, though KVKK applies when scraping involves personal data.
WooCommerce Data Analytics
WooCommerce REST API v3 enables programmatic extraction of order, product, customer, and coupon data. Unlike Shopify’s 40 API requests per minute limit, WooCommerce offers unlimited access through direct database access19. For order filtering operations, see WooCommerce REST API - Filtering Orders.
API to Pivot
from woocommerce import API
import pandas as pd
wcapi = API(
url="https://yourstore.com",
consumer_key="ck_xxx",
consumer_secret="cs_xxx",
version="wc/v3"
)
# Fetch all orders with pagination
all_orders = []
page = 1
while True:
response = wcapi.get("orders", params={"per_page": 100, "page": page})
orders = response.json()
if not orders:
break
all_orders.extend(orders)
page += 1
df = pd.json_normalize(all_orders, sep='_')
RFM Analysis
import numpy as np
snapshot = df["order_date"].max() + pd.Timedelta(days=1)
rfm = df.groupby("customer_id").agg(
recency=("order_date", lambda x: (snapshot - x.max()).days),
frequency=("order_id", "nunique"),
monetary=("line_total", "sum")
).reset_index()
# Score 1-5
for col in ["recency", "frequency", "monetary"]:
labels = range(5, 0, -1) if col == "recency" else range(1, 6)
rfm[f"{col}_score"] = pd.qcut(
rfm[col], q=5, labels=labels, duplicates="drop"
).astype(int)
# Segment pivot
rfm_pivot = rfm.pivot_table(
values="monetary",
index="segment",
aggfunc=["count", "mean", "sum"]
).round(2)
WooCommerce MCP Integration
WooCommerce 10.3 (2025) introduced the first official MCP (Model Context Protocol) integration. This enables AI agents to directly search, add, and update products.
BigQuery Integration Paths
| Tool | Method | Cost |
|---|---|---|
| Airbyte | Open-source ETL | Free (self-hosted) |
| n8n | Workflow automation | Free (self-hosted) |
| Coupler.io | No-code scheduler | From $49/mo |
| Direct DB | Server-side tracking -> BigQuery | Infrastructure cost only |
BigQuery E-commerce Pivot Operations
BigQuery’s native PIVOT operator enables powerful analysis on GA4 e-commerce data20. If you are familiar with the data warehouse concept, BigQuery can be thought of as its cloud-native implementation.
GA4 Data PIVOT
SELECT *
FROM (
SELECT
items.item_category AS category,
event_name,
items.price * items.quantity AS revenue
FROM `project.analytics_XXXXXX.events_*`,
UNNEST(items) AS items
WHERE _TABLE_SUFFIX BETWEEN '20250101' AND '20250131'
AND event_name IN ('purchase', 'add_to_cart', 'view_item')
)
PIVOT (
SUM(revenue) AS total
FOR event_name IN ('purchase', 'add_to_cart', 'view_item')
);
BigQuery ML
BigQuery ML enables model creation directly via SQL for e-commerce predictions:
- ARIMA_PLUS: Daily revenue forecasts, Turkish holiday calendar support (
holiday_region = 'TR'), up to 100 million time series simultaneously - K-means: Customer segmentation from RFM scores, automatic scaling with
standardize_features = TRUE
Cost Optimization
| Technique | Savings | Implementation |
|---|---|---|
| Partitioning | Up to 90% scan reduction | PARTITION BY DATE(event_timestamp) |
| Clustering | 30-60% additional reduction | CLUSTER BY event_name, user_id |
| Materialized Views | Auto-refreshed pre-aggregations | CREATE MATERIALIZED VIEW |
| Column selection | Variable | Never use SELECT * |
BigQuery on-demand pricing: first 1 TB/month free, then $6.25/TB. A typical e-commerce store (500K orders/year) with monthly query volume of 500 GB stays within the free tier.
BI Engine automatically caches frequently accessed data, providing sub-second query response and 4-10x performance improvement.
Privacy-Focused Analytics with Local LLMs
Sending customer data (names, emails, addresses, purchase history) to cloud LLM APIs constitutes cross-border data transfer under KVKK. Local LLMs solve this by design21.
Ollama Setup
ollama pull qwen2.5-coder:14b # Code generation + data analysis
ollama pull mistral-nemo:12b # Multilingual, general analysis
ollama pull phi4:14b # Strong reasoning
VRAM Requirements (Q4_K_M Quantization)
| Model | Parameters | VRAM | Best For |
|---|---|---|---|
| Qwen2.5-Coder 7B | 7B | 6-7 GB | Code generation, SQL writing |
| Qwen2.5-Coder 14B | 14B | 10-12 GB | Complex analytics code |
| Mistral Nemo 12B | 12B | 8-10 GB | Multilingual, general analysis |
| Phi-4 14B | 14B | 10-12 GB | Reasoning, math |
| DeepSeek-R1 14B | 14B | 10-12 GB | Chain-of-thought reasoning |
Hardware tiers: 8 GB VRAM (RTX 4060) for 7B models, 12-16 GB (RTX 4070 Ti, M2 Pro/Max) for 12-14B models, 24 GB (RTX 4090, M3 Max) for 32B models.
Pivot Summary with Local LLM
import ollama
import json
pivot_summary = """
Monthly Revenue (Category):
| Category | Jan | Feb | Mar |
|-------------|--------|--------|--------|
| Electronics | 45,200 | 38,100 | 52,300 |
| Clothing | 22,100 | 19,800 | 28,500 |
RFM Segments:
| Segment | Count | Avg Spend | Avg Recency |
|-------------|-------|-----------|-------------|
| Champions | 342 | $1,250 | 8 days |
| At Risk | 891 | $480 | 67 days |
"""
response = ollama.chat(
model="qwen2.5-coder:14b",
messages=[{
"role": "user",
"content": f"Analyze this e-commerce pivot data. "
f"Return JSON only.\n{pivot_summary}"
}],
format="json"
)
KVKK Compliance
KVKK (Turkey’s Personal Data Protection Law), with its 2025 update, raised administrative fines to a range of 68,083 TL to 13.6 million TL (43.93% increase). The “Generative AI and Personal Data Protection” guideline published in December 2025 mandates “privacy by design” principles for LLM developers22.
When using local LLMs: embedding generation, vector search, and LLM inference all occur in the local environment. No data leaves the premises.
Hybrid Architecture
Layer 1: Data Extraction (Local)
WooCommerce API -> Python -> Polars -> DuckDB
Cost: $0 (compute only)
Layer 2: Pivot + Summarization (Local LLM)
Raw data -> pivot tables -> Ollama (Qwen2.5-Coder 14B)
Cost: ~$0 (electricity, ~$0.02/hr GPU)
KVKK compliant: all PII stays on-premises
Layer 3: Strategic Insights (Cloud LLM, optional)
Pre-pivoted, anonymized summaries -> Claude / GPT-4o
Cost: ~$0.02-0.04 per query
This approach delivers 85-95% cost reduction while maintaining KVKK compliance.
LLM Limitations with Tabular Data
As AI agents increasingly leverage pivot tables, understanding these models’ limitations with tabular data becomes critical.
Structural Mismatch
Tabular data has a two-dimensional, relational structure, while LLMs are trained with a one-dimensional, autoregressive objective. When a table is serialized to text, row and column boundaries blur, the model becomes sensitive to row/column shuffles that should not matter, and context window limits are hit with large datasets23.
Numerical Precision Failures
Numbers are processed as character sequences during tokenization, losing magnitude and numerical order information. Research across six major LLMs found factual hallucination rates between 59% and 82%, with numerical recall reliability near zero24.
Table Understanding Benchmarks
| Benchmark | Year | Scope | Finding |
|---|---|---|---|
| SUC (Microsoft) | WSDM 2024 | 7 basic table tasks | Best accuracy: 65.43% |
| TableBench | AAAI 2025 | 886 instances, 18 subcategories | Even GPT-4 significantly behind humans |
| RealHiTBench | ACL 2025 | 708 hierarchical tables, 3,752 QA | Hierarchical headers challenged all models |
| MMQA | ICLR 2025 | Multi-table, multi-hop reasoning | All models behind human performance |
| MMTU | 2025 | 30,000+ questions, 25 tasks | Reasoning models +10pp advantage |
| TReB | 2025 | 26 sub-tasks, 3 inference modes | ICoT (interleaved chain-of-thought) best |
Table transposition accuracy is approximately 50% (near random chance), while header identification is 94-97% accurate. This means models can understand table content but fail at structural transformation25.
Serialization Format Comparison
Based on tests across 11 different formats26:
| Format | Accuracy | Token Usage |
|---|---|---|
| Markdown KV | 60.7% | 2.7x (baseline) |
| XML | 56.0% | 2.3x |
| YAML | 54.7% | 1.9x |
| Natural language | 49.6% | 3.0x |
| CSV | 44.3% | 1.0x (lowest) |
Markdown KV provides the highest accuracy but consumes 2.7x more tokens than CSV. When token budget is constrained, CSV is recommended; when accuracy is the priority, Markdown KV is preferred.
Solution Strategies
- Code generation approach: Have the LLM write pandas/SQL code rather than analyze data directly. Delegates computation to deterministic tools
- Schema + sample rows: Send column names, data types, and first 5 rows instead of the entire table
- ICoT (Interleaved Chain-of-Thought): Alternate between text reasoning and code execution. Achieved the best results in the TReB benchmark
- Code Interpreter: Claude and GPT’s code execution tools completely eliminate numerical hallucination
- SpreadsheetLLM (Microsoft, EMNLP 2024): A compression approach achieving 96% reduction in token usage and 25.6% improvement in table detection27
Cost and Token Optimization
Raw Data vs. Pivot Summary Comparison
| Approach | Rows | Tokens (Approx.) | Cost (GPT-4o) |
|---|---|---|---|
| Raw order data (10K) | 10,000 | ~250,000 | ~$0.625 |
| Category+Month pre-grouping | ~200 | ~10,000 | ~$0.025 |
| Summary statistics only | ~20 | ~2,000 | ~$0.005 |
Pre-pivot enables 95-99% token savings. CSV format consumes 40-50% fewer tokens than JSON.
API Pricing (February 2026)
| Model | Input/1M Tokens | Output/1M Tokens | Batch |
|---|---|---|---|
| GPT-4o | $2.50 | $10.00 | 50% off |
| GPT-4o mini | $0.15 | $0.60 | 50% off |
| Claude Sonnet 4.5 | $3.00 | $15.00 | 50% off |
| Claude Haiku 4.5 | $1.00 | $5.00 | 50% off |
| Gemini 2.0 Flash | $0.10 | $0.40 | - |
| Local (Ollama) | ~$0 | ~$0 | - |
Anthropic offers a prompt caching mechanism that reduces input token costs by 90% for repeated queries. OpenAI Batch API delivers results within 24 hours at a 50% discount.
Decision Matrix
| Scenario | Recommended Approach |
|---|---|
| Data contains PII | Local LLM only (KVKK requirement) |
| Data < 100K rows, simple pivot | pandas/Polars, no LLM needed |
| Data > 1M rows, complex pivot | BigQuery ($6.25/TB) or DuckDB (free) |
| Natural language insights from pivots | Pre-pivot locally, send summary to cloud LLM |
| Batch processing (daily reports) | Batch API (50% off), overnight processing |
Practical Guide: Which Tool for Which Scenario?
| Scenario | Recommended Tool | Why |
|---|---|---|
| Quick exploration, small data | Excel / Google Sheets | Drag-and-drop, visual |
| Natural language pivot | Excel Copilot / Gemini | ”Sales pivot by region” |
| Analysis on BigQuery | Google Connected Sheets | Billions of rows, Sheets interface |
| Programmatic analysis (medium data) | pandas pivot_table() | Widespread, well-documented |
| Big data (million+ rows) | Polars | 5-10x faster, low memory |
| SQL-based pivot | DuckDB PIVOT | Native SQL syntax |
| Statistical analysis | R tidyverse / data.table | pivot_wider, ggplot2 |
| Browser pivot | AG Grid / Flexmonster | Large data, enterprise |
| AI agent preparation | pandas/Polars + JSON | Pre-pivot, flag addition |
| Data engineering | dbt_utils.pivot() | Repeatable, version-controlled |
| Competitor price tracking | Scrapy/Playwright + DuckDB | Scraping -> pivot pipeline |
| WooCommerce analytics | wc-api-python + Polars | REST API, unlimited access |
| BigQuery pivot | PIVOT operator + BI Engine | GA4, billions of rows |
| Privacy-focused analytics | Ollama + Qwen2.5-Coder | KVKK compliant, free |
Conclusion
Pivot table maintains its position as one of the fundamental tools of data analytics. However, as of 2026, its role has evolved significantly: no longer just a report table for humans to read, but an intermediate layer where AI agents break data into digestible chunks. Pivot tables can now be created through natural language via Excel Copilot and Google Gemini, millions of rows can be pivoted within seconds with Polars, and pre-pivot summaries can reduce AI agent token costs by 60-80%.
In e-commerce analytics, pivot table is the most direct way to answer critical business questions such as cohort analysis, category performance, and customer segmentation. In the modern data stack, pivot operations are defined once in the dbt transformation layer and consumed by all downstream consumers.
However, LLMs have structural limitations when working with tabular data: transposition accuracy at 50%, numerical hallucination rate at 59-82%. Therefore, rather than having LLMs perform calculations, having them generate code, using deterministic tools (pandas, DuckDB, BigQuery) for computation, and positioning LLMs solely for content generation and natural language insights remains the most effective strategy. Local LLMs, with their KVKK compliance and zero marginal cost advantages, offer an increasingly powerful alternative for e-commerce data analysis.
Pivot table usage is also covered in detail across different platforms: Looker Studio Pivot Table, Google Analytics Pivot Table, and Facebook Analytics Pivot Table.
Footnotes
- Data Model specification and limits - Microsoft Support ↩
- DAX function reference - Microsoft Learn ↩
- Agent Mode in Excel - Microsoft Tech Community ↩
- Powerful pivot tables in Connected Sheets - Google Workspace Updates ↩
- Gemini in Google Sheets - Google Workspace Updates ↩
- pandas 3.0.0 What’s New ↩
- Polars PDS-H Benchmark ↩
- DuckDB PIVOT Statement ↩
- tidyr Pivoting Vignette ↩
- data.table CRAN ↩
- Arquero - Columnar JavaScript Data ↩
- Claude Code Interpreter ↩
- Pivot Table Recommendation with Semantic Embeddings ↩
- A Guide to Token-Efficient Data Prep for LLM Workloads - The New Stack ↩
- dbt_utils pivot macro ↩
- ShopifyQL - Commerce Data Querying Language ↩
- Shopify Editions Winter ‘26 ↩
- Web Scraping Trends for 2025 and 2026 ↩
- WooCommerce REST API Documentation ↩
- How to Create Pivot Table with GA4 Data in BigQuery ↩
- Run LLMs Locally with Ollama: Privacy-First AI 2025 ↩
- KVKK 2025 Updates: A Compliance Guide for Companies ↩
- Why LLMs Struggle with Your Spreadsheet Data - Tryolabs ↩
- Guide to Hallucinations in Large Language Models - Lakera ↩
- Table Meets LLM: Can Large Language Models Understand Structured Table Data? - WSDM 2024 ↩
- Which Table Format Do LLMs Understand Best? ↩
- SpreadsheetLLM - Microsoft Research, EMNLP 2024 ↩