Stream Shopify orders into BigQuery with Make.com, stage NDJSON in GCS, run partitioned loads, and publish a daily Slack digest of sales and exceptions.
Introduction
If your business still exports Shopify orders and runs a nightly import script, you know the pain: edits, refunds, and fulfilment changes arrive after the export and dashboards are stale. Waiting for a batch means attribution lag and slow decision making.
This guide shows a production-ready pattern using Make.com to receive Shopify webhooks, normalise payloads, stage newline-delimited JSON (NDJSON) files to Google Cloud Storage (GCS), run BigQuery load jobs into partitioned tables, and publish a daily Slack digest that highlights totals, refunds, exceptions, and top SKUs.
By the end you will have a repeatable pipeline with idempotency, replayability, and a compact operational runbook.
What You’ll Need
- Shopify admin with webhook permissions
- Google Cloud project with BigQuery and GCS enabled, and a service account that can write to GCS and run BigQuery jobs
- Slack workspace and channel for the daily digest
- Make.com account with Shopify, HTTP (for signed uploads), and Slack capabilities
- BigQuery dataset and tables. Use partitioned tables by ingestion_date to control costs
- Optional: a small audit table or Google Sheet for immediate troubleshooting
Notes: BigQuery load jobs and GCS have quotas and costs. This approach minimises per-row streaming by batching files into load jobs.
How It Works (The Logic)
Shopify webhooks (orders.create, orders/updated, orders/paid, fulfillment events) → Make.com receives and validates webhooks → normalise order and line item records → batch into NDJSON files and upload to GCS → call BigQuery load jobs to append to partitioned tables → write audit rows → daily scheduled Make.com job queries BigQuery and posts a Slack digest.
Trigger → Normalize → Stage to GCS → Load to BigQuery → Audit → Daily Slack digest
Step-by-Step Setup
- Design your BigQuery schema
Create three partitioned tables (partitioned by DATE(ingestion_timestamp)):
- orders_raw: store the full Shopify payload plus metadata: order_id, shopify_payload (JSON), event_type, ingestion_timestamp, webhook_id
- orders_flat: denormalised order-level row with fields such as order_id, created_at, customer_email, total_price, currency, financial_status, fulfillment_status, total_items, ingestion_timestamp
- line_items_flat: one row per line item: order_id, line_item_id, sku, product_id, quantity, price, line_total, ingestion_timestamp
Keep raw for replay and flat for BI.
- Create Shopify webhooks
Subscribe to orders/create, orders/updated, orders/paid, and fulfilment events. Point them to a Make.com custom webhook. Save the Shopify webhook secret and validate the HMAC in Make.com.
Gotcha: include update events to catch refunds and edits.
- Receive the webhook in Make.com and validate
- Module: Webhooks — Custom webhook. Accept Shopify payloads.
- Immediately validate the X-Shopify-Hmac-Sha256 header in a Tools/HTTP module or script to ensure authenticity.
- Normalise payloads into flat objects
- Use a Tools/JS or Mapper module to create two objects: order_record and an array of line_item_records. Convert numeric strings to numbers and normalize timestamps to ISO 8601. Add ingestion_timestamp in Make.com.
- Mapping examples:
- order_record.order_id = payload.id
- order_record.total_price = Number(payload.total_price)
- line_item_record.sku = item.sku || item.variant_sku
Common gotcha: optional fields (customer email) may be null. Preserve nulls instead of empty strings.
- Batch into NDJSON and stage to GCS
Rather than loading each record individually, batch records into NDJSON files. Strategies:
- Flush when record count >= 500 or after a time window (for example 5 minutes)
- Use an Aggregator module to collect records, then build NDJSON with one JSON object per line
Upload pattern:
- Module: Upload NDJSON to GCS under path gs://bucket/shopify/ingest_date=YYYYMMDD/hour=HH/file_runid.ndjson
If Make.com lacks a direct GCS upload, generate a signed URL with a small cloud function and PUT the NDJSON from Make.
- Trigger BigQuery load jobs from GCS
Use BigQuery load jobs for cost-effective ingestion:
- Module: HTTP — call BigQuery REST API jobs.insert with a load config referencing the GCS file, target table (orders_raw or line_items_flat), sourceFormat=NEWLINE_DELIMITED_JSON, writeDisposition=WRITE_APPEND.
- Poll job status until completion, or set up a small Cloud Function callback if you prefer asynchronous handling.
On success, write an audit row to a monitoring table: file_path, row_count, bq_job_id, status, ingestion_timestamp.
Failure handling: move failing files to a failed/ prefix in the bucket and notify ops.
- Maintain idempotency and canonical tables
Because webhooks can re-deliver, keep raw webhook metadata in orders_raw. To create canonical order and line item tables for analytics, run periodic MERGE jobs in BigQuery that deduplicate by order_id and keep the latest ingestion_timestamp per order_id.
Example nightly MERGE pattern: MERGE dataset.orders_flat T USING (SELECT * FROM dataset.orders_raw WHERE DATE(ingestion_timestamp)=CURRENT_DATE()) S ON T.order_id = S.order_id WHEN MATCHED THEN UPDATE SET ... WHEN NOT MATCHED THEN INSERT ...
This yields a stable orders_flat table your BI team can rely on.
- Build the daily Slack digest
Schedule a Make.com scenario to run once daily after your ETL window.
- Module: BigQuery — run a query returning metrics for the last day: total orders, GMV, refunds, exceptions (missing email, payment_pending), top N SKUs by quantity.
Sample query snippet: SELECT COUNT(DISTINCT order_id) AS total_orders, SUM(total_price) AS total_gmv FROM dataset.orders_flat WHERE DATE(ingestion_timestamp) = DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY);
- Module: Slack — Post a concise message including totals, refunds, exceptions, and top 5 SKUs, and link to a BI dashboard or the BigQuery query.
Message example: Daily orders (UTC): 1,234 orders • $123,456 Refunds: $1,234 • Exceptions: 3 (missing email) Top SKUs: SKU123 (320), SKU987 (210), SKU555 (198) Link: <BigQuery query link|Open report>
Keep the digest focused so recipients can act quickly.
- Monitoring, retries and runbooks
- On GCS or BigQuery job errors, move the file to failed/ and notify ops with file path and job id.
- Maintain a small runbook: how to replay a failed NDJSON file, which MERGE to run, and where to find raw payloads.
- Track daily ingestion counts and set alerts if volume drops below expected levels.
Real-World Business Scenario
A retail subscription service moved from nightly CSV exports into this pipeline. They staged NDJSON to GCS, ran BigQuery loads hourly, and had a nightly MERGE create the canonical orders_flat table. The daily Slack digest replaced a manual morning check and they could reprocess failed files within minutes, saving hours of reconciliation.
Common Variations
- Low-volume option: use BigQuery streaming inserts for immediate visibility when volume is low, accepting streaming costs.
- Enrichment in-flight: call an enrichment API in Make.com and append firmographic or CLTV fields before staging to GCS.
- Anomaly alerts: run a secondary scheduled job that flags day-over-day GMV or order-count anomalies and posts urgent alerts to Slack.
Where this fits your stack
This pattern moves you from ad-hoc CSV exports to an auditable, replayable ingestion pipeline: Shopify webhooks → Make.com orchestration → GCS staging → BigQuery loads → daily Slack digest. For lower-volume teams consider our Shopify to Google Sheets and Slack guide as an easier start. For operational fulfilment workflows, combine this with ShipStation and Monday.com patterns like ShipStation → Monday.com syncs.
If you want this built and maintained for your store and Google Cloud project, Olmec Dynamics implements these pipelines for businesses. See what we do at https://olmecdynamics.com
Tools Domains
- Shopify: shopify.com
- BigQuery: cloud.google.com
- Slack: slack.com
- Make.com: make.com
- Google Cloud Storage: cloud.google.com