Analyzing Queries
FlowScope supports multiple ways to input SQL for analysis and handles a wide range of SQL statement types. This guide covers input methods, supported queries, and how to interpret validation results.
Input Methods
Section titled “Input Methods”Web App
Section titled “Web App”The web app at flowscope.pondpilot.io offers several input methods:
- SQL Editor - Type or paste SQL directly with syntax highlighting
- File Upload - Drag and drop
.sqlfiles - Multi-File Projects - Upload related SQL files for cross-file lineage
The command-line tool accepts SQL from files or stdin:
# Single fileflowscope query.sql
# Multiple filesflowscope etl/*.sql
# From stdinecho "SELECT * FROM orders" | flowscope
# With dialectflowscope -d snowflake warehouse_queries.sqlThe NPM package accepts SQL as a string:
import { analyzeSql } from '@pondpilot/flowscope-core';
const result = await analyzeSql({ sql: 'SELECT id, name FROM users', dialect: 'postgresql'});Supported Statement Types
Section titled “Supported Statement Types”FlowScope extracts lineage from these SQL statement types:
Data Manipulation (DML)
Section titled “Data Manipulation (DML)”| Statement | Lineage Extracted |
|---|---|
SELECT | Source tables, columns, transformations |
INSERT INTO ... SELECT | Source-to-target column mappings |
UPDATE ... SET | Modified columns and their sources |
DELETE | Affected table and filter conditions |
MERGE | Source, target, and matched/unmatched actions |
Data Definition (DDL)
Section titled “Data Definition (DDL)”| Statement | Lineage Extracted |
|---|---|
CREATE TABLE AS SELECT | Full lineage from SELECT to new table |
CREATE VIEW | View definition lineage |
CREATE TABLE | Schema metadata (used for column resolution) |
DROP TABLE/VIEW | Tracks schema changes |
Query Components
Section titled “Query Components”| Component | Support Level |
|---|---|
| Common Table Expressions (CTEs) | Full lineage tracking |
| Subqueries | Correlated and uncorrelated |
| Set Operations | UNION, INTERSECT, EXCEPT |
| Window Functions | OVER clause column tracking |
| JOINs | All join types with column correlation |
Query Validation
Section titled “Query Validation”FlowScope validates your SQL and reports issues at different severity levels:
Issue Severities
Section titled “Issue Severities”| Severity | Meaning |
|---|---|
| Error | Query cannot be parsed; lineage unavailable |
| Warning | Query parsed but some constructs unrecognized |
| Info | Hints for better analysis (e.g., missing schema) |
Common Validation Messages
Section titled “Common Validation Messages”“Unknown table: orders” The table is referenced but not defined in the schema context. FlowScope still generates lineage with the table marked as external.
*“Cannot resolve column: SELECT ” Without schema metadata, FlowScope cannot expand wildcards. Provide CREATE TABLE statements for accurate column-level lineage.
“Unsupported statement type”
Some specialized statements (e.g., EXPLAIN, GRANT) don’t produce lineage. FlowScope skips them and continues with other statements.
“Dialect-specific syntax not recognized” Try selecting a more specific dialect. Some functions are dialect-specific.
Handling Parse Errors
Section titled “Handling Parse Errors”When FlowScope encounters syntax errors:
- Partial Lineage - FlowScope attempts to extract lineage from valid portions
- Error Spans - The exact location of the error is highlighted
- Suggestions - Common fixes are suggested when possible
Example error output:
Error at line 5, column 12: Unexpected token 'FRMO' - did you mean 'FROM'?
SELECT id, name FRMO users ^^^^Multi-Statement Analysis
Section titled “Multi-Statement Analysis”FlowScope handles files with multiple SQL statements:
-- First statementCREATE TABLE staging ASSELECT id, amount FROM raw_data;
-- Second statementCREATE TABLE summary ASSELECT id, SUM(amount) as totalFROM stagingGROUP BY id;The analysis shows:
- Statement 1:
raw_data→staging - Statement 2:
staging→summary - Cross-statement lineage:
raw_data→staging→summary
Schema Context
Section titled “Schema Context”Providing schema metadata improves analysis accuracy:
Without Schema
Section titled “Without Schema”SELECT * FROM ordersResult: Single node for orders with unknown columns.
With Schema
Section titled “With Schema”-- Schema contextCREATE TABLE orders (id INT, customer_id INT, amount DECIMAL);
-- QuerySELECT * FROM ordersResult: orders node with columns id, customer_id, amount explicitly shown.
Schema Sources
Section titled “Schema Sources”- Manual DDL - Paste CREATE TABLE statements
- File Upload - Upload schema definition files
- Database Connection (CLI) - Connect directly to PostgreSQL, MySQL, or SQLite
# CLI with live schema introspectionflowscope --metadata-url postgres://user:pass@localhost/db query.sqlStatement Splitting
Section titled “Statement Splitting”FlowScope intelligently splits SQL files into individual statements, handling:
- Standard semicolon delimiters
- Dollar-quoted strings (PostgreSQL)
- Nested BEGIN/END blocks
- Comment preservation
-- Correctly split despite complex quotingCREATE FUNCTION example() RETURNS void AS $$BEGIN -- This semicolon doesn't end the statement SELECT 1;END;$$ LANGUAGE plpgsql; -- Statement ends here
SELECT * FROM orders; -- Second statement