Power BI Data Modeling: Building Effective Data Models
'Order'[OrderID], 'Order'[OrderDate], 'Order'[CustomerID], 'Order'[ProductID], "Quantity", SUM('OrderDetails'[Quantity]), "UnitPrice", AVERAGE('OrderDetails'[UnitPrice]), "TotalAmount", SUM('OrderDetails'[Quantity] * 'OrderDetails'[UnitPrice])``` )
-- Fact table characteristics: -- ✓ Measurable metrics -- ✓ Foreign keys only -- ✓ No descriptive text -- ✓ Additive measures -- ✗ Avoid unnecessary columns
### Creating Dimension Tables
```DAX
-- Date dimension (calendar table)
DateDimension =
ADDCOLUMNS(
```text
CALENDAR(DATE(2020, 1, 1), DATE(2030, 12, 31)),
"Year", YEAR([Date]),
"Quarter", "Q" & FORMAT([Date], "Q"),
"Month", FORMAT([Date], "MMM"),
"MonthNumber", MONTH([Date]),
"Day", DAY([Date]),
"DayOfWeek", FORMAT([Date], "DDD"),
"DayOfWeekNumber", WEEKDAY([Date]),
"YearMonth", FORMAT([Date], "YYYY-MM"),
"IsWeekend", WEEKDAY([Date]) IN {1, 7}```
)
-- Product dimension
Product =
SELECTCOLUMNS(
```text
'ProductTable',
"ProductID", [ProductID],
"ProductName", [Name],
"Category", [Category],
"Subcategory", [Subcategory],
"Brand", [Brand],
"Color", [Color],
"ListPrice", [ListPrice]```
)
-- Customer dimension
Customer =
SELECTCOLUMNS(
```text
'CustomerTable',
"CustomerID", [CustomerID],
"CustomerName", [FirstName] & " " & [LastName],
"Email", [Email],
"City", [City],
"State", [State],
"Country", [Country],
"Segment", [CustomerSegment]```
)
Table Relationships
Creating Relationships
Architecture Overview: Relationship types:
Relationship Properties
Key properties:
1. Cardinality:
- One-to-many (1:*)
- Many-to-one (*:1)
- One-to-one (1:1)
- Many-to-many (*:*)
2. Cross filter direction:
- Single: Dimension filters fact (standard)
- Both: Bidirectional filtering (use carefully)
3. Make this relationship active:
- Active: Used in calculations
- Inactive: Used with USERELATIONSHIP()
4. Assume referential integrity:
- Improves performance
- Use when all fact keys exist in dimension
Role-Playing Dimensions
Architecture Overview: Role playing dimension: Same dimension used multiple times
[Total Sales], USERELATIONSHIP(Sales[ShipDate], 'Date'[Date])``` )
## Calculated Columns vs Measures
### Calculated Columns
```DAX
-- Calculated column: Computed row-by-row at refresh
-- Full Name
Customer[Full Name] =
Customer[FirstName] & " " & Customer[LastName]
-- Age Group
Customer[Age Group] =
SWITCH(
```text
TRUE(),
Customer[Age] < 18, "Under 18",
Customer[Age] < 35, "18-34",
Customer[Age] < 55, "35-54",
"55+"```
)
-- Profit
Sales[Profit] = Sales[Revenue] - Sales[Cost]
-- Year-Month
Sales[YearMonth] = FORMAT(Sales[OrderDate], "YYYY-MM")
When to use:
✓ Needed for filtering/slicing
✓ Rarely changes
✓ Row context calculations
✗ Avoid for aggregations (use measures)
Measures
-- Measure: Computed dynamically based on filter context
-- Total Sales
Total Sales = SUM(Sales[Amount])
-- Total Quantity
Total Quantity = SUM(Sales[Quantity])
-- Average Sale
Average Sale = DIVIDE([Total Sales], [Total Quantity], 0)
-- Sales YoY Growth %
Sales YoY Growth % =
VAR CurrentYear = [Total Sales]
VAR PreviousYear =
```text
CALCULATE(
[Total Sales],
DATEADD('Date'[Date], -1, YEAR)
)```
RETURN
```text
DIVIDE(CurrentYear - PreviousYear, PreviousYear, 0)
-- Running Total Running Total = CALCULATE(
[Total Sales],
FILTER(
ALLSELECTED('Date'),
'Date'[Date] <= MAX('Date'[Date])
)```
)
When to use:
✓ Aggregations (SUM, AVG, COUNT)
✓ Dynamic calculations
✓ Context-dependent results
✓ Better performance for large datasets
DAX Fundamentals
Filter Context
Architecture Overview: Filter context: Filters applied to the model
[Total Sales], ALL(Product[Category]) -- Remove category filter``` )
Sales for Electronics = CALCULATE(
[Total Sales],
Product[Category] = "Electronics" -- Apply specific filter```
)
Row Context
-- Row context: Current row in table iteration
-- Calculated column has row context:
Sales[Line Total] = Sales[Quantity] * Sales[UnitPrice]
-- Iterate functions create row context:
Total Freight =
SUMX(
```text
Sales, -- Iterate over Sales table
Sales[Quantity] * Sales[FreightPerUnit] -- Row context```
)
-- Convert row context to filter context with CALCULATE:
Product Total Sales =
CALCULATE(
```text
[Total Sales] -- Uses filter context```
)
-- In calculated column, CALCULATE converts current row to filter
Time Intelligence
-- Requires proper Date table with Date type column
-- Year-to-Date
YTD Sales = TOTALYTD([Total Sales], 'Date'[Date])
-- Month-to-Date
MTD Sales = TOTALMTD([Total Sales], 'Date'[Date])
-- Quarter-to-Date
QTD Sales = TOTALQTD([Total Sales], 'Date'[Date])
-- Previous Year
PY Sales = CALCULATE([Total Sales], SAMEPERIODLASTYEAR('Date'[Date]))
-- Year-over-Year Growth
YoY Growth = [Total Sales] - [PY Sales]
YoY Growth % = DIVIDE([YoY Growth], [PY Sales], 0)
-- Moving Average (3 months)
Sales 3M MA =
CALCULATE(
```text
[Total Sales],
DATESINPERIOD(
'Date'[Date],
LASTDATE('Date'[Date]),
-3,
MONTH
)```
) / 3
-- Custom fiscal year (starts July)
FY Sales =
CALCULATE(
```text
[Total Sales],
DATESYTD('Date'[Date], "06-30")```
)
Data Model Optimization
Reducing Model Size
Architecture Overview: 1. Remove unnecessary columns:
Column Statistics
Architecture Overview: View column statistics:
Relationships Best Practices
1. Use integer keys:
- Better performance than text keys
- Surrogate keys recommended
2. Single direction filtering:
- Avoid bidirectional unless necessary
- Can cause ambiguity
3. Active relationships:
- One active path between tables
- Inactive for role-playing dimensions
4. Avoid many-to-many:
- Use bridge tables
- Better performance
Advanced Patterns
Slowly Changing Dimensions (SCD)
-- Type 1 SCD: Overwrite old values
-- Simply update dimension table
-- History is lost
-- Type 2 SCD: Preserve history
-- Add versioning columns to dimension:
Customer_SCD2 =
ADDCOLUMNS(
```text
Customer,
"ValidFrom", [EffectiveDate],
"ValidTo", [ExpirationDate],
"IsCurrent", [ExpirationDate] = DATE(9999, 12, 31)```
)
-- Filter for current records only
Current Customers =
FILTER(
```text
Customer_SCD2,
Customer_SCD2[IsCurrent] = TRUE```
)
-- Point-in-time query
Customers At Date =
FILTER(
```text
Customer_SCD2,
AND(
Customer_SCD2[ValidFrom] <= [Selected Date],
Customer_SCD2[ValidTo] > [Selected Date]
)```
)
Parent-Child Hierarchies
-- Employee table with Manager relationship
-- Calculate hierarchy level
Level =
PATHLENGTH(Employee[Path])
-- Get manager name
Manager Name =
LOOKUPVALUE(
```text
Employee[Name],
Employee[EmployeeID],
Employee[ManagerID]```
)
-- Aggregate up hierarchy
Total Sales Including Subordinates =
CALCULATE(
```text
[Total Sales],
TREATAS(
PATHCONTAINS(
Employee[Path],
Employee[EmployeeID]
),
Employee[EmployeeID]
)```
)
Many-to-Many with Bridge Table
Scenario: Products can belong to multiple categories
Tables:
- Product (ProductID, ProductName)
- Category (CategoryID, CategoryName)
- ProductCategory bridge (ProductID, CategoryID)
Relationships:
Product → ProductCategory (many-to-one)
Category → ProductCategory (many-to-one)
Enable many-to-many:
Set relationship cross filter direction to Both
Composite Models and Dual Storage Mode
Understanding Storage Modes
Power BI supports three storage modes per table:
1. Import Mode (Default)
- Data loaded into Power BI model
- Compressed using VertiPaq engine
- Fast query performance
- Requires refresh to update
- Limited by memory capacity
Use Cases:
✓ Historical analysis
✓ Complex calculations
✓ Moderate data volumes (<10GB compressed)
✓ Fast visual interactions required
2. DirectQuery Mode
- Data queried from source in real-time
- No data import, minimal memory usage
- Always up-to-date
- Query performance depends on source
- Limited DAX functionality
Use Cases:
✓ Real-time data requirements
✓ Very large datasets (>100GB)
✓ Data sovereignty/compliance needs
✓ Source has good query performance
3. Dual Mode (Hybrid)
- Table acts as Import or DirectQuery contextually
- Power BI chooses optimal mode per query
- Best of both worlds
Use Cases:
✓ Dimension tables in large DirectQuery models
✓ Improve performance while maintaining real-time fact data
Creating Composite Models
Scenario: Sales fact table (DirectQuery) + Dimension tables (Import)
Steps:
1. Connect to SQL with DirectQuery
2. Load Sales fact table (DirectQuery mode)
3. Add dimension tables:
- Home → Get Data → Other sources
- Select dimension tables
- Load as Import mode
4. Create relationships between Import and DirectQuery tables
5. Set dimension tables to Dual mode for optimal performance
Benefits:
- Real-time sales data (DirectQuery)
- Fast dimension filtering (Import/Dual)
- Reduced source query load
- Flexible refresh schedule for dimensions
Aggregation Tables
Aggregations dramatically improve large DirectQuery model performance by pre-calculating summaries.
Creating Aggregations:
// Original detail table (DirectQuery):
// Sales: 1B rows (OrderID, ProductID, OrderDate, CustomerID, Amount)
// Create aggregation table (Import):
Sales_Agg_Daily =
SUMMARIZECOLUMNS(
> **Architecture Overview:** Sales[OrderDate],
**Aggregation Best Practices:**
| Grain | Use Case | Performance Gain |
|-------|----------|------------------|
| Daily × Product | Sales by date/product | 100-1000× faster |
| Monthly × Category | Trend analysis | 1000-10000× faster |
| Yearly × Region | Executive dashboards | 10000+× faster |
```text
Guidelines:
✓ Create multiple aggregation levels (day, month, year)
✓ Include most-queried dimension combinations
✓ Keep aggregation tables in Import mode
✓ Detail table stays in DirectQuery
✗ Don't aggregate to single row (defeats purpose)
✗ Don't include low-cardinality dimensions only
DirectQuery Optimization
Query Reduction Settings
Architecture Overview: File → Options → Data Load
Native SQL Query Monitoring
Architecture Overview: Performance Analyzer → Start Recording → Interact with visual
DirectQuery Best Practices
// ❌ BAD: Complex calculated columns in DirectQuery
// These get recalculated per query
Sales[Profit] = Sales[Amount] - Sales[Cost] // Calculated column (avoid in DQ)
// ✅ GOOD: Pre-calculate in source
// Create view or computed column in SQL Server:
CREATE VIEW Sales_With_Profit AS
SELECT
```text
OrderID,
Amount,
Cost,
(Amount - Cost) AS Profit```
FROM Sales;
// ❌ BAD: Row-level security with complex logic in DirectQuery
[RLS Rule] =
CALCULATE(
```text
DISTINCTCOUNT(Customer[CustomerID]),
FILTER(Customer, Customer[Region] = USERPRINCIPALNAME())```
) > 0
// ✅ GOOD: Implement RLS in source database
-- SQL Server row-level security:
CREATE SECURITY POLICY CustomerFilter
ADD FILTER PREDICATE dbo.fn_securitypredicate(UserName)
ON dbo.Customer
WITH (STATE = ON);
Hybrid Table Patterns
Scenario: Recent data (hot) + Historical data (cold)
Implementation:
1. Partition table in source:
- Sales_Hot: Last 3 months (DirectQuery)
- Sales_Historical: Older data (Import)
2. Create union view in Power BI:
Sales =
UNION(
```text
Sales_Hot, // DirectQuery (real-time)
Sales_Historical // Import (fast performance)```
)
3. Benefits:
- Recent data always current
- Historical analysis performant
- Reduced source load
- Flexible refresh schedule (historical: weekly; hot: real-time)
Advanced Relationship Patterns
Bidirectional Filtering
Use with Caution:
Standard: Single direction (Dimension → Fact)
Product → Sales: Product filters Sales (one-to-many)
Bidirectional: Both directions
Product ↔ Sales: Product filters Sales AND Sales filters Product
When to use bidirectional:
✓ Many-to-many relationships
✓ Bridging tables
✓ Specific advanced scenarios
Risks:
✗ Ambiguous filter propagation
✗ Performance degradation
✗ Unexpected results in complex models
Example Scenario:
Tables: Sales ↔ Budget (many-to-many)
Goal: Show products with sales but no budget
Solution:
1. Create bridge table: ProductBridge (ProductID)
2. Relationships:
- Product → ProductBridge (one-to-many, single)
- Sales → ProductBridge (many-to-one, single)
- Budget → ProductBridge (many-to-one, single)
3. Enable bidirectional on ProductBridge → Product
Result: Sales and Budget cross-filter through bridge
Inactive Relationships with USERELATIONSHIP
// Date dimension with multiple roles
// Active relationship: Sales[OrderDate] → Date[Date]
Sales by Order Date = SUM(Sales[Amount])
// Inactive: Sales[ShipDate] → Date[Date]
Sales by Ship Date =
CALCULATE(
> **Architecture Overview:** SUM(Sales[Amount]),
SUM(Sales[Amount]),
USERELATIONSHIP(Sales[DueDate], 'Date'[Date])```
)
// Dynamic selection with parameter
Sales by Selected Date =
VAR SelectedRole = SELECTEDVALUE('DateRole'[Role], "OrderDate")
RETURN
```text
SWITCH(
SelectedRole,
"OrderDate", [Sales by Order Date],
"ShipDate", [Sales by Ship Date],
"DueDate", [Sales by Due Date],
BLANK()
)
### Circular Dependency Resolution
```text
Problem: A → B → C → A (circular relationship)
Example:
- Employee → Manager (Employee table self-reference)
- Sales → Product → Category → SubcategoryBridge → Sales
Solutions:
1. Flatten hierarchy (best for small hierarchies):
- Denormalize Category into Product table
- Single relationship: Sales → Product
2. Use calculated table to break cycle:
- Keep relationships: Sales → Product → Category
- Create: CategoryList = VALUES(Category)
- Use CategoryList for slicing (no relationship)
3. DAX-based filtering (for complex scenarios):
- Remove problematic relationship
- Use TREATAS or CROSSFILTER in measures
Data Model Optimization Techniques
Column Optimization
Reduce Cardinality:
1. Remove high-cardinality text columns:
❌ Keep: Customer[Email] (5M unique values)
✅ Remove: Use CustomerID for relationships only
2. Convert to low-cardinality:
❌ Sales[Comments] (free text, millions of unique values)
✅ Sales[Has Comments] (Boolean: TRUE/FALSE)
3. Optimize data types:
❌ Text: "2025-01-27" (larger storage)
✅ Date: 2025-01-27 (optimized storage)
❌ Decimal: 123.4567890 (8 bytes)
✅ Whole Number: 1234 (4 bytes) if decimals unnecessary
4. Use integer keys instead of text:
❌ Product[SKU] = "PROD-12345-XYZ" (text)
✅ Product[ProductID] = 12345 (integer, relationship key)
Model Compression Statistics
View model size:
External Tools → DAX Studio → View → DMV Browser → $SYSTEM.DISCOVER_STORAGE_TABLE_COLUMNS
Key metrics:
- Table Size: MB per table
- Column Cardinality: Unique values
- Dictionary Size: Compressed size
- Data Type: Optimization opportunities
Target reductions:
- Remove unused columns: 10-50% size reduction
- Optimize data types: 20-40% size reduction
- Reduce cardinality: 30-60% size reduction
- Remove calculated columns: 10-30% size reduction (replace with measures)
Incremental Refresh Configuration
Scenario: 10-year historical Sales table, daily updates
Configuration:
1. Define parameters (Power Query):
- RangeStart (DateTime)
- RangeEnd (DateTime)
2. Filter table using parameters:
Sales = Table.SelectRows(Source, each
[OrderDate] >= RangeStart and [OrderDate] < RangeEnd)
3. Configure incremental refresh policy (Model view):
- Right-click Sales table → Incremental refresh
- Archive data starting: 9 years before refresh date
- Incrementally refresh data starting: 7 days before refresh date
- Detect data changes: [OrderDate]
4. Benefits:
- Full refresh only needed for last 7 days (fast)
- Historical data (9+ years) never refreshed (stable)
- Automatic partition management
- 100× faster refresh times
Data Model Versioning and Governance
Model Documentation
// Table-level documentation
// Right-click table → Properties → Description
Sales_Description =
"Fact table: Sales transactions
Source: SQL Server - AdventureWorks.dbo.Sales
Refresh: Daily at 2 AM UTC
Grain: One row per order line item
Last Modified: 2025-01-27 by Vladimiro Luis"
// Measure-level documentation
Total Sales = SUM(Sales[Amount])
// Description: "Sum of all sales amounts. Includes discounts. Excludes returns (use [Net Sales] for returns-adjusted)."
// Relationship documentation (in model diagram)
// Add text boxes with notes:
"⚠️ OrderDate is active relationship
ShipDate and DueDate are inactive - use USERELATIONSHIP() in measures"
Change Tracking
Version control for Power BI models:
1. Save as PBIX + PBIT (template):
- PBIX: Full model with data
- PBIT: Model structure only (no data)
- Commit PBIT to Git for version control
2. Use external tools:
- Tabular Editor 2/3: Extract model as JSON
- Commit JSON to Git
- Track changes with diff tools
3. Document changes:
- Changelog table in model
- Version number in report title
- Release notes page
Changelog = #table(
```text
{"Version", "Date", "Changes", "Author"},
{
{"2.1", #date(2025,1,27), "Added sales aggregations", "Vladimiro"},
{"2.0", #date(2025,1,15), "Implemented composite model", "Vladimiro"},
{"1.5", #date(2024,12,10), "Added DirectQuery for real-time data", "Team"}
}```
)
Model Health Checks
// Create "Model Health" page with these measures:
Tables Count = COUNTROWS(INFO.TABLES())
Relationships Count = COUNTROWS(INFO.RELATIONSHIPS())
Measures Count = COUNTROWS(INFO.MEASURES())
Calculated Columns Count = COUNTROWS(INFO.COLUMNS("CALCULATED_COLUMN"))
Unused Columns =
VAR AllColumns = INFO.COLUMNS()
VAR UsedInVisuals = INFO.COLUMNS("USED_IN_VISUALS")
RETURN COUNTROWS(EXCEPT(AllColumns, UsedInVisuals))
Model Size MB =
SUMX(
```text
INFO.STORAGE_TABLE_COLUMNS(),
[DICTIONARY_SIZE] + [DATA_SIZE]```
) / 1024 / 1024
Refresh Duration =
"Last refresh: " & FORMAT([Last Refresh Time], "YYYY-MM-DD HH:mm")
// Alert: Tables without relationships
Orphaned Tables =
VAR AllTables = VALUES(INFO.TABLES[TABLE_NAME])
VAR TablesInRelationships =
```text
UNION(
VALUES(INFO.RELATIONSHIPS[FROM_TABLE]),
VALUES(INFO.RELATIONSHIPS[TO_TABLE])
)```
RETURN COUNTROWS(EXCEPT(AllTables, TablesInRelationships))
Performance Optimization
Figure: Power Apps form control – edit form with validation rules and error handling.
-- Bad: Multiple CALCULATE calls
Slow Measure =
CALCULATE([Total Sales], Product[Category] = "A") +
CALCULATE([Total Sales], Product[Category] = "B") +
CALCULATE([Total Sales], Product[Category] = "C")
-- Good: Single CALCULATE with filter
Fast Measure =
CALCULATE(
```text
[Total Sales],
Product[Category] IN {"A", "B", "C"}```
)
-- Bad: Nested CALCULATE
Slow =
CALCULATE(
```text
CALCULATE(
[Total Sales],
Product[Category] = "A"
),
'Date'[Year] = 2025```
)
-- Good: Combined filters
Fast =
CALCULATE(
```text
[Total Sales],
Product[Category] = "A",
'Date'[Year] = 2025```
)
-- Use variables for repeated calculations
Optimized Measure =
VAR TotalSales = [Total Sales]
VAR TotalCost = [Total Cost]
VAR Profit = TotalSales - TotalCost
VAR Margin = DIVIDE(Profit, TotalSales, 0)
RETURN
```text
IF(Margin > 0.2, "High", "Low")
## Data Model Design Patterns and Anti-Patterns
### ✅ Design Patterns (Best Practices)
**1. Conformed Dimensions**
```text
Pattern: Share dimension tables across multiple fact tables
Example:
- Sales fact → Date dimension
- Budget fact → Date dimension
- Inventory fact → Date dimension
(All use same Date dimension)
Benefits:
✓ Consistent filtering across facts
✓ Simplified model maintenance
✓ Reduced model size
✓ Easier user experience
Implementation:
- Create single Date dimension
- Establish relationships from all facts
- Use in slicers for cross-fact filtering
2. Junk Dimensions
Pattern: Combine low-cardinality flags into single dimension
Instead of:
- Sales[IsOnline] (Boolean)
- Sales[IsPriority] (Boolean)
- Sales[IsInternational] (Boolean)
- Sales[PaymentMethod] (3 values)
Create:
OrderAttributes dimension:
- AttributeID (1-24, all combinations)
- IsOnline (Y/N)
- IsPriority (Y/N)
- IsInternational (Y/N)
- PaymentMethod (Cash/Card/Online)
Benefits:
✓ Reduces fact table width
✓ Improves compression
✓ Easier to add new flags
✓ Better performance
3. Surrogate Keys
Pattern: Integer keys instead of natural keys
❌ Natural Key: Product[SKU] = "PROD-ELECT-12345-XYZ"
✅ Surrogate Key: Product[ProductID] = 12345
Benefits:
✓ Faster joins (integer vs text)
✓ Smaller storage
✓ Handles source key changes
✓ Better compression
Implementation:
-- Power Query: Add Index Column
= Table.AddIndexColumn(Source, "ProductID", 1, 1)
4. Snowflake to Star Conversion
Snowflake (normalized):
Sales → Product → Category → SubCategory (3 hops)
Star (denormalized):
Sales → Product (with Category, SubCategory columns)
Conversion in Power Query:
Product_Flattened =
```text
Table.ExpandTableColumn(
Table.NestedJoin(Product, "CategoryID", Category, "CategoryID", "Category"),
"Category", {"CategoryName", "SubCategoryName"}
)
Benefits: ✓ Faster queries (fewer joins) ✓ Simplified relationships ✓ Better for business users
**5. Fact Table Consolidation**
```text
Multiple fact tables with same grain → Single fact table with type dimension
Instead of:
- Sales_Online (OrderID, Amount, Quantity)
- Sales_Store (OrderID, Amount, Quantity)
- Sales_Phone (OrderID, Amount, Quantity)
Create:
- Sales (OrderID, Amount, Quantity, ChannelID)
- Channel (ChannelID, ChannelName: Online/Store/Phone)
Benefits:
✓ Simpler model
✓ Unified analysis
✓ Easier maintenance
✓ Single source of truth
Implementation:
Combined_Sales = Table.Combine({
```text
Table.AddColumn(Sales_Online, "Channel", each "Online"),
Table.AddColumn(Sales_Store, "Channel", each "Store"),
Table.AddColumn(Sales_Phone, "Channel", each "Phone")```
})
❌ Anti-Patterns (Common Mistakes)
1. Multiple Fact Tables at Different Grains
Architecture Overview: ❌ WRONG:
2. Bidirectional Relationships Everywhere
Architecture Overview: ❌ WRONG:
3. Many Calculated Columns
Architecture Overview: ❌ WRONG:
4. Import Mode for Real-Time Data
Architecture Overview: ❌ WRONG:
5. No Date Table
Architecture Overview: ❌ WRONG:
6. Circular Relationships
Architecture Overview: ❌ WRONG:
Decision Framework
When to Import vs DirectQuery:
| Factor | Import | DirectQuery |
|---|---|---|
| Data size | < 10GB compressed | > 10GB or very large |
| Refresh frequency | Daily or less frequent | Real-time required |
| Query performance need | Critical (sub-second) | Acceptable (1-5 seconds) |
| Data sovereignty | Flexible | Must stay in source |
| DAX complexity | Full DAX support | Limited DAX support |
| Network reliability | Not critical | Must be reliable |
When to Use Calculated Column vs Measure:
| Use Calculated Column When: | Use Measure When: |
|---|---|
| Need to filter/slice by result | Aggregating data (SUM, AVG, COUNT) |
| Value is static (e.g., Full Name) | Result changes with filter context |
| Required in relationships | Performance is critical |
| Low cardinality result | Dynamic business logic |
When to Use Bidirectional Relationships:
✅ Use bidirectional when:
- Many-to-many with bridge table
- Specific measure requires it
- Model is small and performance acceptable
❌ Avoid bidirectional when:
- Standard star schema (use single direction)
- Performance is critical
- Model has many tables (complexity)
- Can achieve same result with DAX
Architecture Decision and Tradeoffs
When designing business intelligence solutions with Power BI, consider these key architectural trade-offs:
| Approach | Best For | Tradeoff |
|---|---|---|
| Managed / platform service | Rapid delivery, reduced ops burden | Less customisation, potential vendor lock-in |
| Custom / self-hosted | Full control, advanced tuning | Higher operational overhead and cost |
Recommendation: Start with the managed approach for most workloads and move to custom only when specific requirements demand it.
Validation and Versioning
- Last validated: April 2026
- Validate examples against your tenant, region, and SKU constraints before production rollout.
- Keep module, CLI, and SDK versions pinned in automation pipelines and review quarterly.
Security and Governance Considerations
- Apply least-privilege access using RBAC roles and just-in-time elevation for admin tasks.
- Store secrets in managed secret stores and avoid embedding credentials in scripts or source files.
- Enable audit logging, data protection policies, and periodic access reviews for regulated workloads.
Cost and Performance Notes
- Define budgets and alerts, then monitor usage and cost trends continuously after go-live.
- Baseline performance with synthetic and real-user checks before and after major changes.
- Scale resources with measured thresholds and revisit sizing after usage pattern changes.
Official Microsoft References
- https://learn.microsoft.com/power-bi/
- https://learn.microsoft.com/power-bi/guidance/
- https://learn.microsoft.com/fabric/
Public Examples from Official Sources
- These examples are sourced from official public Microsoft documentation and sample repositories.
- Documentation examples: https://learn.microsoft.com/power-bi/
- Sample repositories: https://github.com/microsoft/PowerBI-Developer-Samples
- Prefer adapting these examples to your tenant, subscriptions, and governance requirements before production use.
Key Takeaways
-
Star schema is the foundation: Fact tables (measures) + dimension tables (attributes) = optimal analytics model
-
Relationships drive filtering: One-to-many from dimensions to facts; understand cardinality and filter direction
-
Storage mode matters: Import for performance, DirectQuery for real-time, Composite for best of both worlds
-
Calculated columns vs measures: Columns computed at refresh (static), measures at query time (dynamic) - prefer measures
-
Date dimension is mandatory: Contiguous date table marked as date table enables time intelligence functions
-
Aggregations accelerate DirectQuery: Pre-calculate summaries for 100-1000× performance gains on large datasets
-
Multiple relationships via USERELATIONSHIP: Role-playing dimensions (OrderDate, ShipDate) with inactive relationships
-
Model optimization reduces size: Remove unused columns, optimize data types, use integer keys, reduce cardinality
-
Incremental refresh for large tables: Partition by date, refresh recent data only, archive historical data
-
Documentation prevents confusion: Comment tables/measures, track versions, create model health dashboards
-
Design patterns prevent rework: Conformed dimensions, junk dimensions, surrogate keys, fact consolidation
-
Anti-patterns cause problems: Avoid many calculated columns, bidirectional relationships everywhere, circular dependencies
-
Composite models enable hybrid scenarios: Real-time facts (DirectQuery) + performant dimensions (Import/Dual)
-
Performance profiling is essential: Use Performance Analyzer and DAX Studio to identify and fix bottlenecks
-
Filter context understanding is critical: Visuals create filter context; relationships propagate it; CALCULATE modifies it
Next Steps
- Design star schema for your data
- Create date dimension table
- Establish table relationships
- Build core measures library
- Implement time intelligence
- Optimize model size
- Test query performance
- Document data model
Additional Resources
Model. Relate. Calculate. Visualize.
Discussion