Power BI Data Modeling: Building Effective Data Models

'Order'[OrderID], 'Order'[OrderDate], 'Order'[CustomerID], 'Order'[ProductID], "Quantity", SUM('OrderDetails'[Quantity]), "UnitPrice", AVERAGE('OrderDetails'[UnitPrice]), "TotalAmount", SUM('OrderDetails'[Quantity] * 'OrderDetails'[UnitPrice])``` )

-- Fact table characteristics: -- ✓ Measurable metrics -- ✓ Foreign keys only -- ✓ No descriptive text -- ✓ Additive measures -- ✗ Avoid unnecessary columns


### Creating Dimension Tables

```DAX
-- Date dimension (calendar table)
DateDimension = 
ADDCOLUMNS(
```text
CALENDAR(DATE(2020, 1, 1), DATE(2030, 12, 31)),
"Year", YEAR([Date]),
"Quarter", "Q" & FORMAT([Date], "Q"),
"Month", FORMAT([Date], "MMM"),
"MonthNumber", MONTH([Date]),
"Day", DAY([Date]),
"DayOfWeek", FORMAT([Date], "DDD"),
"DayOfWeekNumber", WEEKDAY([Date]),
"YearMonth", FORMAT([Date], "YYYY-MM"),
"IsWeekend", WEEKDAY([Date]) IN {1, 7}```
)

-- Product dimension
Product = 
SELECTCOLUMNS(
```text
'ProductTable',
"ProductID", [ProductID],
"ProductName", [Name],
"Category", [Category],
"Subcategory", [Subcategory],
"Brand", [Brand],
"Color", [Color],
"ListPrice", [ListPrice]```
)

-- Customer dimension
Customer = 
SELECTCOLUMNS(
```text
'CustomerTable',
"CustomerID", [CustomerID],
"CustomerName", [FirstName] & " " & [LastName],
"Email", [Email],
"City", [City],
"State", [State],
"Country", [Country],
"Segment", [CustomerSegment]```
)

Table Relationships

Creating Relationships

Architecture Overview: Relationship types:

Relationship Properties

Key properties:

1. Cardinality:
   - One-to-many (1:*)
   - Many-to-one (*:1)
   - One-to-one (1:1)
   - Many-to-many (*:*)

2. Cross filter direction:
   - Single: Dimension filters fact (standard)
   - Both: Bidirectional filtering (use carefully)
   
3. Make this relationship active:
   - Active: Used in calculations
   - Inactive: Used with USERELATIONSHIP()

4. Assume referential integrity:
   - Improves performance
   - Use when all fact keys exist in dimension

Role-Playing Dimensions

Architecture Overview: Role playing dimension: Same dimension used multiple times

[Total Sales], USERELATIONSHIP(Sales[ShipDate], 'Date'[Date])``` )


## Calculated Columns vs Measures

### Calculated Columns





```DAX
-- Calculated column: Computed row-by-row at refresh

-- Full Name
Customer[Full Name] = 
Customer[FirstName] & " " & Customer[LastName]

-- Age Group
Customer[Age Group] = 
SWITCH(
```text
TRUE(),
Customer[Age] < 18, "Under 18",
Customer[Age] < 35, "18-34",
Customer[Age] < 55, "35-54",
"55+"```
)

-- Profit
Sales[Profit] = Sales[Revenue] - Sales[Cost]

-- Year-Month
Sales[YearMonth] = FORMAT(Sales[OrderDate], "YYYY-MM")

When to use:
✓ Needed for filtering/slicing
✓ Rarely changes
✓ Row context calculations
✗ Avoid for aggregations (use measures)

Measures

-- Measure: Computed dynamically based on filter context

-- Total Sales
Total Sales = SUM(Sales[Amount])

-- Total Quantity
Total Quantity = SUM(Sales[Quantity])

-- Average Sale
Average Sale = DIVIDE([Total Sales], [Total Quantity], 0)

-- Sales YoY Growth %
Sales YoY Growth % = 
VAR CurrentYear = [Total Sales]
VAR PreviousYear = 
```text
CALCULATE(
    [Total Sales],
    DATEADD('Date'[Date], -1, YEAR)
)```
RETURN
```text
DIVIDE(CurrentYear - PreviousYear, PreviousYear, 0)

-- Running Total Running Total = CALCULATE(

[Total Sales],
FILTER(
    ALLSELECTED('Date'),
    'Date'[Date] <= MAX('Date'[Date])
)```
)

When to use:
✓ Aggregations (SUM, AVG, COUNT)
✓ Dynamic calculations
✓ Context-dependent results
✓ Better performance for large datasets

DAX Fundamentals

Filter Context

Architecture Overview: Filter context: Filters applied to the model

[Total Sales], ALL(Product[Category]) -- Remove category filter``` )

Sales for Electronics = CALCULATE(

[Total Sales],
Product[Category] = "Electronics"  -- Apply specific filter```
)

Row Context

-- Row context: Current row in table iteration

-- Calculated column has row context:
Sales[Line Total] = Sales[Quantity] * Sales[UnitPrice]

-- Iterate functions create row context:
Total Freight = 
SUMX(
```text
Sales,  -- Iterate over Sales table
Sales[Quantity] * Sales[FreightPerUnit]  -- Row context```
)

-- Convert row context to filter context with CALCULATE:
Product Total Sales = 
CALCULATE(
```text
[Total Sales]  -- Uses filter context```
)
-- In calculated column, CALCULATE converts current row to filter

Time Intelligence

-- Requires proper Date table with Date type column

-- Year-to-Date
YTD Sales = TOTALYTD([Total Sales], 'Date'[Date])

-- Month-to-Date
MTD Sales = TOTALMTD([Total Sales], 'Date'[Date])

-- Quarter-to-Date
QTD Sales = TOTALQTD([Total Sales], 'Date'[Date])

-- Previous Year
PY Sales = CALCULATE([Total Sales], SAMEPERIODLASTYEAR('Date'[Date]))

-- Year-over-Year Growth
YoY Growth = [Total Sales] - [PY Sales]

YoY Growth % = DIVIDE([YoY Growth], [PY Sales], 0)

-- Moving Average (3 months)
Sales 3M MA = 
CALCULATE(
```text
[Total Sales],
DATESINPERIOD(
    'Date'[Date],
    LASTDATE('Date'[Date]),
    -3,
    MONTH
)```
) / 3

-- Custom fiscal year (starts July)
FY Sales = 
CALCULATE(
```text
[Total Sales],
DATESYTD('Date'[Date], "06-30")```
)

Data Model Optimization

Reducing Model Size

Architecture Overview: 1. Remove unnecessary columns:

Column Statistics

Architecture Overview: View column statistics:

Relationships Best Practices

1. Use integer keys:
   - Better performance than text keys
   - Surrogate keys recommended

2. Single direction filtering:
   - Avoid bidirectional unless necessary
   - Can cause ambiguity

3. Active relationships:
   - One active path between tables
   - Inactive for role-playing dimensions

4. Avoid many-to-many:
   - Use bridge tables
   - Better performance

Advanced Patterns

Slowly Changing Dimensions (SCD)

-- Type 1 SCD: Overwrite old values
-- Simply update dimension table
-- History is lost

-- Type 2 SCD: Preserve history
-- Add versioning columns to dimension:

Customer_SCD2 = 
ADDCOLUMNS(
```text
Customer,
"ValidFrom", [EffectiveDate],
"ValidTo", [ExpirationDate],
"IsCurrent", [ExpirationDate] = DATE(9999, 12, 31)```
)

-- Filter for current records only
Current Customers = 
FILTER(
```text
Customer_SCD2,
Customer_SCD2[IsCurrent] = TRUE```
)

-- Point-in-time query
Customers At Date = 
FILTER(
```text
Customer_SCD2,
AND(
    Customer_SCD2[ValidFrom] <= [Selected Date],
    Customer_SCD2[ValidTo] > [Selected Date]
)```
)

Parent-Child Hierarchies

-- Employee table with Manager relationship

-- Calculate hierarchy level
Level = 
PATHLENGTH(Employee[Path])

-- Get manager name
Manager Name = 
LOOKUPVALUE(
```text
Employee[Name],
Employee[EmployeeID],
Employee[ManagerID]```
)

-- Aggregate up hierarchy
Total Sales Including Subordinates = 
CALCULATE(
```text
[Total Sales],
TREATAS(
    PATHCONTAINS(
        Employee[Path],
        Employee[EmployeeID]
    ),
    Employee[EmployeeID]
)```
)

Many-to-Many with Bridge Table

Scenario: Products can belong to multiple categories

Tables:
- Product (ProductID, ProductName)
- Category (CategoryID, CategoryName)
- ProductCategory bridge (ProductID, CategoryID)

Relationships:
Product → ProductCategory (many-to-one)
Category → ProductCategory (many-to-one)

Enable many-to-many:
Set relationship cross filter direction to Both

Composite Models and Dual Storage Mode

Understanding Storage Modes

Power BI supports three storage modes per table:

1. Import Mode (Default)

Data loaded into Power BI model
Compressed using VertiPaq engine
Fast query performance
Requires refresh to update
Limited by memory capacity

Use Cases:
✓ Historical analysis
✓ Complex calculations
✓ Moderate data volumes (<10GB compressed)
✓ Fast visual interactions required

2. DirectQuery Mode

Data queried from source in real-time
No data import, minimal memory usage
Always up-to-date
Query performance depends on source
Limited DAX functionality

Use Cases:
✓ Real-time data requirements
✓ Very large datasets (>100GB)


✓ Data sovereignty/compliance needs
✓ Source has good query performance

3. Dual Mode (Hybrid)

Table acts as Import or DirectQuery contextually
Power BI chooses optimal mode per query
Best of both worlds

Use Cases:
✓ Dimension tables in large DirectQuery models
✓ Improve performance while maintaining real-time fact data

Creating Composite Models

Scenario: Sales fact table (DirectQuery) + Dimension tables (Import)

Steps:
1. Connect to SQL with DirectQuery
2. Load Sales fact table (DirectQuery mode)
3. Add dimension tables:
   - Home → Get Data → Other sources
   - Select dimension tables
   - Load as Import mode
4. Create relationships between Import and DirectQuery tables
5. Set dimension tables to Dual mode for optimal performance

Benefits:
- Real-time sales data (DirectQuery)
- Fast dimension filtering (Import/Dual)
- Reduced source query load
- Flexible refresh schedule for dimensions

Aggregation Tables

Aggregations dramatically improve large DirectQuery model performance by pre-calculating summaries.

Creating Aggregations:

// Original detail table (DirectQuery):
// Sales: 1B rows (OrderID, ProductID, OrderDate, CustomerID, Amount)

// Create aggregation table (Import):
Sales_Agg_Daily = 
SUMMARIZECOLUMNS(

> **Architecture Overview:** Sales[OrderDate],


**Aggregation Best Practices:**

| Grain | Use Case | Performance Gain |
|-------|----------|------------------|
| Daily × Product | Sales by date/product | 100-1000× faster |
| Monthly × Category | Trend analysis | 1000-10000× faster |
| Yearly × Region | Executive dashboards | 10000+× faster |

```text
Guidelines:
✓ Create multiple aggregation levels (day, month, year)
✓ Include most-queried dimension combinations
✓ Keep aggregation tables in Import mode
✓ Detail table stays in DirectQuery
✗ Don't aggregate to single row (defeats purpose)
✗ Don't include low-cardinality dimensions only

DirectQuery Optimization

Query Reduction Settings

Architecture Overview: File → Options → Data Load

Native SQL Query Monitoring

Architecture Overview: Performance Analyzer → Start Recording → Interact with visual

DirectQuery Best Practices

// ❌ BAD: Complex calculated columns in DirectQuery
// These get recalculated per query
Sales[Profit] = Sales[Amount] - Sales[Cost]  // Calculated column (avoid in DQ)

// ✅ GOOD: Pre-calculate in source
// Create view or computed column in SQL Server:
CREATE VIEW Sales_With_Profit AS
SELECT
```text
OrderID,
Amount,
Cost,
(Amount - Cost) AS Profit```
FROM Sales;

// ❌ BAD: Row-level security with complex logic in DirectQuery
[RLS Rule] = 
CALCULATE(
```text
DISTINCTCOUNT(Customer[CustomerID]),
FILTER(Customer, Customer[Region] = USERPRINCIPALNAME())```
) > 0

// ✅ GOOD: Implement RLS in source database
-- SQL Server row-level security:
CREATE SECURITY POLICY CustomerFilter
ADD FILTER PREDICATE dbo.fn_securitypredicate(UserName)
ON dbo.Customer
WITH (STATE = ON);

Hybrid Table Patterns

Scenario: Recent data (hot) + Historical data (cold)

Implementation:
1. Partition table in source:
   - Sales_Hot: Last 3 months (DirectQuery)
   - Sales_Historical: Older data (Import)

2. Create union view in Power BI:
Sales = 
UNION(
```text
Sales_Hot,        // DirectQuery (real-time)
Sales_Historical  // Import (fast performance)```
)

3. Benefits:
   - Recent data always current
   - Historical analysis performant
   - Reduced source load
   - Flexible refresh schedule (historical: weekly; hot: real-time)

Advanced Relationship Patterns

Bidirectional Filtering

Use with Caution:

Standard: Single direction (Dimension → Fact)
Product → Sales: Product filters Sales (one-to-many)

Bidirectional: Both directions
Product ↔ Sales: Product filters Sales AND Sales filters Product

When to use bidirectional:
✓ Many-to-many relationships
✓ Bridging tables
✓ Specific advanced scenarios

Risks:
✗ Ambiguous filter propagation
✗ Performance degradation
✗ Unexpected results in complex models

Example Scenario:

Tables: Sales ↔ Budget (many-to-many)

Goal: Show products with sales but no budget

Solution:
1. Create bridge table: ProductBridge (ProductID)
2. Relationships:
   - Product → ProductBridge (one-to-many, single)
   - Sales → ProductBridge (many-to-one, single)
   - Budget → ProductBridge (many-to-one, single)
3. Enable bidirectional on ProductBridge → Product

Result: Sales and Budget cross-filter through bridge

Inactive Relationships with USERELATIONSHIP

// Date dimension with multiple roles

// Active relationship: Sales[OrderDate] → Date[Date]
Sales by Order Date = SUM(Sales[Amount])

// Inactive: Sales[ShipDate] → Date[Date]
Sales by Ship Date = 
CALCULATE(

> **Architecture Overview:** SUM(Sales[Amount]),

SUM(Sales[Amount]),
USERELATIONSHIP(Sales[DueDate], 'Date'[Date])```
)

// Dynamic selection with parameter
Sales by Selected Date = 
VAR SelectedRole = SELECTEDVALUE('DateRole'[Role], "OrderDate")
RETURN
```text
SWITCH(
    SelectedRole,
    "OrderDate", [Sales by Order Date],
    "ShipDate", [Sales by Ship Date],
    "DueDate", [Sales by Due Date],
    BLANK()
)


### Circular Dependency Resolution

```text
Problem: A → B → C → A (circular relationship)

Example:
- Employee → Manager (Employee table self-reference)
- Sales → Product → Category → SubcategoryBridge → Sales

Solutions:

1. Flatten hierarchy (best for small hierarchies):
   - Denormalize Category into Product table
   - Single relationship: Sales → Product

2. Use calculated table to break cycle:
   - Keep relationships: Sales → Product → Category
   - Create: CategoryList = VALUES(Category)
   - Use CategoryList for slicing (no relationship)

3. DAX-based filtering (for complex scenarios):
   - Remove problematic relationship
   - Use TREATAS or CROSSFILTER in measures

Data Model Optimization Techniques

Column Optimization

Reduce Cardinality:

1. Remove high-cardinality text columns:
   ❌ Keep: Customer[Email] (5M unique values)
   ✅ Remove: Use CustomerID for relationships only

2. Convert to low-cardinality:
   ❌ Sales[Comments] (free text, millions of unique values)
   ✅ Sales[Has Comments] (Boolean: TRUE/FALSE)

3. Optimize data types:
   ❌ Text: "2025-01-27" (larger storage)
   ✅ Date: 2025-01-27 (optimized storage)
   
   ❌ Decimal: 123.4567890 (8 bytes)
   ✅ Whole Number: 1234 (4 bytes) if decimals unnecessary

4. Use integer keys instead of text:
   ❌ Product[SKU] = "PROD-12345-XYZ" (text)
   ✅ Product[ProductID] = 12345 (integer, relationship key)

Model Compression Statistics

View model size:
External Tools → DAX Studio → View → DMV Browser → $SYSTEM.DISCOVER_STORAGE_TABLE_COLUMNS

Key metrics:
- Table Size: MB per table
- Column Cardinality: Unique values
- Dictionary Size: Compressed size
- Data Type: Optimization opportunities

Target reductions:
- Remove unused columns: 10-50% size reduction
- Optimize data types: 20-40% size reduction
- Reduce cardinality: 30-60% size reduction
- Remove calculated columns: 10-30% size reduction (replace with measures)

Incremental Refresh Configuration

Scenario: 10-year historical Sales table, daily updates

Configuration:
1. Define parameters (Power Query):
   - RangeStart (DateTime)
   - RangeEnd (DateTime)

2. Filter table using parameters:
   Sales = Table.SelectRows(Source, each 
       [OrderDate] >= RangeStart and [OrderDate] < RangeEnd)

3. Configure incremental refresh policy (Model view):
   - Right-click Sales table → Incremental refresh
   - Archive data starting: 9 years before refresh date
   - Incrementally refresh data starting: 7 days before refresh date
   - Detect data changes: [OrderDate]

4. Benefits:
   - Full refresh only needed for last 7 days (fast)
   - Historical data (9+ years) never refreshed (stable)
   - Automatic partition management
   - 100× faster refresh times

Data Model Versioning and Governance

Model Documentation

// Table-level documentation
// Right-click table → Properties → Description

Sales_Description = 
"Fact table: Sales transactions
Source: SQL Server - AdventureWorks.dbo.Sales
Refresh: Daily at 2 AM UTC
Grain: One row per order line item
Last Modified: 2025-01-27 by Vladimiro Luis"

// Measure-level documentation
Total Sales = SUM(Sales[Amount])
// Description: "Sum of all sales amounts. Includes discounts. Excludes returns (use [Net Sales] for returns-adjusted)."

// Relationship documentation (in model diagram)
// Add text boxes with notes:
"⚠️ OrderDate is active relationship
ShipDate and DueDate are inactive - use USERELATIONSHIP() in measures"

Change Tracking

Version control for Power BI models:

1. Save as PBIX + PBIT (template):
   - PBIX: Full model with data
   - PBIT: Model structure only (no data)
   - Commit PBIT to Git for version control

2. Use external tools:
   - Tabular Editor 2/3: Extract model as JSON
   - Commit JSON to Git
   - Track changes with diff tools

3. Document changes:
   - Changelog table in model
   - Version number in report title
   - Release notes page

Changelog = #table(
```text
{"Version", "Date", "Changes", "Author"},
{
    {"2.1", #date(2025,1,27), "Added sales aggregations", "Vladimiro"},
    {"2.0", #date(2025,1,15), "Implemented composite model", "Vladimiro"},
    {"1.5", #date(2024,12,10), "Added DirectQuery for real-time data", "Team"}
}```
)

Model Health Checks

// Create "Model Health" page with these measures:

Tables Count = COUNTROWS(INFO.TABLES())

Relationships Count = COUNTROWS(INFO.RELATIONSHIPS())

Measures Count = COUNTROWS(INFO.MEASURES())

Calculated Columns Count = COUNTROWS(INFO.COLUMNS("CALCULATED_COLUMN"))

Unused Columns = 
VAR AllColumns = INFO.COLUMNS()
VAR UsedInVisuals = INFO.COLUMNS("USED_IN_VISUALS")
RETURN COUNTROWS(EXCEPT(AllColumns, UsedInVisuals))

Model Size MB = 
SUMX(
```text
INFO.STORAGE_TABLE_COLUMNS(),
[DICTIONARY_SIZE] + [DATA_SIZE]```
) / 1024 / 1024

Refresh Duration = 
"Last refresh: " & FORMAT([Last Refresh Time], "YYYY-MM-DD HH:mm")

// Alert: Tables without relationships
Orphaned Tables = 
VAR AllTables = VALUES(INFO.TABLES[TABLE_NAME])
VAR TablesInRelationships = 
```text
UNION(
    VALUES(INFO.RELATIONSHIPS[FROM_TABLE]),
    VALUES(INFO.RELATIONSHIPS[TO_TABLE])
)```
RETURN COUNTROWS(EXCEPT(AllTables, TablesInRelationships))

Performance Optimization

Figure: Power Apps form control – edit form with validation rules and error handling.

-- Bad: Multiple CALCULATE calls
Slow Measure = 
CALCULATE([Total Sales], Product[Category] = "A") +
CALCULATE([Total Sales], Product[Category] = "B") +
CALCULATE([Total Sales], Product[Category] = "C")





-- Good: Single CALCULATE with filter
Fast Measure = 
CALCULATE(
```text
[Total Sales],
Product[Category] IN {"A", "B", "C"}```
)

-- Bad: Nested CALCULATE
Slow = 
CALCULATE(
```text
CALCULATE(
    [Total Sales],
    Product[Category] = "A"
),
'Date'[Year] = 2025```
)

-- Good: Combined filters
Fast = 
CALCULATE(
```text
[Total Sales],
Product[Category] = "A",
'Date'[Year] = 2025```
)

-- Use variables for repeated calculations
Optimized Measure = 
VAR TotalSales = [Total Sales]
VAR TotalCost = [Total Cost]
VAR Profit = TotalSales - TotalCost
VAR Margin = DIVIDE(Profit, TotalSales, 0)
RETURN
```text
IF(Margin > 0.2, "High", "Low")


## Data Model Design Patterns and Anti-Patterns

### ✅ Design Patterns (Best Practices)





**1. Conformed Dimensions**

```text
Pattern: Share dimension tables across multiple fact tables

Example:
- Sales fact → Date dimension
- Budget fact → Date dimension  
- Inventory fact → Date dimension
(All use same Date dimension)

Benefits:
✓ Consistent filtering across facts
✓ Simplified model maintenance
✓ Reduced model size
✓ Easier user experience

Implementation:
- Create single Date dimension
- Establish relationships from all facts
- Use in slicers for cross-fact filtering

2. Junk Dimensions

Pattern: Combine low-cardinality flags into single dimension

Instead of:
- Sales[IsOnline] (Boolean)
- Sales[IsPriority] (Boolean)  
- Sales[IsInternational] (Boolean)
- Sales[PaymentMethod] (3 values)

Create:
OrderAttributes dimension:
- AttributeID (1-24, all combinations)
- IsOnline (Y/N)
- IsPriority (Y/N)
- IsInternational (Y/N)
- PaymentMethod (Cash/Card/Online)

Benefits:
✓ Reduces fact table width
✓ Improves compression
✓ Easier to add new flags
✓ Better performance

3. Surrogate Keys

Pattern: Integer keys instead of natural keys

❌ Natural Key: Product[SKU] = "PROD-ELECT-12345-XYZ"
✅ Surrogate Key: Product[ProductID] = 12345

Benefits:
✓ Faster joins (integer vs text)
✓ Smaller storage
✓ Handles source key changes
✓ Better compression

Implementation:
-- Power Query: Add Index Column
= Table.AddIndexColumn(Source, "ProductID", 1, 1)

4. Snowflake to Star Conversion

Snowflake (normalized):
Sales → Product → Category → SubCategory (3 hops)

Star (denormalized):
Sales → Product (with Category, SubCategory columns)

Conversion in Power Query:
Product_Flattened = 
```text
Table.ExpandTableColumn(
    Table.NestedJoin(Product, "CategoryID", Category, "CategoryID", "Category"),
    "Category", {"CategoryName", "SubCategoryName"}
)

Benefits: ✓ Faster queries (fewer joins) ✓ Simplified relationships ✓ Better for business users


**5. Fact Table Consolidation**

```text
Multiple fact tables with same grain → Single fact table with type dimension

Instead of:
- Sales_Online (OrderID, Amount, Quantity)
- Sales_Store (OrderID, Amount, Quantity)
- Sales_Phone (OrderID, Amount, Quantity)

Create:
- Sales (OrderID, Amount, Quantity, ChannelID)
- Channel (ChannelID, ChannelName: Online/Store/Phone)

Benefits:
✓ Simpler model
✓ Unified analysis
✓ Easier maintenance
✓ Single source of truth

Implementation:
Combined_Sales = Table.Combine({
```text
Table.AddColumn(Sales_Online, "Channel", each "Online"),
Table.AddColumn(Sales_Store, "Channel", each "Store"),
Table.AddColumn(Sales_Phone, "Channel", each "Phone")```
})

❌ Anti-Patterns (Common Mistakes)

1. Multiple Fact Tables at Different Grains

Architecture Overview: ❌ WRONG:

2. Bidirectional Relationships Everywhere

Architecture Overview: ❌ WRONG:

3. Many Calculated Columns

Architecture Overview: ❌ WRONG:

4. Import Mode for Real-Time Data

Architecture Overview: ❌ WRONG:

5. No Date Table

Architecture Overview: ❌ WRONG:

6. Circular Relationships

Architecture Overview: ❌ WRONG:

Decision Framework

When to Import vs DirectQuery:

Factor	Import	DirectQuery
Data size	< 10GB compressed	> 10GB or very large
Refresh frequency	Daily or less frequent	Real-time required
Query performance need	Critical (sub-second)	Acceptable (1-5 seconds)
Data sovereignty	Flexible	Must stay in source
DAX complexity	Full DAX support	Limited DAX support
Network reliability	Not critical	Must be reliable

When to Use Calculated Column vs Measure:

Use Calculated Column When:	Use Measure When:
Need to filter/slice by result	Aggregating data (SUM, AVG, COUNT)
Value is static (e.g., Full Name)	Result changes with filter context
Required in relationships	Performance is critical
Low cardinality result	Dynamic business logic

When to Use Bidirectional Relationships:

✅ Use bidirectional when:

Many-to-many with bridge table
Specific measure requires it
Model is small and performance acceptable

❌ Avoid bidirectional when:

Standard star schema (use single direction)
Performance is critical
Model has many tables (complexity)
Can achieve same result with DAX

Architecture Decision and Tradeoffs

When designing business intelligence solutions with Power BI, consider these key architectural trade-offs:

Approach	Best For	Tradeoff
Managed / platform service	Rapid delivery, reduced ops burden	Less customisation, potential vendor lock-in
Custom / self-hosted	Full control, advanced tuning	Higher operational overhead and cost

Recommendation: Start with the managed approach for most workloads and move to custom only when specific requirements demand it.

Validation and Versioning

Last validated: April 2026
Validate examples against your tenant, region, and SKU constraints before production rollout.
Keep module, CLI, and SDK versions pinned in automation pipelines and review quarterly.

Security and Governance Considerations

Apply least-privilege access using RBAC roles and just-in-time elevation for admin tasks.
Store secrets in managed secret stores and avoid embedding credentials in scripts or source files.
Enable audit logging, data protection policies, and periodic access reviews for regulated workloads.

Cost and Performance Notes

Define budgets and alerts, then monitor usage and cost trends continuously after go-live.
Baseline performance with synthetic and real-user checks before and after major changes.
Scale resources with measured thresholds and revisit sizing after usage pattern changes.

Official Microsoft References

https://learn.microsoft.com/power-bi/
https://learn.microsoft.com/power-bi/guidance/
https://learn.microsoft.com/fabric/

Public Examples from Official Sources

These examples are sourced from official public Microsoft documentation and sample repositories.
Documentation examples: https://learn.microsoft.com/power-bi/
Sample repositories: https://github.com/microsoft/PowerBI-Developer-Samples
Prefer adapting these examples to your tenant, subscriptions, and governance requirements before production use.

Key Takeaways

Star schema is the foundation: Fact tables (measures) + dimension tables (attributes) = optimal analytics model
Relationships drive filtering: One-to-many from dimensions to facts; understand cardinality and filter direction
Storage mode matters: Import for performance, DirectQuery for real-time, Composite for best of both worlds
Calculated columns vs measures: Columns computed at refresh (static), measures at query time (dynamic) - prefer measures
Date dimension is mandatory: Contiguous date table marked as date table enables time intelligence functions
Aggregations accelerate DirectQuery: Pre-calculate summaries for 100-1000× performance gains on large datasets
Multiple relationships via USERELATIONSHIP: Role-playing dimensions (OrderDate, ShipDate) with inactive relationships
Model optimization reduces size: Remove unused columns, optimize data types, use integer keys, reduce cardinality
Incremental refresh for large tables: Partition by date, refresh recent data only, archive historical data
Documentation prevents confusion: Comment tables/measures, track versions, create model health dashboards
Design patterns prevent rework: Conformed dimensions, junk dimensions, surrogate keys, fact consolidation
Anti-patterns cause problems: Avoid many calculated columns, bidirectional relationships everywhere, circular dependencies
Composite models enable hybrid scenarios: Real-time facts (DirectQuery) + performant dimensions (Import/Dual)
Performance profiling is essential: Use Performance Analyzer and DAX Studio to identify and fix bottlenecks
Filter context understanding is critical: Visuals create filter context; relationships propagate it; CALCULATE modifies it

Next Steps

Design star schema for your data
Create date dimension table
Establish table relationships
Build core measures library
Implement time intelligence
Optimize model size
Test query performance
Document data model

Additional Resources

Model. Relate. Calculate. Visualize.

Power BI Data Modeling: Building Effective Data Models

Power BI Data Modeling: Building Effective Data Models

Table Relationships

Creating Relationships

Relationship Properties

Role-Playing Dimensions

Measures

DAX Fundamentals

Filter Context

Row Context

Time Intelligence

Data Model Optimization

Reducing Model Size

Column Statistics

Relationships Best Practices

Advanced Patterns

Slowly Changing Dimensions (SCD)

Parent-Child Hierarchies

Many-to-Many with Bridge Table

Composite Models and Dual Storage Mode

Understanding Storage Modes

Creating Composite Models

Aggregation Tables

DirectQuery Optimization

Query Reduction Settings

Native SQL Query Monitoring

DirectQuery Best Practices

Hybrid Table Patterns

Advanced Relationship Patterns

Bidirectional Filtering

Inactive Relationships with USERELATIONSHIP

Data Model Optimization Techniques

Column Optimization

Model Compression Statistics

Incremental Refresh Configuration

Data Model Versioning and Governance

Model Documentation

Change Tracking

Model Health Checks

Performance Optimization

❌ Anti-Patterns (Common Mistakes)

Decision Framework

Architecture Decision and Tradeoffs

Validation and Versioning

Security and Governance Considerations

Cost and Performance Notes

Official Microsoft References

Public Examples from Official Sources

Key Takeaways

Next Steps

Additional Resources

Discussion