From Flat Table to Fact & Dimension Views in SQL for Power BI (Tamil)
SQL for Power BI (Tamil)
Power BI project-ல performance, accuracy, scalability எல்லாமே depend ஆகுறது
data model எப்படி prepare பண்ணிருக்கோம் என்பதுல தான்.
Real-time projects-ல நிறைய பேருக்கு கிடைக்குற data
👉 single flat table format-ல தான் இருக்கும்.
ஆனா அந்த flat table-ஐ direct-ஆ Power BI-க்கு import பண்ணினா
report slow ஆகும், numbers wrong ஆகும், business decisions fail ஆகும்.
இந்த blog-ல,
ஒரு messy flat SQL table-ஐ
👉 clean பண்ணி
👉 business requirement-க்கு match ஆக
👉 Fact table + Dimension tables views
எப்படி design பண்ணலாம் என்பதைக் practical-ஆ explain பண்ணலாம்.
Step 1: Flat Table Review – Problems Identify பண்ணுவது
First step எப்போதுமே data review தான்.
Table-ஐ top rows select பண்ணி பாத்தா உடனே சில issues தெரிய ஆரம்பிக்கும்.
😕 Same invoice number multiple rows-ல repeat ஆகுது
😕 Quantity & Price இருக்கு, but Sales Amount column இல்லை
😕 Cancelled orders data mixed-ஆ இருக்கு
😕 Description column-ல null values & invalid characters இருக்கு
😕 Price column-ல zero values இருக்கு
இது எல்லாமே flat table-ல normal-ஆ இருக்கக்கூடிய common issues.
Why Flat Table Is a Problem for Power BI
Flat table Power BI-க்கு worst case scenario.
🚫 Duplicate invoice numbers → wrong order count
🚫 No sales amount → calculations Power BI-ல force பண்ணணும்
🚫 Cancelled orders → total sales wrong ஆகும்
🚫 Dirty product names → dimension table messy ஆகும்
இதையெல்லாம் direct Power BI-க்கு load பண்ணினா
👉 wrong sales numbers
👉 wrong trends
👉 wrong business decisions
So, data clean பண்ணாம Power BI-க்கு போகக்கூடாது.
Column-by-Column Data Quality Check
Data clean பண்ணறதுக்கு முன்னாடி
ஒவ்வொரு column-யும் understand பண்ணணும்.
📄 Invoice column
Cancelled orders identify ஆக C-ன்னு start ஆகுற invoices
👉 business requirement-க்கு தேவையில்லாததால் ignore பண்ணலாம்
📝 Description column
Product name இல்லாத rows
Invalid text / garbage values
👉 dimension product table-க்கு big problem
💰 Price & Quantity
Zero values exist
👉 sales calculation-க்கு wrong impact
👤 Customer ID
Mostly valid
👉 but still validation important (especially raw data case-ல)
Why Cleaning in SQL Is Better Than Power BI
Power BI-ல cleaning பண்ணலாம், but best practice அது இல்லை.
💡 SQL Server already data source
💡 Heavy transformations SQL-ல faster
💡 Power BI model lightweight ஆகும்
💡 Same logic multiple reports-க்கு reuse பண்ணலாம்
So approach is simple:
👉 Clean & transform data in SQL
👉 Expose final views to Power BI
Business Requirement – What Exactly Do We Need?
Dashboard design start பண்ண முன்னாடி
business team என்ன expect பண்ணுறாங்கன்னு clear-ஆ define பண்ணணும்.
📊 Total Sales
📦 Total Orders
📈 Monthly Trend
🔥 Top Products
🌍 Country-wise Sales
இந்த requirements base பண்ணிதான்
SQL views design பண்ண போறோம்.
Creating Order Items View (Granular Level)
First view we create is Order Items level data.
What we do here:
🔄 Invoice number → Order ID
🔄 Stock code → Product ID
🔢 Quantity & Unit Price select
➗ Quantity × Unit Price → Sales Amount calculate
Filters applied:
❌ Cancelled orders remove
❌ Quantity <= 0 remove
❌ Price <= 0 remove
👉 Business team-க்கு “actual sales” data மட்டும் தான் தேவ.
Result:
Order-wise, product-wise granular data ready.
Why We Don’t Aggregate Too Early
Order ID repeat ஆகுது – இது expected.
❗ We don’t aggregate here intentionally.
Reason:
📌 Future visuals-ல order-item level analysis தேவைப்படலாம்
📌 If we aggregate now, flexibility lost
So first view = detailed level.
Creating Orders View (Order Level)
Next step: Orders table.
Here we need:
🆔 One row per order
📅 Order date (date-time → date cast)
👤 Customer ID
🌍 Country
💰 Total order sales
Here aggregation is mandatory because:
👉 One order = one row
👉 This table acts as order-level dimension/fact hybrid
Sales Fact Table – Core of Star Schema
Power BI-க்கு one fact table தான் best practice.
So we merge:
🔗 Orders Items view
🔗 Orders view
Final Sales Fact table contains:
📅 Date
🆔 Order ID
👤 Customer ID
🌍 Country
📦 Product ID
🔢 Quantity
💵 Unit Price
💰 Sales Amount
👉 This is the FactSales table
👉 This will sit at center of Star Schema
Creating Customer Dimension Table
Customer dimension should contain only required attributes.
From Orders data:
👤 Customer ID
🌍 Country
Only customers who placed orders are included.
📌 No address / city available in raw data
📌 If available in future → can be added easily
Result: clean DimCustomer table.
Creating Date Dimension Table
Date dimension is created based on order dates.
📅 All dates derived from orders
🔄 Auto-updates when new orders arrive
Optional enhancements:
➕ Fiscal year
➕ Week number
➕ Day name
All can be added later in SQL view itself.
Creating Product Dimension Table (Most Important Cleanup)
Product dimension required most cleaning.
Issues identified:
❌ Invalid product descriptions
❌ Garbage text values
❌ Manually updated wrong values
Correct approach:
🧠 Identify patterns in invalid descriptions
🧹 Clean based on pattern (not random deletes)
🚫 Ignore cancelled orders already removed
🚫 Ignore rows without valid product name
Result:
✅ Unique products only
✅ Clean product names
✅ Total ~3892 valid products
This becomes DimProduct.
Final Outcome – What We Achieved
By the end of this process:
🎯 Messy raw data identified
🎯 Business requirements clarified
🎯 SQL views created
🎯 Fact & Dimension tables separated
🎯 Data Power BI–ready
All views are now optimized for:
⭐ Star schema
⭐ High performance
⭐ Accurate DAX calculations
What You Should Do Next
🧪 Pause the video
🖥️ Open SSMS
📊 Try creating similar views with sample data
🧠 Understand the SQL logic clearly

கருத்துரையிடுக
0 கருத்துகள்