From Flat Table to Fact & Dimension Views in SQL for Power BI (Tamil)

 

SQL for Power BI (Tamil)

Power BI project-ல performance, accuracy, scalability எல்லாமே depend ஆகுறது
data model எப்படி prepare பண்ணிருக்கோம் என்பதுல தான்.

Real-time projects-ல நிறைய பேருக்கு கிடைக்குற data
👉 single flat table format-ல தான் இருக்கும்.
ஆனா அந்த flat table-ஐ direct-ஆ Power BI-க்கு import பண்ணினா
report slow ஆகும், numbers wrong ஆகும், business decisions fail ஆகும்.

இந்த blog-ல,
ஒரு messy flat SQL table-ஐ
👉 clean பண்ணி
👉 business requirement-க்கு match ஆக
👉 Fact table + Dimension tables views
எப்படி design பண்ணலாம் என்பதைக் practical-ஆ explain பண்ணலாம்.


Step 1: Flat Table Review – Problems Identify பண்ணுவது

First step எப்போதுமே data review தான்.

Table-ஐ top rows select பண்ணி பாத்தா உடனே சில issues தெரிய ஆரம்பிக்கும்.

😕 Same invoice number multiple rows-ல repeat ஆகுது
😕 Quantity & Price இருக்கு, but Sales Amount column இல்லை
😕 Cancelled orders data mixed-ஆ இருக்கு
😕 Description column-ல null values & invalid characters இருக்கு
😕 Price column-ல zero values இருக்கு

இது எல்லாமே flat table-ல normal-ஆ இருக்கக்கூடிய common issues.


Why Flat Table Is a Problem for Power BI

Flat table Power BI-க்கு worst case scenario.

🚫 Duplicate invoice numbers → wrong order count
🚫 No sales amount → calculations Power BI-ல force பண்ணணும்
🚫 Cancelled orders → total sales wrong ஆகும்
🚫 Dirty product names → dimension table messy ஆகும்

இதையெல்லாம் direct Power BI-க்கு load பண்ணினா
👉 wrong sales numbers
👉 wrong trends
👉 wrong business decisions

So, data clean பண்ணாம Power BI-க்கு போகக்கூடாது.


Column-by-Column Data Quality Check

Data clean பண்ணறதுக்கு முன்னாடி
ஒவ்வொரு column-யும் understand பண்ணணும்.

📄 Invoice column
Cancelled orders identify ஆக C-ன்னு start ஆகுற invoices
👉 business requirement-க்கு தேவையில்லாததால் ignore பண்ணலாம்

📝 Description column
Product name இல்லாத rows
Invalid text / garbage values
👉 dimension product table-க்கு big problem

💰 Price & Quantity
Zero values exist
👉 sales calculation-க்கு wrong impact

👤 Customer ID
Mostly valid
👉 but still validation important (especially raw data case-ல)


Why Cleaning in SQL Is Better Than Power BI

Power BI-ல cleaning பண்ணலாம், but best practice அது இல்லை.

💡 SQL Server already data source
💡 Heavy transformations SQL-ல faster
💡 Power BI model lightweight ஆகும்
💡 Same logic multiple reports-க்கு reuse பண்ணலாம்

So approach is simple:

👉 Clean & transform data in SQL
👉 Expose final views to Power BI


Business Requirement – What Exactly Do We Need?

Dashboard design start பண்ண முன்னாடி
business team என்ன expect பண்ணுறாங்கன்னு clear-ஆ define பண்ணணும்.

📊 Total Sales
📦 Total Orders
📈 Monthly Trend
🔥 Top Products
🌍 Country-wise Sales

இந்த requirements base பண்ணிதான்
SQL views design பண்ண போறோம்.


Creating Order Items View (Granular Level)

First view we create is Order Items level data.

What we do here:

🔄 Invoice number → Order ID
🔄 Stock code → Product ID
🔢 Quantity & Unit Price select
➗ Quantity × Unit Price → Sales Amount calculate

Filters applied:

❌ Cancelled orders remove
❌ Quantity <= 0 remove
❌ Price <= 0 remove

👉 Business team-க்கு “actual sales” data மட்டும் தான் தேவ.

Result:
Order-wise, product-wise granular data ready.


Why We Don’t Aggregate Too Early

Order ID repeat ஆகுது – இது expected.

❗ We don’t aggregate here intentionally.

Reason:

📌 Future visuals-ல order-item level analysis தேவைப்படலாம்
📌 If we aggregate now, flexibility lost

So first view = detailed level.


Creating Orders View (Order Level)

Next step: Orders table.

Here we need:

🆔 One row per order
📅 Order date (date-time → date cast)
👤 Customer ID
🌍 Country
💰 Total order sales

Here aggregation is mandatory because:

👉 One order = one row
👉 This table acts as order-level dimension/fact hybrid


Sales Fact Table – Core of Star Schema

Power BI-க்கு one fact table தான் best practice.

So we merge:

🔗 Orders Items view
🔗 Orders view

Final Sales Fact table contains:

📅 Date
🆔 Order ID
👤 Customer ID
🌍 Country
📦 Product ID
🔢 Quantity
💵 Unit Price
💰 Sales Amount

👉 This is the FactSales table
👉 This will sit at center of Star Schema


Creating Customer Dimension Table

Customer dimension should contain only required attributes.

From Orders data:

👤 Customer ID
🌍 Country

Only customers who placed orders are included.

📌 No address / city available in raw data
📌 If available in future → can be added easily

Result: clean DimCustomer table.


Creating Date Dimension Table

Date dimension is created based on order dates.

📅 All dates derived from orders
🔄 Auto-updates when new orders arrive

Optional enhancements:

➕ Fiscal year
➕ Week number
➕ Day name

All can be added later in SQL view itself.


Creating Product Dimension Table (Most Important Cleanup)

Product dimension required most cleaning.

Issues identified:

❌ Invalid product descriptions
❌ Garbage text values
❌ Manually updated wrong values

Correct approach:

🧠 Identify patterns in invalid descriptions
🧹 Clean based on pattern (not random deletes)
🚫 Ignore cancelled orders already removed
🚫 Ignore rows without valid product name

Result:

✅ Unique products only
✅ Clean product names
✅ Total ~3892 valid products

This becomes DimProduct.


Final Outcome – What We Achieved

By the end of this process:

🎯 Messy raw data identified
🎯 Business requirements clarified
🎯 SQL views created
🎯 Fact & Dimension tables separated
🎯 Data Power BI–ready

All views are now optimized for:

⭐ Star schema
⭐ High performance
⭐ Accurate DAX calculations


What You Should Do Next

🧪 Pause the video
🖥️ Open SSMS
📊 Try creating similar views with sample data
🧠 Understand the SQL logic clearly



கருத்துரையிடுக

0 கருத்துகள்