Notice: _filter_block_template_part_area(): "sidebar" is not a supported wp_template_part area value and has been added as "uncategorized". in /home/ntsnews/public_html/wp-includes/functions.php on line 6131

Notice: _filter_block_template_part_area(): "sidebar" is not a supported wp_template_part area value and has been added as "uncategorized". in /home/ntsnews/public_html/wp-includes/functions.php on line 6131
SQL Server Data Warehouse Concepts, Schema Design, and Im... - NTS News

SQL Server Data Warehouse Concepts, Schema Design, and Im…

SQL Server Data Warehouse Concepts, Schema Design, and Im…

Learn about the concepts of how a data warehouse centralizes data for better analysis and reporting across data sources. The post SQL Server Data Warehouse Concepts, Schema Design, and Implementation appeared first on MSSQLTips.com.

Companies generate a huge amount of data on a daily basis from such things as sales transactions, inventory changes, and customer interactions. This data originates from different data sources and is stored in operational databases called OLTP (Online Transaction Processing) systems that primary focuses on transactions like data inserts, updates or deletes. Maintaining the transactional data in an OLTP database is important.

However, if you need to analyze or report on the data, storing the data in a centralized repository in well-structured schema is necessary. The best solution analyzing and reporting on vast amounts of data is creating a Data Warehouse. A Data Warehouse acts as a centralized repository, is subject oriented, time variant and non-volatile repository. It is designed for data analytics and reporting. Data Warehousing will provide insights into business operations and will help grow the business in many ways.

A Data Warehouse is considered an OLAP (Online Analytical Processing) platform. It is optimized for data aggregation and read heavy queries run on large volume of data. OLTP systems are transactional databases focusing on running inserts and updates as efficiently as possible. An example includes banking systems which have real time data processing. OLTP platforms are designed for high-speed data access and processing.

OLTP systems are designed to handle a large number of concurrent users ensuring data reliability and data accuracy due to its ACID property. OLAP systems are data warehouses, which are designed for running analytical queries and reporting. An OLAP platform acts as a centralized repository holding data from multiple OLTP systems in a well-structured manner. The OLAP systems have a denormalized schema design, to improve performance for running analytical queries or perform historic data analysis.

There are two types of tables in a Data Warehouse: Facts and Dimensions. Let’s review each. Facts are large table and grows over time. Sample Fact Tables are sales, orders, web site clicks, etc. Additive Fact tables can have measures aggregated across all the dimensions. For example, sales amount, total quantity sold, revenue calculation, etc. This is the most common type of fact that is used in data warehouse and we can aggregate the data as per our requirements by day, week, month, quarter and/or year.

Data can be aggregated to the minute or second level depending on your date and time dimension attributes. Semi Additive Fact tables are those which are aggregated across some dimensions, but not across all dimensions. There are some scenarios in which aggregating does not make any business logic. For example, you want to aggregate product sales it can be summed across month or date, but it does not make any sense summing up across region due to product usage.

Non-Additive Fact Tables are those which cannot be summed up across any of the dimensions. These facts include percentage, ratios, averages, standard deviation, etc. which cannot be aggregated or do not have any business logic. Factless Fact tables are those which do not include any measures. This type of fact table is used to track events for example maintaining attendance of employees or students.

This fact table help you in counting occurrences. Derived Facts do not store aggregates in Fact table directly, but the aggregates can be calculated using other fact tables for example calculating the profit by subtracting sales amount from the production cost amount. Dimensions contains the descriptive attribute which provides context to the fact table. These tables include details for the Fact table.

For example, Date, Time, Region, Sales Person, Product Name, Category, etc. Dimensions are usually smaller in size than a Fact table and the data in these tables does not change frequently. Slowly changing dimensions (SCD) are a data warehousing process in which changes in dimension tables are managed. It helps in maintaining the history as it preserves and stores the historic value. This enables accurate time-based analysis.

For example, if you want to find out that the value of the field for previous year or on a specific date, you will be able to get this information by implementing the SCDs. SCD Type 1 is used when maintaining history is not important and you only keep the latest or current value. It overwrites the old value with the new value, so no history is maintained. As shown in the example above, the old value is updated with the new value and no historical data is maintained.

Type 2 Slowly Changing Dimension (SCD) helps to maintain history for the dimension table and retains a full historical data set the with start and end date. We use start date i.e. Valid_From and end date i.e. Valid_To and Is_Current columns in order to implement type 2 and to maintain changes. Valid_From and Valid_To column represents when the value was active in the data warehouse between these two date time ranges.

The Is_current column with the value = 1 is the latest value for that particular row. Is_current column can be removed, but it is recommended to keep it as it will help you in your analytical queries. For example, if you want to filter only the latest records you just need to add Is_Current=1 into your SELECT statement. This will return the latest records with an optimized approach. Data Warehouses have a few different types of database schemas.

Each of these database designs has pros and cons. Let’s review each. The Star Schema is most commonly used in data warehousing. With this design, the Fact table is connected to each dimension table. The dimension table in the star schema is denormalized as all the attributes are in single table for that dimension. As the structure is simple that means fewer join operations are required to retrieve the data which results in faster query execution.

The Snowflake Schema is similar to star schema, but the dimension tables are further divided into sub dimensions. The dimension tables are normalized and help in reducing data redundancy. Data integrity is improved in this schema, but it becomes complex as join operation increases while retrieving data from dimension and sub dimensions. The Snowflake schema is primarily used in large data warehouses when storage optimization and data integrity are focus points.

The Galaxy Schema contains multiple fact tables that shares Dimensions. This is also known as the constellation schema. The Galaxy Schema is a complex design, but it is best when you are designing an enterprise data warehouse covering numerous subject areas and domains. We are going to implement data warehouse for retail sales. The data warehouse schema is as follow: With our sample Data Warehouse ready, we are going to run some queries which will help us in answering common business questions.

Now we will connect Power BI to our Data Warehouse. In our Power BI report, let’s plot the above two analytical queries into a meaningful chart that will help us visualize the data. To learn how to build these reports review the following tips: Muhammad Hassan Arshad currently works as a Principal Data Engineer at Strategic Systems International. He is a data engineering professional with over 7 years of experience in data engineering, data warehousing, and database development.

He has a strong track record of building scalable data pipelines, optimizing data workflows, and developing robust database solutions to support analytics and business intelligence. Hassan holds Microsoft certifications and has worked extensively with SQL Server and modern cloud data platforms, bringing deep technical expertise and a results-driven approach to every project.

Summary

This report covers the latest developments in iphone. The information presented highlights key changes and updates that are relevant to those following this topic.


Original Source: Mssqltips.com | Author: Muhammad Hassan Arshad | Published: March 6, 2026, 4:00 am

Leave a Reply