Author Archives: Hariharan Rajendran

ByHariharan Rajendran

Prepare data for Power BI  – Local File

As we discussed the levels in Power BI, the first level is creating dataset.

To start prepare the dataset, first sign in to the power BI here,

This article will talk about, how to create a dataset from local file.

Steps to import the local file,

  1. Sign in to the Power BIPowerBI
  2. Check for Get Data and click as like below.
  3. Click on “Get” under FilesPowerBI1
  4. It will display the options to import the files from different places.PowerBI2
  5. Click on “Local File”
  6. Select the file from your local folder and press open. Make sure that file should have both dimension and measure values. Power BI only supports Excel and CSV files and Power BI Desktop files.PowerBI3PowerBI4
  7. Check for uploaded files under Datasets.PowerBI5
  8. Dataset is ready to create reports and dashboards.

 

ByHariharan Rajendran

PolyBase Services in SQL Server 2016 CTP

Microsoft SQL Server 2016 has one of the exciting feature called PolyBase. It is acts as a bridge between relational database and Hadoop.

We can access all structured, unstructured and semi-structured data using known language of SQL.

Polybase allows user to use T-SQL Statements in Microsoft SQL Server Management studio to access data stored in Hadoop or Microsoft Azure blob storage.

Below services should up and running to use this feature.

  1. SQL Server PolyBase Data Movement
  2. SQL Server PolyBase Engine

To check the installed services in system,

  1. Press Ctrl + R (Run command)
  2. Type services.msc and enter

polybase

ByHariharan Rajendran

Secrets behind Query processing Speed in Columnstore Index

Go through my previous article here, about Data storage format in columnstore Index.

There are lots of improvements in Query processing speed when we use columnstore  Index.

  1. The data are organized in a column. As a result, a much higher level compression is possible, compare than row store.
  2. It uses Vertipaq technology for data compression. The Compression rate can be achieved up to 15 to 1. Click here, to know about Vertipaq technology.
  3. Compressed data require only less IO because the data transferred between disk to memory are reduced.
  4. Reducing IO leads to improvement in query processing and response times.
  5. Amount of data transfer to memory is going to be less which leads less space is required in memory to hold the data which processed by the query.
  6. SQL Server fetches only the data which are required for the query. For example, data is maintained in a column from C1 to C10, if the required data residing only on C6 & C7 then it will fetch only those columns.
  7. It uses the advanced technology for processing the queries. The data in the columns are processed in batches using high-efficient vector technology. This processing mode is called “Batch-Mode Processing”.
ByHariharan Rajendran

Temporal Table in SQL Server 2016

Temporal is a database feature that was introduced in ANSI SQL 2011 and this is now supported in SQL Server 2016.

It is a new type of user defined table and designed to keep track of all historical changes on the table and make easy to do point in time analysis.

Main reasons to use temporal table,

  1. In real time, data is not going to be static. To track the changes (Auditing) of the data over a point of time.
  2. Recovering from accidental data changes.

Prerequisites to create temporal table,

  1. The table should have a primary key column
  2. Two columns should have datatime2 data type to track the data validity (from & to).

Syntax to create Temporal Table

CREATE TABLE <Table Name>

(

<column 1> int NOT NULL PRIMARY KEY CLUSTERED

,<column 2> <datatype>

,.

,.

, <column n>

, [ValidFrom] datetime2 (2) GENERATED ALWAYS AS ROW START

, [ValidTo] datetime2 (2) GENERATED ALWAYS AS ROW END

, PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo)

)

WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = <schema>.<tablename> –Optional));

We can give any name instead of ValidFrom and ValidTo, both should be bind in PERIOD.

ByHariharan Rajendran

Vertipaq

Vertipaq is an engine which was found in Power Pivot for Analysis server in SQL Server 2008 R2. Basically, it is used for data compression. Data retrieval and calculations are happening at much faster rate as it holds the compressed data in memory.  It is also called xVelocity and it has been added to SQL Server 2012. As a result, it is delivering very huge performance improvements for data warehouse and business intelligence solutions.

Columnstore index using this compression algorithm technology.

This algorithm using following compression techniques.

  1. Value Encoding
  2. Dictionary Encoding
  3. Run Length Encoding (RLE)

Check this link, for detailed information about VertiPaq Compression techniques.

ByHariharan Rajendran

Data Storage Format in Columnstore Index

Go through my previous article purpose of columnstore index.

Normally SQL Server stores the data in pages in a row based manner for tables (heaps) and B-tree indexes. This is the traditional approach and technically this is called “Row store”.

Column store is something like turning traditional storage model into 90 degrees. In this approach, all the values in single column will be in compressed form. Column store index stores each column in a separate set of disk pages which is different from the traditional storage format.

Let’s see an example,

ID Name Department
1 xyz Production
2 Robert Information Technology
3 Mary Sales
4 John Account

 

Row store
1 xyz Production
2 Robert Information Technology
3 Mary Sales
4 John Account

 

Column Store
1 2 3 4
Xyz Robert Mary John
Production Information Technology Sales Account

Column store index is more compressed than the row store. This can understand by seeing above examples. In row store format we have different data types involved in a single row (ID is Integer, Name is string, Department is string). Row store method trying to compress the different data type fields, but in column store, all the values in a same data type so it is easy to compress the data.

Data Compression percentage of column store is greater than the row store.

 

ByHariharan Rajendran

Basic Concepts of Power BI

Check my previous post about introduction to Power BI before going through this article.

Creating a report and dashboards are very easy in Power BI by simply pointing to the correct data source.

Levels in Power BI,

1. Create a dataset
2. Create a report on top of above dataset
3. Create a dashboard – Just pin a charts to display in dashboard from report.

Power BI is an application which we can use in online as well as in desktop. The steps provided in my previous article (Link) is for online version of Power BI.

To download the desktop version, click here. It has more options compare than online version.

Power BI has gateways which helps to keep the data fresh by connecting the on-premises data sources. It provides the flexibility to individuals as well as organization.

ByHariharan Rajendran

Introduction to Power BI

Power BI is one of the best visualization tool and it can be used by anyone. Initially this tool was available for office 365 SharePoint site. It was something like an add-on but now they have upgraded and it is available externally.

Power BI team from Microsoft has put their ideas and innovations to make this tool available to everyone and it is not  required much technical knowledge to use.

There are lots of improvement has been taking place from day one of power BI by Microsoft power BI team.

Steps to Use Power BI

  1. Visit https://powerbi.microsoft.com/
  2. Scroll down and look for option “Use it free”
  3. Enter work email (xxxxx@xxxx.com), it won’t support personal email
  4. Click “Use it free” and email will be received from power BI and follow the instructions.