Author Archives: Hariharan Rajendran

ByHariharan Rajendran

Power BI Report using Shared Dataset [Detailed Step by Step]

In my previous post, I have explained the possibility of using SSRS shared dataset as a source for Power BI report. Refer my previous post here.

In this post, I am giving you the detailed step by step procedure to work with the shared dataset in Power BI.

Follow the below steps,

Step 1: Open Power BI Report Server portal URL. For example, the URL is like below.

http://localhost/Reports

Step 2: Add the following extension with the above URL. /api/v2.0/datasets. It will list out all the datasets with ID as like below.

http://localhost/Reports/api/v2.0/datasets

Step 3: Filter the dataset with ID. Here I used the second dataset id.

http://localhost/Reports/api/v2.0/datasets(datasetid)/data

My URL, http://lab/ReportsPBI/api/v2.0/datasets(ca502b81-c88d-4646-af32-ca08d9b4a1f4)/data

Step 4: Open Power BI Desktop and choose OData feed data connector

Step 5: Pass the above URL as like below.

Step 6: Choose advance and select “Include open type columns” checkbox to get all the columns of the dataset.

Step 7: Click Ok and select Edit Option on next screen.

Step 8: Select the icon near the “More Columns” as like below. It will list out all the columns.

Step 9: Uncheck the “Use original column name as a prefix” and click ok. Finally click “Close & Apply”.

Step 10: Choose the fields and create a report.

Thanks. Let me know if you face any issues.

ByHariharan Rajendran

How to handle date in R

Date data type is very important for all programming languages. To handle the date in a proper way, we need to apply some formatting logic.

This is the case for R programming language as well. This post explains you, how we can handle the Date data in R. There are different functions available in R to handle date.

To get the today’s date, use Sys.Date()

Sys.Date()

> Sys.Date()

[1] “2018-05-22”

 

To get date and time, use date()

> date()[1] “Tue May 22 15:29:18 2018”

 

Check the below list of symbols to play with date and time format.

Symbol Meaning Example
%d day as a number (0-31) 01-31
%a
%A
abbreviated weekday
unabbreviated weekday
Mon
Monday
%m month (00-12) 00-12
%b
%B
abbreviated month
unabbreviated month
Jan
January
%y
%Y
2-digit year
4-digit year
07
2007

 

> today <- Sys.Date()> format(today, format=”%B %d %Y”)[1] “May 22 2018”

 

Date Conversion

In a real-time scenario, we will not get date as date data type. Always, we need to convert to proper date data type. To convert to date data type, we need to use as.Date() function.

Syntax

as.Date(x, “format”)

x – Field or column

format – use the above symbols to frame the proper format.

 

DateVec<- c(“01/05/2018”, “08/16/2018”)

DateVec

[1] “01/05/2018” “08/16/2018”

summary(DateVec)

Length     Class        Mode

2            character   character

dates <- as.Date(DateVec, “%m/%d/%Y”)

dates

[1] “2018-01-05” “2018-08-16”

summary(dates)

Min.      1st Qu.       Median         Mean      3rd Qu.         Max.

“2018-01-05” “2018-03-01” “2018-04-26” “2018-04-26” “2018-06-21” “2018-08-16”

ByHariharan Rajendran

R Package dplyr Function vs T-SQL – Part 2

Following my previous post, we have another few functions on dplyr package which will be covered in this post.

Arrange Function

As part of data modelling, we need to sort the data to analyze further.

In T-SQL we can easily perform the sort on the data as like below. It uses “Order by” clause to sort the data.

In R, we need to use arrange function.

#Arrange

DF_Arrange<- DF_select %>%

group_by(State) %>%

summarise(Total = sum(Price)) %>%

arrange(desc(Total))

DF_Arrange

Mutate Function

The next, very common task is to build the calculated column to satisfy the business logic. This can be easily done with the help of “mutate” function.

#mutate

DF_mutate <- DF_select %>%

group_by(State) %>%

summarise(Total = sum(Price)) %>%

mutate(“10%of Total” = Total/10) %>%

arrange(desc(Total))

DF_mutate

ByHariharan Rajendran

R Package dplyr Function vs T-SQL

R is one of the very famous tools to handle the data science projects because it has all the capabilities right from the extracting the data from different sources, data modelling and transformation, data visualization and finally building machine learning models using the data.

This post explains how the data modelling can be done with R using “dplyr” package.

To make it easy, let me compare this dplyr function with T-SQL.

First, let us load the data into R studio.

fulldata<-read.csv(“D:\\Projects 2018\\DSKA\\DataWranglingDemo.csv”)

fulldata

Select Function

When we have a large dataset with more than 1000 columns, if we need only certain columns then we can use “SELECT” option to choose the specific columns. Check the below example,

 

The same option is available in R with dplyr package.

#install.packages(“dplyr”)

require(“dplyr”)

#select

DF_select <- fulldata %>% select (State,Price)

DF_select

Filter Function

In T-SQL, we have where condition to filter any data with different conditions.

The same filter option is available with dplyr functions.

#filter

DF_select %>% filter(State==”Alabama”)

Same as T-SQL, we have group by, summarise functions are available in R

Group by Function

#Group by

DF_Group <- DF_select %>% group_by(State)

DF_Group

Summarise Function

#Summarise

DF_Sum<- DF_select %>% group_by(State) %>%

summarise(Total = sum(Price))

DF_Sum

ByHariharan Rajendran

Power BI Report using SSRS Shared Datasets

Creating a Power BI report is very easy, and it supports loads of data connectors. When we start creating a report against any relational database, we must choose any of the below query modes,
1. Import
2. Direct Query or Connect Live

For example, when you try to create a report against SQL Server database, then you need to choose any of the above methods based on the requirement.

If you take SSRS report server, we have below elements available
1. Paginated Reports
2. Mobile Reports
3. KPIs
4. Data Sources
5. Datasets

If you are SSRS administrator, then you can easily manage all these stuff in a centralized place.

Is it possible to use the SSRS dataset to create a Power BI Report?
Yes, it is possible. You must have the latest version of Power BI Report Server.
We have REST APIs and is supported in Power BI Report Server, which is an extended version of SSRS 2017. Check REST API on SSRS 2017.

We need to use the OData Feed data connector to connect with SSRS datasets.
Below is the information required to create a report.
1. REST API Url
2. REST API Url with specific dataset id

The base URL of the REST API will be like below,
http:///Reports/api/v2.0

ByHariharan Rajendran

Python scatter plot in SQL Server 2017

As you know, Microsoft released the latest version of SQL Server which is SQL Server 2017. If you worked with SQL Server 2016 then you could realize that Microsoft SQL Server is not just a relational database anymore because it started to support big data and so many options to handle the non-relational data.
R Language is very popular to work with data science-related projects. It was integrated with SQL Server 2016 and we can run the R Scripts in SQL Server Management Studio itself.
It helps us to avoid the data movement between the relational database to R server and process. The same way, Microsoft now introduced Python integration with SQL Server. These 2 languages are coming from machine learning services in SQL Server.
To run Python scripts in SQL Server management studio, you need to enable the external script stored procedure.
Run the below command in your SSMS and see whether you are getting the “hello world” as an output.
In case, you are getting an error message then the configuration part was not properly done.
execute sp_execute_external_script
@language = N’Python’,
@script = N’
print(“hello world”)’
The below script is to generate the scatter plot.
DECLARE @Query nvarchar(max) = N’SELECT Year, Sales from [dbo].[SalesByYear]’

execute sp_execute_external_script

@language = N’Python’,

@script = N’

import matplotlib.pyplot as myplot

X = myplot.figure()

myplot.scatter(InputDataSet.Year,InputDataSet.Sales)

myplot.xlabel (“Year”)

myplot.ylabel (“Sales”)

myplot.title (“Sales by Year”)

myplot.savefig (“D:\Win – 8\myfig.png”)

‘,

@input_data_1 = @Query
Check the below step by step procedure to create a scatter plot using python.

ByHariharan Rajendran

Graph database in SQL Server 2017

As you know Graph Database is one of the latest features from SQL Server 2017.
Let us first understand the purpose of Graph database. We have a relational database which handles most of the scenarios but as we are started to handle big data and complex scenario our database also should be capable enough to handle those scenarios.
Yes. Graph database handles those complex scenarios easily which I have explained. As part of graph database, Microsoft team introduced two different tables.
1. Node
2. Edge
Check out the below explanation of those tables and how to work with graph database.
Use the below scripts

Use SQL2017
—Create Main table
CREATE TABLE People (
ID INT PRIMARY KEY,
Name NVARCHAR(25)
) AS NODE;

–Create Edge Table for relationships
CREATE TABLE RELATIONSHIP (
TYPE NVARCHAR(25)
) AS EDGE;

—Insert values to People
INSERT INTO People VALUES (1, ‘David’)
INSERT INTO People VALUES (2, ‘John’)

SELECT * FROM People;

SELECT * FROM RELATIONSHIP;

–Create relationships
INSERT INTO RELATIONSHIP VALUES (
(SELECT $node_id from People WHERE Name = ‘David’),
(SELECT $NODE_ID FROM People where Name = ‘John’), ‘Father’);

INSERT INTO RELATIONSHIP VALUES (
(SELECT $node_id from People WHERE Name = ‘John’),
(SELECT $NODE_ID FROM People where Name = ‘David’), ‘Son’);

—Cartesian Product result
–No need to use joins since nodes and edges are interconnected in structure
SELECT FromName.Name, RELATIONSHIP.TYPE, ToName.Name
FROM People AS FromName, People As ToName , RELATIONSHIP

–Proper Result
SELECT FromName.Name, RELATIONSHIP.TYPE, ToName.Name
FROM People AS FromName, People As ToName , RELATIONSHIP
WHERE MATCH (FromName-(Relationship)->ToName)

–more Records
INSERT INTO People VALUES (3, ‘Nancy’)

INSERT INTO RELATIONSHIP VALUES (
(SELECT $node_id from People WHERE Name = ‘Nancy’),
(SELECT $NODE_ID FROM People where Name = ‘David’), ‘Daughter’);

INSERT INTO RELATIONSHIP VALUES (
(SELECT $node_id from People WHERE Name = ‘David’),
(SELECT $NODE_ID FROM People where Name = ‘Nancy’), ‘Father’);

INSERT INTO RELATIONSHIP VALUES (
(SELECT $node_id from People WHERE Name = ‘John’),
(SELECT $NODE_ID FROM People where Name = ‘Nancy’), ‘Brother’);

INSERT INTO RELATIONSHIP VALUES (
(SELECT $node_id from People WHERE Name = ‘Nancy’),
(SELECT $NODE_ID FROM People where Name = ‘John’), ‘Sister’);

SELECT FromName.Name, RELATIONSHIP.TYPE, ToName.Name
FROM People AS FromName, People As ToName , RELATIONSHIP
WHERE MATCH (FromName-(Relationship)->ToName)

SELECT FromName.Name, RELATIONSHIP.TYPE, ToName.Name
FROM People AS FromName, People As ToName , RELATIONSHIP
WHERE MATCH (FromName-(Relationship)->ToName)
AND FromName.Name = ‘Nancy’

Share your comments below. Thank you

ByHariharan Rajendran

Power BI Bookmark and Selection

Microsoft Power BI recently released new features as part of October release. In that, the below two features are very important to consider for our reporting solutions.

  1. Bookmarking
  2. Selection Pane.

First, let me explain the selection pane feature. It is useful to show and hide any report elements in the report.

If we see selection pane alone, then we can’t identify the importance of that but, this selection pane will combine with the bookmark and do magic.

Please check the below report and play with different chart types.

https://app.powerbi.com/view?r=eyJrIjoiN2Q3Yjg3OTUtMDM4NC00ZTU5LWE5MjAtNGVjODMxOGJiOWQzIiwidCI6IjNkMWQwNTA0LTRjNDItNGUwMi1hZTI4LTVjOWE5YzUwZjM2ZiIsImMiOjh9

Take your time and think that how this report works. Actually, it is a simple when you see the report but in the backend it using the selection and bookmark features.

Before explaining how the report is built, let me explain what bookmark feature is.

Bookmark

It is a preview feature which means it is not generally available. Microsoft Power BI team announced this feature during Data Insight Summit. They have created a hype to the feature.

We can create a bookmark for the interesting stats. For example, if you want to create a data story or you want to present the report to the business users to show the sales and revenue information.

At that time, you may need to show metrics for last year sales and this year sales for comparison and same for revenue or you may need to highlight a specific visual (it can be done with spotlight, refer here).

With the single report you need to filter the data during the presentation but instead, you can create a bookmark for each one of the results and can show them easily.

Learn more about Bookmarking feature in Power BI site.

Let me explain how the report is built with the help of bookmark and selection pane. Follow the below steps to reproduce the report.

  1. Create a report with pie charts
    1. Create a bookmark as pie
  2. Add the donut charts on the same page where pie charts are placed
    1. Create a bookmark as Donut
  3. Add column charts on the same page where pie and donut charts are placed
    1. Create a bookmark as Column
  4. Open the selection pane and hide the donut and column charts and update the pie bookmark
  5. Choose Donut bookmark and hide pie and column charts and update the bookmark
  6. Same for column bookmark
  7. Add the slicer on top and added three images (pie, donut & column chart icons)
  8. On each image, on Link option choose bookmark type and select the relevant bookmarks

That’s it. Please let me know your comments below also share if you have any other report logics with these features.