Following Satya’s quote “Empowering Every Person in the Planet”, in 2016 Microsoft Inspire conference, have certainly left my mind with unstoppable beats. I certainly understood what Satya meant, but I still thought about doing something like this, with my own upgraded version of Data Awareness Programme (which I started in 2014). But at that time, all I had was Power View, Power Map, Power Pivot as Excel add-ons and tried my best, to do the awareness programmes in remote villages, by mingling with the villagers and collecting their own data and showing some visuals back to them. I did this to improve their life style, find more time for personal and have better earnings. Though the main target was students, I hoped this message would spread to their friends, family and others in the remote places.
Following the release of full version of Power BI, I now have a fully working site, with living “Awareness Enabled Reports”, from sleeping open data sources (taken from various Gov/Non-Gov sites).
I managed to get this far, with a simple equation of A + B + C + D = E (EmpoweringEveryPerson.com (EEP)). Let me explain this in detail.
With our all time favourite reporting tool, “Microsoft Power BI“, published some “Awareness Enabled Reports” to www.EmpoweringEveryPerson.com site and categorized them with regional, national and global challenges, for easy manoeuvring within the EEP site.
This EEP site currently has 3 simple goals.
Below screenshot shows categorization of reports by UK.
PROMOTE & PARTICIPATE
As per above equation, ‘D’ is the support that we need from you, to promote in any of the following ways.
2. COMMUNITY LEADERS
A request to all Community group leaders, to spend at least a minute by starting or ending your user / local / online group sessions by introducing / re-introducing, this website and showcasing the opportunity to all attendees, to build and submit their own data story with “Awareness Enabled Reports“.
3. END USERS / VOLUNTEERS
Every time you see a new “Awareness Enabled Report“, do tweet / share / post in social media and support to spread the awareness.
Thanks in advance for your support and thanks for your time reading through this far, to create awareness with “Awareness Enabled Reports“.
Date data type is very important for all programming languages. To handle the date in a proper way, we need to apply some formatting logic.
This is the case for R programming language as well. This post explains you, how we can handle the Date data in R. There are different functions available in R to handle date.
To get the today’s date, use Sys.Date()
To get date and time, use date()
> date() “Tue May 22 15:29:18 2018”
Check the below list of symbols to play with date and time format.
|%d||day as a number (0-31)||01-31|
> today <- Sys.Date()> format(today, format=”%B %d %Y”) “May 22 2018”
In a real-time scenario, we will not get date as date data type. Always, we need to convert to proper date data type. To convert to date data type, we need to use as.Date() function.
x – Field or column
format – use the above symbols to frame the proper format.
DateVec<- c(“01/05/2018”, “08/16/2018”)
 “01/05/2018” “08/16/2018”
Length Class Mode
2 character character
dates <- as.Date(DateVec, “%m/%d/%Y”)
 “2018-01-05” “2018-08-16”
Min. 1st Qu. Median Mean 3rd Qu. Max.
“2018-01-05” “2018-03-01” “2018-04-26” “2018-04-26” “2018-06-21” “2018-08-16”
Following my previous post, we have another few functions on dplyr package which will be covered in this post.
As part of data modelling, we need to sort the data to analyze further.
In T-SQL we can easily perform the sort on the data as like below. It uses “Order by” clause to sort the data.
In R, we need to use arrange function.
DF_Arrange<- DF_select %>%
summarise(Total = sum(Price)) %>%
The next, very common task is to build the calculated column to satisfy the business logic. This can be easily done with the help of “mutate” function.
DF_mutate <- DF_select %>%
summarise(Total = sum(Price)) %>%
mutate(“10%of Total” = Total/10) %>%
R is one of the very famous tools to handle the data science projects because it has all the capabilities right from the extracting the data from different sources, data modelling and transformation, data visualization and finally building machine learning models using the data.
This post explains how the data modelling can be done with R using “dplyr” package.
To make it easy, let me compare this dplyr function with T-SQL.
First, let us load the data into R studio.
When we have a large dataset with more than 1000 columns, if we need only certain columns then we can use “SELECT” option to choose the specific columns. Check the below example,
The same option is available in R with dplyr package.
DF_select <- fulldata %>% select (State,Price)
In T-SQL, we have where condition to filter any data with different conditions.
The same filter option is available with dplyr functions.
DF_select %>% filter(State==”Alabama”)
Same as T-SQL, we have group by, summarise functions are available in R
DF_Group <- DF_select %>% group_by(State)
DF_Sum<- DF_select %>% group_by(State) %>%
summarise(Total = sum(Price))
Creating a Power BI report is very easy, and it supports loads of data connectors. When we start creating a report against any relational database, we must choose any of the below query modes,
2. Direct Query or Connect Live
For example, when you try to create a report against SQL Server database, then you need to choose any of the above methods based on the requirement.
If you take SSRS report server, we have below elements available
1. Paginated Reports
2. Mobile Reports
4. Data Sources
If you are SSRS administrator, then you can easily manage all these stuff in a centralized place.
Is it possible to use the SSRS dataset to create a Power BI Report?
Yes, it is possible. You must have the latest version of Power BI Report Server.
We have REST APIs and is supported in Power BI Report Server, which is an extended version of SSRS 2017. Check REST API on SSRS 2017.
We need to use the OData Feed data connector to connect with SSRS datasets.
Below is the information required to create a report.
1. REST API Url
2. REST API Url with specific dataset id
The base URL of the REST API will be like below,
As you know, Microsoft released the latest version of SQL Server which is SQL Server 2017. If you worked with SQL Server 2016 then you could realize that Microsoft SQL Server is not just a relational database anymore because it started to support big data and so many options to handle the non-relational data.
R Language is very popular to work with data science-related projects. It was integrated with SQL Server 2016 and we can run the R Scripts in SQL Server Management Studio itself.
It helps us to avoid the data movement between the relational database to R server and process. The same way, Microsoft now introduced Python integration with SQL Server. These 2 languages are coming from machine learning services in SQL Server.
To run Python scripts in SQL Server management studio, you need to enable the external script stored procedure.
Run the below command in your SSMS and see whether you are getting the “hello world” as an output.
In case, you are getting an error message then the configuration part was not properly done.
@language = N’Python’,
@script = N’
The below script is to generate the scatter plot.
DECLARE @Query nvarchar(max) = N’SELECT Year, Sales from [dbo].[SalesByYear]’
@language = N’Python’,
@script = N’
import matplotlib.pyplot as myplot
X = myplot.figure()
myplot.title (“Sales by Year”)
myplot.savefig (“D:\Win – 8\myfig.png”)
@input_data_1 = @Query
Check the below step by step procedure to create a scatter plot using python.