R Package dplyr Function vs T-SQL

ByHariharan Rajendran

R Package dplyr Function vs T-SQL

R is one of the very famous tools to handle the data science projects because it has all the capabilities right from the extracting the data from different sources, data modelling and transformation, data visualization and finally building machine learning models using the data.

This post explains how the data modelling can be done with R using “dplyr” package.

To make it easy, let me compare this dplyr function with T-SQL.

First, let us load the data into R studio.

fulldata<-read.csv(“D:\\Projects 2018\\DSKA\\DataWranglingDemo.csv”)

fulldata

Select Function

When we have a large dataset with more than 1000 columns, if we need only certain columns then we can use “SELECT” option to choose the specific columns. Check the below example,

 

The same option is available in R with dplyr package.

#install.packages(“dplyr”)

require(“dplyr”)

#select

DF_select <- fulldata %>% select (State,Price)

DF_select

Filter Function

In T-SQL, we have where condition to filter any data with different conditions.

The same filter option is available with dplyr functions.

#filter

DF_select %>% filter(State==”Alabama”)

Same as T-SQL, we have group by, summarise functions are available in R

Group by Function

#Group by

DF_Group <- DF_select %>% group_by(State)

DF_Group

Summarise Function

#Summarise

DF_Sum<- DF_select %>% group_by(State) %>%

summarise(Total = sum(Price))

DF_Sum

About the Author

Hariharan Rajendran author

Hariharan Rajendran is a Microsoft Certified Trainer with 9+ years of experience in Database, BI and Azure platforms. Hariharan is also an active community leader, speaker & organizer and leads the Microsoft PUG (Power BI User Group – Chennai), SQLPASS Power BI Local Group – Chennai and an active speaker in SQL Server Chennai User Group and also a leader in Data Awareness Program worldwide events. Hariharan also frequently blogs (www.dataap.org/blog), provides virtual training (on ad-hoc basis) on Microsoft Azure, Database Administration, Power BI and database development to worldwide clients/audience.

Comments Are Closed!!!