BLOGS

ByDr. SubraMANI Paramasivam

Create a work station for a Data Scientist – Part 2

Following my first post on this same series on creating a work station for a Data Scientist, this will be the next one.

Next you need Revolution R Enterprise 7.5.0 Server (RRE-7.5.0) and you can download this from this link.

This will download about 81MB of “Revolution-R-Enterprise-Node-SQL-7.5.0-Windows.EXE”

Once downloaded you can then start installing this exe and below screenshots are provided for a successful installation.

InstalRevolutionREnterprise75_01

InstalRevolutionREnterprise75_02

InstalRevolutionREnterprise75_03

InstalRevolutionREnterprise75_04

InstalRevolutionREnterprise75_05

InstalRevolutionREnterprise75_06

InstalRevolutionREnterprise75_07

InstalRevolutionREnterprise75_08

InstalRevolutionREnterprise75_09

InstalRevolutionREnterprise75_10

InstalRevolutionREnterprise75_11

InstalRevolutionREnterprise75_12

InstalRevolutionREnterprise75_13

InstalRevolutionREnterprise75_14

InstalRevolutionREnterprise75_15

InstalRevolutionREnterprise75_16

InstalRevolutionREnterprise75_17

InstalRevolutionREnterprise75_18

InstalRevolutionREnterprise75_19

ByHariharan Rajendran

Database Table and Primary Key – Part 2

In my previous article, I have explained about the table creation and primary keys.

In real time scenarios, a primary key field should generate values automatically whenever a new row added to the table.

If the primary key field is an integer then set up an identity

  1. Identity needs to be defined in (X, Y) format
  2. Define the value for X which is starting value and Y is an increment value.

CREATE TABLE Users

(Id int primary key identity (1,1),

Name varchar(50))

This table will generate a value for Id starting from 1 and increment by 1. The column values will be 1, 2, 3, .etc.

Primary key on unique identifier field,

CREATE TABLE Users

(Id uniqueidentifier default NEWID() primary key,

Name varchar(50))

This will generate random new GUID value for each row.

Choosing a right primary key column is very challenging.

We can create a primary key on any of the columns but we need to understand the business requirement.

Consider, if you are creating a primary key on Integer field. It has both advantages and disadvantages. Most of the time people will consider its advantages and start using integer as primary key column for all the tables.

Points to remember,

  1. Performance wise, this is the best option
  2. Applying and rebuilding the index is very easy
  3. If we consider a big environment where data storage is important. Then int will hold values up to Inserting records more than this limit will break the table.

Primary Key on unique identifier field,

Points to remember,

  1. It is very good idea to use unique identifier as a primary key column when data is less in the table
  2. It will occupy more disk space
  3. If data is huge then indexing and rebuilding are very hectic.

Primary key on varchar field, consider an emailid field

Point to remember,

  1. Sometimes we might think that our table should maintain a unique email address so considering this we may create aprimary key on email field. This is not advisable.
  2. Email address is a field which will change over the period of time.
  3. If considering this as a primary key field and involved on many to many relationships then if any changes to the email address, it will affect the whole database.
ByHariharan Rajendran

Database Table and Primary Key – Part 1

Creating a table in MS SQL Server is very easy, this can be achieved by using T-SQL scripts or GUI. This article will use T-SQL Scripts to a create a table.

Basic Pre-Requisites to create a table,

  1. Understand the purpose of the table
  2. Identify the columns and data types. Refer this link, to know about data types.

Table is the storage unit in a database which stores the data in rows and columns as like excel sheet. The output of the table result is similar to the excel sheet data.

If you are decided to store some data in a database then you must use table.

The basic table structure will look like below.

CREATE TABLE Users

(Id int)

We can add more columns in the structure, each column should be define with the right data type. Here Id is the column name and int is the data type which is nothing but an integer.

Let us discuss, what is a primary key and how to create it.

Primary key is a constraint which we need to define in any one of the column in a table. It will not allow null values in that defined column and entry to the column should be unique  i.e. it won’t allow duplicate values.

Choose an ID or Unique Identifier (GUID) column to create a primary key, this is the best way to define a table structure.

Create a table with primary Key on integer column,

CREATE TABLE Users

(Id int primary key,

Name varchar(50))

In the above script, ID column is considered as a primary key. You can check this by clicking the columns folder under dbo.Users table.

Create a table with primary key on unique identifier column,

CREATE TABLE Users

( Id uniqueidentifier primary key,

Name varchar(50))

ByDr. SubraMANI Paramasivam

Create a work station for Data Scientist – Part 1

In this post, I am going to cover how to create the work station that will make it suitable for a Data Scientist. Sounds interesting!? Go ahead reading my other parts on this.

First of all you need Microsoft R Open 3.2.2 for Revolution R Enterprise 7.5.0 which can be downloaded here.

Once you have downloaded this 56MB file, you can start installing and I have provided step by step screenshots below.

InstalRevolution322Enterprise75_01

InstalRevolution322Enterprise75_02

InstalRevolution322Enterprise75_03

InstalRevolution322Enterprise75_04

ByMohan Rajendran

How to check installed R version information in R?

There are many methods to check the installed version of R.

Method 1:

version – Provides information about the version of R with additional details

Method 2:

R.version.string – is used to check only the version information

Method 3:

R.Version () – Provides information about the version of R by list

Method 4:

getRversion() – Provides information about the  R version string number