Tuesday, 11 April 2017

Big Data - Kapil Sharma (101)

Big Data consists of five types:

1. Volume

2. Velocity 

3. Variety 

4. Veracity 

5. Value  

More than 90% of big data set created in last 3+ years.

To process this huge data set of unstructured data, big data frameworks comes in picture. 

To retrieve value from processing, computing and analysis data leads to the unique value addition. 




Wednesday, 25 November 2015

Basic Data Science Introduction (C-01) - Kapil Sharma

Vectors: 
The vector is a very important tool in R programming. Through vectors we create matrix and data-framesVectors can have numeric, character and logical values. The function c() is used to create vectors in R programming.

x <- c(2,22,"xyz", -4)


Factors:

They are same like vectors but they have different meaning.
 y <- c(1,2,3,4,5,6,7)
yf<- factor(y)
yf

Lists:

They are vectors, but they consists of different data sets.
a <- c(dog = "pitbull", age = 100, color = "golden", weight = TRUE)

Matrices:

They are vectors with more than one dimensions, consists of rows and columns (ncol,nrow).
They can be rowbind (rbind()) or column bind (cbind())

# Create matrix with 4 elements:

cells <- c(3,5,16,29)
colname <- c("Jun", "Feb")
rowname <- c("Nut", "Orange")
y <- matrix(cells, nrow=2, ncol=2, byrow=TRUE, dimnames=list(rowname, colname))

              Jun   Feb

Nut         3      5
Orange  16    29

Datasets:

It is same like matrix, but it also consists of numeric and character elements.
Location <- c("Mandi", "Manali")
Distance <- c(200, 307)
df <- data.frame(a,b)
df

Location Distance

Mandi 200
Manali 307