Thursday, 28 March 2013

ITBAL Assingnment 10 : 3D Plotting in R

Assignment 1: 

Create 3 vectors, x, y, z and choose any random values for them, ensuring they are of equal length, bind them together.Create 3 dimensional plots of the same.

Solution:

Commands:

First creating a random data set of 50 items with mean =30 and standard deviation =10

> data <- rnorm(50,mean=30,sd=10)
> data

Taking sample data of length 10 from the created data set in three different vectors x,y,z
> x <- sample(data,10)
> x

> y <- sample(data,10)
> y

> z <- sample(data,10)
> z

Binding the three vectors x,y,z into a vector T using cbind
> T <- cbind(x,y,z)
> T
          

Data Set

Plotting 3d graph 

Command:

> plot3d(T[,1:3])
3D plot

Plotting of graph with labels for axes and color

Command 
> plot3d(T[,1:3], xlab="X Axis" , ylab="Y Axis" , zlab="Z Axis", col=rainbow(500))
3D plot with color

Plotting of graph with labels for axes, color and type = spheres

Command
> plot3d(T[,1:3], xlab="X Axis" , ylab="Y Axis" , zlab="Z Axis", col=rainbow(5000), type='s')
3D Plot with spheres


Plotting of graph with labels for axes, color and type = points

Command

> plot3d(T[,1:3], xlab="X Axis" , ylab="Y Axis" , zlab="Z Axis", col=rainbow(5000), type='p')
3D Plot with Points


Plotting of graph with labels for axes, color and type = lines

Command

> plot3d(T[,1:3], xlab="X Axis" , ylab="Y Axis" , zlab="Z Axis", col=rainbow(5000), type='l')
3D Plot with Lines



Assignment 2:

Choose 2 random variables 
Create 3 plots: 
1. X-Y 
2. X-Y|Z (introducing a variable z and cbind it to z and y with 5 diff categories)
3. Color code and draw the graph 
4. Smooth and best fit line for the curve

Solution

Creating a data set for two random variables and then introducing third variable z

Commands

> x <- rnorm(5000, mean= 20 , sd=10)
> y <- rnorm(5000, mean= 10, sd=10)
> z1 <- sample(letters, 5)
> z2 <- sample(z1, 5000, replace=TRUE)
> z <- as.factor(z2)
> z
Data Set

Creating Quick Plots

Command:

>qplot(x,y)
x and y qplot

>qplot(x,z)
x and z qplot

For semi-transparent plot

> qplot(x,z, alpha=I(2/10))
Semi-transparent Plot


For coloured plot

> qplot(x,y, color=z)

Colored plot


For Logarithmic coloured plot

> qplot(log(x),log(y), color=z)
Logarithmic Plot

Best Fit and Smooth curve using "geom"

Command:

> qplot(x,y,geom=c("path","smooth"))

geom='path'

> qplot(x,y,geom=c("point","smooth"))

geom='point'


> qplot(x,y,geom=c("boxplot","jitter"))
geom='boxplot' and 'jitter'

Saturday, 23 March 2013

Data Visualization & Representation



Data visualization is the study of the visual representation of data, meaning "information that has been abstracted in some schematic form, including attributes or variables for the units of information".

According to Friedman (2008) the "main goal of data visualization is to communicate information clearly and effectively through graphical means.

It doesn't mean that data visualization needs to look boring to be functional or extremely sophisticated to look beautiful. To convey ideas effectively, both aesthetic form and functionality need to go hand in hand, providing insights into a rather sparse and complex data set by communicating its key-aspects in a more intuitive way. Yet designers often fail to achieve a balance between form and function, creating gorgeous data visualizations which fail to serve their main purpose — to communicate information".


The tool that I used for developing my resume implementing data visualization is visual.ly .

Tool Analysis :  Visual.ly: (http://visual.ly/)

About:

Visual.ly is a community platform for data visualization and infographics. It was founded by Stew Langille, Lee Sherman, Tal Siach, and Adam Breckler in 2011.

Visual.ly is structured as both as a showcase for infographics as well as a marketplace and community for publishers, designers, and researchers. The site allows users to search images through description, tags, and sources in a variety of categories, ranging from Education to Business or Politics.Users can publish infographics to their personal profile, which they can subsequently share through their social networks.

Visual.ly maintains a team of data analysts, journalists, and designers that create infographics and data visualizations using the Visual.ly tools. They are currently developing a tool that allows anyone to create and publish their own data visualizations.Through this tool, users will be able to gather information from databases and APIs in an automated service to produce an infographic. 

By tapping into Visually's vibrant community of more than 35,000 designers, Marketplace is able to match infographic commissioners – brands, companies, agencies – with designers, Once matched, commissioners have direct access to the designers working on their projects and can communicate and transact with them in Visually's Project Center. Through such unique features as the Project Timeline, commissioners always know where their project stands and can ensure that it stays on time and on budget.

Visually partners with the world's leading publications and brands, bringing  tools, community, and talented team to bear data visualization needs, wherever bespoke creation is needed.


Some points that I found were wonderful about this tool were:

  • UI is very user friendly
  • it is open source
  • numerous options regarding visual presentation of different types of data are available
  • the full tool is available online and it is not necessary to install any software on your PC
  • it is fast
  • the results are attractive and elegant
  • themes and options suiting everyone's style and taste are available.
  • once the visual presentation of data is ready, all possible options to retain and avail that 

Here is the picture of my resume. click here

Friday, 15 March 2013

ITBAL Assignment 8

Analysis using 3 models:

1. Pool Effect model
2. Fixed Effect model.
3. Random Effect model.

There are two types of index of panel data:

1> ID index
2> Time index.

Step 1:

Read the data.

Commands:
> data("Produc",package = "plm")
> head(Produc)

Step 2:

Pooled Model:

Commands:
>pool<-plm(log(pcap)~log(hwy)+log(water)+log(util)+log(pc)+log(gsp)+log(emp)+log(unemp),data=Produc,model=("pooling"), index= c("state","year"))
> summary(pool)

Step 3:

Fixed effect model

commands:
>fixed<-plm(log(pcap)~log(hwy)+log(water)+log(util)+log(pc)+log(gsp)+log(emp)+log(unemp),data=Produc,model=("within"), index= c("state","year"))
> summary(fixed)






Step 4:

Random effect model:

 commands:

>random1<-plm(log(pcap)~log(hwy)+log(water)+log(util)+log(pc)+log(gsp)+log(emp)+log(unemp),data=Produc,model=("random"), index= c("state","year"))
> summary(random1)



Step 5: Test for fixed vs OLS

pFtest(fixed,pool)
Null hypothesis: Pooled effect model
Alternate hypothesis:  Fixed effect model

Commands:
> pFtest(fixed,pool)

Step 6:  Test for pooled data vs random data

Null hypothesis: Pool effect model
Alternate hypothesis: Random effect model

Command:



> plmtest(pool)





p value is very less. Reject null hypothesis and accept alternate hypothesis i.e.Random effect model.  

Step 7: Test for random effect model vs pool effect model.

Null hypothesis: Random effect model
Alternate hypothesis: Fixed effect model.


command:
phtest(random1,fixed)

P value is too less . So reject Null hypothesis and accept alternate hypothesis i.e.Fixed effect model.
Aritro Ghosh
12BM60087