Thursday, March 27, 2014

Data Frame

R is a platform and developers can have their own packages depending on the purposes.
R will handle the data known " dataframes": Rows and columns
Rows : observations, measurements
and columns: values of different variables
in the body of the datafram: numbers, text male female for examples, calendar dates (23/05/04), logical variables(false, true)
For example, statements TRUE, FALSE, numbers (2,3,5), and text (aa, bb,cc) are in the body of dataframe.

> n<-c(2,3,5)
> s<- c("aa","bb","cc")
> b<- c(TRUE, FALSE, TRUE)
> df<-data.frame(n,s,b)
> df
  n  s     b
1 2 aa  TRUE
2 3 bb FALSE
3 5 cc  TRUE

Use: read.table

df: this is name, you want to put to your data-frame, you can whatever name you like, then recall them by command list(),

> attach(df)
The following objects are masked _by_ .GlobalEnv:
    b, n, s
> names(df)
[1] "n" "s" "b"

use attach to make the variables accessible by name (b,n,s)
use names to get a list of the variable names (n,s,b)
> summary(df)
       n                          s                   b         
 Min.   :2.000           aa:1         Mode :logical 
 1st Qu.:2.500          bb:1         FALSE:1       
 Median :3.000        cc:1         TRUE :2       
 Mean   :3.333                         NA's :0       
 3rd Qu.:4.000                        
 Max.   :5.000    
using summary

Ok let's move, now you have dataframe and you want to select certain rows and column say from column 1 to 3 with all rows.
use: >df[,1:3], or df[,1] to select the column 1 with all rows, df[1,]  First row with all columns.
from rows 3 to end, df[3:5,]
Here is an example to process *.CDF file, download file here

CDF 3 dimensions: time, mass, and ion counts/second

