---
|
|
title: "Read data"
|
|
author: "Pjotr"
|
|
date: "25/02/2020"
|
|
output: html_document
|
|
---
|
|
|
|
Set the data directory. Note you have to use one on
|
|
your own system! When it is set correctly 'Run->all' in the
|
|
menu will recompute everything.
|
|
|
|
```{r setup, include=FALSE}
|
|
data <- "/home/wrk/iwrk/closed/kemri/Francis_Final_TregData_Jan2020/Data/"
|
|
setwd(data)
|
|
knitr::opts_knit$set(echo = TRUE, root.dir=data)
|
|
```
|
|
|
|
```{r}
|
|
getwd()
|
|
```
|
|
|
|
## Read individuals and attributes
|
|
|
|
load a table
|
|
|
|
```{r ind_attr}
|
|
ind_attr=read.csv("Individual_attributes.csv")
|
|
ind_attr[1:3,1:3]
|
|
```
|
|
|
|
Show data structure
|
|
|
|
```{r}
|
|
summary(ind_attr)
|
|
```
|
|
|
|
```{r}
|
|
colnames(ind_attr)
|
|
```
|
|
|
|
Three elements of phenotype column
|
|
|
|
```{r}
|
|
ind_attr[["Phenotype"]][0:3]
|
|
```
|
|
|
|
or
|
|
|
|
```{r}
|
|
ind_attr$Phenotype[0:3]
|
|
```
|
|
|
|
Let's do a simple plot. Plot ELISA values against inds:
|
|
|
|
```{r}
|
|
plot(ind_attr$ELISA)
|
|
```
|
|
|
|
Let's plot ELISA vs Time to diagnosis
|
|
|
|
```{r}
|
|
plot(ind_attr$ELISA ~ ind_attr$Time_to_diagnosis)
|
|
```
|
|
|
|
So, it looks like late diagnosis has an effect. This is just a quick example, let's continue loading sets from
|
|
|
|
```
|
|
cytokines.csv
|
|
final_outcome_jan2020.csv
|
|
Individual_attributes.csv
|
|
pcr.csv
|
|
supernatant.csv
|
|
transcriptomics.csv
|
|
treg_phenotype_data.csv
|
|
```
|
|
|
|
```{r}
|
|
cytokines = read.csv("cytokines.csv")
|
|
final = read.csv("final_outcome_jan2020.csv")
|
|
pcr = read.csv("pcr.csv")
|
|
supernatant = read.csv("supernatant.csv")
|
|
transcriptomics = read.csv("transcriptomics.csv")
|
|
treg = read.csv("treg_phenotype_data.csv")
|
|
```
|
|
|
|
when they load you can explore the data in the top right enviroment or
|
|
|
|
```{r}
|
|
show(pcr$day[1:3])
|
|
```
|
|
|
|
It will show that not all rows are labeled. That means we will need a way to cross-reference by ID.
|