RMySQL Tutorial For Beginners
RMySQL is a database interface and MySQL driver for R. This version complies with the database interface definition as implemented in the package DBI.
install.packages("RMySQL"); library(RMySQL)
library(DBI)
Connecting to MySQL:
Once the RMySQL library is installed, we create a database connection object.
mydb = dbConnect(RMySQL::MySQL(), user='user', password='password', dbname='database_name', host='host')
mydb <- dbConnect(MySQL(),user="genome",db="hg19",host="genome-mysql.cse.ucsc.edu")
summary(mydb)
Listing Tables and Fields:
Now that a connection has been made we list the tables and fields in the database we connected to.
dbListTables(mydb)
This will return a list of the tables in our connection.
dbListFields(mydb, 'some_table'); dbListFields(mydb, 'mgcGenes')
This will return a list of the fields in some_table.
Running Queries:
Reading and sub-setting table from the connected database
Genes <- dbReadTable(mydb, " mgcGenes")
subset(Genes, subset = exonCount > 20, select = name2)
dbGetQuery(mydb, "SELECT name2 FROM mgcGenes WHERE exonCount > 20")
SQL Queries can be run using the dbGetQuery function, which gives the same output as subset function. In my experience with this package any SQL query that will run on MySQL will run using this method.
dbGetQuery(mydb, "SELECT * FROM mgcGenes WHERE exonCount > 20")
* - Select all the columns from mgcGenes table from mydb database
Retrieving data from MySQL:
Using dbSendQuery, fetch and dbClearResult function:
The following set of function gives the same result as dbGetQuery. The only difference is dbSendQuery and fetch function is useful when we are reading large set of table or data.
To retrieve data from the database we need to save a results set object.
rs = dbSendQuery(mydb, "SELECT name2 FROM mgcGenes WHERE exonCount > 20")
I believe that the results of this query remain on the MySQL server, to access the results in R we need to use the fetch function.
dbFetch(rs)
dbClearResult(rs)
Using chunk (fetch “n” option) to read table in chunk. Helpful to read large data set
Disconnecting the connection with the database
dbDisconnect(mydb)
Thanks! This was really helpful!