Inserting Pandas DataFrames Into Database using INSERT

When working with data in Python, we make use of pandas, and we’ve often got our data stored as a pandas DataFrame. Thankfully, we don’t need to do any conversions if we want to use SQL with our DataFrames; we can directly insert a pandas DataFrame into a MySQL database using INSERT.

Below are steps for the same :

Step 1 : We can here create our own DataFrame using a dictionary. We could also import data from a CSV file or create a DataFrame in any number of other ways, but for the purposes of this example, we’re just going to import the movies data as shown :

import pandas as pd

movies = pd.read_csv('http://bit.ly/imdbratings')  ## Reading a Dataframe

movies.head()     
No alt text provided for this image

Step 2: Create a table in our MySQL database.

Before inserting data into MySQL, we’re going to to create a movies table in MySQL to hold our data. If such a table already existed, we could skip this step.

Note: A table is created by using the CREATE TABLE statement, followed by table name and then list each column along with its datatype. For eg.

                       CREATE TABLE movies_details(
                              star_rating float(2,1),
                              title varchar2(100),
                              content_rating varchar2(10),
                              ........);
                              

Step 3: We can once again create a connection to the database from Python using pymysql(as discussed in previous article).

import pymysql

# Connect to the database
connection = pymysql.connect(host='localhost',
                         user='username',
                         password='*****',
                         db='movies')

# create cursor
cursor=connection.cursor()

Step 4: Next, we’ll create a column list and insert our dataframe rows one by one into the database by iterating through each row and using INSERT INTO to insert that row’s values into the database.

# creating column list for insertion
cols = "`,`".join([str(i) for i in data.columns.tolist()])

# Insert DataFrame recrds one by one.
for i,row in data.iterrows():
    sql = "INSERT INTO `movies_details` (`" +cols + "`) VALUES (" +      "%s,"*(len(row)-1) + "%s)"
    cursor.execute(sql, tuple(row))

# the connection is not autocommitted by default, so we must commit to save our 

# changes   

   connection.commit()

Step 5: Again, let’s query the database to make sure that our inserted data has been saved correctly.

# Execute query
sql = "SELECT * FROM `movies_details`"
cursor.execute(sql)

# Fetch all the records
result = cursor.fetchall()
for i in result:

    print(i)

Output :

(9.3, 'The Shawshank Redemption','R', 'Crime', 142, '[u'Tim Robbins', u'Morgan Freeman', u'Bob Gunt...')

(9.2,'The Godfather', 'R','Crime',175, '[u'Marlon Brando', u'Al Pacino', u'James Caan']'
(9.1,'The Godfather: Part II' , 'R', 'Crime',200, '[u'Al Pacino', u'Robert De Niro', u'Robert Duv...')
.......

Once we’re satisfied that everything looks right, we can close the connection.

connection.close()

In the next article, we will learn about to_sql() method to insert Pandas DataFrames into the MySQL database.


To view or add a comment, sign in

More articles by Kumar Brar

  • Part 5 -Bitwise AND operation to mask useless area in image

    In continuation to our lane line detection program, our next step is to mask out the entire region which is not useful.…

  • Part 4 - Region of Interest

    Till now, we have learnt about specific areas of Computer Vision related to Gaussian Blur and Canny images. In this…

  • Part 3-Canny function for Edge Detection

    In the previous article, we worked on getting a Gaussian blurred image( by reducing noise in the image) from our…

  • Part 2 -Remove noise in an image using Gaussian Blur

    In this article, we will discuss and understand the importance of noise reduction and how it is going to be useful for…

    2 Comments
  • Part 1-Lane detection using OpenCV

    I am learning about OpenCV and how it is used for detecting lanes by cars or more specifically, the futuristic…

  • Understand Conda and Pip

    Conda and pip are too identical yet are too different. Although, some of the functionality of these two tools overlap…

  • Definitely, we can minimize road-accidents.

    In my previous article, we were discussing about the road accidents, their statistics, self-driving cars and their…

  • Can we minimize road accidents?

    The image here is a bit funny, as the driver is laughing standing near a toppled truck. This is a usual sight in my…

  • Inserting Pandas DataFrames into a Database Using to_sql() Function

    In this article, we are going to understand how we can make use of the module sqlalchemy to create our connection and…

  • Inserting Records Into a Database From Python

    One of the basic roles of a data analyst/scientist is to extract patterns and insights from raw data. Since much of the…

Others also viewed

Explore content categories