How to connect ChatGPT to a SQL database for data retrieval and analysis

How to connect ChatGPT to a SQL database for data retrieval and analysis

What is chatGPT?
How chatGPT works
How to use ChatGPT in Data Analysis

What is chatGPT?

ChatGPT is an advanced language model that can be used to generate tasks. It can also be used to understand tasks. ChatGPT can be used to write contents, emails, and articles, even to the extent of writing and explaining codes. ChatGPT can be used to generate data, unit codes, and train machine learning modules.

How chatGPT works

To be able to work with chatGPT, head over to ChatGPT and sign up if you haven’t already. If you have signed up, all you need to do is log in


Once you click on Sign Up or Log in, it brings you the sign-up or log-in page where you will have to sign in with your Email, Apple or Microsoft account


After sign-in, you will get to see the next page where you have to input your prompt and start getting responses from ChatGPT

One thing to know is that once you prompt chatGPT, it can always refer back to the question you asked it previously, provided you are still on the same page.

ChatGPT for Data Analysis

Now let’s use ChatGPT for data analysis.
First, we will go to w3schools to get data sets. Click on the try it yourself button. it will bring you to the image below.

Change the customers to products, then click on the “Run SQL” button

Copy some entries from the table, including the headers,

Go to chatGPT and paste them.
Instruct it that this is the product table. It is going to bring out the results of the interpretation of the table.


From here, we can ask it any question of our choice.
First, let’s ask about the product with the highest price.


If you go over to the table, you will find out that the highest price is 40 for Northwood Cranberry Sauce. Indeed, it is correct.

Ask for the lowest price.

You can also prompt ChatGPT for the average of the products. It will not only give you the average, it will also show you how to calculate it.


Lets go ahead and solve this using SQL.

SELECT AVG(Price) AS AveragePrice FROM products;

Let prompt chatGPT write the SQL query for the product with the highest price

Prompt ChatGPT to get the product with the lowest price

Tell it to calculate the average


Now it’s time to make things a little bit harder. So we have to make ChatGPT join two tables together. So go back to w3schools to get the OrderDetails table

Then copy the values you get there and paste them in ChatGPT. You will get something like this below:


Also, do same for orders table and then suppliers table


Finally, you do the suppliers table


Continue by asking ChatGPT to write SQL query to calculate the averageproduct price per supplier. The code below is useful:

SELECT s.SupplierName, AVG(p.Price) AS AvgPrice
FROM Suppliers s
JOIN Products p ON s.SupplierID = p.SupplierID
GROUP BY s.SupplierName
ORDER BY AvgPrice DESC;

You can see that the query above first joins the “Suppliers” and “Products” tables on the SupplierID field. It then groups the results by SupplierName and calculates the average price for each group.

Finally, it orders the results by AvgPrice in descending order.

Replace the table and column names if they are different in your database.

Next, write an SQL query that gets the products that achieved the highest revenue

SELECT p.ProductName, SUM(od.Quantity * p.Price) AS Revenue
FROM Products p
JOIN OrderDetails od ON p.ProductID = od.ProductID
GROUP BY p.ProductName
ORDER BY Revenue DESC;

This query first joins the “Products” and “OrderDetails” tables on the ProductID field. It then groups the results by ProductName and calculates the total revenue for each product (which is the sum of the product of Quantity and Price for each order detail).

Finally, it orders the results by Revenue in descending order.

Replace the table and column names if they are different in your database. If you want to limit the result to the top product, you can add a LIMIT 1 clause at the end of the query.

Next, ChatGPT is going to write an SQL query that gets the employee that made the highest sales from the table above

SELECT e.EmployeeID, SUM(od.Quantity * p.Price) AS Sales
FROM Employees e
JOIN Orders o ON e.EmployeeID = o.EmployeeID
JOIN OrderDetails od ON o.OrderID = od.OrderID
JOIN Products p ON od.ProductID = p.ProductID
GROUP BY e.EmployeeID
ORDER BY Sales DESC
LIMIT 1;

Finally, you can prompt ChatGPT to use window functions and subqueries by asking it about employees that made the second highest sales. Check out the code below:

SELECT EmployeeID, Sales
FROM (
SELECT e.EmployeeID, SUM(od.Quantity * p.Price) AS Sales,
DENSE_RANK() OVER (ORDER BY SUM(od.Quantity * p.Price) DESC) as SalesRank
FROM Employees e
JOIN Orders o ON e.EmployeeID = o.EmployeeID
JOIN OrderDetails od ON o.OrderID = od.OrderID
JOIN Products p ON od.ProductID = p.ProductID
GROUP BY e.EmployeeID
) AS EmployeeSales
WHERE SalesRank = 2;

first of all, Sure, to find the employee that made the second highest sales, you can use a subquery to rank employees by sales and then select the one with the second highest sales. Here is an SQL query that gets the employee that achieved the second highest sales:

SELECT EmployeeID, Sales
FROM (
SELECT e.EmployeeID, SUM(od.Quantity * p.Price) AS Sales,
DENSE_RANK() OVER (ORDER BY SUM(od.Quantity * p.Price) DESC) as SalesRank
FROM Employees e
JOIN Orders o ON e.EmployeeID = o.EmployeeID
JOIN OrderDetails od ON o.OrderID = od.OrderID
JOIN Products p ON od.ProductID = p.ProductID
GROUP BY e.EmployeeID
) AS EmployeeSales
WHERE SalesRank = 2;

You can see that this query first joins the “Employees”, “Orders”, “OrderDetails”, and “Products” tables on the respective ID fields.

It then groups the results by EmployeeID and calculates the total sales for each employee (which is the sum of the product of Quantity and Price for each order detail).

The DENSE_RANK() function is used to rank the employees by their sales in descending order.

Finally, the outer query selects the employee with a SalesRank of 2, which represents the second highest sales.

Do well to replace the table and column names if they are different in your database.

If you want to get the employee name instead of the ID, you can replace e.EmployeeID with e.EmployeeName in the SELECT clause, assuming EmployeeName is the correct column name in your “Employees” table.

In conclusion, ChatGPT makes work easier and faster. But don’t rely too much on it. You can further instruct it to do more for you if you are not satisfied.

Leave a Reply

Your email address will not be published. Required fields are marked *