by Lazy Sloth
AI Query Generator Slack Bot for BigQuery
import re
import slack_sdk
from slackeventsapi import SlackEventAdapter
from flask import Flask, request
from typing import List
from abilities import llm_prompt
from google.cloud import bigquery
import logging
from prettytable import PrettyTable
import os
logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.WARNING)
PROJECT_ID = os.environ['PROJECT_ID']
DATASET_ID = os.environ['DATASET_ID']
TABLE_ID = os.environ['TABLE_ID']
SLACK_BOT_TOKEN = os.environ['SLACK_BOT_TOKEN']
processed_messages = []
def generate_query(
question: str, schema: str, event_types: str, table_info: str, previous_query: str = None, errors: List[str] = None
) -> str:
Frequently Asked Questions
How can this AI Query Generator Slack Bot improve data accessibility in an organization?
The AI Query Generator Slack Bot for BigQuery significantly enhances data accessibility by allowing non-technical team members to easily query complex datasets. Users can simply ask questions in natural language within Slack, and the bot leverages ChatGPT to generate appropriate SQL queries. This democratizes data access, enabling employees across various departments to gain insights without needing to learn SQL or rely heavily on the data team.
What are some potential use cases for this Slack bot in a business setting?
The AI Query Generator Slack Bot has numerous applications in a business setting: - Sales teams can quickly query customer data for insights on buying patterns - Marketing teams can analyze campaign performance metrics on-the-fly - Product managers can access user engagement data without waiting for analyst reports - Executive teams can get real-time updates on key performance indicators - Customer support can retrieve relevant user information during support interactions
By providing instant access to data insights, this bot can significantly speed up decision-making processes across the organization.
How does this bot ensure data security and prevent unauthorized access?
The AI Query Generator Slack Bot incorporates several security measures: - It uses environment variables to store sensitive information like API keys and credentials - The bot only responds to direct mentions, reducing the risk of accidental data exposure - It operates within the Slack workspace, leveraging Slack's built-in security features - The bot uses Google Cloud's service account authentication for BigQuery access
However, it's crucial to implement additional access controls and data governance policies to ensure that users can only query data they're authorized to access.
How can I modify the bot to support multiple BigQuery tables?
To support multiple BigQuery tables, you can modify the handle_message
function to accept a table name as part of the query command. Here's an example of how you might adjust the code:
python
if message.lower().startswith("query"):
parts = message.split(" ", 2)
if len(parts) < 3:
response = "Invalid query format. Use: query <table_name> <question>"
else:
_, table_name, question = parts
table_id = get_table_id(table_name) # You'd need to implement this function
if table_id:
schema = get_table_schema(table_id)
table_info = f"Project ID: {PROJECT_ID}, Dataset ID: {DATASET_ID}, Table ID: {table_id}"
event_types = get_event_types(table_id)
query = generate_query(question, schema, event_types, table_info)
# ... rest of the query execution code ...
else:
response = f"Table '{table_name}' not found."
This modification allows users to specify which table they want to query, making the bot more versatile for organizations with multiple datasets.
Can the AI Query Generator Slack Bot handle complex joins or subqueries?
Yes, the AI Query Generator Slack Bot can handle complex joins and subqueries, thanks to the power of ChatGPT. The generate_query
function provides the LLM with the table schema and other relevant information, enabling it to create sophisticated queries. However, to improve its capabilities for very complex queries, you might want to enhance the prompt. Here's an example of how you could modify the generate_query
function:
```python def generate_query(question, schema, event_types, table_info, previous_query=None, errors=None): prompt = f"""You are a senior engineer tasked to generate a valid SQL query for BigQuery about an events dataset based on the question from a person that knows less than you about the data. Here's the table schema: {schema}, and the table info: {table_info}. These are the only event types that exist: {event_types}. Remember: - TIMESTAMP_SUB does not support YEAR so use 365 DAYS instead - The table always needs some time period in the where clause due to partitioning - Only if you expect the query to output lots of rows limit to 100 by default - You can use complex joins and subqueries if necessary to answer the question accurately - Optimize the query for performance when possible
Here's the question from the person: {question}.
"""
# ... rest of the function remains the same ...
```
This enhanced prompt explicitly tells the model that it can use complex joins and subqueries, and also asks it to optimize for performance, which should result in more sophisticated and efficient queries when needed.
Created: | Last Updated:
Introduction to the AI Query Generator Slack Bot for BigQuery Template
Welcome to the AI Query Generator Slack Bot for BigQuery template! This template is designed to help you create a Slack bot that interacts with users, allowing them to ask questions about data in a BigQuery table or request the table schema. The bot then uses the latest ChatGPT to generate a SQL query, which is executed on BigQuery to return the results. This step-by-step guide will walk you through the process of setting up and using this template on the Lazy platform.
Getting Started
To begin using this template, click on "Start with this Template" on the Lazy platform. This will pre-populate the code in the Lazy Builder interface, so you won't need to copy, paste, or delete any code manually.
Initial Setup: Adding Environment Secrets
Before testing the app, you'll need to set up some environment secrets within the Lazy Builder. These are necessary for the bot to interact with Google BigQuery and Slack. Here's what you need to do:
- Navigate to the Environment Secrets tab within the Lazy Builder.
- Add the following secrets with their respective values:
PROJECT_ID
: Your Google Cloud project ID.DATASET_ID
: The dataset ID in BigQuery where your table is located.TABLE_ID
: The table ID in BigQuery that you want to query.SLACK_BOT_TOKEN
: Your Slack bot token, which you can obtain from your Slack app settings.SLACK_SIGNING_SECRET
: Your Slack app's signing secret, also available in your Slack app settings.
- Ensure that you have the correct permissions and roles set up in Google Cloud to access BigQuery and perform queries.
Test: Pressing the Test Button
Once you have set up the environment secrets, press the "Test" button on the Lazy platform. This will begin the deployment of the app and launch the Lazy CLI. If the code requires any user input, you will be prompted to provide it through the Lazy CLI.
Entering Input
If the template requires user input, you will be prompted for it after pressing the "Test" button. Follow the instructions in the Lazy CLI to provide the necessary input.
Using the App
After the app is deployed, you can interact with the Slack bot by sending messages directly in Slack. The bot will respond to queries about the data or schema requests in a thread to the original message. If the bot generates a query, it will execute it on BigQuery and return the results in a formatted table.
Integrating the App
If you need to integrate this Slack bot with other tools or services, you may need to provide the Slack bot's server link, which will be printed by the Lazy builder CLI after deployment. For example, you might add this link to your Slack app's event subscriptions to receive messages from Slack.
Remember, this bot is designed to work within the Slack environment, so you'll need to ensure that your Slack workspace is set up to interact with the bot. This includes installing the bot to your workspace and inviting it to the channels where you want it to be active.
That's it! You're now ready to use the AI Query Generator Slack Bot for BigQuery. If you encounter any issues or have questions, refer to the documentation links provided in the code or reach out to Lazy's customer support for assistance.
Template Benefits
-
Data Democratization: This Slack bot empowers non-technical team members to access and analyze BigQuery data without requiring SQL expertise, promoting data-driven decision making across the organization.
-
Increased Productivity: By automating the query generation process, the bot significantly reduces the time and effort required to extract insights from complex datasets, allowing employees to focus on interpreting results rather than writing queries.
-
Improved Data Accessibility: The integration with Slack makes data querying seamless within existing communication channels, eliminating the need for separate tools and increasing the likelihood of data utilization in day-to-day operations.
-
Enhanced Data Governance: By centralizing data access through a controlled bot interface, organizations can better manage data access permissions and maintain an audit trail of queries, ensuring compliance with data protection regulations.
-
Scalable Business Intelligence: The AI-powered query generation allows for more complex and nuanced data analysis, enabling businesses to uncover deeper insights and patterns that might be missed with traditional BI tools or basic SQL queries.