AI Query Generator Slack Bot for BigQuery

Test this app for free
160
import re
import slack_sdk
from slackeventsapi import SlackEventAdapter
from flask import Flask, request
from typing import List
from abilities import llm_prompt
from google.cloud import bigquery
import logging
from prettytable import PrettyTable
import os

logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.WARNING)

PROJECT_ID = os.environ['PROJECT_ID']
DATASET_ID = os.environ['DATASET_ID']
TABLE_ID = os.environ['TABLE_ID']
SLACK_BOT_TOKEN = os.environ['SLACK_BOT_TOKEN']

processed_messages = []

def generate_query(
    question: str, schema: str, event_types: str, table_info: str,  previous_query: str = None, errors: List[str] = None
) -> str:
Get full code

Frequently Asked Questions

How can this AI Query Generator Slack Bot improve data accessibility in an organization?

The AI Query Generator Slack Bot for BigQuery significantly enhances data accessibility by allowing non-technical team members to easily query complex datasets. Users can simply ask questions in natural language within Slack, and the bot leverages ChatGPT to generate appropriate SQL queries. This democratizes data access, enabling employees across various departments to gain insights without needing to learn SQL or rely heavily on the data team.

What are some potential use cases for this Slack bot in a business setting?

The AI Query Generator Slack Bot has numerous applications in a business setting: - Sales teams can quickly query customer data for insights on buying patterns - Marketing teams can analyze campaign performance metrics on-the-fly - Product managers can access user engagement data without waiting for analyst reports - Executive teams can get real-time updates on key performance indicators - Customer support can retrieve relevant user information during support interactions

By providing instant access to data insights, this bot can significantly speed up decision-making processes across the organization.

How does this bot ensure data security and prevent unauthorized access?

The AI Query Generator Slack Bot incorporates several security measures: - It uses environment variables to store sensitive information like API keys and credentials - The bot only responds to direct mentions, reducing the risk of accidental data exposure - It operates within the Slack workspace, leveraging Slack's built-in security features - The bot uses Google Cloud's service account authentication for BigQuery access

However, it's crucial to implement additional access controls and data governance policies to ensure that users can only query data they're authorized to access.

How can I modify the bot to support multiple BigQuery tables?

To support multiple BigQuery tables, you can modify the handle_message function to accept a table name as part of the query command. Here's an example of how you might adjust the code:

python if message.lower().startswith("query"): parts = message.split(" ", 2) if len(parts) < 3: response = "Invalid query format. Use: query <table_name> <question>" else: _, table_name, question = parts table_id = get_table_id(table_name) # You'd need to implement this function if table_id: schema = get_table_schema(table_id) table_info = f"Project ID: {PROJECT_ID}, Dataset ID: {DATASET_ID}, Table ID: {table_id}" event_types = get_event_types(table_id) query = generate_query(question, schema, event_types, table_info) # ... rest of the query execution code ... else: response = f"Table '{table_name}' not found."

This modification allows users to specify which table they want to query, making the bot more versatile for organizations with multiple datasets.

Can the AI Query Generator Slack Bot handle complex joins or subqueries?

Yes, the AI Query Generator Slack Bot can handle complex joins and subqueries, thanks to the power of ChatGPT. The generate_query function provides the LLM with the table schema and other relevant information, enabling it to create sophisticated queries. However, to improve its capabilities for very complex queries, you might want to enhance the prompt. Here's an example of how you could modify the generate_query function:

```python def generate_query(question, schema, event_types, table_info, previous_query=None, errors=None): prompt = f"""You are a senior engineer tasked to generate a valid SQL query for BigQuery about an events dataset based on the question from a person that knows less than you about the data. Here's the table schema: {schema}, and the table info: {table_info}. These are the only event types that exist: {event_types}. Remember: - TIMESTAMP_SUB does not support YEAR so use 365 DAYS instead - The table always needs some time period in the where clause due to partitioning - Only if you expect the query to output lots of rows limit to 100 by default - You can use complex joins and subqueries if necessary to answer the question accurately - Optimize the query for performance when possible

   Here's the question from the person: {question}. 
   """
   # ... rest of the function remains the same ...

```

This enhanced prompt explicitly tells the model that it can use complex joins and subqueries, and also asks it to optimize for performance, which should result in more sophisticated and efficient queries when needed.

Created: | Last Updated:

This app allows users to interact with a Slack bot, ask a question about the data in a table or request the table schema, and then uses the latest ChatGPT to generate a query that is executed on BigQuery to return the results. The app includes a retry mechanism for query generation in case of an error (up to two retries) and provides the LLM with the table info to generate more accurate queries. The table schema is only printed if it is successfully retrieved. All errors from retries are now passed to the LLM. The generated query is printed before the results, and the results are displayed in a pretty table format. The bot uses the Slack API to send and receive messages and parses the user's message to determine the action to take. The bot always responds in a thread to the original message.

Introduction to the AI Query Generator Slack Bot for BigQuery Template

Welcome to the AI Query Generator Slack Bot for BigQuery template! This template is designed to help you create a Slack bot that interacts with users, allowing them to ask questions about data in a BigQuery table or request the table schema. The bot then uses the latest ChatGPT to generate a SQL query, which is executed on BigQuery to return the results. This step-by-step guide will walk you through the process of setting up and using this template on the Lazy platform.

Getting Started

To begin using this template, click on "Start with this Template" on the Lazy platform. This will pre-populate the code in the Lazy Builder interface, so you won't need to copy, paste, or delete any code manually.

Initial Setup: Adding Environment Secrets

Before testing the app, you'll need to set up some environment secrets within the Lazy Builder. These are necessary for the bot to interact with Google BigQuery and Slack. Here's what you need to do:

  • Navigate to the Environment Secrets tab within the Lazy Builder.
  • Add the following secrets with their respective values:
    • PROJECT_ID: Your Google Cloud project ID.
    • DATASET_ID: The dataset ID in BigQuery where your table is located.
    • TABLE_ID: The table ID in BigQuery that you want to query.
    • SLACK_BOT_TOKEN: Your Slack bot token, which you can obtain from your Slack app settings.
    • SLACK_SIGNING_SECRET: Your Slack app's signing secret, also available in your Slack app settings.
  • Ensure that you have the correct permissions and roles set up in Google Cloud to access BigQuery and perform queries.

Test: Pressing the Test Button

Once you have set up the environment secrets, press the "Test" button on the Lazy platform. This will begin the deployment of the app and launch the Lazy CLI. If the code requires any user input, you will be prompted to provide it through the Lazy CLI.

Entering Input

If the template requires user input, you will be prompted for it after pressing the "Test" button. Follow the instructions in the Lazy CLI to provide the necessary input.

Using the App

After the app is deployed, you can interact with the Slack bot by sending messages directly in Slack. The bot will respond to queries about the data or schema requests in a thread to the original message. If the bot generates a query, it will execute it on BigQuery and return the results in a formatted table.

Integrating the App

If you need to integrate this Slack bot with other tools or services, you may need to provide the Slack bot's server link, which will be printed by the Lazy builder CLI after deployment. For example, you might add this link to your Slack app's event subscriptions to receive messages from Slack.

Remember, this bot is designed to work within the Slack environment, so you'll need to ensure that your Slack workspace is set up to interact with the bot. This includes installing the bot to your workspace and inviting it to the channels where you want it to be active.

That's it! You're now ready to use the AI Query Generator Slack Bot for BigQuery. If you encounter any issues or have questions, refer to the documentation links provided in the code or reach out to Lazy's customer support for assistance.



Template Benefits

  1. Data Democratization: This Slack bot empowers non-technical team members to access and analyze BigQuery data without requiring SQL expertise, promoting data-driven decision making across the organization.

  2. Increased Productivity: By automating the query generation process, the bot significantly reduces the time and effort required to extract insights from complex datasets, allowing employees to focus on interpreting results rather than writing queries.

  3. Improved Data Accessibility: The integration with Slack makes data querying seamless within existing communication channels, eliminating the need for separate tools and increasing the likelihood of data utilization in day-to-day operations.

  4. Enhanced Data Governance: By centralizing data access through a controlled bot interface, organizations can better manage data access permissions and maintain an audit trail of queries, ensuring compliance with data protection regulations.

  5. Scalable Business Intelligence: The AI-powered query generation allows for more complex and nuanced data analysis, enabling businesses to uncover deeper insights and patterns that might be missed with traditional BI tools or basic SQL queries.

Technologies

Maximize OpenAI Potential with Lazy AI: Automate Integrations, Enhance Customer Support and More  Maximize OpenAI Potential with Lazy AI: Automate Integrations, Enhance Customer Support and More
Streamline Slack Workflows with Lazy AI: Automate Notifications, API Integrations and More  Streamline Slack Workflows with Lazy AI: Automate Notifications, API Integrations and More
Python App Templates for Scraping, Machine Learning, Data Science and More Python App Templates for Scraping, Machine Learning, Data Science and More

Similar templates

FastAPI endpoint for Text Classification using OpenAI GPT 4

This API will classify incoming text items into categories using the Open AI's GPT 4 model. If the model is unsure about the category of a text item, it will respond with an empty string. The categories are parameters that the API endpoint accepts. The GPT 4 model will classify the items on its own with a prompt like this: "Classify the following item {item} into one of these categories {categories}". There is no maximum number of categories a text item can belong to in the multiple categories classification. The API will use the llm_prompt ability to ask the LLM to classify the item and respond with the category. The API will take the LLM's response as is and will not handle situations where the model identifies multiple categories for a text item in the single category classification. If the model is unsure about the category of a text item in the multiple categories classification, it will respond with an empty string for that item. The API will use Python's concurrent.futures module to parallelize the classification of text items. The API will handle timeouts and exceptions by leaving the items unclassified. The API will parse the LLM's response for the multiple categories classification and match it to the list of categories provided in the API parameters. The API will convert the LLM's response and the categories to lowercase before matching them. The API will split the LLM's response on both ':' and ',' to remove the "Category" word from the response. The temperature of the GPT model is set to a minimal value to make the output more deterministic. The API will return all matching categories for a text item in the multiple categories classification. The API will strip any leading or trailing whitespace from the categories in the LLM's response before matching them to the list of categories provided in the API parameters. The API will accept lists as answers from the LLM. If the LLM responds with a string that's formatted like a list, the API will parse it and match it to the list of categories provided in the API parameters.

Icon 1 Icon 1
196

We found some blogs you might like...