Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Thailand Transforms Its Travel Landscape In 2025 With A Powerful Mix Of Extended Visa-Free Access, Smart Immigration Tech, And Investor-Friendly Policies

    June 9, 2025

    Elderly and Children with Adenovirus History at Increased Risk of COVID-19 Complications, ET HealthWorld

    June 9, 2025

    Meta becomes the latest big tech company turning to nuclear power for AI needs

    June 9, 2025
    Facebook X (Twitter) Instagram
    • Demos
    • Buy Now
    Facebook X (Twitter) Instagram YouTube
    14 Trends14 Trends
    Demo
    • Home
    • Features
      • View All On Demos
    • Buy Now
    14 Trends14 Trends
    Home » Build a Text-to-SQL solution for data consistency in generative AI using Amazon Nova
    AI AWS

    Build a Text-to-SQL solution for data consistency in generative AI using Amazon Nova

    adminBy adminJune 9, 2025No Comments13 Mins Read0 Views
    Facebook Twitter Pinterest LinkedIn Telegram Tumblr Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Businesses rely on precise, real-time insights to make critical decisions. However, enabling non-technical users to access proprietary or organizational data without technical expertise remains a challenge. Text-to-SQL bridges this gap by generating precise, schema-specific queries that empower faster decision-making and foster a data-driven culture. The problem lies in obtaining deterministic answers—precise, consistent results needed for operations such as generating exact counts or detailed reports—from proprietary or organizational data. Generative AI offers several approaches to query data, but selecting the right method is critical to achieve accuracy and reliability.

    This post evaluates the key options for querying data using generative AI, discusses their strengths and limitations, and demonstrates why Text-to-SQL is the best choice for deterministic, schema-specific tasks. We show how to effectively use Text-to-SQL using Amazon Nova, a foundation model (FM) available in Amazon Bedrock, to derive precise and reliable answers from your data.

    Options for querying data

    Organizations have multiple options for querying data, and the choice depends on the nature of the data and the required outcomes. This section evaluates the following approaches to provide clarity on when to use each and why Text-to-SQL is optimal for deterministic, schema-based tasks:

    • Retrieval Augmented Generation (RAG):
      • Use case – Ideal for extracting insights from unstructured or semi-structured sources like documents or articles.
      • Strengths – Handles diverse data formats and provides narrative-style responses.
      • Limitations – Probabilistic answers can vary, making it unsuitable for deterministic queries, such as retrieving exact counts or matching specific schema constraints.
      • Example – “Summarize feedback from product reviews.”
    • Generative business intelligence (BI):
      • Use case – Suitable for high-level insights and summary generation based on structured and unstructured data.
      • Strengths – Delivers narrative insights for decision-making and trends.
      • Limitations – Lacks the precision required for schema-specific or operational queries. Results often vary in phrasing and focus.
      • Example – “What were the key drivers of sales growth last quarter?”
    • Text-to-SQL:
      • Use case – Excels in querying structured organizational data directly from relational schemas.
      • Strengths – Provides deterministic, reproducible results for specific, schema-dependent queries. Ideal for precise operations such as filtering, counting, or aggregating data.
      • Limitations – Requires structured data and predefined schemas.
      • Example – “How many patients diagnosed with diabetes visited clinics in New York City last month?”

    In scenarios demanding precision and consistency, Text-to-SQL outshines RAG and generative BI by delivering accurate, schema-driven results. These characteristics make it the ideal solution for operational and structured data queries.

    Solution overview

    This solution uses the Amazon Nova Lite and Amazon Nova Pro large language models (LLMs) to simplify querying proprietary data with natural language, making it accessible to non-technical users.

    Amazon Bedrock is a fully managed service that simplifies building and scaling generative AI applications by providing access to leading FMs through a single API. It allows developers to experiment with and customize these models securely and privately, integrating generative AI capabilities into their applications without managing infrastructure.

    Within this system, Amazon Nova represents a new generation of FMs delivering advanced intelligence and industry-leading price-performance. These models, including Amazon Nova Lite and Amazon Nova Pro, are designed to handle various tasks such as text, image, and video understanding, making them versatile tools for diverse applications.

    You can find the deployment code and detailed instructions in our GitHub repo.

    The solution consists of the following key features:

    • Dynamic schema context – Retrieves the database schema dynamically for precise query generation
    • SQL query generation – Converts natural language into SQL queries using the Amazon Nova Pro LLM
    • Query execution – Runs queries on organizational databases and retrieves results
    • Formatted responses – Processes raw query results into user-friendly formats using the Amazon Nova Lite LLM

    The following diagram illustrates the solution architecture.

    Data flow between user, Streamlit app, Amazon Bedrock, and Microsoft SQL Server, illustrating query processing and response generation

    In this solution, we use Amazon Nova Pro and Amazon Nova Lite to take advantage of their respective strengths, facilitating efficient and effective processing at each stage:

    • Dynamic schema retrieval and SQL query generation – We use Amazon Nova Pro to handle the translation of natural language inputs into SQL queries. Its advanced capabilities in complex reasoning and understanding make it well-suited for accurately interpreting user intents and generating precise SQL statements.
    • Formatted response generation – After we run the SQL queries, the raw results are processed using Amazon Nova Lite. This model efficiently formats the data into user-friendly outputs, making the information accessible to non-technical users. Its speed and cost-effectiveness are advantageous for this stage, where rapid processing and straightforward presentation are key.

    By strategically deploying Amazon Nova Pro and Amazon Nova Lite in this manner, the solution makes sure that each component operates optimally, balancing performance, accuracy, and cost-effectiveness.

    Prerequisites

    Complete the following prerequisite steps:

    1. Install the AWS Command Line Interface (AWS CLI). For instructions, refer to Installing or updating to the latest version of the AWS CLI.
    2. Configure the basic settings that the AWS CLI uses to interact with AWS. For more information, see Configuration and credential file settings in the AWS CLI.
    3. Make sure Amazon Bedrock is enabled in your AWS account.
    4. Obtain access to Amazon Nova Lite and Amazon Nova Pro.
    5. Install Python 3.9 or later, along with required libraries (Streamlit version 1.8.0 or later, Boto3, pymssql, and environment management packages).
    6. Create a Microsoft SQL Server (version 2016 or later) database with credentials to connect.
      1. Create a secret in AWS Secrets Manager for database credentials and name it mssql_secrets. For instructions, see Create an AWS Secrets Manager secret.

    Our sample code uses a Microsoft SQL Server database, but this solution supports the following services:

    For more information about prerequisites, refer to the GitHub repo.

    Set up the development environment

    In the command prompt, navigate to the folder where the code exists and run the following command:

    python3.9 -m pip install -r requirements.txt --upgrade

    This command installs the required libraries to run the application.

    Load the sample dataset in the database

    Make sure you have created a secret in Secrets Manager named mssql_secrets as mentioned in the prerequisites. If you named your secret something else, update the code in app.py (line 29) and load_data.py (line 22).

    After you create the secret, run the following command from the code folder:

    This command creates a database named Sales with tables Products, Customers, and Orders and loads the sample data in these tables.

    Run the application

    To run the application, execute the following command:

    Example queries

    In this section, we explore some sample queries.

    For our first query, we ask “Who are the customers who bought smartphones?” This generates the following SQL:

    SELECT DISTINCT CustomerName, ProductName, SUM(Quantity) AS TotalSoldQuantity FROM vw_sales WHERE ProductName LIKE '%smartphone%' GROUP BY CustomerName, ProductName, OrderDate; 

    We get the following formatted response:

    1. Alice Johnson, who bought 1 smartphone on October 14th, 2023.
    2. Ivy Martinez, who bought 2 smartphones on October 15th, 2023.

    Next, we ask “How many smartphones are in stock?” This generates the following SQL:

    SELECT DISTINCT ProductName, StockQuantity AS AvailableQuantity FROM vw_sales WHERE ProductName LIKE '%smartphone%'; 

    We get the response “There are 100 smartphones currently in stock.”

    Sales-bot web interface demonstrating natural language to SQL conversion with Amazon Bedrock, showing smartphone purchase query and results

    Code execution flow

    In this section, we explore the code execution flow. The code reference is from the GitHub repo. Do not run the different parts of the code individually.

    Retrieve schema dynamically

    Use INFORMATION_SCHEMA views to extract schema details dynamically (code reference from app.py):

    def get_schema_context(db_name, db_view_name):
        conn = connect_to_db()
        cursor = conn.cursor()
        cursor.execute(f"USE {db_name}")
        query = f"SELECT COLUMN_NAME, DATA_TYPE FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = '{db_view_name}'"
        cursor.execute(query)
        schema = cursor.fetchall()
        print("Schema:", schema)
        return '\n'.join([f"- {row[0]}: {row[1]}" for row in schema])

    Dynamic schema retrieval adapts automatically to changes by querying metadata tables for updated schema details, such as table names and column types. This facilitates seamless integration of schema updates into the Text-to-SQL system, reducing manual effort and improving scalability.

    Test this function to verify it adapts automatically when schema changes occur.

    Before generating SQL, fetch schema details for the relevant tables to facilitate accurate query construction.

    Generate a SQL query using Amazon Nova Pro

    Send the user query and schema context to Amazon Nova Pro (code reference from sql_generator.py):

    def generate_sql_query(question: str, schema_context: str, db_name: str, db_view_name: str = None) -> str:
        
        nova_client = NovaClient()
           
        # Base prompt with SQL generation rules
        base_prompt = """
        MS SQL DB {db_name} has one view names '{db_view_name}'. 
        Always use '{db_view_name}' as table name to generate your query.
        Create a MS SQL query by carefully understanding the question and generate the query between tags  and .
        The MS SQL query should selects all columns from a view named '{db_view_name}'
        In your SQL query always use like condition in where clasue.
        if a question is asked about product stock then always use 'distinct' in your SQL query.
        Never Generate an SQL query which gives error upon execution.
          
        
        Question: {question}
        
        Database Schema : {schema_context}
        
        Generate SQL query:
        """
        
        # Format the prompt with the question and schema context
        formatted_prompt = base_prompt.format(
            question=question,
            db_name=db_name,
            db_view_name=db_view_name if db_view_name else "No view name provided",
            schema_context=schema_context if schema_context else "No additional context provided"
        )
            
        # Invoke Nova model
        response = nova_client.invoke_model(
            model_id='amazon.nova-pro-v1:0',
            prompt=formatted_prompt,
            temperature=0.1  # Lower temperature for more deterministic SQL generation
        )
        
        # Extract SQL query from response using regex
        sql_match = extract_sql_from_nova_response(response)
        if sql_match:
            return sql_match
        else:
            raise ValueError("No SQL query found in the response")
        
    def extract_sql_from_nova_response(response):
        try:
            # Navigate the nested dictionary structure
            content = response['output']['message']['content']
            # Get the text from the first content item
            text = content[0]['text']
            
            # Find the positions of begin and end tags
            begin_tag = ""
            end_tag = ""
            start_pos = text.find(begin_tag)
            end_pos = text.find(end_tag)
            
            # If both tags are found, extract the SQL between them
            if start_pos != -1 and end_pos != -1:
                # Add length of begin tag to start position to skip the tag itself
                sql_query = text[start_pos + len(begin_tag):end_pos].strip()
                return sql_query
                
            return None
            
        except (KeyError, IndexError):
            # Return None if the expected structure is not found
            return None

    This code establishes a structured context for a text-to-SQL use case, guiding Amazon Nova Pro to generate SQL queries based on a predefined database schema. It provides consistency by defining a static database context that clarifies table names, columns, and relationships, helping prevent ambiguity in query formation. Queries are required to reference the vw_sales view, standardizing data extraction for analytics and reporting. Additionally, whenever applicable, the generated queries must include quantity-related fields, making sure that business users receive key insights on product sales, stock levels, or transactional counts. To enhance search flexibility, the LLM is instructed to use the LIKE operator in WHERE conditions instead of exact matches, allowing for partial matches and accommodating variations in user input. By enforcing these constraints, the code optimizes Text-to-SQL interactions, providing structured, relevant, and business-aligned query generation for sales data analysis.

    Execute a SQL query

    Run the SQL query on the database and capture the result (code reference from app.py):

    cursor.execute(sql_command)
    result = cursor.fetchall()
    print(result)

    Format the query results using Amazon Nova Lite

    Send the database result from the SQL query to Amazon Nova Lite to format it in a human-readable format and print it on the Streamlit UI (code reference from app.py):

    def interact_with_nova(user_input, llm_query, query_response, model="nova"):
        session = boto3.session.Session()
        region = session.region_name
        
        nova_client = NovaClient(region_name=region)
        
        final_prompt = f"""Human: You are a expert chatbot who is happy to assist the users. User questions in given in  tag and results in  tag. Understand the question and use information from  to generate an answer. If there are more than one entery, give a numbered list. Never retrun  and  in your response.
        for example : question - "How many mouse were sold?"
                      llm response : 
                                    " There were 3 mouse sold in total. 
                                    - 1 mouse sold to Mia Perez on October 2nd, 2023. 
                                    - 2 mouse sold to Jack Hernandez on October 1st 2023."
        
        {user_input}
        
        
        {query_response}
        """
        
        try:
            
                response = nova_client.invoke_model(
                    model_id='amazon.nova-lite-v1:0',
                    prompt=final_prompt,
                    max_tokens=4096,
                    temperature=0.7
                )
                
                content = response['output']['message']['content']
                text = content[0]['text']
                return text
                
                return "Sorry, I couldn't process your request."
        
        except Exception as e:
            print(f"Error in LLM interaction: {str(e)}")
            return "Sorry, an error occurred while processing your request."

    Clean up

    Follow these steps to clean up resources in your AWS environment and avoid incurring future costs:

    1. Clean up database resources:
    2. Clean up security resources:
    3. Clean up the frontend (only if hosting the Streamlit application on Amazon EC2):
      • Stop the EC2 instance hosting the Streamlit application.
      • Delete associated storage volumes.
    4. Clean up additional resources (if applicable):
      • Remove Elastic Load Balancers.
      • Delete virtual private cloud (VPC) configurations.
    5. Check the AWS Management Console to confirm all resources have been deleted.

    Conclusion

    Text-to-SQL with Amazon Bedrock and Amazon Nova LLMs provides a scalable solution for deterministic, schema-based querying. By delivering consistent and precise results, it empowers organizations to make informed decisions, improve operational efficiency, and reduce reliance on technical resources.

    For a more comprehensive example of a Text-to-SQL solution built on Amazon Bedrock, explore the GitHub repo Setup Amazon Bedrock Agent for Text-to-SQL Using Amazon Athena with Streamlit. This open source project demonstrates how to use Amazon Bedrock and Amazon Nova LLMs to build a robust Text-to-SQL agent that can generate complex queries, self-correct, and query diverse data sources.

    Start experimenting with Text-to-SQL use cases today by getting started with Amazon Bedrock.


    About the authors

    Mansi Sharma is a Solutions Architect for Amazon Web Services. Mansi is a trusted technical advisor helping enterprise customers architect and implement cloud solutions at scale. She drives customer success through technical leadership, architectural guidance, and innovative problem-solving while working with cutting-edge cloud technologies. Mansi specializes in generative AI application development and serverless technologies.

    Marie Yap is a Principal Solutions Architect for Amazon Web Services.  In this role, she helps various organizations begin their journey to the cloud. She also specializes in analytics and modern data architectures.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    admin
    • Website

    Related Posts

    Multi-account support for Amazon SageMaker HyperPod task governance

    June 8, 2025

    Implement semantic video search using open source large vision models on Amazon SageMaker and Amazon OpenSearch Serverless

    June 7, 2025

    Build a serverless audio summarization solution with Amazon Bedrock and Whisper

    June 7, 2025

    Modernize and migrate on-premises fraud detection machine learning workflows to Amazon SageMaker

    June 6, 2025

    How climate tech startups are building foundation models with Amazon SageMaker HyperPod

    June 5, 2025

    Impel enhances automotive dealership customer experience with fine-tuned LLMs on Amazon SageMaker

    June 4, 2025
    Leave A Reply Cancel Reply

    Demo
    Top Posts

    ChatGPT’s viral Studio Ghibli-style images highlight AI copyright concerns

    March 28, 20254 Views

    Best Cyber Forensics Software in 2025: Top Tools for Windows Forensics and Beyond

    February 28, 20253 Views

    An ex-politician faces at least 20 years in prison in killing of Las Vegas reporter

    October 16, 20243 Views

    Laws, norms, and ethics for AI in health

    May 1, 20252 Views
    Don't Miss

    Thailand Transforms Its Travel Landscape In 2025 With A Powerful Mix Of Extended Visa-Free Access, Smart Immigration Tech, And Investor-Friendly Policies

    June 9, 2025

    Home » Thailand Travel News » Thailand Transforms Its Travel Landscape In 2025 With A…

    Elderly and Children with Adenovirus History at Increased Risk of COVID-19 Complications, ET HealthWorld

    June 9, 2025

    Meta becomes the latest big tech company turning to nuclear power for AI needs

    June 9, 2025

    Build a Text-to-SQL solution for data consistency in generative AI using Amazon Nova

    June 9, 2025
    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    Demo
    Top Posts

    ChatGPT’s viral Studio Ghibli-style images highlight AI copyright concerns

    March 28, 20254 Views

    Best Cyber Forensics Software in 2025: Top Tools for Windows Forensics and Beyond

    February 28, 20253 Views

    An ex-politician faces at least 20 years in prison in killing of Las Vegas reporter

    October 16, 20243 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews
    Demo
    About Us
    About Us

    Your source for the lifestyle news. This demo is crafted specifically to exhibit the use of the theme as a lifestyle site. Visit our main page for more demos.

    We're accepting new partnerships right now.

    Email Us: info@example.com
    Contact: +1-320-0123-451

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    Thailand Transforms Its Travel Landscape In 2025 With A Powerful Mix Of Extended Visa-Free Access, Smart Immigration Tech, And Investor-Friendly Policies

    June 9, 2025

    Elderly and Children with Adenovirus History at Increased Risk of COVID-19 Complications, ET HealthWorld

    June 9, 2025

    Meta becomes the latest big tech company turning to nuclear power for AI needs

    June 9, 2025
    Most Popular

    ChatGPT’s viral Studio Ghibli-style images highlight AI copyright concerns

    March 28, 20254 Views

    Best Cyber Forensics Software in 2025: Top Tools for Windows Forensics and Beyond

    February 28, 20253 Views

    An ex-politician faces at least 20 years in prison in killing of Las Vegas reporter

    October 16, 20243 Views

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    14 Trends
    Facebook X (Twitter) Instagram Pinterest YouTube Dribbble
    • Home
    • Buy Now
    © 2025 ThemeSphere. Designed by ThemeSphere.

    Type above and press Enter to search. Press Esc to cancel.