Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Zelenskyy: ‘Free, democratic world’ is ‘waiting’ for Trump to bring peace in Ukraine

    June 8, 2025

    Napa Valley town that once rode out emergencies with diesel gets a clean-power backup

    June 8, 2025

    Ukraine drone barrage targets Moscow as Zelenskyy demands accountability for Putin

    June 8, 2025
    Facebook X (Twitter) Instagram
    • Demos
    • Buy Now
    Facebook X (Twitter) Instagram YouTube
    14 Trends14 Trends
    Demo
    • Home
    • Features
      • View All On Demos
    • Buy Now
    14 Trends14 Trends
    Home » Prompting for the best price-performance
    AI AWS

    Prompting for the best price-performance

    adminBy adminApril 5, 2025No Comments17 Mins Read0 Views
    Facebook Twitter Pinterest LinkedIn Telegram Tumblr Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    In the drive to remain competitive, businesses today are turning to AI to help them minimize cost and maximize efficiency. It’s incumbent on them to find the most suitable AI model—the one that will help them achieve more while spending less. For many businesses, the migration from OpenAI’s model family to Amazon Nova represents not only a shift in model but a strategic move toward scalability, efficiency, and broader multimodal capabilities.

    In this blog, we discuss how to optimize prompting in Amazon Nova for the best price-performance.

    Why migrate from OpenAI to Amazon Nova?

    OpenAI’s models remain powerful, but their operational costs can be prohibitive when scaled. Consider these figures from Artificial Analysis:

    Model Input Token Cost (per Million Tokens) Output Token Cost (per Million Tokens) Context Window Output Speed (Tokens per Second) Latency (Seconds per first token)
    GPT-4o ~$2.50 ~$10.00 Up to 128K tokens ~63 ~0.49
    GPT-4o Mini ~$0.15 ~$0.60 Up to 128K tokens ~90 ~0.43
    Nova Micro ~$0.035 ~$0.14 Up to 128K tokens ~195 ~0.29
    Nova Lite ~$0.06 ~$0.24 Up to 300K tokens ~146 ~0.29
    Nova Pro ~$0.80 ~$3.20 Up to 300K tokens ~90 ~0.34

    For high-volume applications—like global customer support or large-scale document analysis—these cost differences are disruptive. Not only does Amazon Nova Pro offer over three times the cost-efficiency, its longer context window also enables it to handle more extensive and complex inputs.

    Breaking down the Amazon Nova suite

    Amazon Nova isn’t a single model—it’s a suite designed for various needs:

    • Amazon Nova Pro – A robust multimodal model that can process text, images, and video. It excels at tasks such as document analysis and deep data visualization. Benchmark comparisons show Amazon Nova Pro matching or even surpassing GPT-4o on complex reasoning tasks, according to section 2.1.1 of the Nova technical report and model card.
    • Amazon Nova Lite – Offers a balanced mix of multimodal processing and speed. Amazon Nova Lite is ideal for applications such as document summarization, translation, and even basic visual search, delivering quality outputs at lower latency and cost compared to GPT-4o Mini. You can find these benchmark results in section 2.1.2 of the Nova Technical Report and Model Card.
    • Amazon Nova Micro – A text-only model engineered for ultra-low latency. With output speed of up to 195 tokens per second, Amazon Nova Micro is perfect for real-time applications such as chat-based assistants and automated FAQs. Its token costs are dramatically lower than those of GPT-4o Mini—roughly 4.3 times cheaper on a per-token basis.

    The lower per-token costs and higher output per second of Amazon Nova give you the flexibility to simplify prompts for real-time applications so you can balance quality, speed, and cost for your use case.

    Understanding the foundations

    To make the best decision about which model family fits your needs, it’s important to understand the differences in prompt engineering best practices in both OpenAI and Amazon Nova. Each model family has its own set of strengths, but there are some things that apply to both families. Across both model families, quality accuracy is achieved through clarity of instructions, structured prompts, and iterative refinement. Whether you’re using strong output directives or clearly defined use cases, the goal is to reduce ambiguity and improve response quality.

    The OpenAI approach

    OpenAI uses a layered messaging system for prompt engineering, where system, developer, and user prompts work in harmony to control tone, safety, and output format. Their approach emphasizes:

    • Hierarchical message roles – Setting the model’s role and behavior using system messages makes sure that the overarching safety and style guidelines (set in system prompts) are preserved
    • Instruction placement and delimiters – Directives are placed at the beginning, with clear separation between context, examples, and queries
    • Selective chain-of-thought – Detailed, step-by-step reasoning is used when it benefits complex tasks
    • Formatting and structure – Using strong directives such as DO, MUST, and DO NOT to provide consistent outputs (for example, in JSON)

    The Amazon Nova approach

    • Define the prompt use case
      • Task – What exactly the model should do
      • Role – Which role the model should assume
      • Response style – The structure or tone of the output
      • Instructions – Guidelines the model must follow
    • Chain-of-thought
      • Explicit state response – Provide clear and strong instructions to limit the model’s response
      • Structural thinking – Thinking step-by-step encourages structural thinking
    • Formatting and structure
      • Use delimiters to sectioned your prompts, for example, ##Task##, ##Context##, or ##Example##
      • Specify the output format, for example, JSON, YAML, or Markdown
      • Use strong instructions and caps, such as DO, DO NOT, or MUST
      • Prefill the responses to guide the model, for example, start with “{“ or “json…”

    Evolving your prompt strategy: Migrating from OpenAI to Amazon Nova

    Transitioning to Amazon Nova isn’t merely a change in API endpoints—it requires retooling your prompt engineering to align with the strengths of Amazon Nova. You need to reframe your use case definition. Begin by breaking down your current GPT-4o or GPT-4o Mini prompt into its core elements of task, role, response style, and instructions and success criteria. Make sure to structure these elements clearly to provide a blueprint for the model.

    To understand how to migrate an existing OpenAI prompt to work optimally for Amazon Nova Pro, consider the following example using the meeting notes summarizer. Here is the GPT-4o system prompt:

    You will be provided with meeting notes, and your task is to summarize the meeting 
    as follows: 
    -Overall summary of discussion 
    -Action items (what needs to be done and who is doing it) 
    -If applicable, a list of topics that need to be discussed more fully in the next meeting.

    The user prompt is the meeting notes that need to be summarized:

    Meeting Date: March 5th, 2050
    Meeting Time: 2:00 PM
    Location: Conference Room 3B, Intergalactic Headquarters...

    GPT produces this helpful response:

    Overall Summary of Discussion: The team discussed the recent mission to Planet Zog, 
    the space pirate issue in Sector 7, the annual Intergalactic Bake-Off, planning for 
    the upcoming charity fundraiser, and a proposed team-building retreat. The mission to 
    Planet Zog was successful, but communication with the Zogians was challenging. The team 
    also discussed the need for a better strategy to deal with space pirates. The team placed 
    second in the Intergalactic Bake-Off and is planning for the upcoming charity fundraiser. 
    A team-building retreat to the Moon Resort and Spa was proposed.
    
    Action Items:
    1. Dr. Quasar to start working on a Zogian-English dictionary.
    2. Captain Stardust to speak with Admiral Starbeam about increasing patrols in Sector 7.
    3. Dr. Quasar to finalize the prototype of the new cloaking technology.
    4. All team members to start gathering supplies and preparing games for the Intergalactic 
    Charity Bazaar.
    5. Captain Stardust to check the budget for the proposed team-building retreat to the Moon 
    Resort and Spa.
    
    Topics for Next Meeting:
    1. Update on the Zogian-English dictionary.
    2. Progress report on the cloaking technology.
    3. Results of increased patrols in Sector 7.
    4. Final preparations for the Intergalactic Charity Bazaar.

    To meet or exceed the quality of the response from GPT-4o, here is what an Amazon Nova Pro prompt might look like. The prompt uses the same best practices discussed in this post, starting with the system prompt. We used a temperature of .2 and a topP of .9 here:

    You are an experienced executive assistant skilled in meeting note analysis and 
    summarization. Your primary responsibilities include distilling complex discussions
    into clear, actionable summaries.
    Follow these instructions:
    
    ##INSTRUCTIONS##
    1. Read understand the meeting notes found in ##NOTES##
    2. Put all of your outputs in a section called ##OUTPUTS## in markdown formatting
    3. Summarize the meeting notes in 5 sentences or less. Put this in a section called 
    "Overall Summary".
    4. Numerically list any action items for specific people and what needs to be completed. 
    Put this list in a section called "Action Items".
    5. If applicable, list the topics that need to be discussed more fully in the next meeting. 
    Put this in a section called "Topics for Next Meeting".

    Here’s the user prompt, using prefilled responses:

    ##NOTES##
    Meeting Date: March 5th, 2050
    Meeting Time: 2:00 PM
    Location: Conference Room 3B, Intergalactic Headquarters
    Attendees:
    - Captain Stardust
    - Dr. Quasar
    - Lady Nebula
    - Sir Supernova
    - Ms. Comet
    Meeting called to order by Captain Stardust at 2:05 PM
    1. Introductions and welcome to our newest team member, Ms. Comet
    2. Discussion of our recent mission to Planet Zog
    - Captain Stardust: "Overall, a success, but communication with the Zogians was difficult. 
    We need to improve our language skills."
    - Dr. Quasar: "Agreed. I'll start working on a Zogian-English dictionary right away."
    - Lady Nebula: "The Zogian food was out of this world, literally! We should consider having 
    a Zogian food night on the ship."
    3. Addressing the space pirate issue in Sector 7
    - Sir Supernova: "We need a better strategy for dealing with these pirates. They've already 
    plundered three cargo ships this month."
    - Captain Stardust: "I'll speak with Admiral Starbeam about increasing patrols in that area.
    - Dr. Quasar: "I've been working on a new cloaking technology that could help our ships avoid 
    detection by the pirates. I'll need a few more weeks to finalize the prototype."
    4. Review of the annual Intergalactic Bake-Off
    - Lady Nebula: "I'm happy to report that our team placed second in the competition! Our Martian Mud 
    Pie was a big hit!"
    - Ms. Comet: "Let's aim for first place next year. I have a secret recipe for Jupiter Jello that I 
    think could be a winner."
    5. Planning for the upcoming charity fundraiser
    - Captain Stardust: "We need some creative ideas for our booth at the Intergalactic Charity Bazaar."
    - Sir Supernova: "How about a 'Dunk the Alien' game? We can have people throw water balloons at a 
    volunteer dressed as an alien."
    - Dr. Quasar: "I can set up a 'Name That Star' trivia game with prizes for the winners."
    - Lady Nebula: "Great ideas, everyone. Let's start gathering the supplies and preparing the games."
    6. Upcoming team-building retreat
    - Ms. Comet: "I would like to propose a team-building retreat to the Moon Resort and Spa. It's a great 
    opportunity to bond and relax after our recent missions."
    - Captain Stardust: "Sounds like a fantastic idea. I'll check the budget and see if we can make it happen."
    7. Next meeting agenda items
    - Update on the Zogian-English dictionary (Dr. Quasar)
    - Progress report on the cloaking technology (Dr. Quasar)
    - Results of increased patrols in Sector 7 (Captain Stardust)
    - Final preparations for the Intergalactic Charity Bazaar (All)
    Meeting adjourned at 3:15 PM. Next meeting scheduled for March 19th, 2050 at 2:00 PM in Conference Room 3B, 
    Intergalactic Headquarters.
    
    ##OUTPUTS##

    The following example shows that the Amazon Nova response meets and exceeds the accuracy of the OpenAI example, formats the output in Markdown, and has found clear owners for each action item:

    ### Overall Summary
    The meeting covered the successful mission to Planet Zog, the ongoing issue with 
    space pirates in Sector 7, the results of the annual Intergalactic Bake-Off, plans 
    for the upcoming charity fundraiser, and a proposed team-building retreat. Action items 
    were assigned, and the next meeting's agenda was set.
    
    ### Action Items
    1. **Dr. Quasar**: Begin working on a Zogian-English dictionary.
    2. **Captain Stardust**: Speak with Admiral Starbeam about increasing patrols in Sector 7.
    3. **Dr. Quasar**: Finalize the prototype for the new cloaking technology within a few weeks.
    4. **Team**: Start gathering supplies and preparing games for the Intergalactic Charity Bazaar.
    5. **Captain Stardust**: Check the budget for the proposed team-building retreat to the Moon Resort and Spa.
    
    ### Topics for Next Meeting
    1. Update on the Zogian-English dictionary (Dr. Quasar)
    2. Progress report on the cloaking technology (Dr. Quasar)
    3. Results of increased patrols in Sector 7 (Captain Stardust)
    4. Final preparations for the Intergalactic Charity Bazaar (All)

    A few updates to the prompt can achieve comparable or better results from Amazon Nova Pro while enjoying a much less expensive cost of inference.

    Employ Amazon Nova extended context

    Amazon Nova Lite and Amazon Nova Pro can support up to 300,000 input tokens, which means that you can include more context in your prompt if needed. Expand your background data and detailed instructions accordingly—if your original OpenAI prompt was optimized for 128,000 tokens, adjust it to use the Amazon Nova extended window.

    Tailor output constraints

    If your GPT prompt required strict formatting (for example, “Respond in JSON only”), make sure that your Amazon Nova prompt includes these directives. Additionally, if your task involves multimodal inputs, specify when to include images or video references.

    Function calling

    The rise of generative AI agents has made function calling, or tool calling, one of the most important abilities of a given large language model (LLM). A model’s ability to correctly pick the right tool for the job, in a low-latency manner, is often the difference between success and failure of an agentic system.

    Both OpenAI and Amazon Nova models share similarities in function calling, in particular their support for structured API calls. Both model families support tool selection through defined tool schemas, which we discuss later in this post. They also both provide a mechanism to decide when to invoke these tools or not.

    OpenAI’s function calling uses flexible JSON schemas to define and structure API interactions. The models support a wide range of schema configurations, which give developers the ability to quickly implement external function calls through straightforward JSON definitions tied to their API endpoints.

    Here is an example of a function:

    tools = [{
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "Get current temperature for a given location.",
                "parameters": {
                        "type": "object",
                        "properties": {
                            "location": {
                                "type": "string",
                                "description": "City and country e.g. Montevideo, Uruguay"
                            }            
                    },            
                    "required": [                
                        "location"
            ],
            "additionalProperties": False
        },
        "strict": True
        }
    }]
    
    completion = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "What is the weather like in Punta del Este today?"}],
        tools=tools

    Similar to OpenAI’s approach, Amazon Nova can call tools when passed a configuration schema as shown in the following code example. Amazon Nova has made heavy use of Greedy Decoding when calling tools, and it’s advised to set temperature, topP, and topK to 1. This makes sure that the model has the highest accuracy in tool selection. These Greedy Decoding parameters and other great examples of tool use are covered in great detail in Tool use (function calling) with Amazon Nova.

    The following is an example of function calling without using additionalModelRequestFields:

    tool_config = {
        "tools": [{
            "toolSpec": {
                "name": "get_recipe",
                "description": "Structured recipe generation system",
                "inputSchema": {
                    "json": {
                        "type": "object",
                        "properties": {
                            "recipe": {
                                "type": "object",
                                "properties": {
                                    "name": {"type": "string"},
                                    "ingredients": {
                                        "type": "array",
                                        "items": {
                                            "type": "object",
                                            "properties": {
                                                "item": {"type": "string"},
                                                "amount": {"type": "number"},
                                                "unit": {"type": "string"}
                                            }
                                        }
                                    },
                                    "instructions": {
                                        "type": "array",
                                        "items": {"type": "string"}
                                    }
                                },
                                "required": ["name", "ingredients", "instructions"]
                            }
                        }
                    }
                }
            }
        }]
    }
    
    # Base configuration without topK=1
    input_text = "I need a recipe for chocolate lava cake"
    messages = [{
        "role": "user",
        "content": [{"text": input_text}]
    }]
    
    # Inference parameters
    inf_params = {"topP": 1, "temperature": 1}
    
    response = client.converse(
        modelId="us.amazon.nova-lite-v1:0",
        messages=messages,
        toolConfig=tool_config,
        inferenceConfig=inf_params
    )
    # Typically produces less structured or incomplete output
     

    The following example shows how function calling accuracy can be improved by using

    additionalModelRequestFields:

    # Enhanced configuration with topK=1
    response = client.converse(
        modelId="us.amazon.nova-lite-v1:0",
        messages=messages,
        toolConfig=tool_config,
        inferenceConfig=inf_params,
        additionalModelRequestFields={"inferenceConfig": {"topK": 1}}
    )
    # Produces more accurate and structured function call
    

    To maximize Amazon Nova function calling potential and improve accuracy, always use additionalModelRequestFields with topk=1. This forces the model to select the single most probable token and prevents random token selection. This increases deterministic output generation and improves function call precision by about 30–40%.

    The following code examples further explain how to conduct tool calling successfully. The first scenario shows recipe generation without an explicit tool. The example doesn’t use topK, which typically results in responses that are less structured:

    input_text = """
    I'm looking for a decadent chocolate dessert that's quick to prepare. 
    Something that looks fancy but isn't complicated to make.
    """
    
    messages = [{
        "role": "user",
        "content": [{"text": input_text}]
    }]
    
    response = client.converse(
        modelId="us.amazon.nova-lite-v1:0",
        messages=messages,
        inferenceConfig={"topP": 1, "temperature": 1}
    )
    # Generates a conversational recipe description
    # Less structured, more narrative-driven response
    

    In this example, the scenario shows recipe generation with a structured tool. We add topK set to 1, which produces a more structured output:

    response = client.converse(
        modelId="us.amazon.nova-lite-v1:0",
        messages=messages,
        toolConfig=tool_config,
        inferenceConfig={"topP": 1, "temperature": 1},
        additionalModelRequestFields={"inferenceConfig": {"topK": 1}}
    )
    # Generates a highly structured, JSON-compliant recipe
    # Includes precise ingredient measurements
    # Provides step-by-step instructions

    Overall, OpenAI offers more flexible, broader schema support. Amazon Nova provides more precise, controlled output generation and is the best choice when working with high-stakes, structured data scenarios, as demonstrated in Amazon Nova’s performance on the IFEval benchmark discussed in section 2.1.1 of the technical report and model card. We recommend using Amazon Nova for applications requiring predictable, structured responses because its function calling methodology provides superior control and accuracy.

    Conclusion

    The evolution from OpenAI’s models to Amazon Nova represents a significant shift in using AI. It shows a transition toward models that deliver similar or superior performance at a fraction of the cost, with expanded capabilities in multimodal processing and extended context handling.

    Whether you’re using the robust, enterprise-ready Amazon Nova Pro, the agile and economical Amazon Nova Lite, or the versatile Amazon Nova Micro, the benefits are clear:

    • Cost savings – With token costs up to four times lower, businesses can scale applications more economically
    • Enhanced response performance – Faster response times (up to 190 tokens per second) make real-time applications more viable
    • Expanded capabilities – A larger context window and multimodal support unlock new applications, from detailed document analysis to integrated visual content

    By evolving your prompt strategy—redefining use cases, exploiting the extended context, and iteratively refining instructions—you can smoothly migrate your existing workflows from OpenAI’s o4 and o4-mini models to the innovative world of Amazon Nova.


    About the Authors

    Claudio Mazzoni is a Sr Specialist Solutions Architect on the Amazon Bedrock GTM team. Claudio exceeds at guiding costumers through their Gen AI journey. Outside of work, Claudio enjoys spending time with family, working in his garden, and cooking Uruguayan food.

    Pat Reilly is a Sr. Specialist Solutions Architect on the Amazon Bedrock Go-to-Market team. Pat has spent the last 15 years in analytics and machine learning as a consultant. When he’s not building on AWS, you can find him fumbling around with wood projects.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    admin
    • Website

    Related Posts

    Multi-account support for Amazon SageMaker HyperPod task governance

    June 8, 2025

    Implement semantic video search using open source large vision models on Amazon SageMaker and Amazon OpenSearch Serverless

    June 7, 2025

    Build a serverless audio summarization solution with Amazon Bedrock and Whisper

    June 7, 2025

    Modernize and migrate on-premises fraud detection machine learning workflows to Amazon SageMaker

    June 6, 2025

    How climate tech startups are building foundation models with Amazon SageMaker HyperPod

    June 5, 2025

    Impel enhances automotive dealership customer experience with fine-tuned LLMs on Amazon SageMaker

    June 4, 2025
    Leave A Reply Cancel Reply

    Demo
    Top Posts

    ChatGPT’s viral Studio Ghibli-style images highlight AI copyright concerns

    March 28, 20254 Views

    Best Cyber Forensics Software in 2025: Top Tools for Windows Forensics and Beyond

    February 28, 20253 Views

    An ex-politician faces at least 20 years in prison in killing of Las Vegas reporter

    October 16, 20243 Views

    Laws, norms, and ethics for AI in health

    May 1, 20252 Views
    Don't Miss

    Zelenskyy: ‘Free, democratic world’ is ‘waiting’ for Trump to bring peace in Ukraine

    June 8, 2025

    Ukrainian President Volodymyr Zelenskyy told ABC News’ Martha Raddatz his country is ready for a…

    Napa Valley town that once rode out emergencies with diesel gets a clean-power backup

    June 8, 2025

    Ukraine drone barrage targets Moscow as Zelenskyy demands accountability for Putin

    June 8, 2025

    Ukraine drone barrage targets Moscow as Zelenskyy demands accountability for Putin

    June 8, 2025
    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    Demo
    Top Posts

    ChatGPT’s viral Studio Ghibli-style images highlight AI copyright concerns

    March 28, 20254 Views

    Best Cyber Forensics Software in 2025: Top Tools for Windows Forensics and Beyond

    February 28, 20253 Views

    An ex-politician faces at least 20 years in prison in killing of Las Vegas reporter

    October 16, 20243 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews
    Demo
    About Us
    About Us

    Your source for the lifestyle news. This demo is crafted specifically to exhibit the use of the theme as a lifestyle site. Visit our main page for more demos.

    We're accepting new partnerships right now.

    Email Us: info@example.com
    Contact: +1-320-0123-451

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    Zelenskyy: ‘Free, democratic world’ is ‘waiting’ for Trump to bring peace in Ukraine

    June 8, 2025

    Napa Valley town that once rode out emergencies with diesel gets a clean-power backup

    June 8, 2025

    Ukraine drone barrage targets Moscow as Zelenskyy demands accountability for Putin

    June 8, 2025
    Most Popular

    ChatGPT’s viral Studio Ghibli-style images highlight AI copyright concerns

    March 28, 20254 Views

    Best Cyber Forensics Software in 2025: Top Tools for Windows Forensics and Beyond

    February 28, 20253 Views

    An ex-politician faces at least 20 years in prison in killing of Las Vegas reporter

    October 16, 20243 Views

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    14 Trends
    Facebook X (Twitter) Instagram Pinterest YouTube Dribbble
    • Home
    • Buy Now
    © 2025 ThemeSphere. Designed by ThemeSphere.

    Type above and press Enter to search. Press Esc to cancel.