Prompt Engineering | Principles and Strategies for Writing Prompts

In order to better communicate with the large model (e.g. chatgpt), let's learn how to write prompt together!

Article directory

  • 1 Introduction
  • 2. Principles and strategies for writing prompts
    • 2.1. Write clear and specific instructions
      • 2.1.1. Strategy 1: Use separators to clearly represent different parts of the input
      • 2.1.2. Strategy 2: Require a structured output
      • 2.1.3. Strategy 3: Require the model to check whether the conditions are met
      • 2.1.4. Strategy 4: Provide a few examples
    • 2.2. Give the model time to think
      • 2.2.1. Strategy 1: Specify the steps required to complete the task
      • 2.2.2. Strategy 2: Guide the model to find its own solution before drawing conclusions
  • 3. Limitations

1. Introduction

  • With the development of large language model (LLM), LLM can be roughly divided into two types, namely base LLM and instruction fine-tuning LLM. The basic LLM is the previous pre-trained language model (such as GPT-1, GPT-2, GPT-3). Instruction fine-tuning LLM is based on the already trained basic LLM, and then fine-tunes it using a data set whose input is an instruction (prompt is a prompt) and whose output is the result it should return, requiring it to follow these instructions. This is then further refined, often using a technique called RLHF (reinforcement learning from human feedback), to make the system more helpful in following instructions. Because instruction-tuned LLMs have been trained to be helpful, honest, and harmless, they are less likely to output problematic text, such as harmful output, than base LLMs.
  • When you use instructions to fine-tune an LLM, think of it like giving instructions to another human being, assuming it is someone who is smart but doesn’t know the specifics of your task. When LLM doesn’t work properly, sometimes it’s because the instructions aren’t clear enough. So it is very important to learn how to ask questions better, that is, how to be a prompt engineer

2. Principles and strategies for writing prompt

2 principles:

  • 1. Write clear, specific instructions
  • 2. Give the model time to think

2.1. Write clear and specific instructions

Clear and specific instructions will guide the model to give more accurate and detailed output!

2.1.1, Strategy 1: Use separators to clearly represent different parts of the input

You can use any obvious punctuation mark to separate a specific text part from the rest of the prompt, the delimiter can be: `, "", <>. This is to prevent specific text from conflicting with the prompt, causing the prompt to be ambiguous.

For example, enter a prompt to let chatgpt summarize:
[Note: \ is used in the following code to adapt the text to the screen size to improve the reading experience. GPT is not affected by \, but when you call other large models, you need to consider whether \ will affect model performance. 】

text = f"""
You should provide as clear and specific instructions as possible about what you want your model to do. \
This will steer the model towards the desired output and reduce the chance of receiving irrelevant or incorrect responses. \
Don't confuse writing clear prompts with writing short prompts. \
In many cases, longer cues can provide the model with more clarity and contextual information, leading to more detailed and relevant output.
"""
# The text content that needs to be summarized
prompt = f""" Summarize the text enclosed in three backticks into one sentence. ```{<!-- -->text}```"""

output:
Provide clear and specific instructions to avoid irrelevant or incorrect responses. Don't confuse writing clear with writing short. Longer prompts can provide more clarity and context, leading to more detailed and relevant output.

2.1.2, Strategy 2: Require a structured output

You can ask the model to generate a structured output (such as Json, HTML, etc.), which can make the output of the model easier for us to parse.

prompt = f"""
Please generate a list of 3 fictional books including title, author and category, \
and provided in JSON format with the following keys: book_id, title, author, genre.
"""

Generated as follows:
{<!-- -->
  "books": [
    {<!-- -->
      "book_id": 1,
      "title": "The Shadow of the Wind",
      "author": "Carlos Ruiz Zafón",
      "genre": "Mystery"
    },
    {<!-- -->
      "book_id": 2,
      "title": "The Name of the Wind",
      "author": "Patrick Rothfuss",
      "genre": "Fantasy"
    },
    {<!-- -->
      "book_id": 3,
      "title": "The Hitchhiker's Guide to the Galaxy",
      "author": "Douglas Adams",
      "genre": "Science Fiction"
    }
  ]
}

2.1.3, Strategy 3: Ask the model to check whether the conditions are met

If the assumptions made by the task are not necessarily satisfied, we can tell the model to check these assumptions first, and if not, indicate and stop execution. You can also consider potential edge cases and how the model should handle them to avoid unexpected errors or results.

The following will let chatgpt judge whether the prompt contains clear step instructions. If it contains, it will rewrite the instruction according to the given format. If it does not contain it, it will answer that no steps are provided.

  • Text with steps:
text_1 = f"""
Making a cup of tea is easy. First, the water needs to be boiled. \
While you wait, grab a mug and put the tea bags in it. \
Once the water is hot enough, pour it over the tea bags. \
Wait a while for the tea to steep. After a few minutes, remove the tea bag. \
Add some sugar or milk to taste if you like. \
That's it, you can enjoy a delicious cup of tea.
"""
prompt = f"""You will get the text surrounded by triple quotes. If it contains a series of instructions, you need to rewrite the instructions in the following format:
Step One -...
Step Two -…
…
Step N -...
If the text does not contain a series of instructions, simply write "No steps provided". "
"""{<!-- -->text_1}"""
"""

output:
Step one - bring the water to a boil.
Step Two - Take a mug and put the tea bags in it.
Step Three - Pour boiling water over the tea bags.
Step Four - Wait a few minutes for the tea leaves to steep.
Step Five - Remove the tea bag.
Step Six - Add some sugar or milk to taste if you like.
Step Seven - That's it, you can enjoy a delicious cup of tea.
  • Text without steps:
text_2 = f"""
Today is sunny and the birds are singing. \
It's a beautiful day to go for a walk in the park. \
Flowers are blooming and the branches are swaying gently in the breeze. \
People were out enjoying the beautiful weather, some were picnicking, others were playing games or relaxing on the grass. \
It's a perfect day to spend outside and enjoy the beauty of nature.
"""
prompt = f"""
You will get the text surrounded by triple quotes. If it contains a series of instructions, they need to be rewritten in the following format:
Step One -...
Step Two -…
…
Step N -...

If the text does not contain a series of instructions, simply write "No steps provided". "
"""{<!-- -->text_2}"""
"""

output:
Steps not provided.

2.1.4, Strategy 4: Provide a few examples

That is the few-shot method in gpt3, before asking the model to perform actual tasks, provide it with a small number of examples of successful task execution

prompt = f"""
Your task is to answer questions in a consistent style.

<child>: Teach me patience.

<Grandparents>: The river that digs the deepest canyon begins at an obscure spring; the grandest symphony begins with a single note; the most intricate tapestry begins with a single thread.

<child>: Teach me resilience.
"""

output:
<Grandparents>: Resilience is like a tree. It needs to go through wind, rain, winter and heat to grow stronger. In life, we also need to experience various setbacks and difficulties in order to exercise resilience. Remember, don't give up easily, stick to it, you will find yourself stronger.

2.2. Give the model time to think

Guide the model to perform a series of related reasoning before arriving at the final answer, instructing the model to spend more time thinking about the problem, which means it spends more computing resources on the task. Prevent the model from drawing potentially erroneous conclusions in a short period of time or with a small amount of words.

2.2.1. Strategy 1: Specify the steps required to complete the task

For some complex tasks, providing the model with a series of steps of the task can help to generate more satisfactory results.

text = f"""
In a charming village, brother and sister Jack and Jill set off to fetch water from a hilltop well. \
They climbed up while singing joyful songs, \
Unfortunately, however, Jack stumbled over a rock and rolled down the hill, followed by Jill. \
Although slightly injured, they returned to their warm home. \
Despite this accident, their adventurous spirit remained undiminished, and they continued to explore with joy.
"""

prompt_2 = f"""
1- Summarize the following text enclosed by <> in one sentence.
2- Translate the abstract into English.
3- List each name in English summary.
4- Output a JSON object with the following keys: English_summary, num_names.

Please use the following format:
Text: [text to summarize]
Summary: [Summary]
Translation: [translation of abstract]
Names: [list of names in English abstract]
Output JSON: [JSON with English_summary and num_names]

Text: <{<!-- -->text}>
"""

output:
Summary: Siblings Jack and Jill have an adventure in their charming village and return home after a tragic fall, but still full of adventure.
Translation: In a charming village, siblings Jack and Jill set out to fetch water from a mountaintop well. While climbing and singing, Jack trips on a stone and tumbles down the mountain, with Jill following closely behind. Despite some bruises, they make it back home safely. Their adventurous spirit remains undifinished as they continue to explore with joy.
Name: Jack, Jill
Output JSON: {<!-- -->"English_summary": "In a charming village, siblings Jack and Jill set out to fetch water from a mountaintop well. While climbing and singing, Jack trips on a stone and tumbles down the mountain , with Jill following closely behind. Despite some bruises, they make it back home safely. Their adventurous spirit remains undiminished as they continue to explore with joy.", "num_names": 2}

2.2.2. Strategy 2: Guide the model to find its own solution before drawing conclusions

Sometimes you ask the model directly, for example: "Judge whether my solution is correct, now I want to solve a xx problem, my solution is xx."; the model is likely to be misled by my solution as correct , but actually your solution is wrong. At this time, if the model first finds a solution by itself, and then compares it with my solution, the solution obtained in this way is more likely to be model-based.

  • Examples of model mistakes:
prompt = f"""
Judge whether the student's solution is correct.

question:
I'm building a solar power plant and need help figuring out the finances.

    Land fee is $100/sqft
    I can buy solar panels for $250/sqft
    I have negotiated a maintenance contract that pays a fixed $100,000 per year plus an additional $10 per square foot
    What is the total cost of operation for the first year as a function of square footage.

Student's solution:
Let x be the size of the power station in square feet.
cost:

    Land cost: 100x
    Solar Panel Cost: 250x
    Maintenance Fee: $100,000 + 100x
    Total cost: 100x + 250x + $100,000 + 100x=450x + $100,000
"""

output:
The student's solution is correct.
  • Example of model answering correctly:
prompt = f"""
Please judge whether the student's solution is correct, please solve this problem through the following steps:

step:

    First, solve the problem yourself.
    Then compare your solution to the student's solution and assess whether the student's solution is correct. Do not decide whether a student's solution is correct until you have completed the problem yourself.

Use the following format:

    question: question text
    Student Solutions: Student Solution Text
    Actual Solutions and Steps: Actual Solutions and Steps Text
    Is the student's solution the same as the actual solution: Yes or No
    Student's Grade: Correct or Incorrect

question:

    I'm building a solar power plant and need help figuring out the finances.
    - Land fee is $100 per square foot
    - I can buy solar panels for $250 per square foot
    - I have negotiated a maintenance contract that pays a fixed $100,000 per year plus an additional $10 per square foot
    What is the total cost of operation for the first year as a function of square footage.

Student's solution:

    Let x be the size of the power station in square feet.
    cost:
    1. Land fee: 100x
    2. Solar panel fee: 250x
    3. Maintenance fee: 100,000 + 100x
    Total fee: 100x + 250x + 100,000 + 100x=450x + 100,000

Actual solution and steps:
"""

output:
Correct solution and steps:
1. Calculate land cost: $100/sqft * x sqft = 100x $
2. Calculate solar panel cost: $250/square foot * x square foot = $250x
3. Calculate maintenance costs: $100,000 + $10/sqft * x sqft = $100,000 + $10x
4. Calculate the total fee: 100x USD + 250x USD + 100,000 USD + 10x USD = 360x + 100,000 USD

Is the student's solution the same as the actual solution: No

Student's Grade: Incorrect

3. Limitations

  • Occasionally, the model will do serious nonsense, that is, generate false content that appears to be true but is not true or exists, also known as “hallucination” or hallucination.
  • So learning the skills of building prompts will help you avoid this situation when building your own LLM applications.
  • In situations where you want the model to generate answers based on text, another strategy for reducing hallucinations is to first ask the model to find any relevant references in the text, and then ask it to use those references to answer the question. It is very helpful to reduce hallucinations.

?
?
?
?
?
?
Reference link:
[1] OpenAI
[2] Teacher Wu Enda’s: DeepLearning.AI
[3] DataWhale
[4] https://learn.deeplearning.ai/