寫給開發者看的 Prompt Engineering

開始之前#

所謂 Prompt Engineering（提示工程），就是與 AI 進行有效溝通以實現預期效果的過程。至於為什麼需要 PE 及其相關原理，並不是這篇文章的重點，感興趣的同學可以看這篇文章¹。

本文想要介紹在提示工程中，一些對於開發者有用的技巧²，以及分析在開源項目中的具體應用。

一些技巧#

編寫明確具體的說明#

使用分隔符#

分隔符可以是任何形式，例如：

'''text'''

"""text"""

< text >

<tag>text</tag>

const text = `
通過提供盡可能明確和具體的說明來表達你希望模型執行的任務。
這將引導模型朝著預期的輸出方向發展，並減少收到無關或不正確回覆的可能性。
不要混淆編寫清晰提示和編寫簡短提示。
在大多數情況下，
較長的提示可以為模型提供更明確的上下文，從而產生更詳細和更具相關性的輸出結果。
`

const prompt = `
將三個雙引號括起來的文本總結為一句話。
"""${text}"""
`

結構化輸出#

比如輸出形式為 JSON 或 HTML

const prompt = `
生成三本虛構類書籍的書名、作者和類型列表。 
使用以下鍵以 JSON 格式提供：book_id、title、author、genre。
`

檢查條件是否滿足#

const text = `
泡一杯茶很容易！首先，需要燒一些水。 
在水燒開的時候，拿一個杯子並把一個茶包放進去。 
然後把開水倒在茶包上。 
讓它浸泡一會兒，茶就可以泡好了。 
幾分鐘後，取出茶包。如果你喜歡，還可以加一些糖或牛奶。 
就這樣！你可以享受泡好的茶水了。
`

const prompt = `
如果文本包含一系列說明，請按以下格式重新編寫這些說明：

步驟1 - ...
步驟2 - ...
...
步驟N - ...

如果文本不包含一系列說明，則僅寫“未提供步驟”。

"""${text}"""
`

「Few-shot」提示#

「Few-shot」提示³是指向 AI 模型提供有限數量的示例，從而引導模型更好地執行任務。這是一種常用於訓練大語言模型（LLMs）的技術。

Few-shot prompting 的步驟如下：

選擇你想讓模型生成響應的領域或主題。可以是一種文本類型、語言方式等。
為模型提供少量的示例（提示），以作為後續樣例的條件。通常只需提供 2-5 個示例即可進行「few-shot」學習。
模型將分析提示中的模式、風格和結構。它將學習定義該領域響應的屬性。
讓模型在相同領域中生成新的響應。通過提示的條件化，它可以生成符合所需風格、結構等的響應。
評估響應並提供反饋以進一步改進模型。這可以是直接反饋給模型，也可以只是記錄下一組提示生成的領域中需要改進的地方。

給模型時間「思考」#

指定完成任務所需的步驟#

const text = `
在一個美麗的村莊裡，有一對兄妹傑克和吉爾。一天他們出發去從山頂的井中取水，
當他們歡快地唱著歌爬山時，不幸降臨了——傑克被石頭絆倒了，滾下山坡，吉爾也跟著摔了下來。 
雖然受了輕傷，萬幸兩人還是平安回家了。儘管發生了不幸，但他們的冒險精神卻絲毫沒有減弱，他們將繼續探索大自然。
`

// 示例
const prompt = `
執行以下操作：
1-請使用一句話概括給出的文本內容。
2-將摘要翻譯成法語。
3-列出法語摘要中的每個名稱。
4-輸出一個包含以下鍵的 json 對象：french_summary，num_names。

請使用換行符給出答案。

文本：
"""${text}"""
`

指示模型在決定之前先自行解決問題#

const prompt = `
你的任務是確定學生的解決方案是否正確。
要解決問題，請執行以下操作：
-首先，自己解決問題。
-然後將你的解決方案與學生的解決方案進行比較，並評估學生的解決方案是否正確。
在自己解決問題之前，請不要決定學生的解決方案是否正確。

使用以下格式：

問題：

"""
問題
""

學生的解決方案：

"""
學生的解決方案
"""

實際解決方案：

"""
解決方案的步驟和您的解決方案在這裡
"""

學生的解決方案是否與剛剛計算出的實際解決方案相同：

""
是或否
"""

學生的成績：

"""
正確或不正確
"""

問題：

"""
我正在建造一個太陽能電站，我需要幫助解決財務問題。
-土地成本為每平方英尺100美元
-我可以以每平方英尺250美元的價格購買太陽能電池板
-我協商了一個維護合同，每年將花費固定的10萬美元，以及額外的每平方英尺10美元
請計算出第一年的總成本是多少。
"""

學生的解決方案：

"""
設x為安裝面積（以平方英尺為單位）。
成本：
1.土地成本：100x
2.太陽能電池板成本：250x
3.維護成本：100,000+100x
總成本：100x+250x+100,000+100x=450x+100,000
"""

實際解決方案：
`

他山之石#

ai-code-translator#

這個項目⁴可以在不同的編程語言環境下轉換代碼。使用到了前面提到過的一些技巧，如使用分隔符來指示輸入和輸出的編程語言、提供「Few-shot」提示等。

const prompt = `
You are an expert programmer in all programming languages. Translate the "${inputLanguage}" code to "${outputLanguage}" code. Do not include \`\`\`.
  
      Example translating from JavaScript to Python:
  
      JavaScript code:
      for (let i = 0; i < 10; i++) {
        console.log(i);
      }
  
      Python code:
      for i in range(10):
        print(i)
      
      ${inputLanguage} code:
      ${inputCode}

      ${outputLanguage} code (no \`\`\`):
     `;
`

gpt-engineer#

我們再來看一個更複雜的項目 —— 基於描述來生成整個完整代碼庫，其中⁵使用了大量的 propmts。下面列舉部分步驟，來分析該步驟中使用到的 prompt。

Mermaid Loading...

const prompt_on_respec = `
You are a pragmatic principal engineer at Google.
You have been asked to review a specification for a new feature by a previous version of yourself

You have been asked to give feedback on the following:
- Is there anything that might not work the way intended by the instructions?
- Is there anything in the specification missing for the program to work as expected?
- Is there anything that can be simplified without significant drawback?

You are asked to make educated assumptions for each unclear item.
For each of these, communicate which assumptions you'll make when implementing the feature.

Think step by step to make sure we don't miss anything.
`

const prompt_on_gen_code = `
Please now remember the steps:

Think step by step and reason yourself to the right decisions to make sure we get it right.
First lay out the names of the core classes, functions, methods that will be necessary, As well as a quick comment on their purpose.

Then you will output the content of each file including ALL code.
Each file must strictly follow a markdown code block format, where the following tokens must be replaced such that
FILENAME is the lowercase file name including the file extension,
LANG is the markup code block language for the code's language, and CODE is the code:

FILENAME
CODE

Please note that the code should be fully functional. No placeholders.

You will start with the "entrypoint" file, then go to the ones that are imported by that file, and so on.
Follow a language and framework appropriate best practice file naming convention.
Make sure that files contain all imports, types etc. The code should be fully functional. Make sure that code in different files are compatible with each other.
Before you finish, double check that all parts of the architecture is present in the files.
`